🥕 Instacart vs Me: A Data War
How does my purchasing behavior on specific days of the week, match with the rest of Instacart's users?
How It All Started
After reading Giorgia Lupi & Stefanie Posavec's -- Dear Data, I was inspired to track data on something that was within the confines of my home to understand and analyze real-world observations through data visualization, as an individual class project taught by professor Nick Cawthon.
1 Analyze
The subject matter I chose was to analyze the grocery consumption pattern in my house so I could use that data to make more conscious purchasing and consumption decisions basis of the insights.
Some of the activities I performed before I started to confine my data collection parameters were:
∙ Analyze the items that are currently in the refrigerator that are frequently bought, and disposed of -- Items like bread, eggs, and milk were standard and most frequently bought, but items like fruits and other dairy products were going stale more often and disposed of.
∙ Plan the timeline to start collecting data -- I decided to collect data for the duration of two weeks as that gives me ample time to analyze how often I orders groceries and track repeat orders.
∙ Deciding on which grocery stores to focus on for data collection -- I chose Instacart as my primary source of data collection as it would be easy to maintain since all of the invoicing and transactions happen digitally.
2 Compile
As depicted in the image above, I began to identify each activity I performed on the items I purchased during the period of 28th February to 10th March. These activities included when the item was purchased, consumed, finished, or disposed of due to expiry/going stale. It also included the placement of the items in the kitchen -- outside on the platform, refrigerated, or frozen.
3 Collect
I started to look for public datasets that have information on US grocery purchase data, when I stumbled upon a dataset uploaded by Instacart on Kaggle. This dataset holds data on over 3 million purchase orders from information on -- products, aisles, departments, repeat orders, and timestamp for each order.
4 Combine
I created my own dataset by digitizing the data I captured manually. This was important so that both the datasets I need to compare are in the same format to map together.
I then pulled in the Instacart datasets into Tableau to merge some sheets together. The three sheets I needed from the Instacart dataset were:
1 Products
2 Orders-Products Mapping
3 Orders
2 Orders-Products Mapping
3 Orders
5 Configure
The "Products" sheet held the following data:
∙ Product ID
∙ Product Name
∙ Aisle ID
∙ Department ID
∙ Product ID
∙ Product Name
∙ Aisle ID
∙ Department ID
The "Orders-Products Mapping" sheet held the following data:
∙ Product ID
∙ Order ID
∙ Reordered
∙ Add to Cart Order
∙ Product ID
∙ Order ID
∙ Reordered
∙ Add to Cart Order
The "Orders" sheet held the following data:
∙ Order DOW (Date fo Week)
∙ Order Hour of Day
∙ Order ID
∙ Order Number
∙ User ID
∙ Order DOW (Date fo Week)
∙ Order Hour of Day
∙ Order ID
∙ Order Number
∙ User ID
This is the process I conducted to reach my merged dataset:
Step 1
First, I linked "Products" with "Orders-Products Mapping" with Product ID.
First, I linked "Products" with "Orders-Products Mapping" with Product ID.
Step 2
I then linked "Orders-Products Mapping" and "Orders" with Order ID to get the Order DOW (Date fo Week) into the merged dataset I finally ended up with a dataset that held over 900K rows.
I then linked "Orders-Products Mapping" and "Orders" with Order ID to get the Order DOW (Date fo Week) into the merged dataset I finally ended up with a dataset that held over 900K rows.
Step 3
To match this data with my purchasing info, I filtered this dataset to only show the products I purchased. This brought me down to around 728K results.
To match this data with my purchasing info, I filtered this dataset to only show the products I purchased. This brought me down to around 728K results.
6 Visualize & Observe
This bar chart is a visual representation of the number of Instacart purchases - per day of the week - per product on my shopping list. As I only bought groceries on a Sunday, Monday, and Thursday within the span of two weeks, I compared the rest of Instacart's purchases only for those days.
Here were my findings --
🥕
Thanks for reading, here's a carrot.