Challenge #8: Aggregate Consumer Purchases
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Aggregate Consumer Purchases:
For this week’s exercise we will look at customer purchase behavior to decide if we should offer a “Meal Deal” that would add a side and drink to a purchase of pizza or a burger. The incoming data is larger than usual for these exercises so I have packaged the workflow as an Alteryx Package. The link to the solution for last challenge #7 is HERE.
This week’s Objective:
In order to decide if we should start including a new "Meal Deal" on our menu we want to study the potential impact on recent transactions. Please identify the number and percentage of orders since July 1, 2013 which include the following categories of food: Pizza OR Burger along with a Side and Drink.
Summary of Data:
Point of Sale data includes the ticket level information, and the lookup table categorizes items into higher level food categories.
Hint:
Don't forget to join to the lookup table and filter by date.
As always we look forward to your feedback and suggestions!
UPDATE 01/18/2016:
The solution has been uploaded.
UPDATE 12/28/2016:
The challenge, text and solution have been updated.
- Labels:
- Basic
- Core
- Data Analysis
- Join
- Preparation
- Transform
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for playing along. The solution has been uploaded to the article.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks for sharing the solution, but in this case I'm wondering if it actually fits with the instructions of the exercise. Those ask for orders that have a burger or pizza and a salad. The way I understand the solution however identifies those orders that have a burger or pizza and a side. Am I missing something?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
You are corect, the workfolw does not match the text. The Objective should read:
We are thinking of including a new "Meal Deal" on our menu and want to study the potential impact on recent transactions. Identify the number and percentage of tickets since June 2013 which include the following categories of food: Pizza OR Burger along with a Side and Drink.
Great catch, thanks for letting us know!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
Maybe a subtle language point, but I think the "since June 2013" text is misleading - I see the solution filters from July onwards. I took that to mean from the start of June.
Either way, another good exercise, so thanks for that.
Regards,
Tom
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I've added a slightly different approach for anyone who is interested - if I may say so I think it's slightly cleaner. You don't need the parsing tools in this workflow, but I think in a real-world scenario your data is unlikely to be as clean as the example, so would then be useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I perhaps made a long-form solution without the use of a flag table. Basically I broke the tables apart with filters and used the Unions to help remove the Tickets without the food types we needed. A basic count of all tickets vs the tickets that remained after the unions made the formula possible.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I think my approach is similar to others:
I then applied a filter to remove the irrelevant food types.
Now the two streams contained the relevant records, I joined them on the description.
I then applied a "Title" field, which was just text for use in pivoting the data. I sorted by ticket ID and used the "Crosstab" tool with the text field as the column title. This put all food types on the order into a single cell per ticket. I then filtered out the records that included the appropriate combinations, and created a count of the Ticket ID field:
I also created a count of tickets on the primary stream. This gave me the total number of tickets since June 2013.
I then simply joined the streams and used a formula to calculate the percentage:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Very similar solution to @TomWelgemoed
- Join to the lookup data
- Summarize by food type
- Transpose to get this into columns (one ticket ID with a count of each food type)
- Clean up nulls
- Simple formula that evaluates to 1 if it hits the deal criteria; and 0 otherwise - DealCount
- Summarize to count the tickets, and get an overall Sum of the DealCount column
- Simple formula for percentage
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator