Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Aggregate data by year for each store

jleslie
5 - Atom

Hi,

 

I have searched this site for a while and still can't find the solution to my problem. I was hoping someone here could point me to the right tool/formula. As a toy example, I am trying to aggregate the number of boxes bought in year and the sum of the cost of boxes purchased in this year by store. I have a row of the past say 20 boxes bought (although this will be increased), specifically the year in which the box was bought and its cost.

 

My general data is set up like this:

 

StoreDate Box 1Date Box 2Date Box 3...Cost Box 1Cost Box 2Cost Box3
A201720172016 1.001.500.50
B2009   0.10  
C201520142014 1.201.001.00
...       

 

The goal is to aggregate by store and year:

 

StoreYearNumber of boxesTotal Cost 
A201722.50 
 201610.50 
 ...   
 2009   
B201700 
 ...   
 200910.1 
...............

 

I would really appreciate someone's push in the right direction. The challenge with doing this in excel is that there are many stores and many boxes. Please let me know if I need to provide any more information. Group by company collapses the years unfortunately. I tried concatenating the year and store and then using group by, but then the count of boxes bought and costs sum across all years :/

 

Kind regards.

5 REPLIES 5
YK_dup_378
7 - Meteor

Hello,

 

I have come up with a workflow that let's you do what you require i.e. present a count of boxes (number of boxes) and total cost of boxes for each Store-Year combo.

 

Please note that I have assumed:

 

(1) Your overall data will have same number of total columns for Date Box # and Cost Box #.

(2) I have also assumed that each Date Box # will match one to one to a Cost Box # e.g. Date Box 1 will only match Cost Box 1 and so on.

(3) A blank/null in a Date Box # implies no boxes for that year and hence Cost Box # will also be a blank/null (for summary purposes that entire record gets removed).

 

The way I approached the problem is as below (and please see attached workflow):

 

(1) Split the original data into just Date Box records and just Cost Box records using Select tool (please note that it was easy to select/deselect fields for this smaller sample data but if you have 20 or more fields this may be a bit of a chore).

(2) Transpose each split data from columns to rows.

(3) Filter out any blanks/nulls (see point 3).

(4) Assign each split data unique record numbers (this is to help join the split data again in subsequent step)

(5) Join the split data using a Join tool.

(6) Summarize data by grouping by Store and then Year. Then you can count number of years (to get number of boxes) and do a sum for total cost.

 

I am not sure if my approach is the best way for handling a large data set with lots of DATE BOX and COST BOX fields - both in terms of workflow efficiency and setting up the Select tools (as I mention in point 1)...  So I hope some Alteryx veterans will also please respond to this post and guide us....

 

Thanks,

 

Yusuf

 

YK_dup_378
7 - Meteor

Hi,

 

I re-read your post and I believe you would like to show all possible years for each store even if a particular year had 0 boxes. As per your example, you would like to present the results as below:

 

StoreYearNumber of boxesTotal Cost 
A201722.50 
 201610.50 
 ...   
 2009   
B201700 
 ...   
 200910.1 
............

 

I have made some adjustments to my original work flow to allow for above presentation and also remove the need for Select Tools (no need to manually select required fields). Please see attached workflow.

 

Please also note that I have used the minimum and maximum year for entire data to come up with year range that you would like to present results for  - the workflow calculates this range dynamically based on data. You can of course also set these as global constants.

 

Thanks,

 

Yusuf

YK_dup_378
7 - Meteor

Sorry here is the correct workflow....

jleslie
5 - Atom

Hi YK,

 

Thank you so much for your help on this! I'm sure your solution is what I was trying to accomplish given your explanation of the steps. It seems I need to update my alteryx to run the workflow, so I will have to do that tomorrow and accept the solution and give credit then too. I hope that is fine with you and thanks again for all the help :)

 

Kind regards.

YK_dup_378
7 - Meteor

Hi, no problem! I think the summary is given in ascending order of years. Let me know if you need this to be in descending order of years (latest year to earliest year) - you can include a sort tool and reorder. Cheers, Yusuf

Labels