Hi,
I have searched this site for a while and still can't find the solution to my problem. I was hoping someone here could point me to the right tool/formula. As a toy example, I am trying to aggregate the number of boxes bought in year and the sum of the cost of boxes purchased in this year by store. I have a row of the past say 20 boxes bought (although this will be increased), specifically the year in which the box was bought and its cost.
My general data is set up like this:
Store | Date Box 1 | Date Box 2 | Date Box 3 | ... | Cost Box 1 | Cost Box 2 | Cost Box3 |
A | 2017 | 2017 | 2016 | 1.00 | 1.50 | 0.50 | |
B | 2009 | 0.10 | |||||
C | 2015 | 2014 | 2014 | 1.20 | 1.00 | 1.00 | |
... |
The goal is to aggregate by store and year:
Store | Year | Number of boxes | Total Cost | |
A | 2017 | 2 | 2.50 | |
2016 | 1 | 0.50 | ||
... | ||||
2009 | ||||
B | 2017 | 0 | 0 | |
... | ||||
2009 | 1 | 0.1 | ||
... | ... | ... | ... | ... |
I would really appreciate someone's push in the right direction. The challenge with doing this in excel is that there are many stores and many boxes. Please let me know if I need to provide any more information. Group by company collapses the years unfortunately. I tried concatenating the year and store and then using group by, but then the count of boxes bought and costs sum across all years :/
Kind regards.
Solved! Go to Solution.
Hello,
I have come up with a workflow that let's you do what you require i.e. present a count of boxes (number of boxes) and total cost of boxes for each Store-Year combo.
Please note that I have assumed:
(1) Your overall data will have same number of total columns for Date Box # and Cost Box #.
(2) I have also assumed that each Date Box # will match one to one to a Cost Box # e.g. Date Box 1 will only match Cost Box 1 and so on.
(3) A blank/null in a Date Box # implies no boxes for that year and hence Cost Box # will also be a blank/null (for summary purposes that entire record gets removed).
The way I approached the problem is as below (and please see attached workflow):
(1) Split the original data into just Date Box records and just Cost Box records using Select tool (please note that it was easy to select/deselect fields for this smaller sample data but if you have 20 or more fields this may be a bit of a chore).
(2) Transpose each split data from columns to rows.
(3) Filter out any blanks/nulls (see point 3).
(4) Assign each split data unique record numbers (this is to help join the split data again in subsequent step)
(5) Join the split data using a Join tool.
(6) Summarize data by grouping by Store and then Year. Then you can count number of years (to get number of boxes) and do a sum for total cost.
I am not sure if my approach is the best way for handling a large data set with lots of DATE BOX and COST BOX fields - both in terms of workflow efficiency and setting up the Select tools (as I mention in point 1)... So I hope some Alteryx veterans will also please respond to this post and guide us....
Thanks,
Yusuf
Hi,
I re-read your post and I believe you would like to show all possible years for each store even if a particular year had 0 boxes. As per your example, you would like to present the results as below:
Store | Year | Number of boxes | Total Cost | |
A | 2017 | 2 | 2.50 | |
2016 | 1 | 0.50 | ||
... | ||||
2009 | ||||
B | 2017 | 0 | 0 | |
... | ||||
2009 | 1 | 0.1 | ||
... | ... | ... | ... |
I have made some adjustments to my original work flow to allow for above presentation and also remove the need for Select Tools (no need to manually select required fields). Please see attached workflow.
Please also note that I have used the minimum and maximum year for entire data to come up with year range that you would like to present results for - the workflow calculates this range dynamically based on data. You can of course also set these as global constants.
Thanks,
Yusuf
Hi YK,
Thank you so much for your help on this! I'm sure your solution is what I was trying to accomplish given your explanation of the steps. It seems I need to update my alteryx to run the workflow, so I will have to do that tomorrow and accept the solution and give credit then too. I hope that is fine with you and thanks again for all the help :)
Kind regards.
Hi, no problem! I think the summary is given in ascending order of years. Let me know if you need this to be in descending order of years (latest year to earliest year) - you can include a sort tool and reorder. Cheers, Yusuf