I have about 20 different input files and my goal is to get them into one output file/tab with a column inserted to summarize. I have used an arrange tool to grab the two columns for each input however, I am running into a problem when trying to join the input files. These files have the same layout, but some have more data than others. I tried making a basic example below:
File 1:
Item | Amount |
Bread | 1 |
Milk | 2 |
Cheese | 1 |
Eggs | 1 |
Total | 5 |
File 2:
Item | Amount |
Mustard | 3 |
Meat | 2 |
Bread | 2 |
Mayo | 2 |
Lettuce | 1 |
Cheese | 3 |
Total | 13 |
File 3:
Item | Amount |
Milk | 1 |
Eggs | 4 |
Cereal | 3 |
Waffles | 2 |
Fruit | 5 |
Total | 15 |
I would want to put it together as
Item | Summary | File 1 | File 2 | File 3 |
Mustard | 3 | 0 | 3 | 0 |
Bread | 3 | 1 | 2 | 0 |
Milk | 3 | 2 | 0 | 1 |
Eggs | 5 | 1 | 0 | 4 |
Ceral | 3 | 0 | 0 | 3 |
Cheese | 4 | 1 | 3 | 0 |
Fruit | 5 | 0 | 0 | 5 |
Meat | 2 | 0 | 2 | 0 |
Lettuce | 1 | 0 | 1 | 0 |
Mayo | 2 | 0 | 2 | 0 |
Waffles | 2 | 0 | 0 | 2 |
Total | 33 | 5 | 13 | 15 |
I created a table that included all of the items in the column 1 for each input file and then used join multiple by specific fields to try this on my inputs to join each file to the "item" column, but Alteryx returns an error that 236,459,580,739,200,000 records were generated by the join and only 20 are displayed. However, I can't do join multiple based on record position because some files have a slightly different list of items than others and the amounts go into the wrong rows. Am I missing something in my multiple join by field that is causing so many possible records?
As @binuacs pointed out - this is a Union - not a join. If you want to do this dynamically with tons of files - you'd use a batch macro which assigns the filename to a new column...
Thanks to you both. I believe I've made my example simpler than what I need to accomplish. Using the three FakeCo files below, how could I get to the FakeCo Summary file? The summary piece I am on board with, but post union I am unsure how to use cross-tab to generate all of the different items that should go under deductible, perpetual differences, and short-term differences under one column and then values for each of the Co's. Would cross-tab still be the appropriate tool here? I am assuming I would need to use multiple?