Hello!
I have a several inputs that all have roughly the same format except with a different amount of rows. The format is a list of items followed by an amount of items, but some files have more items than other. For example, File 1 might have
Apples 5
Orange 3
Bananas 6
and File 2 might just have
Apples 3
Bananas 10
I want to do a union and crosstab to get a result similar to this:
Fruit FIle 1 File 2
Apples 5 3
Orange 3
Bananas 6 10
I don't have a problem getting to this point but I'm having some trouble preserving the item order. I tried to do record ID and group by RecordID in the cross tab tool but that doesn't work because each file has a different amount of rows therefore Apples might not have the same RecordID in each file causing the output to have a row for each instance of Apple rather than showing up in one row. Any help would be greatly appreciated, thanks!
Solved! Go to Solution.
Thanks for this workflow. However, I see that the output is still in alphabetical order - Apples, Bananas, then Oranges.
I was hoping to preserve the order in File 1 such that it displays Apples, Oranges, then Bananas.
Here's the quick way to do it:
Read in data that has fields in the desired order. Sample 1 record. Sample and skip first 1 records.
Now bring that "format" into a UNION and make it the 1st data read (checkbox at bottom of config). Now bring in the real data. Configure by name and you should get the data in whichever order you specified with #1.
Cheers,
Mark
Could you elaborate a little more? Sorry, I'm a beginner.
I'm not quite following how sampling will help or the order of tools in my workflow. Right now, the union output is formatted correctly however I'm still having trouble with that order carrying over after using the cross tab tool to split it back into two columns.
Thank you!
Thank you so much, this is amazing! This perfectly addressed my problem and your workflow and video were really helpful for a beginner like me. You really went above and beyond in adding file N because that scenario does apply in my case. Although my fruit example was simple, I'm actually unioning 11 files so there are bound to be new "fruits" that I would have struggled with, if not for your help.
I will definitely subscribe. I also have a follow-up question. Instead of setting new fruit at the bottom, is there a way to summarize the fruit order within all the input files to cut out having to manually sort the new fruit at the end? This would be assuming each input file follows the same non-alphabetical fruit ordering system. For example, Apples always comes before Oranges which always comes before Bananas and so forth. Essentially, I'm trying to create a "master file" of fruit groupings from separate files which all have the same ordering but may not individually contain all the fruits.
I thought the best way to do this would be to start from the union output which has one long list of all fruits:
Apples
Oranges
Bananas
Apples
Bananas
Apples
Guava
Ideally, I would want some way to remove duplicates such as 'if Apples is already above, delete this duplicate Apples row'. However, I know that the Unique tool or Summarize tool would automatically re-alphabetize so that wouldn't work. I also tried using Tile tool set on unique value and thought it would tile each instance of Apple to be 1 but that was not the case - it just numbered down the list from 1 to N.
Thanks in advance for any ideas you have to help.
Through the use of a UNIQUE tool, you can take the fruit listing and it will output the 1st occurrence of each fruit. If the incoming data is in order, then the unique fruit output is in the right order. Use that and you're there.
Cheers,
Mark