Hi there!
I am trying to "arrange" a huge dataset with over 2,000 fields. My goal is to essentially transpose the dataset to its "flattest' format - only a few columns: ID, VarName, VarValue, Crosscut Layer 1, Crosscut Layer 2, Crosscut Layer 3 (max number of crosscut layers is 3), but a lot of rows.
Theoretically this could be done using the "arrange" tool, but I'm having problems because there are just too many fields and it is painful to scroll through all of them and manually select what I need for Key Fields and Output Fields. Not to mention manually selecting the fields for each of the crosscut layers. I was wondering if there is a way to automate the "arrange" process, via either a Macro or some other tool?
Thanks in advance for your help!
Solved! Go to Solution.
Not sure I fully understand this but one way to approach the problem is to break the calculations into multiple parallel streams and then join the resulting values of each stream.
This will help me better understand your exercise:
1. Will the number of rows in your resulting table have ( 2,000+ times the number of source table rows ) in the source table?
2. Will each of the Crosscut layer values (1,2 and 3) be the same for each unique combination of ID and VarName?
3. Will the values for VarName be equal to each of your 2,000+ field names (except for ID) ?
Hi Derek - Thanks for your response. Sorry that my question wasn't clear. See my answers below -
1. Will the number of rows in your resulting table have ( 2,000+ times the number of source table rows ) in the source table?
Yes, that's exactly right
2. Will each of the Crosscut layer values (1,2 and 3) be the same for each unique combination of ID and VarName?
No. Crosscut layer is dependent on varName. Some variables have crosscuts, some don't. I have a crosswalk that logs what crosscuts exist for each varName.
3. Will the values for VarName be equal to each of your 2,000+ field names (except for ID) ?
Yes. So basically each ID will have a corresponding varValue for each of the 2,000+ varName.
I have assumed that your crosscut values are fixed for each unique varName and then created a small prototype where the source table has 200 fields (not 2,000) and 50 of the varName values have related crosscut values. You will have to see how well this scales up. I have attached the workflow and the source file (XLSX).