I am trying to read multiple excel files into Alteryx, but came across an issue with the column order being inconsistent across files. Is there a way to correct that while reading the files using a macro? Or can anyone think of a solution once I have brought the files into Alteryx? Below I've provided an example of what I'm looking at, as well as my desired output. I have attached the excel file as well.
your schema is different in your second record. probably f1 is being read in as a different format - or there are phantom columns. if column names are the same - swap out union on position to union on name in interface designer in your batch macro.
These excel files are put together by different individuals so some start on different columns, but they contain the same reports. There are no headers unfortunately so the union on name doesn't work. Any other ideas?
If you can logically differentiate specific formats of the records, then you could use a Filter to split the data into multiple streams, use a Select to set the field names and data types, then Union the streams back together. In your example the logic for the Filter would be IsEmpty([Field1]) and you'd have two streams.
if you attach a datacleanse tool in macro you can set it up to drop entirely null columns. You would then union by position. All of this assumes that your sheets are basically schematically consistent. My hunch is that may not be the case.
The only tricky part is logically understanding the differences that can occur and determine how many unique situations you need to split the flow into. Then the cleanup becomes simple. I put together a sample where there are three different record formats.