Hello Community members,
I am working on how to load and standardize 100 Source input files that have inconsistent headers, schemas, sheet names, and file formats, but contain the same underlying data, into one unified table in Alteryx.
I have added #3 sample Dataset for Working purposes.
I would appreciate it if I could get any help from you guys!!
Hi @AdminYogesh
Here is a way of doing it, utilizing Batch Macros to read in the files - split out into one for xlsx files and one for csv - this runs on the assumption that all the csv will be the same, and all the xlsx will be the same, but you can adjust accordingly.
Have annotated the tools accordingly.
Hi @davidskaife, but the field name should be the same in the final output. If you can see that 3 different files have different field names, and that should be dynamic. Change: this will be a common field
(Employee ID
First Name
Last Name
Department
Salary
Start Date )
Hi @AdminYogesh
In that case simply rename the columns to your desired headers in the final Select tool
Yes, but the problem with the field's Structure will be inconsistent in every file.
> there will be Inconsistent headers, schemas, sheet names, and file formats, but contain the same underlying data.
Hi @AdminYogesh
On the files you provided initially the columns were not out of order...
Anyway, I have updated both Macros to handle out of order columns, by updating the column names based on some matching criteria within an IF Statement (i.e if header contains 'First' then its renamed First_Name, if it contains 'Dep' then its renamed Department etc etc). The Union is now auto configured by name (as they should all be the same) and columns are re-aligned to the correct order.
Note: This is based on the files you have provided. If there are still other issues then please provide additional files