Hi Alteryx engineer,
I have several files in a directory that i need to read the data out and combine into one dataset, so I used directory tool and dynamic input to implement it. all working if the files are all with data, while if there is a file only with column but no data in it, it will throw “Has a different schema than the 1st file in the set” issue, how can i solve it?
Hey @187, you'll need to handle this with a batch macro. If I remember correctly, a fully null field will cause the field to default to a double data type so if you're trying to wildcard input/dynamically input a bunch of files, the schema will change if the others files' fields aren't all doubles. To get around this, you can force all fields into a string before outputting from the batch macro. I've attached a copy that I built a while ago for someone else - you'll just need to change the Control Parameter to update the entire filepath instead, and set all of the potentially problematic fields to a forced string (as I currently have for [Code]) in the macro.
More info on inputting files with varying schema etc here: https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Knowledge-Base/The-Ultimate-Input-Data-Flo...
Hi @187 ,
I tried to re-create the error on my end.
The error did not occur when the input data is .csv files.
But it occurred when the input data is .xlsx files and one of the input files contains no record but the header row.
I could not figure out how to avoid the error with the setting of getting the table schema from .xlsx file.
But I found the error did not occur when "First Row Contains Data" in the configuration of Input Data tool was set to ON.
In this case, the column header of each file is treated as a data row.
So you need to rename the column and delete unnecessary header rows in some way as below.
Output
I am not sure if this is the best solution, but at least it works as a workaround.
I hope this may help.
Hi @187
Indeed reading multiple Excel files is a lot of fun, and as @DataNath mentioned you will need a macro, then Googling about I found this community thread/article load multiple excel files from a folder; if you scroll it till kcgreen reply, he posted a couple of links, one to this Gallery macro Read_All_Excel_Files; this macro is very versatile, it support all 4 possible excel files, the community entry reference these blogs wirtten by CameronS (retired Altery employee), It is a good reading and worth reading it.
I downloaded the macro from the gallery, and tested it locally with no dramas, it worked, then I tried to copy the macro to a test folder with broke the macro as it has dependencies with 3 more macros, and two supporting files plus two Excel files XLS.xls and XLSX.xlsx (read the blog entries, part 1 and part 2 for more details on this macro).
My tests involved 5 Excel files, four of them with columns A, B, and C, their data being some sheers with Alpha data, others with Numeric, another one with no data, and the 5th one with 4 columns, the file "BookNothing.xlsx" was the only one it did not read because it contained no data.
This is a picture of the results I got
Have fun,
Arnaldo