Hi,
I want to import multiple excel file( to one data input tool) with different schema and union it?
I have come across multiple post which describes the method somehow I am facing issue implementing it.
Here for example I have include few file.
Solved! Go to Solution.
Now you are asking the questions! 🙂
1-.You should always go with the fullpath, unless yo are replacing a specific part of the given path on the macro, that is up to you.
2-.The file specification and directory depends on your needs, if it is going always to read from the same place there is no need to change, otherwise you would need to specify it.
3.-You can in deed read different files using the same macro, but it would need to specify the file format. Attached you will find the macro I used for my example. You will see there are numbers assigned for each file format. You can get those numbers by adding an action tool to the input, there, under File, you will see a field called "@FileFormat - value= " this will give you the number for the file format.
Think @afv2688 explained this perfectly. Hopefully that covers it!
Hi @afv2688 ,
Thank you for your detail explanations. I have few queries would like to further clarify with you.
1.I'm a bit lost on this part of the explanations below and can you share the screenshot on how can i get the numbers under file?
You can get those numbers by adding an action tool to the input, there, under File, you will see a field called "@FileFormat - value= "
2. I noticed that a lot of macro workflow , they will use fullpath formula as shown in the screenshot below :May i know what is the purpose of we doing at this step ? Could you briefly explain on the circled part as per screenshot below as i not sure how to read this formula and i plan to edit it if necessary. Why we use ///Sheet1 in this case and how we going to configure it if necessary?
3.May i know why we need to use find and replace tool here in this case? What are the things that we need to replace?
4. Normally how you decide at which step you need to contact to the macro. If there is no formula tool and find & replace in this case, will the macro still can function by just directly connected to the input data?
Hello @SH_94 ,
Regarding the questions:
1.- Yo can get the numbers on by adding an action tool and checking the file format like this
2.- What I am adding to the paths on the excel files is the sheed within the excel book. Yo need to specify it to tell the workflow which sheet it needs to read, there is no default on which it will read the sheet even if there is just one.
3.- With the find replace what I am doing is adding the column with the file format. There are many ways to do it, I just used this one.
4.- The more information you dinamically add to the macro the more versatile it will be. The first one I shared had a fixed file format and could only read csv files, this one can now switch between csv and excel files but only when the excel files have the sheetname "Sheet1". We could add a dynamic input to get the sheetnames automatically for example and do much more to make it more versatile.
It depends basically on your needs and what you want to achieve.
Regards
Hi @afv2688 ,
Thanks again for the detail explanations. I have additional queries which would like to clarify with you as below:
1. Would like to confirm my understanding on the two control parameters below, is it one for the xlsx and another one for csv? Let said if we have additional xlsb , it mean we need to create three control parameters? Means that one control parameter can only read one format?
2. Noted on this and can i confirm that we would need to build the following formula if we have both csv and xlsx file? May i know how about if we have multiple sheet in the excel, can we choose all or specific tabs to be imported? Is it all just need to be adjust under this formula tools? Could you share with me how will the formula look like if we want to choose specific tab or all the tab within the excel?
3 Could you shared with me how is the workflow look like if you are adding the dynamic input tools?
Thank you
Hi @SH_94 ,
This is where it gets a little subjective, but personally, I would prefer to split the top workflow (after the directory tool) to execute different macros as soon as the file type (csv/Excel/ other) starts to differ. Not required, but perhaps easier to differentiate.
With Excels that have multiple sheets (where you most likely don't know the name), your workflow needs to first read the list of Excels and then read (into another macro) the list of all the sheets in that Excel. Sounds tricky, but you're really just doing the same thing, but prior to the 2nd macro your only reading the sheets of the Excel that you're processing. Then you pass that list of sheet names into the 2nd macro and reading them one by one.
The one thing you do start to wonder is - if all these files have different schemas - perhaps you don't want to combine them all in one output? But that's for you to decide of course.
Regards,
Tom
Hi @TomWelgemoed ,
Thank you for the detail explanations.
1.Personally i also feel it it more easier to differentiate if we split it starting from Directory tools.Could you share with me the workflow which have multiple macro for me to refer as i want to see how you build the workflow and connect it with each other.
2.Do you have workflow which read the list of excels and then read the list of all the sheets in that excel? Could you share with me if you have it the workflow?
3. Yup, i also thinking on why we need to combine them if it is different schema. Can we adjust the position of the specific row to match with other row which contains similar type of information? May i know do you have the workflow tool as well?
Normally how you will decide whether you want to combine or not if there is different schema? Would like to hear your opinion on this
Hi @SH_94 ,
Looking at your point 3: I think we're circling back perhaps to the original question... why did you want to combine all the files? I think we ( @afv2688 & I) assumed that the data is quite consistent and needs to be combined, but perhaps there is more here.
Maybe before we go into solutions, do you want to restate your requirement/objective? You have all these files - what is the end goal here?
Best,
Tom
Hi @TomWelgemoed ,
Yes sure.
Perhaps let me explain in the way presented below:
Type of file availables - Different type of files with different number of tabs within the excel files. They also consist of different name of header. But all the file contain certain information, for instance invoice amount. However, different name of header may used by different header. For instance, some ppl put it as Invoice Amount, some may put it as Inv Amount , or some may put total balance.
My Objective- My objective is i plan to combine all the data in alteryx and can we adjust different position of column (for specific row) to the same position of column ? So that we can use it for the calculation purpose.
Hello @SH_94 ,
1.- That's not like it, one control parameter adds the file name while the other specifies the file type. If you would like to add xlsb too you would just need to add the file format number on the text input on the find replace tool, thats it.
2.- To add multiple sheets you would need to change the solution like the one I just added
Regards