Alteryx Designer

Definitive answers from Designer experts.

Read in Multiple Excel Files, with Multiple Tabs that have Different Schemas

Alteryx
Alteryx
Created

One of the best things about Alteryx is the ability to read in multiple files very easily and automatically combine them into a single dataset. This becomes a bit trickier when dealing with files that have different schemas or Excel files with multiple tabs. Adding both multiple excel files with multiple tabs, and having the schema change within each tab takes it to another level.

 

If your tabs have the same Schema, the article you want to read is here.

 

The way to accomplish the task if the tabs have (or may have) different schemas (field names change depending on sheet) is to use nested batch macros. I've attached a sample workflow built in 11.0 that demonstrates the process.

 main workflow.png

 

In the main workflow pictured above, the Directory Tool input pulls in the file paths of all of the XLSX files in the directory you're pointing to (note you may need to redirect this tool in the sample to a directory on your machine).

 Macro1.png

 

Most of the magic happens in the macro pictured above. This macro takes the FullPath field and updates the main Input Data Tool to read the first file in the list from the Directory Input. It is configured to read the list of sheet names within that first file, but also to output the Full Path from the Input Data Tool. A new field is formatted in the formula for an acceptable full file path for an Excel file, including the desired sheet name. This final file path is passed into the 2nd macro as the control parameter.

 

Macro2.png

The second macro is very simple. It takes the file path received from the first macro, updates the Input Data Tool, reads in that file and then passes it back to the first macro. It repeats this process once for each sheet in each of the files being passed from the Directory Input Tool.

 

Each batch macro holds the data until each batch is completed and combines it all into one large data set.

 

Note: the sample was created in 11.0 - it will not open in earlier versions.

Attachments
Comments
6 - Meteoroid

Hello, 

Is it possible to add a record ID tool into the macro so that each row of data outputted as an id? I'm not sure where it could be added...after the union?

Thanks

6 - Meteoroid

Is it also possible to adjust the macro so it also outputs a field for Fullpath? This column is excluded in the output. Thanks

Alteryx Partner

@Jamesrich

 

You need to explain how you would like the numbering to work. If you simply want the numbering from 1-N without starting over, you can put a record ID outside of the macro. If you want the numbering to start at 1 with each new sheet, you need to put the Record ID tool inside the "Test2" macro immediately before the macro "t output". 

 

As for the file name, you need to edit the "Test2" macro - see the attachment. 


BenALT.PNG

6 - Meteoroid

Hi Ben, thanks very much for your reply. I would like the Record ID within the macro. Is it possible to have a record id for each row within each sheet being read in? Minus the first row of each sheet which is the header, so record id 1 starts on row two of the sheet. The numbering would then start over from 1 on the second sheet being read in. 

Thanks, James 

Alteryx Partner

@Jamesrich 

 

Yes, you just need to put the record ID in like the picture in "Test2".  Make sure Box 6 is unchecked.

Capture.PNG

5 - Atom

Hi, 

 

Can someone please help with this. 

 

For some reason, i cannot get past this, I have followed the steps outlined, including the formula and selected the full (V_String) however the macros is coming up as not being able to find the sheets in the file.

What is it that i am doing wrong.

 

Thank you

 

chaniel_1-1573931803932.png

7 - Meteor

Hii @AndrewL, thanks very much for this topic. 

 

I have a question: what if my excel files / tabs don't have the columns?

 

For example, file / tab 1 has 3  columns and file / tab 2 has 4 columns, including the 3 columns of file / tab 1.   

 

Thanks !

8 - Asteroid
My version is 2019.2.10.64688. Seems not working for this. I can unzip the attached yxzp file into 3 documents and open one by one. Then have no idea of how it works. My case actually is not so complicated now requiring multiple sheets from each file. I just need to combine each single sheet from multiple excel files. It just happed in one of the files, some fields automatically changed from v-Wstring to v-string and double to date, making the Dynamic input tool malfunctioning.
8 - Asteroid

Hi Andrew,

 

this is exactly what I need. However, you talk of three macros above and I see only one attachment. I am looking for the macro that unions the from all the sheets of a single file and adds a column against each line item with the sheet name (or full file path name).

 

Thanks

Deepa

5 - Atom

@AndrewL How do you export the combined excel sheets? 

5 - Atom

Hi everyone and thanks @AndrewL !

 

I've been attempting to recreate this with just the part that reads in the multiple sheets, as I have only one file with multiple sheets to read in.

However, for some reason that remains obscure to me, it only reads in the first line of the first sheet and then it stops.

 

I must be doing something wrong, but fail to see what... any help would be most appreciated!

 

I've attached my files with this post...

 

Thanks!

5 - Atom

 

Hi everyone and thanks so much to @AndrewL !
I am quite new in the community, quite new in alteryx at all. Instead I loved these topic and the solution described. Also the interaction and collaboration that you evidence.
Please accept my apologies if my question is a little basic, but I want to know if it is possible to add a these two tools inside batch macro. Possible is, but I would like to know if them work. I couldn’t get what I need, that was first to deletes some rows and then make a transposition.
Thanks in advance.
Bye

 

 

 

D_Alteryx.PNG

5 - Atom

@rlopmor Yes this should be possible, but you'll need to look on how you want to accomplish "deleting rows". Are these rows that are always present in all your input files? otherwise the output may get stuck in the process...

5 - Atom
Thanks pieternobels!!
I will verify this. As I told you I am new in this area and maybe I made
a mistake in the way I am training to solve the problem.

I have 20 Excels files that contain information that is not needed. I need
just a range from everyone, particularly data allocated in one row that I
need to get into one table where each column must be allocated as a row.

FILE 1:
XXXXXXX
XXXXXX
Nota 1 Nota 2 Nota 3
1 8 10

Pais1 Pais2 Dont matter this
information
Ale FRa

FILE 2:
XXXXX
XXXXX
XXXXX
Nota 5 Nota 7 Nota 8
7 4 5

Need
Nota Nro. Nota
Nota1 1
Nota2 8
Nota3 10
Nota5 7
Nota7 4
Nota8 5

I thought that all of them have the same structure so I was trying to get
the range from each one, then clean the rows with no relevant data and then
transpose the table. but from your advice I noticed that one of them has
one more row above the one I need. I couldn 't realize if this could be
the problem or simply it is not the best way to solve it.



--