Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Read in Multiple Excel Files, with Multiple Tabs that have Different Schemas

AndrewL
Alteryx
Alteryx
Created

One of the best things about Alteryx is the ability to read in multiple files very easily and automatically combine them into a single dataset. This becomes a bit trickier when dealing with files that have different schemas or Excel files with multiple tabs. Adding both multiple excel files with multiple tabs, and having the schema change within each tab takes it to another level.

If your tabs have the same Schema, the article you want to read is here.

The way to accomplish the task if the tabs have (or may have) different schemas (field names change depending on sheet) is to use nested batch macros. I've attached a sample workflow built in 11.0 that demonstrates the process.

main workflow.png

In the main workflow pictured above, the Directory Tool input pulls in the file paths of all of the XLSX files in the directory you're pointing to (note you may need to redirect this tool in the sample to a directory on your machine).

Macro1.png

Most of the magic happens in the macro pictured above. This macro takes the FullPath field and updates the main Input Data Tool to read the first file in the list from the Directory Input. It is configured to read the list of sheet names within that first file, but also to output the Full Path from the Input Data Tool. A new field is formatted in the formula for an acceptable full file path for an Excel file, including the desired sheet name. This final file path is passed into the 2nd macro as the control parameter.

Macro2.png

The second macro is very simple. It takes the file path received from the first macro, updates the Input Data Tool, reads in that file and then passes it back to the first macro. It repeats this process once for each sheet in each of the files being passed from the Directory Input Tool.

Each batch macro holds the data until each batch is completed and combines it all into one large data set.

Note: the sample was created in 11.0 - it will not open in earlier versions.

Attachments
Comments
Jamesrich
7 - Meteor

Hello, 

Is it possible to add a record ID tool into the macro so that each row of data outputted as an id? I'm not sure where it could be added...after the union?

Thanks

Jamesrich
7 - Meteor

Is it also possible to adjust the macro so it also outputs a field for Fullpath? This column is excluded in the output. Thanks

bb213
8 - Asteroid

@Jamesrich

 

You need to explain how you would like the numbering to work. If you simply want the numbering from 1-N without starting over, you can put a record ID outside of the macro. If you want the numbering to start at 1 with each new sheet, you need to put the Record ID tool inside the "Test2" macro immediately before the macro "t output". 

 

As for the file name, you need to edit the "Test2" macro - see the attachment. 


BenALT.PNG

Jamesrich
7 - Meteor

Hi Ben, thanks very much for your reply. I would like the Record ID within the macro. Is it possible to have a record id for each row within each sheet being read in? Minus the first row of each sheet which is the header, so record id 1 starts on row two of the sheet. The numbering would then start over from 1 on the second sheet being read in. 

Thanks, James 

bb213
8 - Asteroid

@Jamesrich 

 

Yes, you just need to put the record ID in like the picture in "Test2".  Make sure Box 6 is unchecked.

Capture.PNG

chaniel
5 - Atom

Hi, 

 

Can someone please help with this. 

 

For some reason, i cannot get past this, I have followed the steps outlined, including the formula and selected the full (V_String) however the macros is coming up as not being able to find the sheets in the file.

What is it that i am doing wrong.

 

Thank you

 

chaniel_1-1573931803932.png

nicolasdeldalle
7 - Meteor

Hii @AndrewL, thanks very much for this topic. 

 

I have a question: what if my excel files / tabs don't have the columns?

 

For example, file / tab 1 has 3  columns and file / tab 2 has 4 columns, including the 3 columns of file / tab 1.   

 

Thanks !

JokeFun
8 - Asteroid
My version is 2019.2.10.64688. Seems not working for this. I can unzip the attached yxzp file into 3 documents and open one by one. Then have no idea of how it works. My case actually is not so complicated now requiring multiple sheets from each file. I just need to combine each single sheet from multiple excel files. It just happed in one of the files, some fields automatically changed from v-Wstring to v-string and double to date, making the Dynamic input tool malfunctioning.
deepaprash
8 - Asteroid

Hi Andrew,

 

this is exactly what I need. However, you talk of three macros above and I see only one attachment. I am looking for the macro that unions the from all the sheets of a single file and adds a column against each line item with the sheet name (or full file path name).

 

Thanks

Deepa

yasmeenjohnson
5 - Atom

@AndrewL How do you export the combined excel sheets? 

pieternobels
5 - Atom

Hi everyone and thanks @AndrewL !

 

I've been attempting to recreate this with just the part that reads in the multiple sheets, as I have only one file with multiple sheets to read in.

However, for some reason that remains obscure to me, it only reads in the first line of the first sheet and then it stops.

 

I must be doing something wrong, but fail to see what... any help would be most appreciated!

 

I've attached my files with this post...

 

Thanks!

rlopmor
5 - Atom

 

Hi everyone and thanks so much to @AndrewL !
I am quite new in the community, quite new in alteryx at all. Instead I loved these topic and the solution described. Also the interaction and collaboration that you evidence.
Please accept my apologies if my question is a little basic, but I want to know if it is possible to add a these two tools inside batch macro. Possible is, but I would like to know if them work. I couldn’t get what I need, that was first to deletes some rows and then make a transposition.
Thanks in advance.
Bye

 

 

 

D_Alteryx.PNG

pieternobels
5 - Atom

@rlopmor Yes this should be possible, but you'll need to look on how you want to accomplish "deleting rows". Are these rows that are always present in all your input files? otherwise the output may get stuck in the process...

rlopmor
5 - Atom
Thanks pieternobels!!
I will verify this. As I told you I am new in this area and maybe I made
a mistake in the way I am training to solve the problem.

I have 20 Excels files that contain information that is not needed. I need
just a range from everyone, particularly data allocated in one row that I
need to get into one table where each column must be allocated as a row.

FILE 1:
XXXXXXX
XXXXXX
Nota 1 Nota 2 Nota 3
1 8 10

Pais1 Pais2 Dont matter this
information
Ale FRa

FILE 2:
XXXXX
XXXXX
XXXXX
Nota 5 Nota 7 Nota 8
7 4 5

Need
Nota Nro. Nota
Nota1 1
Nota2 8
Nota3 10
Nota5 7
Nota7 4
Nota8 5

I thought that all of them have the same structure so I was trying to get
the range from each one, then clean the rows with no relevant data and then
transpose the table. but from your advice I noticed that one of them has
one more row above the one I need. I couldn 't realize if this could be
the problem or simply it is not the best way to solve it.



--
PCAM
7 - Meteor

Beautiful! 54,000 + lines in 4.4 seconds. Thank you!

SarahSaei754
5 - Atom

Hello there,

 

This is probably a rudimentary question but how does one use the output in test.yxmc? I've been at this for hours 😞 

 

suby
11 - Bolide

hi Andrew,

 

 

his is great, thank you Andrew. Is there a way to select the specific sheets you want with this workflow? 

 

In my scenario i have 2 files file 1 and file 2 ( both file1 and file 2 has two tabs) - which works perfectly fine.

 

Say if i have 3 tabs on file 1 and 2 tabs on file 2 and i want import only two tabs from file and file 2 what should i do ?

skotian1289
8 - Asteroid

Hi All, This Macro is super helpful and solves 90% of my problem, however I have small question which might sound stupid. Is it possible to add a column as sheet Name which will tell us which data is extracted from which specific sheet name. because I could all my sheets values are merged however I cannot distinguish between the sheets. 

 

Thanks

SK

IvanMtz
6 - Meteoroid

Awesome Work keep it up!

jdkatz1
8 - Asteroid

I have followed each step and my workflow will only bring in the first excel. I map one of the excels to macro 1 and 2, and the main workflow shows all of my 13 excels coming in, all .xlsm. I need 1 commonly named sheet from each of them. Can anyone help me troubleshoot?

jdkatz1
8 - Asteroid

I have multiple excels and i want to compile data from 1 tab from each. Its investment names and some other info. I mapped 1 of them as the example as mentioned in the original posting however the workflow only brings in the data that matches that excel exactly. (if the input excel has 5 investments, it will only pull in those 5 for all the other excels even if they have more investments.) I have mapped everything according to the pictures and i have no idea how to troubleshoot this. Been working on this for a while, so would appreciate if someone could provide a solution. 

Bakiyaraja
7 - Meteor

The above solution is reading data from multiple Multiple excel (different schema )and writing output in to single sheet

but our requirement is to read and write output in different sheet.(Eg. output filename should have sheet1,sheet2,sheet3 so on)  

 

INPUT: 

2 excel sheets with different schema

INPUT 1 FROM EXCEL 1:

Bakiyaraja_0-1657872575880.png

 

INPUT 2 FROM EXCEL 2:

Bakiyaraja_1-1657872706700.png

 

 

output expectation:

one excel with 2 sheets

Bakiyaraja_2-1657872858672.png

 

Bakiyaraja_3-1657872876456.png

could you please help with workflow/solution please

 

 

SurbhiMittal
7 - Meteor

Hi Andrew,

Thank you for sharing this workflow.

It's really helpful.I just changed the directory path and it worked.

 

Can you please share with me the workflows for both the macros created by you?

Also,I would like to know that how have you updated the full path in control parameter?

 

Being a beginner in Alteryx,i would like to learn more and more about the formulas,macros etc.

 

 

Can you also provide us with the corporate training for our organization?

 

WorkflowMan
5 - Atom

This example qualifies for my "First Shot" prize .. worked perfectly the first time I tried it. Thank you, Thank you!