Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Don't forget to submit your entry for the Excellence Awards by October 30! | Need more information about the program? Check out the blog here

Import multiple files and remove first row from each

Highlighted
5 - Atom

Hello,

 

I have multiple excel files that I would like to input to my workflow.  They all have the same format.

 

Normally, this is straightforward for me.  I can use a wildcard in my Input tool to grab all of the files and go.

 

In this case, however, I need to strip out the first row of each file before using the second row as a header. 

 

It appears that the row issue is causing problems.  When I run my workflow (just an input tool), I get a warning saying that the second file I am trying to import "has a different schema than the 1st file in the set and will be skipped."  Same goes for all subsequent files.

 

I think I'm overlooking something, but I'm not sure what it is.  Is there an easy way to strip out the first row of each file when using input on multiple files?

 

Thanks!

Highlighted
Alteryx
Alteryx

@lunchtime have you tried using the macro? it will bring in all the excel files in the folder. 

 

https://gallery.alteryx.com/#!app/Read_All_Excel_Files/58dd51b3a18e9e18fca64172

Digan
Alteryx
Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @lunchtime ,

 

Yes, there is!

You can use the option to start loading data from a specific row as image below.

fmvizcaino_0-1576536737514.png

 

Best,

Fernando V.

Highlighted
5 - Atom

Hi @fmvizcaino,

 

Thanks for the suggestion.  I did try starting the data import on row 2, but had the same problem with different schemas.

 

Here's the odd thing: I get the schema error when using a wildcard in my Input Data (eg, importing "workbook*.xlsx").  But if I use a separate Input Data for each file (Input 1 is "workbook1.xlsx" and Input 2i is "workbook2.xlsx") and then Union them together, I have no problems at all.  I could probably solve my particular problem by using Input Data ~100 times, but the whole reason behind getting Alteryx was to automate that sort of repetitive task.

 

I wish I had a better understanding of what the schema error really means.  I've looked over my files and I'm certain they're all formatted the same.  I haven't been able to find much good information about what Alteryx is looking at when it determines that the schema is not the same.

 

I've seen other posts in Community by users with similar issues, but nothing that seems to resolve my issue.

 

Very frustrating.

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

@lunchtime ,

 

I experienced that once and didn't find the true cause of the issue.

One thing you can do and I guarantee it will work is to use a macro to get the files one by one and unite them automatically. I know it doesn't explain the error you are having but it is a great workaround that you will use in other workflows for sure!

 

I'm sharing with you a solution where I used a macro to open and concatenate all excel tabs based in its names. You could use a directory tool to get file names if it is not excel you are using.

 

Let me know if that works for you.

Best,

Fernando Vizcaino

Highlighted
5 - Atom

@fmvizcaino,

 

Thanks for the response.  What you described -- using Input Data >70 times via brute force -- is ultimately what I wound up doing.  It's ugly.

 

Further reading suggests that the data typing issue I mentioned in my earlier post is what is going on.  I have several files that have columns full of nulls.

 

Elsewhere I had seen the macro that you shared, but I don't really understand how it work or what to do with it.

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @lunchtime ,

 

Let me know if you want a quick explanation about the macro I shared with you.

Best,

Fernando Vizcaino

Labels