This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 05-09-2016 08:27 AM - edited 4 weeks ago by kleinpm7
"I have a bajillion files that I need to bring into my workflow... wayyyyyy too many for me to use individual Input Data tools! What's the best way to do this?"
- every Alteryx user ever
Bringing in multiple files at once can sometimes seem like a daunting task, but is often a necessary piece in developing those wonderful time-saving workflows. Here's a handy flowchart to help you get those tricky files in without a hitch!
1. Input Data with wildcard
Most of the time you just need an Input Data tool to get your data in, which is great, because it's by far the easiest and fastest method here.
How to do it:
So in our flowchart above, this connection string will look in the directory C:\Users\akoszycki\Desktop\Input Data\Data to Input and then pick up every file that starts with "CO Store File - " and ends with ".xlsx".
2. Dynamic Input with file paths
Sometimes the universe conspires against you and you can't use a wildcard. Why? Well maybe it's impossible to have all of your files located in the same directory, or maybe there are a ton of files in the directory that you don't want a wildcard to pick up. In this case you'll need to use a list of filepaths to feed a Dynamic Input.
How to do it:
Working with tables:
Some file types require a table or sheet specification, such as Microsoft Excel or Access database files. If the sheet or table name is always going to be the same, you won't have a problem with the Dynamic Input tool. However, if your files have different tables or sheets that you're pulling from, you may get an error like...
If this happens, you'll have to add the table/sheet to the end of the file path in the form ‹file path›|||‹sheet/table›. For example, in the flowchart above I'm specifying "Sheet1" as the worksheet to import from the Excel files by using a simple formula:[FullPath]+"|||Sheet1$".
3. Batch Macro with file paths
No matter what you do, it seems you always get these pesky messages that keep your files from coming in. This is when we bring out the big guns - Batch Macros. Though this last method is the most work-intensive initially, let's remember to keep in mind what my good buddy Confucius always says...
The batch macro will run a workflow one time for each file, updating the Input Data tool each time. It would generally look like this:
The Batch Macro will automatically read in files which have the same field schema without errors. But what happens if your field schema is different from one file to another?
This is where the batch macro really comes into its own! Navigate to the Interface Designer (View > Interface Designer) output mode , and either use ‘Auto Configure by Name (Wait until All iterations Run)’ or ‘Auto Configure by Position (Wait Until All Iterations Run)’. 'Auto Configure by Name' will union the fields with the same names, and 'Auto Configure by Position' will union field 1 from file 1 on top of field 1 from file 2 and so on for every file.
Once you have configured the Interface Designer you can save the macro and insert it into your workflow (Right click on the canvas Insert > Macro...). There is an example workflow attached which includes dummy files and the Batch Macro to see how the process works.
That should do it for inputting your data! Comment below or reach out to Customer Support with any questions!
That you didn't understand the question implies that you are not taking that important step. Follow the link in my post above to learn more, but in a nutshell, use Options > Export Workflow then make sure the macro is checked. Specify the location you want to save to, and click Save. Then send your colleague the .yxzp. The macro (ending with .yxmc) will be packaged with the rest of the workflow, and when your colleague gets it, opening the package will prompt import of the workflow.
I am having trouble getting Batch Input to work. the output is multiple instances of the first file.
I am trying to input 31 files. The first file has 30 records. After running the macro. I am getting 930 records all from the first file.
It appears that the batch macro is not updating with the information from the other 30 files.
Does anyone has experience with using the Batch Input Macro from this post.
Does anyone have a batch macro that can input multiple macro enabled Excel files (.xlsm) from one 1 folder?
thanks in advance.
This is fantastic, but what if the Excel files contain multiple worksheets? Didn't seem to find a clear answer in the above responses.
That is a good question. If you are looking for the same sheet name in every file, you can address that pretty easily with a Formula tool as shown in the original article's orange section. If they are different sheet names in every file, then you'll need to use some conditional logic in the Formula tool to make sure that the |||SheetName$ gets appended to the correct path and filename.
Instead of using the .xlsx suffix, make it .xlsm. You'll want to make that change in the Input Data tool within the macro as well as any Directory tool you use.
If you have the same file opening multiple times, check the following:
Thanks @LisaL. I think the formula option will work, but I only see a screenshot in the original post, so I'm not able to see what that formula is. Do you know what it is?
How you configure the workflow will totally depend on your use case. The Virtual Solution Center has an option for "I need help building something."
SUPER practical and TO THE POINT!
Thank you for this article!
Thanks for sharing this; I'm sure it will come in handy!
I am trying to import data from three multiple macro enabled Excel files. My settings are on par with the first use case (the files are into the same directory, and the structure - column's names and position - is the same across all of them).
But I am facing an issue when using wildcard to bring in data from my files at once. It outputs only data from the first file.
Below are my Workflow, the Input tool configuration and the files name.
Need help please. Thank you in advance.
This was explained so well, thank you!
This is very helpful. Thank you for your help.
Can anyone help me explaining how to implement the second flow chart
Importing the files from multiple directory
I know this is old thread, but looking at this example given specifically method 3. I have downloaded the sample and made a change in one of the file ( just changing the column order ( county), I was expecting the same amount of columns as the batch macro is supposed to work based on the column name when combining different files with same columns with different order. But I am seeing new column has been created with the same name for the changed the spread sheet and when it parses the value for county, it is adding under that column. The bottom line is that if two work sheets has 12 columns and third work sheet has just different order, still creating new column by the same name . So in total I see 13 columns in the output. Is this normal behavior ? if yes, what is the benefit of setting interdface designer property ?
I am new to this tool but liked the examples give to try out. Any thoughts, highly appreciated
Thank you, this is great!
Is the video explanation of this content available?
'The journey of a thousand files starts with a single batch macro.' - Indeed, this is my favourite quote by Confucius.
thankyou for the wonderful Q's.
I auto configured by name but its still saying my set will be skipped
This looks like a problem with how you configured the Action tool. It appears that you are still loading all the files with the asterisk rather than specifying each one from the Control Parameter. Can you show the configuration of the Action tool?
thanks for sharing
It's really Informative
I have 2 input files that I get by connecting to a directory. Anything with .txt extension and date stamp of today matches my criteria. I'm trying to solve for the use case when I have more than 1 text input file. I want to run them iteratively into the dynamic input tool. How can I accomplish this?
Hi! Thanks for the detailed solution but when I run the macro for a directory, if there are 21 files in a directory it runs only the file attached to the original flow 21 times. I am attaching the macro flow screenshot for reference. ￼
Just a recommended update to this - for the dynamic input tool, the field names do not have to be in the same order (Alteryx will stack the fields with the same names regardless of if they are in the same order). If the fields have different field names then Alteryx will stack the fields positionally. Refer to this training video: Connecting to Multiple Sheets at Once - Alteryx Community.
clear and concise
I am using the 3 solution since one file has a different schema, but at me the workflow finds 5 input files. when It enters the macro it reads 5 times the first file. What did I do wrong?