This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 05-09-201608:27 AM- edited
2 weeks ago
"I have a bajillion files that I need to bring into my workflow... wayyyyyy too many for me to use individual Input Data tools! What's the best way to do this?"
- every Alteryx user ever
Bringing in multiple files at once can sometimes seem like a daunting task, but is often a necessary piece in developing those wonderful time saving workflows. Here's a handy flowchart to help you get those tricky files in without a hitch!
1. Input Data with wildcard
Most of the time you just need an Input Data tool to get your data in, which is great, because it's by far the easiest and fastest method here.
You need all of your files to be in the same directory.
All files must be the same type (ex. csv, xlsx, yxdb)
How to do it:
Configure an Input Data tool to bring in one of the files in your directory.
In the connection string, replace the filename (or a portion of the filename) with an asterisk ( * ). The asterisk is a wildcard that essentially takes the place of anything.
So in our flowchart above, this connection string will look in the directory C:\Users\akoszycki\Desktop\Input Data\Data to Input and then pick up every file that starts with "CO Store File - " and ends with ".xlsx".
All of the data from these files is stacked on top of each other, essentially the same way as the Union tool stacks data when it's using the "Auto Config by Name" option.
If you want to see the file from which the data comes from, use the "Output File Name as Field" option in your Input Data tool configuration.
2. Dynamic Input with file paths
Sometimes the universe conspires against you and you can't use a wildcard. Why? Well maybe it's impossible to have all of your files located in the same directory, or maybe there are a ton of files in the directory that you don't want a wildcard to pick up. In this case you'll need to use a list of filepaths to feed a Dynamic Input.
A list of filepaths to import.
A file to populate the "Input Data Source Template" configuration of the Dynamic Input tool. This file will always have to be available for the tool to base it's field schema on, and can be any of the files you wish to input. It *will not* be input if it is missing from the aforementioned file list.
How to do it:
Configure the Input Data Source Template in the Dynamic Input just as if it were a regular Input Data tool.
This tool includes an option to "Read a List of Data Sources" that will update the connection string based on your list of filenames.
Working with tables:
Some file types require a table or sheet specification, such as Microsoft Excel or Access database files. If the sheet or table name is always going to be the same, you won't have a problem with the Dynamic Input tool. However, if your files have different tables or sheets that you're pulling from, you may get an error like...
If this happens, you'll have to add the table/sheet to the end of the file path in the form <file path>|||<sheet/table>. For example, in the flowchart above I'm specifying "Sheet1" as the worksheet to import in the Excel files by using a simple formula: [FullPath]+"|||Sheet1$".
3. Batch Macro with file paths
No matter what you do, it seems you always get these pesky messages that keep your files from coming in. This is when we bring out the big guns - Batch Macros. Though this last method is technically the most work-intensive, let's remember to keep in mind what my good buddy Confucius always says...
The batch macro will run a workflow one time for each file, updating the Input Data tool each time. It would generally look like this:
The Batch Macro will automatically read in files which have the same field schema without errors. But what happens if your field schema is different from one file to another?
This is where the batch macro really comes into its own! Navigate to theInterface Designer (View>>Interface Designer) output mode , and either use ‘Auto Configure by Name (Wait until All iterations Run)’ or ‘Auto Configure by Position (Wait Until All Iterations Run)’. 'Auto Configure by Name' will union the fields with the same names, and 'Auto Configure by Position' will union field 1 from file 1 on top of field 1 from file 2 for every file.
Once you have configured the Interface Designer you can save the macro and insert it into your workflow (Right click on the canvas>>insert>>Macros). There is an example workflow attached which includes dummy files and the Batch Macro to see how the process works.
***This workflow was built in Alteryx Designer 10.5.
That should do it for inputting your data! Comment below or reach out to Customer Support with any questions!