Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to run multiple files through one workflow and result in multiple outputs

Matt_Hancock
7 - Meteor

Hi,

 

I have a workflow that creates 25 separate files, one for each distribution center.  I created an additional workflow that takes one of the files for one distribution center, applies a series of formulas to add additional data to the file, and then output the file in excel.

 

I need to dynamically input all 25 files, run each through the workflow, and then create a separate output for each of the 25 distribution center files.  I may have found a solution to the dynamic input, but I haven't tried it yet and I don't have a solution for the dynamic output.

 

Any thoughts?

 

Thank you for any help you can provide.

 

Matt

11 REPLIES 11
NicholasM
Alteryx Alumni (Retired)

Hey @Matt_Hancock 

 

I believe you maybe running into a scenario where the sheets in the excel files are named differently. Sometimes when you use the wildcard symbol (*) to pull in multiple .xlsx files, all the sheets have to be named similarly as well. 

 

If you want to get around this. I would suggest taking a look at this macro, available on our gallery, which assists with pulling in all excel files. More in depth explanation of how the macro works is located here. 

fharper
12 - Quasar

The wildcard is just a pattern match technique pure and simple.  It doesn't care about file type, only the string that is the file name and extension.  It is essentially the same as doing a "DIR" from the command line with a wildcard, like "dir myfi*.csv", if you know command line or powershell this makes sense. 

The extent that it works for you is dependent on how much pattern you include.

Consider this list of files:

testfile.yxdb

ecm_testfile1.csv

ecm_testfile2.csv

ecm_testfile3.csv

testfile.csv

Alteryx_Log_1583557998_1.log

Alteryx_Log_1583557998_2.log

Alteryx_Log_1583561608_1.log

Alteryx_Log_2020-03-09.zip

  • if you specify testfile*.* then you get 2 files testfile.yxdb & testfile.csv.  Because there is no * at the beginning of the pattern it does not include the 3 other files starting with "ecm"
  • if you specify testfile*.csv then you get 1 file testfile.csv
  • if you specify *testfile*.* then you get 5 files where the filename contains "testfile"
  • if you specify *.log then you get 3 files where the file extension is .log

It is simple pattern matching against a directory list.

You could use a directory tool to get a list of files in a folder, you could pre-filter in the Directory tool using a wild card to get only csv's files using *.csv.  Then further filter the file list in the resulting stream with a filter tool, for example filter off files older than a certain create date. then feed that file list into a dynamic input tool to read.  An input tool, on its own, can only filter by wildcard pattern.

 

In either case the files you read need to match schema and if reading excel the sheet names must also match or you get schema issues and it skips the file that don't match the first file's schema.

 

Inconsistent schema can come from multiple sources.  CSVs are all string so data type isn't really an issue, just column names and number of columns must match.  Excel files can have multiple sheets and often we get sent excel files with the same "file" name but different sheet names. name variances will cause schema problems.  Another source that can be independent of file type, xlsx, csv, yxdb, is the actual field definitions.  If column names change, even a spelling error, or the data type changes, etc.  In excel assumptions are made by excel as to the data types and lengths assigned and Alteryx generally accepts those assignments so if one file was assigned a number data type due to the type of data it contained in that file yet another file assigned the same field/column as string due to alphanumeric data being present in that column of that specific file then you have inconsistent schema's.

 

This can also throw you off when you read a file of parameters into a Dynamic Input and use String Replacement in the SQL.  If the length of the string changes in one of the variables used as a source for the string value then it will cause an inconsistent schema error also.  Especially true when substituting a field value from the input as a string constant in the SQL, if the substituted value changes from one iteration to the next in the dynamic Input this causes inconsistent schema.  In this scenario Alteryx needs the input string to be the exact same size in each iteration.  I used a formula to pad the string with blanks to get around this.

 

So it may be that the file names may vary due to "Tab" names for excel based files but for any file it may be the perceived data types coming into the Dynamic Input changed from one file to another or finally if using SQL string Replacement the variable values must have the same length.

 

Also if someone gets another tab in as the "sheet1" or default sheet in Excel's mind then alteryx may try to read it instead of the correct tab, if you don't explicitly identify the tab/sheet to read.  If you get excel files where the sheet name varies on a base like MonthEnd 202002, then next month its MonthEnd 202003 you can read in the the list of sheets and use logic to choose the most recent or what ever makes sense then feed that fully constructed name into an Input tool in a macro or into a Dynamic input tool. 

 

If you must read files of various schema you can use an Input tool within a batch macro and use a parameter and action tool and macro setting to feed file names into the macro and standardize the output by column name.  Do a search on that to find one of the several articles on that topic. A macro of that type essentially becomes a Dynamic Input tool relative to flat files (csv, xlsx, etc.)

 

Labels