Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

The Ultimate Input Data Flowchart

AlexKo
Alteryx Alumni (Retired)
Created

"I have a bajillion files that I need to bring into my workflow... wayyyyyy too many for me to use individual Input Data tools! What's the best way to do this?"

- every Alteryx user ever

 

Bringing in multiple files at once can sometimes seem like a daunting task, but is often a necessary piece in developing those wonderful time-saving workflows. Here's a handy flowchart to help you get those tricky files in without a hitch!

 

flowchart.PNG

 

1. Input Data with wildcard

 

Most of the time you just need an Input Data tool to get your data in, which is great, because it's by far the easiest and fastest method here.

 

kiss.PNG

 

Requirements:

  • You need all of your files to be in the same directory.
  • All files must be the same type (e.g. csv, xlsx, yxdb)
  • All files must have the same schema.  (The same field names must be in the same order and have the same data types.)

How to do it:

  1. Configure an Input Data tool to bring in one of the files in your directory.
  2. In the connection string, replace the filename (or a portion of the filename) with an asterisk (*). The asterisk is a wildcard that essentially takes the place of anything.

input1.JPG

So in our flowchart above, this connection string will look in the directory C:\Users\akoszycki\Desktop\Input Data\Data to Input and then pick up every file that starts with "CO Store File - " and ends with ".xlsx".

 

Results:

  • All of the data from these files is stacked on top of each other.
  • If you want to see which file the data comes from, use the "Output File Name as Field" option in your Input Data tool configuration.

file name.PNG

 

macros2.jpg

 

2. Dynamic Input with file paths

 

Sometimes the universe conspires against you and you can't use a wildcard. Why? Well maybe it's impossible to have all of your files located in the same directory, or maybe there are a ton of files in the directory that you don't want a wildcard to pick up. In this case you'll need to use a list of filepaths to feed a Dynamic Input.

 

Requirements:

  • All files must have the same schema.  (The same field names must be in the same order and have the same data types.)
  • A list of filepaths to import. fullpths.PNG
  • A file to populate the "Input Data Source Template" configuration of the Dynamic Input tool. This file will always have to be available for the tool to base its field schema on, and can be any of the files you wish to input. It *will not* be input if it is missing from the aforementioned file list.

How to do it:

  1. Configure the Input Data Source Template in the Dynamic Input just as if it were a regular Input Data tool.
  2. This tool includes an option to "Read a List of Data Sources" that will update the connection string based on your list of filenames.

 

macros3.jpg

Working with tables:

Some file types require a table or sheet specification, such as Microsoft Excel or Access database files. If the sheet or table name is always going to be the same, you won't have a problem with the Dynamic Input tool. However, if your files have different tables or sheets that you're pulling from, you may get an error like...

sheetname error.PNG

 If this happens, you'll have to add the table/sheet to the end of the file path in the form ‹file path›|||‹sheet/table›. For example, in the flowchart above I'm specifying "Sheet1" as the worksheet to import from the Excel files by using a simple formula:[FullPath]+"|||Sheet1$".

 

3. Batch Macro with file paths

No matter what you do, it seems you always get these pesky messages that keep your files from coming in. This is when we bring out the big guns - Batch Macros. Though this last method is the most work-intensive initially, let's remember to keep in mind what my good buddy Confucius always says...
 

1000steps.jpg

 

The batch macro will run a workflow one time for each file, updating the Input Data tool each time. It would generally look like this:

 

batch.PNG

 

The Batch Macro will automatically read in files which have the same field schema without errors. But what happens if your field schema is different from one file to another?

 

This is where the batch macro really comes into its own! Navigate to the Interface Designer (View > Interface Designer) output mode idea filter, and either use ‘Auto Configure by Name (Wait until All iterations Run)’ or ‘Auto Configure by Position (Wait Until All Iterations Run)’. 'Auto Configure by Name' will union the fields with the same names, and 'Auto Configure by Position' will union field 1 from file 1 on top of field 1 from file 2 and so on for every file.

 

macros4.jpg

 

Once you have configured the Interface Designer you can save the macro and insert it into your workflow (Right click on the canvas Insert > Macro...). There is an example workflow attached which includes dummy files and the Batch Macro to see how the process works.

 

That should do it for inputting your data! Comment below or reach out to Customer Support with any questions!



null
Attachments
Comments
dataMack
12 - Quasar

This is definitely one of the more common questions.  Thank you for laying out the options in this very clear flowchart - I'm definitely bookmarking it!

serendipitytech
8 - Asteroid

I've been looking at all sorts of posts on this process, and your's definitely has been most helpful. However, I just posted my own question as I'm just not able to get to where I'm trying to go. 

 

I have 5 files that are all different in structure (different columns in each with some ID fields that connect some of the files), so I'm thinking based on your writeup that I need to use the Macro option, however, when I open the Interface Designer, I get no actionable objects, just a greyed out area. I can't seem to find how to activate this area to do anything with, so I'm not sure what step I'm missing. 

 

My end goal is to import a set of 5 files and run a workflow against them. I would need to repeat this same workflow for the same set of 5 files from each customer, but of course, each customer will send the files slightly different - different file names, different formats (xls, txt, csv, etc), different headers (or no headers), different column orders, some include more than the bare minimum we require. I'm looking for the most streamlined way of running the same data analysis workflow against the set of files from each customer. 

 

UPDATE

Not sure what combination I went through, but I got the interface designer panel to operate and created the macro workflow, so I think I'm making progress on that front. 

AlexKo
Alteryx Alumni (Retired)

Hi @serendipitytech! It sound's like you're off to a good start. If you have trouble with this and would like some help, let us know over in Customer Support (support@alteryx.com) and we'll be happy to help.

jishnusyaman
5 - Atom

this was pretty helpful..one question...if i have multiple sheets like Data1, Data2,Data3 and i need to import only Data2 from all files...could be some shortcut i am missing ?

AlexKo
Alteryx Alumni (Retired)

Hey @jishnusyaman - yeah so you could just use option #1 for this.

 

wildcard.PNG

 

 Use the wildcard to read in all the files you need, and just point at the specific sheet in 'Table or Query'

Raghu_s
8 - Asteroid

 Hello, 

 

Am trying to use this workflow to solve a problem where I need to union all text (tab delimited)  files that has different schema. As am very new to this product am not able to understand how to configure Place holder file input in the batch macro setup. 

 

info: 

total files - 11 txt files 

total records - 490 

format 1 - 4 files - 57 columns 

format 2 - 7 files - 56 columns

difference - same schema but one additional column. 

Problem - At any given point we never know when the extra columns will append to which file. 

Solution - we need only identical columns as output. 

 

For example, 

macro 1.PNG

 

The above is the macro i recreated. While using the macro in another WF the loop fails to pick the other files specified in the directory as in the below 

 

macro 2.PNG

 

The total records in the 11 file are 490 but the WF gets out only 176 records post processing. I look for some checkpoints that I need to follow to get all the records as output. 

 

Thanks in advance. looking forward. 

Raghu_s
8 - Asteroid

Any update on on this friends? 

andre347
10 - Fireball
Did you change the Interface Designer properties in the macro? You'd need to set that to 'Auto Configure by Name' - this will put nulls in the columns where the files don't have that column.
Raghu_s
8 - Asteroid

Yes I did, not luck. 

andre347
10 - Fireball
Can you be a bit more specific regarding what's the error message and what is going wrong? From the screenshots you posted above I can see that in the Action tool (2) you haven't told it what to update. So if you click on 'File' (and it's highlighted in blue) then you are going to do your action on the file. At the moment it's only selected input.
Raghu_s
8 - Asteroid

Thanks @andre347

 

Yes, I didn't get any errors as such but warning stating that the files was not readable.

 

I was not sure of which action type to pick to update the file path as there was only limited examples in the Tool Mastery | Action  section that is available in the forum. But after seeing you question I was able to play around and set the action type as - Update Input Action Tool and now its able to read all the files as expected. am getting to know the various usage of the options within each tool. 

 

Thanks mate! 

 

 

 

 

sriniprad08
11 - Bolide

Hi, thank you. I have 20 csv files with different names. How can i create one file out of these 20? can you please help with the workflow? thanks

andre347
10 - Fireball

@sriniprad08 put them in the same folder. Load in 1 file. Remove the filename and replace it with a *. This should load an union all the files. Unless they have different column headers. 

prajgiri
5 - Atom

 Hi All,

 

I have set of text files in a folder with SQL codes in them which are updated whenever there is a need. These text files generate values that needs to be captured in the output tool. New text files get added very often to this folder.

 

Could someone please advise on how to incrementally access each of these files, run the query in dynamic input tool and capture the output and move on to the next file and repeat? 

 

Regards,

Praj

mb1824
9 - Comet

UPDATE - I retried the below process from beginning and it worked. Not sure what I was doing wrong the first time

 

 

 

Hi, 

 

I have got the batch macro idea to work with my own data, however, I want to make the following changes:

  1. Start Data Import on Line: 5  (rather than line 1)

  2. Import only a selection of columns. 

I would like these steps to occur before the data is outputted to my workflow. 

 

For #1, I made the change within Batch Correct.yxmc in the Input Data tool (the placeholder file)

For #2, I added a Select tool after the input data tool but before the Macro output tool. 

 

See below screenshot. 

 

I then saved the Macro and inserted it into my workflow, however neither change worked. 

 

Does anyone have any ideas? Alteryx_batch macro.JPG

 

RJS
7 - Meteor

Can anyone guide me to a solution for my Input data connection issue?

Seems I can not get the program to connect to a .tab file on a common drive.

 

Strangely, It connects fine in access as an excel linked file.

 

Error:

Input Data (2) N:\TechOps\Component Mtc\Operational Support & Planning\Production Control\Complete OSR.tab could not be opened as a MapInfo dataset.

 

Thank you!

jstewart
7 - Meteor

Hello, 

 

First off I am new to Alteryx so there is a lot of room for user error here.

 

I am having problems with a very similar workflow that this thread describes. I migrated about 100 .xlsx files to one folder to simplify the process. They have varying number of tabs but they all have the same one tab of interest with the same name. I am attempting to use the batch macro to combine the one tab from all files in the same folder. Unfortunately the wildcard option does not work seemingly bc I cannot effectively identify the tab name for each file, again it is the same name for all. 

 

Using the batch import option identified above I receive an error, "tab name" does not match a sheet or named range... the Excel file is corrupt, or similar errors depending on configuration. As for configuration I have tried auto configure by position and name (either should be sufficient for the data of interest but not for all data in the tabs), I selected the file path with a wild card for the place holder as I didn't see any other viable option, I have tried update data tool and update value for action config., directory and file path options for the macro itself, and I selected the folder path for the directory input. 

 

I am not certain where each of the unique elements should be defined in Alteryx so any and all help is appreciated. I have done many searches on this topic which has gotten me to this point. Capture.PNG

 

Cheers and thanks for your help. 

 

I'll be off and running with edits once I can access all data in one venue. 

jstewart
7 - Meteor

Follow up on my previous issue posted above. I found the link below to connect the dots for me and am off and running with the batch macro. Overall it was very useful noting where setting are and are not required for all tools used. Ultimately clarifying the action tool settings in the macro and the filter tool settings using the macro is what brought everything home for me.  

 

https://community.alteryx.com/t5/Alteryx-Knowledge-Base/CS-Macro-Dev-Reading-in-multiple-files-with-...

 

 

Cheers

Bobe
7 - Meteor

This doesn't want to work for me, hitting the following snags: 

 

  • If I use the macro as described and set the "Full path" as parameter for the macro input, I get an error message stating that I need to specify a sheet
    • I've specified a sheet name in the input tool within the macro, so this error doesn't make sense 

 

  • If I work around the problem above by using a function to append the sheet name to each file path string, the batch macro only iterates one single time and only returns the results from a single file, stating the macro was "(ended by a downstream tool)"
    • I've set the Output Mode of the macro to "Auto Configure by Name (Wait Until All Iterations Run)" and I can see there are >1 records showing up in the input anchor of the macro  

Can't figure out what I'm overlooking, but any insight would be greatly appreciated. 

BBouri
5 - Atom

Thanks all for sharing. It really helped but I'm not yet there.

 

My objective is to import multiple (different) .csv files into tables in Microsoft SQL Server. Since it concerns different files; all files have different schemas. With your help I can succesfully import each file in a seperate table in SQL server. The issue is however that for each imported table it considers the appended schema of all files. Refer to below example:

 

Files

File A

ID

Text

1

3

2

4


File B

Customer

Name

ABC

ABCDEF

123

123456

 

Output

Table A in SQL

ID

Text

Customer

Name

1

3

 

 

2

4

 

 

 

Table B in SQL

ID

Text

Customer

Name

 

 

ABC

ABCDEF

 

 

123

123456

 

Desired Output

Table A in SQL

ID

Text

1

3

2

4

 

Table B in SQL

Customer

Name

ABC

ABCDEF

123

123456

rdelmund
7 - Meteor

Hi All,

 

First time to use macro on a workflow. Just want to check if we're going to publish the workflow on a Alteryx Co-lab gallery, do I also need to publish the macro file? or I just need to run it on my Alteryx Desktop once and just publish the workflow?

 

 

thanks,
Rob

Ranjith15
7 - Meteor

I have tried all these methods and worked fine if all files contains only one sheet. Is there any way to handle the the files with multiple sheets? 

 

Thanks in Advance

jdemeaux
8 - Asteroid

Can I use a question mark instead of a wildcard, ie "Directory\?.xlsx" instead of "Directory\*.xlsx" to read multiple files?
I believe I read that somewhere but can't find out where.

RodLight
8 - Asteroid

@jdemeaux 

Actually the asterisk and question mark are old DOS based wild cards so you could google on "DOS wildcards" to get a full explanation. The question mark indicates a wildcard for a single character. 

So "Directory\201?.xlsx" will find...

2019.xlsx

2018.xlsx

2017.xlsx

 

but not

2009.xlsx

AIChudnovsky
7 - Meteor

I was having the same issue as @Bobe, where the batch macro I added to a workflow was only displaying results from one file and in my case only one record.  Once I added a tool (e.g. Browse Tool) to the output anchor of the macro, it ran multiple iteration of the macro and displayed all of the results.

offerdan
6 - Meteoroid

Thank you, this is great!

Inactive User
Not applicable

Great stuff, thank you for the content, very helpful!

JokeFun
8 - Asteroid
This is great! I finally got to know how to make macro work and how to read in multiple excel files with different schemas. However, when I used the cross tab tool to summary a report by certain parameters, it cannot group the same items. And I have no idea of how to fix this problem.
Sri9
8 - Asteroid

Does this imports xlsb file ?

srikantap
5 - Atom

real time and interesting pieces of Q&A. helpful. A must read pieces of information.

 

Thanks!

saratsai_99
5 - Atom

Great. This is helpful

Luyanda001
5 - Atom

Hi, I'm having similar issues. My files are not in my Computer, they are in Azure Data Lake in a Folder and updated hourly as JSON. How can I access this folder in Alteryx instead of reading the files individuallly?

sangeethadetne
7 - Meteor

I have downloaded COVID -19 data  which has JSON and CSV files And i am trying to  import the file suing the Macro input but i am not able to do so . Pls let me know if i am doing correct  if not ,pls suggest me the solutions of how to import the different files in Alteryx.

lepome
Alteryx Alumni (Retired)

@BBouri 

I suspect you'd figured this out some time ago, but for the convenience of those who might have the same question, what you need to do  is to include the step writing the table inside the macro.  In that way, you'll have the table match the schema of the specific data file used in that iteration of the macro.

trettelap
8 - Asteroid

Awesome guide!

DPRC
8 - Asteroid

I am having problems with this too and receiving an error message based on the picture below. The error message keeps popping up concerning row 296 of the data.

Annotation 2020-07-28 130848_LI.jpg

lepome
Alteryx Alumni (Retired)

@DPRC 

Without the workflow or data, it's difficult to say what the problem is here, but I would certainly try it without the Data Cleansing tool.  You might also try the Virtual Solution Center if you don't get the help you seek here.

Pamelasdb1215
5 - Atom

Thank you - will be useful!

AnthonyFGCR
5 - Atom

Simply....excellent! as simple as that, love this kind of contributions, makes my feel pride about this discipline. Thank you.

Sara94
7 - Meteor

such a life saver! thanks!!

Xdejesus
5 - Atom

Makes sense!

mjsnyder
7 - Meteor

This will be a useful took moving forward.

RDelPonte
5 - Atom

Thank you!! This was very helpful!

amitagarwal2185
5 - Atom

How do i combine multiple excel files with different schema and each excel file has multiple sheets?

CoreyS
8 - Asteroid

@AlexKo  thanks for sharing! 

Kaustubh17
8 - Asteroid

Wonderfull explanation I understand everything let me try this process to my end thank @AlexKo for simple explanation.

Thanks,

Kaustubh

sriniprad08
11 - Bolide

Hi,

Great macro thanks for building this. One issue i am facing. when i share my workflow with others there is an error in the path of the macro. Every time we have to reconnect it with the downloaded macro. Any reason for this.

sriniprad08_0-1633096003050.png

 

trettelap
8 - Asteroid

@amitagarwal2185 Depending on how many tabs, I would just do the applicable steps form above and then union the results together. Otherwise you would probably have to do a "macro within a macro" so to say to merge the sheets within the macro and then join the files in the original workflow...hopefully this helps!

lepome
Alteryx Alumni (Retired)

@sriniprad08 When you share your workflow with others, are you packaging it first?

sriniprad08
11 - Bolide

@lepome  : Thank you for the reply. Sorry i didn't understand your question.

I generally send the workflow which has the macro to my colleague. But when he opens it he finds the question mark.