Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Read several excel files and loop workflow until all files have been processed

C9272
6 - Meteoroid

Dear All,

 

I would like to achieve the following in Alteryx:

 

  1. Read in several Excel files from a specified folder
  2. Extract the date (e. g. 20210720) from the excel file (date is contained in the file name of each excel file; see simplified example files below) and append this date in a separate column named "reporting period"
  3. Run my "normal workflow" for each single Excel file
  4. Repeat steps 1.-3. until all excel files contained in the source folder have been processed and run through the workflow
  5. Union the results
  6. Write the results back in a single excel table with one sheet containing all the results and delete all previous records contained in the excel table.

 

 

Example Input Files

All input files share the same file structure.

 

File 1_20210720.xlsx

 

Product

Value

Tomato

100

Apple

250

Potato

150

 

File 2_20210615.xlsx

 

Product

Value

Apple

100

Apple

250

Pineapple

150

Apple

100

 

 

I have tried so far to use the "dynamic input tool" as well as the "Alteryx Holiday Gift of 2015 Read all excel worksheets macro" (https://community.alteryx.com/t5/Engine-Works/The-Ultimate-Alteryx-Holiday-gift-of-2015-Read-All-Exc...)  for the above mentioned purpose. Unfortunately these seem to be reading in all excel files first and afterwards run the "normal" workflow (once). This leads to wrong results in my workflow as this e. g. contains several summarize tools in which I would then have to include the "reporting period" as a "group by" criterium. As my workflow has become quite complex in the meanwhile, I would like to avoid this extra effort.

Instead, I would like to loop the workflow until each excel file has been read in, processed and write the results back to a single excel table that gets updated weekly.

 

 

Any help would be greatly appreciated!

 

Thanks in advance

Christopher 

12 REPLIES 12
JagdeeshN
12 - Quasar
12 - Quasar

Hi @C9272 ,

 

Please find attached a sample workflow that should do what you are looking for.

 

The input tool within the workflow is designed to read all excel files in a particular directory (*.xlsx) along with the files names. The date is then extracted from this name before outputting as a combined output to an excel file.

 

This will still run the workflow just once. But because the 'reporting period' is introduced as a column in the beginning itself, your summarize tools should still be able to group by the reporting period.

 

If this does not work for you, one way of running the workflow once for each files is to encapsulate the logic as a batch macro, with a directory tool used to pass one file name at a time.

 

Best,

Jagdeesh

C9272
6 - Meteoroid

Hi @jagdeeshn @atcodedog05,

 

Thank you very much for your input. This is really appreciated. 

I already tried your approach @jagdeeshn before. I did not follow up on this approach further because this would have meant that I had to "scan" my whole workflow for the occurence of "sort", "summarize" etc. tools which show different results, depending on the workflow being run with "only one file at once" compared to "several files unioned". To avoid this extra effort I was looking for a way to keep my workflow that currently only handles one file at once "as it is" and only repeat the workflow automatically for each input file separately until all files have been processed. 

 

Attached please find one of many attempts to solve the problem. The example was provided by another user in this forum and I tried to adapt it to my needs. 

So far I have unfortunately not managed to make the batch input run through all files. It keeps displaying only the values of the first file I selected in the macro options. I hope, that the workflow helps to further investigate my problem. 

 

Kind regards

Christopher

KilianL
Alteryx Alumni (Retired)

Hi @C9272 ,

 

I had a look at your example batch macro and it looks fine. I did minor changes, see attached if it works for you.

 

Please mark this as the solution if it answers your question, it will help others to find solutions quicker.

 

Kind Regards,
Kilian
Solutions Engineer - Alteryx

C9272
6 - Meteoroid

Hi Kilian, 

 

Thank you very much for your support. 🙂 This solved my problem!

I knew that there must have been something wrong in my workflow. 😁

 

Kind regards

Christopher 

C9272
6 - Meteoroid

Dear Community,

 

I just came across another question while working on my workflow:

How can I integrate a "nested" batch macro in my workflow, i. e. what do I have to do if I have got a workflow that relies on the input of more than one datasource during execution?

Let's say I have got a batch macro that reads in all the data for the sales of tomatoes (can be easily achieved by Kilians solution) and now I would like to enrich my workflow with additional information on the current market development which is saved in another data source.

For example, I would have five files for reporting periods January to May, containing the sales of tomatoes and five files for the reporting periods January to May containing the current market development. How can I achieve, that the five files for the sales of the tomatoes (from one datasource) as well as the five files regarding the current market development (from another datasource) are processed and all files for tomatoes and the market development that share the same period are being "merged"?

 

The processing order should be as follows:

 

  1. First file for tomatoes is read in
  2. First file for tomatoes is processed in the workflow
  3. First file for market development is read in and being processed
  4. First file for market development is joined/ unioned (depending on the use case) with first file for tomatoes and is further processed
  5. Second file for tomatoes is read in
  6. And so on…
  7. All results are unioned and written in one file

 

In the end I would like to end up with one file, just like in my first example with the only difference being that I need to introduce a second data source at a later step during the workflow that should be run as a batch macro as well.

The principle, that every "reporting period" is treated separately should thereby be kept.

 

Many thanks in advance and kind regards

Christopher

KilianL
Alteryx Alumni (Retired)

Hi @C9272 ,

 

You can add a second input anchor to the batch macro to load the second file.

Inside the batch macro, you can simply copy-paste the part before the output - this will add a second input anchor to the macro you can map to the second input (you might want to adjust the input data tool in the macro).

 

Now inside the macro, it will load both files at the same time, and in there you can merge both before the macro output.

 

Another way would be to use a separate batch macro for your second dataset and merge both afterwards.

 

Kind Regards,

Kilian

C9272
6 - Meteoroid

Dear @KilianL, 

 

Thanks for your reply. I have taken the "Batch Example" File you provided before and just copy-paste the part before the Macro Output as you proposed but I still ended up with only one macro input in my main workflow. 

Would it perhaps be possible to create a simple "mock-up" of a macro with two input anchors with dummy data so that I can figure out what I might have done wrong? 

 

Many thanks in advance and kind regards

Christopher 

KilianL
Alteryx Alumni (Retired)

Hi @C9272 ,

 

sorry I mixed up Macro Input tools with the Control Parameter tool. You only get one Input Anchor for the control parameter for the batch macro. But you can assign a different column to the second 'control parameter' tool.

Not sure if this is easier than using a separate batch macro, but it solves it in the way you described it before.

 

You can see an example attached. Inside the macro I join both datasets by record position, this can be adjusted or extended inside the macro as needed.

 

Hope it makes sense.

 

-Kilian

Labels
Top Solution Authors