Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Comparing three Excel files - multiple tabs

datamonkey
8 - Asteroid

Hi,

 

I have a process I am trying to create in Alteryx and I am hoping someone can help me with the last piece I am stuck on.

 

Requirements:

1) Take three spreadsheets with multiple tabs (each tab has a different schema, but each file has the same set of tabs) and compare the tab A from file A, file B, file C;

2) output the differences (in a very verbose format)

3) output a merged version of the tab according to a set of rules about which file's changes take priority.

4) Repeat 1-3 for every other tab on files A, B, C

 

I have already done this with just a single tab and steps 1-3 work fine. I just need to abstract this process to handle every tab, allow selection of the input files at runtime, etc.

 

Following the tutorial at Input-Multiple-Excel-tabs-at-once I have moved my main comparison workflow into a batch macro, and created a new workflow to pass in a list of file paths with tab names included. The problem is that my comparison workflow expects three inputs, but the macro can only have one; it seems like I need to pass in a list of all file paths for all three files and try to separate them out again inside the macro..?

 

Is this the only way, and if so how is the best way to do it?

 

(note that steps 2 and 3 are inflexible in that the business requirement is quite specific regarding the output format and content)

6 REPLIES 6
JBLove
10 - Fireball

I've done something similar by using the Macro Input tool to pass in the file path details then you could use two filters to diverge the workflow into three parts one for Tab A, Tab B and Tab C.  Each part would utilize a dynamic input tool and whatever tools you used to do your compare and output the verbose differences.

datamonkey
8 - Asteroid

Actually another issue is that I really need to pass in three file paths per iteration of the macro - one to file A tab A, one to file B tab A, and one to file C tab A. At the moment my macro only sees e.g. file A tab A so it has nothing to compare to. Not sure if it should be a standard macro instead? I'm always a bit hazy on when to use one vs the other...

DavidP
17 - Castor
17 - Castor

Your macro can have as many macro inputs as you want - just add more macro input tools. Same with macro output tools.

 

Wrt your question about the difference between a standard and a batch macro, here's my best effort at explaining.

 

A standard macro is just a normal workflow where all the inputs and outputs are defined as placeholders and the whole thing becomes a single tool that you can then use in other workflows. When it is run, it only executes once. If you have your comparison workflow for instance, that let's say, loads sheet A from 3 different files, compares them and writes the results to a template file with a specific name, you can convert the 3 inputs to macro inputs and the output data tools can stay in the macro and you can give it a dynamic filename that contains the sheet name for instance.

 

So if you use your standard macro in a workflow, it will allow you to specify the files, but the comparison will only be on Sheet A.

 

Now take that standard macro with 3 inputs and add a control parameter input, which makes it a batch macro. As the control input, create a list of the sheet names in the files and configure the Control parameter through 3 action tools to change the Sheet name in each of the 3 Input data tools. 

 

When you now run the workflow with the batch macro in it, it will run an iteration for every sheet name defined in the Control input.

 

There might be one further complication. You said that the sheets all have different schemas, which means a dynamic input tool won't work. The problem with dynamic input is that you have to specify a template file and sheet. The files that you then load has to have the same schema as the template. To overcome this you'll probably need 3 simple batch macros that sits inside your bigger batch macro to be able to load any sheet of the 3 input files. You could also consider using the xlsx wildcard macro from the CreW macro pack.

 

I know all this sounds a bit complicated - it's not really that bad. If you're willing to share some sample data and a version of your comparison workflow, I'd be happy to try and help.

datamonkey
8 - Asteroid

Thanks DavidP - I got pulled onto a different project and completely forgot to check back here until today when I am returning to this problem.

 

I've attached the workflow that I know works in the simpler case of a one sheet comparison, as well as sample data for each of the three inputs as well as the expected output (which this workflow successfully produces).

I was going to include my attempt at the scaled up version to compare all tabs, but I have four different ones I started and scrapped when it seemed like they were the wrong approach, not sure any are worth including!

 

I had considered the xlsx wildcard macro but a) it seems more tailored toward taking in a whole directory rather than three specific files and b) towards taking in every sheet except pre-determined ones to exclude, while I would rather say "only look at every sheet beginning with BAU in the name" as there are other sheets irrelevant to this exercise and there may be more in the future.

 

Thanks in advance for any insight you can provide.

DavidP
17 - Castor
17 - Castor

Hi @datamonkey,

 

Have a look at the attached below. I converted your workflow to a batch macro with 4 control parameters - one for each of the 3 files and one for the sheet name since the macro will execute for one sheet name at a time.

 

I also built a small app that allows you to select 3 files for Base, Prod and Test. The app reads the sheet names for each and then groups the filenames together by sheet name.

 

The app therefore creates a list of sheet names and the associated full filepath with sheetname for each of the 3 input files. This is the input for the macro, which will execute once for each row in the list.

 

I've also included a filter to only keep sheet names staring with BAU in the list.

 

The macro itself reads the relevant sheet from each of the 3 files, executes your logic and then writes the output to the 2 files where the sheetname is updated as the relevant sheet name.

 

When you run the app it will show that there is no output, but if you click on the output log link you'll see the files that were written.

 

The files you provided only have 1 sheet, but it should also work for files with multiple sheets.

 

Give it a try and let me know if there are any issues.

 

DavidP_0-1593703377474.png

 

datamonkey
8 - Asteroid

That worked perfectly, thank you for your excellent explanation and assistance!

Labels