Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Join an excel tab with the same name from two excel files with similar names

JamesGray
7 - Meteor

Hi,

 

I have a list of around 40 excel files which within them have financial statement data that eventually go into two tabs within the file called "P&L" and "BALANCE SHEET".

I have these excel files for this year and last year. The name of these is like "XXX - Dec 20" and "XXX - Dec 21" (this would be 2 of the 40 files), then another two files such as "YYY - Dec 20" and "YYY - Dec 21". 

So in summary, 40 files, for 20 companies, all with the same P&L and BS tabs inside.

 

I want to be able to perform a join between the last year and new year file and then add a column to show the movement. I have made this individually for one company where I take an input for each file and join then add the formula tool to calc the movement. However, this is slow and I still need to select the tab I want from the input tool.

 

How can I automate this??

 

I believe it is possible but I have not been able to find something similar in the community.

 

Any help would be great.

8 REPLIES 8
Felipe_Ribeir0
16 - Nebula

Hi @JamesGray 

 

Instead of using the Input Data Tool, for this case i understand that it is better to use the directory + dynamic input tool.

 

The directory tool will pass to the workflow the filename of all files inside a folder, and the dynamic input tool will union  the configured sheet of these files.

 

I made one dummy example based on what you explained here, take a look at the attached file.

 

Felipe_Ribeir0_0-1665228045350.png

 

If something still not clear, let me know!

 

If the provided answer helped you to solve the problem/was correct, please accept it as a solution :)

 

Thanks.

danilang
19 - Altair
19 - Altair

Hi @JamesGray 

 

Try using the wildcard functionality of the Input tool.  Use "??? - DEC 20.xlsx!!!P&L" in one Input Data tool for the 2020 files and "??? - DEC 21.xlsx!!!P&L" in another for the ones from 2021.  Make sure to include the filename in the output. This will give you all the data from each year in two separate datasets.  Then use a Formula tool with  Left(filename,3) called "Company" to extract the XXX part.  You can then join the two datasets on Company(plus the other required keys that you already use in your current workflow).  then you can proceed to compare the fields from the two years.  Repeat the process for "Balance Sheet" 

 

One thing to note is the the wildcard functionality requires the fields to be identical in name and size.  If they're not, you get a warning similar to this file is not in the same format as the first one and will be skipped.  To work around this, try this solution which uses a batch macro to read the files.

 

Dan

     

Felipe_Ribeir0
16 - Nebula

Hi @JamesGray 

 

It worked?

 

 

JamesGray
7 - Meteor

Hi both,

 

Thank you for your responses. Both really helpful and help me gain a better understanding of these tools. Using  @Felipe_Ribeir0 workflow I got an error around different schema. I did some further research and found a macro which worked around this and combined the two workflows to reach the answer I was trying to get.

 

I will post an update this weekend with the details and credit to the other macro provider.

 

Thanks

Felipe_Ribeir0
16 - Nebula

Yes, the dynamic input just work if the schema (columns and datatypes) is the same for each different file.

 

I assume that you are now using the macro attached to this post, right? https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/The-Ultimate-Input-Data-Flowchart/t...

 

This post has also more explanation about the input alternative methods for files.

 

JamesGray
7 - Meteor

@Felipe_Riberi0

 

Hi,

 

So I have used this macro workflow I found called:Import Multiple Schema Excel Files 

 
This combined with your workflow has produced the solution I was after. Although undoubtedly they is possibly a cleaner option that I just haven't found.
I was able to combine all my files and specific sheet names with no schema errors.
 
However, one thing I have found is that I had a number of files which when I run the workflow, in the first macro (to extract sheet names). I get an error "Unable to open file for read" in the file name of the error message it shows it beginning with a ~$ Do you know how I solve this? I have tried opening and saving the file again but not sure what else to do.
 
Thanks
 
 
JamesGray_0-1665949614485.png

 

Felipe_Ribeir0
16 - Nebula

Hi @JamesGray 

 

Filter out the files that have this '~$' before of the macro.

 

Felipe_Ribeir0_0-1665950439673.png

 

 

'

Felipe_Ribeir0_1-1665950439410.png

 

 

Here you can see some explanation about why these files are being shown for you.

What’s that ~ squiggly mark in front of my filenames? :: Accurate Die Design Software

 

JamesGray
7 - Meteor

That's brilliant, not sure why I hadn't thought of that. Thank you! :) 

Labels