Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Looking to sequentially process excel files in a directory

Sheila29
7 - Meteor

I have a directory which have a number of excel files (all different formats/worksheets). All the files are named in a specific format though which represents a document number e.g. 13456.xlsx, 12458.xlsx, etc. I have a separate table which has the document number and another reference field which I need later on to be able to run a conditional workflow.

 

I'm stuck at the first stage of writing a workflow that will process each file sequentially and get the reference field from my other table for each file processed using the document number.

 

Can anyone assist with this please?

 

 

9 REPLIES 9
Philip
12 - Quasar

Hi Sheila29

 

Can you provide a few samples of the data?

 

From what I understand of your problem, you'll need to create a Read Reference Field macro and feed in the spreadsheet names. The tricky part is handling spreadsheets of different formats. I can't describe without seeing how the spreadsheets are laid out.

Sheila29
7 - Meteor

Sure, please see attached. Any ideas would be most appreciated!

Sheila29
7 - Meteor

Also below is an example of the workflow I'm running for just the one file. I have outputted the file name as a column so that I can link it to this other table to get the reference I need to run the conditional workflow.

The point of this workflow is to cleanse all the data in all the excel spreadsheets and put it in  a standard format. I have a workflow for each reference.

I will like to be able to repeat this process for all the files in my directory.

Thanks v much.

 

Joe_Mako
12 - Quasar

The biggest complexity in your situation is that your file extensions do not match their actual file type. In the attached 7z file (use 7-zip from http://www.7-zip.org to extract, this forum software did not allow for a normal zip file), there is a .vbs that is used for reading the sheet names from the files (since not all your files are actually .xlsx), two macros for reading in the two file types, and the Alteryx workflow to check if the file type matches the extension, rename as needed and load all data from all sheets in all files into a single output. The comment box area can be removed if your files contents matched their extension.

 

To use the workflow, have these four files in the same directory (can be different from your data) and point the first Directory step at your folder of Excel files.

 

Does this get you closer to what you are looking for?

 

Load Excel.png

Sheila29
7 - Meteor

Hi Joe, many thanks for this. It looks like this could work in principle. To make your workflow work for me, I have pointed the directory tool to my folder with all my excel files. However, your dynamic imput tool is pointing to a .csv file? Could you send me an example of this file please? The first filter also has a rule which says

([Field_1]='PK' AND EndsWith([FileName], '.xlsx'))
OR
([Field_1]!='PK' AND EndsWith([FileName], 'xls'))

 

What is Field_1 meant to represent and what is 'PK'? Sorry, it's not obvious to me!

 

Thanks for pointing out that all my files were not genuine .xlsx files. Actually, all these files were initially blobs that came from another workflow that's extracting these documents from a database and putting it in a network folder. I haven't found a way to convert these blobs to an Excel file yet so I manually changed the extension to xlsx.

 

 

Sheila29
7 - Meteor

Hi Joe,

 

Ignore my previous message! I managed to get your worklfow working :)

Sheila29
7 - Meteor

Hi Joe,

 

Sorry, the workflow ran the first time. Now that I'm running it again, I get this error message:

Error: Run Command (18): The external program "C:\Windows\SysWOW64\cmd.exe" returned an error code: 1

 

If you know what this is about, please let me know!

 

Thank you,

Sheila.

 

Joe_Mako
12 - Quasar

However, your dynamic imput tool is pointing to a .csv file? Could you send me an example of this file please?

Please do not edit the dynamic input step, leave it with the red excitation mark, you are safe to ignor it, it is there to read in the first two characters of your files, to determine what extension to use. If you edit the step, you may have inadverantly change the configuration. The details I set are (if you did edit the step, you will want to ensure all setting match these, or just copy paste from my origianl workflow):

config.png


What is Field_1 meant to represent and what is 'PK'? Sorry, it's not obvious to me!

 


"Field_1" is the first two characters of the files, PK means it is a zip file (interestingly it is the initials of the format's creator), and therfore a .xlsx file, and if it does not start with a "PK" is not s zip file and likely a .xls file. You might be able to do this check wtih your blob processing.

Joe_Mako
12 - Quasar

Is the alteryx workflow you are running in the same directory as the .vbs file? You will need to save the workflow in the same directory, or edit the run command tool accordingly.

 

Is your OS 64bit? I doubt this is the issue, but the cmd.exe I referenced is the 64bit version. If you have 32bit, replace that reference with just:

cmd.exe

 

What happens if you run the .vbs from by double clicking on the .vbs file? It should give a more informative error.

Labels