In part 1, I gave an overview of how Alteryx can now pull Excel sheet names dynamically. I attached the workflow, Python_Excel_exmplar.yxmd, to pull sheet names and data as well as the installer workflow, run_me_as_admin_to_install_dependencies.yxmd. The workflow, Python_Excel_exemplar.yxmd, contains two macros:
python_excel_sheet_name_fetcher.yxmc
excel_data_fetcher.yxmc
Here in part 2, I will discuss the workflow and macros in depth.
Upon opening the Python_Excel_exmplar.yxmd, one will notice that I take credit for versions 2 and 3 and give credit for version 1 to @CameronS.
I first became acquainted with the ability to pull Excel sheet names by reading his post. I was working on a project and had a need to read data from Excel files with unpredictable sheet names without manual intervention. His solution uses a completely different method for fetching sheet names and did not fit my project needs. My exemplar workflow in which a list of Excel files is scanned for sheet names and then data is extracted is inspired by his work and that is why he receives credit for the exemplar.
The excel_data_fetcher.yxmc macro is in some sense a generic macro. It comprises an input tool and some interface tools. The input tool is set to read .xls files and this same driver can conveniently read the other three types of Excel files (.xlsx, .xlsm, .xlsb). Anyone can build this batch macro and I don't take any special credit for it. @CameronS included something similar in his solution. It fits into my solution as the data fetcher. A list of Excel files and their sheet names are fed into this macro using a control parameter and output using the macro output. The list of files and sheets are wrangled into a field I call "replacement_value". This is an entire string that replaces the full path of an Excel file as well as its sheet name. This entire string tells the input tool how to get data. The way I get and build the "replacement_value" is where I take credit.
The python_excel_sheet_name_fetcher.yxmc macro, built entirely from scratch, is my contribution to the Alteryx community. It starts with a list of Excel files from a directory tool sent into the macro input.
I identify the extension of the file using the FileGetExt() function.
FileGetExt([FullPath])
Once I have the extension, I bucket the file by how I will get the sheet names. These buckets, "function_call", will be useful in the Python script and I will discuss them at length in part 4.
IF [Excel_type] IN ('.xlsx', '.xlsm')
THEN 'fetch_xlsx_xlsm_sheets'
ELSEIF [Excel_type] IN ('.xlsb')
THEN 'fetch_xlsb_sheets'
ELSEIF [Excel_type] IN ('.xls')
THEN 'fetch_xls_sheets'
ELSE Null()
ENDIF
If any non-Excel files have inadvertently flowed into the macro, they are filtered and sent out via a separate macro output.
!IsNull([function_call])
I reduce the fields to only the good stuff using a select tool. I only need the file and the function call to flow into the Python tool.
The list of Excel files and associated Python functions flow into a Python tool. I will address the Python script in part 4.
The Python tool uses the script I wrote to examine each Excel file, read the names of the associated sheets, and then send the list of files and their sheet names to output 1.
This list from output 1 moves to a formula tool, where the "replacement_value" is built. This is a concatenation of the full path of the file, three vertical bars, a back tick, the sheet name, and a dollar sign and back tick.
[FullPath]
+ '|||`'
+ [SheetName]
+ '$`'
This is how Alteryx expects a full string to look before it reads data from Excel.
These data are sent out of the macro using a macro output.
That is what my macro does. It takes any and all Excel files and outputs their sheet names as well as a convenient string that can be used in a batch macro like the one I included, excel_data_fetcher.yxmc.
The shortest path to victory is a directory tool listing your Excel files and my python_excel_sheet_name_fetcher.yxmc to fetch their sheet names. As I mentioned before, you can use the included macro to then fetch the data or build your own.
In part 1, I gave an overview of my solution.
In part 3, i will release some additional exemplars and macros.
In part 4, I will delve further into the Python script, why I made certain design choices, and some performance metrics.
Solved! Go to Solution.
New post, part 3!
Click here.
New post, part 4!
Click here.
You ran the macro in a workflow and it works. Have you attempted to use the workflow in your crew runner?
Yes, I attempted to run but it failed to run, whereas individually the workflow is running and getting expected result.
This seems like a problem with the configuration of the crew runner and the workflow. Impossible to troubleshoot this with no information and no examples. Please make a new post on the community and you will get plenty of help.