Alteryx Designer

Find answers, ask questions, and share expertise about Alteryx Designer.
Register for the upcoming Live Community Q&A Session - and don't forget to submit your questions for @DeanS regarding the future role of analytics here.
SOLVED

Alteryx can dynamically pull Excel sheet names!

Highlighted
Alteryx Certified Partner

There aren't many challenges to dynamic processing in Alteryx. Unfortunately, reading Excel files with unknown and changing sheet names is one such challenge.

 

Skip to the bottom to download the installer and exemplar.

 

This is a challenge I've tackled in the past using a Python script. In the current version, I used the Python tool and rewrote my script to use pandas. The logic is completely encapsulated in the Python tool. This makes this look and feel more like an out-of-the-box Alteryx tool, especially since the process is wrapped in a macro with a custom icon! In part 2, I will discuss the macro in depth. My macro is depicted below.

seven_1-1588716550673.png

Simply feed a list of Excel full paths into the macro and it outputs sheet names for each file. Use the Excel Data Fetcher macro to get all of the data from the Excel files. In part 3, i will release some additional exemplars and macros.

 

seven_0-1588716088165.png

Watch this input

seven_5-1588717672126.png

turn into this output

seven_4-1588717640294.png

in moments.

 

That's the magic! Take that output and feed it into another macro and watch all the data come out dynamically.

 

For a closer look at the Python script in the macro, keep reading.

I wrote and rewrote the Python script entirely from scratch. I have enclosed it below. In part 4, I will delve further into the Python script, why I made certain design choices, and some performance metrics.

#################################
from ayx import Package


#################################
# list all non-standard packages to be imported
from ayx import Alteryx
import pandas as pd
import openpyxl # read xlsx and xlsm
import xlrd # read xls
from pyxlsb import open_workbook # read xlsb
import warnings


#################################
# read in data from input anchor (after running the workflow)
df_in = Alteryx.read("#1")


#################################
df_out = []

# 'main' function
for row in df_in.itertuples():
    if row.function_call == 'fetch_xlsx_xlsm_sheets':
        '''
        openpyxl can throw a nuisance warning for some files 
        with headers or footers. Ignore the warning.
        '''
        warnings.filterwarnings("ignore",category=UserWarning)
        temp_workbook_object = openpyxl.load_workbook(row.FullPath)
        sheet_list = temp_workbook_object.sheetnames
    elif row.function_call == 'fetch_xls_sheets':
        temp_workbook_object = xlrd.open_workbook(row.FullPath)
        sheet_list = temp_workbook_object.sheet_names()
    elif row.function_call == 'fetch_xlsb_sheets':
        with open_workbook(row.FullPath) as temp_workbook_object:
            sheet_list = temp_workbook_object.sheets
    else:
        None
    sheet_series = pd.Series(sheet_list)
    df_out.append(pd.DataFrame({'FullPath': row.FullPath, 'SheetName': sheet_series.values}))

# create the output dataframe comprising 
# any intermediate dataframes from 'main' function
df_out = pd.concat(df_out, ignore_index=True, sort=False)


#################################
# read in metadata from input anchor
md = Alteryx.readMetadata('#1')
# drop metadata for unused field
del md['function_call']
# add metadata for new field
md['SheetName'] = {
    'name': 'SheetName', 
    'type': 'V_WString', 
    'length': 1073741823,
    'description': 'script author: Kacper Ksieski'
}
md


#################################
# write out the data with updated metadata to output 1
Alteryx.write(df_out, 1, md)

 

Here is the install script. Make sure to run the installer in Designer in Admin Mode.

#################################
# List all non-standard packages 
# only missing packages will be installed
from ayx import Package
Package.installPackages(['pandas','xlrd','openpyxl','pyxlsb'])

 

In part 2, I will discuss the macro in depth.

In part 3, i will release some additional exemplars and macros.

In part 4, I will delve further into the Python script, why I made certain design choices, and some performance metrics.

 

Stay tuned for the next posts and leave feedback! Thank you.

 

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Hi @seven was there a reason you didn't make use of the wildcard xlsx tool that is available on the gallery, or build your own custom tool using the code free tools that exist in the product (with the 'input <list of sheet names>' and then a batch macro)?

 

I'm just trying to understand if you identified gaps in the way these tools worked and hence went the python route, or you just wanted to showcase another method of doing it?

 

 

Highlighted
Alteryx Certified Partner

@BenMossGreat question. I haven't seen the wildcard xlsx macro to which you refer. A search for "wildcard" or "xls" on the Gallery does not yield results that appear to be what you describe. See images below:

seven_0-1588772458440.png

seven_1-1588772495586.png

Why didn't I use code free tools that exist in the product? That is a limitation of Microsoft, not Alteryx. Microsoft Excel files of type (xlsx, xlsm) allow Alteryx to fetch a list of sheet names as you mentioned. Microsoft Excel files of type (xls, xlsb) do not.This is a key part of my solution and I apologise that I did not include it in my post.

 

As far as I know, my solution is the only one that operates on all four Microsoft Excel types (xlsx, xlsm, xls, xlsb) in a way that is seamless to the Alteryx end user.

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Perfect! That's what I had hoped, I know a lot of people get frustrated with this restriction with xls files!

 

I'll make sure to bookmark this for those cases!

 

Ben

Highlighted
Alteryx Certified Partner

New post, part 2!

Click here.

Highlighted
Alteryx Certified Partner

New post, part 3!

Click here.

Highlighted
Alteryx Certified Partner

New post, part 4!

Click here.

Labels