Hello,
I'd like to know how to list the sheet names from a XLS file.
I can't convert the file into XLSX, so I'm trying to find a solution to list the sheet names from an XLS file.
I've found the following macro: https://community.alteryx.com/t5/Engine-Works-Blog/The-Ultimate-Alteryx-Holiday-gift-of-2015-Read-All-Excel/ba-p/5558
But I really don't understand the macro "SheetFinder" itself:
Do I need to change anything in the following Run Command tool?
It looks a bit sophisticated for simply reading a single XLS file... Could you help me out on this?
Attached is a workflow using the Python method if you want to give it a try.
This is because xls files are legacy at this point and use old Microsoft drivers. The method for determining the tabs from these files is found in the supporting files folder using the VBScript file ExcelTabsScript.vbs in the run tool.
If you just use the "Read All Excel Files" Macro and don't change anything in it, it actually includes the FileName which has the sheet names as well. To keep things simple it might be easiest to just use the macro and do a summarize on the FileName field in order to get the list of paths and sheet names.
Another option would be to use Python with the xlrd library. The path to the Excel file can be supplied from an incoming tool (shown in example) or hard-coded in the Python code. The code could also be modified to take in/process multiple files.
Python Code:
from ayx import PackagePackage.installPackages('xlrd')
from ayx import Alteryximport xlrdimport pandas as pd
# Read in the first cellinputfile = Alteryx.read("#1").iloc[0][0]
# Open Excel filewb = xlrd.open_workbook(filename=inputfile)
# Get the sheet names and send outAlteryx.write(pd.DataFrame(wb.sheet_names(), columns=['TabName']),1)
I wish I could use it but I keep on having the following error
Error: Sheet Finder (2): Record #1: Tool #33: The external program "cscript.exe" returned an error code: 1
What should I do?
@KP_DML That Python workflow is fantastic. It solves the exact problem I was having. Thanks!
Hi Charles,
great solution, works perfectly
would you have the code to allow multiple input lines?
Thanks anyway
Hi @fmorazzini,
Attached is a variation of the workflow that supports multiple (or just one) Excel files.
What does this indicate?
Error: Python (2): ---------------------------------------------------------------------------CalledProcessError Traceback (most recent call last)<ipython-input-1-c63f9486f906> in <module>2 # script here (only missing packages will be installed)3 from ayx import Package----> 4 Package.installPackages('xlrd')c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\Package.py in installPackages(package, install_type, debug)200 print(pip_install_result["msg"])201 if not pip_install_result["success"]:--> 202 raise pip_install_result["err"]c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\Utils.py in runSubprocess(args_list, debug)118119 try:--> 120 result = subprocess.check_output(args_list, stderr=subprocess.STDOUT)121 if debug:122 print("[Subprocess success!]")c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in check_output(timeout, *popenargs, **kwargs)354355 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,--> 356 **kwargs).stdout357358c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\subprocess.py in run(input, timeout, check, *popenargs, **kwargs)436 if check and retcode:437 raise CalledProcessError(retcode, process.args,--> 438 output=stdout, stderr=stderr)439 return CompletedProcess(process.args, retcode, stdout, stderr)440CalledProcessError: Command '['c:\\program files\\alteryx\\bin\\miniconda3\\envs\\jupytertool_venv\\python.exe', '-m', 'pip', 'install', 'xlrd']' returned non-zero exit status 1.
Error: Python (2): ---------------------------------------------------------------------------ModuleNotFoundError Traceback (most recent call last)<ipython-input-2-4fca0f91e045> in <module>1 from ayx import Alteryx----> 2 import xlrd3 import pandas as pd45ModuleNotFoundError: No module named 'xlrd'
Nice! Thanks for sharing. Once resolving the xlrd package install it was off to the races! Thanks again.