Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
ned_blog
8 - Asteroid

There was a question on the Alteryx forums today about reading multiple XLS tables. This is the kind of thing that the DynamicInput tool is designed for, except for 1 problem. There is no way in Alteryx to get a list of tables/worksheets from XLS. We could of course add a feature to Alteryx to do that, but customers don't have that ability. So what to do? How about use the RunCommand Tool?

 

So now we need a command line that gets the tables from an XLS file. Some quick Googling on line turned up some good VB script examples. The advantage of VB script is that you don't need any special developer tools on your machine. The following script (with no error checking) will take a OleDb connection string and return a list of tables:

 

Set con = CreateObject( "ADODB.Connection")
con.Open(wscript.Arguments(0))
Const adSchemaTables = 20
Set tables = con.OpenSchema(adSchemaTables)
Do While Not tables.EOF
WScript.echo(tables("TABLE_NAME"))
tables.MoveNext
Loop

 

Again some Googling can figure out the OleDb connection string for XLS files. After this, it is just a matter of making a module that runs our little VBScript and reads all the tables in one stream. After this, it is a simple matter to take all the tables and read them into a single stream.

 

This only works if the schema's of all your tables are the same. If they are different, you might consider a batch macro which gives some more flexability.

Technical notes:

 

  • As Provided, this module will only work in a 32bit version of Alteryx. The reason is that the XLS OleDb provider is only available in 32bit. If running a 64bit Alteryx, change cmd.exe to c:\windows\syswow64\cmd.exe in the RunCommand Tool.
  • The RunCommand is doing another cool trick. It is piping the output to a file. If you notice, The above VBS example wasn't doing anything special with it's output. If you notice at the end of the command line there is a > temp.csv which causes the output to be written to the file named temp.csv.

The modules & scripts can be downloaded here.