community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Combine Different Schema based on output tempalte

Meteoroid

Hi community!

 

Hoping you can help me with the following as most of what I have read leads me to believe data cleaning has to be done manually before reading in files.

 

I have about 150 excel workbooks, each have a tab called "details". I am trying to get a workflow which allows me to read in all the information from the "details" tab and then allows me to select which columns and corresponding date should be output, based on the output template.

 

I already have a macro which takes the details tab from every workbook and combines them into one single workbook. The issue is the details tabs' schema differ across roughly 20 - 25 formats. I have the output template and the columns in the output are present in all the inputs but the positioning varies.

 

Hoping I can read in the singular workbook and only the columns in the singular workbook  which headers match the template columns are then consolidated and output in the template format.

 

Does anyone have suggestions or a workflow already made for this process?

Aurora

Hi @kspence703 

 

You can try something similar to this macro based solution, which will allow you to read only the columns you want from the 150 individual files, without the need for an Excel macro to combine them before hand.

 

Main.pngMain

 

The main program just gets list of files from the directory (Control parameter) and also the list of columns (C input) to read from those files.

 

Macro.pngMacro

 

The macro is a modification of a standard Read-a-Sheet-From-Multiple-Excel-files batch macro.  The filenames are passed into the control parameter.  The column names enter the C input and have a new field with the name "New_" prepended to it.  The Dynamic Rename uses these 2 columns to rename the required columns from the current workbook.  The Dynamic Select selects the FileName column from the Input tool and also all the columns that start with "New_".  The second Dynamic Rename Changes the column names back to originals by removing the "New_".  The batch macro is configured with a differing output schema in case some of the columns only appear in some of the files.

The results look like this 

 

Results.png

 

A, C and D are the column names that I entered into the list in the main program.  The 3 Excel files that I used as input(attached) have different combinations of columns from A to E in them.  They were initially in a subdirectory called "C:\Temp\Alteryx\Combine Different Schema based on output tempalte"

 

Try this out and let me know what you think.

 

Dan

 

Meteoroid

When I run the workflow with your files i get completely different results

 

clipboard_image_0.png

The above is a picture of the output I receive.

 

Also what is the impact if the file names are different. I tried to do this also with my own random workbooks and the workflow also did not work. Files attached.

 

 

Aurora

Hi @kspence703 

 

Sorry about the mix up there.  When I created the .yxzp package, alteryx changed the file paths in the macro from "c:\temp\Alteryx..." to "_externals\1\...".  However the Action tool tried to replace the specific string "c:\temp\alteryx..." which wasn't there any more.

 

Here's a new version which should be fixed.  The results with your files look like this

 

R.png

 

Dan

 

 

Meteoroid

Wow Danil this is helpful I feel like you have been doing this a long time haha please excuse my lack of knowledge so far. What adjustments would I need to make to account for multiple workbooks with multiple sheets. I know the name of the sheet from which I want the data to come "Details" but there are also other sheets within the workbooks which I do not want to access.

Aurora

Hi @kspence703 

 

I've only been using Alteryx for one year longer than you have(May 2018).   The thing that helped me the most was was doing all the Weekly Challenges.    

 

Here's an update to the main workflow that has a sheet name Text Input tool.  This is appended to the file path to build the complete sheet specifier in the form path\Filename.xlsx|||'SheetName'.  

 

Just copy this WF and paste in to the same directory as the original main.yxmd file

 

Dan

Labels