I would like to check how to use UNION tool if one of the dataset is missing during runtime. I am working on scenario where I am trying to combine data from two excel sheets of same workbook with same metadata. This file may or may not have data for both sheets at a time. In such case, that sheet will be missing from workbook.
Scenario 1: PRODUCT_ORIGINAL and PRODUCT_SPECIAL are two sheets available in worksheet with same number of columns.
Output 1: UNION tool works fine while combining data from these two sheets.
Scenario 2: Either of sheets is missing (let's say PRODUCT_SPECIAL is not present in this case)
Output 2: UNION tool output doesn't show data at all even for PRODUCT_ORIGINAL tab along with error (for missing PRODUCT_SPECIAL tab) which can be ignored while data processing.
Thanks for replying. I am attaching sample workflow here with dummy data. In my scenario, input files are generated based on source data availability so it may or may not have all the sheets at any point of time.
In Scenario 2, I am expecting both North Sales and South Sales sheets in PRODUCT_FINAL.xlsx output document but i can see only South Sales.
Really nice problem to solve, awesome! Thanks (i learned something)!
See my attached workflow, it works for your case. Use the dynamic input tool and set your input tool to "List sheet of names", it will extract all sheets via the dynamic input tool but obviously only sheets that exist. Luckily, the dynamic input tool will union the dataset whenever there are more sheets available and since your sheets have the same format, it working nicely :-).
A caveat about Dynamic Input: it's very fussy. Even when the data in the two sheets appear to match, Dynamic Input will not work if a field is read in as VString in one of the worksheets and as VWString in another. This can happen quite easily in Excel files, and has happened to me often.
My workaround, when Dynamic input is failing, is to use a simple Batch Macro.
I'll start with an Input tool, and set it to pull the list of sheet names, and return the name of the full path of the file:
Then, a formula tool to combine the File Name and sheet name
That will feed into a very simple batch macro, consisting of a control parameter, an input tool, and a macro output. The control parameter drives what file is used for the input.
Configure the macro to allow for different schema:
This macro will allow the sheets to have slight variations in the data types, which Dynamic Input doesn't allow for.
Pro tip: Put a browse tool after the macro. Alteryx needs another tool to execute all of the rows feeding into the batch macro. You can learn more about batch macros here
I raised with Alteryx support team and they have also suggested me same solution to use Dynamic Input tool which I tried that earlier before raising ticket but it was not working as expected earlier. This is due to one of the column datatype changed to Double when there is no data which was very annoying to me because i cannot set datatype in Dynamic Input tool itself.
Now i used below workaround using Dynamic Input tool along with option First row contains data and updated my workflow as shown below.
1. Input tool -- Gives list of all the sheet names.
2. FIlter tool -- Filtering only sheets which needs to be combined and rest sheets can be considered separately.
3. Dynamic Input tool -- Field = sheetname &Action = Change File/Table Name along with First row contains data checked
4. Dynamic Rename -- To bring header of records back using Rename mode - Take Field Names from First row of data.
5. Filter tool -- Remove unwanted header of second dataset.
6. Select tool -- To change datatype of columns from VSTRING to required datatype.