on 09-13-2016 06:43 PM - edited on 04-20-2023 08:29 PM by yhchen
This article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of the Union Tool on our way to mastering the Alteryx Designer:
The Union Tool, the aptly named join category tool with DNA on it, accepts multiple input streams of data and combines them into one, unified, data stream. Whereas the Join Toolcombines datasets horizontally (either by a record ID or record position), the Union Tool combines datasets vertically. Not unlike how two nucleic acid strands are unified to form the double helical DNA.
We know, great puns are in our DNA.
The Union Tool has a handful of great applications besides side-stitching punchlines, too. Check them out below:
Have common fields in multiple datasets? Stack them into a single stream with the Union Tool by field name, position, or with manual arrangement:
Don't worry - your datasets don't have to have identical. Any uncommon fields will be at the end of the table, with any fields that are not in a given dataset being populated with null values.
The Alteryx Join Tool has 3 outputs:
These look like:
If you’re used to SQL joins, these are the left, inner, and right joins, respectively. The Union Tool allows you to effortlessly combine these Join outputs (shaded areas above) to create other, more complex,SQL join configurationslike the ones below:
Simply take your reporting elements and specify an Output Order in the Union Tool to stack them vertically - without creating a single reporting element from the combination like the Layout Tool does:
See the attached workflow, Union.yxmd, for the stack, join, and reporting examples shown above!
By now, you should have expert-level proficiency with the Union Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.
Additional Information
Click on the corresponding language link below to access this article in another language -
Portuguese
Spanish
French
German
Japanese
How can I add a column to output to indicate source of input data ( such as from 1st source or second or third etc. )?
According to the graphics, while the middle output of the join tool is indeed equivalent to a SQL inner join, the other two outputs do not represent left and right joins, but probably the result of an except statement. SQL left and right joins are represented by the graphics shown after the unions are applied.
Great Articles... Making Learning easy and fun... 🙂
I had a query with respect to Union Tool -
Without specifying any output order (Keeping 'Set a specific Output order' tab unchecked)
Is there a default ordering behaviour associated with the Tool? If not, Wouldn't it be great to have a default ordering such as 1st data set should always be the one on the top and then the 2nd data set irrespective of the type of options used for performing the Union tasks.
Is it possible to select multiple inputs and drag them all to the Union tool at once?
Yes @TaraCronin TWiG it is definitely possible to select multiple inputs and drag them into Union tools, in fact Union Tool is one of the very few tools in Alteryx that accepts multiple inputs. That's why it has >>> in its input tag.
When running the Union Tool based on column positions, is it important to have have same data type (and size) for the columns in same position?
Also, what determines the order of stacking, by default. I thought the first input connected to the Union tool will be placed first and the remaining inputs will be placed below that. Also, from which input is the column headers chosen?
I'll clarify my question from earlier: Is it possible to select multiple inputs at the same time and connect them just once, as a selected group, to the Union tool. At this point I need to drag the input to the Union tool one by one. Sometimes there are twenty + files I'm looking to union. Thanks
If I have 4 data sources to be put together by union tool, is there a way to add a column to the output data sheet, where a name can be assigned for each row to identify which data source the rows are coming from?
In the following example, column 1 to 4 are coming from the 4 different data source, which will be put together by union on column 1. Hope someone can help me add a column before column 1 to show the data source name? Can I do this by using union tool or I need to use a different function after the union tool? Thank you.
Blank column to be added | Column 1 | Column 2 | Column 3 | Column 4 |
You can do it during the input tool, if that's your source for such multiple files (instead of multiple tools throughout the workflow), by using a WildCard (? for single digit differentiation or * for extensive differentiation among file names/sheets).
Ex:
file_a1
file_b1
Input: file_?1
file_sales_NY
file_sales_FL
Input: file_sales_*
"We know, great puns are in our DNA."
omg, I'm dying lol
is there a limit on how many source you pull from?
This is interesting and very helpful. Thank you.
thank you
This is interesting and very helpful. Thank you
hhtps://community.altery.com/t5/alteryx-community-english/ct-p/externa
This is interesting and very helpful. Thank you.
According to the graphics, while the middle output of the join tool is indeed equivalent to a SQL inner join, the other two outputs do not represent left and right joins, but probably the result of an except statement. SQL left and right joins are represented by the graphics shown after the unions are applied.