Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

Join vs Union



How do I know if I need to use a Jointool or a Uniontool?


This is a fairly common question. You have two datasets and you need to combine them into one larger data set, but how do you go about doing that? Alteryx has two tools that make this very easy depending on what you're trying to accomplish; the Join Tool and the Union Tool.

So what is the difference anyway?

The Join tool will make your file wider, meaning it adds fields to your data or combines your inputs horizontally. You want to use this tool if you're looking to add information for existing records. For instance, maybe you have a list of customers with the store ID that they shop at, and then have a separate list of the Store locations. If you want to add the Store information to the Customer information, you'd need to do a Join based on the Store ID field. This will match record by record the Stores with their Customers so your Customer file now has the Store information on it. Note that this can result in duplicate records depending on how your data sets are set up so be prepared to check your data.

The Union tool makes your file longer, meaning it adds records to your data or combines your inputs vertically. You want to use this tool if you're looking to stack two files on top of each other. For instance, maybe you have a list of customers from each region of your sales organization and want a single master customer file. All your region files have the exact same fields. The Union tool will align your data sets based on the Field names (either automatically or you can do it manually if there are slight variations) and you'll end up with one master file with all the same fields.

Take a look at the attached sample (built in Alteryx Designer 11.3) that demonstrates the difference!

8 - Asteroid

Nice article.

I have a unique situation where I have two data sets that need to come together to form a wider data set, presumably the union tool should make this happen. However, the first data set represents year to date sales information for each product and the second data set represents weekly sales information. Not all items have the weekly sales information as every item does not need to be tracked to this granularity. 


What I am noticing in the final data is that since I used the union tool, every time sales information is added that is being tracked on a weekly basis, the year to date information for that particular item is doubled (tripled etc...) in the year to date sales data. In essence, the line item is duplicated since there is weekly data associated to it.


Am I using the right tool in this instance? Is there an alternative tool I can reference, or even possibly, isolate the year to date line items from displaying more than once (somehow use unique or something).


I wish I could share the workflow or create a sample workflow, but it is hard to show. 





7 - Meteor


Can you take the file with the weekly sales and group the sales so that you calculate the YTD sales?

This should remove the duplicates for you.


Also, the union tool will make your data longer, not wider so without seeing what you are working on, I'm not positive you are using the right tool.