We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Join versus Union

Mikebarb1
5 - Atom

How do I decide when to use the join tool versus the union tool?

3 REPLIES 3
Luke_C
17 - Castor
17 - Castor

Hi @Mikebarb1 

 

You might look at this knowledge article: https://knowledge.alteryx.com/index/s/article/Join-vs-Union-1583460916101

 

Essentially a join is similar to a vlookup in excel, adding or supplementing additional information (more columns) to your data. Union is combining datasets to form a larger dataset (from a record count perspective)

Qiu
21 - Polaris
21 - Polaris

@Mikebarb1 
In simple word, Join arranges the two data stream horizontally, and Union arranges them vertically. 😁

jirikrecek
8 - Asteroid

As others have said, think of these as either data lookups (join) or a data stacker (union)

 

  • Join - similar to a VLOOKUP in Excel, but way more powerful
    • you need some sort of a key, which will be the same in both tables
    • For instance, you'd have a customer ID in left table where you have order info - like items in a shopping cart
    • Then you have a right table, which also has customer ID, but also has customer address
    • You used Join tool to use the Customer ID to pull the address columns from right table and add them to the left table
    • So, your amount of records (ideally) remains the same, but you enrich your data with additional columns from another data source
    • Join has 3 output anchors
      • L - records which are found in left table, but not found a match in right table
      • J - records were found in both right and left table and a match was made based on your key
      • R - records were found in right table, but not the left table
    • Having these 3 outputs is super helpful and allows you to do all sorts of things
    • If you do some SQL, you can think of L as left outer join, J as an inner join and R anchor as right outer join - behaves more or less the same in this Join tool
  • Union - Excel doesn't have a built in tool to do this, but think of it as a data stacker
    • Example would be having data with 12 tabs in Excel, each tab is for one month and all tabs have the same columns
    • If you want to merge all this data in Excel into one huge table for the whole year you'd need to create a 13th tab
    • Then copy and paste the data under each other on teh 13th tab to make up the whole year
    • So, if each month had 100 records in each tab, your final stacked tab for the whole year would have 1200 records
    • Lots of work and prone to human error
    • Union does this for you!
    • As long as your data sources have the same data types and column order, you can all point them into a Union tool and it wills tack the data for you into one giant table - either by position of the column or by column name
    • Very nifty when you are trying to either
      • merge many files of same structure
      • do some extra calculations in Alteryx where you split the data and then you need to merge it back together

 

 

The dark green column is a primary key used to do the lookup in the Join tool.

Union tool doesn't need a primary key - it merges (stacks) the data by either column name or column position.

Screenshot 2025-02-26 212221.png

Labels
Top Solution Authors