Alteryx Designer Desktop Knowledge Base

Definitive answers from Designer Desktop experts.

How to Use the Join Tool with a Union Tool and Change Field Names and their Order

lepome
Alteryx Alumni (Retired)
Created

Joining fields with name changes and reordering can be tricky.  It's important that the data from Join tool anchors have field names that allow the Union tool to stack the data appropriately.

Prerequisites
  • Alteryx Designer
    • All versions

Procedure

  1. If you've read the Tool Mastery article that covers the Join tool and then played with it on your own some, you've probably noticed some of its more subtle features.  One is that Joining two tables can generate a staggeringly large number of rows.  That's often due to selecting fields that have some repeated values, and is covered in a fine article titled "Why Your Join Is Getting More Records than Expected."

  2. As described in the Help you can combine the data from Join output anchors with a Union tool.
  3. Sometimes you want to change the names of the various fields or change their order.  If you are using the Join tool alone to work with just the J output (the Inner Join situation) you can do everything at once and use the "Select" functionality built into that tool.  If you are using any other anchors from the Join tool, you'll need to add a Select tool or two to get the results you want because changes you make to the field names and their order within the Join tool configuration only apply to the data coming out of the J anchor.  To fully understand this, let's look at the data from each anchor:

    1. The L and R anchors each contain the unmatched records from their respective data sources.  In this case, the L data had 10 fields and the R data had 5 fields, so that's what comes out of those anchors, completely unchanged.
      idea Skyscrapersidea Skyscrapers
    2. What comes out of the J anchor is configurable. You can output subsets of the fields from whichever source you like; you can rename those fields; you can change their order.  Any or all of those options will still leave the records coming out from the outer two anchors unchanged.  This is important if you want to combine the data from the L and the J anchors.  The Union tool can match up the data from the two (or three) anchors by field name, but does not know what you intend--only what it encounters.  This is why you may need to use a Select tool if you are using a Union tool with a Join.  Consider the situation where the data you join are clean, complete, and have field names that correspond, but don't match exactly.  I've rearranged the fields of the J anchor to demonstrate.
      idea Skyscrapersidea Skyscrapers
    3. In this case, it's not going to matter which of the duplicated fields you use going forward because they contain the same information.  You can select whichever fields in the Join you're going to ultimately use.  If you're going to Union L & J, you can use the fields with the names from the Left plus the newly-joined Sales Rep field.  If you Union the J and the R, you can use the fields with the names from the Right plus the newly-joined State, County, Region, Total Customers, Avg Sales per Customer, and Perf Indicator.
      idea Skyscrapersidea Skyscrapers
    4. But if only one of the files has data with which you are confident, then it matters how you combine the data.  The Union tool will stack data having the same field name and keep different fields separate.  You can use Select tools before the Join, between the Join and the Union, or after both depending on what you want.  If you rearrange and rename the data in the Join and then try to Union without a Select tool or two, you will probably be dissatisfied with the results.  Here, all the Union tools are set to Auto Config by Name and to Output All Fields with the same output order, but you can download the package and then browse the data to see how the outputs differ.
      idea Skyscrapersidea Skyscrapers

Additional Resources
Attachments
Comments
jacob_kahn
12 - Quasar

This was a really awesome read. Thank you for taking the time to put this together and for sharing with us!

DataMeister
7 - Meteor

Lisa: I came, I read, I didn't quite conquer! Maybe an Office Hours session would help clarify for me how the Union has to interact with Select and Join.

lepome
Alteryx Alumni (Retired)

@DataMeister 

Office hours and working sessions, by any name, are best arranged through the Virtual Solution Center (VSC).  Then you can share a screen with an Alteryx associate and ask questions with the workflows in front of you so you both can see exactly what you mean.