community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Knowledge Base

Definitive answers from Designer experts.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Tool Mastery | Join

Alteryx
Alteryx
Created on

Join.pngThis article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of the Join Tool on our way to mastering the Alteryx Designer:  

 

The Join Tool is the quintessential tool for data blending within Alteryx.  As such, it is also one of the most widely used tools. The Join Tool allows you to join data together from two different sources in two different ways: by record position and by specific fields. Selecting by record position will attach the two datasets together where it will match up each record by the position it is in. Thus record 1 of the left dataset will be in the same row as record 1 on the right in the J output and so on. If one dataset from either side has more records than the other those records will not be joined and they will be placed in there corresponding right or left output (L or R). Joining by specific field will match records up based on a specific field or multiple fields. This article goes into how that option works in more depth and detail. I highly recommend it as a read, as it covers frequent behaviors of the tool that you might run into.

 

Below are some examples of how you might use a join:

 

  • Joining to bring a field not available in other data sources. As you can see in the two tables below, both have fields that are exclusive to those tables and we want to bring those fields together. We would do this by joining on "Fruit" as this is the common field/identifier between both datasets:

     

J1.jpg

 

 

  • Using Join as a filter. You can also use a join to filter out records if you have a secondary dataset or list you want to filter by. Seen below, since Customer ID 3 and 4 is not contained in Table 2, records containing 3 and 4 in Customer ID will get dropped from the center join:

 

customer ID table join.jpg                                                                                                                          customer ID table join 2.jpg

Table 1                                                                                                                                                            Table 2

 

  

customer ID table join 3.jpg

Joined result of Table 1 and Table 2

 

 

If you're noticing your join output has fewer records than anticipated, be sure to check out this article!

 

By now, you should have expert-level proficiency with the Join Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.

 

Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.

Comments
Meteor

Hello, After using the join in alteryx I have drawn a hypothesis. So I would like to know, What I am thinking It is true or not. When joining from two datasource with more than one key, What I am realising It does not do the samething what SQL server do. 

 

So for example I have 3 field A, B, C which is common in both two table. So after joining If B and C is NULL for some row then SQL server does not join the data. But alteryx join the data only based on A key, which is not null.

 

So the Hypothesis is when there is more than one key to join,

Alteryx joins data from two table when atleast one key is available, but SQL server join if and only if all key are not null in case of simple join or inner join.

Community Data Engineer
Community Data Engineer

Hey DataPsycho!

 

This is likely because the Join Tool recognizes NULL cells as identical values - that is, it finds a match in the A key, and then joins every NULL of B and C to every NULL of B and C in your other dataset. You could confirm this by checking to see if there are duplicates exiting your join. The behavior is described here in the context of one join condition field: https://community.alteryx.com/t5/Alteryx-Knowledge-Base/How-the-Join-Tool-Works-and-Why-You-Might-Be...

 

Happy Blending!

MattD

Meteoroid

Hello,

 

Thanks for this article.

 

Is there a limit to the number of rows when performing a join?

 

 

I am very impressed with the speed at which Alteryx Designer is able to load and join two datasets using key fields (see example below).

 

Note: Academic exercise only using government/NHS data available on the public internet.

 

 

BrowseProfile.png

 

 

JoinBrowse.png

 

 


I am impressed that it took only 2 minutes to join. I tried this on a different analytics tool and it took 20 minutes!

 

Part of the challenge in data blending is to locate the key fields in desperate sources. However the GUI here helps to make the data exploration easier.

I am also enjoying the on demand videos available at: https://www.alteryx.com/on-demand-training

Keep up the good work!

 

Alteryx
Alteryx

Hi @plim,

 

There is no limit in the number of rows. The only limit that exists is the hardware on your machine and its ability to handle large data. Glad you are impressed with the tool. This is what we strive to hear from our users.

 

Cheers!

Asteroid

I am currently encountering that hardware limit that Ozzie is mentioning; I have been running queries that take up to an hour to run; too large files, but I need a faster computer with more memory.