Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
The Expert Exam is now live online! Read about the specifics and what it took to bring it to life in the blog by our very own Elizabeth Bonnell!
Alteryx Community Team
Alteryx Community Team

Have you ever noticed that you can't do a left join or a right join in the traditional sense in a single step with Alteryx? Normally to accomplish this you'd add a union tool after the join like so:

 Left join

I thought it would be nice to reduce this to one step, as pointed out here. While I was at it, I thought I'd add some functionality to filter the data in the same step, reducing the need to add a filter tool after the join, as is often the case. Sandeep expressed an interest here in being able to do this with date ranges. 

 

Attached are some example uses of the macro I came up with to accomplish this. The first example shows a left join. The second example, "Inner join with condition (date range)", performs an inner join on a certain column, then filters the joined dataset by checking if a date from one incoming dataset falls between two dates from the other. In SQL you would accomplish the same thing like so:

 

SQL join

 

In SAS (assuming the datasets are sorted properly and the join by columns have the same name), you'd do this:

 

SAS join

 

Another thing you can do with this macro is a cross join (Cartesian product of rows from incoming data) with an optional filter condition. But be careful, before applying the filter, the resulting join will contain as many rows as the product of the number of rows of the incoming datasets. Meaning if you're joining 2 datasets that each contain a million records, the resulting dataset will contain one trillion records! A potential use for the cross join is illustrated in the third example.

 

Feel free to download the macro and tailor it to suit your needs.

 

-Neil

Neil Ryan
Sr Program Manager, Community Content

Neil Ryan is the Sr Program Manager, Community Content, responsible for the content in the Alteryx Community. He held previous roles at Alteryx including Advanced Analytics Product Manager and Content Engineer, and had prior gigs doing fraud detection analytics consulting and creating actuarial pricing models. Neil's industry experience and technical skills are wide ranging and well suited to drive compelling content tailored for Community members to rank up in their careers.

Neil Ryan is the Sr Program Manager, Community Content, responsible for the content in the Alteryx Community. He held previous roles at Alteryx including Advanced Analytics Product Manager and Content Engineer, and had prior gigs doing fraud detection analytics consulting and creating actuarial pricing models. Neil's industry experience and technical skills are wide ranging and well suited to drive compelling content tailored for Community members to rank up in their careers.

Comments
12 - Quasar

Neil

This is a Awsome Macro 🙂

thanks 

Alteryx Partner

What does this mean:

Error: Advanced Join (27): Tool #23: 63 records were generated by an N-Dimensional Join - Join Key: (__a__: 1009:2)

 

i get this error when doing a conventional join with a condition.

Alteryx Community Team
Alteryx Community Team

@jz For a conventional join the join tool within the macro (tool #23) is set to error on multidimensional joins of more than 16 records to keep the output from blowing up. So you'll either need to address this within your incoming data (for example by removing records that have duplicates of the join key on one side of the join) or you can change the configuration of the join multiple tool #23 within the macro to "Allow all multidimensional joins".

Alteryx Partner

Thnx Neal!

5 - Atom

Thanks, I needed this one. May sound obvious, but with no dates I used condition 1=1. May not be needed, but I love a good combinatorial explosion.

11 - Bolide

Very useful!  I've needed this for joining on date logic for a while now.

9 - Comet

this is fantastic! 

 

What if i want to join on a date condition as shown here AND join by, say, a "group id"? 

 

Do the join tool on group id first and then the advanced join back to the prior input? 

Alteryx Community Team
Alteryx Community Team

The second example in the package attached to the post is doing just that, I believe...

Capture.PNG

5 - Atom

Hi all - 

 

@NeilR, Thanks for this macro; it's very useful!

 

Is there a syntax with which we can set the condition to return the record with the maximum start_date, in the event that multiple start_date's are < date?

 

i.e. In the following table, if the date was 2015-02-15, I would want to return "mcintosh"; but if I had another record where date was 2015-01-28, I would want to return "red delcious"

 

Table 1: 

start_date                         extra_col1              extra_col2

2015-01-01                      apple                      red delicious

2015-02-01                      apple                      mcintosh

2015-03-01                      apple                      granny smith

11 - Bolide

@cando235,

 

You could achieve this a few ways.  Here's one method I use on occasion:

 

  1. Perform the advanced join as you described
    1. extra_col1 = extra_col1
    2. (Condition) start_date  < date
    3. Use Left Join
  2. Attach a summary tool
    1. Group by extra_col1
    2. Group by date
    3. Select Max start_date as Max_start_date
  3. Using a traditional join tool, inner join the summary results to the results from the advanced join:
    1. assuming the advanced results are LEFT and the summary results are RIGHT
    2. (left) start_date = (right) Max_start_date
    3. (left) extra_col1 = (right) extra_col1
    4. (left) date = (right) extra_col1
  4. Select all of the left fields only

 

Link to Example Workflow (can't add attachments in blog comments for some reason)

Here's a preview of the tool layout:

 

advanced_join.png

7 - Meteor

Does this macro works for  Connect In-DB?It seems only works with "Input Data" tools?

pls let me know.

 

Thanks,

Brij

Alteryx Community Team
Alteryx Community Team

@brij Correct, it only works for the standard in memory tools. For in database joins use the Join In-DB Tool.

5 - Atom

Is there a current version of this Macro which will work with 2018.4?

Alteryx Community Team
Alteryx Community Team

@KelvinK, 5+ years later, the macro is still working fine for me on the latest version of Designer.

5 - Atom

Hi Neil,

 

I am a new user and it's possible I don't understand the real issue.  My org has not updated beyond 2018.4.  When I download the Macro I receive this message:

 

KelvinK_0-1584045619363.png

 

When I attempt to add the macro to a workflow I get this message:

 

KelvinK_1-1584045693682.png

 

I am interpreting this as my version of Designer being too outdated to run the Advanced Join, but this is also my first time trying to install a new macro.  Is there something else I am missing?

 

Thanks,

Kelvin

 

Alteryx Community Team
Alteryx Community Team

@KelvinK Alteryx gives you a warning when you're opening a file that was created using a newer version, but you can usually just click OK and it's fine. Except when it's part of a package. You've now extracted the package and you're macro looks to be sitting in C:\users\...\downloads\advanced+join(1)Advanced Join.yxmc. Try opening this file directly now. It should give you the warning but you should be able to open it.

7 - Meteor

OK, terminally stupid question, but here goes:  How do I turn this Macro into a button (like I see in your example workflow image) in my Join tab?

 

DISREGARD. ID 10 T error.

Alteryx Community Team
Alteryx Community Team

Save the Advanced Join examples.yxzp file attached to the post. Open it in Designer. When you do that you'll be prompted to extract the package to a directory of your choosing:

Capture.PNG

 

Now the macro itself, Advanced Join.yxmc, will be in that folder. Then follow instructions here.

7 - Meteor

Hi Nail,

 

Hope you are doing well.

 

This Macro is very helpful. But I am unable to add this 'Advanced Join' tool in my Alteryx Tools. I am going to your link ( https://gallery.alteryx.com/#!app/Advanced-Join/5d92671b826fd30b8453779e ) for 'Advanced Join' tool in Alteryx Gallery and then I am downloading the tool. But it is not getting added in the Alteryx Tool section. Please help me to add this tool in my Alterys.

 

It will be a great help if you can give me the steps to add the 'Advanced Join' tool in the Alteryx.

 

Regards,

 

Gaurab

 

 
 
 

 

7 - Meteor

gabh

 

As he indicated in the instructions, follow this link for instructions: https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Macro-Installation/ta-p/13032

 

Make sure the macro is within whatever folder you have told Alteryx to use for Macros.

 

Please note the part about how the Macro's Meta Info will determine where the icon shows up in Alteryx.

7 - Meteor

Hi 

 

7 - Meteor

Hi Nail,

 

I have 2 different tables. When I am joining two tables based on a condition, I am getting the following error :

 

'Advanced Join (123) Tool #23: 18 records were generated by an N-Dimensional Join - Join Key: (__a__: 205887102)'

gabh_0-1598912939197.png

What is the meaning of this error and how to resolve it ?

 

Regards,

 

Gaurab

Alteryx Community Team
Alteryx Community Team

Hi @gabh, @jz had the same question above. Here was my answer:

 

For a conventional join the join tool within the macro (tool #23) is set to error on multidimensional joins of more than 16 records to keep the output from blowing up. So you'll either need to address this within your incoming data (for example by removing records that have duplicates of the join key on one side of the join) or you can change the configuration of the join multiple tool #23 within the macro to "Allow all multidimensional joins".

7 - Meteor

Hi Nail,

 

Thanks for the reply. It is working now.

 

Regards,

 

Gaurab