Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
NeilR
Alteryx Alumni (Retired)

Have you ever noticed that you can't do a left join or a right join in the traditional sense in a single step with Alteryx? Normally to accomplish this you'd add a union tool after the join like so:

 Left join

I thought it would be nice to reduce this to one step, as pointed out here. While I was at it, I thought I'd add some functionality to filter the data in the same step, reducing the need to add a filter tool after the join, as is often the case. Sandeep expressed an interest here in being able to do this with date ranges. 

 

Attached are some example uses of the macro I came up with to accomplish this. The first example shows a left join. The second example, "Inner join with condition (date range)", performs an inner join on a certain column, then filters the joined dataset by checking if a date from one incoming dataset falls between two dates from the other. In SQL you would accomplish the same thing like so:

 

SQL join

 

In SAS (assuming the datasets are sorted properly and the join by columns have the same name), you'd do this:

 

SAS join

 

Another thing you can do with this macro is a cross join (Cartesian product of rows from incoming data) with an optional filter condition. But be careful, before applying the filter, the resulting join will contain as many rows as the product of the number of rows of the incoming datasets. Meaning if you're joining 2 datasets that each contain a million records, the resulting dataset will contain one trillion records! A potential use for the cross join is illustrated in the third example.

 

Feel free to download the macro and tailor it to suit your needs.

 

-Neil

Neil Ryan
Sr Program Manager, Community Content

Neil Ryan (he/him) is the Sr Manager, Community Content, responsible for the content in the Alteryx Community. He held previous roles at Alteryx including Advanced Analytics Product Manager and Content Engineer, and had prior gigs doing fraud detection analytics consulting and creating actuarial pricing models. Neil's industry experience and technical skills are wide ranging and well suited to drive compelling content tailored for Community members to rank up in their careers.

Neil Ryan (he/him) is the Sr Manager, Community Content, responsible for the content in the Alteryx Community. He held previous roles at Alteryx including Advanced Analytics Product Manager and Content Engineer, and had prior gigs doing fraud detection analytics consulting and creating actuarial pricing models. Neil's industry experience and technical skills are wide ranging and well suited to drive compelling content tailored for Community members to rank up in their careers.

Comments
s_pichaipillai
12 - Quasar

Neil

This is a Awsome Macro :)

thanks 

jz
7 - Meteor

What does this mean:

Error: Advanced Join (27): Tool #23: 63 records were generated by an N-Dimensional Join - Join Key: (__a__: 1009:2)

 

i get this error when doing a conventional join with a condition.

NeilR
Alteryx Alumni (Retired)

@jz For a conventional join the join tool within the macro (tool #23) is set to error on multidimensional joins of more than 16 records to keep the output from blowing up. So you'll either need to address this within your incoming data (for example by removing records that have duplicates of the join key on one side of the join) or you can change the configuration of the join multiple tool #23 within the macro to "Allow all multidimensional joins".

jz
7 - Meteor

Thnx Neal!

Flanigan
5 - Atom

Thanks, I needed this one. May sound obvious, but with no dates I used condition 1=1. May not be needed, but I love a good combinatorial explosion.

Coxta45
11 - Bolide

Very useful!  I've needed this for joining on date logic for a while now.

davidhenington
10 - Fireball

this is fantastic! 

 

What if i want to join on a date condition as shown here AND join by, say, a "group id"? 

 

Do the join tool on group id first and then the advanced join back to the prior input? 

NeilR
Alteryx Alumni (Retired)

The second example in the package attached to the post is doing just that, I believe...

Capture.PNG

cando235
5 - Atom

Hi all - 

 

@NeilR, Thanks for this macro; it's very useful!

 

Is there a syntax with which we can set the condition to return the record with the maximum start_date, in the event that multiple start_date's are < date?

 

i.e. In the following table, if the date was 2015-02-15, I would want to return "mcintosh"; but if I had another record where date was 2015-01-28, I would want to return "red delcious"

 

Table 1: 

start_date                         extra_col1              extra_col2

2015-01-01                      apple                      red delicious

2015-02-01                      apple                      mcintosh

2015-03-01                      apple                      granny smith

Coxta45
11 - Bolide

@cando235,

 

You could achieve this a few ways.  Here's one method I use on occasion:

 

  1. Perform the advanced join as you described
    1. extra_col1 = extra_col1
    2. (Condition) start_date  < date
    3. Use Left Join
  2. Attach a summary tool
    1. Group by extra_col1
    2. Group by date
    3. Select Max start_date as Max_start_date
  3. Using a traditional join tool, inner join the summary results to the results from the advanced join:
    1. assuming the advanced results are LEFT and the summary results are RIGHT
    2. (left) start_date = (right) Max_start_date
    3. (left) extra_col1 = (right) extra_col1
    4. (left) date = (right) extra_col1
  4. Select all of the left fields only

 

Link to Example Workflow (can't add attachments in blog comments for some reason)

Here's a preview of the tool layout:

 

advanced_join.png

brij
7 - Meteor

Does this macro works for  Connect In-DB?It seems only works with "Input Data" tools?

pls let me know.

 

Thanks,

Brij

NeilR
Alteryx Alumni (Retired)

@brij Correct, it only works for the standard in memory tools. For in database joins use the Join In-DB Tool.

KelvinK
5 - Atom

Is there a current version of this Macro which will work with 2018.4?

NeilR
Alteryx Alumni (Retired)

@KelvinK, 5+ years later, the macro is still working fine for me on the latest version of Designer.

KelvinK
5 - Atom

Hi Neil,

 

I am a new user and it's possible I don't understand the real issue.  My org has not updated beyond 2018.4.  When I download the Macro I receive this message:

 

KelvinK_0-1584045619363.png

 

When I attempt to add the macro to a workflow I get this message:

 

KelvinK_1-1584045693682.png

 

I am interpreting this as my version of Designer being too outdated to run the Advanced Join, but this is also my first time trying to install a new macro.  Is there something else I am missing?

 

Thanks,

Kelvin

 

NeilR
Alteryx Alumni (Retired)

@KelvinK Alteryx gives you a warning when you're opening a file that was created using a newer version, but you can usually just click OK and it's fine. Except when it's part of a package. You've now extracted the package and you're macro looks to be sitting in C:\users\...\downloads\advanced+join(1)Advanced Join.yxmc. Try opening this file directly now. It should give you the warning but you should be able to open it.

jrobiso2
8 - Asteroid

OK, terminally stupid question, but here goes:  How do I turn this Macro into a button (like I see in your example workflow image) in my Join tab?

 

DISREGARD. ID 10 T error.

NeilR
Alteryx Alumni (Retired)

Save the Advanced Join examples.yxzp file attached to the post. Open it in Designer. When you do that you'll be prompted to extract the package to a directory of your choosing:

Capture.PNG

 

Now the macro itself, Advanced Join.yxmc, will be in that folder. Then follow instructions here.

Gaurab_Bhattacharya
7 - Meteor

Hi Nail,

 

Hope you are doing well.

 

This Macro is very helpful. But I am unable to add this 'Advanced Join' tool in my Alteryx Tools. I am going to your link ( https://gallery.alteryx.com/#!app/Advanced-Join/5d92671b826fd30b8453779e ) for 'Advanced Join' tool in Alteryx Gallery and then I am downloading the tool. But it is not getting added in the Alteryx Tool section. Please help me to add this tool in my Alterys.

 

It will be a great help if you can give me the steps to add the 'Advanced Join' tool in the Alteryx.

 

Regards,

 

Gaurab

 

 
 
 

 

jrobiso2
8 - Asteroid

gabh

 

As he indicated in the instructions, follow this link for instructions: https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Macro-Installation/ta-p/13032

 

Make sure the macro is within whatever folder you have told Alteryx to use for Macros.

 

Please note the part about how the Macro's Meta Info will determine where the icon shows up in Alteryx.

Gaurab_Bhattacharya
7 - Meteor

Hi 

 

Gaurab_Bhattacharya
7 - Meteor

Hi Nail,

 

I have 2 different tables. When I am joining two tables based on a condition, I am getting the following error :

 

'Advanced Join (123) Tool #23: 18 records were generated by an N-Dimensional Join - Join Key: (__a__: 205887102)'

gabh_0-1598912939197.png

What is the meaning of this error and how to resolve it ?

 

Regards,

 

Gaurab

NeilR
Alteryx Alumni (Retired)

Hi @Gaurab_Bhattacharya, @jz had the same question above. Here was my answer:

 

For a conventional join the join tool within the macro (tool #23) is set to error on multidimensional joins of more than 16 records to keep the output from blowing up. So you'll either need to address this within your incoming data (for example by removing records that have duplicates of the join key on one side of the join) or you can change the configuration of the join multiple tool #23 within the macro to "Allow all multidimensional joins".

Gaurab_Bhattacharya
7 - Meteor

Hi Nail,

 

Thanks for the reply. It is working now.

 

Regards,

 

Gaurab

SeanAdams
17 - Castor
17 - Castor

Hey @NeilR - any chance we can get this into the main product?

If it helps - I'll submit an idea to this effect under the Ideas board?

NeilR
Alteryx Alumni (Retired)

@SeanAdams I wonder what kind of interest there is to include this type of functionality in Designer OOTB. When I wrote this post I was coming from a job where I coded in SAS/SQL regularly. Seven years later I find my brain works differently - defaulting to visual workflows and only thinking in SQL left/right joins if forced to. Point being, I don't find this macro as useful any more, but would be curious what the community says with their Ideas votes.  

SeanAdams
17 - Castor
17 - Castor

@NeilR - I get what you're saying, once you're used to the way that Alteryx does things, you find that the ways around things become habitual.

However - I find myself needing to do outer joins on a very frequent basis - so the Join + two formula tools + Union + Select becomes a common pattern.     Even if you don't adjust the fields - this becomes a relatively long series of steps to achieve a simple outcome.

 

I'll do a post on this in the ideas section to make the case, and see how many votes we get!

 

 

Hamder83
11 - Bolide

Hi @NeilR 

I just learned about this macro, and it works great :) 


is there any way to have what matches and what does not? Like in a reguar join.

I just tested with a simple customer number and a data I wanna compare into a start and end date.:

Hamder83_0-1646225135914.png

 



MiriamFucek
5 - Atom

Is there a solution in case you need to merge dataset by multiple fields? (not just one ID number)