ALTERYX INSPIRE | Join us this May for for a multi-day virtual analytics + data science experience like no other! Register Now

Alteryx Designer Knowledge Base

Definitive answers from Designer experts.

Fewer Than Expected Records Output from Join

DanC
Alteryx
Alteryx
Created
 

 

One common reason fewer records than expected come out of a join can be due to mixed cases within the data of the join fields between the two datasets. For example, and as shown below, "Orange" from the first dataset is not going to join to "orange" in the second dataset. And this does not just apply to the leading character. Another common reason can be due to the existence of trailing or leading spaces. As also shown in the example below, "Plum " is not going to join with "Plum". The little red triangle in the upper right-hand corner of "Plum" indicates that there are leading and/or trailing spaces.

 

 

Example_FullSize.png

 

 

 A simple method to get around this is to attach a formula tool after each Input tool and use the following formula for, in this example, the field "Fruit":

 

TRIM(UPPERCASE([Fruit]))

 

This will convert all records for this field in each dataset to upper case, while also removing any leading or trailing spaces, allowing for a more successful join. If you are joining by multiple fields, then simply use the Multi-Field Formula Tool to convert all of the fields.

  

In a related issue, you might find yourself getting too many records than you were expecting. If this is the case, check out this Community article about inadvertently getting duplicates when joining.

 

Thank you!

Comments
iludeiro
5 - Atom

Very helpful! 

I tried this in a workflow that I had created and I was able to join more records. However, there are still a few that should be joined, based on name field, but I am not sure what tool to use to join (vlookup) the following inputs:

 

Scenario 1:

Input 1: Alexandra Smith

Input 2: Alex Smith

 

The "J" anchor is not listing Alexandra, however, it should since it is the same person. Similar situation happens for the following scenarios:

 

Scenario 2: 

Input 1: Ron McDonald

Input 2: Ron Mc Donald

 

Scenario 3:

Input 1: Lee-Va Tran

Input 2: Lee Va Tran

 

Scenario 4:

Input 1: Hellen Garcia de la Guardia

Input 2: Hellen Garcia DLG

 

Scenario 5:

Input 1: Lili Smith

Input 2: Lilia Smith

 

Scenario 6:

Input 1: Lauren Smith

Input 2: Lauren Smith-Gonzalez

 

Scenario 7:

Input 1: Maria de Dolores Cruz

Input 2: Maria Dolores Cruz

 

Scenario 8:

Input 1: Allison Brown

Input 2: Ally Brown

 

Could you please advise what is the best way to perfom this vlookup when the fields are not exactly identical between two sources?

 

BrandonKO
Alteryx
Alteryx

Hey iludeiro, I will send you some follow up resources. But for the benefit of others who might have the same question, the tool you would be looking for is the Fuzzy Match tool. The Fuzzy Match gives you the ability to join fields together when their names/values are different. It’s kind of an art and science, in the sense that you configure the tool to try and get you the best match rate based on the different matching algorithms in the tool. Here is a link to a Community article about it!

 

https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Tool-Mastery-Fuzzy-Match/ta-p/45485