Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Comparing Records between two columns for similarity

JR1
7 - Meteor

Hello,

 

I was wondering what tool I should use for obtaining similarity between two fields.  I know that you can use an IF conditional, but this will only put out T or F if  it is an exact match.  I want to be able to compare the first record from one column with the first record from another column and see what the resulting match percentage is.

 

Example:

ABC ABC, 123  
DEF DEF  
GHI GIH  

 

Result:

ABC ABC, 123 80%
DEF DEF 100%
GHI GIH 90%

 

I know that Fuzzy Match allows you to compare records from one column.  I was wondering if there was something similar with a different tool, or if you can achieve this with Fuzzy Match.

4 REPLIES 4
DultonM
11 - Bolide

Hi @JR1! Fuzzy matching can compare data from 2 columns too! Merge Mode in the tool allows you to compare records from different sources. The attached workflow accomplishes your question using Fuzzy Matching, giving all 3 rows a score! Because Fuzzy Matching is meant to exclude pairs that don't match very well, I had to configure the Fuzzy Match tool in a very particular way that opens up the tool so all rows are returned. This introduces a few extra tools that may not be intuitive, so I added a lot of comments. Let me know if you would like some more explanation on how Fuzzy Matching in Alteryx works and why the work-arounds in the workflow were necessary. I hope this solved your question!

JR1
7 - Meteor

Hi @DultonM,

 

The solution you provided works great, and is very well explained!  Thank you very much!

AshishD
7 - Meteor

Hello DultonM,

 

Thank you so much for this self-explanatory workflow. This is almost what I was looking for.

The only problem I am having it to properly configure the Fuzzy tool for strings comparison in my case.

 

I have attached the sample file and the scores. As you can see that the scores do not reflect the correct matching in a few cases.

Is there any better way to make the Fuzzy tool work on this sort of data?

 

Thanks & Regards,

Ashish

Phill
8 - Asteroid

Thanks @DultonM your workflow is really helpful. I'm just trying to modify it so that I can compare all the data fields between two columns. Would that just be flipping it to purge? 

Labels