Alteryx Designer Desktop Discussions

ravikumar060987 · ‎07-20-2022

Dear All,

I'm in a situation where I need to compare two columns that contain the names of companies and determine how many match and how many do not match with the 'Matching Score'

Example:

Name	Name2
ABC LIMITED	ABC LTD
XYZ PRIVATE LTD	XYZ PVT LTD
123 PUBLIC LTD	123 PUBLIC LTD

And needed a output as below:

Name	Name2	Match Score
ABC LIMITED	ABC LTD	90
XYZ PRIVATE LTD	XYZ PVT LTD	90
123 PUBLIC LTD	123 PUBLIC LTD	100

What tool will generate this output, and how will it be configured? Kindly assist. I tried 'Fuzzy Match Tool' - but no luck.

IraWatt · ‎07-20-2022

Hey @ravikumar060987,

Here is one way to do this:

I check the example workflow here:

In there example they put everything on one column to match on companies.

Any questions or issues please ask :)
HTH!
Ira

ravikumar060987 · ‎07-20-2022

@IraWatt - Thanks for the quick update.

Quick clarification: Why is there a duplicate value? for the second and third rows, but not the first?

christine_assaad · ‎07-20-2022

Hello @ravikumar060987

I did it using Fuzzy Match. See below

This video also explains the process: https://community.alteryx.com/t5/Archived-Training/Fuzzy-Matching-Intermediate-Users/m-p/43852

Cheers!

mbarone · ‎07-20-2022

Hi @ravikumar060987 ,

The Alteryx Academy is a great place to look for content on how to use some of the more advanced tools, like the Fuzzy Match , which will indeed give you what you want, but it is a difficult tool to master and will take some effort to learn. Whenever I use it, I have to refresh myself on it using some of the great free resources Alteryx provides.

Here's just a few:

On the tool itself from the join palate, you can click on it and select "Open Example".
HERE is a great video on the tool.
And HERE is a section of the Tool Mastery Index which is also a great place to look for help on certain tools.

Hopefully this gets you on your way; cheers!

IraWatt · ‎07-20-2022

@ravikumar060987 I think its because they have different match keys (I'm not a huge expert on matching):

However a simple summarize can fix it:

christine_assaad · ‎07-20-2022

Fuzzy Match will create many rows based on the Match score to other names. That's why it's a best practice to sort "Match Score" in a Desc order, then add a Unique tool to just keep the ID/name with the highest score.

ravikumar060987 · ‎07-20-2022

@christine_assaad @ thank you for the quick update

However, I have the source date in a tool. And because the records are so large, switching to another input file is not an option.

Is there another way to get this done quickly?

ravikumar060987 · ‎07-20-2022

@IraWatt - That's the good piece of information.

christine_assaad · ‎07-20-2022

Hi @ravikumar060987

In this case you can use Fuzzy Match in Purge mode. Purge is used for deduping when all records are coming from the same source.

The process will look similar to what @IraWatt sent. It's attached as well.

Alteryx Designer Desktop Discussions

Comparing the 2 names