Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

To conduct fuzzy matching on a column in one set of data vs a list of strings

kyokoleung
5 - Atom

I am pretty new to Alteryx. I have:

(i) Data set 1 - a client transaction report with transactions made with third parties, one of the columns of the report is the name of third parties, each row is one transaction

(ii) Data set 2 - a list of third party names

 

My objective is to see if there are any clients in data set 1 who made transactions with the third parties in data set 2. 

However, the third party names in data set 1 and 2 might not match exactly so I would like to apply fuzzy logic in the matching. 

May I know how I can get an output with, say, clients in data set 1 who made transactions with third parties whose names match >80% with any of the names in data set 2?

 

Thanks!

2 REPLIES 2
mceleavey
17 - Castor
17 - Castor

Hi @kyokoleung ,

 

You can find out all about Fuzzy Matching in Alteryx at the following links:

 

https://help.alteryx.com/designer-cloud/fuzzy-match-tool

This is the help documentation for the Fuzzy Matching tool.

 

https://community.alteryx.com/t5/Videos/Fuzzy-Matching-for-Beginners/td-p/330575

This is a video, Fuzzy Matching for Beginners

 

You can also open example workflows from the Fuzzy Match tool itself and clicking "Open Example":

 

mceleavey_0-1649243333407.png

 

This will open a worked example in a workflow with data, which will give you a great starting point on your fuzzy matching journey!

 

I hope this helps,

 

M.



Bulien

MarqueeCrew
20 - Arcturus
20 - Arcturus

@kyokoleung ,

 

There's a lot of tinkering that goes into a fuzzy match on business names.  Let's take a business name and a mall listing as an example.  Suppose you have Rivertown Crossings Mall Coach #123 on a transaction report and try to match that to COACH.  The fuzzy match will NOT find it.  What's also true is that the fuzzy match may link Rivertown Crossings Mall Buckle to the Rivertown Crossings Mall Coach.

 

There are plenty of articles about fuzzy match, but I find that you need to look at the data and expect that 1 size does not fit all.  You'll want to JOIN to get exact matches cleanly executed.  Maybe some use of FIND REPLACE and finally with the stragglers you might find success with 1 or many fuzzy approaches.  

 

Business name alone is a big ask for fuzzy.  You'll want to use phone and/or address too.

 

Cheers,

 

Mark

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
Labels