To conduct fuzzy matching on a column in one set of data vs a list of strings
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am pretty new to Alteryx. I have:
(i) Data set 1 - a client transaction report with transactions made with third parties, one of the columns of the report is the name of third parties, each row is one transaction
(ii) Data set 2 - a list of third party names
My objective is to see if there are any clients in data set 1 who made transactions with the third parties in data set 2.
However, the third party names in data set 1 and 2 might not match exactly so I would like to apply fuzzy logic in the matching.
May I know how I can get an output with, say, clients in data set 1 who made transactions with third parties whose names match >80% with any of the names in data set 2?
Thanks!
- Labels:
- Fuzzy Match
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @kyokoleung ,
You can find out all about Fuzzy Matching in Alteryx at the following links:
https://help.alteryx.com/designer-cloud/fuzzy-match-tool
This is the help documentation for the Fuzzy Matching tool.
https://community.alteryx.com/t5/Videos/Fuzzy-Matching-for-Beginners/td-p/330575
This is a video, Fuzzy Matching for Beginners
You can also open example workflows from the Fuzzy Match tool itself and clicking "Open Example":
This will open a worked example in a workflow with data, which will give you a great starting point on your fuzzy matching journey!
I hope this helps,
M.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
There's a lot of tinkering that goes into a fuzzy match on business names. Let's take a business name and a mall listing as an example. Suppose you have Rivertown Crossings Mall Coach #123 on a transaction report and try to match that to COACH. The fuzzy match will NOT find it. What's also true is that the fuzzy match may link Rivertown Crossings Mall Buckle to the Rivertown Crossings Mall Coach.
There are plenty of articles about fuzzy match, but I find that you need to look at the data and expect that 1 size does not fit all. You'll want to JOIN to get exact matches cleanly executed. Maybe some use of FIND REPLACE and finally with the stragglers you might find success with 1 or many fuzzy approaches.
Business name alone is a big ask for fuzzy. You'll want to use phone and/or address too.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
