Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Matching Names from 2 Dirty Datasets

kevin0109
5 - Atom

Dear Alteryx Fam,

 

I am respectfully seeking assistance with regards to fuzzy matching of two datasets. Specifically, I have encountered challenges in matching names between the two datasets, resulting in an incorrect number of matched records (supposed to have 80 records).

 

I would greatly appreciate your support in resolving this matter. Attached herewith is my workflow for your reference.

 

Thank you in advance for your kind attention to this request.

2 REPLIES 2
Zok
8 - Asteroid

Hello,

You have 144 matches but only 68 are unique

if you know the missing 12, you can work on that to see what the problems are 

mceleavey
17 - Castor
17 - Castor

Hi @kevin0109 ,

 

I've attached the workflow with a few tweaks:

mceleavey_0-1681465375976.png

 

I've amended the field usage regarding Source and RecordID field. I've also changed the methodology to use the merge method with a source field created on the incoming streams to identify them as different sources.

I have changed the algorithms used in the Fuzzy match to focus on names and slightly tweaked the thresholds:

 

mceleavey_1-1681465475078.pngmceleavey_2-1681465498339.png

 

This allows the non-identical matches to be returned as well, and their corresponding matching scores to records not in their source.

 

mceleavey_3-1681465602417.png

 

I've attached the workflow for you.

 

I hope this helps,

 

M.

 



Bulien

Labels