Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Clean company names with Fuzzy Match, overcome slight variations in name

hellyars
13 - Pulsar

I am trying to clean organization names using Fuzz Match.   My reference point is https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/Tool-Mastery-Fuzzy-Match/ta-p/45485

 

But, I am still having problems.  I have to be doing something wrong, Fuzz Match is struggling with variations of an organization's name, such as PROJECT vs. PROJECTS or ADVANCE vs. ADVANCED.  I even tried to use the Text Pre-Processing (Text Mining) tool.  

 

In the examples below, I can't get it to generate 1 common name for each organization. 

 

DEFENSE ADVANCE RESEARCH PROJECTS AGENCY 
DEFENSE ADVANCED RESEARCH PROJECT AGENCY 
DEFENSE ADVANCED RESEARCH PROJECTS AGENCY 
DEFENSE FINANCE ACCOUNTING SERVICE 
DEFENSE FINANCE AND ACCOUNTING SERVICE 
1 REPLY 1
Blake
12 - Quasar

Hey @hellyars 

 

See attached for a fuzzy match example that gives results similar to what I think you're looking for. I would caution that Fuzzy Match is not exact and I would closely scrutinize the results before putting this into production. 

 

fmwf.png

You'll probably want to adjust some of the Fuzzy Match configuration options to match your datasets and needs but this should hopefully get you started. 

 

Thanks, good luck! 

 

Labels