Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Match between two data sources with ranked criteria

FraM
6 - Meteoroid

Dear Community, 

 

I have two lists with company names that I would like to match. The 'best' unique identifier would be the website, but I do not have a website entry for all companies. So for those without a website I would like to match via the the company names (different spellings, not all names incluced in both lists). 

 

Input

 

Match orderList 1List 2
1CompanyWebsiteCompanyWebsite
2CompanyNameCompanyName
 No more columnsColumns x-y 

 

Desired output 

 

CompanyName from List 1Match score with list 2CompanyName from List 2Columns x-y from List 2
Company 1ax%Company 1b 

 

What I tried so far: 

- Added both lists via data input

- Removed duplicates

- Added a column for source file in each list

- Added a union tool to combine the lists

- Added the fuzzy match tool: Merge, source file names as source ID field, companyName as record ID field, match style company name, ticked output match score & unmatched records -> did not work and also missing the match style for website as this should be my first match criterium 

 

Any ideas how to approach this? 

2 REPLIES 2
ChrisTX
15 - Aurora

Suggestion: use a "waterfall" approach:

1) Exact match on CompanyWebsite using a Join tool, continue processing only unmatched records

2) Exact match on CompanyName using a Join tool, continue processing only unmatched records

3a) Fuzzy Match (tool 1 of 2)  on CompanyWebsite

3b) Fuzzy Match (tool 2 of 2)  on CompanyName

 

Under the Help menu, the workflow for "Merge to a master file with fuzzy matching" is a good example of a waterfall approach.

 

ChrisTX_0-1656003617649.png

 

Chris

ArtApa
Alteryx
Alteryx
Labels