Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Fuzzy Match between two data sources with ranked criteria

FraM
6 - Meteoroid

Dear Community, 

 

I have two lists with company names that I would like to match. The 'best' unique identifier would be the website, but I do not have a website entry for all companies. So for those without a website I would like to match via the the company names (different spellings, not all names incluced in both lists). 

 

Input

 

Match orderList 1List 2
1CompanyWebsiteCompanyWebsite
2CompanyNameCompanyName
 No more columnsColumns x-y 

 

Desired output 

 

CompanyName from List 1Match score with list 2CompanyName from List 2Columns x-y from List 2
Company 1ax%Company 1b 

 

What I tried so far: 

- Added both lists via data input

- Removed duplicates

- Added a column for source file in each list

- Added a union tool to combine the lists

- Added the fuzzy match tool: Merge, source file names as source ID field, companyName as record ID field, match style company name, ticked output match score & unmatched records -> did not work and also missing the match style for website as this should be my first match criterium 

 

Any ideas how to approach this? 

2 REPLIES 2
ChrisTX
15 - Aurora

Suggestion: use a "waterfall" approach:

1) Exact match on CompanyWebsite using a Join tool, continue processing only unmatched records

2) Exact match on CompanyName using a Join tool, continue processing only unmatched records

3a) Fuzzy Match (tool 1 of 2)  on CompanyWebsite

3b) Fuzzy Match (tool 2 of 2)  on CompanyName

 

Under the Help menu, the workflow for "Merge to a master file with fuzzy matching" is a good example of a waterfall approach.

 

ChrisTX_0-1656003617649.png

 

Chris

ArtApa
Alteryx
Alteryx
Labels