Dear Community,
I have two lists with company names that I would like to match. The 'best' unique identifier would be the website, but I do not have a website entry for all companies. So for those without a website I would like to match via the the company names (different spellings, not all names incluced in both lists).
Input
Match order | List 1 | List 2 |
1 | CompanyWebsite | CompanyWebsite |
2 | CompanyName | CompanyName |
No more columns | Columns x-y |
Desired output
CompanyName from List 1 | Match score with list 2 | CompanyName from List 2 | Columns x-y from List 2 |
Company 1a | x% | Company 1b |
What I tried so far:
- Added both lists via data input
- Removed duplicates
- Added a column for source file in each list
- Added a union tool to combine the lists
- Added the fuzzy match tool: Merge, source file names as source ID field, companyName as record ID field, match style company name, ticked output match score & unmatched records -> did not work and also missing the match style for website as this should be my first match criterium
Any ideas how to approach this?
Suggestion: use a "waterfall" approach:
1) Exact match on CompanyWebsite using a Join tool, continue processing only unmatched records
2) Exact match on CompanyName using a Join tool, continue processing only unmatched records
3a) Fuzzy Match (tool 1 of 2) on CompanyWebsite
3b) Fuzzy Match (tool 2 of 2) on CompanyName
Under the Help menu, the workflow for "Merge to a master file with fuzzy matching" is a good example of a waterfall approach.
Chris
Hi @FraM - Here is a training video that may help: https://community.alteryx.com/t5/Videos/Fuzzy-Matching-for-Beginners/td-p/330575