Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Trying to do a Fuzzy Lookup with two files to identify similar names

ferab91966
6 - Meteoroid

Hi, I'm attempting to do a fuzzy lookup with two different files to identify similar names. I'd like to show the similar names by match percentage and then include the sign up date in the final outcome.

 

List 1

Name
Bobbie Smith
Pete Townsend
Will Ferrell
Santa Claus
Freddie Mercury
Sara Stone

 

List 2

NameSign Up Date
Mercury, Fred9/5/1955
Stone, Lauren S.11/27/1972
Townsend, Peter5/7/1982
Smith, Robert1/4/1989
Ferrell, John W.8/2/2005
Rabbit, Peter10/3/1995

 

Desired Outcome (After Fuzzy Lookup showing matches over 75% (sorted))

Name 1Name 2Match %Sign up Date
Pete TownsendTownsend, Peter955/7/1982
Freddie MercuryMercury, Fred959/5/1955
Will FerrellFerrell, John W.908/2/2005
Bobbie SmithSmith, Robert851/4/1989
Sara StoneStone, Lauren S.8011/27/1972

 

I tried doing this with the "Union" tool and then "Fuzzy Match", but I'm concerned that the Union tool is causing my outcome to show matches within itself (since the data would be merged/stacked together) and duplicates. I'm very new to using Alteryx and still learning all the different tools. 

2 REPLIES 2
ferab91966
6 - Meteoroid

Okay, I think this example (https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Fuzzy-Match-Merge-Mode-against-two-Dat...) answers my question. I added the record ID and joins and this basically gives me what I was looking for. 

 

When using the "union" feature is it comparing against the two sources or just matching duplicates within itself?

 

Also, any suggestions on how to improve the fuzzy match for names? Between my two list there are a lot of nick names and middle names / middle initials being used. 

ferab91966
6 - Meteoroid

Okay the solution seems to work, except I still end up with duplicates in my fuzzy match results. Is there a way to only show the best match instead of several potential matches for a person?

Labels