Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Fuzzy Matching Entire File Against one Sting

bryan_ram2613
8 - Asteroid

Hello all. I am trying to filter out some healthcare data and I am running into some messy data. I have the idea of fuzzy matching the file against one value or a couple different values based on some trends that I have found. Example would be giving me a match score below between the reference and the payor. 

 

Issue is I cant get the Fuzzy Match tool configured to the output I want. Open to any suggestions! Thank!

 

Reference ColumnPayor
Medicare AdvantageAARP UHC MEDICARE ADVANTAGE PLAN
Medicare AdvantageAMERIGROUP MEDICARE ADVANTAGE
Medicare AdvantageANTHEM MEDICARE ADVANTAGE
Medicare AdvantageBCBS ADVANTAGE
Medicare Advantage AMERI ADVANTAGE
4 REPLIES 4
Thableaus
17 - Castor
17 - Castor

Hi @bryan_ram2613 

 

Could you please give more details about the filter you're doing? What are you trying to match?


Cheers,

bryan_ram2613
8 - Asteroid

Hey @Thableaus 

 

I am trying to filter through a good amount of payers names. Some of which are medicare advantage names which is what I want. The issue is there are about 30 or more different variations of medicare advantage that I want to sift through. Maybe even more. Some of the variations are "mcr advantage" or "mcre advantage. I am trying to avoid making custom filters for each variation and focus more on a match score to expedite the process.

Thableaus
17 - Castor
17 - Castor

@bryan_ram2613 

 

I was playing a little bit with your data and I think we can find a way for your problem.

 

Fuzzy1.PNG

 

First of all, I strongly recommend you to use Data Cleansing Tool to standardize your data and get rid of some things that might decrease the match score of Fuzzy Match.

 

The rest of the things I did was based on this Tool Mastery article. 

 

I split the data in two, since you're comparing against a single string. Using Merge Mode correctly will restrict the comparisons to "Medicare Advantage".

 

The one thing you should be very aware of is the Match Threshold inside the Match Style configuration. I lowered to 20% so I could see the behavior of all records. Default is 85%, that's probably why you weren't able to see results.

 

 

fuzzy2.PNG

 

fuzzy3.PNG

 

I attach Workflow on version 2018.4. I hope this can give you some ideas on how to deal with your data.

 

This Live Training also clears up a lot of common problems users run into.

 

Cheers,

bryan_ram2613
8 - Asteroid

Thanks @Thableaus that pointed me in the right direction

Labels
Top Solution Authors