Alteryx Designer Desktop Discussions

Sieversd · ‎10-11-2019

I'm new to Alteryx and having trouble solving an issue. I have an address list that has duplication within it (Business Name, Address, City, State). One business shows up multiple times within the data set. The issue is the business name or address could be a little off, so they don't match 100%. (100 North Main Street vs 100 N Main St, just as an example) I'm trying to use the Fuzzy Matching to identify the duplicate lines, however I can't figure out how to get a unique output for every record.

Ideally, my output would remove all duplication and return one record for each business. I tried using the Fuzzy Match/Purge Mode, I just don't understand what to do after that step to get the output I need.

Any guidance or suggestions are greatly appreciated!

PeterA1 · ‎10-11-2019

Hey @Sieversd by any chance can you attach some sample data? That might help me build something out for you. Fuzzy matching is definitely an art and a science but really great once you are able to configure it!

Essentially in purge mode it is comparing all of the records in a list and outputting the record that it matched with. There is an excellent single tool example in alteryx which has workflows that demonstrate all of the possible configurations of this tool.

I have attached an example of what you might be looking for taken from that single tool example (found by clicking on the tool in your tool palette, and selecting to open the example).

You essentially need to use a make groups tool and then a find replace tool to cleanse the data set.

Let me know if this is what you are looking for.

Sieversd · ‎10-11-2019

Thanks for your response! I finally figured out what I needed after doing a little more digging within the community. I didn't even think about find/replace. I am going to give that a whirl as well!

Alteryx Designer Desktop Discussions

De-Dup Address List