This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I'm new to Alteryx and having trouble solving an issue. I have an address list that has duplication within it (Business Name, Address, City, State). One business shows up multiple times within the data set. The issue is the business name or address could be a little off, so they don't match 100%. (100 North Main Street vs 100 N Main St, just as an example) I'm trying to use the Fuzzy Matching to identify the duplicate lines, however I can't figure out how to get a unique output for every record.
Ideally, my output would remove all duplication and return one record for each business. I tried using the Fuzzy Match/Purge Mode, I just don't understand what to do after that step to get the output I need.
Any guidance or suggestions are greatly appreciated!
Hey @Sieversd by any chance can you attach some sample data? That might help me build something out for you. Fuzzy matching is definitely an art and a science but really great once you are able to configure it!
Essentially in purge mode it is comparing all of the records in a list and outputting the record that it matched with. There is an excellent single tool example in alteryx which has workflows that demonstrate all of the possible configurations of this tool.
I have attached an example of what you might be looking for taken from that single tool example (found by clicking on the tool in your tool palette, and selecting to open the example).
You essentially need to use a make groups tool and then a find replace tool to cleanse the data set.