Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
Andy Uttley, Alteryx ACE, makes music with Alteryx | Math + Music

Data Cleaning

Highlighted
7 - Meteor

Currently, I've been doing some cleaning at the SQL query level with a data set we get from a data aggregator. Beyond the typical additions of spaces, and unexpected characters,  names field will have many creative spellings for the same entity.  I was wondering if there are recommendations beyond the Alteryx Data Cleansing tool? I can implement many of those rules in SQL. Hard rules are having their limitations given the ongoing issues with creative spellings and users are screaming for the repetitions to be re-cast as the same entities when applicable.The cleaned data has to be re-loaded to the server for Power BI reporting. 

Highlighted
Alteryx
Alteryx

You can use the Data Cleansing tool to clean up a majority of your data quality issues.

 

With regard to cleaning up the spellings of names, you might want to look into the Fuzzy Matching functionality. Essentially, you can use different algorithms to identify names and terms that are closely related to each other. From there, you can use a Find and Replace tool to normalize the values. Check out the example at the bottom of the Fuzzy Match tool by clicking on the tool in the palette and clicking example.

 

Hope this helps!

Labels