Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Create a fuzzy join operator

I think it would be incredibly helpful for Alteryx to include a "Fuzzy Join" operator, similar to what is described in this article: http://www.decisivedata.net/blog/alteryx-fuzzy-join-workflow/

 

Virtually every client/project I work on, there is a nead to clean up data.  Most of the time, that involved standardizing to some existing list of data.  However, as we all know, data from differnet systems or being manually collected will not match perfectly in all cases.  This is most often when I tend to use the Fuzzy Match tool.

 

However, I have to use a lot of weird steps to effectively create a "Fuzzy Join", which is something I've done using database functions in the past.  I think it would be great if a new tool were created that would do the following:

  • Accept two inputs, one for the "raw" data and another for the "list" of data to match to.
  • Perform a fuzzy join based on similar functionality to the fuzzy match, convert data to metaphone keys and then run Jaro/Levenstein matches.  By default, return only the highest matching result.
  • Expand the pre-process functionality to include words to exclude from the analysis (beyond just "and", "the" and "in").  
  • Match on the whole string.  No need to try and do joins based on partial words within a string.

 

This seems like a very common thing (I've created a macro for this anyway) that could be made to be simpler for everyday use.

 

Thanks!

11 Comments
marco_zara
8 - Asteroid

Any chance this idea will be reviewed by the team again?

Text-based datasets are on the rise and this could be merged with the case insensitive join idea.