Alteryx Designer Desktop Discussions

kwieto · ‎03-22-2023

Hello, I'm a new to Alteryx and still learning the tool, can you help me with following task?

I have a list of the data divided into categories and subcategories (each row have its unique ID), which I want compare to the given list of categories and subcategories. The task is to check if a pair of category and subcategory match the list (that is easy to do with Join tool), but if there is no exact match available, then, Based on the exact match of the categories, suggest the closest match for subcategory if there is any similar value in the reference table.

Additionally, it is nice to have the evaluation how close the best match is to the given value (i.e. 90%)

The sample list is here:

ID	Category	Subcategory
1	Classic	Symphony
2	Jazz	Acid Jazz
3	Rock	Death and Roll
4	Rock	Hard Rok
5	Jazz	Jazz

The table with reference data:

Category	Subcategory
Classic	Chamber
Classic	Opera
Classic	Symphonic
Jazz	Acid
Jazz	Classic
Jazz	Fusion
Jazz	Modern
Rock	Death 'n' Roll
Rock	Hard Rock

Desired output:

ID	Category	Subcategory	Closest match (Subcategory)
1	Classic	Symphony	Symphonic
2	Jazz	Acid Jazz	Acid
3	Rock	Death and Roll	Death 'n' Roll
4	Rock	Hard Rok	Hard Rock
5	Jazz	Jazz

I thought about fuzzy match tool, but don't know how to properly configure it. Or maybe is is not the right tool for the task?

Qiu · ‎03-23-2023

@kwieto
Just want you to know, this is a bit stretchy. 😁
Aside from Fuzzy Match, I also used a filter tool to identify the case for "Acid Jazz" and "Jazz".

kwieto · ‎03-23-2023

Great, that was fast 😀

I assume that if there will be a spelling error in the "Acid Jazz" (i.e. Acit Jazz), the filter won't work and the solution would be more tricky to do?

Qiu · ‎03-23-2023

@kwieto
I think you are right.

Text Similarity is always tricky I think, unless you invovlve something big like Machine Learning, that would be out of my reach.
An article discussing about the Levenshtein distance you may want to take a look.

https://towardsdatascience.com/text-similarity-w-levenshtein-distance-in-python-2f7478986e75

kwieto · ‎04-05-2023

Thanks, I'll look into the article, for now I'm trying to do some changes to the logic with various effects.

Sorry for late reply, got stuck with some other things.

Alteryx Designer Desktop Discussions

Finding best (closest) match To the given value from another list