Let’s talk Alteryx Copilot. Join the live AMA event to connect with the Alteryx team, ask questions, and hear how others are exploring what Copilot can do. Have Copilot questions? Ask here!
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Hot to keep only the longest unique values and remove substrings

annhood
5 - Atom
Is there a way to keep only the longest values of a unique sequence? For example, if we have the following list:

Test
Test123
Test12345
Test67
Test689
Example

We would want to be left with only Test12345, Test689, and Example. The other ones which are substrings would be filtered out.

With a large dataset, is there an automated way to check if it is a substring against all other values in the column to decide if it should be kept or removed? I have been leaning towards using formulas, filters, and fuzzy match, but haven't figured out exactly how to do what I want.

Thanks!
1 REPLY 1
OllieClarke
16 - Nebula
16 - Nebula

Hi @annhood like you said, fuzzy matching will probably solve your problem. I'm not very good at it, so I came up with this solution. It does involve a cartesian join, so with a large dataset, it might not be the most performant solution.

clipboard_image_0.png

clipboard_image_1.png

Labels
Top Solution Authors