Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Remove duplicates in same string

hcao
5 - Atom

I have a cell in this format with an unknown number of strings separate by pipes. Many of those strings are duplicates

 

hello (123) | hello (123) | bye (456) | bye (789) | bye (456)

 

What regex formula can I use to look ahead and delete duplicates? 

 

The expected output should be this:

hello (123) |  bye (456) | bye (789)

 

 

6 REPLIES 6
Bren_Spill
11 - Bolide

Hi @hcao - I took a different approach to using regex. See attached and let me know if this works for you.

 

Thanks!

 

 
 
 
gawa
15 - Aurora
15 - Aurora

hi @hcao 

In your case, Text to Column tool would be better than Regex tool. 

First, split data by delimiter "|" by Text to Column tool, and find unique value by Unique tool, then concatenate them all by Summarize tool.

Please see the attached WF for more detail process.

image.png

hcao
5 - Atom

Thank you for this, however I prefer to not use the text to columns tool because I have a lot of rows and duplicates to deal with. the above was just a snippet. I may use text to columns as a last resort

Qiu
20 - Arcturus
20 - Arcturus

@hcao 
I found this reply and it should help.

It is a bit of beyond of my reach though. 😁

https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Discussions/Regex-help-to-remove-duplicate...

Bren_Spill
11 - Bolide

@hcao - hopefully the regex provided by @Qiu works for you. I also made a few updates to my workflow so it will handle additional rows and duplicates if helpful.

 

Thanks 

malteryx1
6 - Meteoroid

You can simply use Text to Column tool and then Unique tool and Summarize tool .

Please find the below workflow

 

Labels