Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Remove duplicates in same string

hcao
6 - Meteoroid

I have a cell in this format with an unknown number of strings separate by pipes. Many of those strings are duplicates

 

hello (123) | hello (123) | bye (456) | bye (789) | bye (456)

 

What regex formula can I use to look ahead and delete duplicates? 

 

The expected output should be this:

hello (123) |  bye (456) | bye (789)

 

 

8 REPLIES 8
Bren_Spill
12 - Quasar
12 - Quasar

Hi @hcao - I took a different approach to using regex. See attached and let me know if this works for you.

 

Thanks!

 

 
 
 
gawa
16 - Nebula
16 - Nebula

hi @hcao 

In your case, Text to Column tool would be better than Regex tool. 

First, split data by delimiter "|" by Text to Column tool, and find unique value by Unique tool, then concatenate them all by Summarize tool.

Please see the attached WF for more detail process.

image.png

hcao
6 - Meteoroid

Thank you for this, however I prefer to not use the text to columns tool because I have a lot of rows and duplicates to deal with. the above was just a snippet. I may use text to columns as a last resort

Qiu
21 - Polaris
21 - Polaris

@hcao 
I found this reply and it should help.

It is a bit of beyond of my reach though. 😁

https://community.alteryx.com/t5/Alteryx-Designer-Desktop-Discussions/Regex-help-to-remove-duplicate...

Bren_Spill
12 - Quasar
12 - Quasar

@hcao - hopefully the regex provided by @Qiu works for you. I also made a few updates to my workflow so it will handle additional rows and duplicates if helpful.

 

Thanks 

malteryx1
6 - Meteoroid

You can simply use Text to Column tool and then Unique tool and Summarize tool .

Please find the below workflow

 

Vinayp29
7 - Meteor

Guys, one question. This works when we have a single row and not multiple rows. Any suggestion on how to use this on multiple rows?

 

Labels