Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex help to remove duplicate strings in the same cell

csh8428
11 - Bolide

I've looked through various posts throughout the community and have found some similar problems/solutions, but not have been able to resolve what I'm looking for. I'm also not proficient enough with Regex to modify the solutions in some of the other posts.

 

I have a cell with an unknown number of strings separate by commas in a cell. Many of those strings are duplicates

 

Data looks like this

NameTitles Held
JohnHRBP, HRBP, HRBP, Sr HRBP, SVP HR, SVP HR
DaveSales, Sr. Sales, Sr. Sales, Director Sales

 

 

I'm trying to to get it to look like this

NameTitles Held
JohnHRBP, Sr HRBP, SVP HR
DaveSales, Sr. Sales, Director Sales

 

I really don't want to do the text to rows method because I have to do this to a number of columns.

 

Thanks for any Help!

 

Craig

5 REPLIES 5
jdelaguila
8 - Asteroid

I had the same issue when i concatenated a field. I just ended up using a Find and Replace Tool. There's probably a better way of doing it, but thought i should share what i ended up doing.

 

Javier Delaguila

TerryT
Alteryx Alumni (Retired)

Hi Craig,

 

That was a tough one!!!

 

This seems to work:  REGEX_Replace([Titles Held], '(\b[^,]+)(?=.*, *\1(?:,|$)), *', '')

TerryT_0-1588812375599.png

 

Good luck!

 

Terry T

 

csh8428
11 - Bolide

Works great. Thanks!

Fredy_Katpitia
5 - Atom

Hello,

Even i am new to Alteryx & Regex but i tried 2nd step but didnt work for me as i have amultiple fields with below data type.

 

I am trying the RegexReplace formula but dont seem to get it to work..can you help me please?

 

Below is what my string contains

 

([NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' or [NAME_LEVEL10]='ABC123' OR [NAME_LEVEL11]='ABC123' OR [NAME_LEVEL11]='ABC123')

 

what i am looking for is
([NAME_LEVEL10]='ABC123' OR [NAME_LEVEL11]='ABC123')

 

Thanks & Regards

Fredy

mark-spain
8 - Asteroid

Amazing! Exactly what I need for a hassle free way to remove duplicates from a string list. 

Labels