Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Remove Duplicates in a single cell

siddharthpotlia
5 - Atom

Hi, I have data in the below format (In a single cell) Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK) I want to remove all duplicates in here Ideally the result shall look like: Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK) How do I do this in Alteryx, I know Regex is one way but I am not able to figure out the correct code for it. I found the a code shared in the community for this regex_replace([field],"\b(\w+),(?=.*\b\1,?)",""), but this is not working for me (Most likely the code has to be customized in line with the data but I do not know much about stack overflow at the moment, can anyone please help out in this. Thanks in advance.

12 REPLIES 12
mceleavey
17 - Castor
17 - Castor

Hi @siddharthpotlia ,

 

the unique tool is what you need after you've split the cell contents out to analyse them:

 

mceleavey_0-1621948801527.png

Workflow attached.

 

M.

 

 



Bulien

atcodedog05
22 - Nova
22 - Nova

Hi @siddharthpotlia 

 

I know this not regex but this how i would do it.

atcodedog05_0-1621948976155.png

 

1. Use record id as key.

2. Split fields to rows using ; as delimiter

3. Using unique tool to keep unique instance

4. Conacting it back with ; as delimiter

 

Hope this helps 🙂

atcodedog05
22 - Nova
22 - Nova

Hi @siddharthpotlia 

 

And RegEx method i had to try. Taking reference from weekly challenge https://community.alteryx.com/t5/Weekly-Challenge/Challenge-130-Is-There-an-Echo-in-Here/td-p/206817

 

Workflow:

atcodedog05_0-1621949703637.png

 

 

TrimRight(
REGEX_Replace([Field1]+";", "(.+\);)\s.*\1", "${1}"),
";")

 

 

Hope this works 😅 & Hope this helps 🙂

 

vlad_kutateladze
8 - Asteroid

@atcodedog05 Why the Record ID tool is necessary for this workflow? The Sample tool gives the same result while grouping only by Field1.

mceleavey
17 - Castor
17 - Castor

@vlad_kutateladze , I used a method that doesn't need it. You can parse out the contents to rows, unique, then use the sum tool to concatenate back to the same format.

 

@atcodedog05 just used a different method.

In Alteryx, there are always multiple ways to do the same thing 🙂



Bulien

atcodedog05
22 - Nova
22 - Nova

Hi @vlad_kutateladze 

 

If you have multiple rows a key field is necessary to split and join back. RecordID is used as groupby in summarize tool (key filed for rows) for joining back.

Qiu
20 - Arcturus
20 - Arcturus

@atcodedog05 
While I am trying...

B12.PNG

siddharthpotlia
5 - Atom

Thanks for your prompt response it is really helpful. One challenge over here is that I have multiple companies (~100)  for which I need to run this. So the data set looks like this

 

Company 1Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK)
Company 2Botswana Share Market (BSM); Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); Namibian Stock Exchange (NMSE); The Johannesburg Securities Exchange (JSE); SIX Swiss Exchange (SWX); London Stock Exchange (LSE); XETRA Trading Platform (XTRA); XETRA Trading Platform (XTRA); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK)
Company 3Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK)
mceleavey
17 - Castor
17 - Castor

Hi @siddharthpotlia ,

 

this is a simple fix to my original, just added a Record ID field.

Alternatively (it wasn't in the data) you can simply group by the company ID in the sum tool. (EDIT: I've added this)

 

Workflow attached.

 

M.



Bulien

Labels