Hi, I have data in the below format (In a single cell) Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK) I want to remove all duplicates in here Ideally the result shall look like: Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK) How do I do this in Alteryx, I know Regex is one way but I am not able to figure out the correct code for it. I found the a code shared in the community for this regex_replace([field],"\b(\w+),(?=.*\b\1,?)",""), but this is not working for me (Most likely the code has to be customized in line with the data but I do not know much about stack overflow at the moment, can anyone please help out in this. Thanks in advance.
Hi @siddharthpotlia ,
the unique tool is what you need after you've split the cell contents out to analyse them:
Workflow attached.
M.
I know this not regex but this how i would do it.
1. Use record id as key.
2. Split fields to rows using ; as delimiter
3. Using unique tool to keep unique instance
4. Conacting it back with ; as delimiter
Hope this helps 🙂
And RegEx method i had to try. Taking reference from weekly challenge https://community.alteryx.com/t5/Weekly-Challenge/Challenge-130-Is-There-an-Echo-in-Here/td-p/206817
Workflow:
TrimRight(
REGEX_Replace([Field1]+";", "(.+\);)\s.*\1", "${1}"),
";")
Hope this works 😅 & Hope this helps 🙂
@atcodedog05 Why the Record ID tool is necessary for this workflow? The Sample tool gives the same result while grouping only by Field1.
@vlad_kutateladze , I used a method that doesn't need it. You can parse out the contents to rows, unique, then use the sum tool to concatenate back to the same format.
@atcodedog05 just used a different method.
In Alteryx, there are always multiple ways to do the same thing 🙂
If you have multiple rows a key field is necessary to split and join back. RecordID is used as groupby in summarize tool (key filed for rows) for joining back.
@atcodedog05
While I am trying...
Thanks for your prompt response it is really helpful. One challenge over here is that I have multiple companies (~100) for which I need to run this. So the data set looks like this
Company 1 | Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK) |
Company 2 | Botswana Share Market (BSM); Deutsche Boerse AG (DB); Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); Namibian Stock Exchange (NMSE); The Johannesburg Securities Exchange (JSE); SIX Swiss Exchange (SWX); London Stock Exchange (LSE); XETRA Trading Platform (XTRA); XETRA Trading Platform (XTRA); Pink Sheets LLC (OTCPK); Pink Sheets LLC (OTCPK) |
Company 3 | Deutsche Boerse AG (DB); Bolsa Mexicana de Valores (BMV); London Stock Exchange (LSE); Pink Sheets LLC (OTCPK) |
Hi @siddharthpotlia ,
this is a simple fix to my original, just added a Record ID field.
Alternatively (it wasn't in the data) you can simply group by the company ID in the sum tool. (EDIT: I've added this)
Workflow attached.
M.
User | Count |
---|---|
19 | |
14 | |
13 | |
9 | |
8 |