Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove Duplicates from Cell after Concatenation

nathalie93
6 - Meteoroid

Hi all, 

 

I'm having trouble with removing duplicates from a cell after concatenation. Basically, in my Summarize tool, I regroup my data and concatenate multiple fields. But since in the concatenation, I can have the same value multiple times, I want to remove the duplicate values from the cell and only have a concatenation of unique values.

 

Examples:

EUR, EUR, USD, CHF, EUR ---> EUR, USD, CHF

01, 03, 01, 04, 01 ---> 01, 03, 04

 

Since I'm doing this on multiple fields at the same time, it's not very efficient to do it by splitting to rows and then using the unique tool. I've found a REGEX formula for this, but this only works on String values, not on numbers formatted as String, where it also removes the comma between the values (01, 01, 02 ---> 0102)

 

Trim(regex_replace([_CurrentField_],"\b(.+),(?=.*\b\1,?)",""),",")

 

Anyone have a formula or way of doing this easily?

 

Thanks in advance!

 

 

7 REPLIES 7
ggruccio
ACE Emeritus
ACE Emeritus

Hi @nathalie93,

 

One easy way may be to Transpose your data, then do Text to Columns to split to rows, then Group By, then Crosstab.

 

ggruccio_0-1587735339859.png

Assuming this is your start data:

ggruccio_1-1587735391154.png

The Transpose tool will Stack them

ggruccio_2-1587735423172.png

 

Text to Columns will allow you to split to rows:

ggruccio_3-1587735492098.png

Summarize - Group By (on both columns) will let you remove the duplicates

 

ggruccio_4-1587735564369.png

 

 

And Crosstab - concatenate will let you put them back in the original table format:

 

ggruccio_5-1587735579752.png

 

 

This should work for as many columns as you have!

 

 

I've attached a workflow for reference:

AbhilashR
15 - Aurora
15 - Aurora

Hi @nathalie93, I took your regex_replace approach and included leading and lagging whitespace to come up with a feasible solution. Give it a try on your larger dataset and let us know if this works.

AbhilashR_0-1587750289640.png

REPLACE(
    TRIM(
	REGEX_Replace([_CurrentField_], "\b(.+),(?=.*\b\1,?)", "")
    )
  ," ",", "
)

 

  

nathalie93
6 - Meteoroid

Thanks @AbhilashR for the formula, I just had to add another one to remove any duplicate commas that remained, but works great now! 

nathalie93
6 - Meteoroid

Hi @ggruccio, I'm sure this works as well, but as I have quite a large dataset and workflow, I prefer to have a shorter solution, such as the RegEx formula. But thanks for the answer!

ggruccio
ACE Emeritus
ACE Emeritus

Hi @nathalie93, no problem!  Glad @AbhilashR was able to provide a solution that fit your need!

Priyam_Goel
5 - Atom

Hi, I tried this for one of my worksheets. But it creates unnecessary commas or spaces. 

 

sjpostlethwaite
5 - Atom

Great solution, thanks so much for sharing :)

Labels