Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Replace multiple occurrences of string with One occurrence of string

bsharbo
11 - Bolide

Hello everyone. I have a data file that often has comment data that I need to clean up a bit as part of a requiement.

 

Basically I need a function in alteryx that searches for multiple occurences of a string in sequence, and then replaces them wtih ONE occurance of that string.... For example: if a sentence had 2 or more spaces in a row, we would want to replace it with one space, however it CANNOT replace every occurance of that field.

 

So:

The  quick  brown  fox was  cool

 

Becomes

The quick brown fox was cool. 

 

If possible can we do this with line feed characters?  I need this fucntion to also be able to do something like...

ReplaceText(mystring, \n ,' '), but only work if there are multiple \n (line feed characters) in a row....

 

 

Thanks everyone!

3 REPLIES 3
jdunkerley79
ACE Emeritus
ACE Emeritus

I think:

REGEX_Replace([Text],"(\s)(\1+)","$1")

Will do what you need.

 

It will match any repeated white space character and replace with the first instance.

bsharbo
11 - Bolide

Oh that is an awesome regex function! Thank you!

gpmyers111
6 - Meteoroid

I was having a similar issue and stumbled on this thread. In my case, I am using the Summarize tool to concatenate string values into one field. This works great, but in some cases it was creating a bunch of commas where the string values were null. 

I could get the leading and trailing commas out with a TRIM function, but the ones in the middle were a little perplexing. 

My data looked like this:

 

00707513,,,,,,,,,,,,,,,,,,,,,,,,,,00707309
00707309,,,,,,,,,,,,,,,,,,,,,,,,00707513
00707513,,,,,,,,,,,,,,,,,,,,,,,,,00707309
00707309,00707513
00765154,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,00765152
00765154,00765152
00765154,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,00765152
00765154,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,00765152

So I took the formula shared here and had to research it and figured out how to change it from looking for a repeated word to a non-word, in my case a repeated comma. 

 

REGEX_Replace([StringField],"(\W)(\1+)","$1")

 

I just replaced the first switch with the \W and there you have it. Sample attached. 

Labels