Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex Challenge: Selectively removing the same word from string?

waqaraphoto
7 - Meteor

Hi everyone,

 

Anyone know how to do this via regex? I would like to ONLY keep instances of the word Region that end in either a comma or nothing at all.

 

Example of what Id' like to remove in Red and what I'd like to keep in Blue.

 

Svc:Network,Region:Aspac,Country:China,Region,Pm:Jerry Mao, Region:EMEA, Region

 

As we can see, Region shows up twice.. but Id' like to treat the word "Region" as a separate entity in my output so just removing anything after the Word Region would not work.

 

In others remove "Region:Aspac" and "Region:EMEA" values  but keep all instances of where ONLY Region shows up.

 

6 REPLIES 6
LordNeilLord
15 - Aurora

Hey @waqaraphoto

 

Using regex replace you can use the following Regex: (Region\:\w.*?,)

 RegEx Replace.PNG

 

fmvizcaino
17 - Castor
17 - Castor

Hi waqaraphoto,

 

Try using this expression in Regex with Replace output method.

Region:.*?,

 

Regex community.JPG

 

Best,

Fernando V.

NicoleJohnson
ACE Emeritus
ACE Emeritus

Try this formula!

 

REGEX_Replace([Field1],"Region:\w*,+","")

 

Because it is specifically calling out to replace only Region when it is followed by a colon & other letters (and an optional comma), you should still see any instances of Region that you're aiming to keep. (Note that this might give you some duplicate whitespace that you'd want to remove with another Replace formula or a Data Cleanse tool.)

 

Hope that helps!

 

Cheers,

NJ

waqaraphoto
7 - Meteor

Hi Nicole,

 

I'm seeing all instances of the instances of the word "Region" that I do wish to keep.. But I believe I'm still also seeing those which are followed by :text.

 

This is a browse output at the end of the formula tool containing the formula.

 

(Append Region is the column Id' like to show instances of the word Region Only. As we can see we're seeing the unwanted version as well.)

 

Thanks for looking into this!

W

MarqueeCrew
20 - Arcturus
20 - Arcturus

 

Spoiler
REGEX_Replace([section],"Region:.*?,","")

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
NicoleJohnson
ACE Emeritus
ACE Emeritus

I might have a solution, I think I understand now what you're looking for... sounds like a RegEx_CountMatches scenario (or you could skip RegEx altogether and go with Contains).

 

If REGEX_CountMatches([Field1],"Region:") > 1 then null()
Elseif REGEX_CountMatches([Field1],"Region") > 1 then "Region"
else null() endif

 

If Contains([Field1],"Region:") then null()

Elseif Contains([Field1],"Region") then "Region"

Else null() endif

 

Sometimes, though I hate to admit this, RegEx is not always the easiest answer :) Play around with it some more - RegEx101.com is a great resource for practicing various scenarios to determine if your syntax will work!

 

Cheers,

NJ

Labels
Top Solution Authors