Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Remove left of a string base on mulitple scenarios

BosKev
8 - Asteroid

Hi All,

 

I got a bunch text with characters and words I want to clean up , I have come up with a Regex_Replace formula, it seems to only do part of what I want ( it doesn't remove anything after specific word) ...could someone help trouble shoot this formula? Thanks

 

find below item and replace with blanks:

 

1. remove all after period (.)

2. remove all after semicolon (:)

3  remove all after certain word (Respect) or (Behalf)

 

formula below:

 

trim(REGEX_Replace(
(if Contains([Line Items Description], ".")
then Left([Line Items Description], FindString([Line Items Description], "."))
elseif Contains([Line Items Description],"-")
then Left([Line Items Description], FindString([Line Items Description], "-"))
elseif Contains([Line Items Description]," : ")
then Left([Line Items Description], FindString([Line Items Description], " : "))
elseif Contains([Line Items Description],"RESPECT")
then Left([Line Items Description], FindString([Line Items Description], "RESPECT"))

elseif Contains([Line Items Description],"BEHALF")
then Left([Line Items Description], FindString([Line Items Description], "BEHALF"))
else [Line Items Description]
endif),"(.*)\-(.*)", "$1"))

 

 

Sample below:

FEE FOR PROFESSIONAL SERVICES RENDERED IN RESPECT OF AN APPLICATION FOR SHORT STAY WORK VISA . FEE APPROVED BY xxxxx ON -OCT-. AUD$. +GST CONVERTED TO USD - OUT OF SCOPE
GOVERNMENT FEES PAID ON YOUR BEHALF. AUD .
FOR PROFESSIONAL SERVICES RENDERED WITH RESPECT TO THE COORDINATION OF IMMIGRATION REQUIREMENTS - WORK PERMIT CANCELLATION - USD 
FOR PROFESSIONAL SERVICES RENDERED WITH RESPECT TO THE ASSESSMENT FOR WORK PERMIT   USD, 
GOVERNMENT FEES PAID ON YOUR BEHALF - RESIDENCE PERMIT . CNY
7 REPLIES 7
Thableaus
17 - Castor
17 - Castor

Hi @BosKev 

 

Try this:

 

REGEX_Replace([Field1], "(^.*?)([\.:]|respect|behalf).*", "$1$2")

 

It removes everything after the first occurence of any of these ".", ":" "respect", "behalf"

 

Cheers,

Ladarthure
14 - Magnetar
14 - Magnetar

Hi could you please show what you expect for each line?

BosKev
8 - Asteroid

Hi Thableaus,

 

This is works, very helpful in shorting my original formula.

 

Is it possible to explain how this works? I want to be able to update the conditions if anything changes.

 

for example if new words pops up or special character.

 

-Boskev

 

Thableaus
17 - Castor
17 - Castor

Glad it worked, @BosKev 

 

Sure, I can explain shortly to you:

 

REGEX_Replace([Field1], "(^.*?)([\.:]|respect|behalf).*", "$1$2")

 

The expression we need to replace:

 

(^.*?) - ^ beginning of the sentence, .*? any character 0 or more times until it finds the next character in our expression (that's why we put the "?" character). We put between parenthesis for further reference in the replacement expression.

[\.:] - this means a dot "." or a semicolon ":". The dot is escaped (\) because it's a metacharacter. 

| - this is an alternation character

respect and behalf are the exact words you're looking for

([\.:]|respect|behalf) - this whole expression means you can find either a dot, a semicolon, "respect" or "behalf". And it's between parenthesis because we need to reference that in our replacement expression.

.* - any character 0 or more times. This basically means anything after what we specified in our previous expression.

 

The replacement expression:

It's basically the first expression in parenthesis ($1) followed by the second ($2). So we get rid of everything we didn't reference and put between parenthesis.

 

If something is not clear, let me know.

 

Cheers,

BosKev
8 - Asteroid

Great!

 

just one follow up regarding comment below:

 

"The replacement expression:

... So we get rid of everything we didn't reference and put between parenthesis."

 

do you mean we keep everything not referenced because the goal is to remove any text string after :,-, or respect/behalf?

 

 

Thableaus
17 - Castor
17 - Castor

@BosKev 

Basically we have this:

(^.*?)([\.:]|respect|behalf).*

 

(^.*?) - This means $1 - expression 1

([\.:]|respect|behalf) - This means $2 - expression 2

.* - not referenced, not between parenthesis.

 

We're replacing (^.*?)([\.:]|respect|behalf).* - the whole expression

for

$1$2 - which is basically (^.*?)([\.:]|respect|behalf)

 

Cheers,

BosKev
8 - Asteroid

Thank you!

Labels