Alteryx Designer Desktop Discussions

MsBindy · ‎09-08-2017

I have a large file that has one field of text. The text includes a description. part of that description includes someones name and ID number. I'm trying to remove that information from the description to result in a generic description.

I think RegEx is the answer, but I'm finding this a little too complicated for me.

Here's a sample of what some of my data looks like and what my desired stripped description would be

Workflow ~~801~~ Initiated ~~by MS BINDY (P0084430)~~ Template Auto Selected.

Workflow Initiated Template Auto Selected

Edits for Preliminary Bill created ~~by MS BINDY (P002294C)~~

Edits for Preliminary Bill created

Review/Approval Request Email sent ~~to Assigned Approver MS BINDY (P0018670)~~

Review/Approval Request Email sent

I'm trying to remove the text after either of the words (to|by) up through and including the close paren after the ID. (I'm also going to want to remove the numeric after the word "Workflow" if it exists.) I substituted Ms Bindy for the name...it's not always going to be me.

My approach so far is to use Reg Ex with named and unnamed groups, but I'm not having great success.

JohnJPS · ‎09-08-2017

Hi @MsBindy,

I also used groups, but rather than try to do it all at once, I used a pair of Regex_REPLACE:

REGEX_Replace(
  REGEX_Replace([fld1], "(.*\<workflow\>) [0-9]+(.*)", "$1$2"),
  "(.*) (\<to\>|\<by\>) (.*)(\))(.*)", "$1$5")

The inner one removes the numeric if it's there; the outer handles the to/by piece.

Hope that helps!

John

MsBindy · ‎09-08-2017

That's great! I'll have to study that this afternoon. I ran it though my data and have to make a few tweaks. The part where the number after the word 'workflow' is stripped out can be a number with a comma in it, and the replace didn't handle that quite right. And I found another place in some of the data where I need to strip something out.

But, this gets me 95% there, and you saved me at least a day of trial and error! Thank you!

Alteryx Designer Desktop Discussions

Stripping away parts of text (reg ex?)

Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Re: Zero to Advanced in 20 days

Passed the Advanced Certification Exam!

Re: Identify duplicates in a specific column, and ...