I have a large file that has one field of text. The text includes a description. part of that description includes someones name and ID number. I'm trying to remove that information from the description to result in a generic description.
I think RegEx is the answer, but I'm finding this a little too complicated for me.
Here's a sample of what some of my data looks like and what my desired stripped description would be
Workflow Workflow Initiated Template Auto Selected |
Edits for Preliminary Bill created Edits for Preliminary Bill created
|
Review/Approval Request Email sent to Assigned Approver MS BINDY (P0018670)
Review/Approval Request Email sent
I'm trying to remove the text after either of the words (to|by) up through and including the close paren after the ID. (I'm also going to want to remove the numeric after the word "Workflow" if it exists.) I substituted Ms Bindy for the name...it's not always going to be me.
My approach so far is to use Reg Ex with named and unnamed groups, but I'm not having great success.
Solved! Go to Solution.
Hi @MsBindy,
I also used groups, but rather than try to do it all at once, I used a pair of Regex_REPLACE:
REGEX_Replace( REGEX_Replace([fld1], "(.*\<workflow\>) [0-9]+(.*)", "$1$2"), "(.*) (\<to\>|\<by\>) (.*)(\))(.*)", "$1$5")
The inner one removes the numeric if it's there; the outer handles the to/by piece.
Hope that helps!
John
That's great! I'll have to study that this afternoon. I ran it though my data and have to make a few tweaks. The part where the number after the word 'workflow' is stripped out can be a number with a comma in it, and the replace didn't handle that quite right. And I found another place in some of the data where I need to strip something out.
But, this gets me 95% there, and you saved me at least a day of trial and error! Thank you!