Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Using RegEx to Parse a Freeform Field on Multiple Phrases

tfinn
7 - Meteor

Hi. I’m trying to use RegEx to parse a freeform Notes field into 3 columns using 3 phrases and so far I’ve been unsuccessful.  The phrases are Root Cause, Resolution Plan and Resolution Date.  The case may vary but the order should be consistent.  Here is a sample of what the Notes field may contain:

 

Janie K, Aug 30 2018 8:44AM - Action plan is not on tab for reconciling item but is in comments on homepage. Root Cause-Past reconcilers were unable to determine the course to take to clear the open items. These were all system generated transaction & unknown how to clear these. Resolution Plan-Working with past reconciler &  Business Analyst  to try to get resolved. Resolution date:  Expected to clear by 12/31/2018

 

The only thing I’ve been able to muster so far is to create the 3 fields using this regular Expression:

(root cause.*)|(resolution plan.*)|(resolution date.*)

 

Of course, this expression does not stop any of the comments, they just go to the end of the Note. And I can’t even pull this off when there is no carriage return in the Notes field.  This Note, for example, only parsed on Root Cause, not the other two:

 

John S, Aug 28 2018 1:53PM - Root Cause: backlog of aged items.  Resolution Plan: Reconciler was working to get account compliant by 6/30, but this was pushed out due to resource constraints due to the V2V transition.  Reconciler has requested additional resource to assist in clearing, but the approval is still pending. The reconciler has not attached backup for all outstanding items, but will need to do so in order for account to be compliant. Resolution Date: 12/31/18

 

I would like to include 2 of the 3 phrases in their respective fields and then stop at the beginning of the next key phrase, except for the date field, which does not need to include Resolution Date. In this last example the 3 columns would be:

 

Root Cause

Root Cause: backlog of aged items.

 

Resolution Plan:

Resolution Plan: Reconciler was working to get account compliant by 6/30, but this was pushed out due to resource constraints due to the V2V transition. Reconciler has requested additional resource to assist in clearing, but the approval is still pending. The reconciler has not attached backup for all outstanding items, but will need to do so in order for account to be compliant.

 

Resolution Date:

12/31/18

 

Can anyone assist, please?

8 REPLIES 8
ramatp30
7 - Meteor

HI @tfinn

 

Could you please provide sample of expected output.

That would make easy for ppl to provide solutions. 

jasperlch
12 - Quasar

Hi @tfinn

 

This could be achieved by the Regular Expression as shown below:

Capture1.PNG

tfinn
7 - Meteor

Thanks for the response!

 

I’m discovering that sometimes the freeform data is missing one of three phrases and I need to be able to pull in the two that are there. Here is an example of a Note that is missing Resolution Plan:

 

Cindy S, Aug 29 2018 2:50PM - Root cause: Pending response from Reinsurance Finance. A follow up email has been sent by the reconciler 08/16/18. If no response is received, the reviewer of this reconciliation will follow up. Resolution Date: 10/31/18.

 

I still need to have Root Cause and Resolution Date populated, even though there is no Resolution Plan in the Note. 

 

Also, the Resolution Date is not always expressed as a date so I need to pull whatever follows Resolution Date. Here is an example of a Note that does not have a Resolution Date:

 

Jane L, Aug 30 2018 3:06PM - Root Cause: Unable to identify person to clear Resolution Plan: Monthly meeting with AP group to determine next steps to clear items.  Next meeting scheduled for 9/14/18. Resolution Date: No resolution date established

 

I think it will be best to pull Resolution Date and everything to the end of the Note.

 

Hopefully this is all doable?

 

Thanks!

tfinn
7 - Meteor

Hi.  Per the earlier request, I've attached a spreadsheet with the 4 data samples and what I'd like the output to look like.

tcroberts
12 - Quasar

Hey,

 

I've got a working example for you, but the regex is a little complicated.

 

Essentially, rather than just use the Regex tool, i use Regex_CountMatches and Regex_Replace in a Formula tool to check for each of the three fields, then fill them.

 

I'll attach a screenshot and the workflow, let me know if you have any issues or questions about it.

 

Spoiler
regexparsefreeform.PNG

 

Cheers!

tfinn
7 - Meteor

This looks great @tcroberts!  I will go through the results and let you know for sure.  Thanks!

tfinn
7 - Meteor

Hi @tcroberts.  Sorry but one of my users identified a couple issues with the your solution.  I've attached two examples where something is disrupting the logic.  These two examples have carriage returns but so do many other rows and your solution works on them.   I tried to fix but was unable.  Do you have any ideas?

tcroberts
12 - Quasar

Hi @tfinn

 

I think I've got a working fix, I used a RegEx tool to split Field1 into 2 columns around the carriage returns, then concatenate these back together separated by a space. Then I've slightly modified the regex to handle "-" instead of only ":" for the separators between field names and values.

 

I do recognize that the second issue has two carriage returns, however it was able to parse correctly after removing just the one. I think the issue is if the carriage return occurs before the "Root Cause:" heading.

 

You could probably modify this regex to split into more columns if you know there's consistently 3+ carriage returns, but I'll leave that to you as you'll have a better idea about the data you're dealing with. I think this is roughly the idea of how I'd first try to approach the problem.

 

Spoiler
regexfreeformparse.PNG

 

Let me know if this helps,

 

Cheers!

Labels