Hi,
Need help with a parse / regex question. My fields are trapped in a single field called RAW_TXT. In this case, I am trying to isolate the values (amounts) found at the end of RAW_TXT. I then parse the values (if more than one) into their respective fields in a later step.
I am using the expression (.*\u\S*) (.*). This works for 99% of the records. The exception is when the comment ends with ...line \d {1,3}. See #2 below. In these instances, I do NOT want to capture the digits immediately following "line".
How can I modify my expression to account for this?
RECORD | RAW_TXT | LINE_NMBR and TITLE | VALUES | ||
1 | 58 E-3 172141 118068 | 58 E-53 | 172141 118068 | ||
2 | NATO AWACS -Air Force requested transfer to line 87 -36401 | -36401 | |||
3 | IPEC B-kit NRE unjustified growth -6593 | -6593 |
I might use the tokenize option for this. You can set use a negative look around to avoid capturing the numbers that come directly after line, but you may have to fiddle with it a bit if there are other scenarios like that.
here is the explanation of the regex too: regexr.com/5j2fi
Hope that helps,
Greg
Regex makes my eyes cross. For this I would formulate
reversestring(left(reversestring(RAW_TXT),findstring(reversestring(RAW_TXT),' ')))