Hi,
My data set contains data that falls broadly into 3 categories:
1) Data with an AAAA number - e.g. ABC21864 - 123 - XYZ - AAAA-5VF4Z0 - ABCDEF - FFF+ZZZ - TTFN + FDA
2) Data with parenthesis - e.g. abcd\ABC123456 - XYZ - 1234-12-1 ABC DEF (abcdef)
3) Data without either of the above - e.g. ABC1904 - 4MGB - BSOD - ABCDEFGHIJKLMNOP - ABCDEFGH Carrara - TTFN + FDA
I am trying to parse these using the RegEx tool as below:
1) Parse up to the end of AAAA number
2) Parse up to and excluding the opening parenthesis
3) Parse the first 42 characters from left
I am using the RegEx expression:
(ABC.*?AAAA-\w+|ABC.+?\(|ABC.{42})
When used, the middle expression parses data up to and including the first parenthesis and doesn't drop the parenthesis.
Please advise how to parse up to (but exclude) the first parenthesis.
Thank you
Solved! Go to Solution.
Use (ABC.*?AAAA-\w+|[^\(]*|ABC.{42}). [^\(]* matches every character that's not a "(" so it stops when it gets to the first one
Dan
Hi Dan,
Am wondering whether you can help with the below issue.
I am trying to parse three sections of a string into separate columns (screen snip I). When the expression is used within a single line, the data doesn't parse and return null values (screen snip II).
Are you able to assist.
Screen snip I
Screen snip II -
(ABC\d+)(\<XYZ|xy*)(AAAA-\w+|AYAA-\w+)
Thank you
Uthpala
Your combined expression doesn't match the groups in your input string because it doesn't take into account the extra characters between them. It would match "ABC2260xyzAAAA-5LNI5G - TEST OUTPUT 12345 (12)". Try something like this.
(ABC\d+).*?(\<XYZ|xy*).*?(AAAA-\w+|AYAA-\w+)
Dan
Hi Dan,
Thanks so much for this.
It works perfectly if all three instances are given in the text string.
However, if not all three instances are present in the string, then the columns appear null.
Is there a way to fix this.
Uthpala