I welcome your expertise. I have data formatted in a long text field in Excel and need to extract the number field (such as 590288-01 in the first example below.) My challenge is that this number is not necessarily in the same place within each record. The number typically, but not always has an extension of 01, 02, etc.
Sample records:
Originator To Beneficiary Info: /ROC/PLAN15987 OTabcW 590288-01 AMIGOIN FUNHOUSE, 23
Originator To Beneficiary Info: REFERENCE 1260422-01 BOB CASS IDY AND ASSOCIATES
Originator To Beneficiary Info: THE DIRECTED ACCOUNT PLAN PLAN 385222-02
Thank you!
Originator To Beneficiary Info: STATE OF BLISS #98988-01
Originator To Beneficiary Info: Applied Pumpkin Plan 45486901
Solved! Go to Solution.
Hi,
Based on your examples, I identified the following defining characteristics. If these are inaccurate, the below REGEX will not work.
the target number field is one of the following
a set of digits up to 6 in length followed by a dash ("-") and two digits
a set of digits of length 7 or 8 with no dashes or spaces.
((?:\d{0,6}-\d{2})|(?:\d{7,8}))
This regex looks for either a set of 0 to 6 digits, followed by a dash and 2 digits, OR a set of 7 or 8 digits.
Hope this helps!
Along with @Claje, I used a regex expression to create your desired output.
REGEX_Replace([Field1], ".*([0-9]{5,7}\-{0,1}[0-9]{2}).*", '$1')
\d is a shortcut for the set of characters between 0 and 9. I look for a group (enclosed within parenthesis) that comes after any characters where the group contains:
This solves for your desired results too.
Cheers,
Mark
Thank you! This is fabulous. This works for many of the records, but some of the records have other numbers in them in that are not part of the "Originator To Beneficiary Info:" string. These values are being returned in my output when I only need the value of 590288-01 below. For example, I have the following in one Excel field:
Originator To Beneficiary Info: /ROC/PLAN15987 OTabcW 590288-01 AMIGOIN FUNHOUSE, 23
Reference #: P482973
ID:893723
Thanks SO much!
When using my solution (above), I get 90288-01 as a result. So I made a minor tweak to the expression:
REGEX_Replace([Field1], ".*?([0-9]{5,8}\-{0,1}[0-9]{2}).*", '$1')
The ? will look to the first occurrence of the pattern. Now I get 590288-01
Cheers,
Mark
Thanks so much Mark, for your help. I appreciate it. I may be do something wrong (if so, I'm uncertain what...) because the parse returns no data. Alteryx says "0 records were successfully parsed..." Do you have any tips? Thank you!
here's my test module.
You are awesome! Thanks for your help. This is fabulous because it extracts the first string, but unfortunately the string I am seeking is not necessarily the first string in the field, but it is always preceded by "Originator To Beneficiary Info:" Sorry for so much back and forth, here's an example of the entire field that I am searching. I am trying to extract the string = 590283-01
R Number: 181117498
Fed Ref: 00905898724
Date/Time Received: 11/15/2018 10:02:00 AM
Originator: Amigonnafun 2600 SHERIDAN DRIVE TONAWANDA NY 14250
Receiving Bank: 1022349021 Chicago
Beneficiary: /10-9-9133286049
Beneficiary Ref: 124264289
Thank you!
Originator To Beneficiary Info: /ROC/PLAN15987 OTCFGW 590283-01 Pumpkintime. DXXXXX,,
I think that this is what you need:
REGEX_Replace([Field1], ".*Originator To Beneficiary Info:.*?([0-9]{5,8}\-{0,1}[0-9]{2}).*", '$1')
you can modify my prior workflow and use this formula.
Cheers,
Mark
You are a rock star/Alteryx star! This works. Thanks so very much. I'm very new to Alteryx and this is a huge help.
Many thanks to you both!