Hi, I'm fairly new to Alteryx and have inherited some REGEX code that is not working quite how I need it to. Within a string I'm looking for an IBAN code which always starts with 2 letters (representing a country) and then depending on the country this may be followed by 2 number 4 letters 17 number or any sort of combination of letters and numbers. The REGEX that I have successfully finds the IBAN when it is at the start of the string, which contain pipe de-limiters, but not in the middle of a string.
For example, in this string |DE9999991112222333344|ANYWHERE, ANY|ANYROAD|ANYTOWN the regex will identify the IBAN DE9999991112222333344
but not in this string ABC LIMITED|DE9999991112222333344|DEKTDE7G|ANYWHERE, ANYWHERE RD|AW - ANYWHERE - ANYCOUNTRY.
There is a two part process the first regex Filters out anything not recognised as an IBAN
REGEX_Match(replace([Concat_PAYM_NOTE],' ',''), '\|*/*([a-zA-Z]{2})\d{2}\w{4}\d{0,7}(\w?){0,16}/*.*?\|*.*')
and then it goes into the next one which separately identifies the 2 digit country code at the beginning and the IBAN code as a whole. CODE = REGEX_replace(replace([Concat_PAYM_NOTE],' ' ,''), '\|*/*([a-zA-Z]{2})\d{2}\w{4}\d{0,7}(\w?){0,16}/*.*?\|*.*','$1')
For info, CONCAT_PAYM_NOTEr is the name of the field
Any pointers, gratefully received. Thanks for your time.
Solved! Go to Solution.
Hi @DoneF ,
Attached is an example showing how to do it. Please check in your database if it is working for your full dataset.
Let me know if that works for you.
Best,
Fernando Vizcaino
Thanks for your help, it seemed to work on my first few tests but for this
ABU DHABI|ABU DHABI AE|AE123456712345671234567|CITIUS33XXX|ANY OLD THING|PO BOX HOLD HOLD
it returned ABU DHABI AE rather than AE123456712345671234567.
Hi @DoneF ,
You can use the following regex expression.
.*\|(.*)\|.*\|.*\|.*
Please mark the answer as correct if you think I achieve your expectations, as it would help other users with the same issues.
Best,
Fernando Vizcaino
Hello @DoneF,
This seems to be working:
Edit: Changed the formula to adapt maybe better to your needs:
REGEX_Replace([Field1], '(.*)?(\u{2}\d*\u*\d+)(\|.*)', '$2')
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Regards
Hi, thanks for your time. This has worked really well for most of my issues, unfortunately my data set is 70,000 rows so it takes a bit of time to work through the different permutations that may occur. I have some examples where my IBAN has a '/' at the beginning and it doesn't recognise . my string looks like this
/GB11ABCD22222233333333|1/ANY-THINNG-HERE|2/A4-1205,OTHER STUFF HERE|2/ANYTHINGHERE|3/AA/ANYTHING|CITIUS33XXX
CD22222233333333 instead of GB11ABCD22222233333333
Would you be able to advise on that?
Any help gratefully appreciated.
Hello @DoneF,
I updated the formula and now is working with your new example too:
REGEX_Replace([Field1], '(.*)?(\u{2})(\d{2})?(\u{4})?(\d+)(\|.*)', '$2$3$4$5')
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Regards
Hi, I got the same results as before. Thanks
Thanks for all your help. It seems that it can recognise an IBAN when the structure is something AE123123123123123 but struggles to recognise when the structure is GB12ABCD123456789. It's frustrating as it's very close to being how I need to see it.
Any further ideas gratefully received.