Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

REGEX

DoneF
6 - Meteoroid

Hi, I'm fairly new to Alteryx and have inherited some REGEX code that is not working quite how I need it to. Within a string I'm looking for an IBAN code which always starts with 2 letters (representing a country) and then depending on the country this may be followed by 2 number 4 letters 17 number or any sort of combination of letters and numbers. The REGEX that I have successfully finds the IBAN when it is at the start of the string, which contain pipe de-limiters, but not in  the middle of a string.

For example, in this string  |DE9999991112222333344|ANYWHERE, ANY|ANYROAD|ANYTOWN the regex will identify the IBAN DE9999991112222333344

but not in this string ABC LIMITED|DE9999991112222333344|DEKTDE7G|ANYWHERE, ANYWHERE RD|AW - ANYWHERE - ANYCOUNTRY.

There is a two part process the first regex Filters out anything not recognised as an IBAN

REGEX_Match(replace([Concat_PAYM_NOTE],' ',''), '\|*/*([a-zA-Z]{2})\d{2}\w{4}\d{0,7}(\w?){0,16}/*.*?\|*.*')

and then it goes into the next one which separately identifies the 2 digit country code at the beginning and the IBAN code as a whole. CODE = REGEX_replace(replace([Concat_PAYM_NOTE],' ' ,''), '\|*/*([a-zA-Z]{2})\d{2}\w{4}\d{0,7}(\w?){0,16}/*.*?\|*.*','$1')

For info, CONCAT_PAYM_NOTEr is the name of the field

Any pointers, gratefully received. Thanks for your time.

13 REPLIES 13
fmvizcaino
17 - Castor
17 - Castor

Hi @DoneF ,

 

Attached is an example showing how to do it. Please check in your database if it is working for your full dataset.

 

Let me know if that works for you.

Best,

Fernando Vizcaino

DoneF
6 - Meteoroid

 Thanks for your help, it seemed to work on my first few tests but for this

ABU DHABI|ABU DHABI AE|AE123456712345671234567|CITIUS33XXX|ANY OLD THING|PO BOX HOLD HOLD

it returned ABU DHABI AE rather than AE123456712345671234567.

 

fmvizcaino
17 - Castor
17 - Castor

Hi @DoneF ,

 

You can use the following regex expression.

.*\|(.*)\|.*\|.*\|.*

 

Please mark the answer as correct if you think I achieve your expectations, as it would help other users with the same issues.

 

Best,

Fernando Vizcaino

DoneF
6 - Meteoroid

HI, I tried again and some work and some fail. I've attached a document showing the inputs. High-lighted in yellow works and RED font doesn't work. On the one that doesn't work I should be seeing CH1234567891234567891. Thanks for your time

afv2688
16 - Nebula
16 - Nebula

Hello @DoneF,

 

This seems to be working:

 

Untitled.png

 

Edit: Changed the formula to adapt maybe better to your needs:

 

REGEX_Replace([Field1], '(.*)?(\u{2}\d*\u*\d+)(\|.*)', '$2')

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

DoneF
6 - Meteoroid

Hi, thanks for your time. This has worked really well for most of my issues, unfortunately my data set is 70,000 rows so it takes a bit of time to work through the different permutations that may occur. I have some examples where my IBAN has a '/' at the beginning and it doesn't recognise . my string looks like this

/GB11ABCD22222233333333|1/ANY-THINNG-HERE|2/A4-1205,OTHER STUFF HERE|2/ANYTHINGHERE|3/AA/ANYTHING|CITIUS33XXX

CD22222233333333 instead of GB11ABCD22222233333333

Would you be able to advise on that?

Any help gratefully appreciated.

afv2688
16 - Nebula
16 - Nebula

Hello @DoneF,

 

I updated the formula and now is working with your new example too:

 

REGEX_Replace([Field1], '(.*)?(\u{2})(\d{2})?(\u{4})?(\d+)(\|.*)', '$2$3$4$5')

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Regards

DoneF
6 - Meteoroid

Hi, I got the same results as before. Thanks

DoneF
6 - Meteoroid

Thanks for all your help. It seems that it can recognise an IBAN when the structure is something AE123123123123123 but struggles to recognise when the structure is GB12ABCD123456789. It's frustrating as it's very close to being how I need to see it.

Any further ideas gratefully received.

Labels