Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex - extracting text after multiple words

RogerA
8 - Asteroid

Hello, i'm new to regex and been reading a lot of forum posts trying to figure out what i need to do with no luck.  I'm sure this is so simple so apologies in advance!

 

I want to extract all words after several different identifiers in the data, so example data would be something like:

 

Supplier Shop XY Green Cat

Customer XY Blue Dog

Supplier AB Pink Snake

Wholesaler GH Black Frog

 

How can i use regex to parse all the text after XY, AB,GH ?  I can do this individually by using XY +(.*) but how can i write it so that i can search on these multiple values?

 

 

11 REPLIES 11
Inactive User
Not applicable

^.+\s\u\u\s(.*)$

ivoller
12 - Quasar

IIf you only want to do this for Specific words, then perhaps some thing like

 

(^.+)([XP|AB|GH].*$)

 

Cheers,

Iain

 

2018-03-11_16-34-34.png

 

RogerA
8 - Asteroid
Thanks that’s worked on about 75% of my data, I’ve noticed that there are also some 3 length uppercase characters too.

Is it possible to account for different lengths of uppercase characters ?
RogerA
8 - Asteroid
Thanks Iain! I’m out right now but will try this when home
Inactive User
Not applicable

^.+\s\u\u+\s(.*)$

RogerA
8 - Asteroid

I've not managed to get either solution working perfectly, but Ryan's was closest, it didn't' work completely though as it extracts the last word only, but not all words after the uppercase values.

RogerA
8 - Asteroid

Maybe some more representative data will help show why the above ideas are not working for me, my data has more variance in it so:

 

EF1234 Pluto ABC Planet six

EF2345 Mars ABC Chocolate flavour

F-Type Jaguar DEF Animal jungle

A-Type Mercedes A70 Car types 2

DF-Class Casserole MK2 Dinner time tonight

 

The items in my bold are common throughout the data and i need to extract everything after those items in bold (there is no formatting in the data - bolded just to highlight above).

 

 

RogerA
8 - Asteroid

I have managed to get it working, but probably not the optimal solution.  I did:

 

ABC(.*)|DEF(.*)|A70(.*)|MK2(.*) 

 

and then after this joined all the outputs using a formula.

 

regex.JPG

vishwa_0308
11 - Bolide

Hi @RogerA,

 

You can parse the below expression hope this will work for all your given strings:

 

.*\s\w{2}\s(.*)|.*\s[A-Z]\d{2}\s(.*)|.*\s\w{3}\s(.*)

 

 

Thanks,

Vishwa

Labels
Top Solution Authors