Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help! Regex Tokenize multiple expression

Ultralightbeam
8 - Asteroid

GY12_Word_31Dec20_Word2_d1.10_G12_1234

 

I have 7 to 9 delimited(_) string and I want to determine if each delimit is in proper format/appropriate value

I am using regex tokenize but for some reason it doesn't seem to work

 

Current ValueHow to check if it's correct
GY20First two letters should be GY followed by two digit number 18 to current last two digit of year
WordAny word must not contain special characters excluding period and underscore.
31Dec20\d{2}\w+\d{2,4}
Word2Contains word 'Licensing' or any 3 character length
d1.10left must be lowercase of v and second number must be a period.
G12G\d{1,2}
1234\d{3,4}

 

is it possible for this to be written in one regex expression or should do split to columns then do each regex expression?

13 REPLIES 13
Tyro_abc
11 - Bolide
Spoiler

attached workflow

Regards 
Arundhuti

Ultralightbeam
8 - Asteroid

@Tyro_abc  I need to get the correct pattern for each since the pattern for each row is different. 

Qiu
21 - Polaris
21 - Polaris

@Ultralightbeam 

It seems you should use RegMatch instead.

However, you said there are  7 to 9 delimited(_) , which make thing complicated.

Do we have fixed pattern for the 7, 8, 9 cases seperately?

Ultralightbeam
8 - Asteroid

@Qiu 

 

basically instances where there are 7 to 9 delimited is based from the word1 sometimes word1 got two delimited which can be concatenated into one.

 

original and standard format GY12_Word_31Dec20_Word2_d1.10_G12_1234 - 7 delimiter

 

sometimes

Like GY12_Word_Word.1_Word.2_31Dec20_Word2_d1.10_G12_1234

 

Therefor Word_Word.1_word.2 must be in one pattern Any word must not contain special characters excluding period and underscore. (there is really no pattern for this) thinking of just doing (\w+)

 

 

Ultralightbeam
8 - Asteroid

@Qiu  i actually got it by (\w+) my next problem is Word2 which should be equal to = Licensing or a three character of length both should be accepted.

Qiu
21 - Polaris
21 - Polaris

@Ultralightbeam 

like this?

Licensing|\w{3}

 

Ultralightbeam
8 - Asteroid

The word should contain "Licensing" or is in 3 character of length (LCA)

 

both should be captured. 

I have a word Licensing 

and some instance there is a word LCA

Qiu
21 - Polaris
21 - Polaris

@Ultralightbeam 

So the one I gave suit your requirement.

Licensing or \w{3}.

"|" means either 

Tyro_abc
11 - Bolide

Another try, might need some more fine-tuning but working with my sample data.

 

arundhuti726_0-1611814578578.png

 

 

(GY[1-2][\d])_(\w+)_(\d{2}[a-zA-Z]{3}\d{2,4})_(Licensing|[A-Z]{3})_(\l\d\.\d{2})_(G\d{1,2})_(\d{3,4})

 

 

Regards

Arundhuti

Labels
Top Solution Authors