Important Community update: The process for changing your account details was updated on June 25th. Learn how this impacts your Community experience and the actions we suggest you take to secure your account here.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

RegEx Pattern Desired Ouputs Part 2

JonaV90
7 - Meteor

Hi,

 

Following up on my last answered question -> https://community.alteryx.com/t5/Alteryx-Designer-Discussions/RegEx-Pattern-Desired-Ouputs/m-p/10865...

 

I have another task, to pull out the desired outputs from the attached data set. I need 6 different outputs in this instance where there are many combinations of the data.

 

Please see below a sample of patterns from the data set.

 

Can someone help me pull out these 6 different columns from the one string column using one RegEx tool?

 

 

JonaV90_0-1677110757326.png

 

Thanks,

 

 

8 REPLIES 8
BS_THE_ANALYST
14 - Magnetar

@JonaV90 I won't lie, this was tough due to all the variation. The RegEx expression I conjured up is wild. I got everything to match expect one missing piece (highlighted in picture below). This seems to be as far as my RegEx ability takes me at the moment.

BS_THE_ANALYST_0-1677171320021.png

This was the RegEx: 

^([\w]+)\s(.*?)(?=\(|incl\.|[A-Z]$)(?:\(([a-zA-Z]+)-?(.*?)\))?(?:.*(incl.*))?([A-Z])$

Hopefully that helps. 


When building RegEx I think regex101.com is really nice. I used it for this:

BS_THE_ANALYST_0-1677171801908.png

https://regex101.com/r/uy5EE1/1 

 

 

JonaV90
7 - Meteor

Great! thanks. I cleaned everything else up with a data cleansing tool and a formula tool

OllieClarke
15 - Aurora
15 - Aurora

@JonaV90 

 

I think doing this in a single RegEx statement is asking for trouble - and filters/conditional formulae would be a preferred solution here.

 

Having said that, I took @BS_THE_ANALYST's RegEx and tweaked it to get your desired output

^([\w]+)\s(.*?)\s*(?=\(|incl\.|[A-Z]$)(?:\(([a-zA-Z]+)[-\s]*(.*?)\))?\s*((?:incl\. New [DC]SP)?.*?)((?<![A-Z])[PR])?$

OllieClarke_0-1677174722955.png

I hope no one ever has to inherit it...

 

Ollie

BS_THE_ANALYST
14 - Magnetar

@OllieClarke nice! I haven't learned about Look Behinds yet. Thanks for this.

 

JonaV90
7 - Meteor

@OllieClarkecan you share the Alteryx workflow for this?

 

Thanks,

JonaV90
7 - Meteor

.

OllieClarke
15 - Aurora
15 - Aurora

@JonaV90 sure:

JonaV90
7 - Meteor

@OllieClarkeI have to say, your Regex is cleaner in that the columns come out without leading or trailing spaces. Thanks a lot!

Labels