Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

RegEx Pattern Desired Ouputs Part 2

JonaV90
7 - Meteor

Hi,

 

Following up on my last answered question -> https://community.alteryx.com/t5/Alteryx-Designer-Discussions/RegEx-Pattern-Desired-Ouputs/m-p/10865...

 

I have another task, to pull out the desired outputs from the attached data set. I need 6 different outputs in this instance where there are many combinations of the data.

 

Please see below a sample of patterns from the data set.

 

Can someone help me pull out these 6 different columns from the one string column using one RegEx tool?

 

 

JonaV90_0-1677110757326.png

 

Thanks,

 

 

8 REPLIES 8
BS_THE_ANALYST
15 - Aurora
15 - Aurora

@JonaV90 I won't lie, this was tough due to all the variation. The RegEx expression I conjured up is wild. I got everything to match expect one missing piece (highlighted in picture below). This seems to be as far as my RegEx ability takes me at the moment.

BS_THE_ANALYST_0-1677171320021.png

This was the RegEx: 

^([\w]+)\s(.*?)(?=\(|incl\.|[A-Z]$)(?:\(([a-zA-Z]+)-?(.*?)\))?(?:.*(incl.*))?([A-Z])$

Hopefully that helps. 


When building RegEx I think regex101.com is really nice. I used it for this:

BS_THE_ANALYST_0-1677171801908.png

https://regex101.com/r/uy5EE1/1 

 

 

All the best,
BS

LinkedIN

Bulien
JonaV90
7 - Meteor

Great! thanks. I cleaned everything else up with a data cleansing tool and a formula tool

OllieClarke
16 - Nebula
16 - Nebula

@JonaV90 

 

I think doing this in a single RegEx statement is asking for trouble - and filters/conditional formulae would be a preferred solution here.

 

Having said that, I took @BS_THE_ANALYST's RegEx and tweaked it to get your desired output

^([\w]+)\s(.*?)\s*(?=\(|incl\.|[A-Z]$)(?:\(([a-zA-Z]+)[-\s]*(.*?)\))?\s*((?:incl\. New [DC]SP)?.*?)((?<![A-Z])[PR])?$

OllieClarke_0-1677174722955.png

I hope no one ever has to inherit it...

 

Ollie

BS_THE_ANALYST
15 - Aurora
15 - Aurora

@OllieClarke nice! I haven't learned about Look Behinds yet. Thanks for this.

 

All the best,
BS

LinkedIN

Bulien
JonaV90
7 - Meteor

@OllieClarkecan you share the Alteryx workflow for this?

 

Thanks,

JonaV90
7 - Meteor

.

OllieClarke
16 - Nebula
16 - Nebula

@JonaV90 sure:

JonaV90
7 - Meteor

@OllieClarkeI have to say, your Regex is cleaner in that the columns come out without leading or trailing spaces. Thanks a lot!

Labels
Top Solution Authors