We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex tool to parse data points in varying positions over thousands of rows of data

Deano_280
7 - Meteor

Hello all, requesting your expertise in how to navigate this issue I'm stuck on. 

 

I'm trying to parse a particular data point from a string. This is currently a proof of concept but the data set will significantly increase to cover thousands of rows of data so I'm looking for a concise way to do so which I can then use on other parts at a later time.

 

I've amended the data however it would be a similar format to this:

 

(Country Of Origin in (FRANCE, GERMANY, AUSTRIA, NETHERLANDS)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agencies, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)

 

My aim:

- Separate Country of Origin and the countries contained within the brackets. Please note, country of origin could be contained anywhere within the string so it won't always appear at the beginning.

- Additionally, I will be parsing out other parts of the string such as Local Currency and the currency contained within, Safety Rating all the way up to Higher, and the Days Until Review. Similar to Country of Origin, the other data points mentioned could fall anywhere within the string.

- A lot of strings won't contain the data points mentioned above, for example some will only contain Safety Rating and Days Until Review.

 

Please let me know if I can clarify anything further 

7 REPLIES 7
mceleavey
17 - Castor
17 - Castor

Hi @Deano_280 ,

 

You said it could be anywhere, can you provide a few examples that would be more representative of your data otherwise you're going to get correct solutions for what you've provided which don't fit your actual dataset.

 

M.



Bulien

Deano_280
7 - Meteor

Hi mceleavey,

 

Apologies, please find further below. There are certain attributes that can be included within these rows, however not all attributes will always be included.

 

There are the following attributes which can be included:

- Country of Origin.

- Safety Rating.

- Local Currency (attribute can be used twice).

- Days Until Review (attribute can be used twice).

 

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 3650)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review = 365 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 3650)


(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review is greater than 3650) and (Days Until Review is less than 10950)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3650 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review = 365 or lower)


(Country of Origin in (FRANCE, GERMANY)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)


(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)


(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)


(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review = 365 or lower)


(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)


(Country Of Origin in (FRANCE, GERMANY, AUSTRIA, NETHERLANDS)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 366 or higher) and (Days Until Review = 1825 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 1826 or higher) and (Days Until Review = 3650 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 3651 or higher)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3650 or lower)


(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3651 or higher)


(Country Of Origin in (GERMANY, FRANCE)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review is less than 10950)


(Country Of Origin in (UNITED STATES)) and (Local Currency in (US Dollar)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review is less than 10950)


(Country Of Origin in (UNITED KINGDOM, SPAIN, NETHERLANDS)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Local Currency in (British Pound Sterling, Euro)) and (Days Until Review = 3650 or higher) and (Days Until Review = 10950 or lower)

 

Thank you!

Deano_280
7 - Meteor

Hi mceleavey,

 

Please find additional examples below. Apologies for being bunched up but having issues with it containing HTML.

 

Main attributes that will be included:

- Country of Origin.

- Local Currency.

- Safety Rating (attribute can appear twice, not always).

- Days Until Review ((attribute can appear twice, not always).

 

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 3650)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review = 365 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 3650)

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review is greater than 3650) and (Days Until Review is less than 10950)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3650 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review = 365 or lower)

(Country of Origin in (FRANCE, GERMANY)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review = 365 or lower)

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)

(Country Of Origin in (FRANCE, GERMANY, AUSTRIA, NETHERLANDS)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 366 or higher) and (Days Until Review = 1825 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 1826 or higher) and (Days Until Review = 3650 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review = 3651 or higher)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3650 or lower)

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) is less than BBB-) and (Days Until Review = 3651 or higher)

(Country Of Origin in (GERMANY, FRANCE)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review is less than 10950)

(Country Of Origin in (UNITED STATES)) and (Local Currency in (US Dollar)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Days Until Review is less than 10950)

(Country Of Origin in (UNITED KINGDOM, SPAIN, NETHERLANDS)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Local Currency in (British Pound Sterling, Euro)) and (Days Until Review = 3650 or higher) and (Days Until Review = 10950 or lower)

 

Thank you!

Deano_280
7 - Meteor

Hi @mceleavey 

 

My responses keep disappearing after I have sent them, so I'll need to send a reduced list.

 

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)

 

(Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 3650)

 

(Country of Origin in (FRANCE, GERMANY)) and (Local Currency in (Euro)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = CCC or higher) and (Days Until Review is less than 10950)

 

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)

 

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = AA- or higher) and (Days Until Review is greater than 1825) and (Days Until Review = 3650 or lower)

 

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review = 365 or lower)

 

(Country of Origin in (FRANCE, GERMANY)) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = BBB- or higher) and (Safety Rating (Lowest, Across agency, Regional/National, One or Both) = A+ or lower) and (Days Until Review is greater than 365) and (Days Until Review = 1825 or lower)

mceleavey
17 - Castor
17 - Castor

Hi @Deano_280 ,

 

thanks for that.

I've built this using the Regex tool to parse out the two requested sections. You didn't specify the output but I've done it so it parses to rows, then concatenates to the single row again with multiple columns, so you can pick and choose.

 

I built the workflow as follows:

 

workflow.jpg

With the regex configured as follows:

 

regex.jpg

I repeated this for both sections, changing only the leading text.

The remaining part of the workflow is to pivot the data accordingly, resulting in:

results.jpg

 

I've attached the workflow.

 

I hope this helps,

 

M.



Bulien

Deano_280
7 - Meteor

This is brilliant thank you so much!

 

Would you mind explaining how it is this expression works please? 

 

mceleavey
17 - Castor
17 - Castor

no problem.gif



Bulien

Labels
Top Solution Authors