Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parsing phrase from text string

NH0520
5 - Atom

Hi,

 

I'm trying to parse two columns from various text strings, I basically need the first column to have the 5/five and the second to have the year/years from the text strings below (each full string is anywhere from a sentence to a paragraph):

 

 

1)

 have 5-year option

2)

of 2, 5 years option

3)

two five-year options

6 REPLIES 6
StephenR
Alteryx
Alteryx

Is it always going to be 5|five, or will you have other numbers that you want to pull out?

Regards,
Stephen Ruhl
Principal Customer Support Engineer

rarmstrong
8 - Asteroid

Using the RegEx tool with 'Parse' as output method: (.*)(year.*RegEx.PNGRegEx Result.PNG

StephenR
Alteryx
Alteryx

I would change that first expression to be

 

(\d+|one|two|three|four|five|six|seven|eight|nine|ten)

If you need to go higher you can add eleven-nineteen, then twenty, thirty, forty, etc.  It's long and awkward, but I'm not seeing a better option with a cursory search.

Regards,
Stephen Ruhl
Principal Customer Support Engineer

rarmstrong
8 - Asteroid

Ah, yes, I may have misread what the result needs to be. If the first column needs to be only the # then this would be the only route that I could think of. If the 2nd column needs to be only years or year then change the 2nd group to (years|year)

Claje
14 - Magnetar

This regex may be too brittle depending on your actual use case, but it works in this example:

(.*\W)(\w+)\W(years*)(.*)

RegExOut2 and RegExOut3 should be your number and the word year/years.


From there you can probably do some kind of conversion for the word "five".

This makes an assumption that the number of years always precedes the word "year", and that there is always a single non-alphanumeric character between the two.  Depending on the data you are dealing with, you may need to modify this, or cleanse your data into this format in some way.

Inactive User
Not applicable

If the sentence starts with "5 year plan" in that case we have to use "?" in the first expression.

 

(.*\W)?(\w+)(\W)(year)

 

Below are the example results of without and with "?" option condition

 

Without "?" option condition:

Wthout.JPG

 

With "?" option condition:

with.JPG

Labels