Missed the Q4 Fall Release Product Update? Watch the on-demand webinar for more info on the latest in Designer 24.2, Auto Insights Magic Reports, and more!
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to parse sentences

maksu
6 - Meteoroid

Hello community,

 

I would like to parse the inputs in a column by using Regex tool. I only need wording that located between numbers. In this case results should be "car", "plane", "boat", "road" respectively

 

 

Orange MMM-9999 car 01.-12.22
Blue-899 plane 12.2022
Red LL KK-2222 boat  01.-12.22
Green G WWW-3333 road 05.22

 

Thank you in advance

6 REPLIES 6
binuacs
21 - Polaris

@maksu try the below regex

image.png

CoG
14 - Magnetar

Here is another example that should work, that uses "Tokenize" to handle multiple such wordings between numbers in any given row:

_Main.png

_Regex.png

maksu
6 - Meteoroid

hi @binuacs, thank you for your reply.

 

Your solution works, but I noticed that there are more combinations in the dataset. Could you please also assist me for these below?

 

Orange MMM-9999 car 01.-12.22
Blue-899 plane 12.2022
Red LL KK-2222 boat  01.-12.22
Green G WWW-3333 road 05.22
Orange car 04.22
KKK.2222 car 06.2022
Red, Basel plane-9999 12.22
Green-3333 - boat
Green C3333 road 02.22-01.23
Green-green-plane 06.-08.22
Red plane 01.-12.22
plane-333 09.22

 

Thank you

binuacs
21 - Polaris

@maksu regex works based on the pattern in the given data, if your data follow different patterns then it is very difficult to write a regex formula. Does your actual data follow any pattern? Also what are the expected results for the last 3 rows?

CoG
14 - Magnetar

To further build off of what @binuacs - has already said. Take for example your first and last row in the most recent table you shared:

Orange MMM-9999 car 01.-12.22
plane-333 09.22

 

If the output for row 1 is "car" and row 2 is supposed to be "plane", how is an algorithm supposed to know that you didn't want to include "Orange MMM" from row 1? It, just like row 2, is followed by a hyphen and then numbers. Pinning down the exact structure for your data will be crucial to getting the information that you need out of it.

flying008
15 - Aurora

Hi, @maksu 

 

FYI.

 

(?:[[:alnum:]]\s|\-\s?|^)([a-z]+)(?=\s|$|\-\d)

 

 

录制_2024_01_02_10_28_48_318.gif

 

TxtGet
Orange MMM-9999 car 01.-12.22car
Blue-899 plane 12.2022plane
Red LL KK-2222 boat  01.-12.22boat
Green G WWW-3333 road 05.22road
Orange car 04.22car
KKK.2222 car 06.2022car
Red, Basel plane-9999 12.22plane
Green-3333 - boatboat
Green C3333 road 02.22-01.23road
Green-green-plane 06.-08.22plane
Red plane 01.-12.22plane
plane-333 09.22plane

 

BTW, If you already know all keyword, maybe can use Find/Replace tool to match it.

 

录制_2024_01_02_14_16_57_270.gif

Labels
Top Solution Authors