Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to parse sentences

maksu
6 - Meteoroid

Hello community,

 

I would like to parse the inputs in a column by using Regex tool. I only need wording that located between numbers. In this case results should be "car", "plane", "boat", "road" respectively

 

 

Orange MMM-9999 car 01.-12.22
Blue-899 plane 12.2022
Red LL KK-2222 boat  01.-12.22
Green G WWW-3333 road 05.22

 

Thank you in advance

6 REPLIES 6
binuacs
21 - Polaris

@maksu try the below regex

image.png

CoG
14 - Magnetar

Here is another example that should work, that uses "Tokenize" to handle multiple such wordings between numbers in any given row:

_Main.png

_Regex.png

maksu
6 - Meteoroid

hi @binuacs, thank you for your reply.

 

Your solution works, but I noticed that there are more combinations in the dataset. Could you please also assist me for these below?

 

Orange MMM-9999 car 01.-12.22
Blue-899 plane 12.2022
Red LL KK-2222 boat  01.-12.22
Green G WWW-3333 road 05.22
Orange car 04.22
KKK.2222 car 06.2022
Red, Basel plane-9999 12.22
Green-3333 - boat
Green C3333 road 02.22-01.23
Green-green-plane 06.-08.22
Red plane 01.-12.22
plane-333 09.22

 

Thank you

binuacs
21 - Polaris

@maksu regex works based on the pattern in the given data, if your data follow different patterns then it is very difficult to write a regex formula. Does your actual data follow any pattern? Also what are the expected results for the last 3 rows?

CoG
14 - Magnetar

To further build off of what @binuacs - has already said. Take for example your first and last row in the most recent table you shared:

Orange MMM-9999 car 01.-12.22
plane-333 09.22

 

If the output for row 1 is "car" and row 2 is supposed to be "plane", how is an algorithm supposed to know that you didn't want to include "Orange MMM" from row 1? It, just like row 2, is followed by a hyphen and then numbers. Pinning down the exact structure for your data will be crucial to getting the information that you need out of it.

flying008
15 - Aurora

Hi, @maksu 

 

FYI.

 

(?:[[:alnum:]]\s|\-\s?|^)([a-z]+)(?=\s|$|\-\d)

 

 

录制_2024_01_02_10_28_48_318.gif

 

TxtGet
Orange MMM-9999 car 01.-12.22car
Blue-899 plane 12.2022plane
Red LL KK-2222 boat  01.-12.22boat
Green G WWW-3333 road 05.22road
Orange car 04.22car
KKK.2222 car 06.2022car
Red, Basel plane-9999 12.22plane
Green-3333 - boatboat
Green C3333 road 02.22-01.23road
Green-green-plane 06.-08.22plane
Red plane 01.-12.22plane
plane-333 09.22plane

 

BTW, If you already know all keyword, maybe can use Find/Replace tool to match it.

 

录制_2024_01_02_14_16_57_270.gif

Labels
Top Solution Authors