Hello community,
I would like to parse the inputs in a column by using Regex tool. I only need wording that located between numbers. In this case results should be "car", "plane", "boat", "road" respectively
| Orange MMM-9999 car 01.-12.22 |
| Blue-899 plane 12.2022 |
| Red LL KK-2222 boat 01.-12.22 |
| Green G WWW-3333 road 05.22 |
Thank you in advance
@maksu try the below regex
hi @binu_acs, thank you for your reply.
Your solution works, but I noticed that there are more combinations in the dataset. Could you please also assist me for these below?
| Orange MMM-9999 car 01.-12.22 |
| Blue-899 plane 12.2022 |
| Red LL KK-2222 boat 01.-12.22 |
| Green G WWW-3333 road 05.22 |
| Orange car 04.22 |
| KKK.2222 car 06.2022 |
| Red, Basel plane-9999 12.22 |
| Green-3333 - boat |
| Green C3333 road 02.22-01.23 |
| Green-green-plane 06.-08.22 |
| Red plane 01.-12.22 |
| plane-333 09.22 |
Thank you
@maksu regex works based on the pattern in the given data, if your data follow different patterns then it is very difficult to write a regex formula. Does your actual data follow any pattern? Also what are the expected results for the last 3 rows?
To further build off of what @binu_acs - has already said. Take for example your first and last row in the most recent table you shared:
| Orange MMM-9999 car 01.-12.22 |
| plane-333 09.22 |
If the output for row 1 is "car" and row 2 is supposed to be "plane", how is an algorithm supposed to know that you didn't want to include "Orange MMM" from row 1? It, just like row 2, is followed by a hyphen and then numbers. Pinning down the exact structure for your data will be crucial to getting the information that you need out of it.
Hi, @maksu
FYI.
(?:[[:alnum:]]\s|\-\s?|^)([a-z]+)(?=\s|$|\-\d)
| Txt | Get |
| Orange MMM-9999 car 01.-12.22 | car |
| Blue-899 plane 12.2022 | plane |
| Red LL KK-2222 boat 01.-12.22 | boat |
| Green G WWW-3333 road 05.22 | road |
| Orange car 04.22 | car |
| KKK.2222 car 06.2022 | car |
| Red, Basel plane-9999 12.22 | plane |
| Green-3333 - boat | boat |
| Green C3333 road 02.22-01.23 | road |
| Green-green-plane 06.-08.22 | plane |
| Red plane 01.-12.22 | plane |
| plane-333 09.22 | plane |
BTW, If you already know all keyword, maybe can use Find/Replace tool to match it.

