Get Inspire insights from former attendees in our AMA discussion thread on Inspire Buzz. ACEs and other community members are on call all week to answer!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx parsing of text within multiple instances of the same tag / identifier

andreasoszkiel
5 - Atom
Data contained in a field:
Lorem ipsum dolor sit amet, [Start]consectetur adipiscing elit[Stop]. Maecenas gravida odio justo, ac pretium diam tempor vel. Quisque scelerisque sed elit venenatis condimentum. [Start]Aliquam[Stop] ligula mi, rutrum quis dolor ut, semper tempor massa. Integer urna dui, semper eget vulputate vel, aliquet quis ante. Donec accumsan velit vel enim porta, ac consectetur est condimentum. Vestibulum sed magna eleifend, dictum sem at, scelerisque tellus. Nulla in lorem lectus.
 
[Start]Pellentesque nunc arcu, porttitor in nisi et, tincidunt consequat leo. Aliquam ac mauris at ligula tincidunt varius ac vitae ex.[Stop]
 
I have tried to solve this with the RegEx tool using \[Start\](.*)\[Stop\]
but that seems to return only text between the first instance of [Start] and last instance of [Stop]
 
Expected result I would like returned (in a new field or fields):
consectetur adipiscing elit, Aliquam, Pellentesque nunc arcu, porttitor in nisi et, tincidunt consequat leo. Aliquam ac mauris at ligula tincidunt varius ac vitae ex.
 
I would welcome any suggestions!
Cheers,
Andreas
 
2 REPLIES 2
jdunkerley79
ACE Emeritus
ACE Emeritus

The  expression:

\[Start\](.*)\[Stop\]

contains the greedy expression (.*). I.e. it will match all it can where the expression is strill.

 

If you use: 

\[Start\](.*?)\[Stop\]

 

Then it match just the text between Start and next Stop in each case.

 

If the Regex tool is on tokenise mode it should then do what you expect

andreasoszkiel
5 - Atom

Thanks for the fast response jdunkerley79!

I played around with tokenise and the non-greedy match but didn't combine the two. That solves it.

Labels