Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx: Only want text between <p> and </p>

Billbisco
7 - Meteor

I'm trying to use the RegEx tool and I only want text between <p> and </p>

 

This equation will remove everything up to and including <p> 

 

^[^_]*<p>

 

However, I still get </p> and all the other junk after that I don't want.  Can anyone help?

7 REPLIES 7
MSalvage
11 - Bolide

@Billbisco

 

Try using using Expression:

 

Whoops, EDIT: [<]p[>](.*)[<]\/p[>]

 

Best,

MSalvage

danrh
13 - Pulsar

In case you don't want to go the regex route :)

 

Substring([YourData],
FindString([YourData], '<p>')+3,
FindString([YourData], '</p>')-FindString([YourData], '<p>')-3)

Billbisco
7 - Meteor

Hi that gets rid of all of <p> and everything inside of it!  I want to keep the stuff inside of it!

MSalvage
11 - Bolide

@Billbisco

 

hmmmm sounds like you just need to change regex tool from replace to parse for my solution...

 

billbisco output method dropdown..PNG

MSalvage

 

 

Billbisco
7 - Meteor

Alright that worked.  Thank you!  I was not aware of Parse for Regex.  Where do I learn about that?

mceleavey
17 - Castor
17 - Castor

HI @Billbisco,

 

You can read more about Regex parsing here: https://help.alteryx.com/9.5/RegEx.htm

 

You could also have used the .*? for this task, which would take all text in between the two strings, whilst retaining the strings.

For example, from the following text:

"Regex is particularly useful for parsing chunks of text"

 

useful .*? chunks will return "useful for parsing chunks"

 

set the regex tool to tokenize



Bulien

BPurcell2
9 - Comet

For me, the live training has been very helpful

 

https://community.alteryx.com/t5/Live-Training/Live-Training-Introduction-to-RegEx/m-p/66489#M116

 

Also, when I need a quick reference, I use the examples provided when you click on the RegEx icon on the toolbar

 

 image.png

Labels