We are celebrating the 10-year anniversary of the Alteryx Community! Learn more and join in on the fun here.
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex extract text from html using tags

jt_edin
8 - Asteroid

Try as I might, I can't find the example I need to do this simple regex extract. Please can someone help real quick?

 

Here is the relevant line of html as it appears in the source of my browser:

 

<p><strong>Name:</strong> J. Robertson</p>

 

I want to capture the text in bold above (J. Robertson)

 

So in regex speak I believe I want to search for this text:

 

<p><strong>Name:</strong>

 

...and then start a marked group, and capture everything until the next instance of </p> which closes the marked group. Pretty simple huh, but I can't figure it out and I can't find a help example. How do I do this? Thanks!

 

3 REPLIES 3
mceleavey
17 - Castor
17 - Castor

Hi @jt_edin ,

 

I've attached the workflow. The first one parses out a single instance of the name, the second is where there are multiple names and it parses out to rows. There's probably a better way of doing it if I had the full HTML, but given what I can see, that should work.

 

Hope this helps,

 

M.



Bulien

fmvizcaino
17 - Castor
17 - Castor

Hi @jt_edin ,

 

Attached is an example showing how to do it.

I'm using tokenize method to get all incidences of that structure. 

fmvizcaino_0-1583467053067.png

 

Best,

Fernando Vizcaino

 

 

jt_edin
8 - Asteroid

Thanks both. I have accepted @fmvizcaino 's solution as it most closely matches the single tool approach I had in mind, however @mceleavey 's is excellent for working through the problem step by step, so thanks.

 

@fmvizcaino Would you be able to explain what happens within the parentheses of the marked group, both for my benefit and others?

 

([^<]+)

 

What do these symbols mean, and where would you recommend we go for help to understand them? I find Regex baffling and I'm sure I'm not the only one!

Labels
Top Solution Authors