Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Regex tokenise

AndrewW
11 - Bolide

I have the following HTML, is it possible to use RegEx to split this into columns, only pulling out the header component? i.e. it would return 31 column headers

 

The attached workflow includes the below as a text input.

 

<tr>
<th align="center">A</th>
<th align="center">B</th>
<th align="center">C</th>
<th align="center">D</th>
<th align="center">E</th>
<th align="center">F</th>
<th align="center">G</th>
<th align="center">H</th>
<th align="center">I</th>
<th align="center">J</th>
<th align="center">K</th>
<th align="center">L</th>
<th align="center">M</th>
<th align="center">N</th>
<th align="center">O</th>
<th align="center">P</th>
<th align="center">Q</th>
<th align="center">R</th>
<th align="center">S</th>
<th align="center">T</th>
<th align="center">U</th>
<th align="center">V</th>
<th align="center">W</th>
<th align="center">X</th>
<th align="center">Y</th>
<th align="center">Z</th>
<th align="center">AA</th>
<th align="center">AB</th>
<th align="center">AC</th>
<th align="center">AD</th>
<th align="center">AE</th>

4 REPLIES 4
jdunkerley79
ACE Emeritus
ACE Emeritus

Yep:

<th[^>]*>(.*?)</th>

should work

AndrewW
11 - Bolide

That's great. Are you able to explain what this part does: [^>]*

 

I don't fully understand that component and was wrestling in vain with alternative ways to do that.

ZacharyM
Alteryx Alumni (Retired)

@jdunkerley79 is the Regex Wizard, and his solution does a great job grabbing out the characters.

 

However, my attachment takes a differnet approach and actually asserts these values into column headers as well - take a look!

 

Cheers,

Zak

jdunkerley79
ACE Emeritus
ACE Emeritus

This part [^>]* match characters that are not > so anythin after the <th until the next >

 

I set up a regexr regexr.com/44i4d it has a decent explanation bit at the bottom

 

 

 

 

 

Labels