Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Using Regular Expression - Parse

jmilller2018
5 - Atom

I have a set of data that is pulled from a website that is in between two specific blocks of characters:

 

<td>1,911</td>
<td>181</td>
<td>828</td>
<td>440</td>
<td>106</td>
<td>2,213</td>
<td>1,285<sup id="cite_ref-4" class="reference"><a href="#cite_note-4">[4]</a></sup></td>

 

I want to set up a RegEx that pulls anything between a ">" and a "<", so it returns the numbers in the middle of the <td>'s. I know I could do text to columns, but I want to learn the syntax behind RegExpressions better. Is there a way to set it up so that it returns just the digits and comma in between the > and <?

 

Thanks!

 

Jason

4 REPLIES 4
BenMoss
ACE Emeritus
ACE Emeritus

Below is a syntax I used to perform a similar task last week

 

<(\<\w+\>)>(.+)</\<\w+\>>

With the mode set to parse. In my version the value within the open tags declared the header for the value. That does not seem to be the case in yours so I would ammend the statement to read...

 

<\<\w+\>>(.+)</\<\w+\>>

Ben

LordNeilLord
15 - Aurora

Hey @jmilller2018

 

I'd use a regex function like this:

 

<td.*?>(.*?)<.*\/td>

 

I'm no expert in regex but this works in your example

jmilller2018
5 - Atom

Thank you! They both worked like a charm. 

Cc_1
8 - Asteroid

Hi BenMoss-

 

  Please, I need solution to the problem attached. I would appreciate whatever input you can provide. I am trying to attain the part of the workbook that is highlighted yellow while the area to your left not highlighted is the current state. Thanks

Labels