In case you missed the announcement: Alteryx One is here, and so is the Spring Release! Learn more about these new and exciting releases here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Parse all content between HTML tags using the element ID

geeman
8 - Asteroid

Hi,

 

I'm looking for an approach to read all the content inside an HTML tag with an ID that I derive dynamically in the workflow.

Eg.

 

<HTML>

...

<DIV id="section-1" class="someclass" > <table><tr>...</tr></table>

</DIV>

<DIV id="section-2" class="someclass" > ...

</DIV>

...

</HTML>

 

A solution somewhat similar to the regex -- <div class=\"someclass\" id=\"' + [DIV_ID] + '\" .*?>.*?<\/div> (not working), that generates the output;

---

<DIV id="section-1" class="someclass" > <table><tr>...</tr></table>

</DIV>

---

And eventually the goal is to extract and list out the <table> contents.

 

Could anyone provide some suggestions? Thank you in advance!

 

 

7 REPLIES 7
BenMoss
ACE Emeritus
ACE Emeritus

Do you have some sample data @geeman

 

Ben

CharlieS
17 - Castor
17 - Castor

IF it's all on the same line/record, then you could use the substring function. Here's an example:

 

Substring([Field1],findstring([Field1],"<table>")+7,findstring([Field1],"</table>")-(findstring([Field1],"<table>")+7))

 

Otherwise, like @BenMoss said: an example input file for us to use would be best.

geeman
8 - Asteroid

Thanks for the replies @CharlieS & @BenMoss! I have attached a sample file for your reference. The actual html file is a very large file that grows dynamically, when the Ajax calls are made to load additional data in the tables.

CharlieS
17 - Castor
17 - Castor

I'm not sure exactly what you're looking for as far as table parsing goes, but I've attached a solution that will isolate the table contents with the section name appended.

geeman
8 - Asteroid

Hi @CharlieS, this is close to what I'm looking for.. the only additional ask is to be able to pass the html element id (Div id, in this case) as variable/parameter to the Multi-Row Formula dynamically.. Thank you so much for your help!

 

CharlieS
17 - Castor
17 - Castor

How about a Join as a filter?

geeman
8 - Asteroid

@CharlieS, Thank you, Appreciate your help!...  actually I thought about putting a filter for the specific 'Sections' and it works similar to the join you suggested.

But again is it possible to pass a parameter in the Multi-Row Formula?  The table content needs to be only from section-1, in your solution it is getting the table content for both section-1 & section-2. This is the reason why I was looking for getting only the section-1 block. Any suggestions?

Labels
Top Solution Authors