Hello,
I am streaming in a large HTML as a string value into the workflow.
The HTML string has an unknown number of a specific element embedded in it. Every element I want to find will look like this:
<div>Endorsements: <span class='gen-ai-file-name'>XXXXXX</span></div>
the XXXXXX will not always be the same length, might be a few characters, might be a long sentence, might be another embedded div or span.
what I really want to get out of this is the XXXXXX value, but I'd settle for getting the entire <div></div> substring
This could appear 1 time in the HTML, it might appear 100 times in the HTML. It might not show up in the HTML at all.
Ideally, what I'd like to return out of the parse is 1 row per occurrence with either the XXXXXX value or the entire "<div>Endorsements: <span class='gen-ai-file-name'>XXXXXX</span></div>" value. So if it shows up 1 time there will be 1 row, if it shows up 100 times, there will be 100 rows, if its not in there there won't be any rows.
I'm sure I can use XMLparse to do this, but I'm not very skilled in it. And this particular <div> element may be a parent element, a child element or seventeen layers deep buried in stacked divs and spans and whatnot.
It's proprietary so I can't post a sample, but hopefully I've been clear enough someone who does understand parsing and text mining can help.
Thanks in advance.