Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help with HTML Parsing Workflow

ivesbr
7 - Meteor

Hi:

 

I've created an HTML parsing workflow that pulls data from a FF website and then uses the formula function to replace the </tr> and </td> html tags with ~ and | symbols so it can then be broken into rows and columns. 

 

I noticed, however, that one of the desired column headers (Yds /Comp) does not come through the download tool with a </td> tag (workflow attached).  As a result, the workflow is mashing up two columns (Yds /Comp and TDs) into one. 

 

Any suggestions on potential solutions for this would be appreciated.  Thanks!  

5 REPLIES 5
danilang
19 - Altair
19 - Altair

Hi @ivesbr 

 

Since the <td> open tag is always there, use this in formula

 

Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>')

And continue splitting on "|"

 

Dan

 

PhilipMannering
16 - Nebula
16 - Nebula

If @danilang hasn't sorted it (which I doubt),

 

This might help you on your way,

 

parse xml.jpg

ivesbr
7 - Meteor

Awesome!  Thank you @PhilipMannering and @danilang 

 

Follow up question for @danilang.  How does the formula know to parse out that one column when you write the expression like this - Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>')?  Just for my own edification.  

 

@PhilipMannering ... I'm going to pour over your much more advanced workflow to see if I can pick up some new learnings.  

 

Thanks again!

 

 

danilang
19 - Altair
19 - Altair

Hi @ivesbr 

 

The formula Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>') is just a nested version of the Replace Function.  The Alteryx engine works from the inside out in these cases.  The red part is performed 1st, with the string in [DownloadData] being modified.  The modified string is used as the input to the outer green Replace function.

 

Dan  

ivesbr
7 - Meteor

Got it ... thank you @danilang!

Labels
Top Solution Authors