Hi:
I've created an HTML parsing workflow that pulls data from a FF website and then uses the formula function to replace the </tr> and </td> html tags with ~ and | symbols so it can then be broken into rows and columns.
I noticed, however, that one of the desired column headers (Yds /Comp) does not come through the download tool with a </td> tag (workflow attached). As a result, the workflow is mashing up two columns (Yds /Comp and TDs) into one.
Any suggestions on potential solutions for this would be appreciated. Thanks!
Solved! Go to Solution.
Hi @ivesbr
Since the <td> open tag is always there, use this in formula
Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>')
And continue splitting on "|"
Dan
Awesome! Thank you @PhilipMannering and @danilang
Follow up question for @danilang. How does the formula know to parse out that one column when you write the expression like this - Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>')? Just for my own edification.
@PhilipMannering ... I'm going to pour over your much more advanced workflow to see if I can pick up some new learnings.
Thanks again!
Hi @ivesbr
The formula Replace(Replace([DownloadData], '</tr>','~'),'<td>', '|<td>') is just a nested version of the Replace Function. The Alteryx engine works from the inside out in these cases. The red part is performed 1st, with the string in [DownloadData] being modified. The modified string is used as the input to the outer green Replace function.
Dan
Got it ... thank you @danilang!