Challenge #13: HTML Table Parsing
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The link to the solution for last challenge #12 is HERE.
For this challenge let’s look at creating a multi-level hierarchy from employee-manager data. As always there are several ways to do this challenge, I have designated it as an advanced challenge because some of the more complex functions like RegEx can be used, but it is not absolutely necessary.
The use case:
We have HTML data that is in a single field, the HTML contains an HTML Table.
The input contains a series of name/value pairs within the description field. The description field has a HTML table that contains 14 name/value contained within <td> tags. Each pairing can be found on a different row (designated by the <tr> tag).
The objective is to produce a table containing the 14 name/value pairs.
Good luck, I look forward to your feedback.
Update: As of 9/20/19, the start and solution files were updated. Your solution may not match those posted by Community members prior to this date.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The solution has been uploaded.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm not familiar with HTML so I'm not sure how well my solution would work on other HTML examples. My bonus was that I got to use one tools in the labratory menu!
I added the second record to make sure the make columns would work with more than 1 record.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have never seen the 'Make Columns" tool in use before. Nice Job!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Good work Alex!
Another proof that there are a lot of ways to accomplish a goal in Alteryx!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
First formula line does large part of the work then two more formula to identify rows/columns
REGEX_Replace(Description,'<[t][^>]*>\W{0,1}|<\/[^t][^>]*>\W{0,1}|Null>','') will remove all but /td and /tr
REGEX_Replace(Description,'</tr>','~')
REGEX_Replace(Description,'</td>','|')
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I'm not overly proud of this one @JoeM 😞
This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Kinda half-way between the provided solution from @GeneR, and the solution from @MarqueeCrew
This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
My solution. I might bookmark this one to practice again down the line when I feel a bit more comfortable with my XML Parsing & RegEx skills...
PS. @alex, the "Make Columns" tool has made my day. This will be SO useful for some of my work applications!! I went back & tested it out with my workflow on this challenge after I'd solved it with my original method, and it's so much cleaner, allows for fewer tools... Very cool tool!
This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.