community
cancel
Showing results for 
Search instead for 
Did you mean: 
Announcement | Get certified today - take the Alteryx Designer Core and Advanced exams on-demand now!
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Weekly Challenge
Do you have the skills to make it to the top? Subscribe to our weekly challenges. Try your best to solve the problem, share your solution, and see how others tackled the same problem. We share our answer too.
Unable to display your progress at this time. Please try again a little later, or contact an administrator if you continue to see this error.

Challenge #13: HTML Table Parsing

Alteryx Alumni (Retired)

Here is this week’s challenge, I hope everyone had a good Presidents Day. The link to the solution for last challenge #12 is HERE. For this challenge let’s look at creating a multi-level hierarchy from employee-manager data. As always there are several ways to do this challenge, I have designated it as an advanced challenge because some of the more complex functions like RegEx can be used, but it is not absolutely necessary. 

 

The use case:

 

We have HTML data that is in a single field, the HTML contains an HTML Table.

 

The input contains a series of name/value pairs within the description field. The description field has a HTML table that contains 14 name/value contained within <td> tags. Each pairing can be found on a different row (designated by the <tr> tag).

 

The objective is to produce a table containing the 14 name/value pairs.

 

Good luck, I look forward to your feedback.

 

Update 2/20/2016:

The solution has been uploaded.

 
Creative Director
Creative Director

The solution has been uploaded.

Tara McCoy
Bolide

I'm not familiar with HTML so I'm not sure how well my solution would work on other HTML examples. My bonus was that I got to use one tools in the labratory menu!

Spoiler

Week13Image.PNG
I added the second record to make sure the make columns would work with more than 1 record.
Week13dataImage.PNG
Alteryx Alumni (Retired)

I have never seen the 'Make Columns" tool in use before.   Nice Job!

Alteryx Certified Partner
Alteryx Certified Partner

Good work Alex!

 

Another proof that there are a lot of ways to accomplish a goal in Alteryx!

 

Asteroid
Spoiler
I'm enjoying these weekly challenges.  Applying quite a bit on my projects - one of which is parsing html tables.

Capture.JPG

First formula line does large part of the work then two more formula to identify rows/columns

REGEX_Replace(Description,'<[t][^>]*>\W{0,1}|<\/[^t][^>]*>\W{0,1}|Null>','') will remove all but /td and /tr

REGEX_Replace(Description,'</tr>','~')

REGEX_Replace(Description,'</td>','|')

 

 

Alteryx Certified Partner
Spoiler

I made this alternative solution. Largely driven by trying to use the xml parse. This was quite a challenge as I'm new to Alteryx and don't know much about HTML so it took me a while to figure out. I explain each step in more detail in this blog post

Numbered solution.png

 

Alteryx Certified Partner
Alteryx Certified Partner

I'm not overly proud of this one @JoeM :(

 

 

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Aurora
Aurora

Kinda half-way between the provided solution from @GeneR, and the solution from @MarqueeCrew

 

Spoiler
- Used a few formulas to strip out everything up to the end of the body header, and the </body> so that we're left with just the inner body of the page
- Then did a vertical tokenize using Regex for table rows.  Note-because of the embedded table row, had to do this again with a formula
- Then did a horizontal tokenize using Regex for <TD>
- Stripped off Row 1, and converted nulls - job done!

I wouldn't say this is bullet-proof to point at any arbitrary page - but the fact that it does the vertical and then horizontal tokenize means that it's stable under different length or widths of table

My solution. I might bookmark this one to practice again down the line when I feel a bit more comfortable with my XML Parsing & RegEx skills... 

 

PS. @alex, the "Make Columns" tool has made my day. This will be SO useful for some of my work applications!! I went back & tested it out with my workflow on this challenge after I'd solved it with my original method, and it's so much cleaner, allows for fewer tools... Very cool tool!

 

 

Spoiler
WeeklyChallenge13.JPG