Weekly Challenge

Solve the challenge, share your solution and summit the ranks of our Community!
IDEAS WANTED

We're actively looking for ideas on how to improve Weekly Challenges and would love to hear what you think!

Submit Feedback
We've recently made an accessibility improvement to the community and therefore posts without any content are no longer allowed. Please use the spoiler feature or add a short message in the message body in order to submit your weekly challenge.

Challenge #13: HTML Table Parsing

Highlighted
Alteryx Alumni (Retired)

The link to the solution for last challenge #12 is HERE.

 

For this challenge let’s look at creating a multi-level hierarchy from employee-manager data. As always there are several ways to do this challenge, I have designated it as an advanced challenge because some of the more complex functions like RegEx can be used, but it is not absolutely necessary. 

 

The use case:

 

We have HTML data that is in a single field, the HTML contains an HTML Table.

 

The input contains a series of name/value pairs within the description field. The description field has a HTML table that contains 14 name/value contained within <td> tags. Each pairing can be found on a different row (designated by the <tr> tag).

 

The objective is to produce a table containing the 14 name/value pairs.

 

Good luck, I look forward to your feedback.

 

Update: As of 9/20/19, the start and solution files were updated.  Your solution may not match those posted by Community members prior to this date. 

Highlighted
Alteryx
Alteryx

The solution has been uploaded.

Tara McCoy
Highlighted
11 - Bolide

I'm not familiar with HTML so I'm not sure how well my solution would work on other HTML examples. My bonus was that I got to use one tools in the labratory menu!

Spoiler

Week13Image.PNG
I added the second record to make sure the make columns would work with more than 1 record.
Week13dataImage.PNG
Highlighted
Alteryx Alumni (Retired)

I have never seen the 'Make Columns" tool in use before.   Nice Job!

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Good work Alex!

 

Another proof that there are a lot of ways to accomplish a goal in Alteryx!

 

Highlighted
8 - Asteroid
Spoiler
I'm enjoying these weekly challenges.  Applying quite a bit on my projects - one of which is parsing html tables.

Capture.JPG

First formula line does large part of the work then two more formula to identify rows/columns

REGEX_Replace(Description,'<[t][^>]*>\W{0,1}|<\/[^t][^>]*>\W{0,1}|Null>','') will remove all but /td and /tr

REGEX_Replace(Description,'</tr>','~')

REGEX_Replace(Description,'</td>','|')

 

 

Highlighted
Alteryx Certified Partner
Spoiler

I made this alternative solution. Largely driven by trying to use the xml parse. This was quite a challenge as I'm new to Alteryx and don't know much about HTML so it took me a while to figure out. I explain each step in more detail in this blog post

Numbered solution.png

 

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

I'm not overly proud of this one @JoeM 

 

 


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.
Highlighted
16 - Nebula
16 - Nebula

Kinda half-way between the provided solution from @GeneR, and the solution from @MarqueeCrew

 

Spoiler

This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

Highlighted
14 - Magnetar
14 - Magnetar

My solution. I might bookmark this one to practice again down the line when I feel a bit more comfortable with my XML Parsing & RegEx skills... 

 

PS. @alex, the "Make Columns" tool has made my day. This will be SO useful for some of my work applications!! I went back & tested it out with my workflow on this challenge after I'd solved it with my original method, and it's so much cleaner, allows for fewer tools... Very cool tool!

 

 

 


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.