Free Trial

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #13: HTML Table Parsing

PhilipMannering
16 - Nebula
16 - Nebula

Love a bit of regex

 

 

 


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

jasperlch
12 - Quasar

Solution attached


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

aclaxton
8 - Asteroid

This one was ugly and I think I committed a mortal sin by forcing regex on html :P

Spoiler
Week 13.png
blairmbailey
8 - Asteroid

Solution attached - thanks!


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

michalsicak
7 - Meteor

parse/formula/rinse/repeat.

 


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

JosephSerpis
17 - Castor
17 - Castor

Challenge Completed


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

asabau
8 - Asteroid

Here is number 13!


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

kcgreen
8 - Asteroid

If I ever get a new dog and the family gives me  naming rights, I'm naming it Regex because Regex is freaking awesome.

 

 

Spoiler

 

https://regex101.com/ is an outstanding resource for learning and testing out regex.  


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.

mceleavey
17 - Castor
17 - Castor

Nice and straightforward, but I still long for the days when Alteryx will automatically parse HTML tables, and we can all spend that extra few minutes dancing, staring into space, painting butterflies etc.

 

Anyway...

 

Spoiler

This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.



Bulien

KOBoyle
11 - Bolide

Solution attached. For the first step, I used the Formula tool (FindString) to isolate the second, nested table. I had to do some googling to find the regex to parse the <tr> and <td> tag contents. This approach allowed me to complete it in four steps.

 

After seeing others comments and the product enhancement request logged in 2016 (https://community.alteryx.com/t5/Alteryx-Product-Ideas/Tool-to-Parse-Tables-in-HTML/idi-p/39400), I was disappointed that an HTML table parser is still not available. I love Alteryx, but I think this use case was too painful and time consuming. This task can also be accomplished in Google Sheets with a single function call (ImportHTML) and in Excel on the Data tab with a couple of button clicks.  In addition to being much quicker and easier, neither of these options require any inspection of the HTML source.

 

-Ken


This post has been edited by Community Moderation to redact sensitive attachments. The original attachment has been replaced by post_placeholder.txt.