The Alteryx Community is a finalist in three 2026 CMX Awards! Help us win Customer Support Community, Most Engaged Community, and User Group Program of the Year - vote now! (it only takes about 2 minutes) before January 9.
ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests Early 2026. Make sure to check your account preferences in my.alteryx.com to make sure you have filled out your security questions. Learn more here
Start Free Trial

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #40: Parsing a HTML File

JoshuaM
9 - Comet

Wasn't the prettiest but I did it!

Spoiler
Challenge #40 snip.PNG

 

JasonHu
8 - Asteroid
Spoiler
workflow.png

I was struggling on regex for a long time, then check the answer to get an idea.

TraceyD
Alteryx Alumni (Retired)

Interesting - learning to scrape a webpage!

clmc9601
13 - Pulsar
13 - Pulsar

My solution

aiahwieder
9 - Comet

Trickier than it seemed at first . . . 

jwjeong
5 - Atom

For every fields, I parsed them by Regex Expression

 

However, the solution seems to be more logical and easier

 

Thanks for the challenge #40

 

20210125

 

jwjeong_1-1611550495847.png

 

phottovy
13 - Pulsar
13 - Pulsar

My solution is very inefficient but at least I had fun, which is really the most important thing right?

Hub119
12 - Quasar
12 - Quasar

Solution attached.

JP_SDAK
8 - Asteroid

This was a great challenge - love REGEX for this (and everything truth be told).  I also discovered the error with row 649 being parsed in correctly.

Spoiler
JP_SDAK_0-1614830792212.png

 

apathetichell
20 - Arcturus

Kept the errors on 374 and 649 which the output solution had... I don't even know if 374 counts as an error - perhaps Dr. Harken transcends concepts of location and medical specialization - the information isn't included in the original text.