Free Trial

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #18: Predicting Baseball Wins

mithily
8 - Asteroid
Spoiler
losing my fear of predictive tools
jzkyburz
8 - Asteroid
Spoiler
18.PNG
Computernerd
8 - Asteroid

These Predictive challenges are tough.  I did have to peek on this one.

daiphuongngo
9 - Comet
Spoiler
Screenshot 2023-10-09 164251.png

JXEC
7 - Meteor

Had to shake off some rust regarding regression. The tools for data prediction in Alteryx are rather solid too.

JBLove
10 - Fireball

Solution attached.

CoG
14 - Magnetar

This was quite challenging, but also very fun! Got to learn a few new tools, and really enjoyed making one seamless workflow that's totally automated and needs no extra output files.

Spoiler
Workflow.png
Rob-Silk
8 - Asteroid

My solution below:

challenge_18_RS_screencap.PNG

LorenzNacilla
8 - Asteroid

First time using predictive tools, enjoyed this challenge

Spoiler
Challenge 18 Solution.png

RWvanLeeuwen
11 - Bolide

Prepping for the Expert exam so I'm doing this challenge once more. It is a weird challenge, and ultimately I ended up creating two random forests instead of a linear regression or count regression (these tools disallowed me to select a target field thanks to some magnificent bugs)

 

Spoiler
the drop down is broken across multiple tools and tool versions -> hooray!the drop down is broken across multiple tools and tool versions -> hooray!I used 2 models as to ensemble them (avg)...forest models are awesome, however, these models should never be used as the population data is wrong and small,  the granularity is awkward, meaning of the field names are unclear, I have some target leakage risk because of PCA,I used 2 models as to ensemble them (avg)...forest models are awesome, however, these models should never be used as the population data is wrong and small, the granularity is awkward, meaning of the field names are unclear, I have some target leakage risk because of PCA,