Bring your best ideas to the AI Use Case Contest! Enter to win 40 hours of expert engineering support and bring your vision to life using the powerful combination of Alteryx + AI. Learn more now, or go straight to the submission form.
Start Free Trial

Weekly Challenges

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
IDEAS WANTED

Want to get involved? We're always looking for ideas and content for Weekly Challenges.

SUBMIT YOUR IDEA

Challenge #430: Inspire 2024 – Grand Prix (Lap 3)

patrick_digan
17 - Castor
17 - Castor
Spoiler
image.png
ARussell34
8 - Asteroid

I found the driver!

AR_430.png

Reesetrain2
9 - Comet
9 - Comet

My submission!

Spoiler
Screenshot 2024-07-26 165030.png

Garrett_Stoker
8 - Asteroid

Regex just to make it interesting.

Spoiler
Screenshot 2024-08-16 113123.png
Erin
11 - Bolide
Spoiler
430.png
DawnDuong
13 - Pulsar
13 - Pulsar

Good refresher of the Stepwise and LR tools. Thank you for sharing this challenge.

Caramel8
7 - Meteor

 

Spoiler

I learned that the join tool will somehow create an error with the header matching in the Score tool. 

2024-09-18 22_14_28-Alteryx Designer x64 - Challenge_430_start_file.yxmd_.png

 

Alfie_King1
8 - Asteroid
Spoiler
Screenshot 2024-10-23 150926.png

OllieClarke
16 - Nebula
16 - Nebula

Here's my solution which matches the output, and some thoughts on oversampling

Spoiler
image.png
I thought that If the Podium finish column is only there in ~24% of records, should we not be oversampling here?
23.68% Yes23.68% Yes

I tried it and it broke everything, (I think because there were too few records left over from the undersampling)
You get a 100% accurate logistic regression (which warns you about the lack of rows), but after scoring no drivers are predicted to podium.
No one gets a podiumNo one gets a podium

If we do oversample though (rather than using the tool)
Oversampling the "Yes" rather than undersampling the "No"Oversampling the "Yes" rather than undersampling the "No"

We get a more accurate logistic regression than the basic workflow (although less accurate than the oversample one)
oversampled logistic regressionoversampled logistic regression
We also get a model that outputs the actual 3 podium finishers as the 3 most likely to podium (with Leclerc 4th most likely)
image.png

The oversampling section might be too much for a grand prix leg, and there's not a lot of data anyway, but is oversampling the correct approach here?

Bobbyt23
13 - Pulsar

Good practise with predictive tools. Couldn't do it under pressure on stage though!!