cancel
Showing results for
Did you mean:

# Weekly Challenge

Solve the challenge, share your solution and summit the ranks of our Community!
New content is available in Academy! You may need to clear your browser cache for an optimal viewing experience

## Challenge #18: Predicting Baseball Wins

Asteroid
Spoiler
Asteroid
Spoiler
Asteroid
Spoiler
Meteor
Spoiler
Alteryx Partner

Just beginning to learn predictive tools, so this one took a bit of time.  Fortunately, we didn't need to know what all those fields actually mean

Spoiler
Process:
- Association Analysis tool to determine correlation of variables to Wins.  I copied the output to a spreadsheet and sorted to determine the top 10
- Regress those 10 variables against Wins
- Join initial stats to list of 6 teams of interest
- Input the data for the 6 teams into the regression model from above using the Score tool
- Round Projected Wins (the output variable)
- Projected Losses = 162 - Projected Wins
- Sort by Projected Wins (desc) then Projected Losses (asc) then Team
- Calculate Games Back temp as the difference between Projected Wins vs Projected Wins for the prior record
- Calculate Games Back as the running total of Games Back temp
- Clean-up

Asteroid

This was definitely helpful, as I've struggled a few times getting Alteryx to successfully run and score the model (I had tried this one a couple times awhile back). Realized that...

Spoiler

... by filtering down to the teams before generating the linear model, there were not enough df for the model to be generated. Filtering had to be done after the model creation. Learned on this one for sure!
Asteroid

Meteor

solutions as attached

Asteroid
Spoiler

correlation analysis tool to find top 10 variables, use those in the linear regression tool, score tool to test it on the filtered data for the teams instructed, subtract from 162 to get losses, summarize to find max, append, subtract wins from max to find games back

Meteoroid

My approach is a little different in that I "automated" the selection of the top 10 predictor variables. First I identify the 10 with the highest correlation (Sort by the correlation in descending order, Sample first 10 records) and then Join this list to the original data set which has been Transposed (so that each team and variable is a single row). This allows me to simply "select all" in the Linear Regression tool (and just deselect the two irrelevant variables).

Spoiler