This week let’s have some fun and look at predicting baseball wins. For those looking for the solution to last week’s (challenge #17) the link is HERE.
The use case: The Baseball season has completed and it's time to project next year's win totals.
The objective: Determine the top 10 variables that correlate to wins (excluding [Win_Pct] and [Games] from the correlation). Leverage those top 10 variables to predict the # of wins the team will have in next year’s season.
Isolate the teams to only use Boston - BOS, Los Angles of Anaheim - LAA, Chicago Cub - CHC, San Francisco Giants - SFG, Colorado Rockies - COL and Texas Rangers - TEX.
Create what the final standing will be and how many games out of first place each team is assuming each team plays 162 games.
Good luck, I hope you are having fun with these exercises and expanding your knowledge of Alteryx. Thanks to all that have provided feedback
Request to @TaraM - some of the terminology in this one caught me - for example "how many games out of first place", which is probably a natural way of describing the question for folks who grew up with baseball.
Would you mind amending the challenge text a little to say something like "what is the difference in projected win-count between the overall winner and each other team" or something similar?
Good challenge - useful to have an excuse to use the predictive tools!
My solution! I saved this one till last because I didn't read the question all the way (or at all) and for some reason thought it had something to do with drafting a Fantasy Baseball team, and I was hoping to modify it later to help get a leg up on Fantasy Football drafting later this summer... turns out, that was not at all what the challenge was about. Oops. However, dearest Alteryx Challenge-Generating Team, I do believe a Fantasy Football Drafting Challenge would be a lovely idea for an early summer brain teaser... just sayin'... ;)
Anyway. I originally had more or less the same workflow as others, but decided to take a slightly different detour because I don't like having to go in and manually select things (like variables) in my workflows. So I figured out a (rather rambling) path that more or less allowed me to select all variables (except Tm & Wins) in the Linear Regression tool by narrowing it down to just the 10 top variables in the data being used in the tool first (rather than solving for the 10 variables first and then manually picking them from the list of all variables).
(It's the little things in life that bring me joy...)
@SeanAdams, @NicoleJohnson, don't know about racing to the finish (I can't always get to the challenges when they're posted) but I will 100% cross the finish line with you!!!
Are you going to Inspire this year? I'd love to meet up in person if you are. I will be there, and cannot wait. Last year was, well, quite inspiring :)