## Challenge #18: Predicting Baseball Wins

This week let’s have some fun and look at predicting baseball wins.   For those looking for the solution to last week’s (challenge #17) the link is HERE

The use case: The Baseball season has completed and it's time to project next year's win totals.

The objective:  Determine the top 10 variables that correlate to wins (excluding [Win_Pct] and [Games] from the correlation).  Leverage those top 10 variables to predict the # of wins the team will have in next year’s season.

Isolate the teams to only use Boston - BOS, Los Angles of Anaheim - LAA, Chicago Cub - CHC, San Francisco Giants - SFG, Colorado Rockies - COL and Texas Rangers - TEX.

Create what the final standing will be and how many games out of first place each team is assuming each team plays 162 games.

Good luck, I hope you are having fun with these exercises and expanding your knowledge of Alteryx.  Thanks to all that have provided feedback

A solution has been posted to this article.

Tara McCoy
I pretty much had the same solution except I used a multi-row tool instead of Summarize/Append to calculate games behind.

Pretty much the same solution as @GeneR / @TaraM and @alex.   @alex's method of using a join to filter instead of a formula is much more scalable on large data.

Request to @TaraM - some of the terminology in this one caught me - for example "how many games out of first place", which is probably a natural way of describing the question for folks who grew up with baseball.

Would you mind amending the challenge text a little to say something like "what is the difference in projected win-count between the overall winner and each other team" or something similar?

Good challenge - useful to have an excuse to use the predictive tools!

Fun!!

There are a few different ways to determine the top 10 metrics for predicting wins. I used the Association Analysis tool.

First time using the "association analysis"....worked well :)

My solution! I saved this one till last because I didn't read the question all the way (or at all) and for some reason thought it had something to do with drafting a Fantasy Baseball team, and I was hoping to modify it later to help get a leg up on Fantasy Football drafting later this summer... turns out, that was not at all what the challenge was about. Oops. However, dearest Alteryx Challenge-Generating Team, I do believe a Fantasy Football Drafting Challenge would be a lovely idea for an early summer brain teaser... just sayin'... ;)

Anyway. I originally had more or less the same workflow as others, but decided to take a slightly different detour because I don't like having to go in and manually select things (like variables) in my workflows. So I figured out a (rather rambling) path that more or less allowed me to select all variables (except Tm & Wins) in the Linear Regression tool by narrowing it down to just the 10 top variables in the data being used in the tool first (rather than solving for the 10 variables first and then manually picking them from the list of all variables).

(It's the little things in life that bring me joy...)

And on that note... I believe I have officially finished all 70 of the current Weekly Challenges to date!!

Game on @NicoleJohnson .   Will be interesting to see the action replays of the photo finish for the 75th one.

Any other takers in the community to make this a 4 or 5 way sprint to the finish line?

I will 100% cross the finish line with you!!!

Are you going to Inspire this year? I'd love to meet up in person if you are. I will be there, and cannot wait. Last year was, well, quite inspiring :)

Are you going to Inspire this year? I'd love to meet up in person if you are. I will be there, and cannot wait. Last year was, well, quite inspiring :)