# Weekly Challenge

Solve the challenge, share your solution and summit the ranks of our Community!
###### IDEAS WANTED

We're actively looking for ideas on how to improve Weekly Challenges and would love to hear what you think!

Submit Feedback
We've recently made an accessibility improvement to the community and therefore posts without any content are no longer allowed. Please use the spoiler feature or add a short message in the message body in order to submit your weekly challenge.

## Challenge #18: Predicting Baseball Wins

10 - Fireball

I get slightly different results but they're close enough, my workflow is almost identical to the solution

Spoiler
8 - Asteroid
Spoiler
8 - Asteroid

Alteryx need a function to select top(10) fields in linear regression automatically as .yxfd file.

Alteryx
Spoiler
David Wilcox
Senior Software Engineer
Alteryx
Alteryx Certified Partner

Challenge 18 is done!

Spoiler
7 - Meteor
Spoiler

8 - Asteroid

All,

I guess that I will go deep on this one!

-In my workflow I have my solution that has My personal selected variables as well as the variables of the solution that I vehemently oppose.

-The reason is although I am able to get the same output, we are really taking Predictive Modeling out of context when we don't examine variable selection.

-I would rather see the Stepwise tool be used next time vice using a simple correlation.

Spoiler
 Common Variables OBP RBI TB My Variables BatAge Solution Variables BA CS HR GDP OPS IBB OPS_Adj SB R X2B R_G X_Bat SLG Variable GVIF DF Std_GVIF OPS 10231.97529 1 101.1532 SLG 5140.467053 1 71.69705 R_G 2390.894591 1 48.89677 R 2206.742874 1 46.97598 OBP 1373.834561 1 37.06527 RBI 261.1853983 1 16.16123 TB 60.38742787 1 7.770935 HR 31.45761731 1 5.608709 BA 9.184712977 1 3.030629 OPS_Adj 6.802812459 1 2.60822

As we can see, there are three variables that carry over: On-base-percentage (OBP), Total Bases, and Run-Batted-In (RBI). These three are pretty universal when doing this Moneyball style challenge. Teams that can get on base, bat runs in and advance on bases generally score more runs over time which leads to more victories.

-Looking at the Variation-Inflation-Factors for the solution model, we see that VIF is well above 6, for all variables except Batting Average and adjusted On-base Percentage Adjusted.

-Long story short is that these variables, while helpful, have already been accounted for by OBP,RBI, and TB. My model utilizes more of the "Negative" Baseball stats like Caught Stealing, Times Grounded into Double Play where you could say that ever instance costs your team a one-seventh of a game exclusive of everything else. Batters age is surprising as every year gets you about 2.75 games, as I see this as a proxy for experienced players. A young team just doesn't have the experience and will be more likely to ground-out into double plays and not get on-base as more experienced players (who are also more likely to be plucked from lesser teams).

Cheers!

Matt
8 - Asteroid

I defiantly need more work on the predictive tools. This exercise was a good challenge.

Spoiler
8 - Asteroid

Laziness breeds efficiency. I started with the Association Analysis tool, but didn't want to sift through the data to figure out the top 10. I looked through the other Data Investigation tools and found the Pearson Correlation tool, which let me find the top 10 much easier (And more reliable than looking through a big table (:   )

Spoiler
Alteryx Partner

Here's my solution to challenge 18.

Spoiler