## Challenge #18: Predicting Baseball Wins

It was enlightening to see that you could derive the top 10 correlation coefficients from the Spearman rank correlation and the Pearson correlation outputs.

However, I was not able to get the predictors' p-values derived in the output.

It would be nice if Alteryx gave you the option of viewing different values from the R output provided.  Measuring the significance of the predictors is one of the metrics to consider when building a linear prediction model.

I know sod all about building and validating regression models - so a very useful challenge. Solution attached.

I tried to make the selection of variables for the linear regression tool dynamic, but only managed to narrow the list down. the clicking is still a manual process.

Catching up on the old ones. I was curious to compare that in all three methods top 10 predictor variables are actually the same

A nice introduction to Linear Regression.

A quick glance through tells me I've done this pretty much the same way as everyone else.

I first ran the data into the Association Analysis tool, and deselected the appropriate fields. This then gave me the top ten associated fields.

I then used this to determine the predictor variables in the linear analysis tool.
I then simply filtered the required teams, scored them against the Linear Regression model, created the max number of wins, and calculated the other fields accordingly.

Here is my version. I played around with the methods from other users.

