A solution to last week's Challenge has been posted HERE!
I learned the worst thing to say to a bunch of football fans: "The Super Bowl? It's just another football game". The looks of horror on my colleagues' faces when I said that are burned into my memory forever. And let me warn you: don't say that when you have anything important to do because you will be subject to an hour-long debate about player stats, offense/defense match-ups, the importance of turnovers (and not just the apple flavored ones) and the proper way to make chili.
Anyway, after this debate, I did what any data nerd would do: I took to the internet in search of datasets and fired up my Alteryx Designer to answer this question: Is the Super Bowl just another game? I decided that I'd do a little experiment. Using the Predictive tools and data from the New England Patriots's 2016 and 2017 seasons, I wanted to see how a linear regression model developed on regular season games (including post-season) performed when used to predict the number of points the Patriots would score during the Super Bowl.
First, I downloaded data (source: here) for the New England Patriots for the 2016 and 2016 seasons (provided as inputs in the Start File), which required a bit of parsing to prepare for later use. Then, I set out on some data investigation to begin my linear regression model development. My approach (which may not be the same one you use in your modeling approach) was to choose the four (4) variables from the "Score", "Offense" and "Defense" data categories with the most significant relationship to the variable "TM", which indicates the number of points the Patriots scored. With my variables selected, I began the model creation. My approach (which may differ from yours): develop the model on on values except for one pair of regular season games and the Super Bowl games.
What's the difference between your predicted values and actual values for your regular season games and Super Bowl games? Is the Super Bowl just another game?
Extra Challenge: How'd you do on this Jeopardy Category? Admittedly, I was in good company with these contestants!
Here is my attempt! I leveraged some of what I learned from Challenge 18 (specifically @samjohnson)
Solution is attached. Played around with a couple different variables. Because we are predicting points scored, I figured it would be best to solely look at the offensive stats as opposed to any defensive stats. Something like Time of Possession would have been interested.
Also E-A-G-L-E-S. EAGLES! Super Bowl Champs!
Once again this week I find myself working with the data and topic I know nothing about so I stick to developing my model based purely on p-values.
I may have gone a little overboard. Particularly since I was rooting for the Eagles. (For no other reason than I'm a Seahawks fan, and the Eagle is also a bird.)