We've recently made an accessibility improvement to the community and therefore posts without any content are no longer allowed. Please use the spoiler feature or add a short message in the message body in order to submit your weekly challenge.
alteryx Community

# Weekly Challenge

Solve the challenge, share your solution and summit the ranks of our Community!

Also available in | Français | Português | Español | 日本語
###### IDEAS WANTED

We're actively looking for ideas on how to improve Weekly Challenges and would love to hear what you think!

Submit Feedback

## Challenge #103: Just another game?

Alteryx Alumni (Retired)

A solution to last week's Challenge has been posted HERE!

I learned the worst thing to say to a bunch of football fans: "The Super Bowl?  It's just another football game".   The looks of horror on my colleagues' faces when I said that are burned into my memory forever.  And let me warn you: don't say that when you have anything important to do because you will be subject to an hour-long debate about player stats, offense/defense match-ups, the importance of turnovers (and not just the apple flavored ones) and the proper way to make chili

Anyway, after this debate, I did what any data nerd would do: I took to the internet in search of datasets and fired up my Alteryx Designer to answer this question: Is the Super Bowl just another game?  I decided that I'd do a little experiment.  Using the Predictive tools and data from the New England Patriots's 2016 and 2017 seasons, I wanted to see how a linear regression model developed on regular season games (including post-season) performed when used to predict the number of points the Patriots would score during the Super Bowl.

First, I downloaded data (source: here) for the New England Patriots for the 2016 and 2016 seasons (provided as inputs in the Start File), which required a bit of parsing to prepare for later use.  Then, I set out on some data investigation to begin my linear regression model development.  My approach (which may not be the same one you use in your modeling approach)  was to choose the four (4) variables from the "Score", "Offense" and "Defense" data categories with the most significant relationship to the variable "TM", which indicates the number of points the Patriots scored.  With my variables selected, I began the model creation.  My approach (which may differ from yours😞 develop the model on on values except for one pair of regular season games and the Super Bowl games.

What's the difference between your predicted values and actual values for your regular season games and Super Bowl games?  Is the Super Bowl just another game?

Extra Challenge: How'd you do on this Jeopardy Category?  Admittedly, I was in good company with these contestants!

16 - Nebula

Here is my attempt! I leveraged some of what I learned from Challenge 18 (specifically @samjohnson)

Spoiler

I dynamically parsed the fields to columns. Based on correlation, I ended up using Offensive 1st Downs, Total Yards, Passing Yards, Defensive Turnovers, Expected Offensive Points, and Expected Defensive Points. I used PCA. I tried Linear Regression, Neural Network, Forest Model, and Gamma Regression. Neural Network had the lowest Root Mean Squared Error.

11 - Bolide

Solution is attached.  Played around with a couple different variables.  Because we are predicting points scored, I figured it would be best to solely look at the offensive stats as opposed to any defensive stats.  Something like Time of Possession would have been interested.

Also E-A-G-L-E-S. EAGLES!  Super Bowl Champs!

6 - Meteoroid

9 - Comet

Once again this week I find myself working with the data and topic I know nothing about so I stick to developing my model based purely on p-values.

Spoiler
I used only 3 variables Offense_PassY, Offense_TotYd and Offense_1stD because only they were statistically significant with a p-value <0.05 Which I think makes sense now, when I read @nick_ceneviva 's comment that points scored should be influenced by offence stats rather than defence.

The prediction for week 13 is way better than for the Super Bowl game.

7 - Meteor

A pretty terrible but repeatable bit of data prep, trying to combine the first 2 rows for the column headers wasn't necessary but I thought would be good if used in the future or for other uses!

Completed

8 - Asteroid

First Predictive attempt, really enjoyed this challenge.

14 - Magnetar

Here's my solution.

Spoiler
ACE Emeritus

This was a lot of fun to put together!

14 - Magnetar

I may have gone a little overboard. Particularly since I was rooting for the Eagles. (For no other reason than I'm a Seahawks fan, and the Eagle is also a bird.)

Spoiler
I chose to solve this with a macro so that I could see which week of regular season play was best suited to the predictions made from the remaining weeks, as well as which combination of weeks was the best predictor of Superbowl performance (by excluding a particular week of anomalous performance). My iterative macro went through a linear regression model using all weeks except Superbowl + each progressive week based on iteration number, through the entire season... and then I found the minimum absolute value point differential for both the regular season performance as well as Superbowl performance to find my best predictive groupings.

Based on results it appears that Week 15 was the most consistent with overall performance from 2016-2017, and the Superbowl was best predicted by excluding the results from Week 4.

Also, this workflow (with the addition of a couple unnecessary Select tools) lended itself quite nicely to a goalpost layout. Sooooo... I went for the 2-point conversion. 🙂

Cheers!

NJ