cancel
Showing results for
Did you mean:

## Examining the Accuracy of the Predictions in the 2016 Presidential Election App

Alteryx

This very long election cycle has finally come to an end, complete with what, for many, was a surprising conclusion based on the pre-election polls (or, more accurately, the interpretation of those polls). Given these pre-election polls and the subsequent critical press coverage, it is natural to wonder whether the data presented in the 2016 Presidential Election App had some "issues." While it appears that we over estimated Hillary Clinton's lead in the popular vote share (as with the national polling averages, we placed the difference at 3.9 percentage points, while it appears that the true difference is about 1.2 percentage points in Clinton's favor), our predictions at the county level were correct for over 96% of counties based on the election returns data available at the time of analysis. Not all the votes have been counted in this election as I write this, but most of the remaining votes are in California and Washington State, and are unlikely to have a substantive effect on the results reported here. In this blog post I will provide description of the comparison measures used, and how the model stacks up against those measures.

We have comparison data for 3,111 counties including all states except Alaska, which does not report data at the county level.1 The election returns data used were those reported by National Public Radio via their website. I refer to the model that we developed for the Presidential Election App as the two-part model, which is based on the process used to construct the model.

Figure 1 provides a simple scatter plot of the fitted and actual values of the Democratic candidate major party vote share (the percentage of votes cast for the two major parties that went to the Democratic candidate). The figure indicates a fairly tight bunching of points along a roughly 45 degree line, which is what we are hoping to see. The tightness of the points along a 45 degree line corresponds to a correlation coefficient of 0.95 between the fitted and actual values.

Figure 1. Predicted versus Actual Value for the Two-Part Model

Even though the predicted values are numeric values, we can declare a winner in each county based on the candidate that received the greatest number of predicted votes, allowing for a categorization of the results for each candidate into predicted won/lost groups. In turn, this allows us to create what is known (somewhat un-intuitively) as a confusion matrix. In this case the confusion matrix is a two-by-two table. In the table, columns give the number of counties each candidate won, while the rows give the number of counties each candidate was predicted to win. The interior cells of the table gives the number of counties that fall in a cross classification. If the first column gives the number of counties that Clinton won, and the first row gives the number of counties that Clinton was expected to win, then the first cell in the table contains the number of counties that Clinton both won and was expected to win. Similarly, the second cell in the first row of the table gives the number of counties that Clinton was expected to win, but which Trump actually won. The full confusion matrix of the results is provided in Table 1.

 Actual Predicted Democratic Republican Democratic 457 91 Republican 29 2534

Table 1. The Confusion Matrix of the Predicted and Actual County-Level Winner

The values in the confusion matrix can be used to quickly calculate the percentage correctly predicted in total, and for each of the two candidates, and are common metrics used to examine the predictive efficacy of classification models. In addition, since the models are actually predicting numeric quantities, we can also use comparison metrics used for these types of models. These measure include the root mean square error (or RMSE), which is the square root of the mean of the squared errors; the mean absolute deviation (or MAD), which is the mean of the absolute values of the errors; and the mean absolute percentage error (or MAPE), which is the mean of the absolute value of the error given as a percentage of the actual value. For all three of these measures, smaller values are preferred to larger ones. In addition, the correlation between the actual and fitted values is another summary measure, and values closer to one are preferred for this measure. Table 2 provides the full set of summary measures.

 Measure Value Overall Accuracy 0.9614 Clinton Accuracy 0.9403 Trump Accuracy 0.9653 Correlation 0.9497 RMSE 0.0579 MAD 0.0451 MAPE 20.13

Table 2. The County-Level Prediction Accuracy Measures for the 2016 Election

The accuracy figures indicates that the model was able to very accurately predict the counties that both Clinton (at 94% accuracy) and Trump (at nearly 97% accuracy) won. The correlation between fitted and actual values is nearly 0.95, which is very good. The root mean square error of just under 0.06 is very reasonable, as is the mean absolute error of 0.045 (this suggests the predictions were off on average by about 4.5 percentage points across counties). The mean absolute percentage error is a bit higher than we would like (at around 20%), but this is likely due to a known issue with this metric associated with errors in cases where the actual values are low. Overall, the summary measures indicate that the two-part model predicted the actual county-level results very well, and, as a result, the projections contained in the 2016 Presidential Election App appear to have a high degree of accuracy.

### Comparing the Predicted County-Level Map with the Actual County- Level Map

Figure 2 provides the actual county-level results map, while Figure 3 provides the final map of county-level predictions from the 2016 Presidential Election app. An examination of these two maps indicate some differences, but the maps themselves are fundamentally very similar. The one thing that is noticeable is that hue of red is somewhat darker in the Great Plains and the northern portions of the Intermountain West in the actual map compared to the final predicted values. Despite this, the map further demonstrates the predictive efficacy of the two-part model used in creating the Presidential Election App.

Figure 2. The Actual County-Level Results

Figure 3. The Final Presidential Election App County-Level Predictions

Figure 4 shows the counties that are incorrectly predicted, with the color indicating the party of the candidate that actually won (blue for Clinton, red for Trump). While there are some counties that were incorrectly predicted, very few of these touched one another, and in some cases they switched to the other opponent. The thing that does stand out is the number of counties in upstate New York and northern New England that went to Donald Trump, but were predicted to go to Hilary Clinton. Interestingly, Clinton still won all of these states, albeit, Trump did receive an electoral vote from Maine's second congressional district. The Southeast also shows a tendency to have a concentration of counties that went for Trump that were expected to got to Clinton, while the far western states have more counties that went to Clinton than the model predicted. However, these patterns are not strong. What we don't see are systematic incorrect predictions in the states that voted for Trump, but were thought to be part of Clinton's "blue wall". Specifically, Wisconsin has three counties that were expected to go to Clinton, but instead wet to Trump, and three counties that were expected to go to Trump, but went to Clinton. In Michigan there is only one county that was not predicted correctly, and that county went to Clinton. Finally, Pennsylvania also only had one county incorrectly predicted, albeit, it did go for Trump.

Figure 4. The Incorrectly Predicted counties

The relative lack of spatially grouped errors in the predictions of the 2016 Presidential Election App is in marked contrast to the 2012 Presidential Election App. This reflects improvements in the data and methodology used in this year’s election compared to 2012, a topic we turn to now.

## Comparing the Accuracy in the 2016 Presidential Election App with the 2012 Presidential Election App

This is the second election in a row we have produced a Presidential Election App, and there are some substantial differences in both the data and the models used between the two elections. The 2012 App was based on polling data from Gallup, and in a year when the polls overall had a good year, Gallup's did not. In addition, we only had data from just over 3,000 respondents (over three waves of the Gallup poll) in 2012 to work with, while we had slightly over 36,000 respondents in the October 14 to 24 wave of the SurveyMonkey polling data we used in this year's election app. In addition, and probably most importantly, the Gallup data did not include a respondent's county of residence, so we were not able to augment the polling data with county-level data the way we could this year, this is particularly important when it comes to the use of the Partisan Voting Index (or PVI) as a predictor.

Given the substantial change in the data and predictor variables used, comparing the accuracy of the 2012 Election App data with that from this year's App is a bit of an apples and oranges comparison. However, it should indicate the extent to which the data and methodology used has improved between this election and the last presidential election. In the 2012 Presidential Election App, we correctly predicted 85% of counties correctly, which is good, but not nearly as good as the 96% we achieved this year. In terms of the major party candidates, we correctly predicted 84% of the counties Barack Obama won (compared to 94% for Hilary Clinton), and 86% of the counties Mitt Romney won (compared to 97% for Donald Trump). The correlation between the fitted and actual values for the Democratic major party share in 2012 is 0.80, compared with 0.95 in 2016. Based on this, the improvements that have been made in the data and methods between the two elections has been substantial.

## There is One More Blog Left

It turns out that we developed an even better set of county-level predictions, which is based on a simple average of the two-part model presented here and a so-called "fundamentals" model. In the last post associated with this election, I'll describe what a fundamentals model is, what went into the one we created, how well it and the average of the two models did in predicting the actual election results, and finally examine several characteristics of the small number of counties that both models incorrectly predicted. The insights from this should help us to improve on the approach for the next presidential election, although thinking about the next election is a daunting prospect at this moment.

1Alaska calls their county equivalents boroughs. However, boroughs are an optional level of government, and there are many areas in the state that do not fall into an organized borough.

Dan Putler
Chief Scientist

Dr. Dan Putler is the Chief Scientist at Alteryx, where he is responsible for developing and implementing the product road map for predictive analytics. He has over 30 years of experience in developing predictive analytics models for companies and organizations that cover a large number of industry verticals, ranging from the performing arts to B2B financial services. He is co-author of the book, “Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R”, which is published by Chapman and Hall/CRC Press. Prior to joining Alteryx, Dan was a professor of marketing and marketing research at the University of British Columbia's Sauder School of Business and Purdue University’s Krannert School of Management.

Dr. Dan Putler is the Chief Scientist at Alteryx, where he is responsible for developing and implementing the product road map for predictive analytics. He has over 30 years of experience in developing predictive analytics models for companies and organizations that cover a large number of industry verticals, ranging from the performing arts to B2B financial services. He is co-author of the book, “Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R”, which is published by Chapman and Hall/CRC Press. Prior to joining Alteryx, Dan was a professor of marketing and marketing research at the University of British Columbia's Sauder School of Business and Purdue University’s Krannert School of Management.

Alteryx Partner

I would love to replicate this for the coming presidency referandum in Turkey...

Top Starred Posts
Latest Articles
Archives