Statistical inference

Statistical Inference is the branch of statistics dedicated to distinguishing patterns arising from signal versus those arising from chance. It is a broad topic and, in this section, we review the basics using polls as a motivating example. To illustrate the concepts, we supplement mathematical formulas with Monte Carlo simulations and R code. We motivate the concepts with election forecasting as a case study.

The day before the 2008 presidential election, Nate Silver’s FiveThirtyEight stated that “Barack Obama appears poised for a decisive electoral victory”. They went further and predicted that Obama would win the election with 349 electoral votes to 189, and the popular vote by a margin of 6.1%. FiveThirtyEight also attached a probabilistic statement to their prediction claiming that Obama had a 91% chance of winning the election. The predictions were quite accurate since, in the final results, Obama won the electoral college 365 to 173 and the popular vote by a 7.2% difference. Their performance in the 2008 election brought FiveThirtyEight to the attention of political pundits and TV personalities. Four years later, the week before the 2012 presidential election, FiveThirtyEight’s Nate Silver was giving Obama a 90% chance of winning despite many of the experts thinking the final results would be closer. Political commentator Joe Scarborough said during his show1:

Anybody that thinks that this race is anything but a toss-up right now is such an ideologue … they’re jokes.

To which Nate Silver responded via Twitter:

If you think it’s a toss-up, let’s bet. If Obama wins, you donate $1,000 to the American Red Cross. If Romney wins, I do. Deal?

In 2016, Silver was not as certain and gave Hillary Clinton only a 71% of winning. In contrast, many other forecasters were almost certain she would win. She lost. But 71% is still more than 50%, so was Mr. Silver wrong? And what does probability mean in this context anyway? Are dice being tossed or cards being dealt somewhere?

In this part of the book, we will demonstrate how the probability concepts covered in the previous part can be applied to develop statistical approaches that render polls effective tools. Although in the United States the popular vote does not determine the result of the presidential election, we will use it as an illustrative and straightforward example to introduce the main concepts of statistical inference. Forecasting an election is a more complex process that involves combining results from 50 states and DC. We will delve into this subject in the last chapter, after we cover all the basic concepts. Specifically, we will learn the statistical concepts necessary to define estimates and margins of errors for the popular vote, and show how these are used to construct confidence intervals. Once we grasp these ideas, we will be able to understand statistical power and p-values, concepts that are ubiquitous in, for example, the academic literature. We will then aggregate data from different pollsters to highlight the shortcomings of the models used by traditional pollsters and present a method for improving these models. To understand probabilistic statements about the chances of a candidate winning, we will introduce Bayesian modeling. Finally, we put it all together using hierarchical models to recreate the simplified version of the FiveThirtyEight model and apply it to the 2016 election.