Statistical Inference
Statistical Inference is the branch of statistics dedicated to distinguishing patterns arising from signal versus those arising from chance. It is a broad topic and, in this section, we review the basics using polls as a motivating example. To illustrate the concepts, we supplement mathematical formulas with Monte Carlo simulations and R code. We demonstrate the practical value of these concepts by using election forecasting as a case study. Readers already familiar with statisical inference theory and interested in the case study you can focus on 12 Hierarchical Models.
The day before the 2008 presidential election, Nate Silver’s FiveThirtyEight stated that “Barack Obama appears poised for a decisive electoral victory”. They went further and predicted that Obama would win the election with 349 electoral votes to 189, and the popular vote by a margin of 6.1%. FiveThirtyEight also attached a probabilistic statement to their prediction claiming that Obama had a 91% chance of winning the election. The predictions were quite accurate since, in the final results, Obama won the electoral college 365 to 173 and the popular vote by a 7.2% difference. Their performance in the 2008 election brought FiveThirtyEight to the attention of political pundits and TV personalities. Four years later, the week before the 2012 presidential election, FiveThirtyEight’s Nate Silver was giving Obama a 90% chance of winning despite many of the experts thinking the final results would be closer. Political commentator Joe Scarborough said during his show1:
Anybody that thinks that this race is anything but a toss-up right now is such an ideologue … they’re jokes.
To which Nate Silver responded via Twitter:
If you think it’s a toss-up, let’s bet. If Obama wins, you donate $1,000 to the American Red Cross. If Romney wins, I do. Deal?
Obama won the election.
In 2016, Silver was not as certain and gave Hillary Clinton only a 71% of winning. In contrast, many other forecasters were almost certain she would win. She lost. But 71% is still more than 50%, so was Mr. Silver wrong? And what does probability mean in this context anyway?
In this part of the book, we show how the probability concepts from the previous part can be applied to develop statistical methods that make polls effective tools for understanding public opinion. Although in the United States the popular vote does not determine the presidential outcome, we use it as a simple and illustrative example to introduce the core ideas of statistical inference.
We begin by learning how to define estimates and margins of error for the popular vote and how these lead naturally to confidence intervals. We then extend this framework by aggregating data from multiple pollsters to examine the limitations of traditional models and explore improvements. To interpret probabilistic statements about the likelihood of a candidate winning, we introduce Bayesian modeling, and finally, we combine these ideas through hierarchical models to build a simplified version of the FiveThirtyEight election model and apply it to the 2016 election.
We conclude with two widely taught topics that, while not required for the case study, are central to statistical practice: statistical power and p-values. The part ends with a brief introduction to the bootstrap, an inferential method we will revisit in the machine learning part.
https://www.youtube.com/watch?v=TbKkjm-gheY↩︎