Donald Trump’s victory in the 2016 US Presidential election was a shocker – all the poll based statistical models had shown Hillary Clinton as the strong favourite for winning the election, with the probability of her winning varying from 70% to as high as 99%. The predictable outcome of the 2012 US election and the success of these models in that election cycle had bred a lot of complacency among some of the data journalists this time around. Further, the mainstream media had also started blindly believing in the outputs of the models without in many cases understanding the assumptions that went into the models, the limitations of the models or the implications of what the models say.
This herd mentality among the mainstream media, the pundits and the data journalists became so extreme in the last one week that Nate Silver was subjected to intense criticism because his model showed a comparatively lower 65%-70% probability of Clinton winning than the election, relative to some of the other models which were showing the prospect of a Clinton Presidency a near certainty. In my last post, I had explained why the model employed by Fivethirtyeight (the website run by Nate Silver) was more conservative in projecting a Clinton victory and why in my view, it was right in doing so.
A statistical model is an imperfect simulation of the real world. Since it is impossible to replicate a chaotic, massive and dynamic process like the US presidential election, to forecast the same, simple models are instead constructed, which take some input variables and through a pre-defined interaction between these variables, find out the most probable outcome. In the case of election forecasting models, they take in state and country level polls (and some demographic and economic factors in some cases) and try to predict the outcome of the US election on the basis of how these polls change.
However, a model, by its very nature, is a simplistic rendering of a complex process and hence, there are some uncertainties involved with the outcome. A well calibrated model is one where the uncertainties are well accounted for i.e. if the model is used to predict a high frequency event, over the long run, the probability of the event as predicted by the model and as is exhibited in the real world shall converge.
The US presidential election though is not a high frequency event. So it is not possible to run the election 10,000 times to find if Clinton is winning 7000 times as was predicted by Silver’s model. However, as Silver had mentioned repeatedly and as was mentioned in this post, there were a number of sources of uncertainties related to the outcome, which were apparent even during the days of the pre-election consensus among pundits that Clinton had more or less won the election.
Unfortunately for Clinton, and unfortunately for the models, almost all the sources of uncertainties in the model (i.e. the things that could have gone wrong for Clinton) went wrong on the Election Day. Here is a litany of factors that made us relatively bearish on Clinton’s chances on the Election Day and almost all of which came true:
- The final average of national polls has historically differed from the result on the Election Day by around 2 percentage points. There have been some years when the polls have differed more markedly. For example, in 2012, the difference was around 3%, a trivia that is often missed out in the discussions on how the election was so stable and predictable that year. In fact, if the error had been in the other direction, Mitt Romney would have won the election. This time, the error favoured the Republicans. Clinton is expected to win the popular vote share this year, perhaps by 1-1.5 percentage point by the time all the votes are counted. In contrast, the Fivethirtyeight model had Clinton winning the national vote by 3.6 percentage points. Thus, the polling error, at least at the national level, was mostly in line with the historical errors.
- There was a lot of volatility in the polling data in a number of swing states. For example, even as the national polls started recovering the week after news came out that FBI was re-opening the investigation into Clinton emails, a number of swing state polls started showing extremely tight races. In contrast, there were also a number of polls which showed Clinton ahead by multiple points in the states which were part of her firewall. The biggest example was New Hampshire which showed the variance of Cinton’s performance at around 15 percentage points. The volatile state polls were an indication of the uncertainty of the outcome which sadly went unheeded at that time.
- The polls also swung wildly in the course of the election – from a narrow Trump victory to a decisive Clinton win. Unfortunately for Clinton, one of the most terrible stretches for the Clinton campaign just came in before the elections, when polls tightened considerably. Although the polls showed some rebound for Clinton in the dying days of the campaign, it was not enough to bring her out of the woods.
- There were a number of undecided and third party voters in the election, much higher than the level of 2012. Exit polls showed that higher share of such late deciding voters decided to vote for Trump, thus contributing to the polls being skewed in favour of Clinton.
- Even in the days before the election, there was an unusually number of swing states that were very closely contested. It was plausible for either candidate to win in almost 15 states. Such a high number of swing states made a number of Electoral College combinations possible, thus increasing the uncertainty of the race. As the results came in, many of these states indeed turned out to be too close to call. In fact, Clinton lost Florida, Michigan, Pennsylvania and Wisconsin – all by extremely narrow margin. If she had won these states, she would have won the presidency.
- There was additional uncertainty on account of the problems being faced by the polling industry in general – with increasing cost of carrying out surveys and reduced respondent participation rate. This had led to the pollsters badly misjudging the polls in multiple high profile events in the recent past, like the UK parliamentary election, Israel Knesset election and the referendum for Brexit. The poor performance of the polling industry continued in the Election Day in USA. Even though the error in national polls was in line with the historical average, in a number of states, it was glaring. For example, in Wisconsin, the RealClearPolitics average of polls had Clinton leading Trump by 6.5 percentage points. Clinton did not trail in a single poll in the entire election cycle in that state. Even then, she lost the vote by 1 percentage point. Similarly, the polls were very bullish on a Clinton victory in both Pennsylvania and Michigan, part of Clinton’s so-called firewall. To exacerbate the issue, many of these states, including Michigan and Minnesota were not polled very frequently in the days leading to the election, which perhaps lulled the Democrats into a false sense of complacency based on limited data.
- The error in polling in the states, especially geographically and demographically similar states, is generally correlated with each other i.e. the errors move in the same direction. For example, if the polls are understating the level of Trump support in Michigan, it is likely to do so in neighbouring Pennsylvania and Minnesota as well. This is what happened on the election night, as the polls badly missed the mark in all the critical Rust Belt and Midwest states. If the error in polls instead had cancelled each other out, Clinton could have won comfortably in some of these states.
- Clinton was always at a disadvantage in the Electoral College, relative to popular votes. As her base of Hispanic voters is more concentrated in some red and blue states, she over-performed Obama in a handful of such states. However, in almost all the swing states, her performance was much worse than that of Obama. As a result, Clinton was always an underdog if her lead over Trump fell to around 1 percentage point. This outcome came true on the Election Night, leading to the bizarre scenario of Clinton winning the popular vote share narrowly while losing the Electoral College decisively.
While all these factors, the values of which were not known while forecasting, went against Clinton, there were also some other such factors which were considered positive to Clinton and yet proved to be false dawns and red herrings for her. For example, the models did not consider the early voting data, which almost conclusively pointed to a Clinton victory in Nevada and more tentatively, to some advantages in Florida and North Carolina. On the Election Day, Clinton was able to hold on to her lead in Nevada, but it dissipated in the face of massive rural, white voting in favour of Trump in Florida and North Carolina. Further, Clinton’s extensive investment in the ground game and ‘get out the vote’ operations were expected to lead to her over-performing the polls, especially in swing states. But the results indicated that there was hardly any turnout advantage for Clinton in most swing states.
To conclude, Clinton was doomed by a mixture of uncertain factors, almost all of which ultimately broke against her. A large portion of Trump’s unexpected victory can be explained by a variety of known unknowns i.e. factors which were known and whose outcomes were uncertain, but which ultimately favoured Trump. Given that real world is vastly complex, models are designed to be simple and presidential elections are discreet, infrequent events, such errors cannot be ruled out. This is why it is always a good idea for modellers to recognize these uncertainties and calibrate their models accordingly. The Fivethirtyeight model had accounted for most of these factors, but many others had not, leading to preposterous level of confidence in a Clinton victory that never came.