The Monumental Task Facing Bernie Sanders

While the Republican Party presidential primary has largely degenerated into an unpredictable, messy, nasty, free for all, the Democratic Party primary has proceeded mostly along predictable lines. The race has an overwhelming establishment backed favourite candidate (Hillary Clinton) and an insurgent (Bernie Sanders) waging an improbable issue based campaign against her from the left. As it often happens in such contests, the underdog candidate has won a few races and has created a few flutters here and there, but the establishment candidate has been able to lock up the critical contests and is looking on course to register an easy victory in the end.

Political contests dominated by two major candidates are often predictive along demographic lines; i.e. one candidate remains the strongly preferred choice of a certain demographics and vice versa. It is no different in this year’s Democratic Primary. Clinton has been heavily favoured by African American voters and to a lesser extent by Latino voters whereas Sanders has drawn his votes mostly from white voters.

The following chart which shows the performance of Sanders vis-à-vis the black population in each of the fifteen states that have voted so far in the primaries sums up the relationship.

Chart I

For every one percentage increase in the percentage of black population in a state, the vote percentage of Sanders comes down by roughly 1.5% points.

Sanders also does worse among Hispanics although the relationship is very weak. Further, from the limited evidence that we have had so far, he has done better in the New England states and in states that have conducted caucuses. Thus, we can build a slightly more complex model by incorporating the percentage of Hispanic population in each state and a dummy variable for the location of the state (1 for Vermont, the home state of Sanders, 0.5 for other New England states and 0 for other states) and a dummy variable indicating whether the vote shall be through a primary or a caucus (0 for primary and 1 for caucus). We can carry out a multiple regression with the voting percentage of Sanders in each state as the independent variable and the above mentioned variables as the dependent variables.

The results of the regression are shown in the following table:

Chart II

It throws up the following simple equation:

Y = 0.49-0.83*X1+0.32*X2+0.12*X3-0.12*X4


Y = Percentage of Democratic voters likely to support Sanders in a particular state

X1 = Percentage of Black Population in the state

X2 = Dummy Variable indicating if the state is in New England

X3 = Dummy Variable indicating whether the state held a primary or a caucus

X4 = Percentage of Hispanic Population in the state

We can say with 90% confidence that each of these variables has significant relationship with the independent variable, apart from X4 which at best has a tenous relationship with the vote share of Bernie Sanders as of now. The variable has, however, still been included in the model as many of the states scheduled to vote later in the calendar shall have very high percentages of Latino voters.

The plot of the actual vote shares of Sanders as against the percentage vote share as predicted by the model is shown below:

Chart III

The regression shows a strong relationship between the dependent variables and the independent variable. Around 89% of the variation of Bernie Sander’s vote share in the various states may be explained by this simple model.

If we use the equation thrown up by the model to predict the vote share of Sanders in the later voting states in the calendar, we get the following forecasts:

Name of the State Scheduled to Vote on Delegates Offered Predicted Vote Share
Kansas 05-03-2016 33 54%
Louisiana 05-03-2016 51 21%
Nebraska 05-03-2016 25 55%
Maine 06-03-2016 25 76%
Michigan 08-03-2016 130 37%
Mississippi 08-03-2016 36 17%
Florida 15-03-2016 214 32%
Illinois 15-03-2016 156 35%
Missouri 15-03-2016 71 39%
North Carolina 15-03-2016 107 30%
Ohio 15-03-2016 143 38%
Arizona 22-03-2016 75 42%
Idaho 22-03-2016 23 59%
Utah 22-03-2016 33 58%
Alaska 26-03-2016 16 59%
Hawaii 26-03-2016 25 58%
Washington 26-03-2016 101 56%
Wisconsin 05-04-2016 86 43%
Wyoming 09-04-2016 14 58%
New York 19-04-2016 247 32%
Connecticut 26-04-2016 55 54%
Delaware 26-04-2016 21 30%
Maryland 26-04-2016 95 23%
Pennsylvania 26-04-2016 189 39%
Rhode Island 26-04-2016 24 57%
Indiana 03-05-2016 83 40%
West Virginia 10-05-2016 29 46%
Kentucky 17-05-2016 55 42%
Oregon 17-05-2016 61 46%
Puerto Rico 05-06-2016 60 27%
California 07-06-2016 475 39%
Montana 07-06-2016 21 48%
New Jersey 07-06-2016 126 34%
New Mexico 07-06-2016 34 41%
North Dakota 07-06-2016 18 47%
South Dakota 07-06-2016 20 47%
District of Columbia 14-06-2016 20 7%

Now before we go any further, let us be clear about the problems with this model. First of all, the number of data points which is only 15 is woefully inadequate to make predictions with a high degree of confidence. Secondly, there is no guarantee the relationship that the regression analysis has come up with shall hold well into the future. Thirdly, the sample is not random. In fact, it features a disproportionate share of states from the North-east and the south and has very little number of states from the Mid West or the West. Fourthly, because of the low number of data points, the standard errors of the co-efficients are pretty high i.e. the 95% confidence interval of the predictions is pretty wide.

Even with these caveats in place, it makes sense to gather some insight from the voting share being predicted here. The table above shows that Sanders is strongest in the states of the West and in New England, especially in places like Kansas, Nebraska, Maine, Idaho, Utah, Alaska, Hawaii, Washington, Wyoming, Rhode Island and Connecticut, almost all of which are also holding caucuses instead of primaries. He is also expected to be competitive in a handful of states in the Midwest, the Appalachia and the West Coast.

It may be worthwhile to mention here that the Democratic Primary has two types of delegates – pledged and unpledged delegates. The pledged delegates are elected through the state wise primaries and caucuses and are bound to support a particular candidate at the convention, depending on the results of the primary and caucus of the state they are representing. The unpledged delegates (also called super delegates), on the other hand, are members of the party establishment, who are not bound to support any candidate at the convention. There are a total of 4051 pledged delegates and 712 unpledged delegates. Thus, in order to win the Democratic nomination, a candidate has to win at least the support of 2382 of the total of 4763 delegates on offer.

Now Clinton already has a lead of around 439 among  the 479 super-delegates who have committed to support one of the two candidates. She also has a lead of around 160 among pledged delegates, on the basis of her performance in states that have voted so far. If we assume that the remaining unpledged super delegates and the US territories (like Virgin Islands, Northern Mariana Islands, etc.) shall offer their support to the two candidates equally (which is generous to Sanders considering that super delegates have pledged to support Clinton overwhelmingly so far), Sanders will need to win 1798 of the delegates in the states which are left to vote.

However, the problem for Sanders is that he is not very competitive in delegate large states like California, New York, Florida, Pennsylvania, Ohio, Michigan, etc. The Democratic Primary syatem largely allot delegates on a proportionate basis. If we assume that Sanders performs exactly the way the model has suggested, in various states as well as in the various congressional districts within each state, he will end up with around 1154 delegates from these states as opposed to 1843 delegates for Hillary Clinton.

So, how much does Sanders need to overperform relative to his performance till now. In the table below, I have calculated the number of delegates that Sanders may hope to win from the remaining states given the base case scenario (i.e. as predicted by the model) and increase in his voting percentage across states in slabs of 5%.

Scenario Delegates win by Sanders Delegates win by Clinton
Base Case 1154 1843
5% increase across states 1304 1693
10% increase across states 1454 1543
15% increase across states 1604 1393
20% increase across states 1754 1243
25% increase across states 1903 1094

In order to reach the magical number of 1798 delegates from the remaining states, Sanders will need a vote swing by around 23% which is almost impossible. Even if we ignore super delegates, in order to overcome his deficit of 140 pledged delegates, Sanders will need to win 1579 delegates from the remaining states i.e. he will need a vote swing of almost 15% across states, which is no easy task.

In short, Sanders is facing an uphill climb. He has to dramatically increase his appeal to demographics which have not been so favourable to him till now and he will need to do it in a very short span of time. Otherwise, Hillary Clinton will easily become the Democratic Party nominee.






This entry was posted in Uncategorized. Bookmark the permalink.

1 Response to The Monumental Task Facing Bernie Sanders

  1. Pingback: Why Michigan is Must Win for Bernie Sanders – hohokum

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s