Both Geography and Race Matter in the Democratic Primary

We have known for some time now that the vote shares of the two candidates in the Democratic Primary race is highly dependent on the racial makeup of the voters. Bernie Sanders typically does well in regions with large base of white population whereas Hillary Clinton is much more dominant in more racially diverse areas, especially in regions with large percentage of African American voters.

In order to find out how strong that relation is, I created scatter plots showing the variation in the vote shares of Bernie Sanders and Hillary Clinton with the percentage of African American population in each of the counties that have voted so far. I could not get the data for two states i.e. Minnesota and Kansas which reported their results along the lines of congressional districts rather than counties. The data for the remaining states has been taken from the New York Times website and the Politico website.

Clinton All

Clinton’s Vote Share in Each County Plotted Against Black Population

Sanders All

Sander’s Vote Share in Each County Plotted Against Black Population

From the above charts, it is obvious that there is a significant relationship between the voting share of the two candidates and the percentage of African American population.

However, notice that there is a significant divergence in the vote share of the two candidates at low black population level. In counties where the black population is 0% (which is true for a lot of counties especially outside South), the vote share of Clinton ranges anywhere between 10% and 80%. But as the percentage of African American population in a county increases, it starts to vote along more predictable lines.

There is also a large geographical disparity of this relationship between the vote share of the candidates and the percentage of African American population. In South, for instance, the relationship is more or less obvious.

Clinton South

Clinton’s Vote Share in Each County Plotted Against Black Population only in South

The relationship is far less obvious in the non-southern states which have voted so far.

Clinton North

Clinton’s Vote Share Plotted Against Black Population outside South

Of course, one problem is the amount of data we have. Most of the states which have voted so far either have large black population and belong to the South or have overwhelming majority of white population. Michigan was in fact the first state that a significant black population and did not belong to the South. When we constructed a demographic model that predicted that black voters in Michigan would vote exactly the way the black voters in South did, it predicted that Sander’s vote share would be restricted to the low 40s. Instead he won almost 50% of the vote share, indicating that black voters outside South may be voting in a way different from black voters in the South.

In fact, this leads us to an interesting question – is the unqualified success of Clinton in various southern states a function of the fact that Clinton is polling better among all Southern states or a function of the fact that she is getting a higher percentage of votes from black voters compared to white voters (and that black voters are more numerous in Southern states).

The answer is probably that it is a combination of both the factors. Clinton has consistently polled better among black voters compared to white voters. But at the same time, voters across demographics have been more supportive of Clinton in the Southern states compared to the states outside the South.

In fact, we can find significant correlations if we plot the margin of Clinton victory or loss in each state separately with the percentage of African American population in the state and the geography of the states (represented by dummy variables i.e. +1 for Vermont, 0.5 for other New England states, 0 for non-southern states outside New England and -1 for southern states).

Margin vs Black Votes

Difference Between Vote Share of Clinton and Sanders in Each State Plotted Against Black Population

Margin vs Location

Difference Between Vote Share of Clinton and Sanders in Each State Plotted Against Dummy Variable Representing Location of the State

We may also perhaps look at the exit poll data of how Clinton fared among various Southern and non-Southern states.

White Voters

Performance of the Candidates Among White Voters

Black Voters

Performance of the Candidates Among Black Voters

There are some data constraints here. A number of non Southern states which have voted so far have done it through caucuses, the exit polls for which have not been conducted. Moreover, the black population in most of the non-Southern states  that have voted so far is so small that they did not form any meaningful sample size for exit polls. But the overall picture is very clear; there is a large geographic variation among the preferences of white voters as well as black voters. And in Southern states, both white and black voters prefer Clinton by a margin higher than that seen in non-Southern states. Observe how the only two states with significant black population and where less than 80% of black voters support Clinton are located outside South.

In fact, this partly explains why on the last Super Tuesday i.e. on 08th March, 2016, Clinton won so comfortably in Mississippi while she lost Michigan in a close fight. Our demographic model which did not differentiate between difference in voting pattern among southern and non-southern states correctly predicted Clinton’s margin of victory in Mississippi but overestimated her vote share in Michigan by around 10%.

So what does it mean for the election going forward? We can revise our previous model by incorporating a -1 variable for Southern states (in addition to +1 variable for Vermont, +0.5 for other New England states and 0 for other states). There was no such variable in the previous model since it was not yet clear whether Clinton’s support level would actually come down among black voters living outside the South.

If we carry out our predictions with this revised model (there is some problem with this model since two independent variables i.e. percentage of African American population in a state and its location have a fair degree of correlation with each other; but, let’s bear with it for now), we find that among the Super Tuesday states, Sanders would lose Florida and North Carolina badly (by 33% and 39% points respectively), would lose Illinois by around 8% and more or less tie Clinton in both Ohio and Missouri, with may be a slight edge in Missouri. The model also shows that if remaining states continue to vote in this fashion, Clinton would have a lead of around 135 pledged delegates among states yet to vote, to further add to the lead to around 221 pledged delegates she already has. If instead, Sanders has to erase Clinton’s lead among pledged delegates awarded so far, he needs to overperform his fundamentals by 7% which is probably somewhere in the range between difficult and improbable.

Note what that implies for states voting today. Sanders will need to hold Clinton to a margin of around 20%-25% in Florida and North Carolina, win Illinois by around 5% and Ohio and Missouri in the low double digits, in order to be on course to win the nomination. If it sounds like a daunting task, it is because it is.

A more realistic and yet optimistic scenario for Sanders may be to pull of close victories in Ohio, Illinois and Missouri and hold Clinton to a lead of less than 20% in both Florida and Ohio. This will ensure he gets a lot of positive press coverage from today. After this Super Tuesday, Sanders enters into a very favourable stretch and he may be expected to win at least seven (Arizona, Idaho, Utah, Alaska, Washington, Wisconsin and Wyoming) of the next eight states, some of them by huge margins. It may be argued that Sanders may also be a favourite in Hawaii, the eight state, but we have hardly any information on how the Asian-American voters have voted so far in this cycle. Nevertheless, this would mean Clinton would go without a single meaningful victory for more than a month and will enter the home stretch of the campaign, looking far less than an inevitable nominee. Sanders will still need to sweep the Appalachian states (West Virginia and Kentucky), post commanding victories in the remaining New England states (Connecticut and Rhode Island) and even eke out wins in states like California, Pennsylvania and Indiana. But the race will continue to drag on till June and will challenge Clinton more than she ever expected.

On the other hand, Clinton may just post handy victories in all the five states voting tomorrow, including in the two Southern states (North Carolina and Florida) by huge margins, opening up a delegate lead that may be just impossible for Sanders to close.

No matter what happens today though, one thing is for sure – we shall get a far clearer picture of where the race is headed towards once the votes in the five states are counted.









This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s