Correlations Are Real

By Mark Roulo

Last Updated: August-2019


Predicting the 2016 Presidential Election

The pre-election polling models had Wisconsin, Michigan and Pennsylvania going for Hillary Clinton with probability of (from 538) Wisconsin: 84%, Michigan: 79%, Pennsylvania: 77%. What most models assumed was that the results in the three states were independent, so Hillary would expect to win two of the states about 90% of the time with about a 50% chance at winning all three.

Considering the electoral votes per state and the individual probabilities of winning each state, we get the following table of probabilities (above line is Hillary winning at least two states):

Wisconsin Michigan PennsylvaniaElectoral Votes
10 16 20 HillaryTrump
51% Hillary Hillary Hillary 46 0
10% Trump Hillary Hillary 36 10
14% Hillary Trump Hillary 30 16
15% Hillary Hillary Trump 26 20

3% Trump Trump Hillary 20 26
3% Trump Hillary Trump 16 30
4% Hillary Trump Trump 10 36
<1% Trump Trump Trump 0 46

If the three states are totally correlated (so they will have the same result, whatever it is), then the probability table looks something much more like this:

Wisconsin Michigan PennsylvaniaElectoral Votes
10 16 20 HillaryTrump
80% Hillary Hillary Hillary 46 0
20% Trump Trump Trump 0 46

This is a big difference, because in the uncorrelated case the chances of Trump winning all three states is less than 1% but in the correlated case it is close to 20%.

The reality is somewhere between these two, but possibly closer to the totally correlated result table. Consider the results for the twelve previous presidential elections:

Year Wisconsin Michigan Pennsylvania
1968 Republican Democrat Democrat
1972 Republican Republican Republican
1976 Democrat Republican Democrat
1980 Republican Republican Republican
1984 Republican Republican Republican
1988 Democrat Republican Republican
1992 Democrat Democrat Democrat
1996 Democrat Democrat Democrat
2000 Democrat Democrat Democrat
2004 Democrat Democrat Democrat
2008 Democrat Democrat Democrat
2012 Democrat Democrat Democrat

In these 12 elections prior to 2016, the three states had voted identically nine times. A binomial expansion of complete independence would have predicted voting identically only three times, so it would have been reasonable to assume some fairly high correlation.

This fairly high correlation would have implied that Hillary had a better than expected in the non-correlated case chance of winning all three states — closer to 80% than to 50%, but much more dangerous for her that she also had a much better than expected chance to lose all three states — closer to 20% than to 1%.