Monday, July 29, 2019

Clustering States

I was curious about clustering states into regions based off of voting behaviour, similar to what Nate Silver did earlier in 2008. The basic idea is the state's presidential vote to find correlated behaviour in neighboring states, then form clusters. The result:

The colors are chosen to distinguish neighboring clusters, they do not reflect anything else. There is no relationship encoded in the color choice. It is pure aesthetics. There are 11 distinct clusters.

As a first stab, for a given state, I created a list for the percentage of votes a party received in the presidential elections since 1976. One given list might schematically look like: (1976 D%, 1976 R %, 1976 third %, 1980 D %, ...). This is all for one single state.

Each state having a list of these percentages, I compute the correlation between pairs of states's voting behaviour. This produces a long list of connections between states, and the correlation between them. I throw out all connections whose correlation is in the bottom 92.5-percentile, then cluster neighboring states if they have a strong enough correlation.

Texas, Vermont, and West Virginia did not correlate within the top 7.5% with any neighboring states, but they did correlate quite strongly with neighbors regardless. Strong enough for me to manually cluster them with specific neighbors. They were the only states I did manually.

It would be curious to include voting for House members, and Senators. At present, I have not investigated this avenue, and it may be interesting to investigate further.

As always, the code related to this is available on github.

No comments:

Post a Comment