Monday, July 29, 2019

Clustering States

I was curious about clustering states into regions based off of voting behaviour, similar to what Nate Silver did earlier in 2008. The basic idea is the state's presidential vote to find correlated behaviour in neighboring states, then form clusters. The result:

The colors are chosen to distinguish neighboring clusters, they do not reflect anything else. There is no relationship encoded in the color choice. It is pure aesthetics. There are 11 distinct clusters.

As a first stab, for a given state, I created a list for the percentage of votes a party received in the presidential elections since 1976. One given list might schematically look like: (1976 D%, 1976 R %, 1976 third %, 1980 D %, ...). This is all for one single state.

Each state having a list of these percentages, I compute the correlation between pairs of states's voting behaviour. This produces a long list of connections between states, and the correlation between them. I throw out all connections whose correlation is in the bottom 92.5-percentile, then cluster neighboring states if they have a strong enough correlation.

Texas, Vermont, and West Virginia did not correlate within the top 7.5% with any neighboring states, but they did correlate quite strongly with neighbors regardless. Strong enough for me to manually cluster them with specific neighbors. They were the only states I did manually.

It would be curious to include voting for House members, and Senators. At present, I have not investigated this avenue, and it may be interesting to investigate further.

As always, the code related to this is available on github.

Saturday, July 6, 2019

Post-Mortem of 2016 (Fragment)

How did Trump win 2016? It appeared to turn the world upside down, seemingly defying polls and reason. What happened? Was it really surprising or were we misled?

The game plan will be first to examine a few popular myths, eliminate these explanations, then examine the results of 2016. By articulating why 2016 was surprising, we will find the factors responsible for the outcome.

Executive Summary: It's all Gary Johnson's fault.

Remark (On Obama-Trump Voters). One plausible explanation for Trump's victory is the shift of Obama voters of 2012 who came to vote for Trump in 2016, the so-called "Obama-Trump voter". This is worth mentioning, only in passing, because the media has an irrational fascination with such voters.

Myth #1: Shy Trump Supporter

A popular conjecture is that a subpopulation of Trump supporters believed supporting Trump was "socially undesirable", hence would not publicly acknowledge backing Trump. Alexander Coppock conclusively tested this theory, and found no evidence of a shy Trump supporter existing. The predictive models with shy Trump supporters make statistically indistinguishable predictions from those without shy Trump supporters.

Andrew Gelman independently reckoned the same conclusion along different lines. Also, Gelman reasons, Republican candidates outperformed expectations in the Senate races, which casts doubt on the model in which respondents would not admit they supported Trump; rather, the Senate results are consistent with differential nonresponse or unexpected turnout or opposition to Hillary Clinton. It is possible that the anti-media, anti-elite, and even anti-pollster sentiment stoked by the Trump campaign has been a part of the reason for the low response of Trump supporters in states with large rural populations. Emphasis added.

It is worth remembering that Politico/Morning Consult released a poll, gathered online and via live phone calls, indicating despite different methodologies the different results show only a slight, not-statistically-significant difference in their effect on voters’ preferences for president. In other words, it didn't matter if a respondent talked to a pollster on the phone (where shyness would prevent the respondent from announcing support for Trump) or if the respondent communicated online, the results were statistically indistinguishable.

Testing this hypothesis three different ways, and they all reach the same conclusion, seriously undermines the "null hypothesis" of the existence of "shy Trump supporters". We can discard this explanation as lacking empirical support.

Myth #2: Comey Ruined the Election

Hillary Clinton has personally blamed the election outcome on Comey's public announcement that he was re-opening the FBI investigation into Clinton's email servers the Friday prior to the vote. This explanation has become popular, presumably for the proximity of Comey's announcement to the perceived surprising loss. Neither evidence nor reason supports this misconception.

This claim is a contentious matter. But the Comey letter is the sort of mirage which our cognitive biases are susceptible to mistake for real. We must heed Thucydides's words (1.21), stressing the search for truth strains the patience of most people, who would rather believe the first things that come to hand.

Using rather rudimentary post-stratified modeling techniques, Chad P. Kiewiet de Jonge and Gary Langer have shown in fact Clinton began losing the projected electoral college the Tuesday prior to Comey's announcement. Why didn't anyone else have this insight? The other prognosticating psephologists used some combination of poll-weighting and a likely voter model, which would have missed it.1Nate Silver noted, regarding Comey's letter, As of Oct. 28, the polls-plus version of FiveThirtyEight’s forecast, which accounts for these factors, expected Clinton to lose a point or so off her lead before Election Day. Silver's model did not detect the deviations which MRP modeling found.

The New York Times released a poll, released days prior to the letter, showing Trump ahead of Clinton by 4 points. Bloomberg/Selzer released a poll showing Trump ahead of Clinton by 2 points. Could we trust this poll? Well, FiveThiryEight has awarded Selzer & Co. an "A+" pollster grade (and similarly high marks for both Siena College and the New York Times).2As of July 3, 2019 and November 11, 2016 Selzer & Co. received an "A+". The New York Times, in collaboration with CBS, received the more modest "A-" grade, but Siena College won a solid "A". Although this is weak evidence supporting the claim Clinton began losing before Comey's announcement, the point we stress is this is a second approach to think about the matter.

Could it be that Comey's letter accelerated Clinton's decline? From post-stratified modeling based on polling released afterwards, there is insufficient evidence that Comey's letter impacted Clinton's standing. The MRP model using polling data with surveys held November 4–6 show Clinton recovering, albeit insufficient to win the election or recover significant ground. Such results challenge and undermine the claim Comey impacted Clinton at all.

The raw data, and refined statistics, both reach the same conclusion: Clinton began losing the election before Comey even spoke.

What Happened?

To better answer this question, we should first note what people expected. If, for each state, we take a rolling mean of the proportion of the vote for each party (bundling all "third parties" together into a single "third party"), then normalize the result per state (so we have proportions again), the result is precisely the proportion of the vote one might expect to have found on election day of 2016. The results may be described by the following map:

The electoral college count would have been 332 for the Democratic candidate, 206 for the Republican candidate. We can actually quantify how surprising the actual results were using the Kullback–Leibler divergence, but to make sense of this we should compare it to previous elections:

Election Year Surprise (in bits)
1988 0.8504352
1992 33.3329646
1996 2.8056336
2000 2.8825127
2004 1.3989161
2008 0.7251642
2012 0.3414356
2016 3.5202876

For 1992, remember Ross Perot captured about 20% of the popular vote, the most a third party received since Teddy Roosevelt ran for a third term on the Progressive Party ticket in 1912,3Perot's 1992 performance stands third in the rankings of "percent of popular vote a third party candidate received in the presidential election". Teddy in 1912 stands at first with 27.39% of the vote, Millard Fillmore's 1856 bid on the Constitutional Party ticket ranks second at 21.54% of the vote, and Perot's 1992 bid comes in third at 18.91% of the popular vote. In contrast, Gary Johnson's 2016 bid received 3.28% of the popular vote. which is why it is the most surprising row in our table.

So what happened? Why was this so surprising? There are a variety of ways to approach answering this. We may take the expected vote proportions and compare them to the actual vote proportions, and extrapolate out the difference in votes (assuming voter turnout remained the same in this hypothetical 2016 election as in the actual 2016 election). We find the difference in votes:

Party Difference in votes As Percent of Total Vote
Third Parties 4,397,608 0.0321493
Democratic -2,605,576 -0.0190484
Republican -1,792,032 -0.0131009

Computed by lumping all third parties into a single "third party". Then the rolling mean for each state and for each party was taken from 1976 until 2012. The result was renormalized in each state, then multiplied by the total votes cast in each state. The second column shows the difference between the expected votes and the actual votes, summed over each state.

Although we find Trump underperformed by 1.31%, we find that Clinton underperformed by 1.9% of the vote. Third parties overperformed by roughly 3.21% of the popular vote. Note: Johnson received 3.28% of the popular vote in 2016 — does this account for the overperformance of third party candidates in 2016?

Two questions immediately emerge: (a) which Third party candidate overperformed? (b) If we supposed the third parties received lower votes (i.e., they received the expected votes), how would the difference be re-allocated between Trump and Clinton?

Measuring Third Party Overpower

We can actually measure the Kullback–Leibler divergence for how Johnson performed in 2016 compared to 2012 (lumping all non-Johnson votes in one category). This measures the surprise in Johnson's 2016 performance relative to 2012 expectations, an adequate way to gauge improvement. We may do similarly for Stein, since both ran in 2012 and 2016. The result may be summed up thus:

The ratio of "Johnson's improvement" to "Stein's improvement" averaged 22.88781 — that's over an order of magnitude improvement!

In the states which went from Obama in 2012 to Trump in 2016 (specifically: Florida, Michigan, Pennsylvania, Wisconsin, Iowa, Indiana, North Carolina, Ohio, Nebraska), Johnson doubled his votes in 2016 compared to 2012...or better (in Florida, Johnson saw his votes grow from 44,726 to 207,043). Johnson's average improvement among these swing states was 336% more votes in 2016 compared to 2012. We should observe that Johnson didn't run in either Michigan or Wisconsin in 2012, though.

These numbers tell us, quite simply, Johnson improved considerably between 2012 to 2016. This alone is quite surprising, third party candidates seldom improve so drastically. Jill Stein, on the other hand, saw very little improvement in votes. We may safely conclude that Johnson is the dominant (sole?) dynamo for the third party's surprise improvement, which answers the first question the previous section posed: Which Third party candidate overperformed? We may safely answer, it was Johnson.

A Wonderful Life

The Economist's Lexington, stating the obvious, informs us, Most of those who voted for Mr Johnson in 2016 were protesting against the alternatives. But if there were a timeline where there was no Libertarian ticket in 2016, or any other third party for that matter, how would the election have changed?

Looking at the numbers, third party voters changed the outcome in Flordia, Michigan, Pennsylvania, and Wisconsin. (They did so in North Carolina, but they would have to break 95.7% for Clinton, which is implausible.) If 68.997%+1 of third party voters broke for Clinton in this hypothetical, then Clinton would have won an additional 75 delegates in the electoral college. This would have changed the outcome of the election. Is this feasible?

FiveThirtyEight's Harry Enten asked this very question in his article, Election Update: Is Gary Johnson Taking More Support From Clinton Or Trump? If we take his observations as a launching off point, then third party voters are divided up thus: 1.19864% (of the total vote) is taken from the third party candidates, then the remaining third party voters are divvied up evenly between Trump and Clinton. This effectively erases the margin of victory for Trump. (With the exception of Florida, only a small fraction of third party voters need to be shaved off to change the outcome.)

State Delegates Trump margin Shift to Clinton
Florida 29 0.0119863 0.0207737
Michigan 16 0.0022303 0.0311395
Pennsylvania 20 0.0072427 0.0228425
Wisconsin 10 0.0076434 0.0366399

In this alternate timeline, where third parties vanished and its constituents had to pick between Trump and Clinton, would have produced a drastically different result.

Even if taking Harry Enten's findings too generously, that Clinton's edge was not 1% but a more conservative estimate 0.7643%+1 (the margin enough to win Wisconsin), in that hypothetical Clinton would still lose Florida but win Michigan, Pennsylvania, and Wisconsin. This would have given Clinton 46 electoral delegates, enough to make her delegate count 278 to Trump's 260. Again we find Johnson acted as spoiler, prevented Clinton's victory, and delivered to us a Trump presidency.

Remark. Let us suppose Enten's findings could be used to construct a random variable describing how third party voters will likely vote. Given that polls have a margin of error of 0.04 at a 95% confidence level, we can construct a normally distributed random variable X centered at 0.01 [which Enten determined is the edge Clinton has] with a 0.02 sigma [from the noise for polling] for the third party supporter who just "randomly" picks who to vote for as follows: generate a random real number following this distribution and, if it is positive, vote for Clinton, otherwise vote for Trump. In this scheme, Trump receives 30.85375% of the third party voters, Clinton receives 69.14625%, enough for Clinton to win Florida, Michigan, Wisconsin, and Pennsylvania.

Exercise 1. The New York Times's Libertarian Gary Johnson Polls at 10 Percent. Who Are His Supporters? surveys the demographics of Johnson supporters. Consider using a MRP model (like Gelman et al.'s in arXiv:1802.00842) to estimate the preference of third party voters.

Exercise 2. The argument produced above, and the results of exercise 1, give two different ways to show Johnson was a spoiler candidate and Clinton would have won the election had Johnson not run. But it is not wise to go to see with two chronometers (take either one or three). Think of another test for showing Johnson was a spoiler candidate.

What Remains to be Investigated?

Aside from the exercises for myself, points worth pursuing include what could Clinton have done differently? It's one thing for us to sit back and say, "Well, well, third parties ruined everything." But it's more useful to consider how third parties attracted voters, and what Clinton could have done to counter this effect.

Also comparing the demographics of Obama-Trump supporters to Johnson supporters may be insightful. If it turns out these two share a suitably similar political culture, then we may have found one strata of swing voters. It remains to be seen if they are so fed up with Trump that they abstain from even voting in 2020.

But also worth considering is the newly energized Democratic base which didn't materialize for Clinton in 2016 but sure as Hell materialized to protest Trump and vote out Republicans in 2018. If the newly energized base is larger than the Obama-Trump and Johnson voters, especially in the swing states, then it may be worth considering alternative 2020 strategies.

All the scratchwork for this post may be found on Github.

Thursday, July 4, 2019

Explanations in Psephology

What qualifies as an "explanation"? Specifically, when will a statistical analysis explain why candidate A lost the election to candidate B?

Good explanations have a variety of characteristics (it's independent of the method, it's contrastive, social, etc.). The late Cambridge philosophy professor Peter Lipton defines an explanation (as quoted in arXiv:1811.03163)

To explain why P rather than Q, we must cite a causal difference between P and not-Q, consisting of a cause of P and the absence of a corresponding event in the history of not-Q.

The question I'm pursuing (implicitly, through a number of posts) is "Why did Trump win 2016?" The first puzzle: will this generate the same explanations as "Why did Clinton lose 2016"?

There are many variants on these questions which may be worth considering:

  • What could Clinton have done differently to win 2016?
  • What did Trump do (as opposed to [generic Republican candidate]) which contributed to winning 2016?
  • Could a generic Republican candidate have won 2016?
  • Could a generic Democrat have defeated Trump in 2016?
And could we rank how decisive each factor in the answers contributed to the outcome?

All of these questions raise different answers, particularly since we're focusing on different actors. For our purposes, understanding how the election of 2016 unfolded as it did, all of these questions may be worth investigating.

But What's a "Good Explanation"?

The second puzzle is what qualifies as a "good explanation". Lets try to examine a few (hypothetical) propositions, and see if they qualify as "explanations".

Proposition 1. If 50%+1 of voters who voted for Obama in 2012 and Trump in 2016 had changed their vote to Clinton in 2016, then Clinton would have won the election.

This gives us a path to victory, but it does not illuminate why Obama-Trump supporters jumped ship from Obama to Trump. As Lipton phrased it, this gives us knowledge but not understanding. We do not understand voter "issue preferences", to borrow a game theoretic term.

Proposition 2. Clinton didn't alter her campaign sufficiently in 2016 compared to her past campaigns.

This explanation gives understanding why she lost, but it is incomplete or not fully fleshed out...depending on what question we're really trying to explain. Proposition 2 explains why she lost 2016 partially, we would implicitly need to explain that "typical campaigning" didn't work against Trump (which hardly seems like something worth explaining to anyone who lived through it).

How could we empirically test this proposition? This is an orthogonal concern: providing evidence for an explanation. It is worth pondering, though, what data would suffice to merit this explanation.

Concerns for proof withstanding, as an explanation, proposition 2 has the quality of understanding and some flavor of causal reasoning.

Proposition 3. Johnson acted as a spoiler candidate, particularly among swing voters.

Explanations, like proposition 3, tend to sound more like excuses. How can we rigorously test such a proposition? How can we avoid fooling ourselves?

There is a clear counter-factual claim we could make, premised on proposition 3, that: had Johnson not run, Clinton would have been President. So proposition 3 qualifies as an "explanation" per se, but there are lurking factors hidden beneath it...why did Johnson get so many votes? How could Clinton have campaigned differently?

But that's a story for another day...

Tuesday, July 2, 2019

Swing Voters: A Glance at the Literature

Puzzle 0. Journalists have introduced the term "swing voters". (a) Can we make this notion rigorous? Who is a swing voter? Assuming this notion is well-defined, we have follow up queries: (b) Can we "model" swing voters (in some sense)? Is there some "psychological profile" for "swing voters"? (c) What correlates with the number of swing voters in a state?

William Mayer, in his book The Swing Voter in American Politics (2008, pg 2), describes a "swing voter" as a voter who is persuadable:

In simple terms, a swing voter is, as the name implies, a voter who could go either way: a voter who is not solidly committed to one candidate or the other as to make all the efforts at persuasion futileAs indicated in the text, among media articles that do provide an explicit definition of the swing voter, this is the most common approach. See, for example, Joseph Perkins, "Which candidate Can Get Things Done?" San Diego Union-Tribune, October 20, 2000, p. B-11; Saeed Ahmed, "Quick Hits from the Trail," Atlanta Constitution, October 26, 2000, p. 14A; and "Power of the Undecideds," New York Times, November 5, 2000, sec. IV, p. 14.. If some voters are firm, clear, dependable supporters of one candidate or the other, swing voters are the opposite: those whose final allegiance is in some doubt all the way up until Election Day. Put another way, swing voters are ambivalent or, to use a term with a somewhat better political science lineage, cross-pressured.Though it never employed the term “swing voter,” one antecedent to the analysis in this chapter is the discussion in most of the great early voting studies of social and attitudinal cross-pressures within the electorate. See, in particular, Lazarsfeld, Berelson, and Gaudet [The People’s Choice: How the Voter Makes Up His Mind in a Presidential Campaign] (1948,pp. 56–64); Berelson, Lazarsfeld, and McPhee [Voting: A Study of Opinion Formation in a Presidential Campaign] (1954, pp. 128–32); Campbell, Gurin, and Miller [The Voter Decides] (1954, pp. 157–64); and Campbell and others [The American Voter] (1960, pp. 78–88). There was, however, never any agreement as to how to operationalize this concept (Lazarsfeld and his collaborators tended to look at demographic characteristics; the Michigan school used attitudinal data); and almost the only empirical finding of this work was that cross-pressured voters tended to be late deciders. For reasons that are not immediately clear, more recent voting studies have almost entirely ignored the concept. The term appears nowhere in Nie, Verba, and Petrocik [The Changing American Voter] (1976); Fiorina [Retrospective Voting in American National Elections] (1981); or Miller and Shanks [The New American Voter] (1996). Rather than seeing one party as the embodiment of all virtue and the other as the quintessence of vice, swing voters are pulled—or repulsed—in both directions.

The American National Election Studies have surveyed voters in every presidential election since 1972. We can use their so-called "feeling thermometer questions", which gives a value between 0 to 100 for each candidate. Mayer constructs a new statistic by taking the Republican's "feeling thermometer value" and subtract the Democrat's "feeling thermometer value". The voters around 0 degrees, Mayer suggests, are the swing voters.

Since Mayer's book was published, we have acquired more data about swing voters using ad-tracking technology. Quartz's Ashley Rodriguez reviewed findings for 2016 swing voters. While the minute details of these studies are fascinating, if true, they don't tell us any correlated "macro-statistics" correlated with "swing-iness".

After the 2018 midterm elections, Vox's Matthew Yglesias argues swing voters still exist, but his argument is uncompelling circumstantial evidence. There are voters who cast their 2012 vote for Obama yet 2016 vote for Trump (and similarly those who voted for Romney in 2012 and Clinton in 2016), but this data alone is insufficient to prove all such voters are "swing voters". We need more to establish such voters are "swingers".

Palfrey and Poole (1987) have shown low information voters tend to constitute the majority of swing voters, as Mayer has thus defined it.

Puzzle 1. Can we reproduce Palfrey and Poole's results? Has there been more modern work confirming this?

Here's the unintuitive thing: if we control for partisans masquerading as "independents", as Gelman et al. (2014) have done, then swing voters in 2012 are sample artifacts whose effects are quite small. Dr Gelman wrote a piece in the Washington Post explaining his findings, in simpler terms.

Puzzle 2. How does Gelman, et al., hold in light of Palfrey and Poole? Are uninformed voters no longer "swingable"? Or have uninformed voters vanished (or, at least, no longer vote)?

Happily, Mayer has a resolution for this puzzle. It's well-known (since at least Keith's work 1992, building upon many others's works from the '80s) that self-proclaimed "independents" are "hidden partisans". Indeed, using "political independent" as synonymous with "swing voter" is a Bad Idea.

The "undecided voter" is a much closer concept to a "swing voter". Although conceptually similar, it is harder to gauge if a voter is really "undecided" or not. It turns out people eagerly claim to be "undecided", more than matches reality.

Voter Model

There are a variety of voter models we could consider. I'm going to summarize the models as presented in R. Douglas Arnold's The Logic of Congressional Action, and they're really quite simply decision rules.

Party Performance Rule. A voter asks themselves, "Are things better off than they were at the last election? Which party is 'in charge' [of the White House]?" If things are deteriorating, the voter will cast their vote against the President's party. If conditions are improving, the voter will cast their vote supporting the President's party.

Incumbent Performance Rule. Voters, Douglas describe, first evaluate current conditions in society, decide how acceptable those conditions are, and then either reward or punish incumbent legislators for actions that they think contributed to the current state of affairs. (pg. 44) Although very similar to the party performance rule, the difference lies in who is held responsible: the members of the President's party, or the legislators themselves.

Party Position Rule. A citizen first identifies the party offering the most pleasant package of policy positions, then votes for candidates belonging to that party.

Candidate Position Rule. A citizen first identifies the candidate offering the most pleasant package of policy positions, then votes for that candidate.

So...which rule is it? Arnold suggests, arguably, a fifth decision rule which resembles aspects of all these rules. Basically, a voter keeps four "accounts" (integers) in his brain, one for each party, another for the incumbent, and the fourth for the challenger. These values may be positive or negative.

The two accounts for the parties is given some initial values during childhood. As the voter acquires information about the parties achievements in office, the voter updates their "accounts" for the parties using something like the party position and party performance rules.

When the voter learns about the incumbent, our voter opens up a third account for that incumbent. As our voter learns about the incumbent's positions and accomplishments, the voter updates the incumbent's accounts using some amalgam of candidate position and candidate performance rules. Ancillary information (like extramarital affairs, hiking the Appalachian trail, etc.) nudge the account, one way or another.

Finally, a challenger appears, and the voter opens up a fourth account for that challenger. The value assigned to this fourth account is a variant of the candidate position rule, combined with extra-information adjustments. (Is the challenger's party responsible for a disastrous war? Did the economy collapse? Etc.)

On election day, the intrepid voter goes to the polls, combines these four values, then decides how to vote. The simplest model adds the four values together, then if the sum is positive votes for the incumbent (otherwise, the voter sides with the challenger).

How well does this "Impression-driven" model work? There's evidence this model is something along the lines of how people actually decide to vote, but that's a contentious point among academics. I will side-step arguments, and just note that cognitive heuristics probably account for most (all?) of the decision-making process, and some variant of this "impression-driven" model probably works "good enough".

Remark. I suspect something like a moving average formula is used to update these accounts. Doubtless there are countless variants of this model, depending on what formulas we want to use.

How do Swing Voters fit in this Model?

It seems there are multiple narratives one could generate to produce a swing voter. But the only ones which I can think of produce voters whose "accounts" are all "near zero" around election time.

There are some genuine "party switchers", like Reagan-Democrats or Obama-Trump voters. These voters seem to be dissatisfied with the Democratic party and/or their candidate, and update their "accounts" accordingly. This "swing" is a re-evaluation of party performance, or candidate performance, rather than "Starting uninformed and scrambling to form an opinion."

Swing voters seem to have their accounts return to "near zero" after the election, paying little attention to politics. Empirically, it is hard to find covariates correlating with this quality. Mayer's book discusses this in greater detail.

References

Swing Voters

  • Gary Cox, "Swing voters, core voters, and distributive politics". In Political Representation (edited by Ian Shapiro, Susan C. Stokes, Elisabeth Jean Wood, Alexander S. Kirshner), Cambridge University Press, 2010, pp.342–357. Eprint.
  • Timothy J. Feddersen, Wolfgang Pesendorfer, "The Swing Voter's Curse". The American Economic Review 86, no. 3 (1996) pp. 408–424. Provides a decision-theoretic model for voters abstaining from voting.
  • Andrew Gelman, Sharad Goel, Douglas Rivers,,and David Rothschild, The Mythical Swing Voter. (2014)
  • S. Kelley, Interpreting Elections. Princeton University Press, 1983.
  • William G. Mayer (ed.), The Swing Voter in American Politics. Brookings Institute Press, 2008. See esp. ch 1
  • Thomas R. Palfrey, Keith T. Poole, "The Relationship between Information, Ideology, and Voting Behavior". American Journal of Political Science 31, no. 3 (1987) pp. 511–530.
  • Ashley Rodriguez, Undecided voters are as scared as the rest of us, and other insights from a trove of data on swing voters, Quartz, October 8, 2016.
  • Nate Silver, The Invisible Undecided Voter, FiveThirtyEight, Jan. 23, 2017.
  • Matthew Yglesias, Swing voters are extremely real, Vox, July 23, 2018.

Voter Model

  • R. Douglas Arnold, The Logic of Congressional Action. Yale University Press, 1990. See chapter 3.
  • Richard Lau and David Redlawsk, "Advantages and disadvantages of cognitive heuristics in political decision making". American Journal of Political Science 45, No. 4 (2001): 951–971.
  • Milton Lodge, Kathleen M. McGraw, Patrick Stroh, "An Impression-Driven Model of Candidate Evaluation". The American Political Science Review 83, No. 2 (1989), pp. 399–419.
  • Milton Lodge, Marco R. Steenbergen, Shawn Brau, "The Responsive Voter: Campaign Information and the Dynamics of Candidate Evaluation". The American Political Science Review 89, No. 2. (1995), pp. 309–326.

Addendum (). FiveThirtyEight had published an article rehashing much the same, but skims on the references.