We briefly introduced the idea of issue spaces as a formalization of the political spectrum. Now we want to figure out where legislators are on that political spectrum.

Game theory models political actors as an Ideal Point in an issue space equipped with a utility function on that issue space. But how do we estimate (unobservable) ideal points? One strategy is to try to use votes, which is the basis for the (i) Item-Response and (ii) NOMINATE families of algorithms.

Basic Idea

The policy space consists of s dimensions. Legislator i has his/her utility function for voting yes (y subscript) on measure j be a function of the "distance" from the legislator's ideal point to the proposed legislation's point. We use a slightly generalized version of the Pythagoren theorem, where the "distance" first dilates the coordinates by the subjective weights the legislator places w_k for each dimension k in the policy space:

l_{i j y}^{2} = \sum_{k = 1}^{s} w_{k}^{2} d_{i j y k}^{2}

A legislator's utility function is then some "suitably nice" function of these distances, u(l). Well, this isn't quite the end of the story, because we're dealing with statistical regression, we just described the "deterministic part" of the utility function. We also have the "random noise" ε

U (l_{i j y}) = u (l_{i j y}) + ε

We fix u to be either a Gaussian function or a quadratic polynomial. The only condition is that "it looks like a frown" (it has a global maximum at the legislator's ideal point, and is strictly decreasing).

How to progress? Well, we can represent the probability of voting "yea" in terms of the utility function (and how this is done varies model-by-model), then estimate the parameters (the w_k for each legislator, and each legislator's ideal point, and each motion's location in the policy space) using something like maximizing the likelihood or expectation maximization.

NOMINATE models

The utility functions are Gaussian functions. If there are s dimensions to the policy space, legislator i has his/her utility function for voting yes (y subscript) on measure j, where w_kd_k measures the "cost" for deviating from the legislator's ideal point in the k^th dimension of the issue space,

u_{i j y} = β \exp [\sum_{k = 1}^{s} w_{k}^{2} d_{i j y k}^{2} / 2]

Observe the exponent is just the l "cost" for the legislator to support the measure. Well, this isn't quite the end of the story, because we're dealing with statistical regression, we just described the "deterministic part" of the utility function. We also have the "random noise" ε, giving the utility function U as

U_{i j y} = u_{i j y} + ε_{i j y}

Note: we can similarly define the utility for voting "nay" by considering instead the location of the status quo in policy space, computing the distance to that point for the legislator. This is precisely u_ijn the utility function for voting "nay". The stochastic term ε of the utility function is assumed to follow an "extreme value distribution", which lets us write the probability legislator i votes for outcome y on roll call j as:

Pr (Yea) = P_{i j y} = \frac{\exp (u_{i j y})}{\exp (u_{i j y}) + \exp (u_{i j n})}

The exact details of this variant of the NOMINATE algorithm may be found in "Scaling Roll Call Votes with wnominate in R", and it works for a single session of congress.

The models describe estimating the ideal points for a finite set of legislators within the same session of Congress. But how do we handle "progress"? I.e., how ideal points evolve over time (across multiple sessions of Congress)? The legislator's ideal point is then a polynomial in t (sessions since joining), supposing the legislator has served T terms (thus far in his life):

x_{i t} = x_{i 0} + x_{i 1} P_{1} (\frac{t - 1}{T - 1}) + \dots + x_{i ν} P_{ν} (\frac{t - 1}{T - 1})

Where P_k is a Legendre polynomial, and the x_it are more parameters to be determined. Why use a Legendre polynomial? It's unclear to me, presumably for its completeness relation (any function on the domain 0 < x < 1 can be adequately approximated by "enough" Legendre polynomials). This is the DW-NOMINATE variant.

Problems

Although ubiquitous in the literature, there are some problems with the NOMINATE scores.

First, the dimensions are not as clear to interpretation as its proponents claim. The first dimension is always interpreted as the "partisanship" dimension, but there's no clear way to glean that other than guessing.

Second, it poorly describes how someone's views evolve over time. This is important if we wanted to discuss, e.g., "party realignments" (Are the Republicans from the 1990s "the same as" the Republicans in 2019?).

Third, NOMINATE requires a lot of data before it can produce decent results. This has probably improved since the original algorithm, there are so many now it's hard to keep track.

Fourth, it's not Bayesian. This is unfortunate from a performance perspective. If I have just computed the NOMINATE scores for legislators based on the first session of congress, then 6 months into the next session I want to update those scores...I have to recompute everything from scratch. This isn't as terrible as the previous problems, but it is irritating.

Item-Response Models

The basic idea is to take advantage of votes as if they were responses to a survey, then use the already developed Item-Response Theory. The basic idea, as applied to ideal points of legislators, is to consider roughly a probit model for the probabilities that a legislator will vote "yea":

Pr (y_{i j} = 1) = Φ ({\vec{β}}_{j} \cdot {\vec{x}}_{i} - α_{j})

where Φ is the CDF for the Normal distribution.

This can be reinterpreted as an Item-Response model used (apparently) in educational testing, where β_j is the "item discrimination parameter" and α_j is the item difficulty parameter. Clinton, Jackman, and Rivers' "The Statistical Analysis of Roll Call Data" (2004) was the first to approach ideal point identification using Item-Response theory, at least so far as I can tell from the literature.

This led to a multitude of variants: emIRT improved performance, for example; while Martin and Quinn's work on Supreme Court justices ideal points produced innovative algorithms which are Bayesian and dynamical (take a "random walk" in the issue space, as it were).

This turns out to be superior for analyzing the dynamics of ideal points. Specifically for the questions of party realignment, Caughey and Schickler (2014) caution us to use a dynamic IRT model. Although computationally intensive, progress has been made (easily bundled, e.g., with the idealstan R package).

Problems with Item-Response Models

First, Item-Response models are scale-invariant — we can rescale the coordinates for the policy space however much we want. So the numeric values themselves may not matter for ideal points insomuch as their relationship to each other.

Second, for policy spaces which are not 1-dimensional, item-response values are rotation invariant. For 1-dimensional policy spaces, item-response doesn't know whether to order values from most liberal to most conservative or vice-versa.

But both these problems can be solved using semi-informative priors in the Bayesian approaches.

The third problem, perhaps more grave, is we are restricted to certain dimensions due to computational constraints. The NOMINATE algorithm could handle 8 dimensions, no problem; but item response algorithms struggle with determining ideal points in more than 2 dimensions within a sensible period of time.

Conclusion

If you are interested in an overview — a "big picture" of congress — without concern about nuance, the NOMINATE scores may be good enough.

Although it produces a decent approximate ideal point for legislators, it fails to adequately capture how a legislator evolves over multiple sessions. This makes it less than ideal for making any claims about party realignments.

Further, it fails to capture issue-specific nuances for each legislator. Presumably higher dimensionality fixes the problem, but giving, say, 16 numbers worsens the intuitive picture for a single legislator. It is unclear if the Item-Response families suffer the same problem. (See arXiv:1209.6004 for details.)

References

Nolan McCarty, Measuring Legislative Preferences. This review fleshes out more sordid details underpinning the general notion of "ideal points" than I have written about.

NOMINATE algorithms

Keith Poole and Howard Rosenthal's Congress: A Political-Economic History of Roll Call Voting. Oxford University Press, 1997.
Devin Caughey and Eric Schickler, "Substance and Change in Congressional Ideology:NOMINATE and Its Alternatives".
Keith Poole, Jeffrey Lewis, James Lo, Royce Carroll, "Scaling Roll Call Votes with wnominate in R". Journal of Statistical Software 42, 14 (2011).
Royce Carroll, Jeffrey B. Lewis, James Lo, Keith T. Poole, "Measuring Bias and Uncertainty in DW-NOMINATE Ideal Point Estimates via the Parametric Bootstrap". Eprint.

Item-Response Algorithms

Joshua Clinton, Simon Jackman, Douglas Rivers, "The Statistical Analysis of Roll Call Data". American Political Science Review 98, 2 (2004) pp. 355-370. Journal page.
idealstan: an R Package for Ideal Point Modeling with Stan
Andrew D. Martin and Kevin M. Quinn, "Dynamic Ideal Point Estimation via Markov Chain Monte Carlo for the U.S. Supreme Court, 1953-1999". Political Analysis 10 (2002) 134-153. (See esp. section 3 for the basic model.)
Imai, Kosuke, James Lo, and Jonathan Olmsted, "Fast Estimation of Ideal Points with Massive Data." American Political Science Review 110, No. 4 (2016), pp. 631-656. Eprint
Sean M. Gerrish, David M. Blei, "The Issue-Adjusted Ideal Point Model". arXiv:1209.6004

Political Arithmetic

Friday, April 26, 2019

Estimating Legislator Ideal Points