When we say things like, The South is solidly red...what does that mean? How do we measure it?

There are a variety of statistics ("indices"?) to gauge the magnitude of partisanship for the constituency of a region.

Cook PVI

Cook PVI score tells us the party lean for a congressional district (or state), measured as "R+n" or "D+n" or "EVEN", where n is a natural number greater than zero (e.g., 1, 2, 3, 4, 5, etc.). This is basically is done as follows in pidgin code:

if (current_president_party == "D") {
    local_d_avg = (local_d_percent(last_election) +
                     local_d_percent(last_election - 4))/2
    national_d_avg = (national_d_percent(last_election) +
                        national_d_percent(last_election - 4))/2
    
    if (round(local_d_avg) > round(national_d_avg)) {
        return "D+"+round(local_d_avg - national_d_avg);
    } else if (round(local_d_avg) < round(national_d_avg)) {
        return "R+"+round(national_d_avg - local_d_avg);
    } else {
        return "EVEN";
    }
} else if (current_president_party == "R") {
    local_r_avg = (local_r_percent(last_election) +
                     local_r_percent(last_election - 4))/2
    national_r_avg = (national_r_percent(last_election) +
                        national_r_percent(last_election - 4))/2
    
    if (round(local_r_avg) > round(national_r_avg)) {
        return "R+"+round(local_r_avg - national_r_avg)
    } else if (round(local_r_avg) < round(national_r_avg)) {
        return "D+"+round(national_r_avg - local_r_avg)
    } else {
        return "EVEN";
    }
} // other parties handled similarly

The "region" is either a congressional district or an entire state, but ostensibly could be arbitrary (e.g., counties could get its own PVI).

One problem with this is the reliance on voting for a single candidate for determining partisanship or political leaning. But there is a different political culture between a region that (1) votes for the incumbent president and the opposition party for congress, and (2) "down ballot" voters.

Another line of criticism takes issue with working with 2 elections at a time. But really the Cook PVI is just the deviation of the simple moving average for a region compared to the simple moving average for the nation. This smooths out some of the fluctuating "noise".

Partisan Propensity Index

FiveThirtyEight's PPI is explained as:

Thus, Partisan Propensity Index, or PPI, is defined as the percentage chance that the Democrat would have won and open-seat race for Congress in a particular district given the conditions present, on average, between 2002 and 2008 (a period which conveniently featured two good cycles for Republicans and two good ones for Democrats). It contains only two variables: the Presidential vote share in the district and the percentage of households there with incomes below $25,000.

This is computed by using a logistic regression (apparently trained on "1" for Democratic winners, "0" for Republican winners).

One reservation I have, as I had articulated for the Cook PVI, is it turns out there are a lot of people who vote one party for President and the opposition party for congress. At present, I am not convinced the PPI avoids this problem.

The last reservation that comes to mind is the reliance on an economic indicator. This is nothing serious, per se, but could produce spurious results in an economic collapse or a sudden migration.

The prudent reader will realize that both Cook PVI and the PPI both are using federal election data, i.e., results for sending candidates to the White House or to Capitol Hill. Both measures ignore state-level data (e.g., for the gubernatorial race, or for state legislature races).

Ranney Index

Ranney Index, which measures (over a specified time period) the average of:

the proportion of seats held by the Democrats in the state's lower legislature,
the proportion of seats held by the Democrats in the state's upper legislature,
the Democratic gubernatorial candidate's percentage in the election,
the proportion of time the Democrats held both (a) majorities in both chambers of the state legislature, and (b) the governorship.

Since each of these factors are themselves between 0 and 1, the Ranney index will be between 0 (super Republican) and 1 (super Democratic).

The folded Ranney index, computed as 1 − abs(Ranney − 0.5), tells us how competitive the state. For states which are competitive (nearer 0.5) or solidly partisan (nearer 1.0).

Another variant is to use a moving average of the Ranney Index computed for each election, with the "time interval" being 2 gubernatorial elections. This mirrors the simple moving average scheme used in the Cook PVI.

This has the exact opposite problem as the Cook PVI and the PPI statistics, namely an exclusive focus on state-level elections while ignoring federal elections.

Majority Party Index

James W. Ceaser and Robert P. Saldin's A New Measure of Party Strength (2005) present a different metric, the Majority Party Index (MPI). The MPI is, like the Ranney index, a weighted average, but split evenly between federal elections and state-level elections.

The first component of the MPI and the first nationally based measure is the two-party vote for president in each state's most recent presidential election. Thus in 2000, the value is based on the election result of that year. For 2002 the same value is entered because there was no presidential election in 2002. This factor accounts for 25 percent of the total index value for each state.

The second MPI component, also at the national level, is the two-party vote in each state's two most recent United States Senate elections and accounts for 12.5 percent of the total index value. To take an example, Idaho's U.S. Senate value is calculated by averaging the results from the 1998 and 2002 Senate elections. By taking both of the Senate elections into account, despite the time lag on the former, the MPI attempts to reflect partisan voting from year to year without over-emphasizing current partisan swings. By including a Senate result every year, it evens out results among different states.

The third component, and the final national measure, is the total two-party average of all votes in each state's biennial U.S. House elections. Virginia's value, for example, is obtained by adding Republican and Democratic votes in all congressional districts and calculating each party's percentage of this total. This method provides a more accurate measure of the overall state partisan choice than would be obtained by averaging the two-party percentages of each district because districts differ in population and turnout. In addition, it reduces the impact of uncontested seats. This component of the MPI accounts for 12.5 percent of each state's total score.

The fourth component of the MPI, and the first state-level measure, is the two-party vote percentage in each state's most recent gubernatorial election. This component accounts for 25 percent of each state's total.

The fifth measure, also at the state level, is the two-party percentage of all seats in each state's Senate. The Major Party Index value is determined in this case by dividing the number of Republican seats by the sum of Democratic and Republican seats. There were two reasons in this case for using seats (as opposed to votes) as the basis of calculation: vote totals for state legislative electlons are difficult to find, and many of these races are uncontested (a large number of uncontested seats skews the two-party vote totals). This component is weighted as 12.5 percent of the total index.

The final component of the MPI, and the third state-level measure, is the two-party percentage of seats in state houses. This score is calculated in the same way as that for the state senate and is worth 12.5 percent of the total index value.

Nebraska has a nonpartisan, unicameral state legislature. What we do for them is omit the state legislature factors, then reweigh everything accordingly. (The president factor is weighed 1/3, the house and senate factors each weighed 1/6, and the governor is weighed 1/3?)

Thus the formula is:

MPI = ((Most recent 2-Party Republican Presidential Vote)*0.25) + 
    ((Average of Two Most Recent Republican 2-Party Votes for US Senate)*O.125) + 
    ((Republican 2-Party Percent of all US House Votes)*O.125) +
    ((Most Recent 2-Party Republican Vote for Governor)*0.25) +
    ((2-Party Republican Percentage of Seats in the State Senate)*O.125) +
    ((2-Party Republican Percentage of Seats in the State House)*O.125).

We could, for ease, rescale this to be between −1 for Democratic states to +1 for Republican states.

The only problem with this approach, and this stems from my own laziness, is there is no good data source for Governor election data. Or, at least, none that I could find.

Also, the turnover for legislator resignation is far higher in state legislatures. (They resign in disgrace far more frequently than federal legislators.) It's unclear to me how to handle the situation when the replacements belong to different parties.

Equally unclear to me is how special elections are handled for the federal offices. When a senator dies or resigns, some states have a special election for replacing the senator. Do we include special election percentages in the MPI?

Concluding Remarks

It would be useful to compute the Cook PVI for the states, as well as the MPI for the states, to see if these correlate or not. Similarly if there's any correlation between the Cook PVI and the Ranney index, that would be interesting to investigate.

We could ostensibly restrict the region size to counties (or county-equivalents) and use vote data to estimate indices at that level. Although counties range from 100 (rural Texas) to 10 million (Los Angeles), the intent is to apply the formulas to different sub-regions of states, to see how they compare within a state.

(This was more or less a summary I've had lying around for over a year, and I thought I ought to quit polishing it and just publish the thing.)

Political Arithmetic

Friday, July 24, 2020

Measuring a region's partisanship: a survey