Friday, May 31, 2019

What is...a Random Variable?

Tentative Definition. A random variable assigns to a given "random phenomenon" or "random process" some (real) number, or a vector ("list") of numbers.

Examples. The following are all random variables.

Let X be the number of heads in 10 coin flips.
Let R be the number of times a given baseball pitcher strikes out an opponent in the course of a given game.
Let S be the number of "successes" (heads) until the first "failure" (tail) in a repeated trial (flipping a coin over and over again).
Let T be the waiting time (in minutes) until the next bus arrives.

Slightly more formal Definition. If we represent the possible outcomes for a given "random process" (i.e., the set underlying its sigma algebra) by Ω, then a random variable is a function $X\colon\Omega\to\mathbb{R}$ (or possible $X\colon\Omega\to\mathbb{R}^{k}$ some [fixed positive integer] k) such that for any $x\in\mathbb{R}$ the preimage of smaller values is an event $\{\omega\in\Omega:X(\omega)\leq x\}\in\Sigma$ ("is measurable", i.e., we can assign a probability to that preimage). We will write $X\leq x = \{\omega\in\Omega:X(\omega)\leq x\}$ in an abuse of notation.

Remark. We should remember a sigma algebra is not just the space of outcomes $\Omega$, but also the specific set of well-defined events $\Sigma\subseteq\mathcal{P}(\Omega)$. There are various specifications we have on the events, for instance "something happens" $\Omega\in\Sigma$; for any event $E\in\Sigma$ its complement is also a well-defined event $\Omega\setminus E\in\Sigma$ [or "an event does not happen"]; for any countable family of events $\{E_{j}\}_{j\in J}\subseteq\Sigma$ their union is also a well-defined event $\bigcup_{j\in J}E_{j} \in\Sigma$ ["one of these events might happen"]; and so forth. Implicitly we consider a probability measure on the set of well-defined events generically denoted $\Pr(-)$. Altogether, this is the data necessary to describe some "random phenomenon".

Example. Let E be any event. Then indicator function $I_{E}$ which is zero for any $x\notin E$ and $I_{E}(e)=1$ for all $e\in E$, this is a random variable. For multiple events $E_{1},\dots, E_{n}$, we find $I_{E_{1}\cup\dots\cup E_{n}}(x)=\max(I_{E_{1}}(x),\dots,I_{E_{n}}(x))$ and $I_{E_{1}\cap\dots\cap E_{n}}(x) = \prod_{j} I_{E_{j}}(x)$. For discrete random phenomena, these indicator functions are the basic building blocks for constructing other random variables.

What happens to all that sigma-algebra baggage? Given a random variable, we can ask "What is the probability its value will be in a given range?" For example, "What is the probability the starting pitcher for the Dodgers will strike out at least 30 batters?"

This would be computed by first assembling all possible outcomes which satisfy this $\mathcal{E}=\{E\in\Omega : R(E)\geq30\}\subseteq\Omega$ and using the probability measure on the sample space for the random process $\Pr(\mathcal{E})$ or more imaginatively $\Pr(R\geq30)$.

As a caveat, we hasten to add, only inequalities are necessarily well-defined, e.g., $\Pr(X\leq x)$. Equality "on the nose" may not be well-defined $\Pr(X=x)$, but we abuse notation to write the Probability mass function in this manner. (This gets tricky with subtle nuances when dealing with continuous random variables instead of discrete ones.)

This induces a nice mathematical structure on the image of the random variable $R(\Omega)$, namely we can "transport" the probability distribution from the sigma algebra $\Sigma$ on $\Omega$ to $R(\Omega)$. This is the Distribution Function for the random variable, $F_{R}(x) = \Pr(R\leq x)$.

Equivalence relation of Random Variables. If we have two random variables, say, $X$ and $Y$, we can say they are Equivalent if for any $x\in\mathbb{R}$, we have $\Pr(X\leq x) = \Pr(Y\leq x)$. This is usually denoted $X\sim Y$.

Algebra of Random Variables. Given some random variables X, Y on the sample sigma algebra, we can define new random variables $X+Y$, $X - Y$, $XY$, $X/Y$ provided Y is never zero, and exponentiation $X^{Y} = \exp(Y\log(X))$. The intuition to have is that the operations are done as operations on real-valued functions.

So, specifically, if $x\in\Omega$, then $(X+Y)(x)=X(x)+Y(x)$, $(X - Y)(x) = X(x) - Y(x)$, $(XY)(x) = X(x)Y(x)$, $(X/Y)(x) = X(x)/Y(x)$ provided $Y(x)\neq0$, and exponentiation $(X^{Y})(x) = \exp(Y(x)\log(X(x)))$.

Probability Distributions. We have a few "standard" distributions which are the "template" for various random processes. Flipping a coin follows a Bernoulli Distribution if we flip the coin only once, and a Binomial Distribution if we flip it N times, for example.

The notation for these families may vary reference to reference. A Bernoulli distribution with probability p of success is usually denoted $\mathrm{Bernoulli}(p)$ or $\mathrm{Ber}(p)$.

To indicate a random variable is distributed like one of these standard distributions, we abuse notation and write $X\sim \mathrm{Bernoulli}(p)$.

We can build more distributions out of a handful of basic ones, for example $Y = X_{1} + \dots + X_{N}$ where the $X_{j}\sim\mathrm{Bernoulli}(p)$ will describe flipping a coin N times and counting the number of "successes" ("heads"). This gives us the Binomial distribution when we consider $\Pr(Y\leq k)$ (there are at most k heads in N coin flips).

We can specify a probability distribution by its parameters (like p in the Bernoulli distribution), the probability mass function and/or the probability density function. Often it's useful to give other summary statistics alongside this data.

Expected Value. We also have for any random variable X its Expected Value given by $\mathbb{E}[X] = \sum_{x\in X(\Omega)}x\Pr(X=x)$ (or replacing the sum with an integral for continuous random variables). The intuition we should have for the expected value of a random variable is this captures the "average value" of the random variable.

If we have some function $f\colon\mathbb{R}\to\mathbb{R}$, then we have $\mathbb{E}[f(X)] = \sum_{x\in X(\Omega)}f(x)\Pr(X=x)$ (and again, an integral instead of a sum for continuous random variables, with the restriction that f is an integrable function).

Note that $\mathbb{E}[X^{2}]\neq(\mathbb{E}[X])^{2}$ and more generally $\mathbb{E}[XY]\neq \mathbb{E}[X]\mathbb{E}[Y]$. But we do have $\mathbb{E}[X + Y] = \mathbb{E}[X] + \mathbb{E}[Y]$ and, for any real number $a\in\mathbb{R}$ $\mathbb{E}[aX] = a\mathbb{E}[X]$.

Exercise. Let $E\in\Sigma$ be an event in a sigma algebra, and $I_{E}$ be the indicator function on E. What is $\mathbb{E}[I_{E}]$?

Theorem. Let X and Y be random variables, a and b be real numbers. Then:

$\mathbb{E}[aX + b] = a\mathbb{E}[X] + b$
$\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$
$\displaystyle\mathbb{E}[XY] = \sum_{\omega\in\Omega}X(\omega)Y(\omega)\Pr(\omega)$
$\displaystyle\mathbb{E}[X/Y] = \sum_{\omega\in\Omega}\frac{X(\omega)}{Y(\omega)}\Pr(\omega)$ if $Y(\omega)\neq0$ for any $\omega\in\Omega$
$\displaystyle\mathbb{E}[X^{Y}] = \sum_{\omega\in\Omega}\exp(Y(\omega)\log(X(\omega)))\Pr(\omega)$

Variance. If expected value tells us what neighborhood a random variable is likely to live in, the variance tells us how spread out this neighborhood is. We define it as \[\mathrm{Var}[X] = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = \sum_{\omega}(X(\omega) - \mathbb{E}[X])^{2}\Pr(\omega).\] The variance and expected value for a random variable contain a lot of useful information, which we use when trying to infer parameters from data.

More generally, we have for any two random variables X and Y a measure of their failure to be correlated by the covariance \[\mathrm{Cov}(X,Y) = \sum_{\omega}(X(\omega)-\mathbb{E}[X])(Y(\omega)-\mathbb{E}[Y])\Pr(\omega)\] which is such that $\mathrm{Cov}(X,X)=\mathrm{Var}[X]$. Correlatedness is not the same as independence: independent random variables are not correlated, but uncorrelated random variables may or may not be independent (a "all salmon are fish, not all fish are salmon" type statement). So uncorrelated is a "weaker" property than independence.

Applications?

Iterate! Note, we can transform the parameters of these distributions (like p the probability of success in a Bernoulli trial) into random variables themselves. This is precisely what Bayesian data analysis does: the parameters are random variables following prior probability distributions, which we update as new data becomes available using Bayes's theorem.

Regressions! A linear regression basically says that the observations are really values of a random variable, i.e., $Y\sim\mathcal{N}(aX + b, \varepsilon)$ where $\mathcal{N}$ is the normal distribution. There are other useful regressions, but this is the basic idea.

We should admit that this is one formulation of regressions in terms of random variables. The other uses conditional random variables, $Y|X\sim f(\beta\cdot X,\theta)$ when the regressions (X) are stochastic (i.e., "not controlled by the experimenter/statistician"). Formally these are different models. But when actually doing the regressions, they are treated "the same".

Tests! We often do an experiment, producing some data points $x_{1},\dots,x_{n}$ which we interpret as values of a random variable X which follows a prescribed distribution. We test the assumption (that X follows the given distribution with specific parameters) by comparing the sample mean \[ \mu = \frac{1}{n}\sum_{j=1}^{n}x_{j} \] to the expected value $\mathbb{E}[X]$. The central limit theorem suggests that $\sqrt{n}(\mu - \mathbb{E}[X])$ looks like a normal distribution centered at 0 with variance approximately equal to the variance of the data points (loosely speaking).

Reading List

Discrete Random Variables and their Expected Values, MIT OCW 18-05 Introduction to Probability and Statistics
Expected Value and Variance, lecture notes.
Statistical Machine Learning, by Han Liu and Larry Wasserman, specifically chapter 12 on Bayesian inference using random variables.

Tuesday, May 28, 2019

How many news stories are there?

Recently, an eccentric billionaire bought the Los Angeles Times and sought to make it rival the New York Times as a "newspaper of record". Presumably this means hiring more journalists, but let us ask a simpler question.

Puzzle 1: How many news stories go unreported by both the New York Times and the Los Angeles Times?

We can solve this puzzle using the maximum likelihood estimator for the Hypergeometric Distribution. Think of it like this: on a remote island with some unknown deer population, we go and (without harming the wildlife) tag K deer. A month later, we return, and capture n deer, of which k are tagged. We can estimate the total population of deer N on the island.

Explicitly connecting that analogous problem to our own, we know the "tagged stories" K reported by the New York Times, the "sample stories" n reported by the Los Angeles Times, of which there is the "tagged sample stories" k reported by both newspapers, and we want to estimate how many news stories there are in total N. The maximum likelihood estimator for N is given by \[ \min_{\widehat{N}}\frac{\Pr(\widehat{N},K,n,k)}{\Pr(\widehat{N}-1,K,n,k)}\geq1 \] the smallest N for which the ratio of probabilities is greater than 1. It is not hard to solve this to find $\widehat{N} = [Kn/k]$ where the brackets indicate we are using the integer part of the number (e.g., [3.2]=3, [4.9]=4).

Now we just need to list the stories which the New York Times reported but the Los Angeles Times did not (giving $K-k$), the stories which both papers reported (k), and the total number of stories the Los Angeles Times reported (n). From this, we will estimate how many stories have gone unreported.

To answer this fully, I looked at the front section for each paper for May 28, 2019. The short answer is K = 22 stories in the New York Times, n = 12 stories in the Los Angeles Times, and k = 5 stories in both. We thus may expect there to be N = [264/5] = 52 stories, of which 29 were reported and 23 went unreported by either newspaper. Find below a density plot of the probability for various N, and notice how it is maximized at N = 52 (indicated by a red vertical line):

Solution: Using the maximum likelihood estimate for the hypergeometric distribution, there were a total of N = 52 news stories, 29 were reported by one of the two newspapers, and 23 stories went unreported.

Puzzle 2: Is there a Bayesian estimate for the number of news stories? Or different ways to estimate the total number of news stories?

Puzzle 3: How stable is this estimate for N? If we examine, say, the last week's worth of articles, do we get approximately the same value for N?

Puzzle 4: What if we extend this analysis to include, e.g., the Wall Street Journal, the Washington Post, and others? How stable is N in this case?

Find two tables below, one listing the stories in the international section for both papers, and the second for national stories. Corresponding stories are listed on the same row.

New York Times	Los Angeles Times
She Thought She’d Married a Rich Chinese Farmer. She Hadn’t. (A4)
Attacks by Extremists on Afghan Schools Triple, Report Says (A4)
Romania’s Most Powerful Man Is Sent to Prison for Corruption (A6)
With Trump’s Visit to Japan, Empress Masako Finds a Spotlight (A8)
Trump and Abe’s ‘Unshakable Bond’ Shows Some Cracks in Tokyo (A8)	Trump pushes off war talk on Iran, says ‘regime change’ is not U.S. goal (A1)
Election Puts Europe on the Front Line of the Battle With Populism (A10)	In European vote, far-right surge fails to materialize, but mainstream parties lose support (A2)
European Parliament Elections: 5 Biggest Takeaways (A10)
European Vote Reveals an Ever More Divided France (A11)
18 Schoolchildren Stabbed, and Girl and Man Killed, in Attack in Japan (A11)	Knife-wielding man attacks schoolgirls in Japan, killing 2 (blurb of story on A2)
Sebastian Kurz, Austrian Leader, Is Ousted in No-Confidence Vote (A12)	Ousted by parliament, Austria’s Kurz vows to win back job (A4)
Israel’s Netanyahu Struggles to Form a Government, as Time Runs Short (A12)	Netanyahu running out of time to form government; Israel may face new elections (A2)
White Panda Is Spotted in China for the First Time (A12)
30 Dead and 200 Missing in Congo After Boat Sinks (A12)
	Arrests, killings strike fear in Thailand’s dissidents: ‘The hunting has been accelerated’ (A3)

The national stories in both newspapers, appears to be completely disjoint sets of stories.

New York Times	Los Angeles Times
Trump Administration Hardens Its Attack on Climate Science (A1)
Google’s Shadow Work Force: Temps Who Outnumber Full-Time Employees (A1)
Trump Wants to Wall Off Huawei, but the Digital World Bridles at Barriers (A1)
With His Job Gone, an Autoworker Wonders, ‘What Am I as a Man?’ (A1)
With the 2020 Democratic Field Set, Candidates Begin the Races Within the Race (A1)
Saving Charlie: A Rush to Rescue Stranded Cats and Dogs from Oklahoma Floods (A17)
Fearing Supreme Court Loss, New York Tries to Make Gun Case Vanish (A17)
A Missed Opportunity for the Malpractice System to Improve Health Care (A19)
Why a Hamptons Highway Is a Battleground Over Native American Rights (A22)
	High radiation levels found in giant clams of Marshall Islands near U.S. nuclear dump (A1)
	He made millions as an L.A. investor. Now, he may run for president to fight poverty (A1)
	Want to park in Koreatown? Get ready for a ‘blood sport’ (A1)
	Put your hands together for the World Series of Poker, turning 50 this year (A4)
	Texas lawmakers approve safe gun storage program, quietly going around the NRA (A4)
	Oklahoma’s opiod lawsuit targeting drugmaker goes to trial Tuesday (A7)

Running for Higher Office: Case Studies

Puzzle: When will a member of the House of Representatives decide to run for Senate over for Governor?

"Political ambition" generically refers to either (1) a politician holding office deciding to run for a higher office, or (2) an individual who does not hold a political position to run for office.

Aldrich and Bianco note that when political ambition is cast in "utility maximization terms" (which I will extend from decision-theoretic framework to include game theoretic ones), it is called a Calculus of Candidacy. This is in analogy to Riker and Ordeshook's term "calculus of voting".1 See W.H. Riker and P.C. Ordeshook, "A theory of the calculus of voting" American Political Science Review 62 (1968) pp. 25–43; or their follow up book An Introduction to Positive Political Theory, Prentice Hall, 1973. As a decision theoretic problem (i.e., ignoring adversaries), it may be cast as maximizing the expected utility: \[EU(a_{k}) = \sum_{j}P_{jk}U(O_{k}) - C_{k}\] where $a_{k}$ is the strategy of pursuing action $k$, $P_{jk}$ is the probability of outcome $j$ given action $k$, and $C_{k}$ is the cost of taking action $k$. The rational action then chooses the strategy which maximizes the expected utility of its outcome.

But how is this process exactly done? Is there any interaction with "party elites"? Does a person just wake up one day, and announce, "You know what? I think I'll run for governor starting today, because my expected utility of that course is maximized"? And is decision theory the right tool — will potential candidates need to consider potential primary challengers or the potential of defeating an incumbent? Aaron King's doctoral thesis examines these questions on the dynamics surrounding political ambition in greater detail.

This post will gather a few case studies, in preparation for future work trying to set up a game theoretic model for political ambition.

Case Studies

Case Study: Michael Punke and Montana's 2020 Governor Race. The initial decision to run, however, seems to involve some communication with "party elites", as Politico reports about Michael Punke considering a run for Montana's governorship (and the ambitions of Governor Cooney and Mayor Collins):

Punke, who has talked to leading Montana Democrats about his political ambitions but is not talking to donors at this stage, has described himself to potential backers in Montana as "rabidly centrist" and said that if he runs, he would likely focus on issues like health care and workforce development, said one source. He would also use his WTO trade experience as a selling point because some of Montana’s biggest industries are trade-dependent, like exports of agricultural products and copper and even tourism.

[...] The current Democratic governor of Montana, Steve Bullock, is term-limited and is expected to announce a run for president soon. Independent Helena Mayor Wilmot Collins and Democratic Lt. Gov. Mike Cooney are also seen as potential candidates for governor, though Collins said in March he was also considering the Senate race and appears ready to launch a campaign for that office.

The inferences we should draw from this reporting is: (1) there are "party elites" whom Michael Punke is courting prior to entering the race, (2) the considerations of possible opponents are taken into consideration in each actor's calculations.

Curiously, similar processes appear to unfold in the Republican side of the Montana senate race.2 DailyKos's daily election digest reports, MPR's Brian Bakst reports that Bill Guidera, a former executive at 21st Century Fox and News Corp, is considering seeking the GOP nod to take on Democratic Sen. Tina Smith. Guidera, who used to serve as the Minnesota Republican Party's finance chairman, doesn't appear to have said anything publicly yet, but Bakst acquired an email from someone he identified as a longtime friend and quasi-adviser who said that Guidera is thinking about running and holding a fundraiser. Bakst also adds that Guidera has been appearing at local GOP events and doing meet and greets.

Case Study: Bill Weld's 1996 Senate Decision.4This example is inspired from Kenneth A Shepsle's Analyzing Politics, first ed., pages 22–24. First elected in 1991 as governor of Massachusetts, Bill Weld's gubernatorial term came to an end with the November 1994 election. A popular Republican governor in a famously liberal state, Weld remedied the financial debts through a well-executed political squeeze play (thwarting the state legislature from borrowing more money or raising taxes with veto threats), restructuring the state's debts, and taking advantage of Medicaid loopholes to acquire $500Mn from the federal government. The sordid details and play-by-play are well documented in Richard Hogarty's Massachusetts Politics and Public Policy.

Weld was popular inside the state, and outside. It was whispered that party elites were entertaining the idea of Weld as the presidential or vice-presidential candidate in 1996. The governor was inevitably aware of these rumors, since the New York Times's conservative pundit William Safire endorsed such an idea in his 1993 op-ed piece What about Weld, which could be sustained by holding public office. (If you don't know who Mr Safire is, please read Rick Perlstein's Nixonland; it's a wonderful book, and explains only parts of Mr Safire's connections with, and sway among, conservatives and Republican party elite.)

But, things were not so straightforward. Senator Ted Kennedy's term was coming to an end, and Sen Kennedy faced re-election in the November 1994 election as well. Or retirement. In his term, Sen Kennedy faced a number of contraversies ranging from his personal life to his handling of Clarence Thomas's nomination to the supreme court. The GQ's 1990 profile, Ted Kennedy on the Rocks, did little to help. The Boston Globe later reflected, Not surprisingly, many thought the senator would announce that he wasn't running for reelection in 1994, that it was time to get his personal house in order. In fact, Kennedy was already gearing up for the toughest race of his Senate career. The senator announced his intention to run for re-election early in Spring of 1994, entering the race as an especially disadvantaged incumbent.

Governor Weld could either risk challenging the vulnerable Sen Kennedy for the senate seat or run for re-election as governor of Massachusetts. Political observers agreed any race between Kennedy and Weld for the senate would be a toss-up,5For example, The New Republic reported there were more independent voters than registered Democratic voters in 1994. There is a big bloc of voters, as high as 40 percent of the electorate, that is no longer available to Kennedy, a Boston pol who is advising the senator's campaign confided. If anyone runs a minimal campaign, he'll get at least that much of the vote. Such, at least, was the specious reasoning of political operators at the time. but the governor's race would be a lock. Regardless of the choice, Weld needed to hold one of these offices to be considered as presidential (or even vice-presidential) material.

Framed thus, we would expect the decision Weld would make should be to run for re-election as governor in 1994, and enjoy a chance to join the GOP's ticket for the 1996 presidential race.

Well, Weld did run for re-election in 1994, winning 71% of the popular vote. A year afterwards, on November 29, 1995, the governor made his intentions clear to run in 1996 against the junior senator John Kerry after securing the blessings of financial backers and GOP party elites.6The only source I could find documenting this was the Boston Herald's article, Weld expected to launch bid today dated November 29, 1995. Curiously, the article notes, Weld advisers also noted that Weld came to the brink of the presidential race and the 1994 Senate race before bowing out.

Further, that article notes how Weld secured the blessings of Republican donors and party elites: Weld, who was scheduled to be in Manhattan this morning to meet with campaign fund-raisers, reserved a hotel function room in Boston this afternoon in anticipation of announcing his entrance to the race. [...] Today's New York meeting is one last step toward a possible Weld candidacy. New York has been an important factor in the Weld fund-raising equation — accounting for as much as 15 percent of the $6.5 million Weld raised between the 1990 and 1994 gubernatorial elections. While this morning's breakfast was depicted as a critical factor in Weld's decision, a negative outcome is unlikely. Sources close to Weld noted that the people attending the New York event include the governor's two brothers, his sister, and former Harvard classmates. [...] According to sources, Weld has already begun to assemble a fund-raising team, a critical issue since his longtime chief fundraiser, Peter J. Berlandi, has opted for a limited-duty role in the Senate race. According to sources, two Boston attorneys — Weld campaign treasurer Sandy Spaulding and 1992 congressional candidate Michael Crossen — are likely to assume key roles. Sources also said veteran Bay State GOP fundraiser Priscilla Ruzzo, a staffer for the National Republican Senatorial Committee, may be "loaned" to Weld during a startup phase. (LexisNexis saved the article, and I quote from LexisNexis's saved transcript, which may very well be in error.) The Atlantic summed up the elite opinion, Is it the wrong race? Is it the wrong year? Is Kerry the wrong target? The right race, this theory goes, was the last Senate race in Massachusetts. The right year was 1994. The right target was Senator Edward M. Kennedy.

Puzzle. What interactions occurred between election day 1994 and November 25, 1995 which led Weld to prefer challenging Sen Kerry over alternative actions?

Case Study: Claire McCaskill's Senate Run.7This example is inspired from Kenneth A Shepsle's Analyzing Politics, second ed., pages 21–23. Shepsle cites Jeff Goldberg's Central Casting article from The New Yorker, too. Claire McCaskill after graduating law school in 1978 began practicing law until she ran and won a set in Missouri's state House of Representatives. She then ran for Kansas city's county prosecutor in 1988 and won, ran for state auditor (which she viewed as a stepping stone towards governorship) in 1998 and won. Then, in 2004, McCaskill primary challenged the sitting Democratic governor Bob Holden. And won...the nomination. Alas, Roy Blunt (the Republican nominee) prevailed in the governor's race. But McCaskill defeating a sitting governor in the primary was historically unprecedented in Missouri.

But, The New Yorker informs us, In 2006, the two senior Democrats in the Senate, Schumer and Harry Reid, persuaded her to run against a Republican incumbent, Jim Talent. Her timing was good: President Bush’s dismal approval ratings helped the Democrats pick up enough seats to win majorities in both houses of Congress. McCaskill won a narrow victory. (McCaskill claims this as well in her memoir, Plenty Ladylike: A Memoir.)

Observation. "Elder statesmen" of the party [e.g., Reid and Schumer] seemingly count as "party elites" for certain races, like for the Senate.

But in 2014, Sen McCaskill considered running for Governor in 2016 instead of re-election for Senator in 2018. It had been a dream, for Claire McCaskill, to be governor of Missouri, ever since she was in high school. The New Yorker put it this way: By the time McCaskill was in ninth grade, at Hickman High School, in Columbia, she had set her sights on becoming the first female governor of Missouri. Whether this dream was real or imagined, the source or an excuse of, McCaskill's ambition for governorship was evident at the time.7The New York Times reported after the 2018 election, The loss likely marks the end of life in public office for Ms. McCaskill, a singular figure in Missouri politics who began her public career more than three decades ago in a male-dominated State Capitol and outlasted most of her Democratic peers. She has long coveted the state’s governorship, having narrowly lost a bid in 2004, but on Tuesday night, she signaled that she had run her final race, though she said she would be unencumbered in speaking her mind. (emphasis added)

The New Yorker noted about McCaskill's initial run, In 1998, McCaskill ran for state auditor, an office that she saw as a stepping stone to the governorship. And later in that same article, As recently as 2015, she considered returning to Missouri for another try at the governorship. Her mind naturally goes to practical details rather than to big concepts. Her idea of governing is to spend money wisely, punish misbehavior, and give people what they need in order to get through their daily lives.

Whether Sen McCaskill had greater ambitions beyond the governor's mansion remains as unclear as how McCaskill's ambitions evolved over time.

Ultimately, McCaskill sat down and did the calculus sometime in Winter of 2014–2015, and concluded in January 2015 that, for the trajectory McCaskill had in mind, running for re-election in 2018 was more optimal than running for Governor in 2016.8 McCaskill told KCUR in an interview in January 2015, At the end of the day, you have to ask yourself if the job you're thinking about going for is better than the one you have, and can you do more? She reaffirmed this stance with St. Louis Public Radio on January 15, 2015 and with Politico on January 12, 2015.

Puzzle. Did Claire McCaskill plan with Missouri state party elites or her colleagues in the Senate? Or did she arrive at this conclusion on her own?

Conclusion

We have examined a few "case studies" in political ambition. Our case studies have been "broad" rather than "deep": we had a writer aspire for governorship, a governor challenge a sitting senator, a senator with frustrated aspirations for governorship. For completeness, we should also consider a state legislator with ambitions for (1) the House of Representatives, (2) governorship, (3) Senate. But also we should consider individuals with presidential ambitions.

Fowler and McClure's Political Ambition (1989) examines a single congressional district with an open seat, specifically how state legislators determine whether to run for that open seat or not. (This is an example of a "deep" case study which is not "broad".)

We also didn't examine sufficient cases to see if the examples given are a sufficient representative sample. The gender and race of the candidates may impact the dynamics. RL Fox investigated the impact of gender on political ambition.

Although we are critically dependent on newspaper reporting, we have tried to identify a few of the key elements in the decision to run for higher office. The flaw with this approach is obvious: we lack information about "behind the scenes" interactions among key actors. But I'm not a journalist or a political scientist: I don't have the time, energy, or patience to do the investigative dirty work.

Future work could include setting up a game theoretic model of political ambition, further case studies, and possible ways to empirically test various aspects of political ambition or at least determine indicators of political ambition.

References

John H. Aldrich, William T. Bianco, "A game-theoretic model of party affiliation of candidates and office holders". Mathematical and Computer Modelling 16 (1992) pp. 103–116, doi:10.1016/0895-7177(92)90090-8
G. Black, "A theory of political ambition: Career choices and the role of structural incentives". American Political Science Review 66 (1972) pp. 144–159
Scott Gates and Brian D. Humes, Games, Information, and Politics: Applying Game Theoretic Models to Political Science. University of Michigan Press, 1997. See esp. ch. 3.
Linda Fowler and Robert McClure, Political Ambition: Who Decides to Run for Congress. New Haven, CT: Yale University Press, 1989.
David Rohde, "Risk-bearing and progressive ambition: The case of members of the United States House of Representatives". American Journal of Political Science 23, 2 (1979) pp. 1–26 [jstor]

"Thick Description" Reading

Why Would Anyone Ever Want to Run for Congress?, The Atlantic; April 19, 2013
Who Wants To Be A Senator? Not These High-Profile Democrats, FiveThirtyEight; May 9, 2019
Richard Hogarty, Massachusetts Politics and Public Policy. University of Massachusetts Press, 2002.
A Race Too Far?, The Atlantic; August 1996
Ted Kennedy on the Rocks, GQ; April 15, 2016
Peter S. Canellos (ed.), Last Lion: The Fall and Rise of Ted Kennedy. Simon & Schuster, 2009.
Stormin' Mormon, The New Republic; November 7, 1994
Weld expected to launch bid today, the Boston Herald; November 29, 1995, pg 1.
Central Casting: The Democrats think about who can win in the midterms-and in 2008, The New Yorker; May 21, 2006
Claire McCaskill’s Toughest Fight, The New Yorker; October 22, 2018

Initial Decision to Run

RL Fox, JL Lawless, "Gaining and losing interest in running for public office: The concept of dynamic political ambition". Journal of Politics 73, no. 2 (2011) 443-462. Eprint.
Aaron S. King, Unfolding Ambition in Senate Primary Elections: Strategic Politicians and the Dynamics of Candidacy Decisions. Lexington Books, 2017. Appears to be a cleaned up version of King's doctoral thesis.
Jennifer L. Lawless, Becoming a Candidate: Political Ambition and the Decision to Run for Office. Cambridge University Press, 2012.
Daniel Markham Smith, Succeeding in Politics: Dynasties in Democracies, PhD Thesis at UC San Diego, 2012.

Thursday, May 23, 2019

Who are the "Political Elites"?

Modeling the election game required introducing a notion of a "political elite" or "party elite". But who, exactly, are these people?

I'm going to try to answer this question by combing the literature, and providing a sort of "concordance" of relevant passages from various authors. The short answer seems to be: local and state party officials, people in the DCCC and RCCC (as well as in the RNC and DNC), people who serve as delegates to the national convention, "big donors", and ostensibly certain pundits.

Party elites posses some kind of resource which candidates (and potential candidates) try to acquire on a "first come, first serve" basis, after persuading the party elite 'gatekeeper' to back the inquiring candidate. This competition for party elite support constitutes the so-called "Invisible Primary", which happens at the presidential level, sometimes in Senate races, and very seldom for lower races.

It's unclear whether "party elites" also actively recruit people to run for office or not. [Addendum Sept 30, 2021: from a few case studies, it appears party elites do recruit people to run for office, but they are not the only manner by which candidates are recruited.] I would hypothesize, e.g., county and state party officials try to keep a roster of potential candidates to run for state legislature, or for recommendation for county offices.

Kirkpatrick Picks Delegates

Jean Kirkpatrick studies this in her book The New Presidential Elite (1976). For Kirkpatrick, delegates to the national convention constitute party elites, saying (pg 22, emphasis Kirkpatrick's):

Delegates to national conventions are interesting to students of politics because, for the period that they serve as delegates, they are members of the elite political class, persons whose decisions are felt throughout the political system.... The political elite is that political class that has more influence than others in the shaping of specified values through political processes. Collectively, they embody the human, social, and political characteristics of a national party. Collectively the delegates constitute a slice of American political life broad enough to include persons from every state and thick enough to include representatives of all political levels.

But the role "political elites" play is not limited to merely national convention attendance.

King Includes Local & State Officials, Donors

When potential candidates begin organizing potential primary campaigns, they begin by organizing resources. Political elites partake in the "invisible primary", as Aaron King's doctoral thesis notes (pg 27, or 40 of the pdf):

Money is just one type of resource that fuels a campaign. One must also concentrate on building a competitive organization – a resource where previous officeholders, and especially incumbents, have an inherent advantage. While financial contributors are an important part of this organization, this process also consists of recruiting personnel, from volunteers for grassroots efforts to a close group of trusted advisors (Thurber and Nelson 2000, Farrell, et al. 2001). Candidates will also attempt to get endorsements from influential members of the community such as business or union officials, and support from other elected politicians. While the average citizen may not be aware of these events taking place, political elites and activists will be paying close attention. In a way, this can be seen as an invisible primary (Aldrich 1980a, Cohen, et al. 2008) on a smaller scale than in presidential races. During this process, candidates are also trying to gain exposure both with the media and key political elites, and increasing name recognition across the state. If a candidate can establish himself or herself as a front-runner early on, he or she may be able to build off this momentum and gather even more resources.

Among party elites, King sporadically notes, includes donors and other officeholders (35). More often than not, party officials will sit on the sideline and wait until a nominee is official before pouring resources into a campaign, including the electoral wing of each national party organization, the National Republican Senatorial Committee or the Democratic Senatorial Campaign Committee. Still, party officials, especially on the local level, may be involved behind the scenes to encourage, or even discourage, certain politicians to enter a race. (36)

Steger explicates what elites do for fun and profit the party

What do party elites "do" exactly? Wayne P. Steger, in his Party Elites in the 2008 Presidential Nomination Campaigns, note (pg 297–299)

Party elites facilitate candidates’ efforts to build strong personal networks of contributors and volunteers from the overlapping organizations and networks that form the modern political parties (e.g., Merrion 1995; Gimpel 1998). Candidates use endorsements in their fundraising appeals (Bimber and Davis 2003). Party elites frequently serve as headliners at fundraising events with or on behalf of presidential candidates (e.g., Novak 2006). Endorsements also limit the resources available to other candidates (e.g., Embrey 1995). Endorsements are used in campaign communications as candidates announce them at rallies and press conferences and display them in advertisements and web pages (Bimber and Davis 2003; Williams et al. 2008). Party elites also may be deployed as proxies in attacks on rivals, which is effective because party and officeholders are “credible” sources of criticism and may insulate the candidate from charges of negative campaigning (Garramone 1985). While some endorsements are more valuable than others, in terms of active support for the campaign, the aggregate pattern of endorsements provides an approximate indication of insider support for candidates’ campaign efforts. Candidates with more elite support are going to have an easier task raising money and building their campaign organizations than candidates lacking such support.

Nomination campaigns exhibit varying degrees of uncertainty about how candidates will play with voters in the caucuses and primaries. Under conditions of uncertainty, elites can influence media coverage and commentary of the campaign. In an uncertain environment, journalists, editors, and producers look for cues about what to cover and how to cover it (Shaw and Sparrow 1999). Journalists pay close attention to polls and quarterly financial reports to figure out which candidates are leading, lagging, rising, and falling. But national news reports also frequently incorporate subjective elite judgments that are not reflected in objective indicators of the horse race (e.g., Cillizza 2006). This effect can be seen in the coverage of candidates who led in national polls prior to the primaries, but who were not treated like front-runners by the media (witness George Wallace in 1975; Jesse Jackson in late 1987; and Joe Lieberman in 2003). Party elites contribute to the perceptions conveyed to the public through their contact with reporters and through news reports of party elite support for candidates.

While the general public is inattentive to the presidential nomination campaign; party activists, campaign contributors, and organizations aligned with the parties are more likely to be exposed to this information. This matters because these attentive publics, along with the news media, provide candidates with the money and exposure needed to compete for the support of larger numbers of primary voters across the country. Endorsements also affect voting in later primaries indirectly through their effects on voters in early caucuses and primaries. Unlike the public at large, prospective voters in early caucus and primary states are exposed to a high volume of information provided by the news media and campaign advertisements. A study by the Annenberg Public Policy Center shows that most voters in the early caucuses and primaries were exposed to endorsement information during the 2000 PNCs (Jamieson et al. 2000). Further, the study found that endorsements did influence some votes in these elections, especially when endorsements are communicated to the public in campaign ads. This suggests that endorsements have the greatest potential effect in the early contested states in which candidates invest heavily in time and money. This matters because the early caucus and primary outcomes affect the vote in later states (e.g., Bartels 1988). Thus endorsements indirectly affect the primary vote by influencing candidates’ relative abilities to compete in the primaries and through the vote in the early caucuses and primaries.

Importantly, as Cohen (2008a) and Steger (2008a) both argue, party elites influence in the nomination campaign to the extent they unify or coalesce behind a candidate. Party elites dilute their impact when they refrain from making endorsements or when they divide their support for candidates. As elites divide their support for candidates, they send a mixed signal to each other and to their attentive partisan publics. Candidates who gain large numbers of endorsements are signaled to be ideologically acceptable and viable because they are receiving support from across the spectrum of the party’s elite membership. People tend to accept opinion leadership from credible sources (Zaller 1992). An individual endorser may not be a credible source for a given partisan, so an individual endorsement may or may not resonate with an individual party activist or contributor. When elites divide their support for presidential candidates, attentive publics, we can expect attentive party audiences to divide along lines similar to those among elites, giving credence to endorsements of preferred politicians while discounting those of less preferred or unfamiliar politicians. As more elite party officials endorse a candidate, the odds improve that a given party activist or identifier will find a credible endorser. When most elites coalesce behind a given candidate, the signal is unified as to which candidate is desirable, viable and electable.

In the "Party Elite Study", overseen by John S. Jackson, questionnaires were sent to a sample of nominating convention delegates, county chairs, national committee members, and the universe of state party chairs.

References

John S. Jackson, The American Political Party System: Continuity and Change over Ten Presidential Elections. Brookings, 2015.
Jean Kirkpatrick, The new presidential elite: Men and women in national politics. Russell Sage, 1976.
Byron E. Shafer, Partisan Approaches to Postwar American Politics. New York: Chatham House, 1998.
Aaron King's doctoral thesis.
Wayne P. Steger, Party Elites in the 2008 Presidential Nomination Campaigns. American Review of Politics 28 (2008) doi:10.15763/issn.2374-7781.2008.28.0.293-318

Monday, May 20, 2019

Candidate Announcements might be Exponential

Looking back at my prediction, we can see why I was off: the assumption (waiting time between announcements follows an exponential distribution) doesn't even hold. Look at the histogram against the expected distribution:

How can we be sure this is different? ...besides looking...

We can use the Kolmogorov-Smirnov test to see if the histogram differs from the expected exponential distribution. The only problem is there's too little data! There are 20 candidates in my data set, I need it closer to 50 for this test to work. So I just duplicate the data, "smearing" it by adding small amounts like 0.000001 or so, and I do this twice to get a total dataset of 60 "intervals".

The null hypothesis is the data follows the exponential distribution, the alternative hypothesis is the data follows some other distribution. The resulting p-value for the test is 0.03019, so we reject the null hypothesis.

~~This is based on the huge assumption that we can duplicate the data without any problem, which I have severe doubts about.~~ (Addendum: This reasoning, I realized whilst in Maryland a few days after publication, is invalid, though there exists a technique to fabricate data; since, heuristically, around 30 data points are needed to make an inference, it seems the reasoning below is valid.)

In fact, as a sanity test, lets try simulating candidates announcing they are entering the primary, with λ = 20/136. One quick simulation gives us 16 candidates with intervals between announcements:

This doesn't look too far from the real data. If we had tried the Kolmogorov-Smirnov test without fabricating data, we get a p-value of 0.3387, which tells us we fail to reject the hypothesis this data appears to follow an exponential distribution.

So "what's the right answer"? There's sadly not enough data for us to reject the hypothesis (that the intervals between candidate announcements seem to be exponentially distributed), at least using the frequentist hypothesis testing framework.

Also, I discounted a few candidates which FiveThirtyEight has considered "major" like Andrew Yang or John Delaney (though both candidates announced back in 2017, which make them outliers).

I'll have to look for a statistical test which works on around 20 observations, checking if it fits against an exponential distribution. Maybe there's some Bayesian techniques buried away in Gelman somewhere...

To be clear, however, there is no reason to believe candidate announcements are, a priori, exponentially distributed since timing is contingent on when the (potential) candidate thinks other actors are going o announce. Joe Biden said something to the effect of, he's waiting to announce as late as possible because it's part of a strategy he has. But that depends on his calculations of when "the last possible announcement [relative to his adversaries]" which is explicitly dependent on what other people are doing. An exponentially distributed random variable is memoryless: candidates "wouldn't remember" the last time someone announced.

Ostensibly, a waggish critic might argue, this assumption does hold because it's so damn hard to keep track of the last guy or gal who announced their intention to run for President!

But the only way to know a posteriori whether the actual candidates, by accident or by design, appear to announce with intervals which seem to follow an exponential distribution...is to do some statistical analysis.

Data and scratch work is available on GitHub.

Addendum May 26, 2019 at 11:17 am (PDT). One moral to take away from this is to not give a single number as a prediction, but the "HDI" (Highest Density Interval, the region defined as centered on the expected value, the lower bound containing 47.5% of the area, and the upper bound containing 47.5% of the area, so the entire region describes events which are 95% probable). This would've given a large spread, lying between roughly 2 hours and 27.8 days, which is far less interesting as a prediction. For the region containing 85% probability, the interval lies between 29 hours and 14.3 days (14 days, 7 hours).

Whether we use Tukey fences or the HDI, the upper bound for any exponential distribution's prediction would be around $(3\pm\varepsilon)/\lambda$ where $|\varepsilon|\lesssim 0.03$. Since we're typically waiting for an event to occur, we're only really interested in the upper bound (at least, for candidates declaring their intent to run for President).

And after further consideration, the fabrication of data as I have done it above is incorrect, but it is not an invalid technique provided it is done correctly. Maybe that's a topic for a future post...

Monday, May 13, 2019

American Politics as a Game: Roadmap

I'm going to present 2 games coupled together which seem to capture American politics. I don't think any of this is new or innovative, it just weaves together theories and models into a single tapestry described by 2 games. This post will just present a "big picture" of what's going on and what my interests are as far as topics which I'll be writing about in the future.

The two games are the election game and the legislation game, or "Getting to Congress" and "What do I do here in Congress?" I will give conceptual explanations for the games involved, not writing down any formal rules. Since the 2020 election is on everyone's mind, I will give more detail to the election game than to the legislation game.

Election Game

The first game is the Election Game, which involves candidates trying to "persuade" voters to cast their ballot for them. So we have at least two "types" of players in this game: candidates and voters, ostensibly they interact, the candidates can see (after the fact) how each other interact with voters (e.g, hold rallies, give stump speeches), and the voter's cast their ballots on Election Day. Whoever gets the most votes wins the election.

For the sake of simplicity, the only election games I will be considering will involve legislators and the Presidency, held every 2 years (so the Presidency is involved only "every other" time the election game is played). Again we can refine this to distinguish House members running for election from Senators running for election, and further involve the Governorship races as well as state legislator races. But for simplicity, we start small then successively refine the game.

Refining the Game

We can refine this game arbitrarily much, adding new player types ("party elites" which fund the candidates, recruit them, etc.; "activists" which operate the "Get Out The Vote" [GOTV] efforts, which form the pool of recruits for party candidates; etc.). This involves modeling political ambition, to some degree.

We can also consider further intragame aspects. For example, we can imagine in a state, two factions vying for power among the political elites within the same party. This was what happened, e.g., nationwide in 2010 with the Tea Party. Coalition management becomes an issue if we take factions seriously.

But we can also consider inter-game aspects. Senator Mark Hanna [wikipedia] (R-OH) who was able to control the party machinery for Southern Republican parties, ostensibly he would be both a candidate and a "party elite", though in different states. As Edmund Morris put it in Theodore Rex (pp.38–39)1 For more on this, see Horace and Marion Merrill The Republican Command: 1897–1913 (1971) pp.74–75 for Hanna's politicking in the 1900 national convention which secured his position a kingmaker with Southern delegates, and Richard Sherman's The Republican Party and Black America from McKinley to Hoover, 1896-1933 (1971) pp.19–20. Herbert David Croly's Marcus Alanzo Hanna: His Life and Work (1912) pg.298

The South was Hanna's chief source of political strength. No matter that he himself represented Ohio. No matter either that the Republican Party in Dixie was so weak that in some state legislatures it had no seats at all. What did matter was that the South was disproportionately rich in delegates to national conventions. Hanna's expert cultivation of these delegates, and his control of party funds as Chairman of the Republican National Committee, had guaranteed the two nominations of William McKinley. In his other role, as Senator in charge of White House patronage, he had been a rewarding boss, showering offices and stipends upon the faithful. As long as the South continued to send delegations of these blacks north every four years, Mark Hanna would remain a party kingmaker.

We could also consider the situation where we want to model political ambition: several members of the House want "bigger positions". The governorship and a senate seat both have opened up. Each of these legislators have to weigh their own ambitions against the likelihood one of their opponents would win the seat.

Voters

The voters appear to be irrational actors. Or, at least, that's what Campbell, Converse, Miller, and Stokes have found in their book The American Voter (1960). This doesn't mean we cannot model them. It just means how they determine their vote is not by a utility function.

We can take the converse perspective, and try to model voters as rational, but this opens up a huge can of worms (as far as modeling is concerned). How do we determine their utility function?

If James Carville is right, and voters determine who to vote for based on "It's the economy, stupid", then we need to model the economy. I posit economists are incapable of this (see, e.g., Hill and Myatt's The Economics Anti-Textbook or Keen's Debunking Economics for details) and more importantly this would be too distracting from the bigger issue modeling how voters choose who to vote for.

We could try to model the utility functions as exogenous quantities (i.e., not explained by the model, but just supplied by empirical observation or statistical modeling). But this feels underwhelming, and not better than just using statistics ab initio when modeling voter behavior.

My personal belief is that, it is plausible legislators are rational actors (in the game theoretic sense) because they are foist into an unnatural situation (being forced to run for re-election every so often). But voters are not constrained in such a manner. There is no compelling reason to believe voters would behave any differently voting than doing anything else, in which case voters are swayed by the cognitive biases we all experience.

Party Elites

This is a very sinister name for a lackluster type of player in the games. Once upon a time, we could imagine these players as the cigar chomping bosses picking candidates in smoke-filled rooms. But since the McGovern-Fraser reforms of 1972, the cigar chomping bosses have lost their power gradually over the past half century or so.

The "party elites" refers to the boring bureaucrat who has to decide "how to divide the dollar" among candidates they want to endorse, and who to encourage to run for office. Anyone can become a "party elite" in this brave new post-McGovern-Fraser world.

The goal for party elites is to recruit and back candidates who will implement policies the elites desire. This is simple enough, until we start modeling ideology (think: Tea Party versus Establishment Republicans; Progressives versus Establishment Democrats). Then Party Elites contend over the party machinery, in some appropriate sense, responsible for dispensing funds to candidates and recruitment.

In some sense, there is indirect communication between voters and party elites mediated through elections. This would impact which faction among party elites has "power", i.e., greater say in how to allot funds and who to endorse or recruit.

Legislation Game

Once elected, legislators need to play the Legislation Game of introducing bills, trying either to block or to pass them, all before the next election. We can refine this game in quite a few ways, but first perhaps we should clarify terminology.

At the federal level, Congress works in Sessions or 2-year intervals to introduce bills, work on them, and pass them. That's the name of the game: passing (or blocking) legislation. We can view this as a "repeated spatial voting game" coupled to a few other games (Chicken, Divide the Dollar, etc.). Our interests is specifically modeling contemporary legislation, not producing some dynamical system which explains how we got from 1789 to here.2 A historic note: we take for granted bills are identified by one of a half-dozen standard types [e.g., HR, S, SRes, etc.] and a number and the congress number. This didn't start until the 14th Congress, according to the data provided by the Library of Congress. Before then, it is difficult to determine the bill numbers, and seemingly post hoc to assign any identification to those early bills. Eugene Nabors's Legislative Reference Checklist: The Key to Legislative Histories from 1789-1903 is a blessing to researchers, even today, since that patient scholar went through the early bills and assigned numbers to them, and identified bills with the resulting statutes.

Even in its simplest form, the origins of bills is rather elusive. Just like voter preferences, we could model it as exogenous and not worry about "where bills come from": it comes from us, by hand! Or we could model it endogenously, there is some mechanism within the model responsible for legislators creating a bill. But without modeling bill drafting at all, well, why on Earth would legislators meet?

Assuming, somehow, legislators draft and introduce bills, we are confronted with the degree of realism we want to approximate. Bills are assigned to committees. The committees may or may not even schedule hearings for the bills, depending on the attitude of the committee chair. Assuming the committee holds hearings and eventually approves it, the committee (usually) files a report detailing their findings, and the bill is either referred to more committees or the chamber's presiding officer (like the committee chair) may or may not schedule time for debate. There are mechanisms to force a bill to a vote, but again that's rather complicated.

We can refine this legislation game, extending its core concept to incorporate strategic voting (voting against one's interest to feign interest in something else), amendments, include a new type of player ("lobbyists") which could make the legislator's dynamics with party elites more intriguing. I need to research this area more before committing myself to anything, I'm not even sure there are adequate game theoretic models of lobbyists.

Further, presumably legislator behavior changes relative to when they are up for election next. A senator can play the legislation game thrice before playing the election game, whereas all members of the House must alternate between the election game and the legislation game. Does this impact behavior for House members compared to Senators? Do their utility functions change if they change chambers?

Concluding Remarks

I've only outlined the two "subgames" in American politics relevant to elections, but have not described how they are coupled together. Presumably voters care about what their representatives do, which guides the utility functions for the legislation game. Presumably party elites care if their elected candidates are faithfully implementing the policies promised. The interactions between legislating and elections need to be further explored (or explored at all).

We also have not discussed the other branches of government. Presumably we could model the President as a 1-person chamber (that's what veto power allows the President to do, after all) which can draft legislation for the other chambers (it's what the White House Office of Legislative Affairs does and has done since Eisenhower created it). Presumably budget considerations could be modeled, since the Budget and Accounting Act of 1921 specified the Executive branch needs to propose the budget.

I'm hesitant about modeling the Judicial branch, however. In practice, two lawyers try to persuade a judge. That's what a Court Case is. But the means by which persuasion is accomplished is decidedly not a "game" (it cannot be accurately modeled using game theory). Further, the Judicial branch interprets laws which the Legislature has passed and enacted, which is hard to model. We could handle a case-by-case (sorry for the pun) modeling philosophy, but there is no elegant "one size fits all" model as for the legislature above.

I also want to warn against trying to transform the model presented here into a "unified theory of Congress", since there's still quite a bit exogenous to the model. Laws are proposed to respond to prevailing problems and conditions, which are not modeled within this "coupled game". Although this model proposed may "tie together" various disparate games strewn throughout the literature, providing a more cohesive and appealing model, it is not the "unified theory" you are probably hoping for (beyond explaining legislator behaviour given exogenously observed bills and perturbations).

But we have, I think, successfully integrated a number of theories and models into one coherent model. We have woven together political ambition, spatial voting, election campaign behavior, and power dynamics at various levels. Ostensibly this could be extended to include state legislatures, governorships, as well as the Presidency. But we only have a hand wavy description of the games, we don't actually have a proof that "When restricted to x, we recover the political ambition game" (or any similar such proposition). This would be interesting to pursue, perhaps.

What I am interested in, however, is whether we could provide conditions describing "party systems", i.e., periodic shifts and realignments in the ideology of the parties. If so, how long does a party system last? Under what conditions will a party realignment happen? Can they be avoided? How long does a realignment take? Can this be empirically tested?

References

John S. Jackson, The American Party System: Continuity and Change over Ten Presidential Elections. Brookings Institute Press, 2015.

Voters

Angus Campbell, Philip Converse, Warren Miller, and Donald Stokes, The American Voter. Unabridged edition. University of Chicago Press, 1980.
Warren E. Miller and J. Merrill Shanks, The New American Voter. Harvard University Press, 1996.
V.O. Key, The Responsible Electorate. Belknap Press of Harvard University Press, 1966.
Peter F. Nardulli, Popular Efficacy in the Democratic Era: A Reexamination of Electoral Accountability in the United States, 1828-2000. Princeton University Press, 2005.

Saturday, May 11, 2019

Footnotes with Plain Javascript

An example of footnotes using plain Javascript, the HTML snippet to consider is:

<p>
Lets see who believes this footnote will work.<a onClick="footnoteOnClickHandler(event)"
 class="fn" href="#">1</a><span class="footnoteBody">This is test text.</span>
I hope it works well.
</p>

The handler for footnotes are given by the Javascript code:

function footnoteOnClickHandler(event) {
    event.preventDefault();
    var elt = event.target;
    if(elt.classList.contains("expanded")) {
        elt.classList.remove("expanded");
        var fnNumber = elt.getAttribute('data-number');
        elt.innerHTML = fnNumber;
    } else {
        elt.classList.add("expanded");
        var fnNumber = elt.innerHTML;
        elt.setAttribute('data-number', fnNumber);
        elt.innerHTML = "x";
    }
};

The CSS is:

.footnoteBody {
    color: #6d6f71;
    display: none;
    font-family: DecimaMono, Consolas, Monaco, monospace;
    font-size: 14px;
    margin: 1em auto;
}
.fn {
    font-size: smaller;
    vertical-align: super;
}
.fn:not(.expanded) + .footnoteBody {
    display: none;
}
.fn.expanded + .footnoteBody {
    display: block;
}

I've tried using plain CSS to handle footnote expandos, but this doesn't work well if I embed a link in the footnote text. For multiple paragraphs, there is a caveat (with both approaches) that I cannot use the <p> tag, but must instead use two consecutive linebreaks <br /><br />. Lets see if this works.1 Footnote fun! I think it works fine.

Thursday, May 9, 2019

Common Knowledge of Rationality + Consistent Alignment of Beliefs = Common Priors

Heap and Varoufakis summarize the last assumption of game theory's axiomatization of rational behavior in the "consistent alignment of beliefs" axiom: no instrumentally rational person can expect another likewise rational person who has the same information to develop different thought processes.

This is usually justified by the Harsanyi doctrine: when two rational people examine the same information, they must draw the same inferences, and independently come to the same conclusion.

Robert Aumann fiercely defended this principle in his article "Agree to Disagree" (1976) and his earlier article "Subjectivity and Correlation in Randomized Strategies" (1974).

Aumann argues, if you assess it is going to rain tomorrow with 75% probability and I assess it will rain tomorrow with 33% probability, then we must have different information and we should update our probabilities accordingly until we converge on some shared probability estimate. That is, through dialogue, we (as rational actors) will arrive at a conclusion we both agree upon.

When we combine "consistent alignment of beliefs" with the common knowledge of rationality, we end up with common priors (i.e., a source of beliefs). The connection is this: if you know you are rational and you know your adversary is rational and (using consistent alignment of beliefs) you know your thoughts about what your adversary might be doing have the same origin as your thoughts about your own actions along the same line as your adversary's thoughts, THEN you adversary's actions will never surprise you. Beliefs are consistently aligned in the sense, if you actually were able to know your adversary's plans, you wouldn't want to alter your beliefs about those plans. Conversely, if your adversary knew about your planned actions, then your adversary wouldn't want to alter their beliefs they hold about your prospective actions which underpin their planning about their future actions.

Observe this dialogue needs to happen in "real" (i.e., historical) time and not in "logical time" (in the sense of the length of a logical derivation of hypothetical dialogue). Without such actual dialogue, there's no need to come to any agreement. Scott Aaronson has shown (arXiv:cs/0406061) such dialogue can be done in finite time and, in some sense, "efficiently".

One of the problems with this, the inference of common priors from the premises on the Common Knowledge of Rationality coupled to the consistent alignment of beliefs argues the dialogue occurs in "logical time".

The problem with this is for "one shot games", where interactions between the players occur only once and in the absence of communication, there is literally no opportunity for such dialogue.

Prior Beliefs

We need some "initial beliefs" for our rational actors to have, so as to avoid an infinite regress in reciprocal expectation of actions pursued. We saw how rational actors update their beliefs via Bayesian updates, but we need some "initial prior" to start the process. Without common priors, we can get senseless results.

But the choice of prior probability distributions in Bayesian analysis can impact the posterior distribution considerably. The field of "Reference Priors" uses information theory to measure how the choice of prior distribution affects the posterior probability. The choice of priors has a rich history and while it is true "objective" (or "noninformative") priors have "minimal impact" on the posterior, but that is not the same as "zero impact". Noninformative priors can lead to improper posterior, which is dangerous. How we choose a prior seems to be a hotly contested topic (does the choice of priors "matter"? What is an appropriate way to do it?) which Andrew Gelman has written extensively on.

Even if we restrict ourselves to only "stable" priors, I'm not sure this is much progress.

Revenge of the Nerds German Philosophers

One thing which the German philosophers Kant and Hegel pondered was the self-conscious reflection of human reason upon itself. Can our reasoning faculty turn on itself and, if it can, what can it infer? Phrased more relevantly, when reason knowingly encounters itself in a game, does this tell us anything about what reason should expect of itself?

Hegel's Phenomonology of Spirit (or more generally, his philosophy of Spirit) addresses this train of thought (and more). Further Hegel takes Reason reflecting on reason as it reflects on itself as part of the restlessness which drives history. Outside of history, for Hegel, there are no answers for the question of what one's reason demands of others' reason. History provides a changing set of answers.

Also worth mentioning is that game theory uses "reason" akin to Hume's usage in his famous passage We speak not strictly and philosophically when we talk of the combat of passion and reason. Reason is, and ought only to be the slave of passions, and can never pretend to any other office than to serve and obey them. Reason is a tool to help achieve the ends of subjective passions. Hegel rejoins in his lectures on the History of Philosophy, in chapter 2 on Hume in particular, In itself reason thus has no criterion whereby the antagonism between individual desires, and between itself and the desires, may be settled. Thus everything appears in the form of an irrational existence devoid of thought; the implicitly true and right is not in thought, but in the form of an instinct, a desire.

Kant's Critique of Pure Reason via his Transcendental Dialectic investigates Reason's excesses. For other Kantian repudiations of game theoretic "reason", see O'Neil's Constructions of Reason (1989), e.g., page 27 et seq.

Conclusion

So we finally have answered the question posed so long ago: beliefs are formed by taking into account common knowledge of rationality coupled to consistent alignments of beliefs. This bootstraps a rational actor's belief system by considering that actor's rational adversary's beliefs which have already solved the riddle of what is the original actor's belief system.

And if that sounds circular...that's because it is...

References

Shaun Hargreaves Heap and Yanis Varoufakis, Game Theory: A Critical Introduction. Second ed., Routledge. (This is the axiomatization scheme I am following.)
John Searle, Rationality in Action. MIT Press, 2001. (This provides a different set of axioms for rational behaviour, equivalent to the axioms of game theory, and discusses implicit assumptions & its flaws.)
S. Morris, "The Common Prior Assumption in Economic Theory". Economics and Philosophy 11 (1995) 227–253. Eprint.
John Harsanyi, "Games with Incomplete Information Played by 'Bayesian' Players: Part 1, The Basic Model". Management Science 14, 3 (1967) 159–182. Eprint
Robert J. Aumann, "Agreeing to Disagree" (PDF). The Annals of Statistics 4, 6 (1976) 1236–1239. doi:10.1214/aos/1176343654.
Scott Aaronson, Common Knowledge and Aumann’s Agreement Theorem [blogpost]
Scott Aaronson, "The Complexity of Agreement". Proceedings of ACM STOC (2005) pp. 634–643, eprint arXiv:cs/0406061