Thursday, May 28, 2020

Polling: Assorted Notes

These are my random notes on polling. I don't expect anything revolutionary to be contained here, I'm just hoping to consolidate them in one place.

Topics:

Margin of error
Likely voters
Averaging polls
Conducted by phone or by internet (think about, add later)

Margin of Error

Big idea. When we conduct a poll, we ask a subset of the population for their response. There is some error if we extrapolate the results from the poll conducted on this sample of the population, and try to apply it to the population as a whole. By "error" I mean "The estimates will be off 'by a few percentage points'." I do not mean the extrapolation is invalid.

The "off by a few percentage points" is called the sampling error. We can estimate it, and the estimate is referred to as the "margin of error."

Mathematical Details

Intuition: The margin of error for a poll of n respondents (out of a population of N individuals) asked a question is the width of the confidence interval of the response.

For a "large enough sample" on a binary question, we have a binomially distributed sample, and can use the normal approximation. We then determine some level of confidence \(\gamma\) to determine a z-score \(z_{\gamma}\) using the quantile function for the Normal distribution, which tells us how many standard deviations wide the confidence interval needs to be. We approximate the standard deviation using the "standard error", which in turn is approximated by \(\sqrt{s^{2}/n}\) the squareroot of the sample variance of the response divided by the sample size.

This is relatively unenlightening, there are technical matters which (I think) are contentious (at least, from a Bayesian perspective). It's also really hard to interpret the margin of error (it's easy to misinterpret it as "95% probability the true value lies in this interval", whereas it's really saying: "If we repeated this poll a large number of times, 95% of those polls would result in a confidence interval containing the population parameters").

Puzzle MOE1. Is there a better Bayesian replacement for the margin of error for a given poll? Presumably credibility intervals, but is there a quick way to get it without heavy computation?

Heuristic. The 95%-confidence margin of error for a binary question on a survey of n respondents may be approximated as \(1/\sqrt{n}\).

This is because the margin of error would be bounded (i.e., less than or equal to) the case where the true probability (proportion of "yes" responses) is 1/2, which produces \(moe = z_{0.95}\sqrt{0.5(1-0.5)/n} \approx 1.96\times 0.5/\sqrt{n}\leq 1/\sqrt{n}\).

Coincidentally, if we used Bayesian reasoning, and estimated the posterior distribution of the proportion of the population who would answer "yes" using a Beta distribution updated with the survey data, then the width of the 95% interval is also decently approximated by \(1/\sqrt{n}\). (Using \(2\sqrt{\operatorname{var}[\theta]}\) gives approximately the same result, but \(1/\sqrt{n}\) is for pessimists like me.)

Nonresponse error

One difficulty to note is if someone being polled by phone...hangs up before completing the survey. (Or, if in person, walks away from the questioner, or whatever.) If this happens sufficiently frequently, it impacts the reliability of the poll, and really increases the margin of error of the poll.

For many years, the response rate was viewed as a measure of the poll's quality. This heuristic is hard to validate.

We don't have an adequate way to digest polls with a high incompletion rate, or even what qualifies as a "high incompletion rate".

Puzzle MOE2. Can we have some approximate formula relating the nonresponse rate to the poll quality?

Likely Voters

Some polling firms ask questions to gauge if the respondent is a likely voter or not. What does this mean? Not every registered voter votes. We'd like to filter out the nonvoters. What's left are generically referred to as "likely voters". The exact statistical model sometimes remain undisclosed, it's the "secret sauce" for polling firms. (Gallup being a notable exception.)

There was some work done by Pew suggesting the likely voter model works fairly well, but can be improved if the respondent's voter history were known (and improved further with some magical machine learning algorithms).

The quality of a poll improves when it reports the results from likely voters, though this is far more costly to the polling firm.

Puzzle LV1. Is there some statistical way to infer how the reliability improves when a poll surveys likely voters as opposed to registered voters?

Poll Aggregation

This is the fancy term used for "combining polls". Let's consider some real data I just took from RealClearPolitics:

Poll	Date	Sample	MOE	Biden	Trump	Margin
Economist/YouGov	5/23 - 5/26	1157 RV	3.4	45	42	Biden +3
FOX News	5/17 - 5/20	1207 RV	3.0	48	40	Biden +8
Rasmussen Reports	5/18 - 5/19	1000 LV	3.0	48	43	Biden +5
CNBC	5/15 - 5/17	1424 LV	2.6	48	45	Biden +3
Quinnipiac	5/14 - 5/18	1323 RV	2.7	50	39	Biden +11
The Hill/HarrisX	5/13 - 5/14	950 RV	3.2	42	41	Biden +1
Harvard-Harris	5/13 - 5/14	1854 RV	2.0	53	47	Biden +6

There are a variety of ways to go about it. The most dangerous way is what RealClearPolitics does: just take the average of responses. For example, take the column of respondents favoring Biden, then take its average (which R tells me is 47.71429%). For Trump, the simple average is 42.42857%. Together, this sums to 90.14286% (only one poll, Harvard-Harris, has Biden and Trump sum to 100% support).

We don't have any way to gauge the margin of error of this estimate, though, and we don't reward larger polls anymore than smaller polls.

If we took the weighted mean (weighted by the sample size), Biden would receive 48.30791% and Trump 42.80864% with the weighted mean response at 91.11655% favoring one or the other.

We can further adjust weights, rewarding likely voter polls (or penalizing all others; for example, weigh registered voters proportional to the fraction of registered voters who turned out to vote in the last presidential election, something like 0.58).

The margin of error is all too frequently misinterpreted. It's probably better not to contrive some composite margin-of-error.

References

Margin of Error
1. Andrew Mercer, 5 key things to know about the margin of error in election polls. Pew Research, 8 Sept 2016.
2. Gary Langer, Sampling Error: What it Means. ABC, 8 October 2008
Likely voters
1. Scott Clement, Why the ‘likely voter’ is the holy grail of polling. Washington Post, 7 Jan. 2016
2. Gallup(?), What is the difference between registered voters and likely voters? Gallup, not dated.
3. Scott Keeter and Ruth Igielnik, Can Likely Voter Models Be Improved?. Pew Research Institute, 7 January 2016
Carl Bialik, Election Handicappers Are Using Risky Tool: Mixed Poll Averages. Wall Street Journal, 15 Feb. 2008

Tuesday, May 19, 2020

Subjective Logic: Fusion of Opinions

Last time, we introduced opinions as analogous to propositions held by actors, with some degree of belief and uncertainty.

This post will cover the following scenario (and more): consider a trial, and you are on the jury. A number of witnesses X, Y, and Z, all testify what they saw. We have some opinion about the truthfulness or believability of each witness \(\omega^{A}_{X}\), \(\omega^{A}_{Y}\), \(\omega^{A}_{Z}\), and then the opinions held by the witnesses concerning some shared event \(\omega^{X}_{v}\), \(\omega^{Y}_{v}\), \(\omega^{Z}_{v}\). How can we combine these opinions to form some opinion about the event v based on the evidence supplied by the testimony of the witnesses alone?

Fusion of Opinions

Now we can combine opinions, which is the main strength of subjective logic. But we have several operators combining opinions together, depending on the situation. Let's start with the decision tree to pick out which operator we'd need:

Is compromise possible between conflicting opinions? Are sources totally reliable? If no compromise is possible and sources are totally reliable, then Belief Constraint Fusion should be used; otherwise go to step 2.
Does including more opinions increase the amount of independent, distinct evidence? (For example, we use independent sensors in a science experiment as our source of beliefs. Each sensor is independent, so adding more sensors will add more independent evidence.) If so, then Cumulative Belief Fusion should be used; otherwise go to step 3.
Should completely uncertain opinions increase uncertainty? Does adding duplicate opinions increase belief in the fused opinion? (Example: are we on a jury listening to witness testimony?) If so, then the Averaging Belief Fusion; otherwise, go to step 4
How do we handle conflicting opinions (based on possibly overlapping evidence)? If we can use a weighted average to combine conflicting opinions together (e.g., doctors expressing multinomial set of diagnoses), then we should use the Weighted Belief Fusion; otherwise go to step 5.
So, we have conflicting opinions, possibly based on overlapping evidence, with compromise possible among the conflicts. For this case, we should use Weighted Belief Fusion with Vagueness Maximization. This is ideal when the analyst (us) wants to preserve shared opinions while transforming conflicting beliefs into uncertain opinions.

We will consider how to fuse together N sources (denoted \(C_{1}\), ..., \(C_{N}\)) who have opinions on a multinomial matter X (which we could restrict to binomial concerns as a special case). Generically we refer to the set of sources as \(\mathbb{C}\), and the domain for the opinion \(\mathbb{X}\) (for a binomial opinion, this would be \(\{x,\bar{x}\}\), for multinomial generalization this is the set of possible "rolls of the die").

Caution/warning/disclaimer/caveat: I subscribe to Cromwell's Law (we should never have probabilities of 100% or 0%), so uncertainty will never be 0 (nor 1), belief will never be 1 or 0, and our prior inclination \(a\) will never be zero or 1.

Belief Constraint Fusion

This is best expressed using the mass function. Remember, we are working with multinomial opinions, so the possible domain for an opinion is \(\mathbb{X}\) (a good intuition, for our interests, would be race ratings for elections). The mass function is defined on the power set of the opinion domain \(\mathcal{P}(\mathbb{X})\) taking some subset \(x\subset\mathbb{X}\) and assigning \[m(x) = b_{X}(x)\quad\mbox{if }x\neq\mathbb{X}, x\neq\emptyset\] \[m(\emptyset)=0\] \[m(\mathbb{X})=u_{X}\]

Given the mass function, we define the belief, uncertainty, and prior to be: \[ b^{\&(\mathbb{C})}_{X}(x) = \frac{\mathrm{Har}(x)}{1-\mathrm{Con}} \] \[ u^{\&(\mathbb{C})}_{X}(x) = \frac{\prod_{C\in\mathbb{C}}u^{C}_{X}}{1-\mathrm{Con}} \] \[ a^{\&(\mathbb{C})}_{X}(x) = \frac{\sum_{C\in\mathbb{C}}a^{C}_{X}(x)(1 - u^{C}_{X})}{\sum_{C\in\mathbb{C}}(1 - u^{C}_{X})} \] where "Har(-)" is the sum of beliefs harmonic with the argument, "Con" is the sum of conflicting beliefs: \[ \mathrm{Har}(x) = \sum_{X^{C_{1}}\cap\dots\cap X^{C_{N}}=x}\left(\prod_{C\in\mathbb{C}}m^{C}_{X}(x^{C})\right) \] And "Con" is defined as a sum over (nonempty?) subsets of the opinion domain \[ \mathrm{Con} = \sum_{X^{C_{1}}\cap\dots\cap X^{C_{N}}=\emptyset}\left(\prod_{C\in\mathbb{C}}m^{C}_{X}(x^{C})\right) \]

Cumulative Belief Fusion

Cumulative belief fusion is best when the situation satisfies: combining more opinions amounts to combining more evidence.

\[b^{\diamondsuit(\mathbb{C})}_{X}(x) = \frac{\sum_{C_{j}\in\mathbb{C}} \left(b^{C_{j}}_{X}(x)\prod_{C\neq C_{j}}u^{C}_{X}\right)}{\left[\sum_{C_{j}\in\mathbb{C}}\left((1 - u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)\right] + \left[\prod_{C\in\mathbb{C}}u^{C}_{X}\right]}\] \[u^{\diamondsuit(\mathbb{C})}_{X} = \frac{\prod_{C\in \mathbb{C}}u^{C}_{X}}{\left[\sum_{C_{j}\in\mathbb{C}}\left((1 - u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)\right] + \left[\prod_{C\in\mathbb{C}}u^{C}_{X}\right]}\] \[ a^{\diamondsuit(\mathbb{C})}_{X}(x) = \frac{\sum_{C_{j}\in\mathbb{C}}\left(a^{C_{j}}_{X}(x)(1 - u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)}{\sum_{C_{j}\in\mathbb{C}}\left((1 - u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)} \]

One property such a fusion operator satisfies is its resulting opinion's evidence vector (useful when setting parameters to the corresponding Dirichlet distribution) is the sum of the evidence vectors of its constituents: \[ r^{\diamondsuit(\mathbb{C})}_{X}(x) = \sum_{C\in\mathbb{C}}r^{C}_{X}(x). \] Hence the "cumulative" adjective to this fusion rule.

Averaging Belief Fusion

When we have a trial with testimony on the same events from different witnesses, we use the averaging belief fusion

\[b^{\underline{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C_{j}\in\mathbb{C}} \left(b^{C_{j}}_{X}(x)\prod_{C\neq C_{j}}u^{C}_{X}\right)}{\left[\sum_{C_{j}\in\mathbb{C}}\left(\prod_{C\neq C_{j}}u^{C}_{X}\right)\right]}\] \[u^{\underline{\diamondsuit}(\mathbb{C})}_{X} = \frac{\prod_{C\in \mathbb{C}}u^{C}_{X}}{\left[\sum_{C_{j}\in\mathbb{C}}\left(\prod_{C\neq C_{j}}u^{C}_{X}\right)\right]}\] \[ a^{\underline{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C\in\mathbb{C}}a^{C}_{X}(x)}{N} \]

This has the nice property that the evidence vector for the resulting opinion is the average of the evidence vectors for its constituents: \[ r^{\underline{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C\in\mathbb{C}}r^{C}_{X}(x)}{N}. \]

Weighted Belief Fusion

When we want uninformed opinions to not influence us (e.g., an opinion representing "This isn't my field, I have no expertise, I don't know"), we can weigh opinions based off the "confidence" of the opinion (the complement of the uncertainty).

\[b^{\widehat{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C_{j}\in\mathbb{C}} \left(b^{C_{j}}_{X}(x)(1-u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)}{\left[\sum_{C_{j}\in\mathbb{C}}\left((1-u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)\right]}\] \[u^{\widehat{\diamondsuit}(\mathbb{C})}_{X} = \frac{\left(\sum_{C\in\mathbb{C}}(1-u^{C}_{X})\right)\left(\prod_{C\in \mathbb{C}}u^{C}_{X}\right)}{\left[\sum_{C_{j}\in\mathbb{C}}\left((1-u^{C_{j}}_{X})\prod_{C\neq C_{j}}u^{C}_{X}\right)\right]}\] \[ a^{\widehat{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C\in\mathbb{C}}a^{C}_{X}(x)(1-u^{C}_{X})}{\sum_{C\in\mathbb{C}}(1-u^{C}_{X})} \]

The confidence vector for the resulting opinion is, unsurprisingly, the weighted sum of the evidence vectors for its constituents (weighted by confidence): \[ r^{\widehat{\diamondsuit}(\mathbb{C})}_{X}(x) = \frac{\sum_{C\in\mathbb{C}}(1 - u^{C}_{X})r^{C}_{X}(x)}{\sum_{C\in\mathbb{C}}(1 - u^{C}_{X})} \]

Vagueness Maximization

When there are multiple competing hypotheses, and only one can be right (like: multiple doctors give different diagnoses to the same patient's symptoms, and one of them is correct, but we do not know which one), then we have Vagueness or a Vague Belief.

I think I'm going to skip the sordid details, they may be found in the references below.

Trust Transitivity

We have just reviewed all the ways to combine opinions together into one. But how can we factor in the degree of trust in the sources?

We can say, if A is the analyst and B is the source of information, we have an additional opinion \(\omega^{A}_{B}\) the analyst has of the quality of the source B. The analyst evaluates the source of information, and forms a modified opinion \(\omega^{A}_{B}\otimes\omega^{B}_{x}\) the analyst would posses about the matter reported \(x\) by the source B.

There are two different ways to go about this. If the source is sincere, or if the source is deliberately lying; these paths necessitate different treatments.

Now, before continuing on, let's stress how to interpret the opinion \(\omega^{A}_{B}\). Specifically here \(b^{A}_{B}\) is the degree we believe B is an honest source (i.e., that B would tell us the truth insofar as the source understands it), \(d^{A}_{B}\) is the degree we believe the source is trying to deceive us with a lie, and \(u^{A}_{B}\) reflects the lack of familiarity in our source's expertise or knowledgeability (e.g., we have a lack of history, or our interactions have been on other topics).

Earnest Sources

When we think the source B is sincere, then the weighted evidence is denoted \(\omega^{A:B}_{x}=\omega^{A}_{B}\otimes\omega^{B}_{x}\) with components: \[b^{A:B}_{x} = \Pr(\omega^{A}_{B})b^{B}_{x}\] \[d^{A:B}_{x} = \Pr(\omega^{A}_{B})d^{B}_{x}\] \[u^{A:B}_{x} = 1-\Pr(\omega^{A}_{B})(b^{B}_{x} + d^{B}_{x})\] \[a^{A:B}_{x} = a^{B}_{x}\]

This requires care when dealing with extremely naive analysts (where \(b^{A}_{B}\approx 1\)) or very cynical ones (with \(d^{A}_{B}\approx 1\)). In such cases, the analyst should use \(\Pr(\omega^{A}_{B})=b^{A}_{B}\).

Dishonest Sources

When we think the source may know the truth and deliberately misinform us, we can still use the information supplied to aid our judgment. We just deliberately interpret the information as a lie. Specifically, we use the following modified opinion with (confusingly enough) the same notation \(\omega^{A:B}_{x}=\omega^{A}_{B}\otimes\omega^{B}_{x}\) but components: \[b^{A:B}_{x} = b^{A}_{B}b^{B}_{x} + d^{A}_{B}d^{B}_{x}\] \[d^{A:B}_{x} = b^{A}_{B}d^{B}_{x} + d^{A}_{B}b^{B}_{x}\] \[u^{A:B}_{x} = u^{A}_{B} + (b^{A}_{B}+d^{A}_{B})u^{B}_{x}\] \[a^{A:B}_{x} = a^{B}_{x}\]

Here \(d^{B}_{x}\) is the degree to which the source believes x to be false, and \(b^{B}_{x}\) is the degree to which the source believes x to be true.

Usefulness

If we get data from a source, we need to formulate that as an opinion \(\omega^{A:B}_{x}=\omega^{A}_{B}\otimes\omega^{B}_{x}\) using the appropriate rules depending on the source's reliability.

Once we have accumulated data, we combine them together using the fusion rules we discussed earlier in this post. Next time when we discuss subjective logic further, we will consider a more in-depth example.

References

Audun Jøsang, "The Cumulative Rule for Belief Fusion". arXiv:cs/0606066
Audun Jøsang, "Cumulative and Averaging Fission of Beliefs". arXiv:0712.1182
Audun Jøsang, Categories of Belief Fusion. Journal of Advances in Information Fusion 13 no.2 (2018)

Saturday, May 16, 2020

Subjective Logic: Basic Concepts

How do we handle propositional logic when we don't know with certainty the truth or falsehood of a proposition? What if we had some probability that a proposition is true? Subjective logic is a toolkit to handle this problem.

Opinions

Definition 1. The basic idea is, when we have a proposition x, an actor A holds an Opinion about x, which is a 4-tuple \(\omega_{x}^{A} = (b_{x}^{A}, d_{x}^{A}, u_{x}^{A}, a_{x}^{A})\) where superscripts track opinion-holders, subscripts track the proposition opined, and the components are interpreted as:

b is the belief that x is true;
d is the disbelief in x, i.e., the belief that x is false;
u is the uncertainty in x, i.e., the unallocated belief about x;
a is the prior probability that, absent of all evidence, x is true;

such that \(b_{x}+d_{x}+u_{x}=1\) and each component of the 4-tuple is a probability. (End of Definition)

We may omit the superscript when it's clear we're talking about a single actor. (We may, out of laziness, omit the subscript, but only when it's absolutely clear what proposition is being discussed.)

Remark 1.1 (Cromwell's Law). In the real world, we never have "absolute belief" \(b_{x}=1\), or "absolute disbelief" \(d_{x}=1\). Similar statements could be made about the prior \(a_{x}\) and uncertainty \(x_{x}\). This is Cromwell's rule as manifested in subjective logic. If \(u_{x}=0\) for some reason, then we need to be careful about modifying every definition in subjective logic to handle those cases. There is literature on this topic, but it's not germane for the cases we're interested in (and it goes against the virtue of humility-in-forecasting).

Definition 2. We can translate an opinion into a probability by \[\Pr(\omega_{x})=b_{x}+a_{x}u_{x}.\] (End of definition)

Example 3 (Completely uninformed belief). I was reading a newsletter which cited an unfamiliar news source. Although I trust my newsletter, I do not know anything about this unfamiliar source. The completely uninformed opinion I have about this source being reliable could be modelled as \((b_{x}=0, d_{x}=0, u_{x}=1, a_{x}=1/2)\).

Why is u = 1? Because we have no supporting evidence for the opinion (so we cannot have \(b_{x}\gt0\)), nor any evidence to the contrary (so we cannot have \(d_{x}\gt0\)). The constraint \(b_{x}+d_{x}+u_{x}=1\) thus forces our hand.

Why is a = 0.5? One way to justify it is because the probability of this unfamiliar source being reliable is 50% (it could equally be unreliable as reliable), so \(\Pr(\omega_{x}) = b_{x} + a_{x}u_{x} = a_{x}\) in this particular case, which sets the value of a.

Example 3.1 (Limiting cases). We should be able to recover propositional logic from subjective logic. That's a reasonable "sanity check" when generalizing some concept, we recover simpler versions under simplifying assumptions. It turns out we recover propositional logic when taking TRUE to be the opinion (1, 0, 0, 0.5) and FALSE to be (0, 1, 0, 0.5).

Example 4. Suppose a source of information may or may not be reliable. We can model this situation (the reliability of our source) as a binary opinion, where here we interpret \(b_{x}\) as the probability the source is reliable, \(d_{x}\) is the probability the source is unreliable (and should not be used), and \(u_{x}\) is the uncertainty in whether or not the source is reliable. Our initial faith placed in the source is described by \(a_{x}\).

Beta Distribution

The beta distribution is a probability distribution useful for modelling the probability of success in a Bernoulli trial. An opinion translates into parameters of the beta distribution, namely \(\alpha = (2b_{x}/u_{x}) + 2a_{x}\) and \(\beta = (2d_{x}/u_{x}) + 2(1-a_{x})\).

We interpret the 2b/u as the number of observations that x is true, and 2d/u as the number of observations that x is false.

Conversely, if we begin with s observations that x is true, and n observations that x is false, then we may form an opinion with

\(b_{x} = s/(2 + s + n)\)
\(d_{x} = n/(2 + s + n)\)
\(u_{x} = 2/(2 + s + n)\)

As a sanity check, observe the opinion formed from zero observations is the one with b = 0, d = 0, and u = 1.

Puzzle 1. Investigate the possibility of having the prior \(a_{x}\) of an opinion itself be obtained from an opinion, i.e., \(a_{x} = \Pr(\omega_{a})\) or otherwise statistically inferred.

Generalization

We can generalize the notion of an opinion, as we have introduced it, from being an opinion about a proposition to an opinion about an outcome. As a stochastic process, an opinion of a proposition is modelled after a Bernoulli distribution ("coin flip") which has two possible outcomes (the proposition is true or false). We can generalize this to an opinion which can be refined into one of N possible positions (i.e., generalize to a Multinomial process [roll of an N-sided die]).

Definition 5. When an opinion is regarding a proposition (and can be either true or false), we call it various a Binary Opinion (or Bernoulli opinion or Binomial opinion, or an opinion about a proposition). In such a case, we interpret \(b_{x}\) as the probability the proposition is true, \(d_{x}\) the probability the proposition is false, and \(u_{x}\) as the unallocated confidence either way. We interpret \(a_{x}\) as the prior probability the opinion is true (what proportion of the uncertainty we'd be inclined to consider "true").

Definition 5.1. Let \(\omega_{x}=(b_{x},d_{x},u_{x},a_{x})\) be a binary opinion. We will refer to the Evidence associated to the opinion as the pseudo-observations of the associated Beta distribution without the priors, i.e., \(\alpha=2b_{x}/u_{x}\) and \(\beta=2d_{x}/u_{x}\) as the observations supporting and challenging the proposition, respectively. The Total amount of evidence is the sum \(\alpha+\beta\) and is inversely proportional to our uncertainty in the proposition.

Definition 6. When an opinion is regarding N possible distinct outcomes (for example: rolling a die, what face is showing), we refer to it as a Multinomial Opinion. In this case, the belief function maps each outcome to the associated probability of its truth, i.e., \(b_{X}(x)\) is the probability outcome or state \(x\) is true. We also have a prior probability function \(a_{X}(x)\) which is the prior probability we would assign, given no information, that outcome \(x\) is true. We have an uncertainty scalar \(u_{X}\) which is the probability left unallocated. We require \[ \sum_{x}a_{X}(x)=1 \] and \[ u_{X} + \sum_{x}b_{X}(x)=1. \]

Example 7. Observe a binary opinion is a special case of the multinomial opinion when \(N=2\).

Example 8. There are "election raters" like Cook Political Report, or Inside Elections with Nathan Gonzales. They rate a race as one of several categories (solid R, likely R, lean R, toss up, lean D, likely D, solid D). There are 7 possible values for a single race. We would model an opinion as a belief function \(b_{X}(-)\) which would produce a probability for each of these values, an uncertainty u, and a prior probability function \(a_{X}(-)\) which assigns a probability for each of these values. The conditions demanded on these functions become: \[u_{X} + \sum_{r}b_{X}(r) = 1\] where we sum over all ratings r, and \[\sum_{r}a_{X}(r) = 1.\] (End of Example)

We can similarly translate a multinomial opinion into parameters for a Dirichlet distribution \(\alpha_{X}(x)\) by \[\alpha_{X}(x) = \frac{2b_{X}(x)}{u_{X}} + 2a_{X}(x).\] Then we can define the probability for a particular outcome according to an opinion using the Dirichlet distribution.

There are ways to generalize opinions further. We could consider the probability the race is Republican favoring by combining the probabilities of solid R, lean R, likely R, etc. The term for this proposition ("the election favors the Republican candidate") by a Hyper-Opinion.

Definition 9. For a multinomial opinion, we refer to the Evidence associated to the opinion as the parameters of the associated Dirichlet distribution minus the priors, i.e., \(r_{X}(x) = 2b_{X}(x)/u_{X}\). The Total amount of evidence is the sum \[ \mathrm{total\ amount} = r_{X} = \sum_{x}r_{X}(x) \] and is inversely proportional to our opinion's uncertainty \(r_{X}\propto 1/u_{X}\). We may reconstruct the opinion given the evidence by \[ b_{X}(x) = \frac{r_{X}(x)}{2 + r_{X}} \] and the uncertainty \[ u_{X} = \frac{2}{2 + r_{X}} \] similar to the binary case.

Remark 9.1. It is for this reason, if we're given rough proportions of outcomes for a multinomial event (like an election) that we would like to use as the basis of estimating a multinomial opinion's belief, we may want to use a method of moments estimator for the Dirichlet distribution, possibly rescaled depending on how much confidence we have in the evidence given.

Operations on Opinions

Broadly speaking, we have the familiar logical connectives (like negation, conjunction, disjunction) which translate to operations on opinions. There are also operations which are unique to subjective logic, because an actor A may have an opinion on the reliability of another actor B, and A may then take information given by B to form a new opinion. Subjective logic generalizes this, permitting a "judicial trial" way to combine testimony from witnesses into an opinion.

Definition 10. Let \(\omega_{X}\) be a multinomial opinion. The Probability of outcome \(x\) is \[ \Pr(x|\omega_{X}) = b_{X}(x) + a_{X}(x)u_{X} \] analogous to the binary opinion case.

Logical Connectives

Logical connectives have equivalents with opinions, but laws familiar from propositional logic does not translate necessarily to subjective logic.

Our intuition is guided by using the probability of independent events \(\Pr(X\land Y)=\Pr(X)\Pr(Y)\) and \(\Pr(X\lor Y)=\Pr(X)+\Pr(Y)-\Pr(X)\Pr(Y)\), then working backwards to determine the belief, disbelief, uncertainty of the conjunction and disjunction in terms of the original X and Y constituents.

Negation

Negation is the first operation worth covering. Basically, an actor's belief in \(\bar{x}=\neg x\) is that actor's disbelief in x, and the prior to believe in \(\bar{x}\) is precisely \(a_{\bar{x}}=1-a_{x}\).

More formally, if \(\omega_{x}=(b_{x}, d_{x}, u_{x}, a_{x})\) is an opinion, then its negation is \(\omega_{\bar{x}}=(d_{x}, b_{x}, u_{x}, 1-a_{x})\) is its negation in terms of the original opinion.

Conjunction

When one observer holds an opinion about two distinct, independent propositions x and y, they may be conjoined into a new opinion about the proposition x∧y. The opinion the observer holds about the conjunction is denoted \(\omega_{x\land y}=\omega_{x}\cdot\omega_{y}\) and consists of the components:

\(b_{x\land y} = b_{x}b_{y} + [(1 - a_{x})b_{x}a_{y}u_{y} + a_{x}u_{x}(1 - a_{y})b_{y}]/(1 - a_{x}a_{y})\)
\(d_{x\land y} = d_{x} + d_{y} - d_{x}d_{y}\)
\(u_{x\land y} = u_{x}u_{y} + [b_{x}(1 - a_{y})u_{y} + (1 - a_{x})u_{x}b_{y}]/(1 - a_{x}a_{y})\)
\(a_{x\land y} = a_{x}a_{y}\)

Disjunction

When a single observer holds an opinion about two independent, distinct propositions x and y, then we may combine them using the disjunction ("logical OR") into a new proposition \(x\lor y\) and form the opinion denoted \(\omega_{x\lor y}=\omega_{x}\sqcup\omega_{y}\) whose components are:

\(b_{x\lor y} = b_{x} + b_{y} - b_{x}b_{y}\)
\(d_{x\lor y} = d_{x}d_{y} + [a_{x}d_{x}(1 - a_{y})u_{y} + (1 - a_{x})u_{x}a_{y}d_{y}]/(a_{x} + a_{y} - a_{x}a_{y})\)
\(u_{x\lor y} = u_{x}u_{y} + [d_{x}a_{y}u_{y} + a_{x}u_{x}d_{y}]/(a_{x} + a_{y} - a_{x}a_{y})\)
\(a_{x\lor y} = a_{x} + a_{y} - a_{x}a_{y}\)

Caution: in propositional logic, we expect "AND" distributes over "OR" \(x\land(y\lor z) = (x\land y)\lor(x\land z)\). This no longer holds in subjective logic, which shouldn't be surprising. Remember we required the opinions to be about independent propositions. So \(\omega_{x\land(y\lor z)}\neq\omega_{x\land y}\sqcup\omega_{x\land z}\) shouldn't surprise us, since \(x\land y\) is not independent of \(x\land z\).

The familiar de Morgan's law (relating negation of conjunction as the disjunction of negated propositions, and vice-versa) still holds in subjective logic, though.

Other binary operators

There are about a half-dozen other binary operators, but I've covered the ones I believe to be germane for most of my interest.

Conclusion

What have we done? We have expressed propositions as opinions, i.e., as probabilities. Admittedly these opinions are for finite random variables (i.e., for situations which are true-or-false [Bernoulli trial] or finitely-many-distinct-outcomes like a roll of the die [multinomial]).

We've related the various degree of belief or disbelief to the amount of evidence available for a particular outcome or state.

We also have discussed a few operators on opinions familiar from logic, like negation or conjunction.

Next time, we'll discuss how to combine opinions together. One germane example will be as a member of a jury listening to a trial, evaluating evidence, to determine if the accused is guilty or not.

In this way, subjective logic is a "coarser" version of Bayesian inference.

References

It seems most of the work on this subject comes from Audun Jøsang and his students.

A. Jøsang, “A logic for uncertain probabilities”. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9, no. 3 (2001), pp. 279–311, Eprint
Audun Jøsang, "Subjective Logic". Manuscript, Draft, 18 February 2013, Eprint
Audun Jøsang, "Belief Calculus". arXiv:cs/0606029
Mohd Anuar Mat Isa, Ramlan Mahmod, Nur Izura Udzir, Jamalul-lail Ab Manan, Audun Jøsang, Ali Dehghan Tanha, "A Formal Calculus for International Relations Computation and Evaluation". arXiv:1606.02239 for an application to international relations (which generalizes to American politics)

Tuesday, May 12, 2020

Race Ratings as Probability Estimates

What's the difference between "Lean Democratic" versus "Likely Democratic"? Presumably the latter has a higher probability the Democratic candidate will win compared to the former, but how much more?

One way to approximate the probabilities associated with these terms is to use frequentism, i.e., look at the relative frequency of outcomes associated with these terms.

Cook Political Report's House Ratings

From the accuracy report from the Cook Political Report:

From the Editor: In a new academic paper, Dr. James E. Campbell, Chairman of the Political Science Department at the State University of New York- Buffalo has analyzed The Cook Political Report's pre-Labor Day House ratings going back to the Report's founding in 1984. In 11 of the 13 elections in which the Cook Political Report published new ratings between July 1 and then end of August (all except 1986 and 1990), 99.8 percent of the 3,387 races rated by the Cook Political Report as Solid Republican or Solid Democratic in July or August of an election year went by way of that party, 94.9 percent of the 641 races rated as Likely Democratic or Likely Republican fell the way the Cook Political Report predicted, and 85.7 percent of the 441 races rated Lean Democratic or Lean Republican broke in favor of the leaning party. Of the 130 Democratic-held seats rated as Toss Up, 49.2 percent went for Democrats, and 55.0 percent of the 160 Republican held seats rated as Toss Up were won by the GOP.

It seems safe to assert that "toss up" races are approximately even-odds.

"Solid" races: the associated party (for "Solid R" the Republican candidate, for "Solid D" the Democratic candidate) is expected to win with approximately 99.8% probability; the probability of an upset victory is the same as getting 9 heads out of 9 coin flips (with a fair coin).

"Likely" races: the favored party's candidate is expected to win with 95% probability. The probability of an upset is about as likely as getting 8 or more heads, out of 10 coin flips (with a fair coin).

"Lean" races: the favored party's candidate is expected to win with about 85% probability. The probability of an upset is about as likely as getting 6 or more heads out of 8 coin flips (with a fair coin).

Toss up races: the probability of either candidate winning is 50%.

But these are ratings for House races. Presumably Senate races are similar, but what about Presidential races?

Case Study: Inside Elections

We can examine Inside Elections presidential predictions from October 2012 and October 2016.

Null Hypothesis: The race ratings correspond to the parameter values listed above.

Alternative Hypothesis: The race ratings DO NOT correspond to the parameter values listed above.

Of the 8 states rated "Lean", 7 went to the associated party. The probability of this event given the associated parameter is 0.85 is Pr(X=7/8, theta=0.85) = 0.3846925, which would mean: if we were to do a statistical test of significance, we would fail to reject the null hypothesis at significance level 0.05.

Of the 7 states rated "Likely", 6 went to the associated party. The probability of this event given the associated parameter is 0.95 is Pr(X=6/7, theta=0.95) = 0.2572822, which again would fail to reject the null hypothesis at significance level 0.05.

Conclusion

So, the significant results (giving some probability estimate for each race rating) seems to be approximately as probabilities: toss-up ~ 50%, lean ~ 85%, likely ~ 95%, solid ~ 99.8%.

This probability estimate seems to be consistent across multiple forecasters. Though I am too lazy to look at Sabato's Crystal Ball for further statistical testing, I probably should.

Another "sanity test", if these ratings are consistent, then we should expect to see results consistent with these parameters in Senate races and Governor races. This could be future work, ostensibly.

Also perhaps there are Bayesian methods worth applying here, though none immediately spring to mind.

Addendum [Tuesday May 12, 2020 at 8:23PM (PDT)]: it seems like one useful application of these ratings are as elicited priors for Bayesian analysis, since a nuanced analysis of voter turn-out would change the probabilities slightly from these approximations.

Sunday, May 10, 2020

States to watch for 2020 Presidential Election

Given there are 50 states (plus DC) consisting of some 3,143 counties, which ones are important and worth keeping an eye on for the 2020 presidential election?

I'm just going to summarize what the big three (Inside Elections, Cook Political Report, and Sabato's Crystal Ball) to determine which states are competitive. Competitive ratings include "toss up", "tilt", and "lean" ratings (in a rather convoluted terminology).

The states considered "in play" are:

Arizona (Cook, IE, Sabato: Toss up)
Florida (Cook, IE: Toss up; Sabato: Leans R)
Georgia (Cook, IE, Sabato: Leans R)
Iowa (IE, Sabato: Leans R)
Maine (Cook: Lean D)
Michigan (Cook: Toss Up; IE: Tilt D; Sabato: Lean D)
Minnesota (Cook, IE, Sabato: Lean D)
New Hampshire (Cook, IE, Sabato: Lean D)
Nevada (Sabato: Lean D)
North Carolina (Cook, IE, Sabato: Toss up)
Ohio (Sabato: Lean R)
Pennsylvania (Cook, Sabato: Toss up; IE: Tilt D)
Texas (Cook: Lean R)
Wisconsin (Cook, IE, Sabato: Toss up)

The eight states in bold are the eight I think are worth studying further. Of course, it's six months out, and a lot can happen in just a couple months (like a virus killing 80,011 people). In a couple months, it may turn out new states come into play, or some of these competitive states cease to be competitive, or both.

(Addendum: there are 176 ways for either candidate to win the presidency.)