How do we handle propositional logic when we don't know with certainty the truth or falsehood of a proposition? What if we had some probability that a proposition is true? Subjective logic is a toolkit to handle this problem.
Opinions
Definition 1. The basic idea is, when we have a proposition x, an actor A holds an Opinion about x, which is a 4-tuple \(\omega_{x}^{A} = (b_{x}^{A}, d_{x}^{A}, u_{x}^{A}, a_{x}^{A})\) where superscripts track opinion-holders, subscripts track the proposition opined, and the components are interpreted as:
- b is the belief that x is true;
- d is the disbelief in x, i.e., the belief that x is false;
- u is the uncertainty in x, i.e., the unallocated belief about x;
- a is the prior probability that, absent of all evidence, x is true;
We may omit the superscript when it's clear we're talking about a single actor. (We may, out of laziness, omit the subscript, but only when it's absolutely clear what proposition is being discussed.)
Remark 1.1 (Cromwell's Law). In the real world, we never have "absolute belief" \(b_{x}=1\), or "absolute disbelief" \(d_{x}=1\). Similar statements could be made about the prior \(a_{x}\) and uncertainty \(x_{x}\). This is Cromwell's rule as manifested in subjective logic. If \(u_{x}=0\) for some reason, then we need to be careful about modifying every definition in subjective logic to handle those cases. There is literature on this topic, but it's not germane for the cases we're interested in (and it goes against the virtue of humility-in-forecasting).
Definition 2. We can translate an opinion into a probability by \[\Pr(\omega_{x})=b_{x}+a_{x}u_{x}.\] (End of definition)
Example 3 (Completely uninformed belief). I was reading a newsletter which cited an unfamiliar news source. Although I trust my newsletter, I do not know anything about this unfamiliar source. The completely uninformed opinion I have about this source being reliable could be modelled as \((b_{x}=0, d_{x}=0, u_{x}=1, a_{x}=1/2)\).
Why is u = 1? Because we have no supporting evidence for the opinion (so we cannot have \(b_{x}\gt0\)), nor any evidence to the contrary (so we cannot have \(d_{x}\gt0\)). The constraint \(b_{x}+d_{x}+u_{x}=1\) thus forces our hand.
Why is a = 0.5? One way to justify it is because the probability of this unfamiliar source being reliable is 50% (it could equally be unreliable as reliable), so \(\Pr(\omega_{x}) = b_{x} + a_{x}u_{x} = a_{x}\) in this particular case, which sets the value of a.
Example 3.1 (Limiting cases). We should be able to recover propositional logic from subjective logic. That's a reasonable "sanity check" when generalizing some concept, we recover simpler versions under simplifying assumptions. It turns out we recover propositional logic when taking TRUE to be the opinion (1, 0, 0, 0.5) and FALSE to be (0, 1, 0, 0.5).
Example 4. Suppose a source of information may or may not be reliable. We can model this situation (the reliability of our source) as a binary opinion, where here we interpret \(b_{x}\) as the probability the source is reliable, \(d_{x}\) is the probability the source is unreliable (and should not be used), and \(u_{x}\) is the uncertainty in whether or not the source is reliable. Our initial faith placed in the source is described by \(a_{x}\).
Beta Distribution
The beta distribution is a probability distribution useful for modelling the probability of success in a Bernoulli trial. An opinion translates into parameters of the beta distribution, namely \(\alpha = (2b_{x}/u_{x}) + 2a_{x}\) and \(\beta = (2d_{x}/u_{x}) + 2(1-a_{x})\).
We interpret the 2b/u as the number of observations that x is true, and 2d/u as the number of observations that x is false.
Conversely, if we begin with s observations that x is true, and n observations that x is false, then we may form an opinion with
- \(b_{x} = s/(2 + s + n)\)
- \(d_{x} = n/(2 + s + n)\)
- \(u_{x} = 2/(2 + s + n)\)
Puzzle 1. Investigate the possibility of having the prior \(a_{x}\) of an opinion itself be obtained from an opinion, i.e., \(a_{x} = \Pr(\omega_{a})\) or otherwise statistically inferred.
Generalization
We can generalize the notion of an opinion, as we have introduced it, from being an opinion about a proposition to an opinion about an outcome. As a stochastic process, an opinion of a proposition is modelled after a Bernoulli distribution ("coin flip") which has two possible outcomes (the proposition is true or false). We can generalize this to an opinion which can be refined into one of N possible positions (i.e., generalize to a Multinomial process [roll of an N-sided die]).
Definition 5. When an opinion is regarding a proposition (and can be either true or false), we call it various a Binary Opinion (or Bernoulli opinion or Binomial opinion, or an opinion about a proposition). In such a case, we interpret \(b_{x}\) as the probability the proposition is true, \(d_{x}\) the probability the proposition is false, and \(u_{x}\) as the unallocated confidence either way. We interpret \(a_{x}\) as the prior probability the opinion is true (what proportion of the uncertainty we'd be inclined to consider "true").
Definition 5.1. Let \(\omega_{x}=(b_{x},d_{x},u_{x},a_{x})\) be a binary opinion. We will refer to the Evidence associated to the opinion as the pseudo-observations of the associated Beta distribution without the priors, i.e., \(\alpha=2b_{x}/u_{x}\) and \(\beta=2d_{x}/u_{x}\) as the observations supporting and challenging the proposition, respectively. The Total amount of evidence is the sum \(\alpha+\beta\) and is inversely proportional to our uncertainty in the proposition.
Definition 6. When an opinion is regarding N possible distinct outcomes (for example: rolling a die, what face is showing), we refer to it as a Multinomial Opinion. In this case, the belief function maps each outcome to the associated probability of its truth, i.e., \(b_{X}(x)\) is the probability outcome or state \(x\) is true. We also have a prior probability function \(a_{X}(x)\) which is the prior probability we would assign, given no information, that outcome \(x\) is true. We have an uncertainty scalar \(u_{X}\) which is the probability left unallocated. We require \[ \sum_{x}a_{X}(x)=1 \] and \[ u_{X} + \sum_{x}b_{X}(x)=1. \]
Example 7. Observe a binary opinion is a special case of the multinomial opinion when \(N=2\).
Example 8. There are "election raters" like Cook Political Report, or Inside Elections with Nathan Gonzales. They rate a race as one of several categories (solid R, likely R, lean R, toss up, lean D, likely D, solid D). There are 7 possible values for a single race. We would model an opinion as a belief function \(b_{X}(-)\) which would produce a probability for each of these values, an uncertainty u, and a prior probability function \(a_{X}(-)\) which assigns a probability for each of these values. The conditions demanded on these functions become: \[u_{X} + \sum_{r}b_{X}(r) = 1\] where we sum over all ratings r, and \[\sum_{r}a_{X}(r) = 1.\] (End of Example)
We can similarly translate a multinomial opinion into parameters for a Dirichlet distribution \(\alpha_{X}(x)\) by \[\alpha_{X}(x) = \frac{2b_{X}(x)}{u_{X}} + 2a_{X}(x).\] Then we can define the probability for a particular outcome according to an opinion using the Dirichlet distribution.
There are ways to generalize opinions further. We could consider the probability the race is Republican favoring by combining the probabilities of solid R, lean R, likely R, etc. The term for this proposition ("the election favors the Republican candidate") by a Hyper-Opinion.
Definition 9. For a multinomial opinion, we refer to the Evidence associated to the opinion as the parameters of the associated Dirichlet distribution minus the priors, i.e., \(r_{X}(x) = 2b_{X}(x)/u_{X}\). The Total amount of evidence is the sum \[ \mathrm{total\ amount} = r_{X} = \sum_{x}r_{X}(x) \] and is inversely proportional to our opinion's uncertainty \(r_{X}\propto 1/u_{X}\). We may reconstruct the opinion given the evidence by \[ b_{X}(x) = \frac{r_{X}(x)}{2 + r_{X}} \] and the uncertainty \[ u_{X} = \frac{2}{2 + r_{X}} \] similar to the binary case.
Remark 9.1. It is for this reason, if we're given rough proportions of outcomes for a multinomial event (like an election) that we would like to use as the basis of estimating a multinomial opinion's belief, we may want to use a method of moments estimator for the Dirichlet distribution, possibly rescaled depending on how much confidence we have in the evidence given.
Operations on Opinions
Broadly speaking, we have the familiar logical connectives (like negation, conjunction, disjunction) which translate to operations on opinions. There are also operations which are unique to subjective logic, because an actor A may have an opinion on the reliability of another actor B, and A may then take information given by B to form a new opinion. Subjective logic generalizes this, permitting a "judicial trial" way to combine testimony from witnesses into an opinion.
Definition 10. Let \(\omega_{X}\) be a multinomial opinion. The Probability of outcome \(x\) is \[ \Pr(x|\omega_{X}) = b_{X}(x) + a_{X}(x)u_{X} \] analogous to the binary opinion case.
Logical Connectives
Logical connectives have equivalents with opinions, but laws familiar from propositional logic does not translate necessarily to subjective logic.
Our intuition is guided by using the probability of independent events \(\Pr(X\land Y)=\Pr(X)\Pr(Y)\) and \(\Pr(X\lor Y)=\Pr(X)+\Pr(Y)-\Pr(X)\Pr(Y)\), then working backwards to determine the belief, disbelief, uncertainty of the conjunction and disjunction in terms of the original X and Y constituents.
Negation
Negation is the first operation worth covering. Basically, an actor's belief in \(\bar{x}=\neg x\) is that actor's disbelief in x, and the prior to believe in \(\bar{x}\) is precisely \(a_{\bar{x}}=1-a_{x}\).
More formally, if \(\omega_{x}=(b_{x}, d_{x}, u_{x}, a_{x})\) is an opinion, then its negation is \(\omega_{\bar{x}}=(d_{x}, b_{x}, u_{x}, 1-a_{x})\) is its negation in terms of the original opinion.
Conjunction
When one observer holds an opinion about two distinct, independent propositions x and y, they may be conjoined into a new opinion about the proposition x∧y. The opinion the observer holds about the conjunction is denoted \(\omega_{x\land y}=\omega_{x}\cdot\omega_{y}\) and consists of the components:
- \(b_{x\land y} = b_{x}b_{y} + [(1 - a_{x})b_{x}a_{y}u_{y} + a_{x}u_{x}(1 - a_{y})b_{y}]/(1 - a_{x}a_{y})\)
- \(d_{x\land y} = d_{x} + d_{y} - d_{x}d_{y}\)
- \(u_{x\land y} = u_{x}u_{y} + [b_{x}(1 - a_{y})u_{y} + (1 - a_{x})u_{x}b_{y}]/(1 - a_{x}a_{y})\)
- \(a_{x\land y} = a_{x}a_{y}\)
Disjunction
When a single observer holds an opinion about two independent, distinct propositions x and y, then we may combine them using the disjunction ("logical OR") into a new proposition \(x\lor y\) and form the opinion denoted \(\omega_{x\lor y}=\omega_{x}\sqcup\omega_{y}\) whose components are:
- \(b_{x\lor y} = b_{x} + b_{y} - b_{x}b_{y}\)
- \(d_{x\lor y} = d_{x}d_{y} + [a_{x}d_{x}(1 - a_{y})u_{y} + (1 - a_{x})u_{x}a_{y}d_{y}]/(a_{x} + a_{y} - a_{x}a_{y})\)
- \(u_{x\lor y} = u_{x}u_{y} + [d_{x}a_{y}u_{y} + a_{x}u_{x}d_{y}]/(a_{x} + a_{y} - a_{x}a_{y})\)
- \(a_{x\lor y} = a_{x} + a_{y} - a_{x}a_{y}\)
Caution: in propositional logic, we expect "AND" distributes over "OR" \(x\land(y\lor z) = (x\land y)\lor(x\land z)\). This no longer holds in subjective logic, which shouldn't be surprising. Remember we required the opinions to be about independent propositions. So \(\omega_{x\land(y\lor z)}\neq\omega_{x\land y}\sqcup\omega_{x\land z}\) shouldn't surprise us, since \(x\land y\) is not independent of \(x\land z\).
The familiar de Morgan's law (relating negation of conjunction as the disjunction of negated propositions, and vice-versa) still holds in subjective logic, though.
Other binary operators
There are about a half-dozen other binary operators, but I've covered the ones I believe to be germane for most of my interest.
Conclusion
What have we done? We have expressed propositions as opinions, i.e., as probabilities. Admittedly these opinions are for finite random variables (i.e., for situations which are true-or-false [Bernoulli trial] or finitely-many-distinct-outcomes like a roll of the die [multinomial]).
We've related the various degree of belief or disbelief to the amount of evidence available for a particular outcome or state.
We also have discussed a few operators on opinions familiar from logic, like negation or conjunction.
Next time, we'll discuss how to combine opinions together. One germane example will be as a member of a jury listening to a trial, evaluating evidence, to determine if the accused is guilty or not.
In this way, subjective logic is a "coarser" version of Bayesian inference.
References
It seems most of the work on this subject comes from Audun Jøsang and his students.
- A. Jøsang, “A logic for uncertain probabilities”. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9, no. 3 (2001), pp. 279–311, Eprint
- Audun Jøsang, "Subjective Logic". Manuscript, Draft, 18 February 2013, Eprint
- Audun Jøsang, "Belief Calculus". arXiv:cs/0606029
- Mohd Anuar Mat Isa, Ramlan Mahmod, Nur Izura Udzir, Jamalul-lail Ab Manan, Audun Jøsang, Ali Dehghan Tanha, "A Formal Calculus for International Relations Computation and Evaluation". arXiv:1606.02239 for an application to international relations (which generalizes to American politics)
No comments:
Post a Comment