Political Arithmetic: Agents are Instrumentally Rational

Game Plan: We'll introduce the notion of "instrumental rationality" as an ordering of alternatives with some technical condition. Then we'll discuss measures of "preference" via utility functions. Then we conclude by discussing maximizing utility under uncertainty.

Loosely put, individuals who are instrumentally rational have preferences over various "things" (e.g., baby-back ribs are preferred to chicken, and chicken is preferred to bread). Such individuals are deemed "rational" for picking actions which satisfy those preferences. The only constraint is that preferences are ordered in some suitably "weakly coherent" way (e.g., ribs are still preferred to bread if there is no chicken).

The convention is to call these "things" preferred as "Alternatives".

Definition 1. Let an actor be choosing between countably many possible different alternatives x₁, x₂, x₃, …. An actor is called Instrumentally Rational if the actor has preferences satisfying the following conditions:

Reflexivity: No alternative x_i is less preferred than itself.

Completeness: For any two alternatives x_i and x_j, either (1) x_i is strictly preferred over x_j, (2) x_j is strictly preferred over x_i, or (3) the actor is indifferent between the two alternatives.

Transitivity: For any alternatives x_i, x_j, x_k, if x_i is no less desired than x_j, and if x_j is no less desired than x_k, then x_i is no less desired than x_k.

Continuity: For any alternatives x_i, x_j, x_k, if x_i is (strictly) preferred to x_j, and if x_j is (strictly) preferred to x_k, then there exists some "composite" of x_i and x_k (call it y) which is equally as desired as x_j.

Remark 1 (On Continuity). There are two ways to interpret the continuity axiom. The first perspective is to think of y as a "basket" containing "bits" of x_i and "bits" of x_k. For example, if x_i is "18 ribs", x_j is "half a roasted chicken", and x_k is "10 rolls", then there is some composite ("9 ribs and 5 rolls") which is equally as desirable as half a chicken.

The other perspective is to think of y as a lottery, where the actor obtains x_i with probability p (0 < p < 1) and x_k with probability 1 − p. The continuity axiom then says there is some p for which the actor is indifferent between the lottery y and the alternative x_j.

Remark 2 (Ordering, Utility Functions). The first three axioms taken together implies the actor has a well-defined preference ordering (in the mathematical sense). When the continuity axiom is added, the preference ordering may be "represented" by a utility function (i.e., a function assigning to each alternative x_i some real number U(x_i) reflecting the "utility" or "desire" for that alternative). An actor making choices to satisfy his or her preference ordering can be viewed "as if" maximizing his or her utility function.

Now, discussions of "utility" of an alternative should not be confused with the philosophy of Utilitarianism. A utility function just assigns some numbers such that the ordering induced by it is the same as the actor's preference relation. That is to say, U(x_i) > U(x_j) if and only if x_i is strictly preferred to x_j. The numbers represented by U(x_i) are measured in utils, which is Agent-dependent and measures that Agent's preference for the given alternative.

Ordinal Utilities, Cardinal Utilities, Maximizing Expected Utility

Definition 2. If we assign utility "arbitrarily" but in a manner consistent with the preference ordering (e.g., for any alternatives X and Y such that X is preferred to Y, we assign the utilities such that U(X) > U(Y) but otherwise the quantities remain arbitrary), then we call such utility the Ordinal Utility.

Here we must stress again there are two important points of assigning utility in a manner which captures only the preference ordering (and nothing else).

First, ordinal utility does not describe the agent's "intensity of desire" for an alternative. The "strength of preference" is not captured by this notion. So how much more I want ribs than chicken is not adequately described by this notion, just the fact that I really want ribs right now (and not chicken, much less bread).

Second, ordinal utility cannot be compared "across agents". The ordinal utility I assign to a full rack of baby-back ribs cannot be compared to anyone else's ordinal utility for, say, Lasagna. We can only compare my ordinal utility for baby-back ribs against my ordinal utility for Lasagna.

Dealing with Uncertainty

My local BBQ joint smokes 1 pig per day, and when it's all sold, there's no more. If I am hungry, should I go before the lunch rush or afterwards?

Here we must talk of Prospects, outcomes and their associated probabilities.

For our particular situation, there is a decision I must make (go before the lunch rush or after) and two outcomes (there is food left, or they ran out of food). One prospect is given by the possible outcomes to a given choice of going before the lunch rush (go before lunch AND they have food, go before lunch AND no more food; p, 1 − p). The other prospect is given by the decision to go after the lunch rush (go after lunch AND they have food, go after lunch AND no more food; q, 1 − q).

Observe, each decision has different possible outcomes, but the probabilities for the outcomes on a given decision must sum to 100%: something must happen when I take a decision.

We now need to consider preference ordering over prospects.

Definition 3. Suppose a person must choose between actions with uncertain outcomes, in the sense that: each action has various possible outcomes associated with it, each with some probability. We call this action a Prospect and represent it by a pairing of the possible outcomes with their respective probabilities (y₁, y₂, ...; p₁, p₂, ...) where the outcome y_i occurs with probability p_i, and the probabilities sum to 1 = p₁ + p₂ + ... (since there must be an outcome to the action).

Remark. There is a "nested structure" to prospects, in the sense that y_i might be an "atomic outcome" (e.g., "there will be food", "there will be no food", "it will rain", "the world will end", etc.) or another prospect (imagine "I flip a coin; if it is heads, then I do this action, but if it is tails then I do some other action").

An actor's Preferences over Prospects are called Consistent if the preference satisfies axioms (1), (2), and (3) of Definition 1, and:

Continuity: Consider three prospects y_i, y_j, and y_k, and suppose the first is preferred to the second and the second is preferred to the third. Then there exists some probability p such that the prospect (y_i, y_k; p, 1 − p) is equally as preferable to y_j (compare to the second interpretation forwarded in Remark 1).

Preference increasing with probability: If y_i is preferred to y_j, letting y_m = (y_i, y_j; p₁, 1 − p₁) and y_n = (y_i, y_j; p₂, 1 − p₂), then y_m is preferred to y_n only if p₁ > p₂.

Independence: For any three prospects y_i, y_j, and y_k, if y_i is preferred to y_j, then there exists a probability p such that the prospect (y_i, y_j; p, 1 − p) is no less desired than (y_i, y_k; p, 1 − p)

Given this notion of "consistent preferences over uncertain prospects", how can we develop a notion of instrumental rationality?

Maximizing Expected Utility

The first step is to introduce the notion of Cardinal Utility, which assigns to a given outcome y_i the intensity for an agent's preference for that outcome u(y_i).

The second step is to consider the Expected Utility of a Prospect y = (y₁, y₂, ...; p₁, p₂, ...) as the sum E_u[y] = u(y₁)p₁ + u(y₂)p₂ + ..., which is the expected value of the "random variable".

Now, an agent with cardinal utility u(-) is considered instrumentally rational if it picks the action whose prospect has the maximum expected utility.

Example 1. If I go to my favorite BBQ restaurant before the lunch rush, the prospect looks like ("get food", "no food"; 0.95, 0.05). If I leave after the lunch rush, the prospect looks like ("get food", "no food"; 0.1, 0.9).

My cardinal utility function looks like u(get food) = 10, u(no food) = −30.

The expected utility for going before the lunch rush is then

E[before] = u(get food)×0.95 + u(no food)×0.05

= 10×0.95 − 30×0.05

= 9.5 − 1.5

= 8

The expected utility for going after the lunch rush is then

E[after] = u(get food)×0.1 + u(no food)×0.9

= 10×0.1 − 30×0.9

= 1 − 27

= −26

Since 8 > −26, it is rational to go before the lunch rush to try to get food.

Next time, we'll discuss flaws with this notion of instrumental rationality, both logical and empirical.

References

Shaun Hargreaves Heap and Yanis Varoufakis, Game Theory: A Critical Introduction. Second ed., Routledge. (This is the axiomatization scheme I am following.)
John Searle, Rationality in Action. MIT Press, 2001. (This provides a different set of axioms for rational behaviour, equivalent to the axioms of game theory, and discusses implicit assumptions & its flaws.)

Political Arithmetic

Saturday, November 11, 2017

Agents are Instrumentally Rational

Ordinal Utilities, Cardinal Utilities, Maximizing Expected Utility

Dealing with Uncertainty

Maximizing Expected Utility

References

No comments:

Post a Comment

E[before]	=	`u`(get food)×0.95 + `u`(no food)×0.05
	=	10×0.95 − 30×0.05
	=	9.5 − 1.5
	=	8

E[after]	=	`u`(get food)×0.1 + `u`(no food)×0.9
	=	10×0.1 − 30×0.9
	=	1 − 27
	=	−26