Thursday, November 16, 2017

Animals and Rational Behaviour

Having introduced a notion of Instrumental Rationality, perhaps a good question to ask is "What is an example of a rational agent which is not human?"

It might seem that animals behave rationally at times; for example, worker bees give up their own production in favor of other offspring of the queen, surely this must give some benefit to the donor. This idea (generalizing the bee situation) is known as Hamilton's Rule, and it doesn't really work. For example, it is neither testable nor does it make predictions.

With the bees, the altruist (i.e., worker bee) cooperates by giving a benefit b to the recipient (another offspring) at a cost c to itself. Both b and c are measured in terms of fitness, specifically the expected number of offspring. Naively one might expect b > c to suffice, but Hamilton's major insight was that relatedness ("degree of kinship") r between donor and recipient must enter into the equation, giving us Hamilton's rule br > c.

Game theorists are overjoyed to hear this can be derived from utility maximization, and one might expect it to have a status similar to Newton's laws in physics. However, Nowak, Tarnita, and Wilson have argued that Hamilton's rule almost never holds. In short, simple game theoretic models here fail to describe the biological situation.

Decades ago, evolutionary biologists would have treated Hamilton's rule as an "iron law". That no longer seems to be the case. For more on this, see de Vladar and Szathmáry's "Beyond Hamilton's Rule".

Saturday, November 11, 2017

Agents are Instrumentally Rational

Game Plan: We'll introduce the notion of "instrumental rationality" as an ordering of alternatives with some technical condition. Then we'll discuss measures of "preference" via utility functions. Then we conclude by discussing maximizing utility under uncertainty.

Loosely put, individuals who are instrumentally rational have preferences over various "things" (e.g., baby-back ribs are preferred to chicken, and chicken is preferred to bread). Such individuals are deemed "rational" for picking actions which satisfy those preferences. The only constraint is that preferences are ordered in some suitably "weakly coherent" way (e.g., ribs are still preferred to bread if there is no chicken).

The convention is to call these "things" preferred as "Alternatives".

Definition 1. Let an actor be choosing between countably many possible different alternatives x1, x2, x3, …. An actor is called Instrumentally Rational if the actor has preferences satisfying the following conditions:
  1. Reflexivity: No alternative xi is less preferred than itself.
  2. Completeness: For any two alternatives xi and xj, either (1) xi is strictly preferred over xj, (2) xj is strictly preferred over xi, or (3) the actor is indifferent between the two alternatives.
  3. Transitivity: For any alternatives xi, xj, xk, if xi is no less desired than xj, and if xj is no less desired than xk, then xi is no less desired than xk.
  4. Continuity: For any alternatives xi, xj, xk, if xi is (strictly) preferred to xj, and if xj is (strictly) preferred to xk, then there exists some "composite" of xi and xk (call it y) which is equally as desired as xj.

Remark 1 (On Continuity). There are two ways to interpret the continuity axiom. The first perspective is to think of y as a "basket" containing "bits" of xi and "bits" of xk. For example, if xi is "18 ribs", xj is "half a roasted chicken", and xk is "10 rolls", then there is some composite ("9 ribs and 5 rolls") which is equally as desirable as half a chicken.

The other perspective is to think of y as a lottery, where the actor obtains xi with probability p (0 < p < 1) and xk with probability 1 − p. The continuity axiom then says there is some p for which the actor is indifferent between the lottery y and the alternative xj.

Remark 2 (Ordering, Utility Functions). The first three axioms taken together implies the actor has a well-defined preference ordering (in the mathematical sense). When the continuity axiom is added, the preference ordering may be "represented" by a utility function (i.e., a function assigning to each alternative xi some real number U(xi) reflecting the "utility" or "desire" for that alternative). An actor making choices to satisfy his or her preference ordering can be viewed "as if" maximizing his or her utility function.

Now, discussions of "utility" of an alternative should not be confused with the philosophy of Utilitarianism. A utility function just assigns some numbers such that the ordering induced by it is the same as the actor's preference relation. That is to say, U(xi) > U(xj) if and only if xi is strictly preferred to xj. The numbers represented by U(xi) are measured in utils, which is Agent-dependent and measures that Agent's preference for the given alternative.

Ordinal Utilities, Cardinal Utilities, Maximizing Expected Utility

Definition 2. If we assign utility "arbitrarily" but in a manner consistent with the preference ordering (e.g., for any alternatives X and Y such that X is preferred to Y, we assign the utilities such that U(X) > U(Y) but otherwise the quantities remain arbitrary), then we call such utility the Ordinal Utility.

Here we must stress again there are two important points of assigning utility in a manner which captures only the preference ordering (and nothing else).

First, ordinal utility does not describe the agent's "intensity of desire" for an alternative. The "strength of preference" is not captured by this notion. So how much more I want ribs than chicken is not adequately described by this notion, just the fact that I really want ribs right now (and not chicken, much less bread).

Second, ordinal utility cannot be compared "across agents". The ordinal utility I assign to a full rack of baby-back ribs cannot be compared to anyone else's ordinal utility for, say, Lasagna. We can only compare my ordinal utility for baby-back ribs against my ordinal utility for Lasagna.

Dealing with Uncertainty

My local BBQ joint smokes 1 pig per day, and when it's all sold, there's no more. If I am hungry, should I go before the lunch rush or afterwards?

Here we must talk of Prospects, outcomes and their associated probabilities.

For our particular situation, there is a decision I must make (go before the lunch rush or after) and two outcomes (there is food left, or they ran out of food). One prospect is given by the possible outcomes to a given choice of going before the lunch rush (go before lunch AND they have food, go before lunch AND no more food; p, 1 − p). The other prospect is given by the decision to go after the lunch rush (go after lunch AND they have food, go after lunch AND no more food; q, 1 − q).

Observe, each decision has different possible outcomes, but the probabilities for the outcomes on a given decision must sum to 100%: something must happen when I take a decision.

We now need to consider preference ordering over prospects.

Definition 3. Suppose a person must choose between actions with uncertain outcomes, in the sense that: each action has various possible outcomes associated with it, each with some probability. We call this action a Prospect and represent it by a pairing of the possible outcomes with their respective probabilities (y1, y2, ...; p1, p2, ...) where the outcome yi occurs with probability pi, and the probabilities sum to 1 = p1 + p2 + ... (since there must be an outcome to the action).

Remark. There is a "nested structure" to prospects, in the sense that yi might be an "atomic outcome" (e.g., "there will be food", "there will be no food", "it will rain", "the world will end", etc.) or another prospect (imagine "I flip a coin; if it is heads, then I do this action, but if it is tails then I do some other action").

An actor's Preferences over Prospects are called Consistent if the preference satisfies axioms (1), (2), and (3) of Definition 1, and:

  1. Continuity: Consider three prospects yi, yj, and yk, and suppose the first is preferred to the second and the second is preferred to the third. Then there exists some probability p such that the prospect (yi, yk; p, 1 − p) is equally as preferable to yj (compare to the second interpretation forwarded in Remark 1).
  2. Preference increasing with probability: If yi is preferred to yj, letting ym = (yi, yj; p1, 1 − p1) and yn = (yi, yj; p2, 1 − p2), then ym is preferred to yn only if p1 > p2.
  3. Independence: For any three prospects yi, yj, and yk, if yi is preferred to yj, then there exists a probability p such that the prospect (yi, yj; p, 1 − p) is no less desired than (yi, yk; p, 1 − p)

Given this notion of "consistent preferences over uncertain prospects", how can we develop a notion of instrumental rationality?

Maximizing Expected Utility

The first step is to introduce the notion of Cardinal Utility, which assigns to a given outcome yi the intensity for an agent's preference for that outcome u(yi).

The second step is to consider the Expected Utility of a Prospect y = (y1, y2, ...; p1, p2, ...) as the sum Eu[y] = u(y1)p1 + u(y2)p2 + ..., which is the expected value of the "random variable".

Now, an agent with cardinal utility u(-) is considered instrumentally rational if it picks the action whose prospect has the maximum expected utility.

Example 1. If I go to my favorite BBQ restaurant before the lunch rush, the prospect looks like ("get food", "no food"; 0.95, 0.05). If I leave after the lunch rush, the prospect looks like ("get food", "no food"; 0.1, 0.9).

My cardinal utility function looks like u(get food) = 10, u(no food) = −30.

The expected utility for going before the lunch rush is then

E[before] = u(get food)×0.95 + u(no food)×0.05
=10×0.95 − 30×0.05
=9.5 − 1.5
=8

The expected utility for going after the lunch rush is then

E[after] = u(get food)×0.1 + u(no food)×0.9
=10×0.1 − 30×0.9
=1 − 27
=−26

Since 8 > −26, it is rational to go before the lunch rush to try to get food.

Next time, we'll discuss flaws with this notion of instrumental rationality, both logical and empirical.

References

  • Shaun Hargreaves Heap and Yanis Varoufakis, Game Theory: A Critical Introduction. Second ed., Routledge. (This is the axiomatization scheme I am following.)
  • John Searle, Rationality in Action. MIT Press, 2001. (This provides a different set of axioms for rational behaviour, equivalent to the axioms of game theory, and discusses implicit assumptions & its flaws.)