Individual Choice Behavior: A Theoretical Analysis

Summary

This influential treatise presents upper-level undergraduates and graduate students with a mathematical analysis of choice behavior. It begins with the statement of a general axiom upon which the rest of the book rests; the following three chapters, which may be read independently of each other, are devoted to applications of the theory to substantive problems: psychophysics, utility, and learning.Applications to psychophysics include considerations of time- and space-order effects, the Fechnerian assumption, the power law and its relation to discrimination data, interaction of continua, discriminal processes, signal detectability theory, and ranking of stimuli. The next major theme, utility theory, features unusual results that suggest an experiment to test the theory. The final chapters explore learning-related topics, analyzing the stochastic theories of learning as the basic approach—with the exception that distributions of response strengths are assumed to be transformed rather than response probabilities. The author arrives at three classes of learning operators, both linear and nonlinear, and the text concludes with a useful series of appendixes.

Book Preview

Individual Choice Behavior - R Duncan Luce

INDEX

chapter 1

THE BASIC THEORY

A. INTRODUCTION

One large portion of psychology—including at least the topics of sensation, motivation, simple selective learning, and reaction time—has a common theme: choice. To be sure, in the study of sensation the choices are among stimuli, in learning they are among responses, and in motivation, among alternatives having different preference evaluations; and some psychologists hold that these distinctions, at least the one between stimulus and response, are basic to an understanding of behavior. This book attempts a partial mathematical description of individual choice behavior in which the distinction is not made except in the language used in different interpretations of the theory. Thus the more neutral word alternative is used to include the several cases.

In essence, the approach taken—in this respect, by no means novel—is orthogonal to that of S-R psychology, but not at variance with it. Rather than search for lawfulness between stimuli and responses and attempt to formulate a theory to describe those relationships, we shall be concerned with possible lawfulness found among different, but related, choice situations, whether these are choices among stimuli or among responses. Possibly the simplest prototype of this type of theory is the frequently assumed rule of transitivity among choices: given that a person chooses a over b and that he chooses b over c, then he chooses a over c when a and c are offered. This assumption, were it true, would be a law relating a person’s choice in one situation to those in two others, not a law relating responses to stimuli. It is evident that a sufficiently rich set of relations of this sort, coupled with a few simple S-R connections, will allow one to derive many more, and possibly quite complicated, S-R connections.

Such an approach seems to merit careful consideration, since several decades of pure S-R psychology have not resulted in notably simple laws of behavior. However, there seems little point in trying to discuss in detail its merits and demerits now, except to mention it in order to avoid confusion later. The results that follow—which seem to afford some insight into, and some integration of, psychological and psychophysical scaling, utility theory, and learning theory—will implicitly serve as the argument for the course taken.

1. Probabilistic vs. Algebraic Theories

A basic presupposition of this book is that choice behavior is best described as a probabilistic, not an algebraic, phenomenon. That is to say, at any instant when a person reaches a decision between, say, a and b we will assume that there is a probability P(a, b) that the choice will be a rather than b. These probabilities will generally be different from 0 and 1, although these extreme (and important) cases will not be excluded. The alternative is to suppose that the probabilities are always 0 and 1 and that the observed choices tell us which it is; in this case the algebraic theory of relations seems to be the most appropriate mathematical tool.

The decision between these two approaches does not seem to be empirical in nature. Various sorts of data—intransitivities of choices and inconsistencies when the same choices are offered several times—suggest the probabilitistic model, but they are far from conclusive. Both of these phenomena can be explained within an algebraic framework provided that the choice pattern is allowed to change over time, either because of learning or because of other changes in the internal state of the organism. The presently unanswerable question is which approach will, in the long run, give a more parsimonious and complete explanation of the total range of phenomena.

The probabilistic philosophy is by now a commonplace in much of psychology, but it is a comparatively new and unproven point of view in utility theory. To be sure, economists when pressed will admit that the psychologist’s assumption is probably the more accurate, but they have argued that the resulting simplicity warrants an algebraic idealization. Ironically, some of the following results suggest that, on the contrary, the idealization may actually have made the utility problem artificially difficult.

2. Multiple Alternative Choices

Once choice behavior is assumed to be probabilistic, a problem arises which does not exist in the algebraic models. Complete data concerning the choices that a person makes from each possible pair of alternatives taken from a set of three or more alternatives do not appear to determine what choice he will make when the whole set is presented. Because they cannot escape multiple alternative choice problems economists have been particularly sensitive to this feature of probabilistic models, and it has undoubtedly been one source of their resistance in admitting imperfect discrimination. Early psychologists, particularly learning theorists, studied multiple alternatives experimentally, but since the data seemed dreadfully complicated a trend set in toward fewer and fewer alternatives until now many studies employ only two. For the most part, present-day psychologists have been willing to ignore—or, to be more accurate, to bypass and postpone—the connections between pairwise choices and more general ones. And so the relations have remained obscure.

We shall center our attention on this problem. The method of attack is to introduce a single axiom relating the various probabilities of choices from different finite sets of alternatives. It is a simple and, I feel, intuitively compelling axiom that appears to illuminate many of the more traditional problems, in particular the question of whether or not a comparatively unique numerical scale exists which reflects choice behavior. Such a scale, unique except for its unit, is shown to exist very generally. It appears to be the formal counterpart of the intuitive idea of utility (or value) in economics, of incentive value in motivation, of subjective sensation in psychophysics, and of response strength in learning theory.

3. Well-Defined Sets of Alternatives

So far, there seems to have been an implicit assumption that no difficulty is encountered in deciding among what it is that an organism makes its choices. Actually, in practice, it is extremely difficult to know, and much experimental technique is devoted to arranging matters so that the organism and the experimenter are (thought to be) in agreement about what the alternatives are. All of our procedures for data collection and analysis require the experimenter to make explicit decisions about whether a certain action did or did not occur, and all of our choice theories—including this one—begin with the assumption that we have a mathematically well-defined set, the elements of which can be identified with the choice alternatives. How these sets come to be defined for organisms, how they may or may not change with experience, how to detect such changes, etc., are questions that have received but little illumination so far. There are limited experimental results on these topics, but nothing like a coherent theory. Indeed, the whole problem still seems to be floundering at a conceptual level, with us hardly able to talk about it much less to know what experiments to perform.

More than any other single thing, in my opinion, this Achilles’ heel has limited the applicability of current theories of choice: it certainly has been a significant stumbling block in the use of information theory in psychology, it has limited learning theory applications to a rather special class of phenomena typified by T-maze experiments, etc. The present theory is no different in this respect from the others.

B. PROBABILITY AXIOMS

Throughout the book we shall suppose that a universal set U is given which is to be interpreted as the universe of possible alternatives (stimuli or responses). In practice U will have to possess a certain homogeneity: the decision maker will have to be able to evaluate the elements of U according to some comparative dimension and to be able to select from certain finite subsets of U the elements that he thinks are superior (or inferior or distinguished in some way) along that dimension. For example, in economics U may be taken to be a set of commodity bundles among which a person can express preferences; in psychophysics it may be the set of possible sound energies (at a fixed frequency) which a subject can be asked to evaluate according to loudness; or in learning theory U may be the set of alternative responses available to the organism. Note that U may be finite or infinite.

In general, a subject is not asked to make a choice from the whole of U but rather from some (small) finite subsets. In a great many experiments only two alternatives are presented to the subject at a time, and he is required to choose the one he prefers or the one he deems louder, etc. Of course, larger subsets could be used, although for the most part they have not been, and certainly most daily decisions are from larger subsets (e.g., the choice of a meal from a menu or the choice among several jobs, etc.).

Let T be a finite subset¹ of U and suppose that an element must be chosen from T. If x is an element of T (written x ∈ T), let PT(x) denote the probability that the selected element is x. Slightly more generally, if S is a subset of T (written S ⊂T), let PT(S) denote the probability that the selected element lies in the subset S. These probabilities are the basic ingredients of the following theory.

In most choice models we would write P(x) for PT(x) because the choice set T is held invariant throughout the discussion; in fact, we would let T and U be the same set. Here, however, several different choice sets are to be considered at once. Let us suppose that we are working with 1000 cps tones at different intensities measured in db above some reference level; let w, x, y, and z denote, respectively, the 50, 52, 54, and 56 db tones. Let T = {w, x, y} and T′ = {x, y, z} and consider choices according to loudness. There is assumed to be some probability, denoted by PT(x), that x, the 52-db tone, will be called loudest when T is presented, and another, generally different, probability PT′(x) that x will be called loudest when T′ is presented. There is no reason to expect these probabilities to be the same, and the purpose of the subscripts is to make the several probabilities identifiable.

It must not be forgotten, however, that all of the probabilities having the same subscript T form an ordinary probability measure on the subsets of T. This means, explicitly, that the following is assumed:

The ordinary probability axioms.

(i) For S ⊂ TPT(S1.

(ii) PT(T) = 1.

(iii) If R, S ⊂ T and R ∩ S = φ, then PT(R ∪ S) = PT(R) + PT(S).

Repeated application of part iii implies that

therefore, it is always sufficient to state results just for PT(x).

Note that, given our interpretation of these probabilities, part ii means that the subject is forced to make a choice: the probability is 1 that his choice is in T when he must confine his choice to T.

For simplicity of notation, and to conform to standard usage, P(x, y) is written to stand for P{x,y}(x) when x ≠ yso that certain equations (e.g., P(x, y) + P(y, x) = 1) can be written without any restriction on the values assumed by x and y.

C. CHOICE AXIOM

1. Statement of Axiom

The axioms of ordinary probability theory establish certain restraints upon each of the measures PT, but no connections are assumed among the several measures. However, one suspects that, at least for choice behavior, the several measures cannot be completely independent. The relationship we shall investigate can be stated as follows:

Axiom 1. Let T be a finite subset of U such that, for every S ⊂ T, PS is defined.

(i) If P(x, y) ≠ 0, 1 for all x, y ∈ T, then for R ⊂ S ⊂ T

PT(R) = PS(R)PT(S);

(ii) If P(x, y) = 0 for some x, y ∈ T, then for every S ⊂ T

PT(S) = PT − {x} (S − {x}).

Throughout the book the expression "axiom 1 holds for the set T" is used to mean not only that it holds for T itself but also that it holds for every subset of T.

2. Discussion

There are a number of points, both technical and conceptual, that should be made about the axiom.

a. Interpretation. Part ii of the axiom simply states that if y is invariably chosen over x then x may be deleted from T when considering choices from T. This seems reasonable. If one never selects liver in preference to roast beef, then in choosing among liver, roast beef, and chicken one can immediately reduce the problem to consideration of roast beef and chicken.

Lemma 1. If axiom 1 holds for T and if P(x, y) = 0 for some y ∈ T, then PT(x) = 0.

PROOF. For z ∈ T, z ≠ x, part ii of axiom 1 implies

PT(z) = PT − {x} (z).

By parts ii and iii of the probability axioms,

and the result follows.

By repeated applications of part ii of axiom 1, the choice set can be reduced to one in which only cases of imperfect discrimination (P(x, y) ≠ 0 or 1) occur, and then part i becomes applicable. So let us consider that part.

To deal with complicated decisions, it is usual to subdivide them into two or more stages: the alternatives are grossly categorized in some fashion and a first decision is made among these categories; the one chosen is further categorized and a second decision is made, etc. It is commonly accepted, and it is probably true, that when such a multistage process is needed the over-all result depends significantly upon which intermediate partitionings are employed. One senses, however, that if the decision situation is quite simple—so that a two-stage process is not really needed—then the intermediate categorization, if used, will not matter. That is to say, the product PS(R)PT(S) will not depend upon S. But, by taking S = T, we see that this product must be PT(R), which is part i of axiom 1.