0% found this document useful (0 votes)
56 views17 pages

American Economic Association

This document discusses the game theory concept of backward induction and how it can be used to analyze extensive-form games with perfect information like tic-tac-toe. Backward induction works by assigning values to game positions starting from end positions, then using those values to determine optimal moves working backwards. However, backward induction strategies are not optimal against irrational opponents who may not follow the strategy.

Uploaded by

Nasir Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views17 pages

American Economic Association

This document discusses the game theory concept of backward induction and how it can be used to analyze extensive-form games with perfect information like tic-tac-toe. Backward induction works by assigning values to game positions starting from end positions, then using those values to determine optimal moves working backwards. However, backward induction strategies are not optimal against irrational opponents who may not follow the strategy.

Uploaded by

Nasir Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

American Economic Association

Rationality in Extensive-Form Games Author(s): Philip J. Reny Source: The Journal of Economic Perspectives, Vol. 6, No. 4 (Autumn, 1992), pp. 103-118 Published by: American Economic Association Stable URL: http://www.jstor.org/stable/2138271 . Accessed: 22/04/2014 01:58
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

American Economic Association is collaborating with JSTOR to digitize, preserve and extend access to The Journal of Economic Perspectives.

http://www.jstor.org

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

6, Number4-Fall 1992-Pages 103-118 Perspectives- Volume Journal of Economic

Rationality in Extensive-Form Games

Philip J. Reny

ince its inception, the Theory of Games has striven to describe a "rational" course of action for each of the participants involved in a situation of strategic interaction, ranging from the merely amusing realm of parlor games to the truly frightening arena of war. A formidable task, to be sure. Even imagining that a single principle of "rationality" could unite such diverse settings constitutes a major intellectual achievement. For the theory to encompass such a variety of strategic situations, the elements making up the theory must, of course, set aside features specific to any one. In their classic treatment, von Neumann and Morgenstern (1944) focus attention on a few elements common to many strategic situations and which, together are intended to capture the essential details necessary for their analyses.' They are: i) the collection of players; ii) the physical order in which play proceeds; iii) the choices available whenever it is a player's turn to move; iv) the information about previous choices made by others available to a player whose turn it is to move; and v) the payoffs to each of the players resulting from any play of the game.2 Any particular specification of these five elements constitutes an extensive form description of a game, or simply an extensive-form game. Remarkably, one is hard-pressed to uncover a real-life strategic situation which cannot be usefully modelled by a carefully chosen extensive-form game. I
'For simplicity here, I omit any "moves of nature," like dice-rolling or card-shuffling, where chance plays a role. Of course, von Neumann and Morgenstern do not omit this possibility. 2Attention shall be restricted to finite games. Consequently, a play is a finite sequence of choices resulting in the end of the game.

* Philip J. Reny is Associate Professor of Economics, University of WesternOntario, London, Ontario, Canada.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

104

Journal of Economic Perspectives

This paper will restrict its attention to a particularly simple class of extensive-form games: those involving but two players, and having perfect information. The requirement of perfect information means that whenever it is a player's turn to move, that player is informed of all previous moves. Both chess and tic-tac-toe have this property. A game like tic-tac-toe is too simple to be interesting to play, yet its simplicity is a virtue when expositing the fundamental technique game theorists have typically employed to "solve" games with perfect information: backward induction. As a matter of terminology, let's say that a "position" constitutes an entire configuration of the tic-tac-toe board. Thus, to describe the current position is to describe the location of every X and every 0 that has been played up to this point. For instance, ignoring symmetric positions, there are 3 positions with precisely one X in play, 12 positions with one X and one 0 in play, and so on. The backward induction solution technique assigns a definite value say, in terms of whether or not player one wins, to any possible position that could arise during the course of playing tic-tac-toe. Armed with this information, any novice would immediately become an expert player since, for instance, the novice would be aware of all positions that would ensure a win. The important question is, of course: How are these values assigned? The idea is first to assign values to positions that are the simplest to analyze and with this knowledge in hand, then move on to assign values to more complex positions. Regarding tic-tac-toe, a novice is probably unaware that if X, the first to move, were to first mark a corner, then the only non-losing response of 0 is to mark the center. However, even a novice knows that when X faces the situation in Figure 1, the only possible move will win the game. The technique of backward induction begins with these simplest positions. We now illustrate by applying the technique to tic-tac-toe. Any position in which eight squares are filled, as in Figure 1 for example, can trivially be valued as either "win," "loss" or "draw" (for X). Consider now a position arising after 7 moves have been made in total, leaving two unmarked squares. Since a definite value has been obtained for every position involving eight moves, any move player 0 next makes has therefore been assigned a definite value. Among these, the best choice (there may be more than one) available for player 0 is then chosen and this determines a value for the initial position after 7 moves. All positions arising after 7 moves are similarly assigned values. One now proceeds backward, inductively, assigning values to all positions arising after 6 moves, then 5, 4, and finally, to all first moves by X. The best first move by X then determines the outcome of tic-tac-toe. As we know from experience, the outcome is that both players can force a draw. The same general backward induction technique is used to solve any game with perfect information. Positions in which the game is over according to the rules ("end positions") yield payoffs to the players as given by the rules. Positions penultimate to end positions are then assigned a "payoff vector" by choosing an end position which follows that is best for the player whose turn it

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

PhilipJ. Reny

105

Figure 1 Backward Induction in Tic-Tac-Toe

0
is to move, and so on. The backward induction technique not only assigns values to every possible position arising in a game, it also provides an optimal move for the player whose turn it is once that position is reached. As a matter of terminology, let us define a strategy for a player participating in a game as "a rule which designates a legal choice for that player whenever it is that player's turn to move." The aim of backward induction then is to provide each player with an optimal strategy. Chess, of course, is an immensely more complex game. Nonetheless, in principle, the backward induction solution technique is equally capable of determining the outcome of chess, thereby rendering it as "uninteresting" as tic-tac-toe. The only limitation in applying backward induction to "solve" the game of chess is the enormous (although finite) number of positions that might potentially arise. There are simply too many positions for even the most sophisticated computers of the day to assign values to. Backward induction has, in any event, left its mark on the world of chess. In 1912, Zermelo used it to prove that, in principle, one of White or Black can guarantee at least a draw regardless of how the opponent plays. Unfortunately, the argument gives no indication as to whether it is White or Black, or how play ought to proceed. Chess remains, quite comfortably, a game beyond full analysis, in the sense that a lack of sufficient memory space prevents making the appropriate calculations. Abstracting from computational limitations, backward induction provides impeccable advice to two "experts" playing any game with perfect information where the possible outcomes are either win, loss or draw. For, an argument analogous to Zermelo's (1912) establishes that in such games, for every position, either at least one player has a strategy ensuring a win once that position is reached, or both players have a strategy ensuring each of them a draw from that position. Moreover, any pair of strategies induced by the backward induction technique (henceforth called backward induction strategies) has these properties. Although the strategies given by backward induction appear "unequivocally" optimal, they are, in fact, definitely not optimal when pitted against certain kinds of "irrational" opponents. For instance, suppose that a computer has been programmed to play tic-tac-toe, and that the program contains a few bugs. In particular, whenever the computer finds itself in a position from which

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

106

Journal of EconomicPerspectives

it can guarantee itself a win, it then plays to lose if this is still possible. Otherwise, it plays without error. Suppose now that you are playing against this computer. The computer has the first move and marks an X in a corner. The unique response dictated by backward induction is to mark an 0 in the center. Any other move would guarantee X a win. However, this is precisely what you want. You know that after placing the computer in a position from which it can guarantee a win, it will play to let you win. Consequently, avoiding the backward induction choice becomes optimal!3 The lesson here is that backward induction strategies are optimal in win, lose or draw games so long as you believe that your opponent will never move from a position in which the opponent can guarantee a win to one in which at most a draw is possible. Since, loosely speaking, no "expert" would ever make such a blunder, backward induction strategies are indeed optimal when the game is played by experts. Finally, since the classical view of the role of a theory of games is to determine, again loosely speaking, "expert" strategies, the backward induction technique has arisen as the fundamental method by which "rational behavior" is determined not only in win, lose or draw games, but in all games with perfect information. Let us then adopt the classical point of view that a theory of games is a description of "rational" behavior. Consequently, equipped with a book entitled "Theory of Games," any individual in any strategic situation need only consult the book to make a "rational" decision. One of the questions to address in this context is indeed whether or not strategies other than those provided by backward induction can ever appear in such a book. In offering an answer, we shall also explore the logical limits within which any "Theory of Games" must operate.

Backward Induction Paradoxes and Resolutions


In a win, lose or draw game, backward induction strategies must surely be considered those that are optimal by virtually any reasonable standard of "rational play." The same cannot be so confidently asserted for games not having the win, lose or draw payoff structure, however. The most well-known paradox associated with backward induction is that of the finitely repeated the prisoner's dilemma.4 To stick with games of perfect information-which prisoner's dilemma game is not since each prisoner must choose whether or not to be a stool-pigeon in ignorance of the other's choice-we'll consider a
3I wish to thank Ariel Rubinstein for pointing this out to me in the context of these "win, lose or draw" games. In the simple example in the text, choosing the center might also be optimal. However, the entire backward induction strategy is not. 4For a lucid account of the prisoner's dilemma, see Luce and Raiffa (1957) or Dixit and Nalebuff (1991). Many other paradoxical examples can be found. Among these are Rosenthal's (1981) centipede game and Selten's (1978) chain-store game.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

in Extensive-Form Games 107 Rationality

Figure 2

Take It or Leave It
Leave Leave 1 Leave Take Leave
Take()

Leave

Take

Take

Take

different game that illustrates essentially the same point. The game is called Take It or Leave It, or TOL for short. A referee announces to two players that he or she is equipped with N dollars and places one on the table in front of them. Player I can either take the dollar ending the game or leave it. If he leaves it, the referee places a second dollar on the table. Player II can now take the two dollars, ending the game, or she can leave it. Play continues in an alternating fashion with the referee adding one dollar to the pot whenever it is not taken. As soon as one of the players decides that the pot is large enough to take, the game ends with the other player receiving a payoff of zero. The game cannot go on forever, of course, since the referee has only N dollars. If the pot grows to N dollars, then the player whose turn it is to move can either take the N dollars or not, in which case it is given to the other player. Let's assume that the players care only about maximizing their own monetary payoff. An extensive-form tree diagram of TOL, where N is odd, is provided in Figure 2. The game begins at the left-most "decision node" (small black circle) at which point it is player I's turn to move (a "I" is therefore placed above the decision node). The edges emanating from decision nodes represent the choices available to the player whose turn it is to move. The vectors give the payoffs to the players whenever a sequence of moves ends the game. For instance, if I and II alternate by playing Leave, Leave, Take, then player I receives a payoff of three and player II a payoff of zero. Before continuing, the reader may wish to think about how to play TOL when the pot might potentially grow to, say, $1,000, or your favorite six-figure amount. The backward induction analysis of TOL proceeds as follows. Were the pot to grow to size N (odd, say), player I, whose turn it would be to move, does best by taking the money rather than leaving it for player II. At the previous stage, when it is player II's turn to move and the pot is of size N - 1, II then does best to take the money, since leaving it will result in a payoff of zero for II. But, given this, at stage N - 2, player I does best to take the pot, and so it goes until we reach the beginning of the game whereupon the conclusion of backward induction is that player I should always take the one dollar and immediately end the game. And this is regardless of the value of N! To most, this conclusion

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

108

Perspectives Journal of Economic

is at least counterintuitive, while to some it is utterly preposterous. It is a typical backward induction paradox. While the logic of backward induction appears impeccable, the "paradox" is that one nonetheless remains somehow dissatisfied and unpersuaded to act in the manner that it prescribes. Who would take the first dollar in a milliondollar TOL game? To resolve this paradox is to somehow bridge the gap between the (perhaps flawed) intuition expressed by those who would allow the pot to grow and the apparently "rational" strategy of always taking the current pot. Kreps et al. (1982) suggest a possible resolution in the context of the paradox associated with the finitely repeated prisoner's dilemma. We shall adapt their ideas to the game of TOL. Suppose then, that instead of all players being interested in maximizing their monetary payoff, some players simply choose to leave the money whenever it is their turn to move, even in the last stage. There are two equivalent ways of thinking about these players. Either they always prefer leaving the money to taking it, in which case the payoffs given in Figure 2 do not reflect these players' preferences, or the payoffs do represent their preferences over the outcomes and these players are simply irrational-they do not always act to choose a most preferred outcome. Since it is simpler not to alter the payoffs, we adopt (as do Kreps et al.) the latter
interpretation.

To capture the idea that such "Leavers" are present, but that not all opponents are Leavers, suppose that with probability p, the opponent one faces is a Leaver. In addition, suppose that the payoff numbers in Figure 2 represent the players' von Neumann-Morgenstern utilities and that those who are not Leavers (call them "Maximizers") act so as to maximize expected utility.5 It's not hard to see that for any positive value of p, no matter how small, if N exceeds I/p then a Maximizer will leave the first dollar. The reasoning is straightforward. If player I takes the first dollar, his expected utility is, of course, 1. Consider, however, the strategy for player I of always leaving the money unless it is his turn in the last stage in which case he takes it. With probability p, player II is a Leaver and in this event player I's strategy will yield utility N. With probability (1 - p), player II is a Maximizer, and player I's resulting utility in this event will depend upon how II plays. However, it cannot be less than zero, no matter how II plays. Hence, this strategy yields player I an expected utility of at least pN which, by assumption, exceeds 1. Although this doesn't show that this strategy is optimal for player I, it does show that it is better than taking the first dollar. Consequently, for N > I/p no Maximizer takes the first dollar.
this specification, Maximizers are risk-neutral. This is immaterial. Any von NeumannMorgenstern utility function that is strictly increasing in money will produce the same qualitative features.
5With

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

PhilipJ. Reny

109

A full analysis of this situation involves more subtle reasoning than is suggested by the above considerations. For instance, what if p is small enough relative to N so that N < 1/p? Might p be so small as to again render taking the first dollar the only "rational" choice by a Maximizer? The answer is no. For suppose this were indeed the case. Then, whenever player 1I happened to be called upon to play because the first dollar was not taken, she would conclude that her opponent must be a Leaver (since Maximizers always take the first dollar). Player II's optimal response against a Leaver is to leave the money herself until the last stage is reached. But given all of this, a Maximizer who plays first would then have an incentive to leave the first dollar, pretending to be a Leaver and then to snatch the pot away from a shocked player 11 at the last opportunity. But this contradicts the initial hypothesis that it is always best for Maximizers to take the first dollar. It turns out that "rational play" in this situation, as described by an appropriate generalization of backward induction to take into account the inherent uncertainty, involves Maximizers, at every stage but perhaps the last two, leaving the pot with some positive probability, regardless of the values of p and N.6 Fixing p, for large enough values of N, Maximizers must allow the pot to grow with certainty for a number of periods (which grows with N). Toward the end of the game, however, and again for N large enough, Maximizers will take the pot with some positive probability less than one (equal to one in the last stage, of course). The lesson here is that no matter how slight the possibility that the opponent is a Leaver, rational play must involve leaving the money with some positive probability. This goes a long way toward resolving the original paradox. It doesn't quite go all the way however, for true skeptics would insist that they leave the first dollar without the slightest doubt that they are playing against a Maximizer. Perhaps some simply won't listen to reason. Or perhaps ...

Taking A Closer Look


As shown by Kreps et al. and as illustrated in the discussion above for TOL, when players believe that they might be playing against a non-Maximizer (a Leaver, for instance), non-backward induction outcomes (of the original game) may well be "rational." It turns out that this finding is a corollary of a more basic point. Indeed, even if players do believe with certainty (henceforth know) that their opponents are Maximizers, it is enough that they place a positive probability on the event that some other player does not know that they are a Maximizer for the same conclusion to hold. As a matter of terminology, if for
6The only instances in which a Maximizer in the second last stage can take the pot for sure is when

(N

2)/N < p < (N

1)/N.

Of course, Maximizers always take the pot in the last stage.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

110

Journalof EconomicPerspectives

any specification of players i, j, k,... 1 all statements of the form i knows that j knows that k knows that ... knows that 1 is a Maximizer, then we'll say that it is commonknowledgethat all players are Maximizers.The more basic point is that so long as it is not common knowledge that all players are Maximizers, play which conflicts with backward induction may be rational. Kreps et al. were aware of this, and we shall focus on this particular message of theirs. On the other hand, if throughout the game it is common knowledge that all players are Maximizers, then the only outcomes consistent with all players actually being Maximizers are those given by backward induction.7 It would appear then that a very strong case can be made for the view that only backward induction strategies can be part of a theory of "rational" behavior. For one need only insist that a "rational" theory must be consistent with the view that all players are Maximizers, that all players know the theory (and wouldn't it be odd if "rational" players didn't know the theory of "rational" behavior?), that all players know that all players know the theory (again, how could this be false?), and so on. Since this would imply that the theory is common knowledge among the players and because the theory is consistent with maximizing behavior, it would imply that maximizing behavior is common knowledge among the players. And this would then imply that the theory must be one supporting backward induction, right? Well, not quite. To move from the common knowledge assumption to backward induction, it is necessary that throughoutthe game it is common knowledge that all players are Maximizers. This qualification is crucial, since if for some position arising in the game, maximizing behavior is not common knowledge, then the lesson from Kreps et al. holds, and play other than that prescribed by backward induction may be rational from that point on. Furthermore, if setting aside backward induction is rational at some position, it may well also be rational even before this position arises, since the optimality of backward induction play early in the game relies upon backward induction play later on (recall the example of the misprogrammed tic-tac-toe computer). So the previous argument supporting backward induction relies on the assumption that maximizing behavior is common knowledge among the players throughout the game. So what? The importance of recognizing this qualification lies in the fact that for many games it is simply not possible for maximizing

According to the definition of "knowledge" provided in the early formal literature on the subject beginning with Aumann (1976), if an event was "known" to have occurred, then, in particular, it must have. Hence, according to this definition, it would be impossible to "know" something and later be presented with evidence to the contrary. In this essay we're following the usage employed by Brandenburger (in this symposium) where to "know an event has occurred" means only that you believe with certainty that this is the case. Consequently, it is possible to know something and subsequently find out that what you previously "knew" in fact was incorrect. Adopting a common terminology has obvious advantages and it is meant to be helpful. The reader is warned however to keep in mind that a statement such as "player I knows that player II is not a Maximizer" does not imply that indeed II is not a Maximizer, but only that player I believes this with certainty.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

Games 111 Rationality in Extensive-Form

behavior to be common knowledge throughout.8 Before demonstrating this with an example, let us consider the implications. If some games offer positions where it is not possible for maximizing behavior to be common knowledge, then the argument of Kreps et al. implies that from that point on, play differing from backward induction play may be optimal, which may then render other than backward induction play optimal even before this position arises, perhaps even at the beginning of the game. The result can be that setting aside backward induction can make sense at the beginning of the game, even if initially it is common knowledge that all players are Maximizers! For instance, suppose indeed that maximizing behavior is common knowledge at the beginning of the game of TOL. Moreover, suppose (and we'll show below that this is the case) that if player I leaves the first dollar then it is no longer possible for maximizing behavior to be common knowledge. Consequently, if called upon to play, player II's beliefs about whether or not player I is a Maximizer, or II's beliefs about the initial common knowledge must change. And depending upon how II's beliefs change, player II may find it optimal given these new beliefs, to leave the two dollars. But this may then make it optimal for player I to leave the first dollar, despite the initial common knowledge. Note the subtle, yet important, difference between the above argument for allowing the pot to grow and that of Kreps et al. Their argument begins with the assumption that from the very start maximizing behavior is not common knowledge (neither player knows that the other is a Maximizer-the opponent may be a Leaver with probability p). Hence, their rationale for allowing the pot to grow in TOL is based upon an "irrationality," namely the presence of Leavers (or at least the belief in their presence). In contrast, the argument put forward here does not require that common knowledge of maximizing behavior fail at the beginning of the game. The reason for this is that (according to the proposition below) by leaving the first dollar, player I can ensure that from then on maximizing behavior cannot be common knowledge regardless of the players' initial knowledge. And this may make player I better off (from his point of view) than taking the first dollar, since without the common knowledge player II may rationally allow the pot to grow. Clearly, the key to the present argument is the claim that maximizing behavior cannot be common knowledge once player I leaves the first dollar in TOL. Before proving this, note that allowing the pot to grow to any size, even N, is consistent with the hypothesis that both players are Maximizers, since a Maximizer would indeed allow the pot to grow to N if playing against a Leaver. Of course, if the pot is allowed to grow to size N, then either the player whose turn it was at stage N - 1 is not a Maximizer, or that player does not know that
8Roughly, within the class of two-person finite games having perfect information, the only exceptions are those for which the unique backward induction play of the game reaches each of every player's decision nodes (Reny, 1992a).

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

112

Journal of Economic Perspectives

the opponent is. In either case, at stage N it is impossible for the players to know that each is a Maximizer and that each knows the other is a Maximizer. Consequently, at stage N it is certainly impossible for Maximizing behavior to be common knowledge. With this fact in hand, we're ready to state and prove a
Proposition.'

Proposition: If player I leaves thefirst dollar in TOL, thenfrom that point on it is no longer possiblefor maximizingbehaviorto be commonknowledge. Proof: Consider any stage n 2 2 in TOL. Suppose, by way of contradiction, that maximizing behavior is common knowledge at stage n. Consequently (since it must be possible), suppose that two Maximizers play so that stage n is reached and that at stage n Maximizing behavior is common knowledge. Therefore at stage n, the following two statements hold and are common knowledge: 1) Player I knows that player II is a Maximizer. 2) Player II knows that player I is a Maximizer. Now, for concreteness suppose that it was player I who chose to leave the 1 dollars in the previous stage. Since player I is a Maximizer, this only n makes sense if he believes that there is some chance that player II will leave the n dollars in this stage. This translates to the following additional fact at stage n: 3) Player I knows that player II will, with positive probability, leave the n dollars. Next, note that because statement 3 follows only from the fact that player I is a Maximizer, and because Maximizing behavior is common knowledge at stage n, statement 3 must also be common knowledge at stage n. Now, according to statement 3, player I would not be surprised if player II left the n dollars (after all, player I assigns this event positive probability). Consequently, player I's assessment of player II's future behavior (i.e. at stages n + 2, n + 4,... ) will, according to Bayes' rule, remain the same after II chooses to leave the n dollars as before the choice is made. But by statement 1 this behavior reflects that player II is a Maximizer. Hence, 1 and 3 together with Bayes' rule yield that at stage n + 1: 4) Player I knows that player II is a Maximizer. In addition, because player II's actual choice at stage n does not influence her assessment of player I's subsequent behavior (at stages n + 1, n + 3,... ) and because statement 2 holds, player II will continue to know that player I is a Maximizer if she chooses to leave the n dollars. Therefore we also have that at stage n + 1: 5) Player II knows that player I is a Maximizer.
9In the argument to follow, it is assumed that the payoffs, in particular, as well as the entire

specification of the game is and remains common knowledge among the players. As mentioned earlier, allowing the payoffs to change is equivalent to changing the players' views about one another's behavior. Not insisting that other parts of the game remain common knowledge would only render it easier to obtain outcomes not prescribed by backward induction.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

PhilipJ. Reny

113

But because 4) and 5) follow from 1-3, and 1-3 are common knowledge, 4) and 5) must also be common knowledge. That is, at stage n + 1, maximizing behavior is common knowledge. What does this show? In TOL, if maximizing behavior is common knowledge at any stage n ? 2, then it is also common knowledge at stage n + 1. But this demonstration can then be repeated to ultimately show that it is also common knowledge at stage N. But we already now that this is impossible. We must conclude therefore that maximizing behavior cannot be common knowledge at any stage n ? 2. The proof is complete. Figure 3 The Three Dollar TOL Game
Leave I Leave I Leave

Take

Take

Take

(0;)

(2)

(0)

It's now time to consider an example: the three dollar TOL game illustrated by Figure 3. Below, we describe beliefs that the players might entertain about one anothers' behavior at each but the last stage of this game.'l The beliefs we describe imply that maximizing behavior is common knowledge at the beginning of the game (that is, at stage 1). Nonetheless, an optimal choice for player I is to leave the first dollar. The beliefs are: STAGE 1: 1.1 Player II knows (i.e. believes with certainty) that player I will, with probability one, take the dollar, and 1.2 Player I knows that if he leaves the dollar, player II will, with probability two-thirds, take the two dollars, and 1.3 Statements 1.1, 1.2, 2.1 and 2.2 are common knowledge. STAGE 2: 2.1 Player II knows that if she leaves the two dollars, player I will, with probability one-third, take the three dollars, and 2.2 Player I knows that player II will, with probability two-thirds, take the two dollars, and 2.3 Statements 1.1, 1.2, 2.1, and 2.2 are common knowledge.

Since player II has no further opportunity to play by the time the pot reaches three dollars (i.e. at the third stage), player I's choice there is not disciplined by his beliefs about player II. Hence, these beliefs, when stage three is reached, are redundant.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

114

Perspectives Journal of Economic

This is more than a mouthful. Let's consume it one piece at a time. In stage 1, player II knows that player I is a Maximizer. Why? It suffices to show that II knows that I is making an expected utility maximizing choice. To see that this is so note that by 1.3 and 1.2 player II knows that player I knows that leaving the first dollar would yield an expected utility equal to one. Taking the first dollar is then an expected utility maximizing choice. Since, by 1.1 player II knows that this is the choice player I will make, the conclusion I follows. " Player I also knows that II is a Maximizer. This follows by putting together 1.3 and 2.1 to conclude that in stage 1 player I knows that if he left the first dollar so that stage 2 were reached, player II would then no longer know that player I is a Maximizer, but would instead know that if given the chance, player I would, with probability two-thirds, leave the three dollars for player II rather than take it himself. Consequently, taking the two dollars in stage 2 yields the same expected utility for II as leaving it. But this means, by 1.2, that player I in stage 1 knows that II is a Maximizer. Note then that 1.1, 1.2 and 1.3 together imply that in stage 1, each player knows that the other is a Maximizer. But by 1.3, both of 1.1 and 1.2 are common knowledge and 1.3, it turns out, is also common knowledge.'2 Hence, all of the above is common knowledge in stage 1, which means then that in stage 1 it is common knowledge that both players are Maximizers. Since, in addition, according to player I's beliefs (as expressed by what he knows) about II in stage 1, it is certainly optimal for player I to leave the first dollar, the example is complete. Incidentally, by 1.3 and 2.3, the players' beliefs (knowledge) about one anothers' behavior is common knowledge throughout the game. Since this is true, in particular, in stage 1 when it is also common knowledge that both players are Maximizers, the players' stage 1 beliefs about one another in this example actually constitute a Nash equilibrium of the game (as Brandenburger explains in this symposium). Finally, note that by the earlier proposition, in II knows that player I is a stage 2 at least one of the statements-player Maximizer; player II knows that player I knows that player II is a Maximizer; and so on-must be false. In the example, the first of these is false. A brief summary of this section is probably worthwhile. First, whenever maximizing behavior is not common knowledge, the potential arises for non-backward induction behavior to be optimal, as argued by Kreps et al. Second, if player I leaves the first dollar in TOL then from that point on it is impossible for maximizing behavior to be common knowledge.
lWe remind the reader that simply because II knows (that is, believes with certainty) that player I will take the first dollar, does not imply that player I will indeed take the first dollar. Player II may be incorrect. 12Think of what it means to say that 1.3 is common knowledge, then note that this follows from 1.3 itself! In general, if an event is common knowledge, then it is common knowledge that the event is common knowledge.
I

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

Rationality in Extensive-Form Games 115

Third, it is possible for maximizing behavior to be common knowledge at the beginning of TOL and yet optimal for player I to leave the first dollar. And this is because when the second stage (or beyond) is reached, maximizing behavior cannot be common knowledge. In particular, player II might believe that player I is not a maximizer. The example has precisely this feature.

Concluding Remarks
The example demonstrates that in the game of TOL, it may be perfectly "rational" for player I to leave the first dollar. Indeed, one can construct examples in which N exceeds three dollars and two maximizing players still allow the pot to grow as large as one wishes, even where maximizing behavior is common knowledge at the beginning of the game. Ben-Porath (1992) shows that for a large class of perfect information games, the outcomes consistent with common knowledge of maximizing behavior at the beginning of the game are precisely those that can result after one round of elimination of weakly dominated strategies and then iteratively eliminating strictly dominated strategies.'3 Examples exhibiting the properties of TOL are not isolated. Reny (1992a) characterizes those two-person, finite, games having perfect information within which it is possible for maximizing behavior to be common knowledge throughout-for it is in these games that a compelling argument for backward induction can be made. The characterization makes it clear that there are "few" such games. It's now high time to consider an objection that the reader may have been harboring. For concreteness, let's consider the game of TOL. Apparently, it may be rational for the players to allow the pot to grow even to size N. But if this is so, won't the player whose turn it is to move at stage N - 1 then know that he is playing against a Maximizer (after all, allowing the pot to grow until that point is rational for both players) and won't that player then definitely take the N - 1 dollars rather than allow the opponent to take the N dollars in the last stage? And if this is so, won't the player in stage N - 2 know this and hence take the pot in stage N - 2? And ultimately then won't player I take the first dollar? The answer is "no, not necessarily." It is simply impossible to be sure that one is playing against a Maximizer at stage N - 1 or in any stage beyond stage one. If one could somehow prove at stage N - 1 that each player is a Maximizer, then presumably this proof would be common knowledge, and then it would be common knowledge that both players are Maximizers at stage N - 1. But this is impossible for N ? 3. That is, once the first dollar is not
13Astrategy is weaklydominated by another if the other always yields a payoff that is at least as large regardless of the strategies employed by the other players, and for some choice of strategy for the other players it yields a strictly higher payoff. A strategy is strictlydominated by another if the other always yields a strictly higher payoff.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

116

Perspectives Journal of Economic

taken in TOL, although there is a rational explanation for this, it is also consistent with the hypothesis that player I is not a Maximizer or that player I does not know that II is a Maximizer. Moreover, without the availability of the latter such "irrational" explanations, a rational explanation would not be possible at all. Returning to the example in which N = 3, for instance, if it is "obvious" that player I is a Maximizer even after leaving the first dollar, then player II will certainly take the two dollars. But if this is so "obvious," then doesn't that make player I's choice to leave the first dollar an irrational one? It simply cannot be "obvious" at stage 2 that player I is a Maximizer. This sort of reasoning is analogous to that in Binmore (1985), where it is argued that backward induction play cannot be a complete expression of rationality since if a choice not rooted in backward induction is made, this would violate a player's rationality, which upsets the backward induction argument in the part of the game to follow (and hence the preceding part as well). All of this, however, appears to be a consequence of the more basic observation that maximizing behavior cannot be common knowledge at some points in many games. What are the implications of this for game theory in general? If one takes, as we did, the view that a theory of games is to provide a player with a rational decision whenever a decision must potentially be made, then, in TOL the theory of games must provide player II with a rational decision if player I leaves the first dollar. Since at that point in the game it is impossible for maximizing behavior to be common knowledge, the theory provided must allow for the possibility that player I is not a Maximizer or that I knows that II is not a Maximizer; or the theory cannot be common knowledge among the players.'4 Thus, a theory of "rational" behavior cannot be developed without also implicitly or explicitly developing a theory of "irrational" behavior as well, expressed as some form of non-maximizing behavior or as some lack of knowledge of the theory.'5 What then constitutes rational behavior in extensive form games? This paper has not provided (nor was it intended to provide) a definitive answer. We do hope to have convinced the reader, however, that at least for the game of TOL, violating backward induction and allowing the pot to grow may be perfectly rational. Consequently, the logic of backward induction is not always impeccable. Moreover, this conclusion extends well beyond the game of TOL. The reader who now has a feel for the arguments leading to the impossibility of common knowledge of maximizing behavior should be able to apply these to other backward induction paradoxes: Rosenthal's (1981) centipede game,
therwise maximizing behavior would be common knowledge, which is impossible at this point in the game. 1 Some recent examples of such theories include Gul's (1989) tau-theories and Reny's (1992b) explicable equilibria. A very early example of a theory of rational behavior developed together with a theory of irrational behavior is Selten's (1975) perfect equilibrium, although the kinds of irrationalities allowed there are very special. See Binmore (1985) for a nice critique.
14

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

PhilipJ. Reny

11 7

Selten's (1978) chain-store game'6 and even the finitely repeated prisoner's dilemma. In each case the reader will discover that common knowledge of maximizing behavior cannot hold throughout. And here, perhaps, lies a rather fundamental resolution of these and other backward induction paradoxes.

* I wish to thank Preston McAfee, MottyPerry, Arthur Robsonand the editorsfor their on an earlier draft, and the Social Sciencesand Humanities ResearchCouncil comments of Canadafor financial support.

16 Rosenthal's centipede game is very similar to TOL except that by choosing to keep the game going the immediate effect on your cumulative winnings is to decrease them by one dollar whereas your opponent's are increased by two. Consequently, by "agreeing" to continue play for a long time, both players can amass huge winnings. However, backward induction yields both players no winnings at all. In Selten's chain-store game an incumbent finds that it is optimal when up against any single potential entrant to give it some market share if indeed it does enter rather than fight vigorously to retain the entire market. Consequently, the potential entrant finds it in its interest to enter. With many potential entrants however, one would expect that fighting off one would be a good lesson to the others and this would ensure the incumbent a healthy market share. However, backward induction prescribes that the incumbent acquiesce whenever a potential entrant enters. Consequently, all of the many potential entrants do indeed enter and the incumbent's market share and profits are driven to near zero.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

118

Journal of Economic Perspectives

References
Aumann, R., "Agreeing to Disagree," Annals of Statistics, 1976, 4, 1236-39. Ben-Porath, Elchanan, "Common Belief of Rationality in Perfect Information Games," in preparation, Tel Aviv University, Tel Aviv. Binmore, K. G., "Modelling Rational Players," mimeo, London School of Economics and University of Pennsylvania, 1985. Dixit, A. and B. Nalebuff, Thinking Strategically. New York: Norton, 1991. Gul, F., "Rational Strategic Behavior and the Notion of Equilibrium," mimeo, Stanford Graduate School of Business, 1989. Kreps, D., et al., "Rational Cooperation in the Finitely Repeated Prisoner's Dilemma," Journal of Economic Theory,August 1982, 27:2, 245-52. Luce, R. D. and H. Raiffa, Games and Decisions. New York: Wiley, 1957. Neumann, J. von, and 0. Morgenstern, Theoryof Gamesand EconomicBehavior. Princeton: Princeton University Press, 1944. Reny, P. J., "Common Belief and the Theory of Games with Perfect Information," Journal of Economic Theory,forthcoming, 1992a. Reny, P. J., "Backward Induction, Normal Form Perfection and Explicable Equilibria," Econometrica,May 1992b, 60, 627-49. Rosenthal, R. W., "Games of Perfect Information, Predatory Pricing and the Chain-Store Paradox," Journal of Economic Theory, August 1981, 25:1, 92-100. Selten, R., "Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games," International Journal of Game Theory, 1975, 4:1, 25-55. Selten, R., "The Chain-Store Paradox," Theoryand Decision, 1978, 9, 127-59.

This content downloaded from 103.243.237.5 on Tue, 22 Apr 2014 01:58:59 AM All use subject to JSTOR Terms and Conditions

You might also like