# «Contents. 1 Nash Equilibrium in Extensive Form Games 2 1.1 Selten’s Game......... 1.2 The Little Horsey....... 1.3 Giving Gifts.. ...»

It is important to realize that at this point I do not claim that such equilibrium exists—we shall look for one that has these properties. Also, I do not claim that if it does exist, it is the unique SPE of the game. We shall prove this later. However, given that the subgames are structurally identical, there is no a priori reason to think that oﬀers must be non-stationary, and, if this is the case, that there should be any reason to delay agreement (given that doing so is costly). So it makes sense to look for an SPE with these properties.

Let x ∗ denote player A’s equilibrium oﬀer and y ∗ denote player B’s equilibrium oﬀer (again, because of stationarity, there is only one such oﬀer). Consider now some arbitrary time t at which player A has to make an oﬀer to player B. From the two properties, it follows that if B rejects the oﬀer, she will then oﬀer y ∗ in the next period (stationarity), which A will accept (no delay). So, B’s payoﬀ to rejecting A’s oﬀer is δy ∗. Subgame perfection requires that B reject any oﬀer π − x δy ∗ and accept any oﬀer π − x δy ∗. From the no delay property, this implies π − x ∗ ≥ δy ∗. However, it cannot be the case that π − x ∗ δy ∗ because player A could increase his payoﬀ by oﬀering some x such that

**π − x ∗ π − x δy ∗. Hence:**

π − x ∗ = δy ∗ (6) Equation 6 states that in equilibrium, player B must be indiﬀerent between accepting and rejecting player A’s equilibrium oﬀer. By a symmetric argument it follows that in equilibrium,

**player A must be indiﬀerent between accepting and rejecting player B’s equilibrium oﬀer:**

Proposition 1. The following pair of strategies is a subgame perfect equilibrium of the

**alternating-oﬀers game:**

• player A always oﬀers x ∗ = π /(1 + δ) and always accepts oﬀers y ≤ y ∗,

• player B always oﬀers y ∗ = π /(1 + δ) and always accepts oﬀers x ≤ x ∗.

Proof. We show that player A’s strategy as speciﬁed in the proposition is optimal given player B’s strategy. Consider an arbitrary period t where player A has to make an oﬀer. If he follows the equilibrium strategy, the payoﬀ is x ∗. If he deviates and oﬀers x x ∗, player B would accept, leaving A strictly worse oﬀ. Therefore, such deviation is not proﬁtable. If he instead deviates by oﬀering x x ∗, then player B would reject. Since player B always rejects such oﬀers and never oﬀers more than y ∗, the best that player A can hope for in this case is max{δ(π − y ∗ ), δ2 x ∗ }. That is, either he accepts player B’s oﬀer in the next period or rejects it and A’s oﬀer in the period after the next one is accepted. (Anything further down the road will be worse because of discounting.) However, δ2 x ∗ x ∗ and also δ(π − y ∗ ) = δx ∗ x ∗, so such deviation is not proﬁtable. Therefore, by the one-shot deviation principle, player A’s proposal rule is optimal given B’s strategy.

Consider now player A’s acceptance rule. At some arbitrary time t player A must decide how to respond to an oﬀer made by player B. From the above argument we know that player A’s optimal proposal is to oﬀer x ∗, which implies that it is optimal to accept an oﬀer y if and only if π − y ≥ δx ∗. Solving this inequality yields y ≤ π − δx ∗ and substituting for x ∗ yields y ≤ y ∗, just as the proposition claims.

This establishes the optimality of player A’s strategy. By a symmetric argument, we can show the optimality of player B’s strategy. Given that these strategies are mutually best responses at any point in the game, they constitute a subgame perfect equilibrium.

This is good but so far we have only proven that there is a unique SPE that satisﬁes the no delay and stationarity properties. We have not shown that there are no other subgame perfect equilibria in this game. The following proposition, whose proof involves knowing some (not much) real analysis, states this result.16 Proposition 2. The subgame perfect equilibrium described in Proposition 1 is the unique subgame perfect equilibrium of the alternating-oﬀers game.

If you know what a supremum of a set is, you can ask me and I will tell you how the proposition can be proved. There is a very elegant proof due to Shaked and Sutton that is much easier to follow than the extremely complicated original proof by Rubinstein.

We are now in game theory heaven! The rather complicated-looking bargaining game has a unique SPE in which agreement is reached immediately. Player A oﬀers x ∗ at t = 0 and player B immediately accepts this oﬀer. The shares obtained by player A and player B in the unique equilibrium are x ∗ = π /(1 + δ) and π − x ∗ = δπ /(1 + δ) respectively.

In equilibrium, the share depends on the discount factor and player A’s equilibrium share is strictly greater than player B’s equilibrium share. In this game there exists a “ﬁrst-mover” advantage because player A is able to extract all the surplus from what B must forego if she rejects the initial proposal. In your homework you will be asked to ﬁnd the unique stationary no delay SPE when the two players use diﬀerent discount factors.

**The Rubinstein bargaining model makes an important contribution to the study of negotiations. First, the stylized representation captures characteristics of most real-life negotiations:**

(a) players attempt to reach an agreement by making oﬀers and counteroﬀers, and (b) bargaining imposes costs on both players.

Some people may argue that the inﬁnite horizon assumption is implausible because players have ﬁnite lives. However, this involves a misunderstanding of what the inﬁnite time horizon really represents. Rather than modeling a reality where bargaining can continue forever, it models a reality where players do not stop bargaining after some exogenously given predeﬁned time limit. The ﬁnite horizon assumption would have the two players to stop bargaining even though each would prefer to continue doing so if agreement has not been reached. Unless there’s a good explanation of who or what prevents them from continuing to bargain, the inﬁnite horizon assumption is appropriate. (There are other good reasons to use the assumption and they have to do with the speed with which oﬀers can be made. There are some interesting models that explore the bargaining model in the context of deadlines for reaching an agreement. All this is very neat stuﬀ and you are strongly encouraged to read it.)

**4.6 Bargaining with Fixed Costs**

Osborne and Rubinstein also study an alternative speciﬁcation of the alternating-oﬀers bargaining game where delay costs are modeled not as time preferences but as direct per-period costs. These models do not behave nearly as nicely as the one we studied here, and they have not achieved widespread use in the literature.

As before, there are two players who bargain using the alternating-oﬀers protocol with time periods indexed by t, (t = 0, 1, 2,...). Instead of discounting future payoﬀs, they pay per-period costs of delay, c2 c1 0. That is, if agreement is reached at time t on (x, π − x), then player 1’s payoﬀ is x − tc1 and player 2’s payoﬀ is π − x − tc2.

Let’s look for a stationary no-delay SPE as before. Consider a period t in which player 1 makes a proposal. If player 2 rejects, then she can obtain y ∗ − (t + 1)c2 by our assumptions.

If he accepts, on the other hand, she gets π − x − tc2 because of the t period delay. Hence, player 2 will accept any π − x − tc2 ≥ y ∗ − (t + 1)c2, or π − x ≥ y ∗ − c2. To ﬁnd now the maximum she can expect to demand, note that by rejecting her oﬀer in t + 1, player 1 will get x ∗ − (t + 2)c1 and by accepting it, he will get π − y − (t + 1)c1 because of the t + 1 period delay up to his acceptance. Therefore, he will accept any π − y − (t + 1)c1 ≥ x ∗ − (t + 2)c1, which reduces to π − y ≥ x ∗ − c1. Since player 2 will be demanding the most that player 1 will accept, it follows that y ∗ = π − x ∗ + c1. This now means that player 2 cannot credibly

**commit to reject any t period oﬀer that satisﬁes:**

** π − x ≥ π − x ∗ + c1 − c2 x ∗ − x ≥ c1 − c2.**

Observe now that since c1 c2, it follows that the RHS of the second inequality is negative.

Suppose now that x ∗ π, then it is always possible to ﬁnd x x ∗ such that 0 x ∗ − x ≥ c1 − c2. For instance, taking x = x ∗ − (c1 − c2 ) = x ∗ + (c2 − c1 ) x ∗ because c2 c1.

Therefore, if x ∗ π, it is possible to ﬁnd x x ∗ such that player 1 will prefer to propose x instead of x ∗, which contradicts the stationarity assumption. Therefore, x ∗ = π. This now pins down y ∗ = c1. This yields the following result.

Proposition 3. The following pair of strategies constitutes the unique stationary no-delay subgame perfect equilibrium in the alternating-oﬀers bargaining game with per-period costs

**of delay c2 c1 0:**

• player 1 always oﬀers x ∗ = π and always accepts oﬀers y ≤ c1 ;

• player 2 always oﬀers y ∗ = c1 and always accepts oﬀers x ≤ π.

The SPE outcome is that player 1 grabs the entire pie in the ﬁrst period.

Obviously, if c1 c2 0 instead, then player 1 will get c2 in the ﬁrst period and the rest will go to player 2. In other words, the player with the lower cost of delay extracts the entire bargaining surplus, which in this case is heavily asymmetric. If the low-cost player gets to make the ﬁrst oﬀer, he will obtain the entire pie. It turns out that this SPE is also the unique SPE (if c1 = c2, then there can be multiple SPE, including some with delay).

This model is not well-behaved in the following sense. First, no matter how small the cost discrepancy is, the player with the lower cost gets everything. That is, it could be that player 1’s cost is c1 = c2 −, where 0 is arbitrarily small. Still, in the unique SPE, he obtains the entire pie. The solution is totally insensitive to the cardinal diﬀerence in the costs, only to their ordinal ranking. Note now that if the costs are very close to each other and we tweak them ever so slightly such that c1 c2, then player 2 will get π − c2 ; i.e., the prediction is totally reversed! This is not something you want in your models. It is perhaps for this reason that the ﬁxed-cost bargaining model has not found wide acceptance as a workhorse model.

5 Critiques of Backward Induction and Subgame Perfection Although backward induction and subgame perfection give compelling arguments for reasonable play in simple two-stage games of perfect information, things get uglier once we consider games with many players or games where each player moves several times.

5.1 Critiques of Backward Induction There are two criticisms of BI and both have to do with questions about reasonable behavior.

In my mind, the second critique has more bite than the ﬁrst one, but I will give you both.

First, consider a game with n players that has the structure depicted in Fig. 24 (p. 44).

Since this is a game of perfect information, we can apply the backward induction algorithm.

The unique equilibrium is the proﬁle where each player chooses C and in the outcome each player gets 2.

People have argued that this is unreasonable because in order to get the payoﬀ of 2, all n−1 players must choose C. If the probability that any player chooses C is p 1, independent of the others, then the probability that all n − 1 will choose C is p n−1, which can be quite small if n is large even if p itself is very close to 1. For example, with p =.999 and n = 1001,

The backward-induction solution is that players choose S at every information set. However, suppose that contrary to expectations player 1 chooses C at the initial node. What should player 2 do? The backward-induction solution says to play S, because player 1 will play S given a chance. However, player 1 should have played S at the initial node but did not. Since player 2’s optimal behavior depends on her beliefs about player 1’s behavior in the future, how does she form these beliefs following a 0-probability event? For example, if she believes that player 1 will stop with probability less than 2/3, then she should play C because doing so will get her at least 3, which is the best she obtains from stopping.

How does player 2 form these beliefs and what beliefs are reasonable? There are two ways to address this problem. First, we may introduce some payoﬀ uncertainty and interpret deviations from expected play by the payoﬀs diﬀering from those originally thought to be most likely. Instead of conditioning beliefs on probability-0 events, this approach conditions them payoﬀs that are most likely given the “deviation”.

Second, we may interpret the extensive form game as implicitly including the possibility that players sometimes make small “mistakes” or “trembles” whenever they act. If the probabilities of “trembles” are independent across diﬀerent information sets, then no matter how often past play has failed to conform to the predictions of backward induction, a player is still justiﬁed in continuing to use backward induction for the rest of the game. There is a “trembling-hand perfect” equilibrium due to Selten that formalizes this idea. (This is a defense of backward induction.) The question now becomes one of choosing between two possible interpretations of deviations. In Fig. 25 (p. 44), if player 2 observes C, will she interpret this as a small “mistake” by player 1 or as a signal that player 1 will choose C if given a chance? Who knows? I am more inclined toward the latter interpretation but your mileage may vary. To see why it may make sense to treat deviations as a signal, suppose we extend the centipede game to 40 periods and now suppose we ﬁnd ourselves in period 20; that is, both players have played C 10 times. Is it reasonable to suppose these were all mistakes? Or that perhaps players are trying to get closer to the endgame where they would get better payoﬀs? In experimental settings, players usually do continue for a while although they do tend to stop well short of the end. One way we can think about this is that the game is not actually capturing everything about the players. In particular, in experiments a player may doubt the rationality of the opponent (so he may expect her to continue) or he may believe she doubts his own rationality (so she expects him to continue, which in turn makes him expect her to continue as well). At any rate, small doubts like this may move the play beyond the game-stopping ﬁrst choice by player 1. This does not mean that backward induction is “wrong.” What it does mean is that the full information common knowledge assumptions behind it may not be captured in experiments where real people play the Centipede Game. My reaction to this is not to abandon backward induction but to modify the model and ask: what will happen if players with small doubts about each other’s rationality play the Centipede Game? This is a topic for another discussion, though.

**5.2 Critiques of Subgame Perfection**

Obviously, all the critiques of backward induction apply here as well. However, in addition to these problems, SP also requires that players agree on the play in a subgame even when BI cannot predict the play.

The coordination game between player 1 and 3 has three Nash equilibria: two in pure strategies with payoﬀs (7, 10, 7), and one in mixed strategies with payoﬀs (3.5, 5, 3.5).17 If we specify an equilibrium in which player 1 and 3 successfully coordinate, then player 2 will choose R, and so player 1 will choose R as well, expecting a payoﬀ of 7. If we specify the MSNE, then player 2 will choose L because R yields an expected payoﬀ of 5 (coordination will fail half of the time). Again player 1 will choose R expecting a payoﬀ of 8. Thus, in all SPE of this game player 1 chooses R.

Suppose, however, player 1 did not see a way to coordinate in the third stage, and hence expected a payoﬀ of 3.5 conditional on this stage being reached, but feared that player 2 would believe that the play in the third stage would result in coordination on an eﬃcient equilibrium. (This is not unreasonable since the two pure strategy Nash equilibria there are the eﬃcient ones.) If player 2 had such expectations, then she would choose R, which means that player 1 would go L at the initial node!

In this MSNE, each player chooses A with probability 1/2, as you should readily see.

The problem with SPE is that all players must expect the same Nash equilibria in all subgames. So, while this was not a big problem for subgames with unique Nash equilibria, the critique has signiﬁcant bite in cases like the one just shown. Is such a common expectation reasonable? Who knows? (It depends on the reason the equilibrium arises in the ﬁrst place, which is not something we can say a whole lot about yet.)