«Contents. 1 Nash Equilibrium in Extensive Form Games 2 1.1 Selten’s Game......... 1.2 The Little Horsey....... 1.3 Giving Gifts.. ...»
Suppose now that player 1 never oﬀers the Star Trek manual, so s ∗ = 0, but oﬀers the Game Theory book with positive probability, so r ∗ 0. In this case, q = 1, and player 2’s sequentially rational response is to accept for sure. However, if she accepts the oﬀer with certainty, then player 1 would prefer to oﬀer even the Star Trek manual: doing so would net For this reason, some authors call the strategies “partially separating” or “partially pooling.” him a payoﬀ of 1 while not oﬀering saddles him with 0. In other words, there can be no equilibrium of this type.
In this case, the player 1’s strategy is separating because it allows player 2 to infer what he knows with certainty. Not surprisingly, if he allowed her to do that, he would be at a disadvantage. It is not diﬃcult to see that separating in a way that convinces player 2 that the oﬀer contains the Game Theory book cannot be optimal: she would accept the oﬀer given this belief but then player 1 would have an incentive to oﬀer the Star Trek manual as well, which means player 2 cannot sustain her belief that she is oﬀered the Game Theory book for sure.
Conversely, player 1 could induce a posterior belief that the book is the Star Trek manual for sure by playing the separating strategy (r ∗ = 0, s ∗ = 1); that is, he never oﬀers the Game Theory book and always oﬀers the Star Trek manual. But in this case, player 2 will always reject the oﬀer, in which case player 1 will be strictly better oﬀ not oﬀering the Star Trek manual. So this cannot be an equilibrium either. This game has no separating equilibria.
The ﬁnal scenario to examine is r ∗ = s ∗ = 0. This leaves player 2’s information set oﬀ the equilibrium path of play, so Bayes rule cannot pin down her posterior beliefs. Can we sustain these strategies in equilibrium? To induce player 1 never to oﬀer even the Game Theory book, it must be the case that U1 (G|GT) ≤ U1 (NT |GT) ⇒ 3α − 1 ≤ 0 ⇒ α ≤ 1/3. That is, player 2 must threaten that she will reject the oﬀer with at least 2/3 probability. This threat will be credible if she believed the gift was bad with probability at least 1/2. In other words, if q ≤ 1/2, then player 2 can credibly reject the oﬀer. Observe now that this is suﬃcient to deter player 1 from oﬀering the Star Trek manual as well: U1 (G|ST) = 2α − 1 2( 1/3) − 1 = − 1/3 0 = U1 (NG|ST).
Therefore, we have multiple PBE that all take the same form:
r ∗ = 0, s ∗ = 0, α = 0 with q ≤ 1/2.
We have some latitude in specifying oﬀ-the-equilibrium-path beliefs here but as long as q ≤ 1/, these strategies can be sustained in PBE. Note that technically when q = 1/, any α ≤ 1/ would do the trick. However, as a knife-edge case that does not change the equilibrium path of play, we would normally ignore this.
Observe also that this PBE exists regardless of the prior. In this situation, we have multiple
equilibria. It is important to realize that the semi-separating and the other pooling equilibrium in which player 1 always makes an oﬀer are not “multiple equilibria” in that sense:
depending on the value of the exogenous parameter p, the equilibrium takes either the partially separating or the pooling form. If it weren’t for the pooling PBE where player 1 never oﬀers the gift, this game would have a unique solution for any value of this exogenous parameter. Unfortunately, with this second pooling PBE, this is no longer the case: the game now has at least two solutions (in terms of the equilibrium path of play) over the entire range of p, an indeterminacy that we would have to address somehow.
3 Backward Induction
Before learning how to characterize PBE in arbitrary games of imperfect information, we study how to use backward induction to compute PBE in games of complete information. In games of complete and perfect information, we can apply this technique to ﬁnd the “rollback” equilibrium, which we then generalize to “subgame-perfect” equilibrium. All of these are PBE but do not require the entire belief machinery.
3.1 Rollback Equilibrium
Computing PBE in games of complete and perfect information is very easy and one does not need to introduce the machinery of beliefs at all because all information sets are singletons.
Sequential rationality will suﬃce (recall that the conditional beliefs would always require that the probability of being at a node in a singleton information set is exactly 1).
These games can be solved by backward induction, a technique which involves starting from the last stage of the game, determining the last mover’s best action at his information set there, and then replacing the information set with the payoﬀs from the outcome that the optimal action would produce. Continuing in this way, we work upwards through the tree until we reach the ﬁrst mover’s choice at the initial node.
In 1913 Zermelo proved that chess has an optimal solution. He reasoned as follows. Since chess is a ﬁnite game (it has quite a few moves, but they are not inﬁnite), this means that it has a set of penultimate nodes. That is, nodes whose immediate successors are terminal nodes. The optimal strategy speciﬁes that the player who can move at each of these nodes chooses the move that yields him the highest payoﬀ (in case of a tie he makes an arbitrary selection). Now, the optimal strategies specify that the player who moves at the nodes whose immediate successors are the penultimate nodes chooses the action which maximizes his payoﬀ over the feasible successors given that the other player moves there in the way we just speciﬁed. We continue doing so until we reach the beginning of the tree. When we are done, we will have speciﬁed an optimal strategy for each player.
These strategies constitute a Nash equilibrium because each player’s strategy is optimal given the other player’s strategy. In fact, these strategies also meet the stronger requirements of subgame perfection, which we shall examine in the next section. (Kuhn’s paper provides a proof that any ﬁnite extensive form game has an equilibrium in pure strategies. It was also in this paper that he distinguished between mixed and behavior strategies for extensive form
games.) Hence the following result:
Theorem 4 (Zermelo 1913; Kuhn 1953). A ﬁnite game of perfect information has a pure strategy Nash equilibrium.
We can ﬁnd this equilibrium by backward induction in the following way. Start with the penultimate node, and determine the sequentially rational strategy for the player who gets to move there. Replace the node with the outcome produced when that player chooses this strategy. Continue working backward until you reach the initial node. The sequentially rational strategies you now have constitute the PBE of this game, which is called a rollback equilibrium.
Since Selten’s game in Fig. 2 (p. 3) is one of complete and perfect information, we can apply backward induction to ﬁnd its rollback equilibrium. At her information set, player 2 would choose L. This reduces player 1’s choices between D (which, given player 2’s strategy would yield 3) and U, which yields 2. Therefore, player 1 would choose D. The rollback equilibrium of this game is D, L. As before, this is also Nash and PBE.
Consider now the game in Fig. 10 (p. 19).
What are the Nash equilibria of this game? As usual, convert this to strategic form, as shown in Fig. 11 (p. 19). (We shall keep the non-reduced version to illustrate a point.) The Nash equilibria in pure strategies are (∼e, a), r, (∼e, ∼a), r, and (e, a), ∼r.
Applying backward induction produces the rollback equilibrium (∼e, ∼a), r illustrated in Fig. 12 (p. 19). This eliminates two of the pure-strategy Nash equilibria, and demonstrates
The only reason why ∼e is sequentially rational at player 1’s ﬁrst information set is because player 2’s sequentially rational strategy prescribes r, which in turn is only rational because she expects player 1 to choose ∼a at his last information set, where this is the sequentially rational choice. In other words, the optimality of player 1’s initial action depends on the optimality of his action at his second information set. This is precisely why we cannot determine optimality of strategies unless they specify what to do for all information sets. Note that in this case, the second information set is not reached if the strategy (∼e, ∼a) is followed, but we still need to know the action there.
The problem with the Nash equilibrium proﬁles (∼e, a), r and (e, a), ∼r is that they leave information sets oﬀ the equilibrium path of play and so Nash optimality cannot pin down behavior at sets that are never reached. For example, (e, a), ∼r leaves player 1’s second information set oﬀ the path of play, which causes the strategy to miss the fact that a is not sequentially rational at that set. This incredible threat rationalizes player 2’s choice of ∼r causing her to take an action that leaves this information set oﬀ the path. Similarly, (∼e, a), r leaves both player 1’s second information set and player 2’s information set oﬀ the path of play and causes the strategies to miss two problems: player 1’s choice of a is not sequentially rational at his second information set, and given that choice, player 2’s choice of r is not rational either! In other words, since the equilibrium path of play does not reach some continuation games, Nash cannot pin down optimal behavior there.
Let’s apply backward induction to the game in Fig. 13 (p. 20). There are three penultimate information sets, all of them for player 2. For each of these sets we determine player 2’s sequentially rational strategy. Because all information sets are singletons, we do not need to specify conditional beliefs (they are trivial and assign probability 1 to the single node in the information set). So, if the (2, 0) set is ever reached, then player 2 is indiﬀerent between accepting and rejecting. Any of these actions is plausible, so player 2’s sequentially rational strategy at (2, 0) includes both actions. In other words, player 2 will be willing to mix at this information set. On the other hand, player 2’s sequentially rational strategy at (1, 1) is unique: it is always better to accept. And so in the case of (0, 2). Therefore, player 2 has only two credible pure strategies, yyy and nyy.
(2, 0) (0, 2) (1, 1) y y y n n n 2,0 0,0 1,1 0,0 0,2 0,0
Given player 2’s best responses to each of his actions, player 1 can now optimize his own behavior. He knows that if he chooses (2, 0), then player 2 can either accept of reject, if he oﬀers (1, 1) or (0, 2), then player 2 will always accept. Given these conditions, player 1 will never oﬀer (0, 2) in equilibrium because it yields him a payoﬀ of 0. Given these strategies, player 1 has two optimal actions, either (2, 0) if he thinks player 2 will choose the ﬁrst strategy and (1, 1) if he thinks that player 2 will choose the second. Thus, we conclude that the game has only two rollback equilibria in pure strategies: (2, 0), yyy and (1, 1), nyy.4 Kuhn’s theorem makes no claims about uniqueness of the equilibrium. Indeed, as we saw in the example above, the game has two rollback equilibria. However, it should be clear that if no player is indiﬀerent between any two outcomes, then the equilibrium will be unique.
Note that all rollback equilibria are Nash equilibria. (The converse, of course, is not true. We have just gone to great lengths to eliminate Nash equilibria that seem implausible.)
3.2 Subgame-Perfect Equilibrium
If you accept the logic of backward induction, then the following discussion should seem a natural extension. Consider the game in Fig. 14 (p. 21). Here, neither of player 2’s choices is Player 2 is indiﬀerent between accepting and rejecting after (2, 0) oﬀer, so she can mix after receiving it. If she accepts with probability greater than 1/2, then player 1 is better oﬀ demanding (2, 0), and we have a continuum of SPE there. If, however, she accepts with probability less than 1/2, then player 1 is better oﬀ demanding (1, 1) instead (for another continuum of SPE). If she accepts exactly with probability 1/2, then player 1 is indiﬀerent between (2, 0) and (1, 1), so he can mix between these two strategies as well (for another continuum of SPE).
dominated at her second information set: she is better oﬀ choosing D if player 1 plays A and is better oﬀ choosing C if player 1 plays B. Hence, we cannot apply rollback.
However, we can reason in the following way. The game that begins with player 1’s second information set—the one following the history (D, R)—is a zero-sum simultaneous move game. We have seen similar games, e.g., Matching Pennies. The expected payoﬀs from the unique mixed strategy Nash equilibrium of this game are (0, 0). Therefore, player 2 should only choose R if she believes that she will be able to outguess player 1 in the simultaneousmove game. In particular, the probability of obtaining 2 should be high enough (in outweighing the probability of obtaining −2) that the expected payoﬀ from R is larger than 1 (the payoﬀ he would get if he played L). This can only happen if player 2 believes she can outguess player 1 with a probability of at least 3/4, in which case the expected payoﬀ from R will be at least 3/4(2) + 1/4(−2) = 1. But, since player 2 knows that player 1 is rational (and therefore just as cunning as she is), it is unreasonable for her to assume that she can outguess player 1 with such high probability. Therefore, player 2 should choose L, and so player 1 should go D. The equilibrium obtain by backward induction in the game in Fig. 14 (p. 21), then, is (D, 1/2[A]), (L, 1/2[C]).
This now is the logic of subgame perfection: replace every “proper subgame” of the tree with one of its Nash equilibrium payoﬀs and perform backward induction on the reduced tree.5 For the game in Fig. 14 (p. 21), once we replace the subgame that starts at player 1’s second information set with the Nash equilibrium outcome, the game becomes the one in Fig. 2 (p. 3), which we have already analyzed and for which we found that the backwardinduction equilibrium is D, L.