# «Contents. 1 Nash Equilibrium in Extensive Form Games 2 1.1 Selten’s Game......... 1.2 The Little Horsey....... 1.3 Giving Gifts.. ...»

There is no gain from folding because it yields at most 1. Therefore, choosing r at this node is sequentially rational. Consider now his information set b (chance has chosen the losing black card). If player 1 raises, he would get +1 if player 2 passes, and −2 if player 2 meets.

Under the assumption that player 2 implements her equilibrium strategy, the expected payoﬀ will be 1/3 × 1 + 2/3 × −2 = −1. If the player folds, his expected payoﬀ is −1 also. Hence, player 1 is indiﬀerent between raising and folding at this information set and must be willing to randomize. In other words, mixing at this information set is sequentially rational, just like the equilibrium strategy speciﬁes.

Consider now player 2’s information set h. Let qb denote the left node (that is, the node that follows player 1 raising if the card is black), and let qc denote the right node (that is, the node that follows player 1 raising if the card is red). We ﬁrst calculate the probabilities of

**reaching these nodes under σ given Nature’s moves:**

2.4 Beliefs After Zero-Probability Events You may have detected some hand-waving in Requirement 3 in the “whenever possible” clause. You would be right. How do we update beliefs in the strategy proﬁle NN, N in Fig. 5 (p. 5)? The probability of reaching player 2’s information set is 0 if these strategies are followed. We cannot use Bayes rule in such situations because it would involve division by

0. However, player 2’s belief is still meaningful under these conditions: it is the belief when she is “surprised” by being oﬀered a gift. This problem only arises oﬀ the equilibrium path, never on it (because along the equilibrium path there is never a zero probability of reaching an information set).

In the case when Bayes rule does not pin down posterior beliefs, any beliefs are admissible.

This means that every action can be chosen as long as it is sequentially rational for some belief. Notice that in the gift game, N is not sequentially rational for any possible belief, and so it still would not be chosen. This is because N is strictly dominated by Y.

To help illustrate these ideas, consider the following motivational example in Fig. 8 (p. 13), where three players play a game that can end if player 1 opts out. Let p denote player 3’s belief that he is at the left node in his information set conditional on this set being reached by the path of play.

1 O 2, 0, 0 I

Consider ﬁrst the strategy proﬁle (I, U, R). It is a Nash equilibrium of the game, and these strategies along with p = 1 satisfy Requirements 1 through 3 because there is no information set oﬀ the equilibrium path.

Consider now the strategy proﬁle (O, U, L) and the belief p = 0. These strategies satisfy Requirements 1 and 2 but fail Requirement 3. The strategies do form a Nash equilibrium because no player wants to deviate unilaterally. Player 3 has a belief and acts optimally given that belief, and players 1 and 2 both act optimally given the strategies of the other players.

However, this Nash equilibrium fails the requirement of consistent beliefs. Player 3’s belief is inconsistent with player 2’s strategy. However, since the information set is never reached, Nash equilibrium cannot pin that down. Requirement 3, however, does because it forces player 3 to form beliefs consistent with the other players’ strategies. Since player 2 chooses U in this proﬁle, the only belief player 3 can hold is p = 1, in which case his strategy of playing L is no longer sequentially rational, and so it fails Requirement 2.

Let’s now modify this example to demonstrate the “whenever possible” clause. Consider the game in Fig. 9 (p. 14), where player 2 has now the option to quit, ending the game as well.

Consider some equilibrium where player 1’s optimal strategy was to play O, in which case player 3’s information set is oﬀ the equilibrium path, as above. However, now Requirement 3 may not pin down player 3’s beliefs because player 2 may choose Q, and so the probability of reaching player 3’s information set conditional on this strategy is zero, and the weak consistency requirement does not restrict the beliefs there. We can assign anything we want there (a rather dissatisfying thing to do). Other reﬁnements do put additional restrictions to handle these cases. Note, of course, that if player 2 chooses U with probability q1, D with probability q2, and Q with probability 1 − q1 − q2, then Requirement 3 does have a bite because now player 3’s information set is reached with positive probability given player 2’s strategy, and so we require that p = q1 /(q1 + q2 ) whenever q1 + q2 0.

What beliefs players have after zero-probability events is not a minor technical issue, but

an extremely important question, and much research in game theory has been directed at deciding what sort of beliefs are “reasonable” to have.

2.5 Perfect Bayesian Equilibrium We have everything in place to deﬁne our solution concept, which is a stronger version of Nash equilibrium; i.e. it eliminates certain Nash equilibria that fail the additional requirements. We shall call the pair of a strategy proﬁle and a belief vector, (σ, π ), an assessment.

Definition 3. An assessment (σ ∗, π ∗ ) is a perfect Bayesian equilibrium (PBE) if the strategies speciﬁed by the proﬁle σ ∗ are sequentially rational given beliefs π ∗, and the beliefs π ∗ are weakly consistent with σ ∗.

A PBE is a set of strategies and beliefs such that, at any stage in the game, strategies are optimal given the beliefs, and the beliefs are obtained from the equilibrium strategies and observed actions via Bayes rule whenever possible. Beliefs are elevated to the level of strategies here, and the equilibrium consists not just of a strategy for each player but also includes a belief for each player at each information set where the player has to move. Before we insisted that players choose reasonable strategies, we now also require that they hold reasonable beliefs.

The deﬁnition of PBE is circular in the sense that strategies must be optimal given beliefs and beliefs are derived from the strategies. This means that we must solve for strategies and beliefs simultaneously, like a system of equations. Sometimes this is quite involved, and we shall spend quite a bit of time practicing diﬀerent ways of approaching these games.

However, at least we do know that if we look for PBE, we shall ﬁnd at least one in every game we are likely to solve in this class. The following result establishes this claim.

** Theorem 3. If (σ, π ) is a perfect Bayesian equilibrium of an extensive-form game with perfect recall, then σ is a mixed-strategy Nash equilibrium.**

For any ﬁnite extensive-form game, a perfect Bayesian equilibrium exists.

This theorem tells us that the set of perfect Bayesian equilibria is really a subset of the set of Nash equilibria. That is, any PBE is also a Nash equilibrium. The converse, of course, is not true: there are Nash equilibria that are not PBE. However, the theorem guarantees that the additional restrictions will not eliminate all Nash equilibria from consideration. That is, each ﬁnite game will have at least one PBE. This, as you can imagine, is very important if we want to solve games.

So remember, a PBE is a Nash equilibrium where strategies are sequentially rational given the beliefs, and the beliefs are weakly consistent with these strategies (updated via Bayes rule whenever possible).

Going back to our Gift-Giving Game in Fig. 7 (p. 10), recall that player 2’s sequentially rational strategy is to accept whenever q 1/2, reject whenever q 1/2, and either one (including mixtures) otherwise. Let player 1’s strategy be denoted by (r s), where r is the probability of oﬀering the Game Theory book, and s is the probability of oﬀering the Star

**Trek manual. Bayes rule then yields player 2’s posterior belief:**

pr q=.

pr + (1 − p)s To ﬁnd the PBE, we must ﬁnd mixing probabilities for the two players that are sequentially rational and that are also such that q is consistent with player 1’s equilibrium strategy. Suppose ﬁrst that player 1’s strategy is completely mixed; that is, he randomizes at both information sets, so r, s ∈ (0, 1). Take an arbitrary (possibly degenerate) mixed strategy for player 2 in which she accepts with probability α ∈ [0, 1]. Since player 1 is willing to mix, he must be indiﬀerent between oﬀering the gift and not oﬀering it at both information sets. Not offering gives him a payoﬀ of 0 in either case. Oﬀering, on the other hand, yields a payoﬀ U1 (G|GT) = 2α + (1 − α)(−1) if the book is on Game Theory and U1 (G|ST) = α + (1 − α)(−1) if it is on Star Trek. Observe now that because player 1 must be indiﬀerent to mix, it follows that U1 (G|ST) = U1 (NG|ST), or α + (1 − α)(−1) = 0 ⇒ 2α − 1 = 0 ⇒ α =.

That is, if player 1 is indiﬀerent between oﬀering the Star Trek manual and not oﬀering it, it must be the case that player 2 will accept the oﬀer with probability exactly equal to 1/2. This now implies that

** U1 (G|GT) = 2( 1/2) + (1 − 1/2)(−1) = 1 − 1/2 = 1/2 0 = U1 (NG|GT).**

In other words, α = 1/2, which must hold for player 1 to mix in if he holds the Star Trek manual, also ensures that he cannot possibly mix if he holds the Game Theory book. This contradicts the supposition that player 1 mixes in both cases. We conclude that if player 1 mixes on the Star Trek manual in equilibrium, he must be oﬀering the Game Theory book for sure.

Since we now know that s ∗ 0 ⇒ r ∗ = 1, let’s see if there is a PBE with these properties.

From the discussion above, we know that this equilibrium requires α∗ = 1/2 or else player 1 would not randomize with the Star Trek manual. This now implies that q = 1/2 or else player 2 would not randomize in her acceptance decision.

**Putting everything together yields:**

1 p(1) p ⇒ s∗ = q= =.

∗ p(1) + (1 − p)s 1−p We need to ensure that s ∗ is a valid mixing probability; that is, we must make sure that s ∗ ∈ (0, 1). It is clearly positive because p ∈ (0, 1). To ensure that s ∗ 1, we also need

**p 1/2. Hence, we found a PBE. Writing it in behavioral strategies yields the following:**

p 1 r ∗ = 1, s ∗ =, α = 1/2 provided p.

1−p 2 This equilibrium is intuitive: since player 1 always oﬀers the Game Theory book and only sometimes oﬀers the Star Trek manual, player 2 is willing to risk accepting his oﬀer. Of course, her estimate about this risk depends on her prior belief. Player 1’s strategy is precisely calibrated to take into account this belief when he tries to bluﬀ player 2 into accepting what he knows is a gift she would not want. Note that p 1/2 is a necessary condition for this equilibrium. As we shall see shortly, if player 2’s priors are too optimistic, she would accept the oﬀer for sure, in which case (we would expect) player 1 to oﬀer her even the Star Trek manual for sure.

Observe now that player 2 has learned something from player 1’s strategy that she did not know before: her equilibrium posterior belief is q = 1/2 p. That is, she started out with a prior which assigned less than 50% chance to the gift being a Game Theory book and then updated this belief to 50% upon seeing the gift being oﬀered. However, she still is not sure just what type of gift she is being oﬀered. Player 1’s strategy is called semi-separating because player’s action allow player 2 to learn something, but not everything, about the information he has: she can “separate” the Game Theory gift from the Star Trek manual only partially.3 The residual uncertainty player 2 has is a common feature of the other player choosing a semi-separating strategy.

We have exhausted the possibilities in which player 1 mixes when he has the Star Trek manual. Only two possible types of strategies remain: he either always oﬀers it or never does.

Suppose ﬁrst that player 1 always oﬀers the Star Trek manual in equilibrium, so s ∗ = 1. From our calculations above, we know that this means player 2 would have to accept his oﬀer with α ≥ 1/2, which in turn implies that q ≥ 1/2 as well. If player 2 accepts with probability at least as high as 1/2, then player 1 will always oﬀer the Game Theory book: U1 (G|GT) = 3α − 1 0 for any α 1/3. This now means that player 1 always oﬀers the gift regardless of its type, which implies q = p. Since we require that q ≥ 1/2, it follows that this equilibrium only exists

**if p ≥ 1/2. Hence, we found a PBE for that range of priors:**

r ∗ = 1, s ∗ = 1, α = 1 provided p ≥.

Note that if p = 1/2, then player 2 is indiﬀerent between accepting and rejecting, so she can mix with any probability as long as α ≥ 1/2, so there is a continuum of PBE in this case. However, requiring that the prior equal a particular value is an extremely demanding condition and these solutions are extremely fragile: the smallest deviation from p = 1/2 would immediately produce one of the PBE we identiﬁed above. Normally, we would ignore solutions that depend on knife-edge conditions like that. It is important to note that whereas p = 1/2 is a knife-edge condition we can ignore, q = 1/2 in our semi-separating PBE is not.

Unlike the prior, the posterior probability is strategically induced by the behavior of the player.

In this equilibrium, player 1 is playing a pooling strategy because he “pools” on the same action (oﬀering the gift) no matter what he knows about the gift’s type. Not surprisingly, whenever a player uses a pooling strategy, his opponent cannot learn anything from the behavior she observes. As we have seen, her posterior is exactly equal to her prior.