





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of nash equilibrium in the context of matrix games, using examples like the prisoners' dilemma, the battle of the sexes, and hawks and doves. It delves into the definition of mixed strategies and the existence of nash equilibria in these games.
Typology: Study notes
1 / 9
This page cannot be seen from the preview
Don't miss anything!






Game-Theoretic Artificial Intelligence^1 Spring 2007 Professor Amy Greenwald Topic #
This lecture is concerned with the Nobel Prize winning work of John Nash. In particular, we define the notion of mixed strategies in matrix games, and we present Nash’s argument on the existence of mixed strategy (Nash) equilibrium.
The most well-known game-theoretic scenario is the paradoxical situation known as the Prisoners’ Dilemma, which was popularized by Axelrod [1] in his popular science book. The following is one (uncommon) variant of the story.^2
A crime has been committed for which two prisoners are held incommunicado. The district attorney is assigned to question the prisoners. He designs the following incentive structure to induce the prisoners to talk. If neither prisoner talks, both prisoners automatically receive mild sentences (payoff 4). But if exactly one prisoner squeals on the other, the squealer is let off scot free (payoff 5), while the “squealee” is subject to a severe sentence (payoff 0). Finally, if both prisoners squeal, they share the severe punishment (payoff 1).
The Prisoners’ Dilemma is a two player a matrix—or strategic, or normal form— game. Such games are easily described by payoff matrices, where the strategies of player 1 and player 2 serve as row and column labels, respectively, and the corresponding payoffs are listed as pairs in matrix cells such that the first (sec- ond) number is the payoff to player 1 (2). The payoff matrix which describes the Prisoners’ Dilemma is depicted in Figure 1, with C denoting “cooperate” or “confess”, and D denoting “defect” or “don’t cooperate.”
This game is known as the Prisoners’ Dilemma because the only rational out- come is (D, D), which yields suboptimal payoffs of (1, 1). The reasoning is as follows. If player 1 plays C, then player 2 is better off playing D, since D yields a payoff of 5, whereas C yields only 4; but if player 1 plays D, then player 2 is again better off playing D, since D yields a payoff of 1, whereas C yields only
(^1) Copyright c© Amy Greenwald, 2007 (^2) The original anecdote due to A.W. Tucker appears in Rapoport [5]; the latter author is the two-time winner of the Prisoners’ Dilemma computer tournament organized by Axelrod.
Figure 1: The Prisoners’ Dilemma
Another popular two-player game is called the Battle of the Sexes. A man and a woman would like to spend an evening out together; however, the man prefers to go to a football game (strategy F ), while the woman prefers to go to the ballet (strategy B). Both the man and the woman prefer to be together, even at the event that is not to their liking, rather than go out alone. The payoffs of this coordination game are shown in Figure 2; the woman is player 1 and the man is player 2. In this game, there are two coordination equilibria, one which is preferred by the woman, and another which is preferred by the man.
1
Figure 2: Battle of the Sexes
The stag hunt game (see Figure 3) is a prototypical social contract. Rousseau tells an early version of the story in A Discourse on Inequality.:
If it was a matter of hunting a deer, everyone realized that he must remain well faithful to his post; but if a hare happened to pass within reach of one of them, we cannot doubt that he would have gone off in pursuit of it without scruple...
This game has two equilibria, one of which Pareto dominates the other: i.e., all players are simultaneously better off at one equilibrium than the other. But action H risk-dominates action D, since action H is safer for each player, given his/her uncertainty about the other player’s action.
A Nash equilibrium is a strategy profile from which none of the players has any incentive to deviate. In particular, no player can achieve strictly greater payoffs by choosing any strategy other than the one prescribed by the profile, given that all other players choose their prescribed strategies. In this sense, a Nash equilibrium specifies optimal strategic choices for all players.
Let us examine the Nash equilibria in the aforementioned examples:
Matching Pennies is another well-known example of a two player, zero-sum game. In this game, each of the players, the matcher and the mismatcher,^3 flips a coin, and the payoffs are determined as follows. If the coins come up matching (i.e., both heads or both tails), then the matcher wins, so the mismatcher pays the matcher $1. If the coins do not match (i.e., one head and one tail), then the mismatcher wins, so the matcher pays the mismatcher $1. In Figure 5, player 1 is the mismatcher and player 2 is the matcher. This game is called zero-sum because the payoffs in each cell of the matrix sum to zero.
In the game of Matching Pennies, there is no pure strategy Nash equilibrium. If player 1 plays H, then the best response of player 2 is T ; but if player 2 plays T , the best response of player 1 is not H, but T. Moreover, if player 1 plays T , then the best response of player 2 is H; but if player 2 plays H, then the best response of player 1 is not T , but H. This game, however, does have a mixed strategy Nash equilibrium. A mixed strategy is a randomization over a set of
(^3) The mismatcher is often affectionately referred to as Miss Matcher.
1
Figure 5: Hawks and Doves
pure strategies. In particular, the probabilistic strategy profile in which both players choose H with probability 12 and T with probability 12 is the unique (mixed strategy) Nash equilibrium in the game of Matching Pennies.
A matrix game is a 3-tuple Γ = (N, (Ai, Ri) 1 ≤i≤n), where
Matrix games are also sometimes called games in strategic, or normal, form.
In this formalism, the Prisoners’ Dilemma consists of a set of players N = { 1 , 2 }, with strategy (action) sets A 1 = A 2 = {C, D}, and payoffs as follows:
R 1 (C, C) = R 2 (C, C) = 4 R 1 (C, D) = R 2 (D, C) = 0 R 1 (D, D) = R 2 (D, D) = 1 R 1 (D, C) = R 2 (C, D) = 5
A mixed strategy set for player i is the set of probability distributions over the action set Ai, which can be described by the simplex operator ∆:
∆(Ai) =
qi : Ai → [0, 1] |
ai ∈Ai
qi(ai) = 1
For convenience, let Qi ≡ ∆(Ai). The usual notational conventions extend to mixed strategies: e.g., Q =
i Qi^ and^ q^ = (qi, q−i)^ ∈^ Q.^ In the context of mixed strategies, the expected payoffs to player i from strategy profile q are:
Ea∼q [Ri(a)] =
a∈A
q(a)Ri(a)
where
q(a) =
j=
qj (aj )
Brouwer’s Fixed Point Theorem. Let X ⊂ Rn^ be nonempty, compact, and convex. If f : X → X is a continuous function, then f has a fixed point: i.e., there exists x∗^ ∈ X s.t. x∗^ = f (x∗).
2.2 Proof of Existence
The proof of existence of Nash equilibrium is a direct application of Kakutani’s fixed point theorem. It suffices to show that the set of mixed strategies Q is nonempty, compact, and convex, and that the best-response correspondence (i.e., br : Q ⇒ Q) is nonempty and convex-valued, with a closed graph.
Lemma The set of mixed strategies Q is nonempty, compact, and convex.
Proof Recall that Q =
i Qi, where^ Qi^ is the set of probability over distri- butions player i’s the action set Ai. The set Q is nonempty (assuming Ai is nonempty, for all players i).
Given a sequence {(q 1 m ,... , qmn )} of mixed strategies. that converges to (q∗ 1 ,... , q n∗). This limit point is indeed a mixed strategy: i.e., q∗ i ≥ 0 and
i q
∗ i = 1.^ The former claim follows from the fact that the limit of a sequence of non-negative points is itself non-negative. The latter claim follows from the fact that the sum of the limits equals the limit of the sum. Thus, Q is closed. Moreover, Q is bounded in each component by 0 and 1. Therefore, Q is compact.
The set of mixed strategies Qi for each player i is convex: i.e., for all qi, pi ∈ Qi, for all λ ∈ [0, 1], the convex combination λqi + (1 − λ)pi ∈ Qi. thus, given two elements (q 1 ,... , qn), (p 1 ,... , pn) ∈ Q, the convex combination λ(q 1 ,... , qn) + (1 − λ)(p 1 ,... , pn) = (λq 1 + (1 − λ)p 1 ,... , λqn + (1 − λ)pn) ∈ Q, for all λ ∈ [0, 1].
Thus, the set of mixed strategies Q is nonempty, compact, and convex.
Lemma The best-response correspondence is nonempty.
Proof By Weierstrass’ theorem, any real-valued continuous function on a compact set attains a maximum. Recall that the set Qi is compact. Since Ri is a linear function of Qi, Ri is continuous. Thus, bri : Q → Qi is nonempty, for all players i, from which it follows that br is nonempty.
Lemma The best-response correspondence is convex-valued.
Proof If q∗ i , p∗ i ∈ bri(q−i) are best replies of player i to q−i, then Ri(q∗ i , q−i) = Ri(p∗ i , q−i) = λRi(q∗ i , q∗−i) + (1 − λ)Ri(p∗ i , q∗−i). Now, by the linearity of Ri, λRi(q∗ i , q∗−i) + (1 − λ)Ri(p∗ i , q∗−i) = Ri(λq∗ i + (1 − λ)p∗ i , q−∗i). Thus, the convex combination λq i∗ + (1 − λ)p∗ i ∈ bri(q−i). Since q−i was arbitrary, bri is convex- valued. Since i was arbitrary, br is convex-valued.
Lemma The graph of the best-response correspondence is closed.
Proof Must show p ∈ br(q), given the sequences qm, pm^ ∈ Q s.t. qm^ → q and pm^ → p, with pm^ ∈ br(qm) for all m. Suppose not: i.e., suppose there exists player i s.t. pi 6 ∈ bri(q−i). It follows that there exists qi ∈ Qi s.t. Ri(qi, q−i) > Ri(pi, q−i). Now let δ ≡ Ri(qi, q−i) − Ri(pi, q−i) > 0. Since Ri is linear, and therefore continuous, for all ǫ > 0, there exists Mǫ ∈ N s.t. for all m ≥ Mǫ, |Ri(pmi , q −mi) − Ri(pi, q−i)| < ǫ and |Ri(qi, qm −i) − Ri(qi, q−i)| < ǫ. Now
Ri(qi, qm −i) > Ri(qi, q−i) − ǫ = Ri(pi, q−i) + Ri(qi, q−i) − Ri(pi, q−i) − ǫ = Ri(pi, q−i) + δ − ǫ
Ri(pmi , qm −i) + δ − 2 ǫ
If ǫ = δ/2, then Ri(qi, qm −i) > Ri(pmi , qm −i), for all m ≥ Mδ/ 2. But then pmi 6 ∈ bri(q −mi) for all m. Contradiction. Therefore, the graph of br is closed.
Exercise Compute the best-response correspondences for the game depicted in Figure 6, a version of Hawks and Doves. Plot these correspondences, and compute all Nash equilibria.
1
(^2) H
H
D 1,
D
2,
0,
−1,−
Figure 6: Hawks and Doves
In this lecture, we defined Nash equilibrium and reproved Nash’s theorem guar- anteeing its existence in all (finite) matrix games.
Although Nash equilibrium is the generally accepted solution concept in the deductive analysis of matrix games, the Nash equilibria in our examples are somewhat peculiar. In the Prisoners’ Dilemma, the Nash equilibrium payoffs are sub-optimal. In the game of Matching Pennies, there is no pure strategy Nash equilibrium; the unique Nash equilibrium is probabilistic. Finally, in the coordination and miscoordination games, the Nash equilibrium is not unique.
In future lectures, alternative notions of equilibria are discussed.