Nash Equilibrium
A Nash Equilibrium is a set of strategies that players act out, with the property that no player benefits from changing their strategy. Intuitively, this means that if any given player were told the strategies of all their opponents, they still would choose to retain their original strategy. For example, in the game of trying to guess 2/3 of the average guesses, the unique Nash equilibrium is (counterintuitively) for all players to choose 0.
Nash equilibria are useful in analyzing the result of competitive scenarios, especially as applied to conflict such as war. For similar reasons, they are often used in analyzing economic factors, such as markets, currencies, and auctions. They are also used to enforce cooperation through self-interest, by contriving the relative rewards of actions in such a way that each actor would independently choose the desired action; a famous example is the prisoner's dilemma problem.
Formal Definition
Let \(S_i\) denote the set of strategies for the \(i\)th player, and \(S=S_1 \cdot S_2 \cdot \ldots \cdot S_n\) denote the set of strategy profiles. This means that the elements of \(S\) are all possible combinations of individual strategies. Let \(f_i(s)\) denote the payoff to player \(i\) when evaluated at strategy profile \(s \in S\); note that the payoff to an individual player depends on the strategies of the other players as well.
An individual mixed strategy is a probability distribution on the set of available strategies. For example, selecting one of "rock", "paper", or "scissors" uniformly at random is an example of a mixed strategy. There can also be a choice of weighting so that strategies are picked with different probabilities. A pure strategy is one that does not involve randomization at all, instead choosing a particular strategy all of the time. Note that, importantly:
A pure strategy is simply a special case of a mixed strategy, in which one strategy is chosen 100% of the time.
Which means that the same methods used to calculate mixed strategies are equally useful in detecting pure strategies.
A Nash equilibrium is a strategy profile \(s=(s_1, s_2, \ldots, s_n)\) with the property that
\[f_i(s) \geq f_i((s_1, s_2, \ldots, s_i', \ldots, s_n))\]
for all \(i\), where \(s_i' \in S_i\) denotes a strategy other than \(s_i\) available to player \(i\).
In the event that this inequality is strict; i.e.
\[f_i(s) > f_i((s_1, s_2, \ldots, s_i', \ldots, s_n))\]
for all \(i\), the profile \(s\) is called a strict Nash equilibrium. Otherwise, \(s\) is called a weak Nash equilibrium.
Nash's existence theorem guarantees that as long as \(S_i\) is finite for all \(i\) and there are a finite number of players, at least one Nash equilibrium exists (possibly involving mixed strategies).
Amusingly, this result was dismissed by John von Neumann with the response "That's trivial, you know. That's just a fixed point theorem", a year before Nash's publication and 40 years before Nash won the Nobel prize for (in part) this result. Indeed, Nash's existence theorem is a consequence of the Brouwer fixed point theorem (or equivalently Kakatuni's fixed point theorem), which is a more general statement in algebraic topology.
Examples
The simplest example of Nash equilibrium is the coordination game, in which both players benefit from coordinating but may also hold individual preferences. For instance, suppose two friends wish to plan an evening around either partying or watching a movie. Both friends prefer to engage in the same activity, but one prefers partying to movies by a factor of 2, and the other prefers movies to partying by the same ratio. This can be modeled by the following payoff matrix:
Party | Movie | |
Party | 2,1 | 0,0 |
Movie | 0,0 | 1,2 |
where the payoff vector is listed under the appropriate strategy profile (the first player's strategies are listed on the left). In this case, both {Party, Party} and {Movie, Movie} are Nash equilibria, as neither side would choose to deviate when informed of the other's choice.
The most famous example of Nash equilibrium, however, is the Prisoner's dilemma problem, in which each of two prisoners have the choice of "cooperating" with the other prisoner by keeping quiet, or "defecting" by confessing. If both prisoners cooperate, they will face little jail time, but if exactly one of them defects, the defector will immediately go free and the cooperator will face lots of jail time. The catch is that if both prisoners choose to defect, they will both face a moderate amount of jail time. This can be modeled by the payoff matrix
where lower jail sentences have higher payoffs. In this scenario, there is exactly one Nash equilibrium: both players choose to defect -- in any other case, a cooperating prisoner would choose instead to defect. This is despite the fact that both prisoners would improve their situation by both cooperating, meaning that the Nash equilibrium is globally inferior to the "both cooperate" strategy.
The practical application of this is clear: by contriving the values appropriately, authorities can make it optimal for a suspect to confess rather than cooperate with their accessories.
Finding Nash equilibria
Generally, finding a "pure" Nash equilibrium (in which no randomization occurs) is fairly easy, as verifying one only requires comparing a small number of potential payoffs. For instance, consider a game with the following payoff matrix:
1 | 2 | 3 | |
1 | -1 | -2 | -1 |
2 | 2 | 2 | 1 |
3 | -1 | -1 | 0 |
In this game, each player has three strategies to choose from, and the first player earns the value in the corresponding cell. His goal is to maximize his score, while the second player's goal is to minimize it.
A Nash equilibrium of this game occurs when neither player has any incentive to change their strategy, even if they know their opponents. This means that
For a cell to represent a (pure) Nash equilibrium, it must be the minimum of its row and the maximum of its column
as this is the only way neither player would choose to change their strategy. In the above game, the unique pure equilibrium is player 1 choosing strategy 2 and player 2 choosing strategy 3, as neither player wishes to deviate from the resulting payoff of 1.
Of course, a "pure" Nash equilibrium is a special case of a mixed strategy (where one strategy is chosen with probability 1), so the more general approach below is universally valid.
In the case of mixed strategies, the situation becomes slightly more complex, and often involves optimization strategies such as the rearrangement inequality. For instance, consider the following game: each player can show either one or two fingers, and
- If an odd number of fingers are showing, the first player scores the number of shown fingers.
- If an even number of fingers are showing, the second player scores the number of shown fingers.
This corresponds to the payoff matrix
1 | 2 | |
1 | 0,2 | 3,0 |
2 | 3,0 | 0,4 |
It is immediately evident that this game has no pure equilibrium (as either player would choose to switch their move if losing), so an analysis of mixed strategies is necessarily. Surprisingly, the Nash equilibrium in this game favors the first player, despite the apparent symmetry of the problem.
To find the (or a) Nash equilibrium of the game, assume that the Nash equilibrium consists of the first player choosing 1 with probability \(p\) (and 2 with probability \(1-p\)), and the second player chooses 1 with probability \(q\). Note that Nash's theorem guarantees that at least one Nash equilibrium exists, so this step is valid. Now, player 1's expected payoff is
\[(0-2) \cdot p \cdot q + (3-0) \cdot p \cdot (1-q) + (3-0) \cdot (1-p) \cdot q + (0-4) \cdot (1-p) \cdot (1-q) = -12pq+7p+7q-4\]
Since this is a Nash equilibrium, player 1 would not choose to adjust \(p\) knowing \(q\). But the payoff can be written as \(p(7-12q)+7q-4\), so if
- \(q>\frac{7}{12}\), player 1 would wish to minimize \(p\) (set \(p=0\))
- \(q<\frac{7}{12}\), player 1 would wish to maximize \(p\) (set \(p=1\))
This implies that at the Nash equilibrium point, \(q=\frac{7}{12}\).
Analogously, player 2's expected payoff is
\[(2-0) \cdot p \cdot q + (0-3) \cdot p \cdot (1-q) + (0-3) \cdot (1-p) \cdot q + (4-0) \cdot (1-p) \cdot (1-q) = 12pq-7p-7q+4\]
which is precisely as expected, considering the sum of the expected payoffs should be zero (this is a zero-sum game). Thus, by analogous reasoning, \(p=\frac{7}{12}\) at the Nash equilibrium point.
Thus, at the Nash equilibrium point, player 1's expected utility is positive, namely \(\frac{1}{12}\). This means that the game is inherently unfair; by choosing 1 with probability \(\frac{7}{12}\), player 1 guarantees an expected payoff of at least \(\frac{1}{12}\) (player 2 chooses the same strategy in order to minimize player 1's expected payoff).
Generally speaking, both players adapted the same general strategy: to calculate the expected payoff of the other player as a function of the probability distributions, then adjust theirs to "cancel out" the other's. Another way of viewing Nash's theorem is to note that since the expected payoff is linear in each variable, this process results in a system of linear equations that always has at least one solution.
Alice and Bob are playing 900 games of Rock-Paper-Scissors, but Alice is not allowed to choose scissors in any of the games. If both players choose their strategies optimally (i.e. Nash equilibrium is reached), what is the expected number of games Bob will win?
"Optimally" means that both players want to maximize the difference between the number of games they win and the number of games the opponent wins.
Practical Limitations
Nash equilibrium requires several conditions to hold in order to apply:
- All players are interested only in maximizing their own expected payoff, and will act accordingly.
- All players execute their strategies perfectly.
- All players are infinitely intelligent, or at least intelligent enough to determine the solution.
- Every player knows (or can deduce) the planned equilibrium strategy of all other players, and know that changing their own strategy will not result in other players changing their strategy.
- All of this is common knowledge, meaning that every player knows every other player satisfies the above four conditions.
In practice, it is rare for all these conditions to be directly satisfied. For instance,
- A prisoner in the prisoner's dilemma may face other considerations; for example, a prisoner expecting retribution for defecting would be facing far less of a dilemma.
- A player may accidentally (or intentionally) execute a strategy imperfectly, which could potentially lead to a loss, but could also potentially lead to a win due to invalidation of the common knowledge criterion.
- A player may not be sufficiently intelligent to work out the solution; for instance, a young child playing tic-tac-toe would not necessarily be able to deduce the optimal play.
- Players may believe, rightly or wrongly, that their fellow players will not be perfectly rational. This is a major concern, for example, in arms races -- particularly, as in recent times, nuclear ones.
For this reason, most practical situations are not modeled particularly well by Nash equilibrium. The concept is mostly useful for explaining trends in economics and evolutionary biology, in which strategies that do not maximize utility (e.g. money in economics, or survival in biology) are rejected through natural competition. Indeed, research in these fields tends to support the theory that the system tends to its Nash equilibrium.