Nash Equilibrium: Existence & Computation

: & IIIS, Tsinghua University zoy.blood@gmail.com March 28, 2016

Overview 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

Overview Fixed-Point Theorems: Kakutani & Brouwer Proof of 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

History 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of John Forbes Nash, Jr. Who proved it? When & how? Kakutani fixed-point theorem [Nash, 1950]. Brouwer fixed-point theorem [Nash, 1951]. 1 From Wikipedia. https://en.wikipedia.org/

Fixed-Point Theorems: Kakutani & Brouwer Proof of : everyone plays a mixed strategy that is a best response to others. Mixed Strategy: a probability distribution over the pure strategies, which means to play pure strategies randomly according to the distribution. Best Response: one strategy π i is a best response, if it maximizes the player i s utility with others strategies fixed.

Brouwer Fixed-Point Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem (Brouwer fixed-point theorem) For convex and compact domain D, every continuous function f : D D has a fixed point, i.e., p D, s.t. f (p) = p. Convex: every interval is in D, if its two endpoints are in D. Compact: closed and bounded (Euclidean space).

Examples for Brouwer Fixed-Point Theorems: Kakutani & Brouwer Proof of One dimensional case. Two dimensional case. Failure examples.

Kakutani Fixed-Point Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem (Kakutani fixed-point theorem) Let D be a non-empty, compact and convex subset of R n. Any set-valued function φ : D 2 D with (1) a closed graph (2) non-empty and convex funtion value for each x D, has a fixed point, i.e., p D, s.t. p φ(p). (1) Closed graph: the set {(x, y) y φ(x)} is closed. (2) Non-empty and convex function value: φ(x) is non-empty and convex.

Proof Idea Fixed-Point Theorems: Kakutani & Brouwer Proof of For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Then the best response is a set-value function, φ br : Π 2 Π, where φ br i (π) = best-response i (π i ).

Fixed-Point Theorems: Kakutani & Brouwer Proof of Proof Idea Cont d φ br (π) = φ br 1 (π),..., φbr n (π), φ br i (π) = best-response i (π i ). By applying Kakutani fixed-point theorem, there exists π Π, such that π φ br (π ), π i φ br i (π ) i [n]. Recall the definition of Nash equilibrium, i.e., π i is a best response to π i, i [n].

Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem ([Nash, 1951]) Every finite game G has a mixed Nash equilibrium. Proof. Verify that Π and φ br meet the requirements of Kakutani fixed-point theorem. By Kakutani = Done.

Verification Fixed-Point Theorems: Kakutani & Brouwer Proof of Recall i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }, Π = 1 2 n, φ br i (π) = best-response i (π i ). Π: non-empty, compact and convex subset of R n. φ br (π): non-empty and convex. φ br : closed graph.

Attention! Fixed-Point Theorems: Kakutani & Brouwer Proof of Is φ br (π) non-empty? Why? Finite game (important). Closed graph? Show that the set {(π, π ) π φ br (π)} is closed. Closed means, for any sequence of elements in this set, (π 1, π 1 ),..., (π k, π k ),..., that converges to (π, π ), (π, π ) is also in this set. Can be verified by definition.

Proof via Brouwer Fixed-Point Theorems: Kakutani & Brouwer Proof of For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Goal: Construct continuous function f : Π Π, such that f has a fixed point = G has a mixed Nash equilibrium.

Overview A Brute Force Solution Lemke Howson Algorithm 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

A Brute Force Solution A Brute Force Solution Lemke Howson Algorithm Nash equilibrium π can be written as a feasibility mixed integer program by definition. W.l.o.g., assume that utility for each player is in [0, 1], i.e., u : Π [0, 1] n. Auxiliary integer variables s ij, indicating whether action j is in the support of player i, i.e., s ij = I [ j Supp(π i ) ]. satisfying u i (j, π i ) u i (π i, π i ) s ij 1, i [n], j [m] π s π Π, s {0, 1} n m

Lemke Howson Algorithm A Brute Force Solution Lemke Howson Algorithm Lemke and Howson, [1964]. Solving two-person normal form general-sum game (bimatrix game). Exponential time in worst case. NASH is PPAD-complete, so is 2-person NASH [Chen and Deng, 2006].

Notations for Bimatrix Games A Brute Force Solution Lemke Howson Algorithm Let R denote the utility matrix for row player (i = 1), i.e., R jj = u 1 (j, j ). Similarly, C for column player (i = 2). Assumption (w.l.o.g.) The given bimatrix game (R, C) is symmetric, i.e., R = C T. Why (w.l.o.g.)? Otherwise, consider constructing a symmetric bimatrix game ( R, R T ), where [ ] 1 R R = C T. 1

Rewrite the MIP A Brute Force Solution Lemke Howson Algorithm u R. π 1 = π 2 z. z 0 and Rz 1. Find z 0 satisfying (Rz ) j = 1 or z j = 0, j [m]. Assumption (Non-degenerated (w.l.o.g.)) Every m + 1 equations (out of the 2m equations) are linear independent. In other words, the corresponding hyperplanes won t intersect at one point.

A Brute Force Solution Lemke Howson Algorithm z Implies Symmetric Lemma (z, z) is a symmetric Nash equilibrium (SNE) of bimatrix game (R, R T ), where z = normalize(z ). Proof. z j = 0 = j / Supp(z). z j 0 = j Supp(z), meanwhile (Rz) j = ( 1 T z ) 1 = j br(r, z) = arg maxj (Rz) j. Together, j [m], j Supp(z) = j br(r, z) = (z, z) SNE(R).

How to Find z? A Brute Force Solution Lemke Howson Algorithm Recall that z satisfies, j [m], (Rz ) j = 1 or z j = 0. LH Algorithm operates on the polytope P = {z : Rz 1, z 0}. By assumption, each vertex of P is defined exactly by m equations. (The intersection of m planes.) Label each vertex by the values of j s in the equations defining it. For example, the origin is labeled by 123 m, and the vertex with label 2 2 34 m is on the first axis next to the origin.

How to Find z? A Brute Force Solution Lemke Howson Algorithm Lemke-Howson Algorithm 1 Start from the origin, relax the equation with j = m and move to another vertex of P. 2 If the label of current vertex is 123 m (each number exactly once), return this vertex. 3 Otherwise, relax the equation to move on, whose j appears twice and won t lead to the previous vertex. 4 Goto step 2.

A Brute Force Solution Lemke Howson Algorithm Example 0 3 0 R = 2 2 2 3 0 0 2 Z 3 12 2 1 2 2 3 2 1 2 2 Z 1 23 2 3 123 1 1 2 3 Z 2 23 2 123 3

A Brute Force Solution Lemke Howson Algorithm Proof Proof of Termination. Each internal vertex (other than the origin and the target) on the path has exactly two neighbors. Never visit incompletely labeled vertices twice. Finite vertices. Proof of Correctness. Label m is missing for all internal vertices on the path. There is only one neighbor achievable from the origin by removing label m. The output cannot be the origin.

A Brute Force Solution Lemke Howson Algorithm Additional Comments for LH Algorithm π i is a best response to π i all pure strategies in the support of π i are best responses to π i. The origin is fully labeled, while all the internal vertices are not. Any internal vertex must have a label with m missing, and some k appearing twice, (proved by induction) i.e., 12 k 2 (m 1). Any internal vertex has exactly in-degree 1 and out-degree 1. So there will never be a ρ-shape loop.

A Brute Force Solution Lemke Howson Algorithm Additional Comments for LH Alg Cont d The origin has exactly m neighbors, each of which is reached by removing label j [m]. Denote them v 1,..., v m respectively. Since there will never be a ρ-shape loop, the path cannot go back to the origin via v m. (Otherwise it forms a ρ-shape.) Since all internal vertices don t have the label m, the path cannot go back to the origin via any of v 1,..., v m 1. (Because the label of v j, j m 1, includes m.) Therefore there is no loop.

References References Thanks Proof via Brouwer Chen, Xi, and Xiaotie Deng. Settling the complexity of 2-player Nash-equilibrium. Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). 2006. Lemke, Carlton E., and Joseph T. Howson, Jr. Equilibrium points of bimatrix games. Journal of the Society for Industrial & Applied Mathematics 12.2 (1964): 413-423. Nash, John F. Equilibrium points in n-person games. Proceedings of the national academy of sciences 36.1 (1950): 48-49. Nash, John. Non-cooperative games. Annals of mathematics (1951): 286-295.

References Thanks Proof via Brouwer Thanks!

Proof Idea References Thanks Proof via Brouwer For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Goal: Construct continuous function f : Π Π, such that f has a fixed point = G has a mixed Nash equilibrium.

Theorem References Thanks Proof via Brouwer Theorem ([Nash, 1951]) Every finite game G has a mixed Nash equilibrium. Proof. Define φ i (π, j) = max { 0, u i (j, π i ) u i (π i, π i ) }. Then construct f as follows, f i (π) = normalize ( π i + φ i (π) ) Verify the following facts to complete the proof f is continuous and Π is convex and compact. f has a fixed point = G has a mixed Nash equilibrium.

References Thanks Proof via Brouwer f is Continuous on Π, and Π is Convex and Compact i is convex and compact = Π is convex and compact. To establish the continuousness of f on Π, we show that each f i is continuous. u i (π) continuous. φ i (π, j) nonnegative, and continuous. 1 T( π i + φ i (π) ) 1 and continuous. f i (π) = normalize ( π i + φ i (π) ) continuous.

References Thanks Proof via Brouwer f Has a Fixed Point = G Has a Mixed Suppose π is a fixed point. f (π ) = π = i [n], f i (π ) = πi. Combined with the definition of f i, we have i [n], φ i (π ) = α i πi, where α i is some constant. Easy to see that there exists j [m] such that φ i (π, j) = 0. For every such j, πi (j) = 0 = contradiction. Therefore α i = 0, and hence π is a mixed Nash equilibrium.

References Thanks Proof via Brouwer Additional Comments on Last Slide We prove that π is a mixed NE by showing the following (according to the definition of NE), i [n], j [m], u i (j, π i) u i (π i, π i). Equivalent to (recall the definition of φ i ) i [n], φ i (π ) = 0. By definition of f, we have that for all i [n], π i = f i (π ) = normalize ( π i + φ i (π ) ) = φ i (π ) = α i π i. In other words, φ i (π ) is proportional to π i.

Proving α i = 0 References Thanks Proof via Brouwer To complete the proof, we need only to show that α i = 0. It is implied by the following fact, i [n], j [m], s.t. π i (j) > 0, φ i (π, j) = 0. Notice that j Supp(πi ) π i (j) = 1, u i (πi, π i) = πi (j)u i (j, π i) j Supp(π i ) j Supp(π i ) Hence j Supp(πi ), such that Done. π i (j) ( u i (j, π i) u i (π i, π i) ) = 0 u i (j, π i) u i (π i, π i) 0 = φ i (π, j) = 0.

References Thanks Proof via Brouwer Intuition behind the Constructive Proof f i (π) = normalized ( π i + φ i (π) ) : smoothly approaching to a better strategy for player i (assuming others fixed). Why not choose f i (π) to be the best response to π i? (Incontinuous). +φ i (π): increasing the profitable components proportional to the increments. α i = 0 (φ i (π ) = 0): no profitable deviation, hence Nash equilibrium.