Nash Equilibrium: Existence & Computation

Similar documents
1 PROBLEM DEFINITION. i=1 z i = 1 }.

Game Theory and Algorithms Lecture 7: PPAD and Fixed-Point Theorems

Lecture 9 : PPAD and the Complexity of Equilibrium Computation. 1 Complexity Class PPAD. 1.1 What does PPAD mean?

6.891 Games, Decision, and Computation February 5, Lecture 2

Efficient Nash Equilibrium Computation in Two Player Rank-1 G

Non-zero-sum Game and Nash Equilibarium

Lecture 6: April 25, 2006

Theoretical Computer Science

A Note on Approximate Nash Equilibria

Part IB Optimisation

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Chapter 9. Mixed Extensions. 9.1 Mixed strategies

Normal-form games. Vincent Conitzer

Computing Solution Concepts of Normal-Form Games. Song Chong EE, KAIST

Algorithmic Game Theory and Economics: A very short introduction. Mysore Park Workshop August, 2012

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Lectures 6, 7 and part of 8

Algorithmic Game Theory

of a bimatrix game David Avis McGill University Gabriel Rosenberg Yale University Rahul Savani University of Warwick

Large Supports are required for Well-Supported Nash Equilibria

MS&E 246: Lecture 4 Mixed strategies. Ramesh Johari January 18, 2007

Iterative Rounding and Relaxation

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 17: Games, Min-Max and Equilibria, and PPAD

Theory and Internet Protocols

Approximate Nash Equilibria with Near Optimal Social Welfare

Optimal Machine Strategies to Commit to in Two-Person Repeated Games

3 Nash is PPAD Complete

Lecture Notes on Game Theory

Game Theory: Lecture 3

New Algorithms for Approximate Nash Equilibria in Bimatrix Games

CS Algorithms and Complexity

KAKUTANI S FIXED POINT THEOREM AND THE MINIMAX THEOREM IN GAME THEORY

Economics 204. The Transversality Theorem is a particularly convenient formulation of Sard s Theorem for our purposes: with r 1+max{0,n m}

15 Real Analysis II Sequences and Limits , Math for Economists Fall 2004 Lecture Notes, 10/21/2004

Topics in Theoretical Computer Science April 08, Lecture 8

CS 573: Algorithmic Game Theory Lecture date: January 23rd, 2008

A Polynomial-Time Algorithm for Pliable Index Coding

Computing Minmax; Dominance

Basic Combinatorics. Math 40210, Section 01 Fall Homework 8 Solutions

6. Linear Programming

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Mixed Strategies. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof)

16.1 Min-Cut as an LP

Single parameter FPT-algorithms for non-trivial games

A Few Games and Geometric Insights

Games and Their Equilibria

arxiv:cs/ v1 [cs.gt] 26 Feb 2006

Recall: Matchings. Examples. K n,m, K n, Petersen graph, Q k ; graphs without perfect matching

Economics 703 Advanced Microeconomics. Professor Peter Cramton Fall 2017

Polynomial-time Computation of Exact Correlated Equilibrium in Compact Games

Extensive games (with perfect information)

Constructive Proof of the Fan-Glicksberg Fixed Point Theorem for Sequentially Locally Non-constant Multi-functions in a Locally Convex Space

Analysis of Algorithms I: Perfect Hashing

Today: Linear Programming (con t.)

COMPARISON OF INFORMATION STRUCTURES IN ZERO-SUM GAMES. 1. Introduction

The Complexity of the Permanent and Related Problems

Computing Minmax; Dominance

Could Nash equilibria exist if the payoff functions are not quasi-concave?

15-780: LinearProgramming

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

a i,1 a i,j a i,m be vectors with positive entries. The linear programming problem associated to A, b and c is to find all vectors sa b

The complexity of uniform Nash equilibria and related regular subgraph problems

CO 250 Final Exam Guide

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3

Computational Integer Programming. Lecture 2: Modeling and Formulation. Dr. Ted Ralphs

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Mixed Nash Equilibria

Lecture 9 Tuesday, 4/20/10. Linear Programming

Industrial Organization Lecture 3: Game Theory

A Multiplayer Generalization of the MinMax Theorem

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

Settling Some Open Problems on 2-Player Symmetric Nash Equilibria

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Introduction to Game Theory

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

Game Theory. Lecture Notes By Y. Narahari. Department of Computer Science and Automation Indian Institute of Science Bangalore, India August 2012

On Equilibria of Distributed Message-Passing Games

Supplementary lecture notes on linear programming. We will present an algorithm to solve linear programs of the form. maximize.

Math 152: Applicable Mathematics and Computing

The inefficiency of equilibria

LINEAR PROGRAMMING III

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games

Two-Player Kidney Exchange Game

Markov Decision Processes

directed weighted graphs as flow networks the Ford-Fulkerson algorithm termination and running time

LINEAR SPACES. Define a linear space to be a near linear space in which any two points are on a line.

Cocliques in the Kneser graph on line-plane flags in PG(4, q)

1 Motivation. Game Theory. 2 Linear Programming. Motivation. 4. Algorithms. Bernhard Nebel and Robert Mattmüller May 15th, 2017

Solving Zero-Sum Security Games in Discretized Spatio-Temporal Domains

Game Theory. 4. Algorithms. Bernhard Nebel and Robert Mattmüller. May 2nd, Albert-Ludwigs-Universität Freiburg

Game Theory: Lecture 2

Lecture 2: Minimax theorem, Impagliazzo Hard Core Lemma

Lecture 4 The nucleolus

Lecture 18: March 15

Lecture: Expanders, in theory and in practice (2 of 2)

The Distribution of Optimal Strategies in Symmetric Zero-sum Games

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

TRINITY COLLEGE DUBLIN THE UNIVERSITY OF DUBLIN. School of Mathematics

Brown s Original Fictitious Play

Transcription:

: & IIIS, Tsinghua University zoy.blood@gmail.com March 28, 2016

Overview 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

Overview Fixed-Point Theorems: Kakutani & Brouwer Proof of 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

History 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of John Forbes Nash, Jr. Who proved it? When & how? Kakutani fixed-point theorem [Nash, 1950]. Brouwer fixed-point theorem [Nash, 1951]. 1 From Wikipedia. https://en.wikipedia.org/

Fixed-Point Theorems: Kakutani & Brouwer Proof of : everyone plays a mixed strategy that is a best response to others. Mixed Strategy: a probability distribution over the pure strategies, which means to play pure strategies randomly according to the distribution. Best Response: one strategy π i is a best response, if it maximizes the player i s utility with others strategies fixed.

Brouwer Fixed-Point Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem (Brouwer fixed-point theorem) For convex and compact domain D, every continuous function f : D D has a fixed point, i.e., p D, s.t. f (p) = p. Convex: every interval is in D, if its two endpoints are in D. Compact: closed and bounded (Euclidean space).

Examples for Brouwer Fixed-Point Theorems: Kakutani & Brouwer Proof of One dimensional case. Two dimensional case. Failure examples.

Kakutani Fixed-Point Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem (Kakutani fixed-point theorem) Let D be a non-empty, compact and convex subset of R n. Any set-valued function φ : D 2 D with (1) a closed graph (2) non-empty and convex funtion value for each x D, has a fixed point, i.e., p D, s.t. p φ(p). (1) Closed graph: the set {(x, y) y φ(x)} is closed. (2) Non-empty and convex function value: φ(x) is non-empty and convex.

Proof Idea Fixed-Point Theorems: Kakutani & Brouwer Proof of For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Then the best response is a set-value function, φ br : Π 2 Π, where φ br i (π) = best-response i (π i ).

Fixed-Point Theorems: Kakutani & Brouwer Proof of Proof Idea Cont d φ br (π) = φ br 1 (π),..., φbr n (π), φ br i (π) = best-response i (π i ). By applying Kakutani fixed-point theorem, there exists π Π, such that π φ br (π ), π i φ br i (π ) i [n]. Recall the definition of Nash equilibrium, i.e., π i is a best response to π i, i [n].

Theorem Fixed-Point Theorems: Kakutani & Brouwer Proof of Theorem ([Nash, 1951]) Every finite game G has a mixed Nash equilibrium. Proof. Verify that Π and φ br meet the requirements of Kakutani fixed-point theorem. By Kakutani = Done.

Verification Fixed-Point Theorems: Kakutani & Brouwer Proof of Recall i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }, Π = 1 2 n, φ br i (π) = best-response i (π i ). Π: non-empty, compact and convex subset of R n. φ br (π): non-empty and convex. φ br : closed graph.

Attention! Fixed-Point Theorems: Kakutani & Brouwer Proof of Is φ br (π) non-empty? Why? Finite game (important). Closed graph? Show that the set {(π, π ) π φ br (π)} is closed. Closed means, for any sequence of elements in this set, (π 1, π 1 ),..., (π k, π k ),..., that converges to (π, π ), (π, π ) is also in this set. Can be verified by definition.

Proof via Brouwer Fixed-Point Theorems: Kakutani & Brouwer Proof of For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Goal: Construct continuous function f : Π Π, such that f has a fixed point = G has a mixed Nash equilibrium.

Overview A Brute Force Solution Lemke Howson Algorithm 1 Fixed-Point Theorems: Kakutani & Brouwer Proof of 2 A Brute Force Solution Lemke Howson Algorithm

A Brute Force Solution A Brute Force Solution Lemke Howson Algorithm Nash equilibrium π can be written as a feasibility mixed integer program by definition. W.l.o.g., assume that utility for each player is in [0, 1], i.e., u : Π [0, 1] n. Auxiliary integer variables s ij, indicating whether action j is in the support of player i, i.e., s ij = I [ j Supp(π i ) ]. satisfying u i (j, π i ) u i (π i, π i ) s ij 1, i [n], j [m] π s π Π, s {0, 1} n m

Lemke Howson Algorithm A Brute Force Solution Lemke Howson Algorithm Lemke and Howson, [1964]. Solving two-person normal form general-sum game (bimatrix game). Exponential time in worst case. NASH is PPAD-complete, so is 2-person NASH [Chen and Deng, 2006].

Notations for Bimatrix Games A Brute Force Solution Lemke Howson Algorithm Let R denote the utility matrix for row player (i = 1), i.e., R jj = u 1 (j, j ). Similarly, C for column player (i = 2). Assumption (w.l.o.g.) The given bimatrix game (R, C) is symmetric, i.e., R = C T. Why (w.l.o.g.)? Otherwise, consider constructing a symmetric bimatrix game ( R, R T ), where [ ] 1 R R = C T. 1

Rewrite the MIP A Brute Force Solution Lemke Howson Algorithm u R. π 1 = π 2 z. z 0 and Rz 1. Find z 0 satisfying (Rz ) j = 1 or z j = 0, j [m]. Assumption (Non-degenerated (w.l.o.g.)) Every m + 1 equations (out of the 2m equations) are linear independent. In other words, the corresponding hyperplanes won t intersect at one point.

A Brute Force Solution Lemke Howson Algorithm z Implies Symmetric Lemma (z, z) is a symmetric Nash equilibrium (SNE) of bimatrix game (R, R T ), where z = normalize(z ). Proof. z j = 0 = j / Supp(z). z j 0 = j Supp(z), meanwhile (Rz) j = ( 1 T z ) 1 = j br(r, z) = arg maxj (Rz) j. Together, j [m], j Supp(z) = j br(r, z) = (z, z) SNE(R).

How to Find z? A Brute Force Solution Lemke Howson Algorithm Recall that z satisfies, j [m], (Rz ) j = 1 or z j = 0. LH Algorithm operates on the polytope P = {z : Rz 1, z 0}. By assumption, each vertex of P is defined exactly by m equations. (The intersection of m planes.) Label each vertex by the values of j s in the equations defining it. For example, the origin is labeled by 123 m, and the vertex with label 2 2 34 m is on the first axis next to the origin.

How to Find z? A Brute Force Solution Lemke Howson Algorithm Lemke-Howson Algorithm 1 Start from the origin, relax the equation with j = m and move to another vertex of P. 2 If the label of current vertex is 123 m (each number exactly once), return this vertex. 3 Otherwise, relax the equation to move on, whose j appears twice and won t lead to the previous vertex. 4 Goto step 2.

A Brute Force Solution Lemke Howson Algorithm Example 0 3 0 R = 2 2 2 3 0 0 2 Z 3 12 2 1 2 2 3 2 1 2 2 Z 1 23 2 3 123 1 1 2 3 Z 2 23 2 123 3

A Brute Force Solution Lemke Howson Algorithm Example 0 3 0 R = 2 2 2 3 0 0 2 Z 3 12 2 1 2 2 3 2 1 2 2 Z 1 23 2 3 123 1 1 2 3 Z 2 23 2 123 3

A Brute Force Solution Lemke Howson Algorithm Proof Proof of Termination. Each internal vertex (other than the origin and the target) on the path has exactly two neighbors. Never visit incompletely labeled vertices twice. Finite vertices. Proof of Correctness. Label m is missing for all internal vertices on the path. There is only one neighbor achievable from the origin by removing label m. The output cannot be the origin.

A Brute Force Solution Lemke Howson Algorithm Additional Comments for LH Algorithm π i is a best response to π i all pure strategies in the support of π i are best responses to π i. The origin is fully labeled, while all the internal vertices are not. Any internal vertex must have a label with m missing, and some k appearing twice, (proved by induction) i.e., 12 k 2 (m 1). Any internal vertex has exactly in-degree 1 and out-degree 1. So there will never be a ρ-shape loop.

A Brute Force Solution Lemke Howson Algorithm Additional Comments for LH Alg Cont d The origin has exactly m neighbors, each of which is reached by removing label j [m]. Denote them v 1,..., v m respectively. Since there will never be a ρ-shape loop, the path cannot go back to the origin via v m. (Otherwise it forms a ρ-shape.) Since all internal vertices don t have the label m, the path cannot go back to the origin via any of v 1,..., v m 1. (Because the label of v j, j m 1, includes m.) Therefore there is no loop.

References References Thanks Proof via Brouwer Chen, Xi, and Xiaotie Deng. Settling the complexity of 2-player Nash-equilibrium. Proceedings of the Annual Symposium on Foundations of Computer Science (FOCS). 2006. Lemke, Carlton E., and Joseph T. Howson, Jr. Equilibrium points of bimatrix games. Journal of the Society for Industrial & Applied Mathematics 12.2 (1964): 413-423. Nash, John F. Equilibrium points in n-person games. Proceedings of the national academy of sciences 36.1 (1950): 48-49. Nash, John. Non-cooperative games. Annals of mathematics (1951): 286-295.

References Thanks Proof via Brouwer Thanks!

Proof Idea References Thanks Proof via Brouwer For finite game G with n players, m actions for each player. Action (mixed) space for player i: i = { (p 1, p 2,..., p m ) : p j 0, j p j = 1 }. Action profile (mixed) space: Π = 1 2 n. Goal: Construct continuous function f : Π Π, such that f has a fixed point = G has a mixed Nash equilibrium.

Theorem References Thanks Proof via Brouwer Theorem ([Nash, 1951]) Every finite game G has a mixed Nash equilibrium. Proof. Define φ i (π, j) = max { 0, u i (j, π i ) u i (π i, π i ) }. Then construct f as follows, f i (π) = normalize ( π i + φ i (π) ) Verify the following facts to complete the proof f is continuous and Π is convex and compact. f has a fixed point = G has a mixed Nash equilibrium.

References Thanks Proof via Brouwer f is Continuous on Π, and Π is Convex and Compact i is convex and compact = Π is convex and compact. To establish the continuousness of f on Π, we show that each f i is continuous. u i (π) continuous. φ i (π, j) nonnegative, and continuous. 1 T( π i + φ i (π) ) 1 and continuous. f i (π) = normalize ( π i + φ i (π) ) continuous.

References Thanks Proof via Brouwer f Has a Fixed Point = G Has a Mixed Suppose π is a fixed point. f (π ) = π = i [n], f i (π ) = πi. Combined with the definition of f i, we have i [n], φ i (π ) = α i πi, where α i is some constant. Easy to see that there exists j [m] such that φ i (π, j) = 0. For every such j, πi (j) = 0 = contradiction. Therefore α i = 0, and hence π is a mixed Nash equilibrium.

References Thanks Proof via Brouwer Additional Comments on Last Slide We prove that π is a mixed NE by showing the following (according to the definition of NE), i [n], j [m], u i (j, π i) u i (π i, π i). Equivalent to (recall the definition of φ i ) i [n], φ i (π ) = 0. By definition of f, we have that for all i [n], π i = f i (π ) = normalize ( π i + φ i (π ) ) = φ i (π ) = α i π i. In other words, φ i (π ) is proportional to π i.

Proving α i = 0 References Thanks Proof via Brouwer To complete the proof, we need only to show that α i = 0. It is implied by the following fact, i [n], j [m], s.t. π i (j) > 0, φ i (π, j) = 0. Notice that j Supp(πi ) π i (j) = 1, u i (πi, π i) = πi (j)u i (j, π i) j Supp(π i ) j Supp(π i ) Hence j Supp(πi ), such that Done. π i (j) ( u i (j, π i) u i (π i, π i) ) = 0 u i (j, π i) u i (π i, π i) 0 = φ i (π, j) = 0.

References Thanks Proof via Brouwer Intuition behind the Constructive Proof f i (π) = normalized ( π i + φ i (π) ) : smoothly approaching to a better strategy for player i (assuming others fixed). Why not choose f i (π) to be the best response to π i? (Incontinuous). +φ i (π): increasing the profitable components proportional to the increments. α i = 0 (φ i (π ) = 0): no profitable deviation, hence Nash equilibrium.