Game Theory and Control

Similar documents
Game Theory: Spring 2017

Algorithmic Game Theory. Alexander Skopalik

Strategic Games: Social Optima and Nash Equilibria

Potential Games. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. Potential Games p. 1/3

Outline for today. Stat155 Game Theory Lecture 17: Correlated equilibria and the price of anarchy. Correlated equilibrium. A driving example.

Topics of Algorithmic Game Theory

ALGORITHMIC GAME THEORY. Incentive and Computation

6.254 : Game Theory with Engineering Applications Lecture 8: Supermodular and Potential Games

A (Brief) Introduction to Game Theory

MS&E 246: Lecture 17 Network routing. Ramesh Johari

Game Theory for Linguists

Introduction to Game Theory

1 Equilibrium Comparisons

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples

News. Good news. Bad news. Ugly news

Introduction to Game Theory

CSC304 Lecture 5. Game Theory : Zero-Sum Games, The Minimax Theorem. CSC304 - Nisarg Shah 1

Game Theory: introduction and applications to computer networks

CS 598RM: Algorithmic Game Theory, Spring Practice Exam Solutions

Routing Games 1. Sandip Chakraborty. Department of Computer Science and Engineering, INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR.

CS 573: Algorithmic Game Theory Lecture date: Feb 6, 2008

Mixed Nash Equilibria

CS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria

Selfish Routing. Simon Fischer. December 17, Selfish Routing in the Wardrop Model. l(x) = x. via both edes. Then,

Algorithmic Game Theory and Applications. Lecture 4: 2-player zero-sum games, and the Minimax Theorem

Introduction to Game Theory Lecture Note 2: Strategic-Form Games and Nash Equilibrium (2)

Traffic Games Econ / CS166b Feb 28, 2012

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

Hannu Salonen Utilitarian Preferences and Potential Games. Aboa Centre for Economics

CS364A: Algorithmic Game Theory Lecture #16: Best-Response Dynamics

Discrete Optimization 2010 Lecture 12 TSP, SAT & Outlook

General-sum games. I.e., pretend that the opponent is only trying to hurt you. If Column was trying to hurt Row, Column would play Left, so

MS&E 246: Lecture 18 Network routing. Ramesh Johari

User Equilibrium CE 392C. September 1, User Equilibrium

The Paradox Severity Linear Latency General Latency Extensions Conclusion. Braess Paradox. Julian Romero. January 22, 2008.

Routing Games : From Altruism to Egoism

Utilitarian Preferences and Potential Games

Lecture Notes on Game Theory

AGlimpseofAGT: Selfish Routing

Near-Potential Games: Geometry and Dynamics

Congestion Games with Load-Dependent Failures: Identical Resources

Learning Approaches to the Witsenhausen Counterexample From a View of Potential Games

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

Part II: Integral Splittable Congestion Games. Existence and Computation of Equilibria Integral Polymatroids

The price of anarchy of finite congestion games

Evolution & Learning in Games

Competitive Scheduling in Wireless Collision Channels with Correlated Channel State

Efficiency and Braess Paradox under Pricing

Worst-Case Efficiency Analysis of Queueing Disciplines

6.207/14.15: Networks Lecture 16: Cooperation and Trust in Networks

Equilibrium Computation

Game Theory: introduction and applications to computer networks

Game Theoretic Learning in Distributed Control

The inefficiency of equilibria

Lecture 19: Common property resources

MS&E 246: Lecture 4 Mixed strategies. Ramesh Johari January 18, 2007

Near-Potential Games: Geometry and Dynamics

A Modified Q-Learning Algorithm for Potential Games

Connections Between Cooperative Control and Potential Games Illustrated on the Consensus Problem

Lecture 6: April 25, 2006

Industrial Organization Lecture 3: Game Theory

6.891 Games, Decision, and Computation February 5, Lecture 2

University of Warwick, Department of Economics Spring Final Exam. Answer TWO questions. All questions carry equal weight. Time allowed 2 hours.

Computation of Efficient Nash Equilibria for experimental economic games

Reducing Congestion Through Information Design

Dynamic Atomic Congestion Games with Seasonal Flows

Exact and Approximate Equilibria for Optimal Group Network Formation

Normal-form games. Vincent Conitzer

Game Theory. Professor Peter Cramton Economics 300

Recap Social Choice Fun Game Voting Paradoxes Properties. Social Choice. Lecture 11. Social Choice Lecture 11, Slide 1

EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium

Efficiency Loss in a Network Resource Allocation Game

Routing (Un-) Splittable Flow in Games with Player-Specific Linear Latency Functions

Achieving Pareto Optimality Through Distributed Learning

Computing Minmax; Dominance

Iterated Strict Dominance in Pure Strategies

Game Theory Lecture 10+11: Knowledge

RANDOM SIMULATIONS OF BRAESS S PARADOX

Social Network Games

Algorithmic Game Theory

Minimizing Price of Anarchy in Resource Allocation Games

Single parameter FPT-algorithms for non-trivial games

Network Games with Friends and Foes

Efficient Mechanism Design

LEARNING IN CONCAVE GAMES

Achieving Pareto Optimality Through Distributed Learning

Definition Existence CG vs Potential Games. Congestion Games. Algorithmic Game Theory

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 17th May 2018 Time: 09:45-11:45. Please answer all Questions.

Optimality conditions and complementarity, Nash equilibria and games, engineering and economic application

Games and Their Equilibria

Chapter 2. Equilibrium. 2.1 Complete Information Games

Static (or Simultaneous- Move) Games of Complete Information

Distributed Learning based on Entropy-Driven Game Dynamics

Coevolutionary Modeling in Networks 1/39

Quantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli

6.207/14.15: Networks Lecture 10: Introduction to Game Theory 2

Game Theory. Wolfgang Frimmel. Perfect Bayesian Equilibrium

Designing Games for Distributed Optimization

On the Value of Correlation

Network Games: Learning and Dynamics

Transcription:

Game Theory and Control Lecture 4: Potential games Saverio Bolognani, Ashish Hota, Maryam Kamgarpour Automatic Control Laboratory ETH Zürich 1 / 40

Course Outline 1 Introduction 22.02 Lecture 1: Introduction to games 2 Single-stage games 01.03 Lecture 2: Zero-sum games 08.03 Lecture 3: Non-zero-sum games 15.03 Lecture 4: Potential games 22.03 Lecture 5: Convex games I 29.03 Lecture 6: Convex games II 10.04 Homework 1 due 3 Multi-stage games 12.04 Lecture 6: Feedback games 19.04 Lecture 7: Randomized strategies for feedback games 26.04 Lecture 8: Dynamic games I 03.05 Lecture 9: Dynamic games II 15.05 Homework 2 due 17.05 Lecture 10: Stackelberg games 24.05 Lecture 11: Auctions I 31.05 Lecture 12: Auctions II 12.06 Homework 3 due 2 / 40

Recall: Finite non-zero-sum games Let there be n < players. Player i has m i < pure actions available to it. Denote the set of pure actions of player i by Si, S i = m i. Pure strategy of player i is denoted by π i,j. The set of mixed strategies of player i, denoted by X i, is the set X i = { (x i,1,..., x i,mi ) : m i j=1 x i,j = 1, x i,j 0, j = 1,..., m i } where x i,j is the probability with which player i selects action j S i. 3 / 40

Recall: Utility under pure and mixed strategies Consider a set of pure strategies of all players π = (π 1,j1, π 2,j2,..., π n,jn ), where π i,j S i for every player i. Utility of player i is: v i (π) = v i (π i,ji, π i ). Consider a set of mixed strategies of all players x = {x 1, x 2,..., x n }, where x i X i for every player i. We also denote it by x = (x i, x i ), where x i is the joint mixed strategy of all players other than i, i.e., x i = (x 1, x 2,..., x i 1, x i+1,..., x n ). The expected utility of player i is denoted by m 1 m 2 m n U i (x i, x i ) =... v i (π 1,j1, π 2,j2,..., π n,jn )x 1,j1 x 2,j2... x n,jn j 1 =1 j 2 =1 j n=1 4 / 40

Recall: Pure and mixed Nash equilibrium Pure Nash equilibrium A pure strategy profile π = (π 1, π 2,..., π n) is a pure Nash equilibrium if for every player i, v i (π i, π i) v i (π i,j, π i), π i,j S i. Mixed Nash equilibrium A mixed strategy profile x = (x 1, x 2,..., x n ) is a mixed Nash equilibrium if for every player i, U i (x i, x i) U i (x i, x i), x i X i. Every pure Nash equilibrium is a mixed Nash equilibrium. 5 / 40

Recall: Nash s theorem Theorem Every finite game has a mixed Nash equilibrium. Proof. Consider the set X = X 1 X 2... X n. X is compact and convex. Consider the map f : X X defined earlier. f is continuous. From Brouwer s fixed point theorem: There exists a fixed point x such that f(x ) = x. From Proposition 2: Every fixed point of f is a mixed Nash equilibrium. 6 / 40

Lecture outline Topics covered today: How do we compute a pure Nash equilibrium? Best response dynamics Special class of games: Potential games Application in traffic equilibrium 7 / 40

Example: Pure Nash equilibrium Utility of Red Car Utility of Blue Car Red Car Blue Car Go Wait Go 5 10 Wait 9 8 Red Car Blue Car Go Wait Go 3 9 Wait 10 8 Recall: v r(g, g) = 5, v b (g, w) = 9, and so on. Which of the following joint pure strategies are Nash equilibria? (Go,Go): (Go,Wait): (Wait,Go): (Wait,Wait): 8 / 40

Minimization vs. Maximization Suppose each player i minimizes a cost function c i as opposed to maximize a utility function v i. Pure Nash equilibrium A pure strategy profile π = (π 1, π 2,..., π n) is a pure Nash equilibrium if for every player i, Example: Prisoner s dilemma c i (π i, π i) c i (π i,j, π i), π i,j S i. Betray Stay silent [ Betray (10, 10) (0, 11) Stay silent (11, 0) (1, 1) ] or A = [ 10 ] 0 11 1 B = [ ] 10 11 0 1 Entries represent costs, which players minimize. 9 / 40

Myopic strategy Let π i = (π 1,j1, π 2,j2,..., π i 1,ji 1, π i+1,ji+1,..., π n,jn ) be the pure strategy profile of all players other than i. How should player i choose her strategy? What about the strategy that maximizes its utility? 10 / 40

Myopic strategy Let π i = (π 1,j1, π 2,j2,..., π i 1,ji 1, π i+1,ji+1,..., π n,jn ) be the pure strategy profile of all players other than i. How should player i choose her strategy? What about the strategy that maximizes its utility? Pure Best Response The pure best response of player i is the set Si (π i ) S i such that πi Si (π i ) if and only if v i (π i, π i ) v i (π i,j, π i ), π i,j S i. In other words: S i (π i ) := argmax πi S i v i (π i, π i ). S i (π i ) is not necessarily single-valued; it is set-valued. S i (π i ) is a function of the joint strategies of other players. 11 / 40

Example: Best response Utility of Red Car Utility of Blue Car Red Car Blue Car Go Wait Go 5 10 Wait 9 8 Red Car Blue Car Go Wait Go 3 9 Wait 10 8 Recall: v r(g, g) = 5, v b (g, w) = 9, and so on. What are the best responses: S r(g): best response of the red car when the blue car chooses go? S r(w): best response of the red car when the blue car chooses wait? S b (g): S b (w): 12 / 40

Best response and Nash equilibrium Proposition A pure strategy profile π = (π1, π 2,..., π n) is a pure Nash equilibrium if and only if πi Si (π i ) for every player i. Fixed point interpretation: Consider a set valued map S such that when π S, S (π) := [S 1 (π 1), S 2 (π 2),..., S n(π n )]. Note: S (π) S. A pure strategy profile π S is a Nash equilibrium if and only if π S (π ). We require a stronger version of Brouwer s fixed point theorem to show existence of fixed points in set-valued maps. 13 / 40

Best response dynamics Best response dynamics 1 Consider an initial pure strategy profile π 0 = (π 0 1, π0 2,..., π0 n). 2 If π k is a pure Nash equilibrium Stop. 3 Else there exists a player i, and π k+1 i πi k such that v i (π k+1 i, π i k ) > v i(πi k, π i k ). 4 Update: π k+1 := (π k+1 i, π k i ). 5 Repeat steps 2-4. Does this dynamics converge? If yes, then to which joint strategy? 14 / 40

Example: Best response dynamics in odds and evens If sum of both numbers is odd: P 1 wins 1 Franc, P 2 loses 1 Franc If sum of both numbers is even: P 2 wins 1 Franc, P 1 loses 1 Franc P 1 maximizes, P 2 maximizes P 2 P 1 1, 1 1, 1 1, 1 1, 1 Let π 0 = (1, 1). Does the best response dynamics converge? 15 / 40

Example: Best response dynamics in prisoner s dilemma Example: Prisoner s dilemma Betray Stay silent [ Betray (10, 10) (0, 11) Stay silent (11, 0) (1, 1) ] or A = [ 10 ] 0 11 1 B = [ ] 10 11 0 1 Entries represent costs, which players minimize. Let π 0 = (silent, silent). Does the best response dynamics converge? 16 / 40

Potential game: Definition Ordinal Potential Function A function P : S 1 S 2... S n R is an ordinal potential function if for every player i, every π i, v i (π i,j1, π i ) v i (π i,j2, π i ) > 0 iff P(π i,j1, π i ) P(π i,j2, π i ) > 0, for every π i,j1, π i,j2 S i. Note: The potential function assigns a value to each joint strategy profile. When player i chooses a best response, the potential increases. 17 / 40

Potential game: Definition Ordinal Potential Function A function P : S 1 S 2... S n R is an ordinal potential function if for every player i, every π i, v i (π i,j1, π i ) v i (π i,j2, π i ) > 0 iff P(π i,j1, π i ) P(π i,j2, π i ) > 0, for every π i,j1, π i,j2 S i. Exact Potential Function A function P : S 1 S 2... S n R is an ordinal potential function if for every player i, every π i, v i (π i,j1, π i ) v i (π i,j2, π i ) = P(π i,j1, π i ) P(π i,j2, π i ), for every π i,j1, π i,j2 S i. A game is an (ordinal/exact) potential game if it admits an (ordinal/exact) potential function. 18 / 40

Example: Prisoner s dilemma Payoff matrices are given by Betray Stay silent [ ] [ ] Betray ( 10, 10) (0, 11) 10 0 i.e., A = Stay silent ( 11, 0) ( 1, 1) 11 1 B = [ ] 10 11 0 1 Both players maximize. Is the following a potential function? Betray Stay silent [ Betray 0 1 Stay silent 1 2 Note: If a player deviates, then the change in potential is equal to the change in utility of the deviating player. ] 19 / 40

Existence of pure Nash equilibrium Proposition Finite games with an ordinal potential function possess a pure Nash equilibrium. Furthermore, best response dynamics converges. Proof idea: The joint strategy profile that maximizes the potential function is a Nash equilibrium. 20 / 40

Improvement paths Let s introduce some terminology. Let S := S 1 S 2... S n. A path in S is a sequence z = (z 0, z 1,...), z k S, such that for every k, there exists a unique player i k such that z k = (π ik,j, z k i k ) for some π ik,j S ik, π ik,j z k 1 i. A path z is an improvement path if at every k 1, v ik (z k ) > v ik (z k 1 ). Proposition In a finite ordinal potential game, every improvement path is finite. The above property is known as the finite improvement property (FIP). Does the converse hold? 21 / 40

Potential game and FIP Ordinal Potential Function A function P : S 1 S 2... S n R is a generalized ordinal potential function if for every player i, every π i, v i (π i,j1, π i ) v i (π i,j2, π i ) > 0 = P(π i,j1, π i ) P(π i,j2, π i ) > 0, for every π i,j1, π i,j2 S i. Proposition A finite game has the finite improvement property (FIP) if and only if it admits a generalized ordinal potential. 22 / 40

Potential game characterization Consider any finite path z = (z 0, z 1,..., z m ). z need not be an improvement path. Define I(z, v) := where i k is the player with z k i k m [v ik (z k ) v ik (z k 1 )], k=1 z k 1 i k. A path is closed if z 0 = z m. A path is simple if z i z j, for every i, j (except z 0 and z m ). Question Suppose the game admits an exact potential function. Let z be a closed path. Then, I(z, v) =? 23 / 40

Potential game characterization Proposition [4] Consider a finite game. Then the following are equivalent: 1 The game admits an exact potential function. 2 I(z, v) = 0 for every finite closed path z. 3 I(z, v) = 0 for every finite simple closed path z. 4 I(z, v) = 0 for every finite simple closed path z of length 4. 24 / 40

Example: Coordination game Payoff matrices are given by Movie Football [ ] Movie (2, 1) (0, 0) Football (0, 0) (1, 2) Is it a potential game? Construct a potential function? 25 / 40

Example: Odds and evens game Exercise - P 1 maximizes, P 2 maximizes - Is this game a potential game? P 2 P 1 1, 1 1, 1 1, 1 1, 1 - Evaluate I(z, v) for a closed simple path of length 4. 26 / 40

Congestion game Consider a game with n players. Let there be m resources. Each player chooses a resource, i.e., S i = {1, 2,..., m} for every player i. Consider a pure strategy profile π. Denote the load on resource j as σ j (π) := {1 i n π i = j}, the number of players who choose resource j in strategy profile π. Cost of a player depends on the load on the resource it chose. c i (π) = f j (σ j (π)), when π i = j. The function f j is resource-specific. Each player who choose a given resource experience the same cost. 27 / 40

Example: Traffic routing A 15 + 0.1n Road Ferry 40 Ferry 40 Road 15 + 0.1n B There are two ways to reach city B from city A, and both include some driving, and a trip on the ferry. The two paths are perfectly equivalent, the only difference is whether you first drive, or take the ferry. The time needed for the trip depends on what other travellers do. The ferry time is constant, 40 minutes The road time depends on the number of cars on the road. We consider a population of N = 200 travellers. 28 / 40

Example: Traffic routing A 15 + 0.1n Road Ferry 40 Ferry 40 Road 15 + 0.1n B Formulation as a non-zero-sum N-person game. Each traveller is a Player. Each path is a resource. Each Player can decide to take the North or the South path. { γ (i) 1 North = 0 South All players have identical cost function { c i (γ (i), γ ( i) 40 + 15 + 0.1 j ) = γ(j) if γ (i) = 1 40 + 15 + 0.1 j (1 γ(j) ) if γ (i) = 0 29 / 40

Potential function Theorem The following is an exact potential function for congestion games. P(π) = m j=1 σ j (π) k=1 f j (k). Proof. Consider a player i, and two joint pure strategies π 1 = (p, π i ) and π 2 = (q, π i ). Note that σ p (π 2 ) = σ p (π 1 ) 1. σ q (π 2 ) = σ q (π 1 ) + 1. σ j (π 2 ) = σ j (π 1 ) for every resource j p, q. It suffices to show that P(π 1 ) P(π 2 ) = c i (π 1 ) c i (π 2 ). 30 / 40

Proof cont. Proof. P(π 1 ) P(π 2 ) = m j=1 σ j (π 1 ) k=1 f j (k) m j=1 σ j (π 2 ) k=1 f j (k) = σ p(π 1 ) k=1 σ q(π 1 ) f p (k) + k=1 = f p (σ p (π 1 )) f q (σ q (π 2 )) = c i (π 1 ) c i (π 2 ). σ p(π 2 ) f q (k) k=1 σ q(π 2 ) f p (k) k=1 f q (k) Consequently, congestion games admit a pure Nash equilibrium. 31 / 40

Example: Traffic routing Ferry 40 A 15 + 0.1n Road Ferry 40 Road 15 + 0.1n B Are there pure NE? Suppose 100 players choose North path, 100 choose South path. Travel cost of each player: c i (γ (i), γ ( i) ) = 40 + 15 + 0.1 200 2 = 65 minutes Can you improve the outcome by unilaterally deviating from the NE? 32 / 40

Example: Braess paradox A 15 + 0.1n Road Ferry 40 Bridge 0 Ferry 40 Road 15 + 0.1n B Assume a bridge is build, to help reduce traffic. It takes no time to cross the bridge, allowing to go from city A to city B without taking the ferry. New Nash equilibrium: all travellers avoid the ferry. c i (γ (1),..., γ (N) ) = 2 (15 + 0.1 200) = 70 minutes Can you improve your outcome by unilaterally deviate from the NE? No, road + ferry now takes 40 + 15 + 0.1 200 = 75 minutes! 33 / 40

Example: Braess paradox Ferry 40 Ferry 40 A 15 + 0.1n Road Ferry 40 Road 15 + 0.1n B A 15 + 0.1n Road Ferry 40 Bridge 0 Road 15 + 0.1n B c NE i = 65 minutes c NE i = 70 minutes With the new link in the transportation graph the original choice (road + ferry) is still present the new link is intensively used all agents experience higher cost! 34 / 40

What if They Closed 42d Street and Nobody Noticed? 25 December 1990 On Earth Day this year, New York City s Transportation Commissioner decided to close 42d Street, which as every New Yorker knows is always congested. [...] But to everyone s surprise, Earth Day generated no historic traffic jam. Traffic flow actually improved when 42d Street was closed. And many other real-life cases in road traffic, data networks, etc. 35 / 40

Social welfare Welfare function In a n-person game, let x i X i be the (possibly mixed) strategy played by agent i. Let x X := X 1 X 2... X n be the system-wide strategy. A welfare cost W : X R is a measure of efficiency of each strategy for the social cost of the population of agents. Let the individual cost be c i (x) that player i wants to minimize. For example: W(x) = i c i (x) W(x) = i log c i (x) W(x) = max c i (x) i Different meanings: think of income 36 / 40

Price of Anarchy The Price of Anarchy is defined as the ratio PoA := max x X NE W(x) min x X W(x) where X is the set of all possible strategies for all agents, while X NE is the set of all strategies which are NE. In Braess paradox example, assume W = N i=1 c i(x). Theorem [5] PoA = 70 65 = 108% When the delay functions are affine for every edge, PoA 3+ 5 2. 37 / 40

Outlook Smaller the PoA, better the quality of Nash equilibrium. Every finite potential game is isomorphic to a congestion game [4]. Many different types of learning dynamics can be shown to converge to Nash equilibrium in potential games [3, 2]. Every finite game can be decomposed to a potential game and a harmonic game [1]. Next lecture: Games with an infinite number of pure strategies or continuous pure strategy sets. 38 / 40

References I Ozan Candogan, Ishai Menache, Asuman Ozdaglar, and Pablo A Parrilo. Flows and decompositions of games: Harmonic and potential games. Mathematics of Operations Research, 36(3):474 503, 2011. Jason R Marden, Gürdal Arslan, and Jeff S Shamma. Joint strategy fictitious play with inertia for potential games. IEEE Transactions on Automatic Control, 54(2):208 220, 2009. Dov Monderer and Lloyd S Shapley. Fictitious play property for games with identical interests. Journal of Economic Theory, 68(1):258 265, 1996. Dov Monderer and Lloyd S Shapley. Potential games. Games and Economic Behavior, 14(1):124 143, 1996. 39 / 40

References II Tim Roughgarden. Selfish routing. Technical report, PhD Thesis, Cornell University, 2002. 40 / 40