General-sum games. I.e., pretend that the opponent is only trying to hurt you. If Column was trying to hurt Row, Column would play Left, so

Similar documents
Normal-form games. Vincent Conitzer

Computing Solution Concepts of Normal-Form Games. Song Chong EE, KAIST

Computing Minmax; Dominance

Computing Minmax; Dominance

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

CMU Noncooperative games 4: Stackelberg games. Teacher: Ariel Procaccia

First Prev Next Last Go Back Full Screen Close Quit. Game Theory. Giorgio Fagiolo

Game Theory. Greg Plaxton Theory in Programming Practice, Spring 2004 Department of Computer Science University of Texas at Austin

Optimization 4. GAME THEORY

Computational complexity of solution concepts

: Cryptography and Game Theory Ran Canetti and Alon Rosen. Lecture 8

Game Theory: introduction and applications to computer networks

Strategic Games: Social Optima and Nash Equilibria

A (Brief) Introduction to Game Theory

CSC304 Lecture 5. Game Theory : Zero-Sum Games, The Minimax Theorem. CSC304 - Nisarg Shah 1

CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash

Recap Social Choice Functions Fun Game Mechanism Design. Mechanism Design. Lecture 13. Mechanism Design Lecture 13, Slide 1

Algorithmic Game Theory and Applications. Lecture 4: 2-player zero-sum games, and the Minimax Theorem

Quantum Games. Quantum Strategies in Classical Games. Presented by Yaniv Carmeli

For general queries, contact

MS&E 246: Lecture 12 Static games of incomplete information. Ramesh Johari

CSC304: Algorithmic Game Theory and Mechanism Design Fall 2016

EconS Advanced Microeconomics II Handout on Subgame Perfect Equilibrium (SPNE)

Belief-based Learning

CS261: A Second Course in Algorithms Lecture #11: Online Learning and the Multiplicative Weights Algorithm

Game Theory DR. ÖZGÜR GÜRERK UNIVERSITY OF ERFURT WINTER TERM 2012/13. Strict and nonstrict equilibria

Game Theory and Control

Algorithms, Games, and Networks January 17, Lecture 2

Game Theory Fall 2003

Introduction to Game Theory Lecture Note 2: Strategic-Form Games and Nash Equilibrium (2)

Iterated Strict Dominance in Pure Strategies

Network Games with Friends and Foes

Two-Player Kidney Exchange Game

Exact and Approximate Equilibria for Optimal Group Network Formation

Computational Evolutionary Game Theory and why I m never using PowerPoint for another presentation involving maths ever again

CS364A: Algorithmic Game Theory Lecture #13: Potential Games; A Hierarchy of Equilibria

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program June 2016

Game Theory: Spring 2017

Industrial Organization Lecture 3: Game Theory

On the Value of Correlation

Introduction to Game Theory

Game Theory: Lecture 3

MS&E 246: Lecture 17 Network routing. Ramesh Johari

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 17: Duality and MinMax Theorem Lecturer: Sanjeev Arora

Theory Field Examination Game Theory (209A) Jan Question 1 (duopoly games with imperfect information)

4: Dynamic games. Concordia February 6, 2017

Exact and Approximate Equilibria for Optimal Group Network Formation

Mechanism Design: Implementation. Game Theory Course: Jackson, Leyton-Brown & Shoham

"Arrow s Theorem and the Gibbard-Satterthwaite Theorem: A Unified Approach", by Phillip Reny. Economic Letters (70) (2001),

SF2972 Game Theory Exam with Solutions March 15, 2013

Game Theory. 4. Algorithms. Bernhard Nebel and Robert Mattmüller. May 2nd, Albert-Ludwigs-Universität Freiburg

Game Theory. Wolfgang Frimmel. Perfect Bayesian Equilibrium

The Social Value of Credible Public Information

Graduate Microeconomics II Lecture 5: Cheap Talk. Patrick Legros

MS&E 246: Lecture 4 Mixed strategies. Ramesh Johari January 18, 2007

New Complexity Results about Nash Equilibria

Economics 201B Economic Theory (Spring 2017) Bargaining. Topics: the axiomatic approach (OR 15) and the strategic approach (OR 7).

1 Motivation. Game Theory. 2 Linear Programming. Motivation. 4. Algorithms. Bernhard Nebel and Robert Mattmüller May 15th, 2017

NASH IMPLEMENTATION USING SIMPLE MECHANISMS WITHOUT UNDESIRABLE MIXED-STRATEGY EQUILIBRIA

Correlated Equilibria: Rationality and Dynamics

A Generic Bound on Cycles in Two-Player Games

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Notes on Mechanism Designy

Equilibrium Computation

Ex Post Cheap Talk : Value of Information and Value of Signals

The Adjusted Winner Procedure : Characterizations and Equilibria

Satisfaction Equilibrium: Achieving Cooperation in Incomplete Information Games

Backwards Induction. Extensive-Form Representation. Backwards Induction (cont ) The player 2 s optimization problem in the second stage

CS 598RM: Algorithmic Game Theory, Spring Practice Exam Solutions

Traffic Games Econ / CS166b Feb 28, 2012

A Lazy Bureaucrat Scheduling Game

Game Theory for Linguists

Lectures 6, 7 and part of 8

Economics 3012 Strategic Behavior Andy McLennan October 20, 2006

EC3224 Autumn Lecture #04 Mixed-Strategy Equilibrium

MS&E 246: Lecture 18 Network routing. Ramesh Johari

Duality. Peter Bro Mitersen (University of Aarhus) Optimization, Lecture 9 February 28, / 49

We set up the basic model of two-sided, one-to-one matching

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples

Introduction to General Equilibrium

Efficient Mechanism Design

Security Game with Non-additive Utilities and Multiple Attacker Resources

Weighted Majority and the Online Learning Approach

New Perspectives and Challenges in Routing Games: Query models & Signaling. Chaitanya Swamy University of Waterloo

Price of Stability in Survivable Network Design

Zero-Sum Games Public Strategies Minimax Theorem and Nash Equilibria Appendix. Zero-Sum Games. Algorithmic Game Theory.

12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria

Lecture Notes on Game Theory

Long-Run versus Short-Run Player

Basic Game Theory. Kate Larson. January 7, University of Waterloo. Kate Larson. What is Game Theory? Normal Form Games. Computing Equilibria

Massachusetts Institute of Technology 6.854J/18.415J: Advanced Algorithms Friday, March 18, 2016 Ankur Moitra. Problem Set 6

Introduction to Game Theory

Government 2005: Formal Political Theory I

EC319 Economic Theory and Its Applications, Part II: Lecture 2

Question 1. (p p) (x(p, w ) x(p, w)) 0. with strict inequality if x(p, w) x(p, w ).

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business

Algorithmic Game Theory

Combinatorial Agency of Threshold Functions

Multiple Equilibria in the Citizen-Candidate Model of Representative Democracy.

1 Equilibrium Comparisons

Transcription:

General-sum games You could still play a minimax strategy in general- sum games I.e., pretend that the opponent is only trying to hurt you But this is not rational: 0, 0 3, 1 1, 0 2, 1 If Column was trying to hurt Row, Column would play Left, so Row should play Down In reality, Column will play Right (strictly dominant), so Row should play Up Is there a better generalization of minimax strategies in zerosum games to general-sum games?

Nash equilibrium [Nash 50] A vector of strategies (one for each player) is called a strategy profile A strategy profile (σ 1, σ 2,, σ n ) is a Nash equilibrium if each σ i is a best response to σ -i That is, for any i, for any σ i, u i (σ i, σ -i ) u i (σ i, σ -i ) Note that this does not say anything about multiple agents changing their strategies t at the same time In any (finite) game, at least one Nash equilibrium (possibly using mixed strategies) exists [Nash 50] (Note - singular: equilibrium, plural: equilibria)

Nash equilibria of chicken S D D D S S D 0, 0-1, 1 S 1, -1-5, -5 (D, S) and (S, D) are Nash equilibria i They are pure-strategy Nash equilibria: nobody randomizes They are also strict Nash equilibria: changing your strategy will make you strictly worse off No other pure-strategy Nash equilibria

Nash equilibria of chicken D S D S 0, 0-1, 1 1, -1-5, -5 Is there a Nash equilibrium that uses mixed strategies? Say, where player 1 uses a mixed strategy? Recall: if a mixed strategy is a best response, then all of the pure strategies that it randomizes over must also be best responses So we need to make player 1 indifferent between D and S Player 1 s 1s utility for playing D = -p c S Player 1 s utility for playing S = p c D - 5p c S = 1-6p c S So we need -p c S = 1-6p c S which means p c S = 1/5 Then, player 2 needs to be indifferent as well Mixed-strategy Nash equilibrium: ((4/5 D, 1/5 S), (4/5 D, 1/5 S)) People may die! Expected utility -1/5 for each player

Audience The presentation game Pay attention (A) Do not pay tt ti (NA) Put effort into presentation (E) Presenter Do not put effort into presentation (NE) 4, 4-16, -14 0, -2 0, 0 attention (NA) Pure-strategy Nash equilibria: (A, E), (NA, NE) Mixed-strategy Nash equilibrium: ((1/10 A, 9/10 NA), (4/5 E, 1/5 NE)) Utility 0 for audience, -14/10 for presenter Can see that some equilibria are strictly better for both players than other equilibria, i.e. some equilibria Pareto-dominate other equilibria

The equilibrium selection problem You are about to play a game that t you have never played before with a person that you have never met According to which equilibrium should you play? Possible answers: Equilibrium that maximizes the sum of utilities (social welfare) Or, at least not a Pareto-dominated equilibrium So-called focal equilibria Meet in Paris game - you and a friend were supposed to meet in Paris at noon on Sunday, but you forgot to discuss where and you cannot communicate. All you care about is meeting your friend. Where will you go? Equilibrium that is the convergence point of some learning process An equilibrium that is easy to compute Equilibrium selection is a difficult problem

Stackelberg (commitment) games L R L R 1, -1 3, 1 2, 1 4, -1 Unique Nash equilibrium is (R,L) This has a payoff of (2,1)

Commitment L R L (1,-1) (3,1) R (2,1) (4,-1) What if the officer has the option to (credibly) announce where he will be patrolling? This would give him the power to commit to being at one of the buildings This would be a pure-strategy Stackelberg game

Commitment L R L (1,-1) (3,1) If the officer can commit to always being at the left building, then the vandal's best response is to go to the right building This leads to an outcome of (3,1)

Committing to mixed strategies L R L (1,-1) (3,1) R (2,1) (4,-1) What if we give the officer even more power: the ability to commit to a mixed strategy This results in a mixed-strategy Stackelberg game E.g., the officer commits to flip a weighted coin which decides where he patrols

Committing to mixed strategies is more powerful L R L (1,-1) (3,1) R (2,1) (4,-1) Suppose the officer commits to the following strategy: {(.5+ε)L,(.5- ( 5- ε)r} The vandal s best response is R As ε goes to 0, this converges to a payoff of (3.5,0)

Stackelberg games in general One of the agents (the leader) has some advantage that allows her to commit to a strategy (pure or mixed) The other agent (the follower) then chooses his best response to this

Visualization L C R U 0,1 1,0 0,0 M 4,0 0,1 0,0 D 0,0 1,0 1,1 (0,1,0) = M C L R (1,0,0) = U (0,0,1) = D

Easy polynomial-time algorithm for two players For every column t, we solve separately for the best mixed row strategy (defined by p s ) that induces player 2 to play t maximize Σ s p s u 1 (s, t) subject to for any t, Σ s p s u 2 (s, t) Σ s p s u 2 (s, t ) Σ s p s = 1 (May be infeasible) Pick the t that is best for player 1

(a particular kind of) Bayesian games leader utilities follower utilities (type 1) follower utilities (type 2) 2 4 1 0 1 0 1 3 0 1 1 3 probability.6 probability.4

Multiple types - visualization (0,1,0) (1,0,0) L C R (0,0,1) Combined (010) (0,1,0) (0,1,0) (1,0,0) (0,0,1) L R (R,C) (1,0,0) C (0,0,1)

Solving Bayesian games There s a known MIP for this 1 Details omitted due to the fact that its rather nasty. The main trick of the MIP is encoding a exponential number of LP s into a single MIP Used in the ARMOR system deployed at LAX [1] Paruchuri et al. Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games

(In)approximability (# types)-approximation: optimize for each type separately using the LP method. Pick the solution that gives the best expected utility against the entire type distribution. Can t do any better in polynomial time, unless P=NP Reduction from INDEPENDENT-SET For adversarially chosen types cannot decide in polynomial For adversarially chosen types, cannot decide in polynomial time whether it is possible to guarantee positive utility, unless P=NP

Reduction from independent set leader utilities 1 2 3 A B a l 1 1 0 a l 2 1 0 a l 3 1 0 follower utilities (type 1) follower utilities (type 2) follower utilities (type 3) A B a 1 l 3 1 a 2 l 0 10 a 3 l 0 1 A B a 1 l 0 10 a 2 l 3 1 a 3 l 0 10 A B a 1 l 0 1 a 2 l 0 10 a 3 l 3 1

Value L R L (1,-1) (3,1) R (2,1) (4,-1) We define three ratios: Value of Pure Commitment (VoPC) Value of Mixed Commitment (VoMC) 3/2 3.5/2 Mixed vs Pure (MvP) 3.5/3

Related concepts Value of mediation 1 Ratio of combined utility of all players between: Best correlated equilibrium Best Nash equilibrium Price of anarchy 2 Measures loss of welfare lost due to selfishness of the players Ratio between: Worst Nash equilibrium Optimalsocialwelfare One difference is that we are only concerned with player 1 s utility 1 Ashlagi et al. 08 2 Koutsoupias and Papadimitriou 99

Normalform(2 2) 2) L R U ϵ,1 1,0 D 0,0 1-ϵ ϵ,1 VoPC = VoMC =

Normalform(2 2) 2) continued L R U ϵ,1 1,0 D 0,0 2ϵ,1 MvP =

Summary of value results Game type VoPC VoMC MvP Normal-form (2 2) (ISD) (ISD) (ISD) Symmetric normal-form (3 3) (ISD) (ISD) (ISD) Extensive-form (3 leaves) (SPNE) (SPNE) (SPNE) Security games (2 2) (UNash) (UNash) (UNash) Atomic Selfish routing Linear costs (n players) n (ISD) n (ISD) n (ISD) Quadratic costs (n players) n 2 (ISD) n 2 (ISD) n 2 (ISD) k-nomnial costs (n players) n k (ISD) n k (ISD) n k (ISD) Arbitrary costs (2 players) (ISD) (ISD) (ISD)

Learning Single follower type Unknown follower payoffs Repeated play: commit to mixed strategy, see follower s (myopic) response L R U 1,? 3,? D 2? 2,? 4? 4,?

Visualization L C R U 0,1 1,0 0,0 M 4,0 0,1 0,0 D 0,0 1,0 1,1 (0,1,0) = M C L R (1,0,0) = U (0,0,1) = D

Sampling C (010) (0,1,0) L R (1,0,0) (001) (0,0,1)

Three main techniques in the learning algorithm Find one point in each region (using random sampling) Find a point on an unknown hyperplane Starting ti from a point on an unknown hyperplane, determine the hyperplane completely l

Finding a point on an unknown hyperplane Step 1. Sample in the overlapping region Intermediate state Step 2. Connect the new point to the point in the region that doesn t match L C R Step 3. Binary search along this line L R R or L Region: R

Determining the hyperplane Intermediate state Step 1. Sample a regular d-simplex centered at the point Step 2. Connect d lines between points on opposing sides L C R Step 3. Binary search along these lines Step 4. Determine hyperplane (and update the region estimates with this information) L R R or L

Bound on number of samples Theorem. Finding all of the hyperplanes necessary to compute the optimal mixed strategy to commit to requires O(Fk log(k) + dlk 2 ) samples F depends on the size of the smallest region L depends on desired precision k is the number of follower actions d is the number of leader actions