An Introduction to Noncooperative Games

Similar documents
Alberto Bressan. Department of Mathematics, Penn State University

On the Stability of the Best Reply Map for Noncooperative Differential Games

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Hyperbolicity of Systems Describing Value Functions in Differential Games which Model Duopoly Problems. Joanna Zwierzchowska

A Stackelberg Game Model of Dynamic Duopolistic Competition with Sticky Prices. Abstract

EE291E/ME 290Q Lecture Notes 8. Optimal Control and Dynamic Games

Discrete Bidding Strategies for a Random Incoming Order

Game theory Lecture 19. Dynamic games. Game theory

The Pennsylvania State University The Graduate School MODELS OF NONCOOPERATIVE GAMES. A Dissertation in Mathematics by Deling Wei.

Recent Trends in Differential Inclusions

Dynamic stochastic game and macroeconomic equilibrium

Basics of Game Theory

A Note on Open Loop Nash Equilibrium in Linear-State Differential Games

Noncooperative continuous-time Markov games

Optimal Tariffs on Exhaustible Resources: A Leader-Follower Model

Optimal Pricing Strategies in a Continuum Limit Order Book

Robust control and applications in economic theory

Microeconomics II Lecture 4: Incomplete Information Karl Wärneryd Stockholm School of Economics November 2016

ANALYTIC SOLUTIONS FOR HAMILTON-JACOBI-BELLMAN EQUATIONS

Two hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Thursday 17th May 2018 Time: 09:45-11:45. Please answer all Questions.

Constructive Proof of the Fan-Glicksberg Fixed Point Theorem for Sequentially Locally Non-constant Multi-functions in a Locally Convex Space

Advertising and Promotion in a Marketing Channel

University of Warwick, Department of Economics Spring Final Exam. Answer TWO questions. All questions carry equal weight. Time allowed 2 hours.

Long-Run versus Short-Run Player

Advanced Microeconomics

Cooperative Advertising in a Dynamic Retail Market Duopoly

Large-population, dynamical, multi-agent, competitive, and cooperative phenomena occur in a wide range of designed

Ph.D. Preliminary Examination MICROECONOMIC THEORY Applied Economics Graduate Program May 2012

Oligopoly. Molly W. Dahl Georgetown University Econ 101 Spring 2009

Wireless Network Pricing Chapter 6: Oligopoly Pricing

Optimal Control Feedback Nash in The Scalar Infinite Non-cooperative Dynamic Game with Discount Factor

Game Theory Extra Lecture 1 (BoB)

A Bidding Game with Heterogeneous Players

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

Dual Interpretations and Duality Applications (continued)

Bayesian Games and Mechanism Design Definition of Bayes Equilibrium

Chapter Introduction. A. Bensoussan

A Stochastic Multiple Leader Stackelberg Model: Analysis, Computation, and Application

6.891 Games, Decision, and Computation February 5, Lecture 2

Game Theory with Information: Introducing the Witsenhausen Intrinsic Model

Solving a Class of Generalized Nash Equilibrium Problems

UC Berkeley Haas School of Business Game Theory (EMBA 296 & EWMBA 211) Summer 2016

Mixed Strategies. Krzysztof R. Apt. CWI, Amsterdam, the Netherlands, University of Amsterdam. (so not Krzystof and definitely not Krystof)

Collaborative Network Formation in Spatial Oligopolies

Area I: Contract Theory Question (Econ 206)

Theory of Auctions. Carlos Hurtado. Jun 23th, Department of Economics University of Illinois at Urbana-Champaign

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

COMPUTATION OF GENERALIZED NASH EQUILIBRIA WITH

Game Theory. Monika Köppl-Turyna. Winter 2017/2018. Institute for Analytical Economics Vienna University of Economics and Business

Liquidation in Limit Order Books. LOBs with Controlled Intensity

Deterministic Dynamic Programming

problem. max Both k (0) and h (0) are given at time 0. (a) Write down the Hamilton-Jacobi-Bellman (HJB) Equation in the dynamic programming

Variational approach to mean field games with density constraints

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

BROUWER S FIXED POINT THEOREM: THE WALRASIAN AUCTIONEER

Noncooperative Games, Couplings Constraints, and Partial Effi ciency

Solution of Stochastic Optimal Control Problems and Financial Applications

Extremal Solutions of Differential Inclusions via Baire Category: a Dual Approach

NTU IO (I) : Auction Theory and Mechanism Design II Groves Mechanism and AGV Mechansim. u i (x, t i, θ i ) = V i (x, θ i ) + t i,

Brouwer Fixed-Point Theorem

Welfare consequence of asymmetric regulation in a mixed Bertrand duopoly

Could Nash equilibria exist if the payoff functions are not quasi-concave?

SECTION 5.1: Polynomials

Mechanism Design II. Terence Johnson. University of Notre Dame. Terence Johnson (ND) Mechanism Design II 1 / 30

EC319 Economic Theory and Its Applications, Part II: Lecture 2

Game Theory. Strategic Form Games with Incomplete Information. Levent Koçkesen. Koç University. Levent Koçkesen (Koç University) Bayesian Games 1 / 15

Bayesian-Nash Equilibrium

A Special Labor-Managed Oligopoly

A stochastic game approach for competition over popularity in social networks

Oblivious Equilibrium: A Mean Field Approximation for Large-Scale Dynamic Games

On Equilibria of Distributed Message-Passing Games

Negotiation: Strategic Approach

EXTENSIONS OF CONVEX FUNCTIONALS ON CONVEX CONES

Strategic Interactions between Fiscal and Monetary Policies in a Monetary Union

Introduction to Game Theory

CMU Noncooperative games 4: Stackelberg games. Teacher: Ariel Procaccia

Equilibrium in a Stochastic n-person Game

6.207/14.15: Networks Lecture 11: Introduction to Game Theory 3

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples

Prof. Erhan Bayraktar (University of Michigan)

Dynamic Bertrand and Cournot Competition

Lecture 1. Evolution of Market Concentration

3.3.3 Illustration: Infinitely repeated Cournot duopoly.

Competition relative to Incentive Functions in Common Agency

4F3 - Predictive Control

Introduction to Game Theory

Sealed-bid first-price auctions with an unknown number of bidders

Mathematical Economics - PhD in Economics

Linear Quadratic Zero-Sum Two-Person Differential Games

Conservation laws and some applications to traffic flows

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

IBM Research Report. Equilibrium in Prediction Markets with Buyers and Sellers

A three-level MILP model for generation and transmission expansion planning

Economics 204. The Transversality Theorem is a particularly convenient formulation of Sard s Theorem for our purposes: with r 1+max{0,n m}

h Edition Money in Search Equilibrium

Copyright (C) 2013 David K. Levine This document is an open textbook; you can redistribute it and/or modify it under the terms of the Creative

4: Dynamic games. Concordia February 6, 2017

Essential equilibria in normal-form games

Topic 3: Application of Differential Calculus in

Coalitional solutions in differential games

Transcription:

An Introduction to Noncooperative Games Alberto Bressan Department of Mathematics, Penn State University Alberto Bressan (Penn State) Noncooperative Games 1 / 29

introduction to non-cooperative games: solution concepts differential games in continuous time applications to economic models Alberto Bressan (Penn State) Noncooperative Games 2 / 29

Optimal decision problem maximize: Φ(x, y) y y * Φ = constant x * x The choice (x, y ) R 2 yields the maximum payoff Alberto Bressan (Penn State) Noncooperative Games 3 / 29

A game for two players Player A wishes to maximize his payoff Φ A (a, b) Player B wishes to maximize his payoff Φ B (a, b) B A Φ = constant B Φ = constant b 1 b 2 a 1 a 2 A Player A chooses the value of a A Player B chooses the value of b B Alberto Bressan (Penn State) Noncooperative Games 4 / 29

A game for two players Player A wishes to maximize his payoff Φ A (a, b) Player B wishes to maximize his payoff Φ B (a, b) B A Φ = constant B Φ = constant b 1 b 2 a 1 a 2 A Player A chooses the value of a A Player B chooses the value of b B Alberto Bressan (Penn State) Noncooperative Games 4 / 29

A cooperative solution B A Φ = constant B Φ = constant b C C a C A maximize the sum of payoffs Φ A (a, b) + Φ B (a, b) split the total payoff fairly among the two players (how???) Alberto Bressan (Penn State) Noncooperative Games 5 / 29

The best reply map If Player A adopts the strategy a, the set of best replies for Player B is { } R B (a) = b ; Φ B (a, b) = max s B ΦB (a, s) If Player B adopts the strategy b, the set of best replies for Player A is { } R A (b) = a ; Φ A (a, b) = max s A ΦA (s, b) B A Φ = constant B Φ = constant B R (a) b A R (b) a A Alberto Bressan (Penn State) Noncooperative Games 6 / 29

Nash equilibrium solutions A couple of strategies (a, b ) is a Nash equilibrium if a R A (b ) and b R B (a ) B A Φ = constant B Φ = constant * b N a * A Antoin Augustin Cournot (1838) John Nash (1950) Alberto Bressan (Penn State) Noncooperative Games 7 / 29

Existence of Nash equilibria Theorem. Assume Sets of available strategies for the two players: A, B R n are compact and convex Payoff functions: Φ A, Φ B : A B R are continuous For each a A, the set of best replies R B (a) B is convex For each b B, the set of best replies R A (b) A is convex Then the game admits at least one Nash equilibrium. Proof. If the best reply maps are single valued, the map (a, b) (R A (b), R B (a)) is a continuous map from the compact convex set A B into itself. By Brouwer s fixed theorem, it has a fixed point (a, b ). If R A, R B are convex-valued, by Kakutani s fixed point theorem there exists (a, b ) (R A (b ), R B (a )) Alberto Bressan (Penn State) Noncooperative Games 8 / 29

One-dimensional version of Brouwer s and Kakutani s theorems 1 Brouwer 1910 Kakutani 1941 x * = f(x *) x * F(x *) F f f ε F 0 x* 1 0 x* 0 x Luitzen Egbertus Jan Brouwer (1910) Shizuo Kakutani (1941) Arrigo Cellina (1969) Alberto Bressan (Penn State) Noncooperative Games 9 / 29

Stackelberg equilibrium Player A (the leader) announces his strategy a A in advance Player B (the follower) adopts his best reply: b R B (a) B What is the best strategy for the leader? A Φ = constant B max a A Φ A (a, R B (a)) B Φ = constant B R (a) b * S a* a A A couple of strategies (a, b ) is a Stackelberg equilibrium if b R B (a ) and Φ A (a, b ) Φ A (a, b) for all a A, b R B (a) Alberto Bressan (Penn State) Noncooperative Games 10 / 29

Game theoretical models in Economics and Finance Sellers (choosing prices charged) vs. buyers (choosing quantities bought) Companies competing for market share (choosing production level, prices, amount spent on research & development or advertising) Auctions, bidding games Economic growth. Leading player: central bank (choosing prime rate) followers: private companies (choosing investment levels) Debt management. Lenders (choosing interest rate) vs. borrower (choosing repayment strategy)... Alberto Bressan (Penn State) Noncooperative Games 11 / 29

Differential games in finite time horizon x(t) R n = state of the system Dynamics: ẋ(t) = f (x(t), u 1 (t), u 2 (t)), x(t 0 ) = x 0 u 1 ( ), u 2 ( ) = controls implemented by the two players Goal of i-th player:. ( ) T ( maximize: J i = ψi x(t ) L i x(t), u1 (t), u 2 (t) ) dt t 0 = [terminal payoff] - [running cost] Alberto Bressan (Penn State) Noncooperative Games 12 / 29

Differential games in infinite time horizon Dynamics: ẋ = f (x, u 1, u 2 ), x(0) = x 0 Goal of i-th player: +. maximize: J i = e γt ( Ψ i x(t), u1 (t), u 2 (t) ) dt 0 (running payoff, exponentially discounted in time) Alberto Bressan (Penn State) Noncooperative Games 13 / 29

Example 1: an advertising game Two companies, competing for market share state variable: x(t) [0, 1] = market share of company 1, at time t controls: ẋ = (1 x) u 1 x u 2 u 1, u 2 = advertising rates T payoffs: J i = N x i (T ) p i c i u i (t) dt i = 1, 2 0 N = expected number of items purchased by consumers p i = profit made by player i on each sale c i = advertising cost x i = market share of player i (x 1 = x, x 2 = 1 x) Alberto Bressan (Penn State) Noncooperative Games 14 / 29

Example 2: harvesting of marine resources x(t) = amount of fish in a lake, at time t dynamics: ẋ = αx(m x) xu 1 xu 2 controls: u 1, u 2 = harvesting efforts by the two players payoffs: J i = + 0 e γt (p xu i c i u i ) dt p = selling price of fish c i = harvesting cost Alberto Bressan (Penn State) Noncooperative Games 15 / 29

Example 3: a producer vs. consumer game { p = price State variables: q = size of the inventory { a(t) = production rate Controls: b(t) = consumption rate { ṗ = p ln(q0 /q) The system evolves in time according to q = a b Here q 0 is an appropriate inventory level + J producer. = e γt[ p(t) b(t) c ( a(t) )] dt 0 Payoffs: + J consumer. = e γt[ φ(b(t)) p(t)b(t) ] dt c(a) = production cost, φ(b)= utility to the consumer 0 Alberto Bressan (Penn State) Noncooperative Games 16 / 29

Solution concepts No outcome can be optimal simultaneously for all players Different outcomes may arise, depending on information available to the players their ability and willingness to cooperate Alberto Bressan (Penn State) Noncooperative Games 17 / 29

Nash equilibria (in infinite time horizon) Seek: feedback strategies: u1 (x), u 2 (x) with the following properties Given the strategy u 2 = u2 (x) adopted by the second player, for every initial data x(0) = y, the assignment u 1 = u1 (x) provides a solution to the optimal control problem for the first player: max u 1( ) 0 e γt Ψ 1 (x, u 1, u 2 (x)) dt subject to ẋ = f (x, u 1, u 2 (x)), x(0) = y Similarly, given the strategy u 1 = u1 (x) adopted by the first player, the feedback control u 2 = u2 (x) provides a solution to the optimal control problem for the second player. Alberto Bressan (Penn State) Noncooperative Games 18 / 29

Solving an optimal control problem by PDE methods V (y) = inf u( ) + 0 e γt L(x(t), u(t)) dt subject to: ẋ(t) = f (x(t), u(t)) x(0) = y u(t) U V (y) = minimum cost, if the system is initially at y Alberto Bressan (Penn State) Noncooperative Games 19 / 29

A PDE for the value function y = x(0) y + ε f(y, ω) x(t) If we use the constant control u(t) = ω for t [0, ε] then we play optimally for t [ε, [, the total cost is ( ε + ) J ε,ω = + e γt L(x(t), u(t)) dt 0 = ε L(y, ω) + e γε V ε ( ) y + εf (y, ω) + o(ε) = ε L(y, ω) + (1 γε)v (y) + V (y) εf (y, ω) + o(ε) V (y) Since this is true for every ω U, V (y) V (y) γεv (y) + ε min ω U { L(y, ω) + V (y) f (y, ω) } + o(ε) Alberto Bressan (Penn State) Noncooperative Games 20 / 29

The Hamilton-Jacobi PDE for the value function Choosing ω = u (y) the optimal control, we should have equality. Hence { } V (y) = V (y) γεv (y) + ε min ω U L(y, ω) + V (y) f (y, ω) + o(ε) Letting ε 0 we obtain γv (y) = min ω U { L(y, ω) + V (y) f (y, ω) }. = H(y, V (y)) If V ( ) is known, the optimal feedback control can be recovered by { } u (y) = argmin L(y, u) + V (y) f (y, u) u U Alberto Bressan (Penn State) Noncooperative Games 21 / 29

An example minimize: 0 ( ) e γt φ(x(t)) + u2 (t) dt 2 subject to: ẋ = f (x) + g(x) u, x(0) = y, u(t) R The value function V (y) = minimum cost, starting at y satisfies the PDE γv (x) = min ω R ( ) {φ(x) } + ω2 2 + V (x) f (x) + g(x) ω = φ(x) + V (x) f (x) 1 2 ( V (x) g(x))2 Alberto Bressan (Penn State) Noncooperative Games 22 / 29

Finding the optimal feedback control ẋ = f (x) + g(x) u V(x) g(x) x V = const. If V ( ) is known, the optimal control can be recovered by u (x) = argmin u R { V (x) g(x) u + u2 } 2 = V (x) g(x) Alberto Bressan (Penn State) Noncooperative Games 23 / 29

Solving a differential game by PDE methods Dynamics: ẋ = f (x) + g 1 (x)u 1 + g 2 (x)u 2 ( Player i seeks to minimize: J i = e γt φ i (x(t)) + u2 i (t) ) dt 2 0 Given the strategy u2 (x) of Player 2, the optimal control problem for Player 1 is: minimize J 1 subject to: ẋ = f (x) + g 1 (x)u 1 + g 2 (x)u2 (x) PDE for the Value Function V 1 (y) = minimum cost starting at y γv 1 = φ 1 + V 1 f + V 1 g 2 u2 1 ( ) 2 V1 g 1 2 Optimal feedback control for Player 1 u 1 (x) = V 1 (x) g 1 (x) Alberto Bressan (Penn State) Noncooperative Games 24 / 29

A system of PDEs for the value functions The value functions V 1, V 2 for the two players satisfy the system of H-J equations γv 1 = (f V 1 ) 1 2 (g 1 V 1 ) 2 (g 2 V 1 )(g 2 V 2 ) + φ 1 γv 2 = (f V 2 ) 1 2 (g 2 V 2 ) 2 (g 1 V 1 )(g 1 V 2 ) + φ 2 Optimal feedback controls: u i (x) = V i(x) g i (x) i = 1, 2 highly nonlinear, implicit! Alberto Bressan (Penn State) Noncooperative Games 25 / 29

Differential games in finite time horizon Player i seeks to maximize: J i. = ψi ( x(t ) ) T t 0 ( φ i (x(t)) + u2 i (t) ) dt 2 subject to ẋ = f (x) + g 1 (x)u 1 + g 2 (x)u 2, x(t 0 ) = x 0. The value functions satisfy the system of H-J equations V 1,t = φ 1 (f V 1 ) 1 2 (g 1 V 1 ) 2 (g 2 V 1 )(g 2 V 2 ) V 2,t = φ 2 (f V 2 ) 1 2 (g 2 V 2 ) 2 (g 1 V 1 )(g 1 V 2 ) Terminal conditions: V 1 (T, x) = ψ 1 (x), V 1 (T, x) = ψ 1 (x) Optimal feedback controls: u i (x) = V i(x) g i (x) i = 1, 2 a hard PDE problem to solve! Alberto Bressan (Penn State) Noncooperative Games 26 / 29

Linear - Quadratic games Assume that the dynamics is linear and the cost functions are quadratic: ẋ = (Ax + b 0 ) + b 1 u 1 + b 2 u 2, x(0) = y + J i = e γt( ) a i x + x T P i x + c i u i + u2 i dt 2 0 Then the system of PDEs has a special solution of the form quadratic polynomial: V i (x) = α i + β i x + x T Γ i x i = 1, 2 } optimal controls: ui (x) = argmin {c i ω + ω2 ω R 2 + (β i + 2x T Γ i )b i ω = c i (β i + 2x T Γ i ) b i To find this solution, it suffices to determine the coefficients α i, β i, Γ i by solving a system of algebraic equations Alberto Bressan (Penn State) Noncooperative Games 27 / 29

Validity of linear-quadratic approximations? Assume the dynamics is almost linear ẋ = f 0 (x)+g 1 (x)u 1 +g 2 (x)u 2 (Ax +b 0 )+b 1 u 1 +b 2 u 2, x(0) = y and the cost functions are almost quadratic J i = + 0 e γt( φ i (x) + u2 ) i dt 2 + 0 e γt( a i x + x T P i x + u2 ) i dt 2 Is it true that the nonlinear game has a feedback solution close to the one for linear-quadratic game? Alberto Bressan (Penn State) Noncooperative Games 28 / 29

References On optimal control: A. Bressan and B. Piccoli, Introduction to the Mathematical Theory of Control, AIMS Series in Applied Mathematics, 2007. W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Optimal Control. Springer-Verlag, 1975. S. Lenhart and J. T. Workman, Optimal Control Applied to Biological Models Chapman and Hall/CRC, 2007. On differential games: A.Bressan, Noncooperative differential games. A tutorial. Milan J. Math. 79 (2011), 357 427. T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory. SIAM, Philadelphia, 1999. E. Dockner, S. Jorgensen, N. Van Long, and G. Sorger, Differential Games in Economics and Management Science, Cambridge University Press, 2000. Alberto Bressan (Penn State) Noncooperative Games 29 / 29