Alberto Bressan. Department of Mathematics, Penn State University

Similar documents
Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

On the Stability of the Best Reply Map for Noncooperative Differential Games

An Introduction to Noncooperative Games

Hyperbolicity of Systems Describing Value Functions in Differential Games which Model Duopoly Problems. Joanna Zwierzchowska

Intersection Models and Nash Equilibria for Traffic Flow on Networks

Hyperbolic Systems of Conservation Laws. in One Space Dimension. I - Basic concepts. Alberto Bressan. Department of Mathematics, Penn State University

Conservation laws and some applications to traffic flows

Hyperbolic Systems of Conservation Laws

Chap. 1. Some Differential Geometric Tools

Numerical Methods for Optimal Control Problems. Part I: Hamilton-Jacobi-Bellman Equations and Pontryagin Minimum Principle

Dynamical Systems & Lyapunov Stability

Hyperbolic Systems of Conservation Laws. in One Space Dimension. II - Solutions to the Cauchy problem. Alberto Bressan

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

The Pennsylvania State University The Graduate School MODELS OF NONCOOPERATIVE GAMES. A Dissertation in Mathematics by Deling Wei.

Applied Math Qualifying Exam 11 October Instructions: Work 2 out of 3 problems in each of the 3 parts for a total of 6 problems.

Computational Issues in Nonlinear Dynamics and Control

Graph Completions for Impulsive Feedback Controls

Math 4200, Problem set 3

7 Planar systems of linear ODE

EN Applied Optimal Control Lecture 8: Dynamic Programming October 10, 2018

Solution to Homework #4 Roy Malka

Deterministic Dynamic Programming

Logistic Map, Euler & Runge-Kutta Method and Lotka-Volterra Equations

The first order quasi-linear PDEs

Lecture Notes on Hyperbolic Conservation Laws

An Introduction to Numerical Continuation Methods. with Application to some Problems from Physics. Eusebius Doedel. Cuzco, Peru, May 2013

Traffic models on a network of roads

On finite time BV blow-up for the p-system

Realization theory for systems biology

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

Chapter 3 Second Order Linear Equations

Math Ordinary Differential Equations

An introduction to Birkhoff normal form

Connecting Orbits with Bifurcating End Points

The Linear Quadratic Regulator

Changing coordinates to adapt to a map of constant rank

One Dimensional Dynamical Systems

LECTURE 7, WEDNESDAY

z x = f x (x, y, a, b), z y = f y (x, y, a, b). F(x, y, z, z x, z y ) = 0. This is a PDE for the unknown function of two independent variables.

9 11 Solve the initial-value problem Evaluate the integral. 1. y sin 3 x cos 2 x dx. calculation. 1 + i i23

An introduction to Mathematical Theory of Control

Lyapunov Stability Theory

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

1. (i) Determine how many periodic orbits and equilibria can be born at the bifurcations of the zero equilibrium of the following system:

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Robotics. Control Theory. Marc Toussaint U Stuttgart

Global output regulation through singularities

Iterative methods to compute center and center-stable manifolds with application to the optimal output regulation problem

1. Find the solution of the following uncontrolled linear system. 2 α 1 1

Basics of Game Theory

Structurally Stable Singularities for a Nonlinear Wave Equation

Chap. 3. Controlled Systems, Controllability

Explicit solutions of some Linear-Quadratic Mean Field Games

Now I switch to nonlinear systems. In this chapter the main object of study will be

11 Chaos in Continuous Dynamical Systems.

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Controlling Mechanical Systems by Active Constraints. Alberto Bressan. Department of Mathematics, Penn State University

5.2.2 Planar Andronov-Hopf bifurcation

Numerical Optimal Control Overview. Moritz Diehl

On continuous time contract theory

Hamiltonian Systems of Negative Curvature are Hyperbolic

CHAPTER 3 THE MAXIMUM PRINCIPLE: MIXED INEQUALITY CONSTRAINTS. p. 1/73

Thermodynamic limit and phase transitions in non-cooperative games: some mean-field examples

In a uniform 3D medium, we have seen that the acoustic Green s function (propagator) is

Numerical Algorithm for Optimal Control of Continuity Equations

Nonlinear Model Predictive Control Tools (NMPC Tools)

Hyperbolic Systems of Conservation Laws. I - Basic Concepts

B5.6 Nonlinear Systems

A Stackelberg Game Model of Dynamic Duopolistic Competition with Sticky Prices. Abstract

Solving non-linear systems of equations

Closed-Loop Impulse Control of Oscillating Systems

Viscosity Solutions of Hamilton-Jacobi Equations and Optimal Control Problems. Contents

Optimal Control. Quadratic Functions. Single variable quadratic function: Multi-variable quadratic function:

Numerical Continuation and Normal Form Analysis of Limit Cycle Bifurcations without Computing Poincaré Maps

Dynamic Blocking Problems for Models of Fire Propagation

The Index of Nash Equilibria

The Scalar Conservation Law

8.1 Bifurcations of Equilibria

Prof. Krstic Nonlinear Systems MAE281A Homework set 1 Linearization & phase portrait

7 Two-dimensional bifurcations

Inertial Game Dynamics

Introduction to optimal control theory in continuos time (with economic applications) Salvatore Federico

Recent Trends in Differential Inclusions

Observability. Dynamic Systems. Lecture 2 Observability. Observability, continuous time: Observability, discrete time: = h (2) (x, u, u)

What can be expressed via Conic Quadratic and Semidefinite Programming?

Computing High Frequency Waves By the Level Set Method

Non-linear Scalar Equations

Noncooperative continuous-time Markov games

Hyperbolic Conservation Laws Past and Future

Game Theory and its Applications to Networks - Part I: Strict Competition

ECE7850 Lecture 7. Discrete Time Optimal Control and Dynamic Programming

Partial Differential Equations

Stabilization and Passivity-Based Control

6 Linear Equation. 6.1 Equation with constant coefficients

BIFURCATION PHENOMENA Lecture 1: Qualitative theory of planar ODEs

Oblique derivative problems for elliptic and parabolic equations, Lecture II

Numerical approximation for optimal control problems via MPC and HJB. Giulia Fabrini

Numerical Solutions to Partial Differential Equations

1.2 Derivation. d p f = d p f(x(p)) = x fd p x (= f x x p ). (1) Second, g x x p + g p = 0. d p f = f x g 1. The expression f x gx

Game Theory Extra Lecture 1 (BoB)

Transcription:

Non-cooperative Differential Games A Homotopy Approach Alberto Bressan Department of Mathematics, Penn State University 1

Differential Games d dt x(t) = G(x(t), u 1(t), u 2 (t)), x(0) = y, u i (t) U i x = state of the system u 1, u 2 = controls of the two players Goal of the i-th player: maximize: J i = 0 e ρt L i (x, u 1, u 2 ) dt 2

Nash non-cooperative equilibrium solution, in feedback form u 1 = u 1 (x) u 2 = u 2 (x) is a non-cooperative equilibrium solution if Given the dynamics ẋ = G(x, u 1, u 2 (x)) The assignment u 1 = u 1 (x) provides an optimal solution to the optimal control problem maximize J 1 = 0 e ρt L 1 (x, u 1, u 2 (x)) dt for every initial state of the system x(0) = y. Similarly u 2 = u 2 (x) should provide a solution to the corresponding optimal control problem for the second player, given that u 1 = u 1 (x) 3

Basic assumptions: each player has complete information about the state of the system players do not communicate, or make deals with each other Goal of the analysis: provide indications what is the best or the rational course of action for each player predict players behavior in real life situations set the rules so that the outcome of the game is favorable from the higher standpoint of the collectivity. Is the Nash non-cooperative equilibrium an appropriate model?? 4

PDE approach: study the value functions V i (y) = value function for the i-th player (= expected payoff, if game starts at y) ẋ = G 1 (x, u 1 ) + G 2 (x, u 2 ) J i. = 0 e ρt[ L i1 (x, u 1 ) + L i2 (x, u 2 ) ] dt u i (x, V i) = argmax ω U i V i (x) G i (x, ω)+l ii (x, ω) { } optimal feedback controls 5

Hamilton-Jacobi equations for the value functions ρv 1 (x) = H (1) (x, V 1, V 2 ) ρv 2 (x) = H (2) (x, V 1, V 2 ) H (1) (x, ξ 1, ξ 2 ) = ξ 1 G ( x, u 1 (x, ξ 1), u 2 (x, ξ 2) ) + L 1 ( x, u 1 (x, ξ 1 ), u 2 (x, ξ 2) ) H (2) (x, ξ 1, ξ 2 ) = ξ 2 G ( x, u 1 (x, ξ 1), u 2 (x, ξ 2) ) + L 2 ( x, u 1 (x, ξ 1 ), u 2 (x, ξ 2) ) Highly nonlinear system of PDE s - HARD! 6

Zero-sum games Assume: L 1 (x, u 1, u 2 ) = L 2 (x, u 1, u 2 ) Then V 1 (x) = V 2 (x) The value function can be characterized as the unique viscosity solution to a scalar H-J equation ρv (x) = H(x, V (x)) 7

Linear-quadratic differential games (T.Basar, G.Olsder...) ẋ = Ax + B 1 u 1 + B 2 u 2 x IR n J i = 0 e ρt [ x T P i x + x T Q i u + u T i R iu i + h T i x + kt i u] dt Value functions can be found within the family of quadratic polynomials V i (x) = x T M i x + b T i x + c i Solve a system of algebraic equations for M i, b i, c i 8

Beyond Linear-Quadratic Games Finite horizon problem with terminal cost: backward Cauchy problem for the value functions t V i + x V i g(x, u 1, u 2 ) = L i(x, u 1, u 2 ) V i (T, x) = ψ i (x) i = 1, 2 9

Is the Cauchy problem well posed? ξ i = V i ξ i (T, x) = ψ i (x) (ξ i ) t + [ ξ i g(x, ξ 1, ξ 2 ) L i (x, ξ 1, ξ 2 ) ] x = 0 In one space dimension, if V 1 V 2 and the C.P. is well posed > 0 the system is hyperbolic, In one space dimension, if V 1 V 2 and the C.P. is ill posed < 0 the system is NOT hyperbolic, In several space dimensions, generically the system is NOT hyperbolic, and the C.P. is ill posed A.B - W.Shen, Small BV solutions of hyperbolic non-cooperative differential games, SIAM J. Control Optim. 43 (2004), 104-215. A.B. - W.Shen, Semi-cooperative strategies for differential games, Intern. J. Game Theory 32 (2004), 561-593. 10

A Homotopy Technique Basic setting (one space dimension) ẋ = G(x, u 1, u 2, θ) = G 1 (x, u 1, θ) + θ G 2 (x, u 2, θ) goal of i-th player: minimize J i. = L i = L i1 (x, u 1, θ) + L i2 (x, u 2, θ) 0 e ρt L i (x, u 1, u 2, θ) dt θ = 0 = myopic strategy: u 2 = u 2 (x) = argmin L 22 (x, ω) ω U 2 Optimal control problem for first player: minimize: 0 e ρt L i (x, u 1, u 2 (x)) dt subject to: ẋ = G 1 (x, u 1 (x)) 11

Value function V 1 (x) satisfies ρv 1 = H (1) (x, V 1 ) H(1) (x, ξ) = min ω U 1 { ξ G1 (x, ω)+l 1 (x, ω, u 2 (x))} Assume: on a fixed interval I = [a, b], for θ = 0 the optimal feedback u 1 (x) yields a dynamics ẋ = G 1 (x, u 1 (x)) having a single stable equilibrium point x V 1 a _ x b x + smoothness assumptions + transversality assumptions (generically true) QUESTION : what happens for θ > 0? 12

An implicit system of ODE s ρv i = H (i) (x, V 1, V 2 ) i = 1, 2 u i (x, ξ i) = argmin ω { ξi G i (x, ω) + L ii (x, ω) } H (i) (x, ξ 1, ξ 2 ) = ξ i G(x, u 1, u 2 ) + L i(x, u 1, u 2 ) Set ξ i = V i and differentiate w.r.t. x : ρξ i = H (i) x + H (i) ξ 1 ξ 1 + H(i) ξ 2 ξ 2 Since H (1) ξ 1 = H (2) ξ 2 = G, one obtains G H (1) ξ 2 H (2) ξ 1 G ξ 1 ξ 2 = ρξ 1 H x (1) ρξ 2 H (2) x 13

Bifurcation analysis G β θ2 α G ξ η = φ ψ If the determinant of the matrix does not vanish, this is equivalent to ξ η = 1 G 2 θ 2 αβ G θ2 α β G φ ψ Goal: find regular solutions in a neighborhood of a point P = ( x, ξ, η) where G = φ = 0 CASE 1: αβ(p ) < 0 = for θ > 0 infinitely many solutions exist 14

CASE 2: αβ(p ) > 0 = under a generic transversality condition, for θ > 0 small a unique solution exists The implicit system of ODEs G dξ dx + θ2 α dη dx β dξ dx + Gdη dx = φ = ψ can be written as a Pfaffian system: ω 1 = φ dx + G dξ + θ 2 α dη = 0 ω 2 = ψ dx + β dξ + G dη = 0 Solutions are obtained by concatenating trajectories of the vector field G 2 θ 2 αβ v θ = ω 1 ω 2 = Gφ θ 2 αψ Gψ βφ 15

Seek: a concatenation of trajectories of v θ smooth function x (ξ(x), η(x)). providing the graph of a Note: along the two surfaces Σ ± θ. = { (x, ξ, η) ; G = ± θ } αβ the first component of v θ vanishes, hence v θ is vertical. These surfaces can only be crossed along the two curves where v θ = 0 γ ± θ. = { (x, ξ, η) ; G = ± θ } α αβ, φ = ± θ β ψ 16

Find a heteroclinic orbit joining a point p γ θ with a point p + γ + θ γ θ γ θ + Σ θ p Σ + θ p + Under a generic condition, the stable and unstable manifolds intersect transversally, hence a unique heteroclinic orbit exists 17

Example 1: a linear - quadratic game ẋ = x + u + θv. Payoffs to be maximized: J u. ) = 0 e ρt ( ax u2 2 dt J v. = Value functions: U(x), V (x), ξ = U, η = V 0 ( ) e ρt bx v2 2 dt a, b 0 ξ x + θ2 η η θ 2 ξ ξ x + θ 2 η ξ η = (1 + ρ)ξ a (1 + ρ)η b G = ξ x + θ 2 η, φ = (1 + ρ)ξ a G η = θ 2, φ η = 0 18

When θ = 0, the singular point is ( x, ξ, η) = ( ) a 1 + ρ, a 1 + ρ, b 1 + ρ G = ξ x + θ 2 η, α = ξ, β = η, φ = (1 + ρ)ξ a, ψ = (1 + ρ)η b Check: αβ = ab (1 + ρ) 2 ab > 0 = unique solution also for θ > 0 ab < 0 = infinitely many solutions for θ > 0 19

Example 2: a sticky price game ṗ = (b a)p p = price { a(t) = production rate b(t) = consumption rate J prod = J cons = c(a) = production cost, 0 0 e ρt[ p(t)b(t) c(a(t)) ] dt e ρt[ φ(b(t)) p(t)b(t) ] dt φ(b) = utility for the consumer 20

Optimal feedback strategies V (p) = value function for the producer W (p) = value function for the consumer a (p, V ) = arg max a b (p, W ) = arg max b { V ( pa) c(a) } { W pb + φ(b) p b } Myopic strategy for the consumer b (p) = arg max b { φ(b) p b } 21

A small coalition of consumers a fraction θ [0, 1] of all consumers join together and play a longterm strategy the remaining fraction (1 θ) consists of individual consumers who adopt the myopic strategy J prod = ṗ = p [ (1 θ)b (p) + θb a ] 0 J cons = e ρt [ (1 θ)b (p) + θp b c(a) ] dt 0 e ρt [ψ(b) pb] dt V (p) = value function for the producer W (p) = value function for the strategic consumer group θ = 0 = optimal control problem for the producer 22

Example: c(a) = a2 2, φ(b) = 2 b myopic strategy for the consumer: b (p) = 1 p 2 The value function for the producer satisfies ρv = 1 + V (p) p + [p V (p)] 2 2 23

2 1.8 1.6 1.4 1.2 V value 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 5 6 7 8 p value choose the branch corresponding to a stable dynamics: ṗ(p p 0 ) 0 24

Bifurcation Problem ρv = (pv ) 2 2 + (1 + V ) [ 1 θ p + θ (1 θw ) 2 p ] ρw = p 2 V W + 1 θ p W + 1 θw (1 θw ) 2 p ξ. = V η = W < 0 αβ = 2( p3 1) η p 2 < 0 = infinitely many solutions for θ > 0 25

Open questions: If infinitely many solutions exist, can we select a best one? What happens in the generic one-dimensional case, when the optimal feedback has jumps? V 1 a x 1 x 2 b Can a branch of solutions be continued for all θ [0, 1]? How do we recognize a critical value θ where a further bifurcation occurs? Study bifurcation problems in two or more space variables. Price - Inventory game: state of the system described by (p, I). 26

Best reply map: B(u 1, u 2 ) = (û 1, û 2 ) û 1 ( ) = best reply of player 1, to strategy u 2 ( ) of second player û 2 ( ) = best reply of player 2, to strategy u 1 ( ) of first player The couple of strategies (u 1, u 2 ) is a Nash equilibrium solution iff it provides a fixed point of B B(u 1, u 2 ) = (u 1, u 2 ) For a fixed θ > 0 small, does the sequence of iterates B(u 0 1, u0 2 ), B2 (u 0 1, u0 2 ), B3 (u 0 1, u0 2 ),... converge? Here the initial guess is (u 0 1, u0 2 ), i.e. the solution for θ = 0. 27