On the Stability of the Best Reply Map for Noncooperative Differential Games

Similar documents
Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Alberto Bressan. Department of Mathematics, Penn State University

An Introduction to Noncooperative Games

Hyperbolicity of Systems Describing Value Functions in Differential Games which Model Duopoly Problems. Joanna Zwierzchowska

Piecewise Smooth Solutions to the Burgers-Hilbert Equation

Lecture Notes on PDEs

Laplace s Equation. Chapter Mean Value Formulas

Optimal Control Feedback Nash in The Scalar Infinite Non-cooperative Dynamic Game with Discount Factor

Lyapunov Stability Theory

Hyperbolic Systems of Conservation Laws

1 Lyapunov theory of stability

7 Planar systems of linear ODE

Chap. 3. Controlled Systems, Controllability

Math Ordinary Differential Equations

An introduction to Mathematical Theory of Control

Dynamic Blocking Problems for a Model of Fire Propagation

Extremal Solutions of Differential Inclusions via Baire Category: a Dual Approach

Topic # /31 Feedback Control Systems. Analysis of Nonlinear Systems Lyapunov Stability Analysis

Convergence Rate of Nonlinear Switched Systems

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

UNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS

Noncooperative continuous-time Markov games

Structurally Stable Singularities for a Nonlinear Wave Equation

LMI Methods in Optimal and Robust Control

Problem: A class of dynamical systems characterized by a fast divergence of the orbits. A paradigmatic example: the Arnold cat.

Entrance Exam, Differential Equations April, (Solve exactly 6 out of the 8 problems) y + 2y + y cos(x 2 y) = 0, y(0) = 2, y (0) = 4.

Oblique derivative problems for elliptic and parabolic equations, Lecture II

minimize x subject to (x 2)(x 4) u,

PH.D. PRELIMINARY EXAMINATION MATHEMATICS

Threshold behavior and non-quasiconvergent solutions with localized initial data for bistable reaction-diffusion equations

An introduction to Birkhoff normal form

ODE Homework 1. Due Wed. 19 August 2009; At the beginning of the class

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

STABILITY. Phase portraits and local stability

Applied Math Qualifying Exam 11 October Instructions: Work 2 out of 3 problems in each of the 3 parts for a total of 6 problems.

Control, Stabilization and Numerics for Partial Differential Equations

MCE693/793: Analysis and Control of Nonlinear Systems

EE363 homework 7 solutions

On Discontinuous Differential Equations

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

Z i Q ij Z j. J(x, φ; U) = X T φ(t ) 2 h + where h k k, H(t) k k and R(t) r r are nonnegative definite matrices (R(t) is uniformly in t nonsingular).

Dynamical Systems & Lyapunov Stability

Theorem 1. ẋ = Ax is globally exponentially stable (GES) iff A is Hurwitz (i.e., max(re(σ(a))) < 0).

Nonlinear Control. Nonlinear Control Lecture # 2 Stability of Equilibrium Points

1 Relative degree and local normal forms

Solution of Linear State-space Systems

Chapter Introduction. A. Bensoussan

DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY

Chap. 1. Some Differential Geometric Tools

Explicit solutions of some Linear-Quadratic Mean Field Games

What can be expressed via Conic Quadratic and Semidefinite Programming?

OPTIMAL CONTROL. Sadegh Bolouki. Lecture slides for ECE 515. University of Illinois, Urbana-Champaign. Fall S. Bolouki (UIUC) 1 / 28

Nonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems. p. 1/1

OR MSc Maths Revision Course

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

arxiv: v2 [math.ap] 28 Nov 2016

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS

An homotopy method for exact tracking of nonlinear nonminimum phase systems: the example of the spherical inverted pendulum

Inertial Game Dynamics

FIXED POINT ITERATIONS

Multivariable Calculus

EXISTENCE OF THREE WEAK SOLUTIONS FOR A QUASILINEAR DIRICHLET PROBLEM. Saeid Shokooh and Ghasem A. Afrouzi. 1. Introduction

PH.D. PRELIMINARY EXAMINATION MATHEMATICS

THE INVERSE FUNCTION THEOREM

Modeling & Control of Hybrid Systems Chapter 4 Stability

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

Deterministic Dynamic Programming

8.1 Bifurcations of Equilibria

The Minimum Speed for a Blocking Problem on the Half Plane

Dynamic Blocking Problems for Models of Fire Propagation

Linear Quadratic Zero-Sum Two-Person Differential Games

B5.6 Nonlinear Systems

Intersection Models and Nash Equilibria for Traffic Flow on Networks

Converse Lyapunov theorem and Input-to-State Stability

Equilibria with a nontrivial nodal set and the dynamics of parabolic equations on symmetric domains

Tilburg University. An equivalence result in linear-quadratic theory van den Broek, W.A.; Engwerda, Jacob; Schumacher, Hans. Published in: Automatica

Minimum-Phase Property of Nonlinear Systems in Terms of a Dissipation Inequality

Robust control and applications in economic theory

Prof. Krstic Nonlinear Systems MAE281A Homework set 1 Linearization & phase portrait

Linear Hyperbolic Systems

A LOCALIZATION PROPERTY AT THE BOUNDARY FOR MONGE-AMPERE EQUATION

Optimal Control. Lecture 18. Hamilton-Jacobi-Bellman Equation, Cont. John T. Wen. March 29, Ref: Bryson & Ho Chapter 4.

Introduction to Nonlinear Control Lecture # 3 Time-Varying and Perturbed Systems

On finite time BV blow-up for the p-system

EE363 homework 2 solutions

An Exactly Solvable 3 Body Problem

Numerical Methods for Differential Equations Mathematical and Computational Tools

Symmetric Matrices and Eigendecomposition

Input to state Stability

Course Summary Math 211

Chapter #4 EEE8086-EEE8115. Robust and Adaptive Control Systems

Xiyou Cheng Zhitao Zhang. 1. Introduction

Poincaré Map, Floquet Theory, and Stability of Periodic Orbits

u n 2 4 u n 36 u n 1, n 1.

Introduction to Geometric Control

Stability in the sense of Lyapunov

EN Nonlinear Control and Planning in Robotics Lecture 3: Stability February 4, 2015

Largest dual ellipsoids inscribed in dual cones

Linearization at equilibrium points

Preliminary Exam 2018 Solutions to Morning Exam

Transcription:

On the Stability of the Best Reply Map for Noncooperative Differential Games Alberto Bressan and Zipeng Wang Department of Mathematics, Penn State University, University Park, PA, 68, USA DPMMS, University of Cambridge, Cambridge, CB3 WB, UK e-mails: bressan@math.psu.edu, zw55@cam.ac.uk January 3, Abstract Consider a differential game for two players in infinite time horizon, with exponentially discounted costs. A pair of feedback controls u x, u x is Nash equilibrium solution if u is the best strategy for Player in reply to u, and u is the best strategy for Player, in reply to u. Aim of the present note is to investigate the stability of the best reply map: u, u R u, R u. For linear-quadratic games, we derive a condition which yields asymptotic stability, within the class of feedbacks which are affine functions of the state x IR n. An example shows that stability is lost, as soon as nonlinear perturbations are considered. Introduction Consider non-cooperative differential game for two-players, where the state of the system evolves according to ẋ = fx, u, u.. Here x IR n while the control functions u and u range over the domains u U IR m, u U IR m. The upper dot denotes a derivative w.r.t. time. The goal of each player is to minimize his own cost functional J i = J i x, u = e γt L i xt, u t, u t dt i =,.. This corresponds to an infinite horizon problem, with exponentially discounted cost. Given a feedback strategy x u x implemented by Player, we say that the feedback strategy x u x is a best reply for Player, and write u R u, if u is optimal for

the problem with dynamics minimize: J = More precisely, we are here requiring that i for every initial state x IR n, the Cauchy problem e γt L xt, u xt, u dt,.3 ẋ = fx, u x, u..4 ẋ = fx, u x, u x, x = x,.5 has at least one Carathéodory solution, defined for all times t. ii Every Carathéodory solution of.5 is optimal in the sense that, given the initial state x and the feedback u, the control u minimizes the cost J among all strategies u available to Player. Given a feedback strategy x u x implemented by Player, a best reply u R u, for Player is defined in a similar way. Two feedback strategies u = u x, u = u x constitute a Nash equilibrium solution to the differential game.-. if at the same time one has u R u, u R u..6 One can regard the Nash solution as a fixed point of the best reply map u, u R u, R u..7 Assuming that this best reply is unique, a natural question is whether the fixed point is asymptotically stable. In other words, let u, u be a pair of feedback controls sufficiently close to a Nash equilibrium u, u, and define the iterates u k+, u k+. = R u k, R u k. Is it true that u k, uk u, u as k? In the present note we provide a positive answer for a class of games with linear dynamics and quadratic costs, as long as the feedback controls u k i are affine functions of the state x IR n. In general, this convergence is not expected to hold for nonlinear systems. Indeed, we show by an example that, even for a linear-quadratic game, the fixed point of the best reply map is not stable w.r.t. nonlinear perturbations of the feedback controls, even if these perturbations have uniformly bounded support and are arbitrarily small in any C k norm. The paper is organized as follows. Section reviews the basic theory of linear-quadratic optimal control problems in infinite time horizon, focusing on how the optimal feedback changes as a consequence of small variations in the dynamics and in the cost function. In Section 3 we briefly review the equations determining a Nash equilibrium solution to a linear-quadratic differential game.

After these preliminaries, Section 4 analyzes the stability of the best reply maps for linearquadratic games, within the class of feedback controls which are affine functions of the state. Finally, in Section 5 we provide an example showing that, even for a simple linear-quadratic game, the Nash equilibrium feedbacks are not stable w.r.t. nonlinear perturbations. For a general introduction to differential games we refer to [, 5]. Linear-quadratic games have an extensive literature, see for example [6, ] and references therein. For nonlinear problems, the papers [7, 8] analyze the relations between solutions to differential games and optimal feedback controls. Examples of feedback solutions to nonlinear differential games in infinite time horizon were studied in [4, 9]. For a special class of nearly decoupled games, an iterative procedure yielding a Nash solution can be found in []. Review of the linear-quadratic optimal control problem Consider a system on IR n, with dynamics ẋ = Ax + Bu + c. and a cost functional J = Jx, u. = [ ] e γt x P x + x Qu + u + a x + b u dt.. Here x IR n, u IR m are column vectors, while the superscript denotes transposition. Moreover A, P IR n n, B, Q IR n m, a, c IR n, b IR m. Without loss of generality we can assume P = P. We also remark that, if R IR m m is a symmetric, strictly positive definite matrix, then the more general cost functional J =. [ e γt x P x + x Qu + ] u Ru + a x + b u dt can be reduced to.. Indeed, it suffices to use a new set of control variables v = Λu, with Λ Λ = R. This yields v = u Λ Λu = u Ru. Let V be the value function for the optimization problem.-.. More precisely, let t xt, u, x be the solution to. corresponding to the control function u and the initial data x = x..3 We then define V x. = inf u Jxu, x, u..4 Calling ξ. = V x = Vx,..., V xn the row vector describing the gradient of V, the optimal feedback control u is provided by { u x, ξ = argmin ξ Bω + x Qω + ω } ω + b ω..5 3

Differentiating w.r.t. ω, we obtain ξ B + x Q + u + b =, u x, ξ = B ξ + Q x + b..6 It is well known that, if the value function V is continuously differentiable function, then it provides a solution to the Hamilton-Jacobi equation where γv = V ẋ + Lx, u,.7 ẋ = Ax + Bu + c,.8 Lx, u. = x P x + x Qu + u + a x + b u.9 After some calculations, this yields [ ] γv = V Ax B B V + Q x + b + c + x P x + a x x Q + b B V + Q x + b. + V B + x Q + b B V + Q x + b. This can be written as γv = V à V + V Bx + V c + x P x + q x + r,. where à = BB, B = A BQ, c = c Bb, P = P QQ, q = a Qb, r = b b.. Now assume V x = x Mx + n x + e.3 for some symmetric matrix M, some vector n and a scalar e. This implies Inserting.3-.4 in. one obtains γ x Mx + γn x + γe V x = x M + n..4 = x M + n ÃMx + n + x M + n Bx + x M + n c + x P x + q x + r..5 4

Equating homogeneous terms of the same degree, and observing that x M Bx = x B Mx because M is symmetric, we finally obtain γm = MÃM + M B + B T M + P, γn = n ÃM + n B + c M + q,.6 γe = n + n c + r. The first equation in.6 is a quadratic equation for the n n symmetric matrix M, whose solution depends on the matrices A, B, P, Q. Existence and uniqueness of solutions is not guaranteed, in general. As soon as the matrix M is determined, the last two equations in.6 yield the values of the vector n and of the scalar e. These last two equations are linear w.r.t. n, e. Next, assume that a solution of the equations.6 has been found. According to.6 the optimal feedback control is [ ] [ ] u x = B V + Q x + b = B Mx + n + Q x + b..7 The system dynamics is thus described by [ ] ẋ = Ax B B Mx + n + Q x + b + c..8 Our basic assumption will be that this dynamics is asymptotically stable. Equivalent conditions for this to happen are: i The linear homogeneous system is asymptotically stable. ẋ = Y x, Y. = A BB M BQ.9 ii All the eigenvalues of the matrix Y in.9 have strictly negative real part. iii There exist constants C, ω > such that norm of the exponential matrix satisfies e ty C e tω for all t.. We write.7 in the form where the m n-matrix U and the m-vector v are given by u x = Ux + v,. U = B M + Q, v = B n + b.. Clearly, this optimal feedback u depends on the coefficients A, B, c of the system. and on the coefficients P, Q, a, b of the cost functional.. Keeping B, Q, b constant, we wish 5

to understand how U, v depend on the remaining parameters A, P, c, a. In other words, we study the differential of the map A, c, P, a Φ U, v..3 Assume that we replace A, c, P, a by A + εa, c + εc, P + εp, a + εa. Then the optimal feedback will have the form u εx = U + εu x + v + εv + oε..4 We seek an expression for U, v as a linear function of A, P, c, a, depending on A, P, c, a. Inserting = M + εm + oε inside the first equation in.6 and recalling. one obtains M ε γm ε = M ε BB M ε + M ε A + εa BQ + A + εa BQ M ε + P + εp QQ. Collecting terms of order ε we find γm = MBB M M BB M + M A BQ + A BQ M + MA + A M + P, [ ] [ γm + M BB M A + BQ + M BB M A + BQ ] = MA + MA + P. We thus seek an n n symmetric matrix M such that where γm M Y M Y = S,.5 Y. = A BB M BQ, S. = MA + MA + P..6 Notice that S is symmetric, but Y may not be symmetric. Of course, the left hand side of.5 is always symmetric. The solution to.5 is provided by the formula M = e γt e ty Se ty dt..7 Notice that the integral is absolutely convergent, because of.. Observing that M is symmetric, the above formula can be verified by writing γm M Y M Y d = e γt e ty Se ty dt = S..8 dt According to., the first order variations of the matrix U and of the vector v are computed by U = B M, v = B n..9 The symmetric matrix M has already been computed at.7. Notice that M depends only on the first order variations A, P and not on c, a. For future use, we seek an expression for n in the special case where A =, P =. By., the second equation in.6 yields n = γi + MBB A + QB Mc Bb + a Qb. 6

Hence Notice that the matrix n = γi + MBB A + QB Mc + a..3 γi + MBB A + QB = γi Y is certainly invertible. Indeed, the stability assumption. implies all of its eigenvalues have real part > γ. 3 A linear-quadratic differential game We now consider a differential game for two players, with controls u IR m, u IR m. In place of., the dynamics is now described by ẋ = Ax + B u + B u + f. 3. Assume that the cost functionals for the two players have the form J i = e γt x P i x + a i x + u i + x Q ij u j + b ij dt i =,. 3. j=, Assume that the value functions V, V are second order polynomials in the space variables x,..., x n : so that V i x = x M i x + n i x + e i i =,, 3.3 V i x = M i x + n i. 3.4 Then the optimal feedback controls u, u for the two players are determined by { u i x = argmin V i B i ω + ω } ω +x Q ii ω +b ii ω = B i M ix+n i +Q ii x+b ii. 3.5 The strategies 3.5 yield a Nash equilibrium solution in feedback form provided that the value functions V, V satisfy the system of P.D.E s γv i = V i ẋ + L i x, u, u i =,, 3.6 where ẋ = Ax + B u + B u + f, 3.7 L i x, u, u = x P i x + a i x + u i + x Q ij + b ij u j. 3.8 j=, 7

Inserting in 3.6 the expressions 3.3 for V, V, and using 3.4, 3.7, 3.8, one obtains γ x M x + γn x + γe [ = x M + n Ax B B M x + n + Q x + b B B M x + n + Q x + b ] 3.9 + x P x + a x + j=, x Q j + b j B j M jx + n j + Q jj x + b jj x M + n B + x Q + b B M x + n + Q x + b, and an entirely similar equation holds for V. We regard 3.9 as an identity between two polynomials of degree in the variables x,..., x n. Equating the coefficients of the quadratic terms, one obtains a system of algebraic equations for the coefficients of the symmetric matrices M, M. A direct computation yields M B B M M B B M + M C M B Q + C =, M B B M M B B M + M C M B Q + C = 3. where C. = A γ I B Q B Q, C. = P Q Q Q Q C. = P Q Q Q Q. 3. Throughout the following, we assume that there exists a pair of n n symmetric matrices M, M providing a solution to the above system. Moreover, we assume that the resulting dynamics is asymptotically stable, so that all the eigenvalues of the matrix Y =. A B B M + B B M + B Q + B Q 3. have strictly negative real parts. Next, equating the coefficients of the linear terms in 3.9, we obtain a system of linear equations for the vectors n, n : Y γi R n = R Y γi n T 3.3 T where. R = M B B + Q B, R. = M B B + Q B, T = a + M B j b jj + Q j b jj + Q jj b j + M B b Q b. j=, 8

T = a + j=, M B j b jj + Q j b jj + Q jj b j + M B b Q b. Finally, equating the constant terms on the two sides of 3.9, one obtains an expression for e, e : γe i = n i B j B j n j + b jj b ij b jj + n B + b. 3.4 j=, j=, 4 Affine perturbations Now consider the differential game 3.-3.. Let the feedback controls u i x = U i x + v i, i =,, 4. provide a Nash equilibrium solution. Let V, V be the value functions for the two players, so that all the identities 3.3 3.4 hold. We wish to understand whether this solution is stable w.r.t. iterations of the best reply map. In this section we study the case of affine perturbations, while in the next section is concerned with general nonlinear perturbations. As we shall see, the answer is quite different in the two cases. To state the question more precisely, consider two perturbed feedback controls u i x = U i x + v i, i =,. 4. By induction on N, define a sequence of feedback controls u k i x = U k i x + v k i, i =,, 4.3 where u k is the optimal feedback for Player, in reply to the feedback u k implemented by Player, and similarly u k is the optimal feedback for Player, in reply to u k. We seek conditions which guarantee that, if the pair u, u is sufficiently close to u, u, then one has the convergence u k, uk u, u as k. 4.4 For this purpose, we consider the two maps Φ : U k U k, Φ : U k U k+. 4.5 and compute the differential of the composite mapping Φ Φ at the equilibrium point U, U. Consider a small perturbation of the feedback for the first player: u,ε x = U i + εu ix + v + εv. Then the optimal feedback for the second player will also be changed, say u,ε x = U + εu x + v + εv + oε. 9

The perturbation U is computed as in.9,.7. Indeed, from the point of view of the second player, perturbing U and v amounts to replacing the dynamics with Moreover, the cost functional J = is replaced by ẋ = A + B U x + B u + B v + f 4.6 ẋ ε = A + B U + εu x + B u + B v + εv + f. 4.7 e γt [ x P x + a x + u J,ε = e γt [ x P x + a x + u ] + x Q + b u + x Q + b U x + v dt + x Q + b u ] + x Q + b U x + εu x + v + εv dt 4.8 4.9 By the analysis in Section, the differential of the map Φ in 4.5 is the linear map U U defined by U = B M, 4. M = e γt e ty S e ty dt, 4. S. = M A + M A + P = M B U + M B U + Q U + Q U. 4. The differential of the map Φ is computed by the same formulas, 4. 4., permuting the indices,. We now consider the composition of two iterations, say Ũ = DΦ U = DΦ DΦ U. Observing that Ũ = B M, U = B M, U = B M, where the matrices B, B do not change from one iteration to the next. We have the estimate can be estimated by M = e γt e ty S e ty dt = e γt e ty M B U + M B U + Q U + Q U e ty dt e γt e ty e ty dt M B + Q B M Calling e γt e ty e ty dt I. = 4 M B + Q M B + Q B B M. e γt e ty e ty dt, 4.3

we can now state Theorem. Let V i, i =, be the value functions corresponding to a Nash equilibrium feedback solution. Assume that 4 M B + Q M B + Q B B I <. 4.4 γi Y M + Q M + Q B B <. 4.5 Then the Nash equilibrium is asymptotically stable w.r.t. iterations of the best reply map, within the class of piecewise affine feedback controls. Proof. Consider the composite mapping U, v U, v Ũ, ṽ, 4.6 where u x = U x+v is the best reply for Player to the feedback u x = U x+v used by Player, while ũ x = Ũx+ṽ is the best reply for Player to the feedback u x = U x+v. Consider the corresponding value functions V x = x M x + n x + e, V x = x M x + n x + e, Ṽ x = x M x + ñ x + ẽ. 4.7 By 3.5 U i = B i M i + Q ii, v i = B i n i + b ii. Therefore, to prove convergence of the iterates of the best reply map, it suffices to prove the convergence of the iterates of the composite map M, n M, n M, ñ. 4.8 Consider the differential of this mapping, computed at the Nash equilibrium: Λ. = M / M M / n. 4.9 ñ / M ñ / n To prove the theorem, it suffices to show that the powers of this matrix converge to zero: Λ N as N. 4. Since M / n =, the convergence in 4. will follow from the two inequalities M / M <, ñ / n <. 4. By 4.3, the first inequality follows from the assumption 4.4. Next, applying.3 with c = v = B n, a = Q v = Q B n,

we obtain Similarly, Hence ñ / n n = γi Y M B + Q B n. ñ = γi Y M B + Q B n. γi Y M + Q M + Q B B. and the second inequality in 4. follows from the assumption 4.5. Example. The previous result applies, in particular, to a differential game with one weak player, considered in [, 3]. Let the cost functionals be as in 3., but assume that the dynamics has the form ẋ = Ax + B u + θb u + f. 4. When θ =, the Nash equilibrium solution is attained after one iteration. Indeed, in this case the second player cannot affect the evolution of the system. His best strategy is thus the myopic one: { ω u } x = argmin ω + x Q ω + b ω regardless of the feedback u implemented by the first player. = Q x + b, On the other hand, for θ >, replacing the term B by θb it is clear that the assumptions 4.4-4.5 are satisfied, as long as θ remains small enough. 5 Nonlinear perturbations The convergence result proved in the previous section strongly relied on the linear-quadratic structure of the game, and on the fact that all perturbed feedback controls remained within the class of affine functions of the state x. In this section we show that, even for a linear-quadratic game, the iterates of the best reply map may fail to converge, as soon as we consider feedback controls which are not affine functions of the state. Example. Consider the game with linear dynamics and quadratic payoff functionals ẋ = fx, u, u. = x + u + u 5. J J. =. = e t[ ax u ] dt, 5. e t[ bx u ] dt. 5.3

We assume here < a < b. Call V x, V x the value functions for the two players. Their derivatives will be written as The optimal controls are then computed as ξ. = V = V,x, η. = V = V,x. u x, ξ = argmax ω ξ ω ω = ξ, 5.4 u x, η = argmax η ω ω = η. 5.5 ω The value functions V, V can be found by solving the system of Hamilton-Jacobi equations { V = Hx, V, V, V = Kx, V, V, 5.6 where Notice that Hx, ξ, η = ξ x + u x, ξ + v x, η + ax u x, ξ = ξ Kx, ξ, η = η + a ξx + ξη, x + u x, ξ + v x, η + bx v x, η = ξη + b ηx + η. Differentiating 5.6 we obtain the system 5.7 5.8 H ξ = K η = x + ξ + η = fx, ξ, η. 5.9 { ξ = Hx + H ξ ξ + H η η, η = K x + K ξ ξ + K η η. 5. In matrix notation, this can be written as ξ + η x ξ η ξ + η x ξ η = ξ a η b. 5. The system 5. admits the constant solution ξ a, η b. 5. This correspond to a Nash equilibrium solution of the differential game consisting of two constant feedback controls: u x = a, u x = b. 5.3 3

Next, we study whether there exist other solutions, possibly described by nonlinear feedbacks u x, u x. Following [, 3], we write 5. as a Pfaffian system ω. = a ξ dx + ξ + η x dξ + ξ dη =, ω. = b η dx + η dξ + ξ + η x dη =. 5.4 The graph of a solution to 5. can then be obtained by suitably concatenating trajectories of the vector field ξ + η x ξη v = ω ω = ξ + η xξ a ξη b. 5.5 ξ + η xη b ηξ a Notice that the first component of v vanishes along the conical surface Γ =. { } x, ξ, η ; ξ + η x ξη =. 5.6 At a point P Γ we either have vp = IR 3, or else the vector vp is vertical. The only way to connect trajectories of the vector field v forming the graph of a smooth function x W x = yx, zx is to cross the surface Γ somewhere along the two curves where v vanishes, namely γ ±. = {x, ξ, η ; x = ξ + η ± ξη, ξ a = ± Observe that the map } ξ η b. 5.7 η t P t. = xt, a, b, xt = p+ + p e κt + e κt, κ =, p ± = a + b ± ab ab 5.8 describes a heteroclinic orbit of the vector field v in 5.5, connecting the two stationary points P + = a + b + ab, a, b γ +, P = a + b ab, a, b γ. Setting Xt. = a + b xt, the Jacobian matrix of the vector field v at the point P t is At = Xt Xt b Xt a Xt a. 5.9 b Xt The eigenvalues λ i and the corresponding eigenvectors w i of the matrix At are: λ t = Xt, λ t = Xt + ab, λ 3 t = Xt ab, 5. 4

w =, w = b a, w 3 = b a + b a. 5. b a In particular, at the point P the three eigenvalues are ab < < ab, while at the point P + the eigenvalues are ab < < ab. Since the eigenvectors are constant, the general solution to the linear equation ẏ = Aty can be written as { t yt = c exp } λ sds w for some constants c, c, c 3. By 5.8-5., we have { t } { t } + c exp λ sds w + c 3 exp λ 3 sds w 3, 5. lim yt = + if and only if c 3, t lim yt = + if and only if c, t + 5.3 Let us call Σ + the -dimensional manifold of points P = x, ξ, η such that the solution to P = vp, P = P 5.4 satisfies P t γ + as t. Similarly, call Σ the -dimensional manifold of points P = x, ξ, η such that the solution of 5.4 satisfies P t γ as t +. By 5.3, these two manifolds intersect transversally along the segment P P +. We conclude that there can be no other solutions to 5., in a neighborhood of the constant solution 5.. We claim that this pair of feedback controls, providing a fixed point of the best reply map is unstable w.r.t. nonlinear perturbations. More precisely: Proposition. There exists δ > and a sequence of smooth perturbations φ k C c IR such that the following holds. The C k norms of φ k satisfy φ k C k. 5.5 For any k, starting with the values u x = a + φ kx, u x = b, 5.6 the iterates u N, u N of the best reply map do not converge to the solution u, u in 5.3. Indeed, lim sup N { u N x a + u N x b ; a + b < x < a + b + 6δ } > δ. 5.7 5

Proof. To construct the nonlinear perturbations φ k, choose < δ << and let x Consider the standard C function with compact support if x <, φs = exp { s } if x.. = a+b +3δ. Then define. x x φ k x = c k φ k. ε k Choosing sequences of numbers numbers < c k << ε k converging to zero sufficiently fast, the condition 5.5 can be easily satisfied. We now examine the sequence of iterations of the best reply map, starting from the initial feedbacks ξ x = a + ψ kx, η x = b. 5.8 Assume ξ N, η N have been determined. By 5., the next iteration yields a pair of functions ξ N+ x and η N+ x, respectively providing solutions to ξ = ξ ξ + η N x x a η N x ξ, η = η η + ξ N x x b ξ N x η. Iterating once more, we find that ξ N+ provides a solution to [ ξ = ξ + η N+ x x ] ξ a η N+ + ξ N x x η N+ b ξ N x η ξ 5.9. 5.3 At this stage, one should observe that the system of ODEs 5., as well as the two ODEs in 5.9, are not supplemented by initial data. It is the singularity in the coefficients that determines one particular, globally defined solution. For the system 5., the matrix of coefficients fails to be invertible on the conical surface Γ at 5.6. A globally smooth solution is found by concatenating trajectories of the vector field v in 5.5. As we have seen, these trajectories must cross Γ at points where v vanishes. Since the manifolds Σ +, Σ intersect transversally, the segment P + P is the only heteroclinic orbit, and the unique Nash equilibrium solution is 5.3. On the other hand, the right hand sides of 5.9 are both singular at points where ξ+η x =. At each iteration, the only solution which is regular across the surface where ξ + η x = is the one for which ξ N x = a, η Nx = b for all x x ε k, N. 5.3 The functions ξ N+, η N+ can thus be constructed by solving 5.9 one the half line [x ε k, + [, with boundary data ξx ε k =, ηx ε k =. 5.3 6

Observe that the Cauchy problem for the system ξ ξ + η x ξ = η ξ + η x ξη η ξ + η x ξ a η b with initial data 5.3 is well posed in a neighborhood of x, and its solution provides a fixed point of the iteration scheme 5.9. However, this iteration scheme, motivated by the best-reply map, is highly unstable for x close to a+b. Indeed, contrary to the usual Picard integral map, the right hand sides of 5.9 depend also on the derivatives ξ N, η N. Since ξ a/, η b/, x a + b/ + 3δ, from 5.3 it follows ξ N+ x ab a + b x 4 ξ N x, We thus expect that ξ N x + for x ε k < x < x. A more careful argument is as follows. Assume a ξ Nx a + δ, b δ η N+x b + δ, for all x [x ε k, ȳ], for some ȳ [x ε k, x ]. Then 5.3 yields 5.33 ξ N+ x ab 3δ 4 ξ N x δ η N+ x b. 5.34 Moreover, the second equation in 5.9 yields η N+ x = ] [ξ N x η ξ N x x η η + b b δ + δ Hence, integrating and inserting in 5.34, we obtain ξ N+ x 3δ ab 4 ξ N x δ b δ + 4 x x ε k ξ N It is not restrictive to assume that ε k > is so small that b δ + 4 all x [x ε k, ȳ] one has ξ N+ x a = x x ε k ξ N+ y dy ξ N x 5.35 y dy. 5.36 ε k ab 7. If this holds, for ab x 3δ 4 ξ N y dy b x x ε k δ δ + 4 [x x ε k ] ξ N y dy x ε k ab x 3δ 8 ξ N y dy = ab x ε k 3δ 8 ξ N x a 5.37 Since < δ <<, the above estimate shows that, for every x ε k < x < ȳ, the sequence ξ N x a/ keeps increasing, as long as the bounds 5.33 hold. By continuity, for any given N we can now find yn > x ε k such that the estimates 5.33 hold for all N N and x [x ε k, yn ]. By 5.37, the sequence ξ N x keeps increasing. Hence there exists N > N such that one of the bounds in 5.33 fails at some x [x ε k, yn ]. This establishes the limit 5.7. 7

Acknowledgment. The first author was partially supported by NSF through grant DMS- 874, New problems in nonlinear control. References [] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory, reprint of the second edition, SIAM, Philadelphia, 999. [] A. Bressan, From optimal control to non-cooperative differential games: a homotopy approach. Control and Cybernetics, 38 9, 8-6. [3] A. Bressan, Bifurcation analysis of a noncooperative differential game with one weak player. J. Differential Equations 48, 97-34. [4] A. Bressan and F. Priuli, Infinite horizon noncooperative differential games, J. Differential Equations 7 6, 3 57. [5] E. Dockner, S. Jorgensen, N. Van Long, and G. Sorger, Differential games in economics and management science, Cambridge University Press,. [6] J. Engwerda, Linear quadratic differential games: an overview. In: Advances in dynamic games and their applications, pp. 37-7, Ann. Internat. Soc. Dynam. Games, Birkhuser, Boston, 9. [7] S. Mirica, Verification theorems for optimal feedback strategies in differential games. Int. Game Theory Rev. 5 3, 67 89. [8] S. Mirica, Reducing a differential game to a pair of optimal control problems. In Differential Equations, Chaos and Variational Problems, pp. 69 83, Birkhäuser, Basel, 8. [9] F. S. Priuli, Infinite horizon noncooperative differential games with nonsmooth costs. J. Math. Anal. Appl. 336 7, 56-7. [] R. Srikant and T. Basar, Iterative computation of noncooperative equilibria in nonzerosum differential games with weakly coupled players. J. Optim. Th. Appl. 7 99, 37-68. [] A.J.T.M. Weeren, J.M. Schumacher, and J.C. Engwerda, Asymptotic analysis of linear feedback Nash equilibria in nonzero-sum linear-quadratic differential games. J. Optim. Theory Appl. 999, 693-7. 8