On the Solvability of Risk-Sensitive Linear-Quadratic Mean-Field Games

Similar documents
Linear-Quadratic Stochastic Differential Games with General Noise Processes

Stability of Stochastic Differential Equations

Robust control and applications in economic theory

Mean-Field Games with non-convex Hamiltonian

Risk-Sensitive Control with HARA Utility

Risk-Sensitive and Robust Mean Field Games

Large-population, dynamical, multi-agent, competitive, and cooperative phenomena occur in a wide range of designed

An Introduction to Malliavin calculus and its applications

1 Lattices and Tarski s Theorem

On a class of stochastic differential equations in a financial network model

An Uncertain Control Model with Application to. Production-Inventory System

On Stochastic Adaptive Control & its Applications. Bozenna Pasik-Duncan University of Kansas, USA

arxiv: v4 [math.oc] 24 Sep 2016

Linear Quadratic Zero-Sum Two-Person Differential Games Pierre Bernhard June 15, 2013

MEAN FIELD GAMES WITH MAJOR AND MINOR PLAYERS

Chapter 2 Event-Triggered Sampling

Non-zero Sum Stochastic Differential Games of Fully Coupled Forward-Backward Stochastic Systems

Control of McKean-Vlasov Dynamics versus Mean Field Games

Man Kyu Im*, Un Cig Ji **, and Jae Hee Kim ***

Explicit solutions of some Linear-Quadratic Mean Field Games

Linear Quadratic Zero-Sum Two-Person Differential Games

An Introduction to Noncooperative Games

Chapter Introduction. A. Bensoussan

Decentralized Convergence to Nash Equilibria in Constrained Deterministic Mean Field Control

Game Theory Extra Lecture 1 (BoB)

Backward Stochastic Differential Equations with Infinite Time Horizon

Introduction to Diffusion Processes.

Thomas Knispel Leibniz Universität Hannover

I forgot to mention last time: in the Ito formula for two standard processes, putting

A Concise Course on Stochastic Partial Differential Equations

A MODEL FOR THE LONG-TERM OPTIMAL CAPACITY LEVEL OF AN INVESTMENT PROJECT

On the Stability of the Best Reply Map for Noncooperative Differential Games

Prof. Erhan Bayraktar (University of Michigan)

HJB equations. Seminar in Stochastic Modelling in Economics and Finance January 10, 2011

c 2004 Society for Industrial and Applied Mathematics

Weak solutions of mean-field stochastic differential equations

Information and Credit Risk

An introduction to Mathematical Theory of Control

UNCERTAINTY FUNCTIONAL DIFFERENTIAL EQUATIONS FOR FINANCE

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Stochastic Differential Equations.

Noncooperative continuous-time Markov games

An Overview of Risk-Sensitive Stochastic Optimal Control

On the connection between symmetric N-player games and mean field games

Solution of Stochastic Optimal Control Problems and Financial Applications

Mean-field SDE driven by a fractional BM. A related stochastic control problem

Some aspects of the mean-field stochastic target problem

Half of Final Exam Name: Practice Problems October 28, 2014

Bernardo D Auria Stochastic Processes /10. Notes. Abril 13 th, 2010

Dynamic Bertrand and Cournot Competition

Worst Case Portfolio Optimization and HJB-Systems

On the quantiles of the Brownian motion and their hitting times.

6.254 : Game Theory with Engineering Applications Lecture 7: Supermodular Games

Continuous Time Finance

Mean-Field optimization problems and non-anticipative optimal transport. Beatrice Acciaio

A numerical method for solving uncertain differential equations

Lecture 4: Ito s Stochastic Calculus and SDE. Seung Yeal Ha Dept of Mathematical Sciences Seoul National University

Controlled Diffusions and Hamilton-Jacobi Bellman Equations

Asymptotic Perron Method for Stochastic Games and Control

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Differential Games II. Marc Quincampoix Université de Bretagne Occidentale ( Brest-France) SADCO, London, September 2011

Information, Utility & Bounded Rationality

A Class of Fractional Stochastic Differential Equations

Non-Zero-Sum Stochastic Differential Games of Controls and St

Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control

The Pedestrian s Guide to Local Time

March 16, Abstract. We study the problem of portfolio optimization under the \drawdown constraint" that the

On the Multi-Dimensional Controller and Stopper Games

Path integrals for classical Markov processes

Matthew Zyskowski 1 Quanyan Zhu 2

arxiv: v1 [math.pr] 18 Oct 2016

Control Theory in Physics and other Fields of Science

Rough Burgers-like equations with multiplicative noise

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

The multidimensional Ito Integral and the multidimensional Ito Formula. Eric Mu ller June 1, 2015 Seminar on Stochastic Geometry and its applications

Deterministic Dynamic Programming

arxiv: v1 [math.oc] 7 Dec 2018

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Multi-dimensional Stochastic Singular Control Via Dynkin Game and Dirichlet Form

Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

MDP Preliminaries. Nan Jiang. February 10, 2019

A NOTE ON STOCHASTIC INTEGRALS AS L 2 -CURVES

ON THE POLICY IMPROVEMENT ALGORITHM IN CONTINUOUS TIME

Generalized Riccati Equations Arising in Stochastic Games

Learning in Mean Field Games

Continuous dependence estimates for the ergodic problem with an application to homogenization

On Robust Arm-Acquiring Bandit Problems

Skorokhod Embeddings In Bounded Time

Point Process Control

L 2 -induced Gains of Switched Systems and Classes of Switching Signals

Computing Solution Concepts of Normal-Form Games. Song Chong EE, KAIST

Mean-square Stability Analysis of an Extended Euler-Maruyama Method for a System of Stochastic Differential Equations

Nested Uncertain Differential Equations and Its Application to Multi-factor Term Structure Model

Risk-Sensitive Filtering and Smoothing via Reference Probability Methods

MEAN FIELD GAMES AND SYSTEMIC RISK

Near-Potential Games: Geometry and Dynamics

LOCAL FIXED POINT THEORY INVOLVING THREE OPERATORS IN BANACH ALGEBRAS. B. C. Dhage. 1. Introduction

Order book modeling and market making under uncertainty.

Reflected Brownian Motion

Transcription:

arxiv:141.37v1 [math.oc] 6 Nov 14 On the Solvability of Risk-Sensitive Linear-Quadratic Mean-Field Games Boualem Djehiche and Hamidou Tembine December, 14 Abstract In this paper we formulate and solve a mean-field game described by a linear stochastic dynamics and a quadratic or exponential-quadratic cost functional for each generic player. The optimal strategies for the players are given explicitly using a simple and direct method based on square completion and a Girsanov-type change of measure, suggested in Duncan et al. in e.g. [3, 4] for the mean-field free case. This approach does not use the well-known solution methods such as the Stochastic Maximum Principle and the Dynamic Programming Principle with Hamilton-Jacobi-Bellman-Isaacs equation and Fokker-Planck-Kolmogorov equation. In the risk-neutral linear-quadratic mean-field game, we show that there is unique best responsestrategy tothemean ofthestateandprovideasimplesufficient condition of existence and uniqueness of mean-field equilibrium. This approach gives a basic insight into the solution by providing a simple explanation for the additional term in the robust or risk-sensitive Riccati equation, compared to the risk-neutral Riccati equation. Sufficient conditions for existence and uniqueness of mean-field equilibria are obtained when the horizon length and risk-sensitivity index are small enough. The method is then extended to the linear-quadratic robust mean-field games under small disturbance, formulated as a minimax mean-field game. B. Djehiche is with KTH Royal Institute of Technology, boualem@kth.se H. Tembine is with New York University, tembine@nyu.edu

1 Introduction Mean-field games [] with very large number of players have been widely studied recently. Various solution methods such as the Stochastic Maximum Principle (SMP) ([1]) and the Dynamic Programming Principle (DPP) with Hamilton-Jacobi-Bellman-Isaacs equation and Fokker-Planck-Kolmogorov equation have been proposed [, 1]. Most studies illustrated these solution methods in the linear-quadratic (LQ) game [8, 9]. In this paper, we propose a simple argument that gives the best-response strategy and the best-response cost for the LQ-game without use of the well-known solution methods (SMP and DPP). We apply a simple square completion and a Girsanov-type change of measure(whenitapplies), successfully appliedbyduncanetal. [3,4,5,15,7] in the mean-field-free case. This method is well suited to LQ games and can hardly be extended to other dynamics and performance functionals. Applying the solution methodology related to the DPP or the SMP requires involved (stochastic) analysis (e.g. in the risk-sensitive case) and convexity arguments to insure necessary and sufficient optimality criteria. We avoid all this with this method. Although, for this simple case, we note that both DPP and SMP give the same linear structure of the best-response strategies as the actual square completion method. In the LQ-mean-field game problems the state process can be modeled by a set of linear stochastic differential equations of McKean-Vlasov type and the preferences are formalized by quadratic or exponential of integral of quadratic cost functions with mean-field terms. These game problems are very popular in the literature and a detailed exposition of this theory can be found in [1, 11]. The popularity of these game problems is due to practical considerations in signal processing, pattern recognition, filtering and prediction, economics and management science [1]. To some extent, most of the risk-neutral versions of these optimal controls are analytically and numerically solvable [4, 15, 7]. On the other hand, the linear quadratic risk-sensitive setting naturally appears if the decision makers objective is to minimize the effect of a small perturbation and related variance of the optimally controlled nonlinear process. By solving a linear quadratic risk-sensitive game problem, and using the implied optimal control actions, players can significantly reduce the variance (and the cost) incurred by this perturbation. While the risk-sensitive LQ optimal control has been widely investigated in the literature starting from Jacobson 1973 [1], the mean-field version of the problem has been introduced only recently in [18,

]. In that paper, the authors established a stochastic maximum principle for risk-sensitive mean-field-type control where the key mean-field term is the mean state. The linear-quadratic risk-sensitive mean-field game has been first introduced in [19]. Contribution Our contribution can be summarized as follows. In the risk-neutral linearquadratic mean-field game, we show that there is a unique admissible bestresponse strategy to any mean-field process. The argument to derive the best response is a simple square completion method and is not based on the classical solution methods used in the literature. We derive a fixed-point equation for the mean-field equilibrium, through the state process. The fixedpoint equation is shown to have a unique solution on the entire trajectory when the length of the horizon is small enough. In the risk-sensitive linearquadratic mean-field game, we establish a risk-sensitive Riccati equation with mean-field term involved in the coefficients. However the existence a positive admissible solution requires a condition on the risk sensitivity index. There is a unique admissible best response strategy when θσ < b, where θ is the r risk sensitivity index, b is the control action coefficient drift, r is the weight on the quadratic cost and σ is the diffusion coefficient. Further, we consider robust mean-field games, formulated as a minmax games, and establish a simple connection between the risk-sensitive best and the robust best response. Interestingly, the technique does not require min-max type partial differential equation (such as the Hamilton-Jacobi-Isaacs equation). We also derive a unique admissible best-response strategy under worst-case disturbance. Finally, a risk-sensitive and robust mean-field game are considered in a similar way. Structure The remainder of the paper is organized as follows. In the next section we present linear-quadratic mean-field games with risk-neutral cost functional. In Section 3 we focus on risk-sensitive mean-field games. Section 4 examines robust mean-field games. In Section 5 we consider the robust risk-sensitive mean-field games. Section 6 concludes the paper.

Notation and Preliminaries Let T > be a fixed time horizon and (Ω,F,lF,lP) be a given filtered probability space on which a one-dimensional standard Brownian motion B = {B s } s is given, and the filtration lf = {F s, s T} is the natural filtration of B augmented by lp null sets of F. We introduce the following notation. Let k 1. L k (,T;lR) is the set of functions f : [,T] R such that f(t) k dt <. L k lf (,T;lR) is the set of lf-adapted lr-valued processes X( ) such that [ ] T E X(t) k dt <. For t [,T], ψ t := sup s t ψ(s). An admissible control strategy u is an lf-adapted and square-integrable process with values in a non-empty subset U of lr. We denote the set of all admissible controls by U: U = {u( ) L lf (,T;lR); u(t) U a.e.t [,T], lp a.s.} Given u U, consider the following controlled linear SDE. { dx(t) = [ax(t)ām(t)bu(t)]dtσdb(t), x() = x lr, (1) where, a,ā,b,σ are real numbers and m is a deterministic function such that m T <. Then the following holds Lemma 1. (1) There exists a constant C T > depending on T,a,ā,b,σ and m T such that ( [ ]) E[ x T ] C T 1 x() E u(t) dt () () If f is a Borel function from [,T] lr into lr and is of linear growth i.e. there exists a constant C > such that f(t,x) C(1 x ), for any (t,x) [,T] lr, then the process defined, for t T, by ( t E t (x) := exp f(s,x(s))db(s) 1 t ) f(s,x(s)) ds, (3)

is an lf-martingale. In particular, E[E T (x)] = 1. (4) Proof. The first assertion follows from the Gronwall and the Burkholder- Davis-Gundy s inequalities. The second assertion follows from Corollary 3.5.16 in [1]. Linear-quadratic mean-field game: riskneutral case Consider a very large number of risk-neutral decision-makers. The bestresponse of a player is the following risk-neutral linear-quadratic mean-field problem where the mean-field term is through the mean of all the states of all the players: 1 inf u( ) U E[q(T)x (T) q(t)[x(t) m(t)] ] T q(t)x (t) q(t)(x(t) m(t)) r(t)u (t)dt, subject to (5) dx(t) = [ax(t)ām(t)bu(t)]dtσdb(t), x() R, where, q(t), q(t),r(t) >, and a,ā,b,σ are real numbers and where m(t) is the mean state trajectory created by all players at an equilibrium (if it exists). Any ū( ) U satisfying the infimum in (5) is called a risk-neutral bestresponse strategy of a generic player to the mean-field term (m(t)) t. The corresponding state process, solution of (5), is denoted by x( ) := xū[m]( ). The risk-neutral mean-field equilibrium problem we are concerned with is to characterize the triplet ( x,ū,m) solution of the problem (5) and the mean state created by all the players coincide with m, i.e., which is a fixed-point equation. m(t) = E[xū[m](t)], Definition 1. A mean-field equilibrium problem is a collection ( x, ū, m) such that ū minimizes (5) and E[ x] = m.

Note that we define a mean-field equilibrium through the mean state and not the entire distribution itself..1 Determining the best-response of a player Let L(u) = 1 [q(t)x (T) q(t)[x(t) m(t)] {q(t)x q(t)(x(t) m(t)) r(t)u }dt]. Then, the risk-neutral cost functional in (5) is E[L(u)], the expected value fo L. Let f(t,x) = 1 β(t)x α(t)x γ(t), where, α,β and γ are deterministic function of time, such that f(t,x) = 1 [ q(t)x q(t)[x m(t)] ], i.e. β(t) = q(t) q(t), α(t) = q(t)m(t), γ(t) = q(t) m (T). By applying Itô s formula to f(t,x(t)), we have (6) = = f(t,x(t)) f(,x()) [f t (t,x(t))(ax(t)ām(t)bu(t))f x (t,x(t)) σ f xx(t,x(t))]dt σf x (t,x(t))db(t) { β (t)aβ(t)}x (t)( α(t)aα(t)āβ(t)m(t))x(t)dt ( γ(t)āα(t)m(t) σ β(t))bu(β(t)x(t)α(t))dt σ(β(t)x(t)α(t))db(t). (7)

We evaluate L(u) 1 β()x () α()x() γ() : L(u) 1 β()x () α()x() γ() (8) = f(t,x(t)) f(,x()) = Thus, 1 {q(t)x (t) q(t)(x(t) m(t)) r(t)u (t)}dt { β (t)aβ(t)}x (t)dt ( α(t) aα(t) āβ(t)m(t))x(t)dt ( γ(t)āα(t)m(t) σ β(t))bu(t)(β(t)x(t)α(t))dt [ q(t) q(t) x (t) q(t)m(t)x(t) q(t) m (t) r(t) u (t)] dt σ(β(t)x(t) α(t))db(t). L(u) 1 β()x () α()x() γ() (9) = { β(t) q(t) q(t) aβ(t) }x (t)dt ( α(t) aα(t) āβ(t)m(t) q(t)m(t))x(t)dt ( γ(t)āα(t)m(t) σ q(t) β(t) m (t))dt [bu(t)(β(t)x(t)α(t)) r(t) u (t)] dt σ(β(t)x(t) α(t))db(t) (1)

We now use the following simple square completion relation: Hence, bu(βxα) r u (11) = r (u br ) (βxα) b r (βxα) = r (u br ) (βxα) b β r x b αβ x b α r r. L(u) 1 β()x () α()x() γ() (1) = { β(t) aβ(t) q(t) q(t) b β (t) r(t) }x (t) ( α(t)aα(t)āβ(t)m(t) q(t)m(t) b α(t)β(t) )x(t)dt r(t) ( γ(t)āα(t)m(t) σ q(t) β(t) m (t) b α (t) r(t) )dt [ r(t) u(t) b ] r(t) (β(t)x(t)α(t)) dt σ(β(t)x(t) α(t))db(t). (13) Since the expected value of the last term (13) is zero and r >, we have E[L(u)] 1 β()x ()α()x()γ(), u U, (14) if and only if β(t)aβ(t) b r(t) β (t)q(t) q(t) =, β(t) = q(t) q(t), α(t)aα(t)(āβ(t) q(t))m(t) b α(t)β(t) =, r(t) α(t) = q(t)m(t), γ(t)āα(t)m(t) σ q(t) β(t) m (t) b r(t) α (t) =, γ(t) = q(t) m (T). (15)

with equality if and only if u(t) = b r(t) (β(t)x(t)α(t)). Since b >,r > by assumption, the risk-neutral Riccati equation in (15), has a unique positive solution β. Incorporating it into the equation satisfied by α one gets a solution α(m) by direct integration. Thus, the unique bestresponse strategy is ū(t) = b (β(t) x(t)α[m](t)), (16) r(t) where, α, β are solutions of (15). Moreover, the associated performance is E[L(ū)] = 1 β()x ()α()x()γ(). (17). Risk-Neutral Mean-Field Equilibrium We now look for an equilibrium of the risk-neutral linear-quadratic game, which we call risk-neutral mean-field equilibrium. If ( x, ū, m) is a mean-field equilibrium then m solves the following fixed-point equation: m(t) = m t [(aā)m(s) b (β(s)m(s)α[m](s))]ds =: Φ[m](t), (18) r(s) which can be rewritten as m = Φ[m]. The Banach fixed-point theorem (also known as the contraction mapping theorem) is an important tool in the theory of complete metric spaces. It guarantees the existence and uniqueness of fixed points of certain self-maps of complete metric spaces, and provides a constructive method to find those fixed points by Banach-Picard iterates. From the inequality in Lemma 1, we deduce that the operator Φ maps L k (,T;lR) into itself, and L k (,T;lR) is a complete metric space. Therefore, a simple sufficient condition for having a unique fixed-point of Φ is given by a strict contraction ofφ, which is ensured if its Lipschitz constant L Φ isstrictly less than one. A standard Gronwall inequality yields [ ] L Φ < T g 1 g 1 ( q T ǫ 1 e T a b β/r T ).

where g 1 = aā b β/r T, g 1 = b /r T ǫ 1 = āβ q T. (19) Thus, we have proved the following Proposition 1. Suppose that r >,s >,q q > then there exists a unique best response strategy ū = b (βx α), where α and β solve the r Riccati equations (15). [ ] In addition, if T g 1 g 1 ( q T ǫ 1 e T a b β/r T ) < 1 then there is a unique risk-neutral mean-field equilibrium. We now refine the equation satisfied by α by setting α = ηm. { α(t)aα(t)(āβ(t) q(t))m(t) b α(t)β(t) =, r(t) α(t) = q(t)m(t). () Then, { η [aā b r β]η b r η (āβ q) =, η(t) = q(t). (1) Closed-Form Expression of Mean-Field Equilibrium Whenever α and β does not blow-up within [,T], the explicit mean-field equilibrium is given by m(t) = m()e t (aā b r (βη))dt β(t)aβ(t) b r(t) β (t)q(t) q(t) =, β(t) = q(t) q(t), η(t)[aā b r(t) η(t) = q(t) ū = b r(t) (β(t)x(t)η(t)m(t)) β(t)]η(t) b r(t) η (t)(āβ(t) q(t)) =, 3 Linear exponential-quadratic mean-field game: risk-sensitive case () We now consider a very large number of risk-sensitive decision-makers. While the risk-sensitive Linear Quadratic Gaussian optimal control have been widely

investigated in the literature starting from Jacobson 1973[1], the mean-field game version of the problem has been introduced only recently [19]. The best-response of a player is the following risk-sensitive linear-quadratic mean-field problem where the mean-field term is through the mean of all the states of all the players: inf u( ) U Ee θl(u), subject to dx(t) = [ax(t)ām(t)bu(t)]dtσdb(t), x() R. (3) Similarlyasabove, anyū( ) U satisfying theinfimumin(3)iscalledarisksensitivebest-responseofagenericplayertothemean-fieldterm(m(t)) t. The corresponding state process, solution of (3), is denoted by x( ) := xū[m]( ). The risk-sensitive mean-field equilibrium problem we are concerned with is to characterize the triplet ( x,ū,m) solution of the problem (3) and the state created by all the players coincide with m, i.e., m(t) = E[xū[m](t)], which is a fixed-point equation. Wecompletewiththeterm 1 T θ σ (β(t)x(t)α(t)) dtθ [1 θσ ](β(t)x(t) α(t)) dt in L(u) to obtain = θ θ[l(u) 1 β()x () α()x() γ()] (4) { β (t)aβ(t) q(t) q(t) ( b r(t) 1 θσ )β (t)}x (t)dt θ ( α(t)aα(t)āβ(t)m(t) q(t)m(t)( b r(t) θσ (t))α(t)β(t))x(t)dt θ ( γ(t)āα(t)m(t) σ q(t) β(t) m (t) b α (t) 1 r(t) θσ α (t))dt [ r θ u(t) b ] r(t) (β(t)x(t)α(t)) dt θσ(β(t)x(t)α(t))db(t) 1 θ σ (β(t)x(t)α(t)) dt. (5)

Taking the exponential and the expectation yields [ ] Ee θ[l(u) 1 β()x () α()x() γ()] = E E T (x)e θ r(t) [u(t) b r(t) (β(t)x(t)α(t))] dt E[E T (x)] (6) if and only if (β,α,γ) satisfies the following system of equations, where we call the equation satisfied by β the risk-sensitive Riccati equation: β(t)aβ(t) ( b r(t) θσ )β (t)q(t) q(t) = α(t)aα(t)(āβ(t) q(t))m(t)( b r(t) θσ )α(t)β(t) = γ(t)āα(t)m(t) σ q(t) β(t) m (t) b α (t) (7) θσ r(t) α (t) =, γ(t) = q(t) m (T), with equality if and only if ū(t) = b (β(t) x(t)α(t)), (8) r(t) where, x solves the dynamics in (3) associated with ū, and ( t E t (x) := exp θσ(β(s)x(s)α(s))db(s) 1 t ) θ σ (β(s)x(s)α(s)) ds. Now, the risk-sensitive Riccati equation in β above has a unique positive solution if b r θσ >, i.e., θ < b, in which case, in view of Lemma 1, we rσ have E[E T (x)] = 1, u U. Hence Ee θl(u) e θ[1 β()x ()α()x()γ()], u U, with equality for u = ū, which constitutes the unique best response to the mean-field process. Hence, the cost functional associated to the best response ū given by (8) is Ee θl(ū) = e θ(1 β()x ()α()x()γ()). (9) Note that the presence of the term θσ in the risk-sensitive Riccati equation (7) comes from the completion of the exponential martingale. One gets the risk-neutral Riccati equation as θ vanishes.

For T and θ sufficiently small enough, the risk-sensitive mean-field game is completely solvable. Note that, however that for large θ, the solution β may blow-up in finite time and the control b (βxα) becomes non-admissible. r Thus, we have proved the following Proposition. Consider the linear-quadratic risk-sensitive mean-field game associated with (3). Suppose that r >,s >,q q > and θ < b then rσ there exists a unique best response strategy ū = b (βxα), where α and β r solves the risk-sensitive [ Riccati equation (7). ] In addition, if T g g ( q T ǫ e T a( b /rθσ )β T ) < 1 then there is a unique robust risk-sensitive mean-field equilibrium, where g = a ā ( b /r)β T, g = b /r T, ǫ = āβ q T, which are modification of g 1, g 1 and ǫ 1 given by (19). 4 Robust mean-field game: the risk-neutral case Consider a very large number of risk-neutral players and a malicious/disturbance term v. We refer to [] for an interesting application of robust mean-field games to crowd seeking problems in social networks. We assume that the disturbance term v takes values a subset V of lr and define the set of disturbance strategies V in a similar fashion as U. Next, we formulate a risk-neutral robust mean-field game as a minmax mean-field game as follows. Let L (u,v) = 1 T {q(t)x (t) q(t)(x(t) m(t)) r(t)u (t) s(t)v (t)}dt 1 [q(t)x (T) q(t)[x(t) m(t)] ] (3) be the cost functional of a generic player with strategy u under disturbance v when the mean-field process is m. The best-response of a generic player is the following risk-neutral linearquadratic robust mean-field problem under worst case disturbance is inf u( ) U sup v( ) V E[L (u,v)], subject to dx(t) = [axām(t)bu(t)cv(t)]dtσdb(t), x() R, (31)

where, q(t), q(t),r(t) >, and a,ā,b,σ are real numbers and where m(t) is the mean state trajectory created by all players at robust equilibrium (if it exists). Any (ū( ), v( )) U V satisfying the min-max in (31) is called a riskneutral robust best-response of a generic player to the mean-field process (m(t)) t under worst case disturbance v. The corresponding state process, solution of (31), is denoted by x( ) := xū, v [m]( ). The robust mean-field equilibrium problem we are concerned with is to characterize the collection ( x, ū, v, m) solution of the problem (31) and the mean state created by all the players coincide with m, i.e., which is a fixed-point equation. m(t) = E[xū, v [m](t)], Definition. A robust mean-field equilibrium problem is a collection( x, ū, v, m) such that ū minimizes (31) under worst disturbance v and E[ x] = m. At a robust mean-field equilibrium, the best-response to the mean-field under worst case disturbance, should reproduce the mean-field itself. 4.1 Determining the robust best-response of a player TheterminalcostissimilarasabovebutnowItô sformulatothefunctionf is differentbecauseofthedisturbance. Letf(t,x) = 1 β(t)x (t)α(t)x(t)γ(t). By applying Itô s formula, we have = = f(t,x(t)) f(,x()) [f t (t,x(t))(ax(t)ām(t)bu(t)cv(t))f x (t,x(t)) σ f xx(t,x(t))]dt σf x (t,x(t))db(t) { β(t) aβ(t)}x (t)( α(t)aα(t)āβ(t)m(t))x(t)dt ( γ(t)āα(t)m(t) σ β(t))(bu(t)cv(t))(β(t)x(t)α(t))dt σ(β(t)x(t) α(t))db(t). (3)

We now compute the difference L (u,v) 1 β()x () α()x() γ(). We have L (u,v) 1 β()x () α()x() γ() = { β(t) aβ(t) q(t) q(t) ( α(t)aα(t)āβ(t)m(t) q(t)m(t))x(t)dt ( γ(t)āα(t)m(t) σ We now use the following relations: β(t) q(t) m (t))dt r(t) [(bu(t)cv(t))(β(t)x(t)α(t)) u (t) s(t) }x (t) dt v (t)] dt σ(β(t)x(t)α(t))db(t). (33) bu(βxα) r u = r (u br ) (βxα) b β r x b αβ x b α r r and cv(βxα) s v = s to obtain = (v c s (βxα) ) c β L (u,v) 1 β()x () α()x() γ() { β(t) aβ(t) q(t) q(t) s x c αβ r x c α s ( b r(t) c s(t) )β (t)}x (t) ( α(t)aα(t)āβ(t)m(t) q(t)m(t)( b r(t) c s(t) )α(t)β(t))x(t)dt ( γ(t)āα(t)m(t) σ q(t) β(t) m (t)( b r(t) c s(t) )α (t))dt [ r(t) u(t) b ] r(t) (β(t)x(t)α(t)) s(t) [ v(t) c s(t) (β(t)x(t)α(t)) Therefore, ] dt σ(β(t)x(t) α(t))db(t). (34) E[L (u,v)] 1 β()x ()α()x()γ(), (u,v) U V, (35)

if and only if (α,β,γ) solves the following system of equations where we call the equation β solves, robust Riccati equation: β(t)aβ(t)( b c r(t) s(t) )β (t)q(t) q(t) =, β(t) = q(t) q(t), α(t)aα(t)(āβ(t) q(t))m(t)( b c )α(t)β(t) =, r(t) s(t) (36) α(t) = q(t)m(t), γ(t)āα(t)m(t) σ γ(t) = q(t) m (T), with equality if and only if β(t) q(t) m (t)( b r(t) c s(t) )α (t) ū(t) = b c (β(t)x(t)α(t)), v(t) = (β(t)x(t)α(t)). (37) r(t) s(t) Since b >,r >,s > by assumption, the robust Riccati equation (36) in β has a unique positive solution if b c >. Injecting β into the r s equation satisfied by α we get a solution α(m) by direct integration. Thus, there exists a unique best-response strategy whenever b > c >. r s Remark 1. Setting c /s = θσ in the robust Riccati equation (36), we obtain the risk-sensitive Riccati equation (7). 4. Risk-Neutral Robust Mean-Field Equilibrium If ( x,ū, v,m) is a robust mean-field equilibrium then m solves the following fixed-point equation: m = Φ[m], (38) where, Φ[m](t) := m t (aā)m(t )( b r(t ) c s(t ) )(β(t )m(t )α[m](t ))dt. Since the term ( b c ) is changed compared to the previous analysis and r s [ ] L Φ < T g 3T g 3T ( q T ǫ 3 e T a( b /rc /s)β T ), where, g 3 = aā( b /rc /s)β T, g 3 = ( b /rc /s) T, ǫ 3 = āβ q T. Thus, we have proved the following

Proposition 3. Consider the robust linear-quadratic game associated with (31). Suppose that r >,s >, and b > c > then there exists a unique r s best response strategy ū = b (βx α), and the worst case disturbance is r v = c (βxα). where α and β solves the robust Riccati equation (36). s [ ] In addition, if T g 3T g 3T ( q T ǫ 3 e T a( b /rc /s)β T ) < 1, then there is a unique risk-neutral robust mean-field equilibrium. 5 Linear exponential-quadratic robust meanfield game The risk-sensitive robust best-response of a player is a solution to the following minmax problem: inf u( ) U sup v( ) V Ee θl (u,v), subject to dx(t) = [ax(t)ām(t)bu(t)cv(t)]dtσdb(t), x() R. (39) Similarly as above, any (ū( ), v( )) U V satisfying the min-max in (39) is called a risk-sensitive robust best-response of a generic player to the meanfieldprocess (m(t)) t under worst casedisturbance v. The corresponding state process, solution of (39), is denoted by x( ) := xū, v [m]( ). The risk-sensitive robust mean-field equilibrium problem we are concerned with is to characterize the pair ( x,ū, v) solution of the problem (39) and the state created by all the players coincide with m, i.e., m(t) = E[xū, v [m](t)], which is a fixed-point equation. Completing with the term 1 θ σ (β(t)x(t)α(t)) dt 1 θ θσ (β(t)x(t)α(t)) dt

in L (u,v) we obtain θ[l (u,v) 1 β()x () α()x() γ()] (4) = θ θ ( α(t)aα(t)āβ(t)m(t) q(t)m(t)( b r(t) c s(t) θσ )α(t)β(t))x(t)dt θ ( γ(t)āα(t)m(t) σ q(t) β(t) m (t)( b r(t) c s(t) θ σ )α (t))dt [ r(t) θ u b ] r(t) (β(t)x(t)α(t)) s(t) [ v(t) c ] s(t) (β(t)x(t)α(t)) dt { β(t) aβ(t) q(t) q(t) θσ(β(t)x(t)α(t))db(t) 1 ( b r(t) c s(t) θ σ )β (t)}x (t)dt θ σ (β(t)x(t)α(t)) dt. (41) Taking exponential and the expectation yields [ ] 1 Ee θ[l(u,v)] E[E T (x)]expθ β()x ()α()x()γ(), (u,v) U V, where, ( E T (x) := exp θσ(β(t)x(t)α(t))db(t) 1 ) θ σ (β(t)x(t)α(t)) dt, if and only if β(t)aβ(t)( b c r(t) s(t) θσ )β (t)q(t) q(t) =, β(t) = q(t) q(t), α(t)aα(t)(āβ(t) q(t))m(t) ( b c r(t) s(t) θσ )α(t)β(t) =, α(t) = q(t)m(t), γ(t)āα(t)m(t) σ q(t) β(t) m (t)( b c θ r(t) s(t) σ )α (t), γ(t) = q(t) m (T), (4) with equality if and only if ū(t) = b c (β(t)x(t)α(t)), v(t) = (β(t)x(t)α(t)). (43) r(t) s(t)

The risk-sensitive robust Riccati equation in β above has a unique positive solution β(t) if b r c s θσ >, which is satisfied if θ andcare relatively small enough. In this case, in view of Lemma 1, E[E T (x)] = 1, which yields Ee θ[l (u,v)] e θ[1 β()x ()α()x()γ()], (u,v) U V, with equality if (u,v) = (ū, v), which is a unique best response strategy to the mean-field for b > c r s θσ >. For T and θ, c sufficiently small enough, the risk-sensitive mean-field game is completely solvable. Note however that for large θ, or c, the solution β may blow-up in finite time and the control b (βxα) becomes non-admissible. r Thus, we have proved the following Proposition 4. Suppose that r >,s >,q >, q and b > c r s θσ > then there exists a unique best response strategy ū = b (βxα) and the worst r case disturbance is v = c (βx α). where α and β solves the risk-sensitive s Riccati equations [(4). ] In addition, if T g 4 g 4 ( q ǫ 4 e T a( b /rc /sθσ )β T ) < 1 then there is a unique robust risk-sensitive mean-field equilibrium, where, g 4 = a ā ( b /r c /s)β T, g 4 = ( b /r c /s) T, ǫ 4 = āβ q T. Remark. If the mean-field term is a function of the equilibrium control action ξ(e[ū]) instead of the mean state m, the same methodology applies, except that the fixed-point equation is now different E[ū] = b r [β mα[ξ(e[ū])]]. This type of mean-field structures is observed in smart grids where the meanfield term which modulates the price of electricity is a function of aggregate demand and supply. The demand is generated by the users (residential consumers, commercial, industrial, transportation etc). It is also relevant in wireless networking where the mean-field term is the interference E[ū. x ] created by other users and ū is the transmission power of the user. See [13] for more details. 6 Concluding remarks The method described in this paper provides an alternative to both the Hamilton-Jacobi-Bellman-Isaacs equation and the robust stochastic maximum principle for the particular case of LQ-games. The method exhibits

the best-response strategy of the player and the best-response cost directly in the problem solution. The mean-field equilibrium is then formulated as a fixed-point of the best-response. Sufficient conditions for existence and uniqueness of mean-field equilibria are provided. The approach can be extended to more general LQ-games in several ways: (i) matrix form, (ii) games with anomalous diffusions, (iii) partially observable games, (iv) robust cooperative mean-field-type games. We leave these questions for future work. We note that this method can hardly be extended to games with arbitrary nonlinear dynamics and cost performance. References [1] Dockner E. J., Jorgensen S., Van Long N., and Sorger G.: Differential Games in Economics and Management Science, Cambridge University Press, Nov 16, - Business and Economics - 38 pages [] Lasry J. M., Lions,P. L. (7). Mean field games. Japanese Journal of Mathematics, Volume, Issue 1, pp 9-6 [3] Duncan T. E. and Pasik-Duncan B., Linear-quadratic fractional Gaussian control, SIAM J. Control Optim. 51 (13), 464-4619. [4] Duncan T. E. and Pasik-Duncan B., Linear-exponential-quadratic control for stochastic equations in a Hilbert space, Dynamic Systems and Applications 1 (1), 47-416. 111. [5] Duncan T. E., Linear-exponential-quadratic Gaussian control, IEEE Trans. Autom. Control, 58 (13), 91-911. [6] Duncan T. E. and Pasik-Duncan B., A direct method for solving stochastic control problems, Commun. Info. Systems, (special issue for H. F. Chen), 1 (1), 1-14. [7] Duncan T. E., Linear-quadratic stochastic differential games with general noise processes, Models and Methods in Economics and Management Science: Essays in Honor of Charles S. Tapiero, (eds. F. El Ouardighi and K. Kogan), Operations Research and Management Series, Springer Intern. Publishing, Switzerland, Vol. 198, 14, 17-6.

[8] Bardi M., Explicit solutions of some linear-quadratic mean field games, Netw. Heterog. Media 7 (1), 43-61. [9] Bardi Martino and Priuli, Fabio S. Linear-Quadratic N-person and Mean-Field Games with Ergodic Cost, July 14. [1] Bensoussan A., Frehse J., Yam S.C.P.: Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer, 13. [11] Bensoussan A., Explicit solutions of linear quadratic differential games, Chapter in Stochastic Processes, Optimization, and Control Theory: Applications in Financial Engineering, Queueing Networks, and Manufacturing Systems, International Series in Operations Research and Management Science Volume 94, 6, pp 19-34. [1] Jacobson D. H: Optimal Stochastic Linear Systems with Exponential Performance Criteria and Their Relation to Deterministic Differential Games, IEEE Transactions on Automatic Control, 18(), 14-131, 1973. [13] Tembine H., Energy-constrained mean-field games in wireless networks, Strategic Behavior and the Environment, Vol 4, Issue, 14, pp 187-11. [14] Whittle P. : Risk-sensitive Optimal Control. John Wiley and Sons, New York, 199. [15] Lukes D. L. and Russell D. L.. A Global Theory of Linear-Quadratic Differential Games. Journal of Mathematical Analysis and Applications, 33:96-13, 1971. [16] Engwerda J. C.. On the open-loop Nash equilibrium in LQ-games. Journal of Economic Dynamics and Control. :79-76, 1998. [17] Lim A E B, Zhou X. A new risk-sensitive maximum principle. IEEE Trans Autom Cont, 5, 5(7): 958-966. [18] Djehiche B., Tembine H and Tempone,R: A Stochastic Maximum Principle for Risk-Sensitive Mean-Field-Type Control, 14. Preprint. [19] Tembine H. and Zhu Q. and Basar T., Risk-sensitive mean-field games, IEEE Transactions on Automatic Control, 59(4): 835-85, 14.

[] Tembine H, Bauso D., Basar T.: Robust linear quadratic mean-field games in crowd-seeking social networks. CDC 13: 3134-3139. [1] Karatzas I. and Shreve S.E.: Brownian Motion and Stochastic Calculus. nd ed., Springer, New York, 1991. [] Tembine H., Distrobuted Strategic Learning for Wireless Engineers, CRC Press, Taylor & Francis, 1, 496 pages.