Mean Field Game Theory for Systems with Major and Minor Agents: Nonlinear Theory and Mean Field Estimation

Size: px

Start display at page:

Download "Mean Field Game Theory for Systems with Major and Minor Agents: Nonlinear Theory and Mean Field Estimation"

Audrey Moody
5 years ago
Views:

1 Mean Field Game Theory for Systems with Major and Minor Agents: Nonlinear Theory and Mean Field Estimation Peter E. Caines McGill University, Montreal, Canada 2 nd Workshop on Mean-Field Games and Related Topics Department of Mathematics University of Padova September 2013 Joint work with Mojtaba Nourian and Arman C. Kizilkale

2 Background Mean Field Game (MFG) Theory Modeling Framework and Central Problem of MFG Theory (Huang, Malhamé, PEC ( 03, 06, 07), Lasry-Lions ( 06, 07)): Framework Games over time with a large number of stochastic dynamical agents such that: Each agent interacts with a mass effect (e.g. average) of other agents via couplings in their individual cost functions and individual dynamics Each agent is minor in the sense that, asymptotically as the population size goes to infinity, it has a negligible influence on the overall system but the mass effect on the agent is significant Problem Establish the existence and uniqueness of equilibria and the corresponding strategies of the agents

Background Mean Field Game (MFG) Theory Solution Concepts for MFG Theory: The existence of Nash equilibria between the individual agents and the mass in the infinite population limit where (a) the

3 Background Mean Field Game (MFG) Theory Solution Concepts for MFG Theory: The existence of Nash equilibria between the individual agents and the mass in the infinite population limit where (a) the individual strategy of each agent is a best response to the mass effect, and (b) the set of the strategies collectively replicate that mass effect The ɛ Nash Approximation Property: If agents in a finite population system apply the infinite population equilibrium strategies an approximation to the infinite population equilibrium results

4 Non-linear Major-Minor Mean Field Systems

MFG Theory Involving Major-Minor Agents Extension of the LQG MFG model for Major and Minor agents (Huang 2010, Huang-Ngyuan 2011) to the case of nonlinear dynamical systems Dynamic game models will

5 MFG Theory Involving Major-Minor Agents Extension of the LQG MFG model for Major and Minor agents (Huang 2010, Huang-Ngyuan 2011) to the case of nonlinear dynamical systems Dynamic game models will involve nonlinear stochastic systems with (i) a major agent, and (ii) a large population of minor agents Partially observed systems become meaningful and hence estimation of major agent and mean field states becomes a meaningful problem Motivation and Applications: Economic and social models with both minor and massive agents Power markets with large consumers and large utilities together with many domestic consumers and generators using smart meters

6 MFG Nonlinear Major-Minor Agent Formulation Problem Formulation: Notation: Subscript 0 for the major agent A 0 and an integer valued subscript for minor agents {A i : 1 i N}. The states of A 0 and A i are R n valued and denoted z N 0 (t) and z N i Dynamics of the Major and Minor Agents: dz N 0 (t) = 1 N dz N i + 1 N (t) = 1 N + 1 N N f 0(t, z0 N (t), u N 0 (t), zj N (t))dt j=1 (t). N σ 0(t, z0 N (t), zj N (t))dw 0(t), z0 N (0) = z 0(0), 0 t T, j=1 N f(t, zi N (t), u N i (t), zj N (t))dt j=1 N σ(t, zi N (t), zj N (t))dw i(t), zi N (0) = z i(0), 1 i N. j=1

7 MFG Nonlinear Major-Minor Agent Formulation Cost Functions for Major and Minor Agents: The objective of each agent is to minimize its finite time horizon cost function given by T ( J0 N (u N 0 ; u N 1 0) := E 0 N T ( Ji N (u N i ; u N 1 i) := E 0 N N j=1 ) L 0[t, z0 N (t), u N 0 (t), zj N (t)] dt, N L[t, zi N j=1 ) (t), u N i (t), z0 N (t), zj N (t)] dt. Notation The major agent has non-negligible influence on the mean field (mass) behaviour of the minor agents due to presence of z0 N in the cost function of each minor agent. A consequence is that the mean field is no longer a deterministic function of time. (Ω, F, {F t} t 0, P ): a complete filtered probability space F t := σ{z j(s), w j(s) : 0 j N, 0 s t}. F w 0 t := σ{z 0(0), w 0(s) : 0 s t}.

8 Assumptions Assumptions: Let the empirical distribution of N minor agents initial states be defined by F N (x) := 1 N N i=1 1 {z i (0)<x}. (A1) The initial states {z j(0) : 0 j N} are F 0-adapted random variables mutually independent and independent of all Brownian motions, and there exists a constant k independent of N such that sup 0 j N E z j(0) 2 k <. (A2) {F N : N 1} converges weakly to the probability distribution F. (A3) U 0 and U are compact metric spaces. (A4) f 0[t, x, u, y], σ 0[t, x, y], f[t, x, u, y] and σ[t, x, y] are continuous and bounded with respect to all their parameters, and Lipschitz continuous in (x, y, z). In addition, their first order derivatives (w.r.t. x) are all uniformly continuous and bounded with respect to all their parameters, and Lipschitz continuous in y. (A5) f 0[t, x, u, y] and f[t, x, u, y] are Lipschitz continuous in u.

9 Assumptions Assunptions (ctd): (A6) L 0[t, x, u, y] and L[t, x, u, y, z] are continuous and bounded with respect to all their parameters, and Lipschitz continuous in (x, y, z). In addition, their first order derivatives (w.r.t. x) are all uniformly continuous and bounded with respect to all their parameters, and Lipschitz continuous in (y, z). (A7) (Non-degeneracy Assumption) There exists a positive constant α such that σ 0[t, x, y]σ T 0 [t, x, y] αi, σ[t, x, y]σ T (t, x, y) αi, (t, x, y). Otherwise, a notion of viscosity like solutions seems necessary.

10 McKean-Vlasov Approximation for MFG Analysis -Loop Major and Minor Agent Dynamics: Assume ϕ 0(ω, t, x) L 2 F w 0 ([0, T ]; U 0) and ϕ(ω, t, x) L 2 t F w 0 ([0, T ]; U) are t two arbitrary F w 0 t -measurable stochastic processes, Lipschitz continuous in x, constituting the Major and Minor agent control laws. Then: dz N 0 (t) = 1 N dz N i + 1 N (t) = 1 N + 1 N N f 0(t, z0 N (t), ϕ 0(t, z0 N (t)), zj N (t))dt j=1 N σ 0(t, z0 N (t), zj N (t))dw 0(t), z0 N (0) = z 0(0), 0 t T, j=1 N f(t, zi N (t), ϕ(t, zi N (t), z0 N (t)), zj N (t))dt j=1 N σ(t, zi N (t), zj N (t))dw i(t), zi N (0) = z i(0), 1 i N. j=1

11 McKean-Vlasov Approximation for MFG Analysis Major-Minor Agent McKean-Vlasov System: For an arbitrary function g and a probability distribution µ t in R n, set g[t, z, ψ, µ t] = g(t, z, ψ, x)µ t(dx). The pair of infinite population McKean-Vlasov (MV) SDE systems corresponding to the collection of finite population systems above is given by: dz 0(t) = f 0[t, z 0(t), ϕ 0(t, z 0(t)), µ t]dt + σ 0[t, z 0(t), µ t]dw 0(t), dz(t) = f[t, z(t), ϕ(t, z(t), z 0(t)), µ t]dt + σ[t, z(t), µ t]dw(t), with initial conditions (z 0(0), z(0)). 0 t T In using the MV system it is assumed that the (behaviour) of an infinite population of (parameter) uniform minor agents can be modelled by the collection of sample paths of agents with independent initial conditions and independent Brownian sample paths. In the above MV system ( ) ( z 0( ), z( ), µ ( ) is a consistent solution if z0( ), z( ) ) is a solution to the above SDE system, and µ t is the conditional law of z(t) given F w 0 t (i.e., µ t := L ( ) z(t) F w 0 t ).

12 McKean-Vlasov Approximation for MFG Analysis We shall use the notation: dz 0(t) = f 0[t, z 0(t), ϕ 0(t, z 0(t)), µ t]dt + σ 0[t, z 0(t), µ t]dw 0(t), 0 t T, dz i(t) = f[t, z i(t), ϕ(t, z i(t)), z 0(t), µ t]dt + σ[t, z i(t), µ t]dw i(t), 1 i N, with initial conditions z j(0) = z j(0) for 0 j N, to describe N independent samples of the MV SDE system. Theorem (McKean-Vlasov Convergence Result) Assume (A1) and (A3)-(A5) hold. Then unique solutions exist for the finite and MV SDE equation schemes and we have sup sup 0 j N 0 t T E ẑ N j (t) z j(t) = O( 1 N ), where the right hand side may depend upon the terminal time T. The proof is based on the Cauchy-Schwarz inequality, Gronwall s lemma and the conditional independence of minor agents given F w 0 t.

13 Distinct Feature of the Major-Minor MFG Theory The non-standard nature of the SOCPs is due to the fact that the minor agents are optimizing with respect to the future stochastic evolution of the major agent s stochastically evolving state which is partly a result of that agent s future best response control actions. Hence the mean field becomes stochastic. This feature vanishes in the non-game theoretic setting of one controller with one cost function with respect to the trajectories of all the system components (the classical SOCP), moreover it also vanishes in the infinite population limit of the standard MFG models with no major agent. The nonstandard feature of the SOCPs here give rise to the analysis of systems with stochastic parameters and hence BSDEs enter the analysis.

14 An SOCP with Random Coefficients (after Peng 92) Let (W (t)) t 0 and (B(t)) t 0 be mutually independent standard Brownian motions in R m. Denote F W,B t := σ{w (s), B(s) : s t}, F W t := σ{w (s) : s t} U := { u( ) U : u(t) is adapted to σ-field F W,B t and E T 0 u(t) 2 dt < }. Dynamics and optimal control problem for a single agent: dz(t) = f[t, z, u]dt + σ[t, z]dw (t) + ς[t, z]db(t), 0 t T, [ T ] inf J(u) := inf E L[t, z(t), u(t)]dt, u U u U 0 where the coefficients f, σ, ς and L are are F W t -adapted stochastic processes. The value function φ(, ) is defined to be the F W t -adapted process φ(t, x t) = inf u U E F W t T where x t is the initial condition for the process x. t L[s, x(s), u(s)]ds,

15 Solution to the Optimal Control Problem with Random Coefficients Let u o ( ) be the optimal control with corresponding closed-loop solution x( ) By the Principle of Optimality, the process ζ(t) := φ ( t, x(t) ) + t 0 L[s, x(s), u o (s, x(s))]ds, is an {Ft W } 0 t T -martingale. Next, by the martingale representation theorem along the optimal trajectory there exists an Ft W -adapted process ψ (, x( ) ) such that φ ( t, x(t) ) T T = L[s, x(s), u o (s, x(s))]ds ψ T ( s, x(s) ) dw (s) =: t T t Γ ( s, x(s) ) T ds ψ T ( s, x(s) ) dw (s), t [0, T ]. t t

16 Extended Itô-Kunita formula Theorem (Extended Itô-Kunita formula (after Peng 92)) Let φ(t, x) be a stochastic process represented by dφ(t, x) = Γ(t, x)dt + m ψ k (t, x)dw k (t), (t, x) [0, T ] R n, k=1 where Γ(t, x) and ψ k (t, x), 1 k m, are Ft W -adapted stochastic processes. Let x( ) = ( x 1 ( ),, x n ( ) ) be a continuous semimartingale of the form m m dx i (t) = f i(t)dt + σ ik (t)dw k (t) + ς ik (t)db k (t), 1 i n, k=1 where f i, σ i = (σ i1,, σ im) and ς i = (ς i1,, ς im), 1 i n, are F W t adapted stochastic processes. k=1

17 Extended Itô-Kunita formula Then the composition map φ(, x( )) is also a continuous semimartingale which has the form dφ ( t, x(t) ) = Γ ( t, x(t) ) m ( ) n dt + ψ k t, x(t) dwk (t) + xi φ ( t, x(t) ) f i(t)dt i=1 k=1 k=1 n m + xi φ ( t, x(t) ) n m σ ik (t)dw k (t) + xi φ ( t, x(t) ) ς ik (t)db k (t) n i=1 k=1 m ( ) xi ψ k t, x(t) σik (t)dt n i,j=1 k=1 i=1 k=1 n i,j=1 k=1 m x 2 i x j φ ( t, x(t) ) ς ik (t)ς jk (t)dt. i=1 m x 2 i x j φ ( t, x(t) ) σ ik (t)σ jk (t)dt

18 A Stochastic Hamilton-Jacobi-Bellman (SHJB) Equation Using the extended Itô-Kunita formula and the Principle of Optimality, it may be shown that the pair ( φ(s, x), ψ(s, x) ) satisfies the following backward in time SHJB equation: dφ(t, ω, x) = [H[t, ω, x, D xφ(t, ω, x)] + σ[t, ω, x], D xψ(t, ω, x) Tr( a[t, ω, x]d 2 xxφ(t, ω, x) )] dt ψ T (t, ω, x)dw (t, ω), φ(t, x) = 0, in [0, T ] R n, where a[t, ω, x] := σ[t, ω, x]σ T [t, ω, x] + ς[t, ω, x]ς T (t, ω, x), and the stochastic Hamiltonian H is given by { } H[t, ω, z, p] := inf f[t, ω, z, u], p + L[t, ω, z, u]. u U

19 Unique Solution of the SHJB Equation Assumptions: (H1)(Continuity Assumptions 1) f[t, x, u] and L[t, x, u] are a.s. continuous in (x, u) for each t, a.s. continuous in t for each (x, u), f[t, 0, 0] L 2 F t ([0, T ]; R n ) and L[t, 0, 0] L 2 F t ([0, T ]; R +). In addition, they and all their first derivatives (w.r.t. x) are a.s. continuous and bounded. (H2) (Continuity Assumptions 2) σ[t, x] and ς[t, x] are a.s. continuous in x for each t, a.s. continuous in t for each x and σ[t, 0], ς[t, 0] L 2 F t ([0, T ]; R n m ). In addition, they and all their first derivatives (w.r.t. x) are a.s. continuous and bounded. (H3) (Non-degeneracy Assumption) There exist non-negative constants α 1 and α 2, α 1 + α 2 > 0, such that σ[t, ω, x]σ T [t, ω, x] α 1I, ς[t, ω, x]ς T (t, ω, x) α 2I, a.s., (t, ω, x). Theorem (Peng 92) Assume (H1)-(H3) hold. Then the SHJB equation has a unique forward in time Ft W -adapted solution pair (φ(t, ω, x), ψ(t, ω, x)) ( L 2 F t ([0, T ]; R), L 2 F t ([0, T ]; R m ) )

20 Best Response Control Action and Verification Theorem The optimal control process: u o (t, ω, x) := arg inf u U Hu [t, ω, x, D xφ(t, ω, x), u] { = arg inf f[t, ω, x, u], Dxφ(t, ω, x) + L[t, ω, x, u] }. u U is a forward in time F W t -adapted process for any fixed x. By a verification theorem approach, Peng showed that if a unique solution (φ, ψ)(t, x) to the SHJB equation exists, and if it satisfies: (i) for each t, (φ, ψ)(t, x) is a C 2 (R n ) map, (ii) for each x, (φ, ψ)(t, x) and (D xφ, Dxxφ, 2 D xψ)(t, x) are continuous Ft W -adapted stochastic processes, then φ(x, t) coincides with the value function of the optimal control problem.

21 The MFG Consistency Condition The functional dependence loop of observation and control of the major and minor agents yields the following proof iteration loop, initiated with a nominal measure µ t(ω) : µ ( ) (ω) m-smv u o (, ω, x) M-SHJB ( φ 0(, ω, x), ψ 0(, ω, x) ) M-SBR u o 0(, ω, x) m-sbr M-SMV ( φ(, ω, x), ψ(, ω, x) ) m-shjb z o 0(, ω) Closing the Loop: : By substituting u o into the generic minor agent s dynamics we get the SMV dynamics: dz o (t, ω) = f[t, z o (t, ω), u o (t, ω, z), µ t(ω)]dt + σ[t, z o (t, ω), µ t(ω)]dw(t), z o (0) = z(0), where f and σ are random processes via µ, and u o which depend on the Brownian motion of the major agent w 0. Let µ t(ω) be the conditional law of z o ( ) with control u o given F w 0 t. Then: The MFG or Nash certainty equivalence (NCE) consistency condition: The measure and control mapping loop for the MM MFG equation schema is closed if µ t(ω) = µ t(ω) a.s., 0 t T, which consistitutes the measure valued part of the solution.

22 Major-Minor Agent Stochastic MFG System Summary of the Major Agent s Stochastic MFG (SMFG) System: [ MFG-SHJB dφ 0(t, ω, x) = inf H 0[t, ω, x, u, D xφ 0(t, ω, x)] u U 0 + σ 0[t, x, µ t(ω)], D xψ 0(t, ω, x) Tr( a 0[t, ω, x]d 2 xxφ 0(t, ω, x) )] dt ψ T 0 (t, ω, x)dw 0(t, ω), φ 0(T, x) = 0 MFG-SBR u o 0(t, ω, x) = arg inf u U 0 H 0[t, ω, x, u, D xφ 0(t, ω, x)] MFG-SMV dz o 0(t, ω) = f 0[t, z o 0(t, ω), u o 0(t, ω, z 0), µ t(ω)]dt + σ 0[t, z o 0(t, ω), µ t(ω)]dw 0(t, ω), z o 0(0) = z 0(0) where a 0[t, ω, x] := σ 0[t, x, µ t(ω)]σ T 0 [t, x, µ t(ω)], and the stochastic Hamiltonian H 0 is H 0[t, ω, x, u, p] := f 0[t, x, u, µ t(ω)], p + L 0[t, x, u, µ t(ω)].

23 Major-Minor Agent Stochastic MFG System Summary of the Minor Agents SMFG System: [ MFG-SHJB dφ(t, ω, x) = H[t, ω, x, u, Dxφ(t, ω, x)] inf u U Tr( a[t, ω, x]d 2 xxφ(t, ω, x) )] dt ψ T (t, ω, x)dw 0(t, ω), φ(t, x) = 0 MFG-SBR u o (t, ω, x) = arg inf H[t, ω, x, u, Dxφ(t, ω, x)] u U MFG-SMV dz o (t, ω) = f[t, z o (t, ω), u o (t, ω, z), µ t(ω)]dt + σ[t, z o (t, ω), µ t(ω)]dw(t) where a[t, ω, x] := σ[t, x, µ t(ω)]σ T [t, x, µ t(ω)], and the stochastic Hamiltonian H is H[t, ω, x, p] := f[t, x, u, µ t(ω)], p + L[t, x, u, z o 0(t, ω), µ t(ω)].

24 Major-Minor Agent Stochastic MFG System Solution to the Major-Minor Agent SMFG System: The solution of the major-minor SMFG system consists of 8-tuple F w 0 t -adapted random processes ( φ0(t, ω, x), ψ 0(t, ω, x), u o 0(t, ω, x), z0(t, o ω), φ(t, ω, x), ψ(t, ω, x), u o (t, ω, x), z o (t, ω) ) where z o (t, ω) generates the random measure µ t(ω). The solution to the major-minor SMFG system is a stochastic mean field in contrast to the deterministic mean field of the standard MFG problems (HCM 03,HMC 06,LL 06).

25 Analysis of the MM-SMFG System Existence and uniqueness of Solutions to the Major and Minor (MM) Agents SMFG System: A fixed point argument with random parameters in the space of stochastic probability measures. µ ( ) (ω) m-smv u o (, ω, x) M-SHJB ( φ 0(, ω, x), ψ 0(, ω, x) ) M-SBR u o 0(, ω, x) m-sbr M-SMV ( φ(, ω, x), ψ(, ω, x) ) m-shjb z o 0(, ω) Theorem (Existence and Uniqueness of Solutions (Nourian and PEC. SIAM Jnl. Control and Optimization, 2013) Under technical conditions including a contraction gain condition there exists a unique solution for the map Γ, and hence a unique solution to the major and minor agents MM-SMFG system.

26 ɛ-nash Equilibrium of the MFG Control Laws Given ɛ > 0, the set of controls {u o j; 0 j N} generates an ɛ-nash equilibrium w.r.t. the costs J N j, 1 j N} if, for each j, J N j (u 0 j, u 0 j) ɛ inf u j U j J N j (u j, u 0 j) J N j (u 0 j, u 0 j). Theorem (Nourian, PEC, SIAM Jnl Control and Optimization, 2013) Subject to technical conditions, there exists a unique solution to the MM-MFG system such that the set of infinite population MF best response control processes in a finite N population system of minor agents (u o 0,, u o N ) generates an ɛ N -Nash equilibrium where ɛ N = O(1/ N) Agent y is a maximizer Agent x is a minimizer y x 0 1 2

27 Major-Minor Agent LQG -MFG Systems

28 Major-Minor Agent LQG-MFG Systems Application of non-linear theory to MM LQG-MFG systems: Yields retrieval of the MM MFG-LQG equations of [Nguyen-Huang 11] (given here in the uniform agent class case). Dynamics: A 0 : dz N 0 (t) = ( a 0z N 0 (t) + b 0u N 0 (t) + c 0z (N) (t) ) dt + σ 0dw 0(t) A i : dz N i (t) = ( az N i (t) + bu N i (t) + cz (N) (t) ) dt + σdw i(t), 1 i N where z (N) ( ) := (1/N) N i=1 zn i ( ) is the average state of minor agents. Costs: A 0 : J N 0 (u N 0, u N 0) = E A i : J N i (u N i, u N i) = E where r 0, r > 0. T 0 T 0 [( z N 0 (t) ( λ 0z (N) (t) + η 0 ) ) 2 + r0 ( u N 0 (t) ) 2 ] dt [( z N i (t) ( λz (N) (t) + λ 1z N 0 (t) + η )) 2 + r ( u N i (t) ) 2 ] dt

29 Major-Minor Agent LQG-MFG Systems The Major Agent s LQG-MFG System: [ ] [ (a0 Backward SDE : ds0(t) = (b 2 0/r ) 0)Π 0(t) s 0(t) η 0 + ( ) ] c 0Π 0(t) λ 0 z o (t) dt q 0(t)dw 0(t), s 0(T ) = 0 [ ] Best Response : u o 0 (t) := (b ( 0/r 0) Π 0(t)z0(t) o + s ) 0(t) [ Forward SDE ] : dz o 0 (t) = ( a 0z o 0(t) + b 0u o 0(t) + c 0z o (t) ) dt + σ 0dw 0(t), z o 0(0) = z 0(0) where Π 0( ) 0 is the unique solution of the Riccati equation: tπ 0(t) + 2a 0Π 0(t) (b 2 0/r 0)Π 2 0(t) + 1 = 0, Π 0(T ) = 0. The Minor Agents LQG-MFG System: [ ] [ (a Backward SDE : dsi(t) = (b 2 /r)π(t) ) s i(t) η λ 1z0(t) o + ( cπ(t) λ ) ] z o (t) dt q i(t)dw 0(t), s i(t ) = 0 [ ] Best Response : u o i (t) := (b/r) ( Π(t)zi o (t) + s ) i(t) [ Forward SDE ] : dz o i (t) = ( az o i (t) + bu o i (t) + cz o (t) ) dt + σdw i(t), z o i (0) = z i(0) where Π( ) 0 is the unique solution of the Riccati equation: tπ(t) + 2aΠ(t) (b 2 /r)π 2 (t) + 1 = 0, Π(T ) = 0.

30 Major-Minor Agent LQG-MFG Systems The Minor Agents Mean Field z o ( ) Equations: [ ] [ (a Backward SDE : ds(t) = (b 2 /r)π(t) ) s(t) η λ 1z0(t) o + ( cπ(t) λ ) ] z o (t) dt q(t)dw 0(t), s(t ) = 0 [ ] Best Response : u o i (t) := (b/r) ( Π(t)zi o (t) + s(t) ) [ MF FDE ] : dz o (t) = ( (a + c)z o (t) + bu o (t) ) dt. Key assumption for solution existence and uniqueness of MM-MFG system is that all drift and cost functions f and L and their derivatives are bounded which clearly does not hold for the MM-MFG LQG problem (as in classical LQG control), so particular methods must be used to deal with this.

31 The LQG-MFG Solution: The Generalized Four-Step Scheme The Generalized Four-Step Scheme for the SMFG-LQG System: For given z o ( ) we set s 0(t) = θ(t, z0(t)) o where the function θ is to be determined. By Itô s formula: { ds 0(t) = dθ(t, z0(t)) o = θ t(t, z0(t)) o + θ x(t, z0(t)) o [ (a 0 b2 0 )z0(t) o r 0 b2 0 θ(t, z0(t)) o + c 0z o (t) ] + 1 } r 0 2 σ2 0θ xx(t, z0(t)) o dt + σ 0θ x(t, z0(t))dw o 0(t). Comparing this to the Major agent s SDE implies that θ should satisfy the equations: θ t(t, z o 0(t)) + θ x(t, z o 0(t)) [ (a 0 b2 0 r 0 )z o 0(t) b2 0 r 0 θ(t, z o 0(t)) + c 0z o (t) ] σ2 0θ xx(t, z o 0(t)) = [ a 0 b2 0 r 0 Π 0(t) ] θ(t, z o 0(t)) + η 0 [ c 0Π 0(t) λ 0 ] z o (t) σ 0θ x(t, z o 0(t)) = q 0(t). We can get similar equations for the minor agent by setting s(t) = ˇθ(t, z o 0(t), z o (t)).

32 The Four-Step Scheme for the LQG-MFG System Step 1. For given z o ( ) solve the following parabolic PDE for θ(t, x): θ t(t, x) + θ x(t, x) [ (a 0 b2 0 r 0 Π 0(t))x b2 0 r 0 θ(t, x) + c 0z o (t) ] σ2 0θ xx(t, x) = [ a 0 b2 0 r 0 Π 0(t) ] θ(t, x) + η 0 [ c 0Π 0(t) λ 0 ] z o (t), θ(t, x) = 0. Step 2. Use θ in Step 1 to solve the following forward SDE: dz o 0(t) = [ (a 0 b2 0 r 0 Π 0(t))z o 0(t) b2 0 r 0 θ(t, z o 0(t)) + c 0z o (t) ] dt + σ 0dw 0(t), z o 0(0) = z 0(0) Step 3. Set q 0(t) = σ 0θ x(t, z o 0(t)) s 0(t) = θ(t, z o 0(t)). by the use of θ and z o 0(t) obtained in Steps 1 and 2.

33 The Four-Step Scheme for the LQG-MFG System (ctd ) Step 4. Use θ to solve the following PDE for ˇθ(t, x, y): ˇθ t(t, x, y) + ˇθ x(t, x, y) [ (a 0 b2 0 r 0 )x b2 0 r 0 θ(t, x) + c 0y ] σ2 0 ˇθ xx(t, x, y) + ˇθ y(t, x, y) [ (a + c b2 b2 Π(t))y r r ˇθ(t, x, y)] = [ a b2 r Π(t)]ˇθ(t, x, y) + η λ1x ( cπ(t) λ ) y, ˇθ(T, x, y) = 0. Step 5. Use z o 0(t) and ˇθ obtained in Steps 2 and 4 to solve the following forward SDE: dz o (t) = [ (a + c b2 r Π(t))zo (t) b2 r ˇθ(t, z o 0(t), z o (t)) ] dt, z o (0). Step 6. Set q(t) = σ 0 ˇθx(t, z o 0(t), z o (t)) s(t) = ˇθ(t, z o 0(t), z o (t)). by the use of z o 0(t), ˇθ and z o (t) obtained in Steps 2, 4 and 5.

34 Partially Observed Major-Minor Agent Mean Field Systems

35 Infinite Horizon Completely Observed MM MFG Problem Formulation (Huang 2010) Dynamics: Completely Observed Finite Population: Major Agent dx 0 = [A 0 x 0 + B 0 u 0 ]dt + D 0 dw 0 Minor Agents dx i = [A(θ i )x i + B(θ i )u i + Gx 0 ]dt + Ddw i, i N The individual infinite horizon cost for the major agent: J 0 (u 0, u 0 ) = E 0 e ρt { x 0 Φ(x N ) 2 Q 0 + u 0 2 R 0 } dt Φ( ) = H 0 x N + η 0 x N = (1/N) N i=1 x i The individual infinite horizon cost for a minor agent i, i N: { J i (u i, u i ) = E e ρt x i Ψ(x N ) } 2 Q + u i 2 R dt 0 Ψ( ) = H 1 x 0 + H 2 x N + η

36 Minor Agents Types Minor Agents Types: I k = {i : θ i = k, i N}, N k = I k, 1 k K π N = (π1 N,..., πn K ), πn k = N k/n, 1 k K, denotes the empirical distribution of the parameters (θ 1,..., θ N ) of the agents A i N. Assumption: There exists π such that lim N π N = π a.s.

37 Major Agent and Minor Agents Introduce the (auxiliary) state means: x N k = 1 x i, 1 k K N k i I k If it exists, the L 2 limit of the system states means x N = [x N 1,..., xn K ] constitutes the system mean field. Subject to time invariant local state plus mean field plus major agent state feedback control, x N = [x N 1,..., xn K ] satisfies the mean field equation K d x k = Ā k,j x j dt + Ḡkx 0 dt + m k dt, j=1 1 k K i.e., d x(t) = Ā x(t)dt + Ḡx 0(t)dt + m(t)dt where the quantities Ḡk, m k are to be solved for in the tracking solution.

38 Major Agent and Minor Agents LQG - MFG When MF plus x 0 plus local state dependent controls are applied, the MF-dependent extended state closes the system equations into state equations. Major Agent s Extended State: Major agent s state extended by the mean field: [ x0 x Minor Agents Extended States: Minor agent s state extended by major agent s state and the x i mean field: x 0. x ].

39 Major Agent and Minor Agents (Inf. Population) Major Agent s Dynamics (Infinite Population): [ dx0 d x ] [ A0 0 = nk n Ḡ Ā [ + ] [ x0 x ] B 0 u 0 0 dt + nk m ] dt [ 0n 1 m ] [ ] D0 dw dt nk 1 [ ] [ A0 0 A 0 = nk n B Ḡ Ā 0 = [ ] [ 0n 1 M 0 = Q π m 0 = B 0 0 nk m ] Q 0 Q 0 H0 π H0 πt Q 0 H0 πt Q 0 H0 π η 0 = [I n n, H π 0 ] T Q 0 η 0 H π 0 = π H 0 [π 1 H 0 π 2 H 0... π K H 0 ] ]

40 Major Agent and Minor Agents (Inf. Population) Minor Agents Dynamics (Infinite Population): dx i dx 0 d x [ = [ A k = [ B k = A k [G 0 n nk ] 0 (nk+n) n A 0 [ + ] B k 0 (nk+n) m ] x i x 0 x dt [ ] 0n 1 u i dt + M 0 0 n m B 0 0 nk m A k [G 0 n nk ] 0 (nk+n) n A 0 B 0 R 1 ] [ B k M = 0 (nk+n) m dt u 0 dt + 0 BT 0 Π 0 ] 0 n 1 M 0 B 0 R 1 0 BT 0 s 0 η = [I n n, H, H π 2 ] T Qη H π 2 = π H 2 Ddw i D 0 dw 0 0 nk 1 ]

41 Infinite Population Cost Function The individual cost for the major agent: J 0 (u 0, u 0 ) = E 0 e ρt { x 0 Φ( x) 2 Q 0 + u 0 2 R 0 } dt Φ( ) = H0 π x + η 0 The individual cost for a minor agent i, i N: J i (u i, u i ) = E 0 e ρt { x i Ψ( x) 2 Q + u i 2 R Ψ( ) = H 1 x 0 + H π 2 x + η } dt

42 Control Actions (Infinite Population) Major Agent Tracking Problem Solution: ρπ 0 = Π 0 A 0 + A T 0 Π 0 Π 0 B 0 R 1 0 BT 0 Π 0 + Q π 0 ρs 0 = ds 0 dt + (A 0 B 0 R 1 0 BT 0 Π 0 ) T s 0 + Π 0 M 0 η 0 u 0 i = R0 1 [ BT 0 Π0 (x T 0, z T ) T + s 0] Minor Agent Tracking Problem Solution: ρπ k = Π k A k + A T k Π k Π k B k R 1 B T k Π k + Q ρs k = ds k dt + (A k B k R 1 B T k Π k) T s k + Π km η u 0 i = R 1 B T [ k Πk (x T i, x T 0, z T ) T + s k]

43 Major-Minor MF Equations Define: Π k = Π k,11 Π k,12 Π k,13 Π k,21 Π k,22 Π k,23 Π k,31 Π k,32 Π k,33, 1 k K e k = [0 n n,..., 0 n n, I n, 0 n n,..., 0 n n ], where the n n identity matrix I n is at the kth block. Major-Minor MF Equations for Ā, Ḡ, m: Consistency Requirements ρπ 0 = Π 0 A 0 + A T 0 Π 0 Π 0 B 0 R 1 0 BT 0 Π 0 + Q π 0, ρπ k = Π k A k + A T k Π k Π k B k R 1 B T k Π k + Q π, k, Ā k = [A k B k R 1 B T k Π k,11 ]e k B k R 1 B T k Π k,13, k, Ḡ k = B k R 1 B T k Π k,12, k, ρs 0 = ds 0 dt + (A 0 B 0 R 1 0 BT 0 Π 0 ) T s 0 + Π 0 M 0 η 0, ρs k = ds k dt + (A k B k R 1 B T k Π k ) T s k + Π k M η, k, m k = B k R 1 B T k s k, k,

44 Assumptions Define A 1 B 1R 1 B1 T Π 1,11 M 1 =... B 1R 1 B1 T Π 1,13 A K B KR 1 BKΠ T K,11 M 2 =. B KR 1 B KΠ T K,13 A M 3 = Ḡ Ā 0, L 0,H = Q 1/2 0 [I, 0, H0 π ] Ḡ M 2 M 1 H1: There exists a probability vector π such that lim N π N = π. H2: The initial states are independent, Ex i(0) = 0 for each i 1, sup j 0 E x j(0) 2 c. H3: The pair (L 0,H, M 3) is observable. H4: The pair (L a, A 0 (ρ/2)i) is detectable, and for each k = 1,..., K, the pair (L b, A k (ρ/2)i) is detectable, where L a = Q 1/2 0 [I, H0 π ] and L b = Q 1/2 [I, H, Ĥπ ]. The pair (A 0 (ρ/2)i, B 0) is stabilizable and (A k (ρ/2)i, B k ) is stabilizable for each k = 1,..., K.

45 Major - Minor: MF Equilibrium Theorem: (Huang, 2010) Major and Minor Agents: MF Equilibrium: Subject to H1-H4 the MF equations generate a set of stochastic control laws UMF N {u0 i ; 0 i N}, 1 N <, such that (i) All agent systems S(A i ), 0 i N, are second order stable. (ii) {UMF N ; 1 N < } yields an ɛ-nash equilibrium for all ɛ, i.e. for all ɛ > 0, there exists N(ɛ) such that for all N N(ɛ) J N i (u 0 i, u 0 i) ɛ inf u i U g J N i (u i, u 0 i) J N i (u 0 i, u 0 i).

46 Simulation 0.05 Major Agent y x t State trajectories

47 Partially Observed Major-Minor Mean Field Systems

48 Partially Observed Major-Minor Agent Systems Recall: Estimation of major agent and mean field states becomes a meaningful problem in MM case. Major Agent dx 0 = [A 0 x 0 + B 0 u 0 ]dt + D 0 dw 0 Minor Agents dx i = [A(θ i )x i + B(θ i )u i + Gx 0 ]dt + Ddw i, The observation process for minor agent A i : x i dy i (t) = Lx 0, x i dt + dv i (t) L x 0 x where L = [L 1 L 2 0]. dt + dv i (t) Complete observations process for the major agent A 0 : dy o (t) = dx 0 (t) i N

49 Partially Observed Major-Minor Agent Systems The Major agent is assumed to have complete observations of its own state. This permits the Minor agents to form conditional expectations of the Major agent s MFG control action u 0 since it is a (linear) function of the Major agent s state. Such an estimate could not in general be generated by the Minor agents in the case where the Major agent s control action is a (linear) function of the conditional expectation of its state E[x 0 F 0], where F 0 is the Major agent s observation σ-field.

50 Estimation The Riccati equation associated with the Kalman filtering equations for [x i, x 0, x]: x 0, x i V (t) = A k V (t) + V (t)a T k K(t)R v K T (t) + Q w, where Q w = Σ i Σ ζ , A k = A k [G 0 n nk ] [ A0 0 0 (nk n) (nk+n) n Ḡ Ā ] and [ V (0) = E x 0, x i ] [ (0) ˆx 0, x i (0) x 0, x i T (0) ˆx 0, x i (0)].

51 Estimation The innovation process is dν i = dy i L and the Kalman filter gain is given by ˆx i F y i ˆx 0 F y i ˆ x F y i K(t) = V (t)l T R 1 v. H5: The system parameter set Θ satisfies [A k, Q w ] controllable and [L, A k ] observable for 1 k K. The Filtering Equations: dˆx i F y i [ A k ] [ [G 0 n nk ] ] dˆx 0 F y = 0n n A0 0 i nk n dˆ x F y 0 i nk n Ḡ Ā B k 0 n m + 0 n m u i dt + B 0 û 0 F y dt + i 0 nk m 0 nk m, 0 n 1 0 n 1 m ˆx i F y i ˆx 0 F y i ˆ x F y i dt dt + Kdν i,

52 Separation Principle for MM-MFG Systems The control law dependent summand of the individual cost for the major agent A 0 : J N 0 (u 0, u 0 ) = E 0 e ρt { x 0 H 0ˆx N F 0 η 0 2 Q 0 + u 0 2 R 0 } dt ˆx N F 0 = (1/N) N i=1 ˆx i F 0 The control law dependent summand of the individual cost for a minor agent A i, i N: J N i (u i, u i ) = E 0 e ρt { ˆxi F y i H 1ˆx 0 F y i H 2ˆx N F y i η 2 Q + u i 2 R }

53 Separation Principle for PO MM-MFG Systems Key Steps to Main Result: (1) The major agent and individual minor agent state estimation recursive equations schemes are given by the MM KF-MF Equations (for size N finite populations and infinite populations). (2) Apply the Separation Theorem strategy of reducing a partially observed SOC problem to a completely observed SOC problem for the controlled state estimate processes. (3) Step 2 transforms the J 0 and J i MM performance functions into LQG MM tracking performance functions on the controlled state estimate processes in the infinite and finite population cases.

54 Separation Principle for PO MM-MFG Systems (4) The problem in (2) for the state estimate processes is solved using the completely observed LQG MM-MFG methodology which yields the û 0 and û i control laws and the J0 and Ji performance function values. (5) The Major Agent performance value J0 and Minor Agent performance value Ji neccessarily correspond to infinite population Nash equilibria. (6) Approximation Analysis gives ɛ Nash equilibria with respect to J 0 and J i for J N 0 and J N i in finite N populations.

55 Nash Equilibria for Partially Observed MM-MFG Systems Theorem: (PEC, AK, 2013) ɛ-nash Equilibria for PO MM-MF Systems Subject to H1-H5, the KF-MF state estimation scheme plus MM-MFG equations generate the set of control laws ÛMF N {û0 i ; 1 i N}, 1 N <, and u 0 given by u 0 0 = R0 1 [ BT 0 Π0 (x T 0, x T ) T + s ] 0, [ ] û 0 i = R 1 B T k Π k (ˆx T i F y, ˆx T i 0 F y, ˆ x T i F y ) T + s k i such that (i) All agent systems S(A i ), 0 i N, are second order stable. (ii) {Û MF N ; 1 N < } yields an ɛ-nash equilibrium for all ɛ, i.e. for all ɛ > 0, there exists N(ɛ) such that for all N N(ɛ) J N i (û 0 i, û 0 i) ɛ inf u i U g J N i (u i, û 0 i) J N i (û 0 i, û 0 i).

56 Concluding Remarks Future Work Building upon the MM-LQG-MFG theory for partially observed MM systems and the Nonlinear MM-MFG theory, the next step is to generate a MM-MFG theory for partially observed nonlinear MM systems. Investigate applications to power markets via the systematic application of MM-MFG Theory to markets where minor agents (customers and suppliers) receive intermittent observations on active major agents (such as utilities and international energy prices) and on passive major agents (such as wind and ocean behaviour).

On continuous time contract theory

On continuous time contract theory Ecole Polytechnique, France Journée de rentrée du CMAP, 3 octobre, 218 Outline 1 2 Semimartingale measures on the canonical space Random horizon 2nd order backward SDEs (Static) Principal-Agent Problem