Mean Field Game Systems with Common Noise and Markovian Latent Processes

Size: px

Start display at page:

Download "Mean Field Game Systems with Common Noise and Markovian Latent Processes"

Rodger Day
5 years ago
Views:

1 arxiv: v2 [math.oc 29 Oct 218 Mean Field Game Systems with Common Noise and Markovian Latent Processes 1 Abstract Dena Firoozi, Peter E. Caines, Sebastian Jaimungal A class of non-cooperative stochastic games with major and minor agents is investigated where agents interact with a completely observed common process. However, the common process is modulated by a latent Markov chain and a latent Wiener process (common noise) which are not observable to agents. Consequently the Wonham filter is used to generate the posteriori estimates of the latent processes based on the realized trajectories of the common process. Then, the convex analysis is further developed to (i) solve the mean field game limit of the problem, (ii) demonstrate that the best response strategies generate an ǫ-nash equilibrium for the finite player game, and (iii) obtain explicit characterisations of the best response strategies. 2 Introduction In this paper, an MFG framework is considered where there exist one major agent and a large number of minor agents which are subject to linear dynamics and quadratic cost functionals. Each agent interacts with other agents in the system through the coupling in their cost functional with a common process. The common process is modulated by a latent Markov chain process and a latent Wiener process, which are not directly observed by the agents but rather are inferred from the agents observation processes. We refer to the latent Wiener process as the common noise process. Moreover, the common process is impacted by the major agent s state, the major agent s control action, the average state of all the minor agents, and the average control action of all the minor agents. We obtain the best response strategies for the major agent and each individual minor agent 1

2 in the infinite population limit which collectively yield an ǫ-nash equilibrium for the finite population system. Motivation: Financial and economic systems (among others) are often driven by latent factors, and these latent factors also affect the cost (profit) functional of the traders involved. Moreover, the agents in these system are often acting in a non-cooperative manner, and hence playing a large stochastic game with one another; while they may control aspects of the system, they are also at the whim of factors they cannot control or observe. For example, in optimal execution problems (where traders aim to sell or buy shares of an asset), all traders are subject to the same asset price process and must make their trading decisions based on the observed price. The asset price dynamics may be driven by a common Wiener process, which accounts for so-called noise (uninformed) traders. In addition, the effect of unobserved factors on the price dynamics, other than the major agent s trading action and the aggregate impact of minor agents trading actions, are important factors to incorporate (see e.g. [1, [2 ) in specifying the best response trading strategies and ǫ-nash equilibrium. Methodology: Although latent processes are not directly observable, the information provided from the realized trajectories of the common process and the evolution of system s aggregate state (mean field) can be used to obtain posteriori estimates, and to subsequently partially predict future behavior of the common process [1. Certain versions of such problems can then be recast as MFG systems with a common noise. A variation of this type of MFG system has been investigated in [3, where the case of correlated randomness in a nonlinear setting is analyzed. Here we utilize a different approach in order to address the existence of a latent process together with the common noise. Specifically, we treat the common process as a major agent and further extend the Major - Minor LQG MFG analysis of [4 to incorporate such a latent process in the dynamics. Then, we utilize the convex analysis approach in [5 to obtain the best response strategies for all agents that yield an ǫ-nash equilibrium. The rest of the paper is organized as follows. Section 3 introduces a class of major minor MFG problems with a common process as well as a latent process. The MFG formulation of the problem is then presented in Section 4. Concluding remarks are made in Section 5. 2

3 3 Major Minor Mean Field Game Systems with a Common Process 3.1 Dynamics: Finite Population We consider a large population of N minor agents and a major agent, where the agents are coupled through their individual cost functionals with a common process Major and Minor agents The underlying dynamics of the major and minor agents are assumed to be given, respectively, by dx t = [A x t +B u t +b (t)dt+σ dw t, (1) dx i t = [A kx i t +B ku i t +b k(t)dt+σ k dw i t, (2) where t [,T, i N, N = {1,...,N}, N <, and the subscript k K, K = {1,...,K}, K N, denotes the type of a minor agent. Here, x i t R n, i N,N = {,...,N}, are the states, u i t R m, i N are the control inputs, {w i t, i N } denotes (N + 1) independent standard Wiener processes inr r, wherew i is progressively measurable with respect to the filtration F w := (F w t ) t [,T. All matrices in (1) and (2) are constant and of appropriate dimension; the vector processes b (t), and b k (t) are deterministic functions of time. Assumption 1. The initial states {x i, i N } are identically distributed and mutually independent and also independent of F w ; E[w i t (wi t )T = Σ, i N. Moreover, Ex i =, and E x i 2 C <, i N, with Σ and C independent ofn. Minor Agents Types: Minor agents are given in K distinct types with 1 K <. The notation Ψ k Ψ(θ i ), θ i = k is introduced where θ i Θ, with Θ being the parameter set, and Ψ may be any dynamical parameter in (2) or weight matrix in the cost functional (6). The symbol 3

4 I k denotes I k = {i : θ i = k, i N}, k K where the cardinality of I k is denoted by N k = I k. Then, π N = (π1 N,...,πN K ), πn k = N k, k K, denotes the empirical distribution of the N parameters(θ 1,...,θ N ) sampled independently of the initial conditions and Wiener processes of the agentsa i,i N. The first assumption is as follows. Assumption 2. There existsπ such that lim N π N = π a.s Common Process: Finite Population We consider the systems where the major agent and any minor agent A i, i N, observe a common stochastic process y t, where both the state and common process y t appear in an agent s cost functional as introduced in Section 3.2. The common process y t R n is governed by wherey L t evolves as in dy t = dy L t +(Fu (N) t dt+f u t +Hx (N) t +H x t)dt, (3) dy L t = f(t,y L t,γ t )dt+σdw t. (4) In (4), the process Γ := (Γ t ) t [,T denotes a latent continuous Markov chain process with Γ t {γ j, j M}, M = {1,...,M}, M < ; the vector f(t,y L t,γ t) denotes a deterministic nonlinear function of t, y L, and Γ; w t R r denotes a latent Wiener process independent of {w i t, i N }, and the matrices F, F, H, H, and σ are deterministic, constant and of appropriate dimensions. Moreover, by substituting (4) in (3), it is evident that the common process y t is impacted by 1) a latent Markov chain processγ t, 2) the major agent s statex t, 3) the major agent s control actionu t, 4) the average state of minor agents, i.e. x (N) t = 1 N N i=1 xi t, 5) the average control action of minor agents, i.e. u (N) t = 1 N N i=1 ui t, 4

5 6) a latent Wiener (common noise) process w t R r independent ofw t, w i t, i N. Assumption 3. The major agent A completely observes its own state and the common processy t. Assumption 4. Each minor agent A i, i N completely observes its own state, the major agent s state and the common processy t. We again emphasize that the latent processes Γ t and w t are not directly observed by the agents A i, i N. However, each agent may obtain their posteriori estimates based on its complete observations on the common process y t. We refer to the latent Wiener process as the common noise process in this work Control σ-fields We denote by F i := (Ft) i t [,T, i N, the natural filtration generated by the i-th minor agent s state(x i t ) t [,T, byf := (Ft ) t [,T the natural filtration generated by the major agent s state (x t ) t [,T, and F := (F t ) t [,T the natural filtration generated by the states of all agents((x i t) i N,x t) t [,T. Moreover, we denote by G := (G t ) t [,T the natural filtration generated by (Γ t,w t ) t [,T, and F y := (F y t ) t [,T the natural filtration generated by (y t ) t [,T. Next, we introduce two admissible control sets. Let U denote the set of feedback control laws with second moment lying in L 1 [,T, for any finite T, which are adapted to the smaller filtration F,r := (F,r t ) t [,T, where F,r := F F y. The set of control inputsu i,i N, based upon the local information set of the minor agenta i,i N, consists of the feedback control laws adapted to the smaller filtration F i,r := (F i,r t ) t [,T, where F i,r := F i F F y, i N, while Ug N is adapted to the general filtrationfg := (Ft) g t [,T, wheref g := F F y G, 1 N, and thel 1 [,T constraint on second moments applies in each case. We note in passing the significant differences between the information structures specified here and those in the team theory literature [6. Assumption 5 (Major Agent s Linear Control Laws). For major agent A, the set of control laws U,L U, is defined to be the collection of linear feedback control laws adapted tof,r. Assumption 6 (Minor Agent s Linear Control Laws). For each minor agent A i, i N, the set of control lawsu i,l U i, i N, is defined to be the collection of linear feedback control laws adapted tof i,r, i N. 5

6 3.2 Cost Functionals: Finite Population Given the vectorz t as z t = [ yt x t the major agent s cost functional to be minimized is formulated by J (u,u ) = 1 2 E [ (z T )T G z T + T whereu = (u 1,u 2,...,u N )., { } (zs )T Q zs +2(z s )T N u s +(u s )T R u s ds, Assumption 7. For the cost functional (5) to be convex, we assume that G, R >, and Q N R 1 N T >. (5) Similarly, given the vectorzt i, i N, as [ x zt i i = t y t, the cost functional to be minimized for minor agent A i, i N, is formulated by J i (u i,u i ) = 1 T { } [(z 2 E T) i T G k zt i + (zs) i T Q k zs i +2(zs) i T N k u i s +(u i s) T R k u i s ds, 1 k K, where u i = (u,...,u i 1,u i+1,...,u N ). Assumption 8. For the cost functional (6) to be convex, we assume that G k, R k >, andq k N k R 1 k NT k > fork K. 4 Major Minor LQG Mean Field Games Approach In the mean field game methodology with a major agent [7, [4, the problem is first solved in the infinite population case where the average terms in the finite population dynamics and cost functional of each agent are replaced with their infinite population limit, i.e. the mean field. For this purpose, the major agent s 6 (6)

7 state is extended with the mean field, while the minor agent s state is extended with the major agent s state, and the mean field; this yields stochastic optimal control problems for each agent linked only through the major agent s state and mean field. Finally the infinite population best response strategies are applied to the finite population system which yields an ǫ-nash equilibrium. To address major minor mean field game systems involving a common process and a latent Markov chain process, the following steps are followed. We first note that the common process in this work represents an extended form of common noise in [3. However, a different approach is followed to incorporate the common process in the major minor LQG mean field game framework. First in Section 4.1, the evolution of the state mean field and the control mean field in the infinite population case are derived. Then, an F,r -adapted and F i,r -adapted, i N, forms of the common process in the infinite population case are presented in Section 4.2. Next in Sections 4.3 and 4.4, the common process is perceived as a major agent in the major minor LQG MFG framework. Subsequently, the major minor LQG analysis described above is further extended where the major agent s state is extended with the mean field and thef,r -adapted common process, while a minor agent s state is extended with the major agent s state, the mean field, and the F i,r -adapted common process. Finally, a convex analysis method is performed in Section 4.5 to obtain the best response strategies which yield the infinite population Nash equilibrium and finite populationǫ-nash equilibrium. 4.1 Mean Field Evolution The common processy t governed by (3) is involved with the empirical average of the minor agents states, i.e. x (N) t, as well as the empirical average of the minor agents control actions, i.e. u (N) t. To attain the infinite population limit ȳ t of y t, the state mean field x t and the control mean field ū t are introduced as the infinite population limits ofx (N) t and u (N) t, respectively Control Mean Field The empirical average of minor agents control actions is introduced as u (N k) t = 1 N k N k j=1 u j,k t, k K, (7) 7

8 and the vector u (N) t = [u (N 1) t,u (N 2) t,...,u (N K) t is defined, where the pointwise in timel 2 limit ofu (N) t, if it exists, is called the control mean field of the system and is denoted by ū t = [ū 1 t,...,ūk t. We consider for each minor agent A i, i N, of type k, k K, a uniform (with respect to i) state feedback control u i,k t U i,l as in u i,k t = L k 1x i,k t +Σ K l=1σ N l j=1 Lk,l 2 x j,l t +L k 3x t +L k 4y t +m k t, (8) where t [,T, L k 1, Lk,l 2, Lk 3 and Lk 4 are constant matrices of appropriate dimensions, and m k t is a F y t -measurable process. If we take the average of the control actions u i,k t over the population of the agents of typek, k K, and hence calculate u (N) t, it can be shown that the L 2 limit ū t of u (N) t as N, i.e. the control mean field is given by ū t = C x t + Dx t +Ēȳ t + r t, (9) where x t, if it exists, denotes the state mean field introduced in Section 4.1.2, ȳ t denotes the limiting process associated with the common process y t as N (see Section 4.2), and r t is a F y t -measurable process. Furthermore, the matrices in (9), i.e. C = C 1. C K, D = D 1. D K, Ē = Ē 1. Ē K, r t = r 1 t. r K t, (1) are to be solved for using the mean field consistency equations (47)-(48) derived in Section State Mean Field Similarly, the empirical state average is introduced as x (N k) t = 1 N k N k j=1 x j,k t, k K, (11) and the vector x (N) t = [x (N 1) t,x (N 2) t,...,x (N K) t is defined, where the pointwise in time L 2 limit of x (N) t, if it exists, is called the state mean field of the system and is denoted by x t = [ x 1 t,..., xk t. 8

9 If we substitute (8) in (2) for i N, and take the average of the states of the minor agents closed loop systems of type k, k K, and hence calculate x (N) t, it can be shown that thel 2 limit x t ofx (N) t, i.e. the state mean field, satisfies d x t = Ā x tdt+ḡx t dt+ Lȳ t dt+ m t dt, (12) where ȳ t denotes the infinite population limit of the common process y t (see Section 4.2), m t is af y t -measurable process, and the matrices Ā 1 Ḡ 1 L 1 m 1 t Ā =., Ḡ =., L =., m t =., (13) L K Ā K Ḡ K are again to be solved for using the mean field consistency conditions (47)-(48) derived in Section 4.5. By abuse of language, the mean value of the system s Gaussian mean field given by the state process x t = [ x 1 t,..., xk t shall also be termed the system s mean field (The derivation of the state mean field equation above may be performed using the methods of [8, [9 and [4). 4.2 Common Process: Infinite Population Each agent completely observes the common processy t but has no observations on the latent Markov chain processγ t. In order to resolve the issue of the unobserved latent process Γ t, Wonham filtering method is used to estimate the distribution of Γ t based on the observations of each agent on y t, i.e. F y t. Subsequently, f(t,y L t,γ t) and w t in (4) are presented in their F y t -adapted forms (see e.g. [1, [1). Denote the transition probabilities for the continuous time Markov chain processγby p ij = P(Γ t+h = γ j Γ t = γ i ), 1 i,j M (14) and the corresponding transition rates by v ij, and v i = M j=1, j j m K t v ij, i M. (15) 9

10 The posterior distribution of Γ t conditional on F y t is denoted by Π = {π j t, j M, t [,T}, where π j t = E[½ {Γt=γ j } F y t, j M, t [,T, (16) with initial distribution{π j,j M}. Remark 1. As a result of Assumptions 3-4, the major agent A, and each minor agenta i, i N, completely observe the unaffected common processy L t given by (3). Consequently π j t = E[½ {Γt=γ j } F y t E[½ {Γt=γ j } F yl t. (17) Lemma 1 (Wonham Filter). [1 If σ >, the posterior distribution Π of Γ t is given by dπ j t = i M. ( v j π j t+ M i=1, i j ( M M v ij πt )dt σ i 2 πtγ i i )[f(t,yt L,γ j ) πtγ i i π tdt j i=1 +σ 2 [f(t,y L t,γ j) Lemma 2. [1 Define the processŵ = (ŵ t,t [,T) as i=1 M πt i γ i πtdy j t L, (18) i=1 t ŵ t = w t +σ 1 (f τ f τ )dτ, (19) where f = ( f t, t [,T) is an F y t -adapted process defined as and is computed by f t = E[f(t,y L t,γ t ) F y t, (2) M f t = f(t,y t L,Π) = πtf(t,y j t L,γ j). (21) j=1 Then the processŵ t is an F y t -adapted Wiener process. 1

11 According to Lemma 1 and Lemma 2, equation (4) can be rewritten as dy L t = f t dt+σdŵ t, (22) and by substituting (22) in (3), thef,r t -adapted dynamics of the common process for the infinite population case, i.e. ȳ t, is given by dȳ t = [ f t +F π ū t dt+f u t +H π x t +H x tdt+σdŵ t, (23) where the average termsx (N) t andu (N) t in (3) have been replaced with theirl 2 limit asn, i.e. the state mean field x t and the control mean fieldū t, respectively. Moreover,F π = π F andh π = π H, where denotes the Kronecker product of the corresponding matrices. Remark 2. Since the state and the control action of each individual minor agent A i, i N, do not affect the infinite population evolution of the common process, i.e. ȳ t, the F i,r t -adapted and F,r t -adapted dynamics of the common process ȳ t in the infinite population limit are identical and given by (23). 4.3 Major Agent s Regulation Problem : Infinite Population First, the major agent s state x t is extended with the state mean field x t and the infinite population common process ȳ t to form the major agent s extended state X t = [(ȳ t) T,(x t )T,( x t ) T T which is governed by dx t = A X t dt+b u t dt+m t dt+σ dw t, (24) By substituting (9) into (23), the matrices in the extended major agent s dynamics (24) are given by F π Ē F π D+H F π C +H π F A = n n A n nk, B = B, L Ḡ Ā nk m M t = f t +F π r t b (t) m t, Σ = σ n r n rk n r σ n rk nk r nk r nk rk, W t = ŵ t w t rk 1 (25). 11

12 Next, the major agent s extended cost functional is given as J ex (u ) = 1 2 E [ (X T )T G X T + T where the corresponding weight matrices are given by { } (Xs )T Q Xs +2(X s )T N u s +(u s )T R u s ds, (26) G = [I 2n, 2n nk T G [I 2n, 2n nk, (27) Q = [I 2n, 2n nk T Q [I 2n, 2n nk, (28) [ N N =. (29) nk m The minimization of the extended cost functional (26) subject to the extended dynamics (24) constitutes a stochastic optimal control problem for the major agent in the infinite population limit. Then, according to Theorem 3 the major agent s optimal control action is given by [ u, t = R 1 N T Xt +B( T Π (t)xt +st), (3) whereπ (t) ands t are to be solved for using Π +Π A +A T Π (B T Π +N T )T R 1 (BT Π +N T )+Q =, Π (T) = G, (31) and the FBSDE ds t +[(A B R 1 N ) T Π B R 1 BT s t dt+π M t dt+(π Σ qt )dw t =, s T =. (32) The Riccati eqution (31) and the offset equation (32) shall be derived in Section 4.5. Finally, the closed-loop dynamics of the major agent A when the control action (3) is substituted in (1) is given by ( [ dxt = A Xt B R 1 N T X t ( +BT Π (t)xt t) ) +s +b (t) dt+σ dwt. (33) 12

13 4.4 Minor Agent s Regulation Problem: Infinite Population First, minor agent A i s, i N, state is extended with the infinite population common processȳ t, the major agent s statex t, and the state mean field x t to form the minor agent s extended statex i t = [(xi t )T,(ȳ t ) T,(x t )T,( x t ) T T which satisfies dx i t = A kx i t dt+b ku i dt+m k t dt+σ kdw i t. (34) To attain the extended matrices in (34), the joint dynamics of (i) minor agenta i s system given by (2), (ii) the common process ȳ t given by (23) where (9) and (3) are substituted, (iii) the major agent A s closed loop system given by (33), and (iv) the state mean field x t given by (12) are utilized which results in [ [ A A k = k n (2n+nK) (2n+nK) n A B R 1 N B R 1 B T, B k = Π [ [ M k t = b k (t) σ M t B R 1 B T, Σ s (t) k = k (2n+nK) r Σ n (2r+rK) Next, the minor agenta i s extended cost functional is formed as B k (2n+nK) m, [ w, Wt i = i t, Wt (35) Ji ex (u i ) = 1 T { } [(X 2 E T) i T G k XT i + (Xs) i T Q k Xs i +2(Xs) i T N k u i s +(u i s) T R k u i s ds, (36) where the corresponding weight matrices are given by G k = [ I 2n, 2n (n+nk) T G k [ I2n, 2n (n+nk), Q k = [ T [ I 2n, 2n (n+nk) Q k I2n, 2n (n+nk), [ N N k = k. (37) (n+nk) m The dynamics (34) together with the cost functional (36) constitute a stochastic optimal control problem for minor agent A i, i N, in the infinite population limit. Then, according to Theorem 3, the minor agent A i s optimal control action for the infinite population case is given by u i, t = R 1 k [ N T k Xi t +BT k ( ) Πk (t)xt i +si,k t, (38). 13

14 where Π k (t), k K, is the solutions to the following deterministic Riccati equation Π k +Π k A k +A T k Π k (B T k Π k+n T k )T R 1 k (BT k Π k+n T k )+Q k =, Π k (T) = G k, (39) and s i,k t, k K, is the solution to the following FBSDE ( [(Ak ds i,k t + B k R 1 k N ) k) T Π k B k R 1 k BT k s i,k t +Π k M k t dt+(π k Σ k qt i )dwi t =, s i,k T =. (4) The complete derivation of (39)-(4) will be discussed in Section 4.5. Finally, control action (68) is substituted in (2) which gives minor agent A i s, i N, closed loop system as ( [ dxt i = A k Xt i B ( ) ) kr 1 N T k Xi t +BT k Πk Xt i +si,k t +b k dt+σ k dwt i. (41) k Remark 3. We note that for the case where there exists no latent process, i.e. y L t =, t [,T, the diffusion terms of (32) and (4) become zero and they reduce to the deterministic offset equations of classical major minor LQG mean field games in [ Nash andǫ-nash Equilibria To derive the mean field consistency equations which specify the matrices in the control and state mean field equations, respectively, (9) and (12), the closed loop system (41) of minor agent A i is rewritten as ( dx i t = A k x i t B k R 1 k wherei N, k K. Then the block matrices Π k = ( N T k +B T kπ k )[ (x i t ) T,ȳ T t,(x t) T, x T t ) T Bk R 1 k BT ks i,k t +b k )dt Π k,11 Π k,12 Π k,13 Π k,14 Π k,21 Π k,22 Π k,23 Π k,24 Π k,31 Π k,32 Π k,33 Π k,34 Π k,41 Π k,42 Π k,43 Π k,44, N k = N k,1 N k,2 N k,3 N k,4 +σ k dw i t, (42), 14

15 e k = [ n n,..., n n,i n, n n,..., n n, (43) are defined, where Π k,11,π k,22,π k,33 R n n, Π k,44 R nk nk ; N k,1,n k,2,n k,3 R n m, N k,4 R nk m ; and e k R n nk, k K, denotes a matrix which has the identity matrix I n in its kth block and zero matrix n n in other(k 1) blocks. Now, if the average of (42) over N k minor agents of type k, k K, and then itsl 2 limit as the numbern k of agents within the subpopulationk goes to infinity (i.e. N k ) be taken, it yields [( ) d x k t = A k B k R 1 k (NT k,1 +BT k Π k,11) e k B k R 1 k (NT k,4 +BT k Π k,14) x t dt B k R 1 k (NT k,3 +BT k Π k,13)x t dt B kr 1 k (NT k,2 +BT k Π k,12)ȳ t dt+(b k B k R 1 k BT k sk t )dt. (44) In (44), s k t is obtained by taking the average and then thel2 limit of (4) over the subpopulationk K as N k, and is given by d s k t + ( [(Ak B k R 1 k NT k )T Π k B k R 1 k BT k s k t +Π k M k t where W t = [ r 1, W t ) dt+(π k Σ k q t )d W t =, s k T =, (45), (46) sincelim Nk 1 Nk N k i=1 wi t = ; and hence q t is an F,r t -adapted process. Then, equating (44) with (12) results in the following sets of equations. Π +Π A +A T Π (N T +BT Π ) T R 1 (NT +BT Π )+Q =, Π (T) = G, Π k +Π k A k +A T k Π k (N T k +BT k Π k) T R 1 k (NT k +BT k Π k)+q k =, Π k (T) = G k, C k = R 1 k (NT k,1 +BT k Π k,11)e k R 1 k (NT k,4 +BT k Π k,14), D k = R 1 k (NT k,3 +BT k Π k,13), Ē k = R 1 k (NT k,2 +BT k Π k,12), Ā k = A k e k +B k Ck, Ḡ k = B k Dk, L k = B k Ē k, (47) 15

16 ds t + ( [(A B R 1 N T )T Π B R 1 B T d s k t + r t k = R 1 k BTk sk t, m k t = B k r t k +b k. s t +Π M t) dt+(π Σ q t )dw t =, ( s T =, [(Ak ) B k R 1 k NT k )T Π k B k R 1 k BT k k s t +Π k M k t dt+(π k Σ k q t )d W t =, s k T =, (48) Equations (47)-(48) are called the mean field consistency equations (see [4) from which the matrices in (9) and (12) can be calculated. Now, according to the asymptotic equilibrium analysis performed in [4, the following matrices are defined. M 1 = M 1 = M 2 = A 1 B 1 R 1 1 (NT 1,1 +BT 1 Π 1,11)... A K B K R 1 K (NT K,1 +BT K Π K,11) π 1 FR1 1 (N T 1,1 +B1Π T 1,11 )... π K FR 1 B 1 R 1 1 (NT 1,4 +BT 1 Π 1,14), M 2 = K (NT K,1 +BT K Π K,11),, π 1 FR 1 1 (NT 1,4 +BT 1 Π 1,14).. B K R 1 K (NT K,4 +BT K Π K,14) π K FR 1 K (NT K,4 +BT K Π K,14) F π Ē F π D +H F π C +H π n nk n n n n A n nk n nk n n M 3 = L Ḡ Ā nk nk nk n L Ḡ M 2 M 1 nk n, F π Ē F π D +H M 2 M 1 n n, L,H = Q 1 2 [ n n I n n nk n nk n n n n n n n+nk n+nk I n L a = Q 1 2 [I 2n, 2n nk, L b = Q 1 2 k [ I2n, 2n (n+nk). (49) Assumption 9. The matrixm 1 is Hurwitz. Assumption 1. The pair(l,h,m 3 ) is observable. 16,

17 The analysis above leads to the following theorem where convex analysis and asymptotic MFG equilibrium analysis are utilized to establish the infinite population Nash equilibrium and finite populationǫ-nash equilibrium. Theorem 3. Subject to Assumptions 1-1, the mean field equations (47)-(48) together with the system equations (1)-(3) and (5)-(6), generate an infinite family of stochastic control laws U, MF, with finite sub-families UN, MF {ui, t ; i N}, 1 N <, given by (3)-(32) and (68)-(4), such that (i) U, MF yields a unique Nash equilibrium within the set of linear control laws such that U L J i (u i,,u i, ) = inf u i U L J i (u i,u i, ), (ii) All agent systemsi { N, are second order stable in the sense } that sup t [,T,i N E ( x i t 2 + x (N) t 2 + x t 2 + y t 2 ) < C with C independent of N. (iii) {UMF N ; 1 N < } yields a unique ǫ-nash equilibrium within the set of linear control laws UL N for all ǫ >, i.e. for all ǫ >, there exists N(ǫ) such that for alln N(ǫ) J N i (u i,,u i, ) ǫ inf u i U N L J N i (u i,u i, ) J N i (u i,,u i, ), wherej N i (u i,,u i, ) J i (u i,,u i, ), i N, as N. Proof. We use the convex analysis method developed in [5 to obtain the best response strategies (17)-(19) and (23)-(25); this proves parts (i) and (ii) of the theorem. Then following the asymptotic equilibrium analysis of [1, the set of infinite population control actions yields an ǫ-nash equilibrium for the large population system which proves part (iii) of the theorem. First, the convex analysis is performed for the major agent A s extended system to derive the major agent s optimal control action in the infinite population limit. Using Theorem 2 in [5, the Gâteaux derivative of the major agent s 17

18 extended costj ex (u ) in the direction ofωt U is given by [ T { J ex (u ),ω = E (ωt )T N T X,u t +R u t +B T (e AT t M t t e AT (s t)( Q X,u s +N u s) ds ) } dt where the martingalemt is specified by [ T Mt = E e AT T G X,u T + e AT s (Q Xs,u +N u s)ds F,r t, (5). (51) Given that Assumption 7 holds, according to Theorem 2 in [5, the optimal control action u, t for the major agent A in the infinite population limit is given by [ ( t u, t = R 1 N T X, t +B T e AT t Mt e (s t)( ) ) AT Q Xs, +N us, ds, which is obtained by setting (5) to zero for all possible paths ofωt U. Now, Let us definep t as in p t = e AT t M t t (52) e (s t)( ) AT Q Xs, +N us, ds, (53) which is the adjoint process for the major agent s system in the stochastic maximum principle framework. Next, we adopt an ansatz forp t given by p t = Π (t)x, t +s t, (54) whose substitution in (52) yields a linear state feedback form for the major agent s optimal control action, i.e. [ ( u, t = R 1 N T X, t +B T Π (t)x, t +s t )). (55) To find Π (t) R (2+K)n (2+K)n and s t R (2+K)n, first both sides of (54) are differentiated and then (24) and (55) are substituted, which gives [ ) dp t = ( Π +Π A Π B R 1 N T Π B R 1 B T Π X t dt + ( Π B R 1 BT s t +Π ) M t dt+ds t +Π Σ (t)dwt. (56) 18

19 Next, both sides of (53) are differentiated to yield dp t = ( AT p t Q X t N u t )dt+e AT t dm t. (57) According to the martingale representation theorem, the martingale Mt can be written as t Mt = M + Zs dw s, (58) wherez t is an F,r t -adapted process. Differentiating both sides of (58) yields dm t = Z tdw t. (59) Then, (55) and (59) are substituted in (57) which gives rise to dp t = [ ( Q +N R 1 N T +N R 1 B T Π A T Π )X, t +(N R 1 B T A T )s t dt +q t dw t, (6) whereq t = e AT t Z t. Finally, (56) and (6) are equated which results in a deterministic Riccati equation as Π +Π A +A T Π (B T Π +N T )T R 1 (BT Π +N T )+Q =, Π (T) = G, (61) and a stochastic offset equation as ( [(A ds t+ B R 1 N T ) T Π B R 1 B T s t +Π M t)dt+(π Σ qt)dw t =, s T =. (62) To derive the optimal control action for minor agent A i, i N, as well as the corresponding Riccati and offset equations, a similar approach is followed. Utilizing Theorem 2 in [5, the Gâteaux derivative of the extended cost functional Jk ex (ui ), k K, for minor agent A i,i N, is computed as [ T { Jk ex (ui ),ω i = E (ωt i )T N T k Xi,u t +R k u i t t ) (e } AT k t M it e AT k (s t) (Q k Xs i,u +N k u i s )ds dt. (63) +B T k 19

20 where the martingalemt i is defined by [ T ) Mt i = E e AT k T G k X i,u T + e AT k s (Q k Xs i,u +N k u i F s )ds i,r t. (64) Given Assumption 8, as per Theorem 3, the optimal control action u i, t for minor agenta i,i N, in the infinite population limit is given by [ u i, t = R 1 k N T k Xi, t +B T k (e AT k t M it t e AT k (s t) (Q k X i, s +N k u i, s )ds ), which is obtained by setting (63) to zero for all possible paths ofωt i U i. Let us define p i t as p i t = e AT k t M i t t (65) e AT (s t) (Q k X i, s +N k u i, s )ds, (66) which is in fact the adjoint process for the minor agent A i s system in the stochastic maximum principle framework. Then we adopt an ansatz for p i t given by p i t = Π k (t)x i, t +s i,k t, (67) whose substitution in (65) results in a linear state feedback form foru i, t as u i, [ t = R 1 k N T k X i, ( t +B T k Πk (t)x i, t +s i,k ) t. (68) To find Π k (t) R (3+K)n (3+K)n and s i,k t R (3+K)n, first both sides of (67) are differentiated and then (34) and (68) are substituted which yields dp i t = [ ( Πk +Π k A k Π k B k R 1 k NT k Π kb k R 1 Next, both sides of (66) are differentiated k BT k Π k +Π k M k t +dsi,k t ) X i, t Π k B k R 1 k BT k si,k t dt+π k Σ k (t)dw i t. (69) dp i t = ( AT k pi t Q kx i, t N T k ui, t )dt+e AT k t dm i t. (7) According to the martingale representation theorem, the martingale Mt i shall be written as t Mt i = Mi + Zs i dwi s, (71) 2

21 or equivalently, when both sides of (71) are differentiated, as dm i t = Zi t dwi t, (72) wherezt i is an Fi,r t -adapted process. Then, (68) and (72) are substituted in (7) which gives [ dp i t = ( Q k +N k R 1 k NT k +N kr 1 k BT k Π k A T k Π k)x i, t +(N k R 1 k BT k A T k)s i,k t whereqt i = e AT k t Zt i. Finally, (69) is equated with (73) which yields dt+q i tdw i t, (73) Π k +Π k A k +A T kπ k (B T kπ k +N T k) T R 1 k (BT kπ k +N T k)+q k =, Π k (T) = G k, (74) ( ds i,k [(Ak ) t + B k R 1 k NT k )T Π k B k R 1 k BT k s i,k t +Π k M k t dt+(π k Σ k qt i )dwi t =, s i,k T =, (75) i N,k K. Finally, following the asymptotic equilibrium analysis of [4, the set of control actions U N, MF {ui, t ; i N}, 1 N <, yields an ǫ-nash equilibrium for the large population system given by (1)-(3) and (5)-(6). 5 Conclusions In this paper, we introduced and formulated a new class of major minor MFG systems motivated from financial and economic systems. In this novel setup, the major agent and each of the mass of minor agents interact with a common process, and this process also affects their cost functionals. The common process is influenced by (i) a latent process which is not observed, (ii) a common Wiener process, (iii) the major agent s state and control action, and (iv) the average state and control action of all minor agents. Then, we developed a convex analysis method to establish the best trading strategies for all agents which yield an ǫ-nash equilibrium. Our framework can be easily extended to the case where each agent s dynamics also is influenced by the common process. 21

22 References [1 P. Casgrain and S. Jaimungal, Trading algorithms with learning in latent alpha models, SSRN, 216. [2 P. Casgrain and S. Jaimungal, Algorithmic trading with partial information: A mean field game approach, arxiv, 218. [3 R. Carmona, F. Delarue, and D. Lacker, Mean field games with common noise, Annals of Probability, vol. 44, pp , 216. [4 M. Huang, Large-population LQG games involving a major player: The Nash certainty equivalence principle, SIAM Journal on Control and Optimization, vol. 48, no. 5, pp , 21. [5 D. Firoozi, P. E. Caines, and S. Jaimungal, Convex analysis for LQG systems with applications to major minor LQG mean field game systems, arxiv, 218. [6 A. Nayyar, A. Gupta, C. Langbort, and T. Başar, Common information based markov perfect equilibria for stochastic games with asymmetric information: finite games, IEEE Transaction on Automatic Control, vol. 59, no. 3, pp , 214. [7 M. Nourian and P. E. Caines, ǫ-nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents, SIAM Journal on Control and Optimization, vol. 51, no. 4, pp , 213. [8 P. E. Caines and A. C. Kizilkale, ǫ-nash equilibria for partially observed LQG mean field games with major player, IEEE Transaction on Automatic Control, vol. 62, no. 7, pp , 217. [9 P. E. Caines and A. C. Kizilkale, Recursive estimation of common partially observed disturbances in MFG systems with application to large scale power markets, in Proceedings of the 52nd IEEE Conference on Decision and Control (CDC), (Florence, Italy), pp , Dec [1 W. M. Wonham, Some applications of stochastic differential equations to optimal nonlinear filtering, Journal of the Society for Industrial and Applied Mathematics Series A Control, vol. 2, no. 3, pp ,

arxiv: v1 [math.pr] 18 Oct 2016

arxiv: v1 [math.pr] 18 Oct 2016 An Alternative Approach to Mean Field Game with Major and Minor Players, and Applications to Herders Impacts arxiv:161.544v1 [math.pr 18 Oct 216 Rene Carmona August 7, 218 Abstract Peiqi Wang The goal