Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Size: px
Start display at page:

Download "Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach"

Transcription

1 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv: v1 [cs.sy] 8 Sep 2012 Absrac A general model of decenralized sochasic conrol called parial hisory sharing informaion srucure is presened. In his model, a each sep he conrollers share par of heir observaion and conrol hisory wih each oher. This general model subsumes several exising models of informaion sharing as special cases. Based on he informaion commonly known o all he conrollers, he decenralized problem is reformulaed as an equivalen cenralized problem from he perspecive of a coordinaor. The coordinaor knows he common informaion and selec prescripions ha map each conroller s local informaion o is conrol acions. The opimal conrol problem a he coordinaor is shown o be a parially observable Markov decision process (POMDP) which is solved using echniques from Markov decision heory. This approach provides (a) srucural resuls for opimal sraegies, and (b) a dynamic program for obaining opimal sraegies for all conrollers in he original decenralized problem. Thus, his approach unifies he various ad-hoc approaches aken in he lieraure. In addiion, he srucural resuls on opimal conrol sraegies obained by he proposed approach canno be obained by he exising generic approach (he person-by-person approach) for obaining srucural resuls in decenralized problems; and he dynamic program obained by he proposed approach is simpler han ha obained by he exising generic approach (he designer s approach) for obaining dynamic programs in decenralized problems. Index Terms Decenralized Conrol, Sochasic Conrol, Informaion Srucures, Markov Decision Theory, Team Theory I. INTRODUCTION Sochasic conrol heory provides analyic and compuaional echniques for cenralized decision making in sochasic sysems wih noisy observaions. For specific models such as Markov decision processes and linear quadraic and Gaussian sysems, sochasic conrol gives Preliminary version of his paper appeared in he proceedings of he 46h Alleron conference on communicaion, conrol, and compuaion, 2008 (see [1]).

2 2 resuls ha are inuiively appealing and compuaionally racable. However, hese resuls are derived under he assumpion ha all decisions are made by a cenralized decision maker who sees all observaions and perfecly recalls pas observaions and acions. This assumpion of a cenralized decision maker is no rue in a number of modern conrol applicaions such as neworked conrol sysems, communicaion and queuing neworks, sensor neworks, and smar grids. In such applicaions, decisions are made by muliple decision makers who have access o differen informaion. In his paper, we invesigae such problems of decenralized sochasic conrol. The echniques from cenralized sochasic conrol canno be direcly applied o decenralized conrol problems. Noneheless, wo general soluion approaches ha indirecly use echniques from cenralized sochasic conrol have been used in he lieraure: (i) he person-by-person approach which akes he viewpoin of an individual decision maker (DM); and (ii) he designer s approach which akes he viewpoin of he collecive eam of DMs. The person-by-person approach invesigaes he decenralized conrol problem from he viewpoin of one DM, say DM i and proceeds as follows: (i) arbirarily fix he sraegy of all DMs excep DM i; and (ii) use cenralized sochasic conrol o derive srucural properies for he opimal bes-response sraegy of DM i. If such a srucural propery does no depend on he choice of he sraegy of oher DMs, hen i also holds for globally opimal sraegy of DM i. By cyclically using his approach for all DMs, we can idenify he srucure of globally opimal sraegies for all DMs. A variaion of his approach may be used o idenify person-by-person opimal sraegies. The variaion proceeds ieraively as follows. Sar wih an iniial guess for he sraegies of all DMs. A each ieraion, selec one DM (say DM i), and change is sraegy o he bes response sraegy given he sraegy of all oher DMs. Repea he process unil a fixed poin is reached, i.e., when no DM can improve performance by unilaerally changing is sraegy. The resuling sraegies are person-by-person opimal [2], and in general, no globally opimal. In summary, he person-by-person approach idenifies srucural properies of globally opimal sraegies and provides an ieraive mehod o obain person-by-person opimal sraegies. This mehod has been successfully used o idenify srucural properies of globally opimal sraegies for various applicaions including real-ime communicaion [3] [7], decenralized hypohesis esing and quickes change deecion [8] [16], and neworked conrol sysems [17] [19]. Under

3 3 cerain condiions, he person-by-person opimal sraegies found by his approach are globally opimal [2], [20], [21]. The designer s approach, which is developed in [22], [23], invesigaes he decenralized conrol problem from he viewpoin of he collecive eam of DMs or, equivalenly, from he viewpoin of a sysem designer who knows he sysem model and probabiliy disribuion of he primiive random variables and chooses conrol sraegies for all DMs. Effecively, he designer is solving a cenralized planning problem. The designer s approach proceeds by: (i) modeling his cenralized planning problem as a muli-sage, open-loop sochasic conrol problem in which he designer s decision a each ime is he conrol law for ha ime for all DMs; and (ii) using cenralized sochasic conrol o obain a dynamic programming decomposiion. Each sep of he resuling dynamic program is a funcional opimizaion problem (in conras o cenralized dynamic programming where each sep is a parameer opimizaion problem). The designer approach is ofen used in andem wih he person-by-person approach as follows. Firs, he person-by-person approach is used o idenify srucural properies of globally opimal sraegies. Then, resricing aenion o sraegies wih he idenified srucural propery, he designer s approach is used o obain a dynamic programming decomposiion for selecing he globally opimal sraegy. Such a andem approach has been used in various applicaions including real-ime communicaion [4], [24], [25], decenralized hypohesis esing [13], and neworked conrol sysems [17], [18]. In addiion o he above general approaches, oher specialized approaches have been developed o address specific problems in decenralized sysems. Decenralized problems wih parially nesed informaion srucure were defined and sudied in [26]. Decenralized linear quadraic Gaussian (LQG) conrol problems wih wo conrollers and parially nesed informaion srucure were sudied in [27], [28]. Parially nesed decenralized LQG problems wih conrollers conneced via a graph were sudied in [29], [30]. A generalizaion of parial nesedness called sochasic nesedness was defined and sudied in [31]. An imporan propery of LQG conrol problems wih parially nesed informaion srucure is ha here exiss an affine conrol sraegy which is globally opimal. In general, he problem of finding he bes affine conrol sraegies may no be a convex opimizaion problem. Condiions under which he problem of deermining opimal conrol sraegies wihin he class of affine conrol sraegies becomes a convex opimizaion problem were idenified in [32], [33].

4 4 Decenralized sochasic conrol problems wih specific models of informaion sharing among conrollers have also been sudied in he lieraure. Examples include sysems wih delayed sharing informaion srucures [34] [36], sysems wih periodic sharing informaion srucure [37], conrol sharing informaion srucure [38], [39], sysems wih broadcas informaion srucure [19], and sysems wih common and privae observaions [1]. In his paper, we presen a new general model of decenralized sochasic conrol called parial hisory sharing informaion srucure. In his model, we assume ha: (i) conrollers sequenially share par of heir pas daa (pas observaions and conrol) wih each oher by means of a shared memory; and (ii) all conrollers have perfec recall of commonly available daa (common informaion). This model subsumes a large class of decenralized conrol models in which informaion is shared among he conrollers. For his model, we presen a general soluion mehodology ha reformulaes he original decenralized problem ino an equivalen cenralized problem from he perspecive of a coordinaor. The coordinaor knows he common informaion and selecs prescripions ha map each conroller s local informaion o is conrol acions. The opimal conrol problem a he coordinaor is shown o be a parially observable Markov decision process (POMDP) which is solved using echniques from Markov decision heory. This approach provides (a) srucural resuls for opimal sraegies, and (b) a dynamic program for obaining opimal sraegies for all conrollers in he original decenralized problem. Thus, his approach unifies he various ad-hoc approaches aken in he lieraure. A similar soluion approach is used in [36] for a model ha is a special case of he model presened in his paper. We presen an informaion sae (Eq. (51)) for he model of [36] ha is simpler han ha presened in [36, Theorem 2]. A preliminary version of he general soluion approach presened here was presened in [1] for a model ha had feaures (e.g., direc bu noisy communicaion links beween conrollers) ha are no necessary for parial hisory sharing. However, i can be shown ha by suiable redefiniion of variables, he model in [1] can be recas as an insance of he model in his paper and vice versa (see Appendix C). The informaion sae for parial hisory sharing ha is presened in his paper (see Thereom 4) is simpler han ha presened in [1, Eq. (39)].

5 5 A. Common Informaion Approach for a Saic Team Problem We firs illusrae how common informaion can be used in a saic eam problem wih wo conrollers. Le X denoe he sae of naure and Y, Y 1, Y 2 be hree correlaed random variables ha depend on X. Assume ha he join disribuion of (X, Y, Y 1, Y 2 ) is given. Conroller i, i = 1, 2, observes (Y, Y i ) and chooses a conrol acion U i = g i (Y, Y i ). The sysem incurs a cos l(x, U 1, U 2 ). The conrol objecive is o choose (g 1, g 2 ) o minimize J(g 1, g 2 ) := E (g1,g 2) [l(x, U 1, U 2 )] If all he sysem variables are finie valued, we can solve he above opimizaion problem by a brue force search over all conrol sraegies (g 1, g 2 ). For example, if all variables are binary valued, we need o compue he performance of = 256 conrol sraegies and choose he one wih he bes performance. In his example, boh conrollers have a common observaion Y. One of he main ideas of his paper is o use such common informaion among he conrollers o simplify he search process as follows. Insead of specifying he conrol sraegies (g 1, g 2 ) direcly, we consider a coordinaed sysem in which a coordinaor observes he common informaion Y and chooses prescripions (Γ 1, Γ 2 ) where Γ i is a mapping from Y i o U i, i = 1, 2. Hence, (Γ 1, Γ 2 ) = d(y ), where d is called he coordinaion sraegy. The coordinaor hen communicaes hese prescripions o he conrollers who simply use hem o choose U i = Γ i (Y i ), i = 1, 2. I is easy o verify (see Proposiion 3 for a formal proof) ha choosing he conrol sraegies (g 1, g 2 ) in he original sysem is equivalen o choosing a coordinaion sraegy d in he coordinaed sysem. The problem of choosing he bes coordinaion sraegy, however, is a cenralized problem in which he coordinaor is he only decision-maker. For example, consider he case when all sysem variables are binary valued. For any coordinaion sraegy d, le (γ0, 1 γ0) 2 = d(0) and (γ1, 1 γ1) 2 = d(1). Then, he cos associaed wih his coordinaion sraegy is given as: J(d) := E (d) [l(x, U 1, U 2 )] = P(Y = 0)E[l(X, γ0(y 1 1 ), γ0(y 2 2 )) Y = 0] + P(Y = 1)E[l(X, γ1(y 1 1 ), γ1(y 2 2 )) Y = 1] To minimize he above cos, we can minimize he wo erms separaely. Therefore, o find he bes coordinaion sraegy d, we can search for opimal prescripions for he cases Y = 0 and Y = 1

6 6 separaely. Searching for he bes prescripions for each of hese cases involves compuing he performance of = 16 prescripion pairs and choosing he one wih he bes performance. Thus, o find he bes coordinaion sraegy, we need o evaluae he performance of = 32 prescripion pairs. Conras his wih he 256 sraegies whose coss we need o evaluae o solve he original problem by brue force. The above example described a saic sysem and illusraes ha common informaion can be exploied o conver he decenralized opimizaion problem ino a cenralized opimizaion problem involving a coordinaor. In his paper, we build upon his basic idea and presen a soluion approach based on common informaion ha works for dynamical decenralized sysems as well. Our approach convers he decenralized problem ino a cenralized sochasic conrol problem (in paricular, a parially observable Markov decision process), idenifies srucure of opimal conrol sraegies, and provides a dynamic program like decomposiion for he decenralized problem. B. Conribuions of he Paper We inroduce a general model of decenralized sochasic conrol problem in which muliple conrollers share par of heir informaion wih each oher. We call his model he parial hisory sharing informaion srucure. This model subsumes several exising models of informaion sharing in decenralized sochasic conrol as special cases (see Secion II-B). We esablish wo resuls for our model. Firsly, we esablish a srucural propery of opimal conrol sraegies. Secondly, we provide a dynamic programming decomposiion of he problem of finding opimal conrol sraegies. As in [1], [36], our resuls are derived using a common informaion based approach (see Secion III). This approach differs from he person-by-person approach and he designer s approach menioned earlier. In paricular, he srucural properies found in his paper canno be found by he person-by-person approach described earlier. Moreover, he dynamic programming decomposiion found in his paper is disinc from and simpler han he dynamic programming decomposiion based on he designer s approach. For a general framework for using common informaion in sequenial decision making problems, see [40]. C. Noaion Random variables are denoed by upper case leers; heir realizaion by he corresponding lower case leer. For inegers a b and c d, X a:b is a shor hand for he vecor (X a, X a+1,..., X b )

7 7 while X c:d is a shor hand for he vecor (X c, X c+1,..., X d ). When a > b, X a:b equals he empy se. The combined noaion X c:d a:b is a shor hand for he vecor (Xj i : i = a, a + 1,..., b, j = c, c + 1,..., d). In general, subscrips are used as ime index while superscrips are used o index conrollers. Bold leers X are used as a shor hand for he vecor (X 1:n ). P( ) is he probabiliy of an even, E( ) is he expecaion of a random variable. For a collecion of funcions g, we use P g ( ) and E g ( ) o denoe ha he probabiliy measure/expecaion depends on he choice of funcions in g. 1 A ( ) is he indicaor funcion of a se A. For singleon ses {a}, we also denoe 1 {a} ( ) by 1 a ( ). For a singleon a and a se B, {a, B} denoes he se {a} B. For wo ses A and B, {A, B} denoes he se A B. For wo finie ses A, B, F (A, B) is he se of all funcions from A o B. Also, if A =, F (A, B) := B. For a finie se A, (A) is he se of all probabiliy mass funcions over A. For he ease of exposiion, we assume ha all sae, observaion and conrol variables ake values in finie ses. For wo random variables (or random vecors) X and Y aking values in X and Y, P(X = x Y ) denoes he condiional probabiliy of he even {X = x} given Y and P(X Y ) denoes he condiional PMF (probabiliy mass funcion) of X given Y, ha is, i denoes he collecion of condiional probabiliies P(X = x Y ), x X. Finally, all equaliies involving random variables are o be inerpreed as almos sure equaliies (ha is, hey hold wih probabiliy one). D. Organizaion The res of his paper is organized as follows. We presen our model of a decenralized sochasic conrol problem in Secion II. We also presen several special cases of our model in his secion. We prove our main resuls in Secion III. We apply our resul o some special cases in Secion III-B. We presen a simplificaion of our resul and a generalizaion of our model in Secion IV. We consider he infinie ime-horizon discouned cos analogue of our problem in Secion V. Finally, we conclude in Secion VI. II. PROBLEM FORMULATION A. Basic model: Parial Hisory Sharing Informaion Srucure 1) The Dynamic Sysem: Consider a dynamic sysem wih n conrollers. The sysem operaes in discree ime for a horizon T. Le X X denoe he sae of he sysem a ime, U i U i

8 8 denoe he conrol acion of conroller i, i = 1,..., n a ime, and U denoe he vecor (U 1,..., U n ). The iniial sae X 1 has a probabiliy disribuion Q 1 and evolves according o X +1 = f (X, U, W 0 ), (1) where {W 0 } T =1 is a sequence of i.i.d. random variables wih probabiliy disribuion Q 0 W. 2) Daa available a he conroller: A any ime, each conroller has access o hree ypes of daa: curren observaion, local memory, and shared memory. (i) Curren local observaion: Each conroller makes a local observaion Y i of he sysem a ime, Y i Y i on he sae = h i (X, W i ), (2) where {W i } T =1 is a sequence of i.i.d. random variables wih probabiliy disribuion Q i W. We assume ha he random variables in he collecion {X 1, W j, = 1,..., T, j = 0, 1,..., n}, called primiive random variables, are muually independen. (ii) Local memory : Each conroller sores a subse M i of is pas local observaions and is pas acions in a local memory: A = 1, he local memory is empy, M i 1 =. M i {Y i 1: 1, U i 1: 1}. (3) (iii) Shared memory: In addiion o is local memory, each conroller has access o a shared memory. The conens C of he shared memory a ime are a subse of he pas local observaions and conrol acions of all conrollers: C {Y 1: 1, U 1: 1 } (4) where Y and U denoe he vecors (Y 1,..., Y n ) and (U 1,..., U n ) respecively. A = 1, he shared memory is empy, C 1 =. Conroller i chooses acion U i Specifically, for every conroller i, i = 1,..., n, as a funcion of he oal daa (Y i, M i, C ) available o i. U i = g i (Y i, M i, C ), (5) where g i is called he conrol law of conroller i. The collecion g i = (g1, i..., gt i ) is called he conrol sraegy of conroller i. The collecion g 1:n = (g 1,..., g n ) is called he conrol sraegy of he sysem.

9 9 3) Updae of local and shared memories: (i) Shared memory updae: Afer aking he conrol acion a ime, he local informaion a conroller i consiss of he conens M i of is local memory, is local observaion Y i and is conrol acion U i. Conroller i sends a subse Z i of his local informaion {M i, Y i, U i } o he shared memory. The subse Z i is chosen according o a pre-specified proocol. The conens of shared memory are nesed in ime, ha is, he conens C +1 of he shared memory a ime + 1 are he conens C a ime augmened wih he new daa Z = (Z 1, Z 2,..., Z n ) sen by all he conrollers a ime : C +1 = {C, Z }. (6) (ii) Local memory updae: Afer aking he conrol acion and sending daa o he shared memory a ime, conroller i updaes is local memory according o a pre-specified proocol. The conen M i +1 of he local memory can a mos equal he oal local informaion {M i, Y i, U i } a he conroller. However, o ensure ha he local and shared memories a ime + 1 don overlap, we assume ha M i +1 {M i, Y i, U i } \ Z i. (7) Figure 1 shows he ime order of observaions, acions and memory updaes. We refer o he + 1 Shared Memory C Z C +1 Conroller 1 M 1 Y 1 U 1 Z 1 M 1 +1 Conroller n M n Y n U n Z n M n +1 Fig. 1. Time ordering of Observaions, Acions and Memory Updaes above model as he parial hisory sharing informaion srucure. 4) The opimizaion problem: A ime, he sysem incurs a cos l(x, U ). The performance of he conrol sraegy of he sysem is measured by he expeced oal cos J(g 1:n ) := E g1:n[ T ] l(x, U ), (8) =1

10 10 where he expecaion is wih respec o he join probabiliy measure on (X 1:T, U 1:T ) induced by he choice of g 1:n. We are ineresed in he following opimizaion problem. Problem 1 For he model described above, given he sae evoluion funcions f, he observaion funcions h i, he proocols for updaing local and share memory, he cos funcion l, he disribuions Q 1, Q i W, i = 0, 1,..., n, and he horizon T, find a conrol sraegy g1:n for he sysem ha minimizes he expeced oal cos given by (8). B. Special Cases: The Models In he above model, alhough we have no specified he exac proocols by which conrollers updae he local and shared memories, we assume ha pre-specified proocols are being used. Differen choices of his proocol resul in differen informaion srucures for he sysem. In his secion, we describe several models of decenralized conrol sysems ha can be viewed as special cases of our model by assuming a paricular choice of proocol for local and shared memory updaes. 1) Delayed Sharing Informaion Srucure: Consider he following special case of he model of Secion II-A. (i) The shared memory a he beginning of ime is C = {Y 1: s, U 1: s }, where s 1 is a fixed number. The local memory a he beginning of ime is M i = {Y i s+1: 1, U i s+1: 1}. (ii) A each ime, afer aking he acion U i, conroller i sends Z i = {Y i s+1, U i s+1} o he shared memory and he shared memory a + 1 becomes C +1 = {Y 1: s+1, U 1: s+1 }. (iii) Afer sending Z i = {Y i s+1, U i s+1} o he shared memory, conroller i updaes he local memory o M i +1 = {Y i s+2:, U i s+2:}. In his spacial case, he observaions and conrol acions of each conroller are shared wih every oher conroller afer a delay of s ime seps. Hence, he above special case corresponds o he delayed sharing informaion srucure considered in [34], [36], [41]. 2) Delayed Sae Sharing Informaion Srucure: A special case of he delayed sharing informaion srucure (which iself is a special case of our basic model) is he delayed sae sharing informaion srucure [35]. This informaion srucure can be obained from he delayed sharing informaion srucure by making he following assumpions:

11 11 (i) The sae of he sysem a ime is a n-dimensional vecor X = (X 1, X 2,..., X n ). (ii) A each ime, he curren local observaion of conroller i is Y i = X i, for i = 1, 2,..., n. In his spacial case, he complee sae vecor X is available o all conrollers afer a delay of s ime seps. 3) Periodic Sharing Informaion Srucure: Consider he following special case of he model of Secion II-A where conrollers updae he shared memory periodically wih period s 1: (i) For ime ks < (k + 1)s, where k = 0, 1, 2,..., he shared memory a he beginning of ime is C = {Y 1:ks, U 1:ks }. The local memory a he beginning of ime is M i = {Y i ks+1: 1, U i ks+1: 1 }. (ii) A each ime = (k + 1)s, k = 1, 2,..., afer aking he acion U i, conroller i sends Z i = {Yks+1:(k+1)s i, U ks+1:(k+1)s i } o he shared memory. A oher imes, each conroller does no send anyhing (hus Z i = ). (iii) Afer sending Z i o he shared memory, conroller i updaes he local memory o M i +1 = {M i, Y i, U i } \ Z i. In his spacial case, he enire hisory of observaions and conrol acions are shared periodically beween conrollers wih period s. Hence, he above special case corresponds o he periodic sharing informaion srucure considered in [37]. 4) Conrol Sharing Informaion Srucure: Consider he following special case of he model of Secion II-A. (i) The shared memory a he beginning of ime is C = {U 1: 1 }. The local memory a he beginning of ime is M i = {Y i 1: 1}. (ii) A each ime, afer aking he acion U i, conroller i sends Z i = {U i } o he shared memory. (iii) Afer sending Z i = U i M i +1 = Y i 1:. o he shared memory, conroller i updaes he local memory o In his spacial case, he conrol acions of each conroller are shared wih every oher conroller afer a delay of 1 ime sep. Hence, he above special case corresponds o he conrol sharing informaion srucure considered in [38]. 5) No Shared Memory wih or wihou finie local memory: Consider he following special case of he model of Secion II-A.

12 12 (i) The shared memory a each ime is empy, C = and he local memory a he beginning of ime is M i = {Y i s: 1, U i s: 1}, where s 1 is a fixed number. (ii) Conrollers do no send any daa o shared memory, Z i =. (iii) A he end of ime, conrollers updae heir local memories o M i +1 = {Y i s+1:, U i s+1:}. In his special case, he conrollers don share any daa. The above model is relaed o he finie-memory conroller model of [42]. A relaed special case is he siuaion where he local memory a each conroller consiss of all of is pas local observaions and is pas acions, ha is, M i = {Y i 1: 1, U i 1: 1}. Remark 1 All he special cases considered above are examples of symmeric sharing. Tha is, differen conrollers updae heir local memories according o idenical proocols and he daa sen by a conroller o he shared memory is seleced according o idenical proocols. However, his symmery is no required for our model. Consider for example, he delayed sharing informaion srucure where a he end of ime, conroller i sends Y i s i, U i s i o he shared memory, wih s i, i = 1, 2,..., n, being fixed, bu no necessarily idenical, numbers. This kind of asymmeric sharing is also a special case of our model. III. MAIN RESULTS For cenralized sysems, sochasic conrol heory provides wo imporan analyical resuls. Firsly, i provides a srucural resul. This resul saes ha here is an opimal conrol sraegy which selecs conrol acions as a funcion only of he conroller s poserior belief on he sae of he sysem condiioned on all is observaions and acions ill he curren ime. The conroller s poserior belief is called is informaion sae. Secondly, sochasic conrol heory provides a dynamic programming decomposiion of he problem of finding opimal conrol sraegies in cenralized sysems. This dynamic programming decomposiion allows one o evaluae he opimal acion for each realizaion of he conroller s informaion sae in a backward inducive manner. In his secion, we provide a srucural resul and a dynamic programming decomposiion for he decenralized sochasic conrol problem wih parial informaion sharing formulaed above (Problem 1). The main idea of he proof is o formulae an equivalen cenralized sochasic conrol problem; solve he equivalen problem using classical sochasic-conrol echniques; and ranslae he resuls back o he basic model. For ha maer, we proceed as follows:

13 13 1) Formulae a cenralized coordinaed sysem from he poin of view of a coordinaor ha observes only he common informaion among he conrollers in he basic model, i.e., he coordinaor observes he shared memory C bu no he local memories (M i, i = 1,..., n) or local observaions (Y i, i = 1,..., n). The coordinaor chooses prescripions Γ = (Γ 1,..., Γ n ), where Γ i is a mapping from (Y i, M i ) o U i, i = 1,..., n. 2) Show ha he coordinaed sysem is a POMDP (parially observable Markov decision process). 3) For he coordinaed sysem, deermine he srucure of an opimal coordinaion sraegy and a dynamic program o find an opimal coordinaion sraegy. 4) Show ha any sraegy of he coordinaed sysem is implemenable in he basic model wih he same value of he oal expeced cos. Conversely, any sraegy of he basic model is implemenable in he coordinaed sysem wih he same value of he oal expeced cos. Hence, he wo sysems are equivalen. 5) Translae he srucural resuls and dynamic programming decomposiion of he coordinaed sysem (obained in sage 3) o he basic model. Sage 1: The coordinaed sysem Consider a coordinaed sysem ha consiss of a coordinaor and n passive conrollers. The coordinaor knows he shared memory C a ime, bu no he local memories (M i, i = 1,..., n) or local observaions (Y i, i = 1,..., n). A each ime, he coordinaor chooses mappings Γ i : Y i M i U i, i = 1, 2,..., n, according o Γ = d (C, Γ 1: 1 ), (9) where Γ = (Γ 1, Γ 2,..., Γ n ). The funcion d is called he coordinaion rule a ime and he collecion of funcions d := (d 1,..., d T ) is called he coordinaion sraegy. The seleced Γ i is communicaed o conroller i a ime. The funcion Γ i ells conroller i how o process is curren local observaion and is local memory a ime ; for ha reason, we call Γ i he coordinaor s prescripion o conroller i. Conroller i generaes an acion using is prescripion as follows: U i = Γ i (Y i, M i ). (10)

14 14 For his coordinaed sysem, he sysem dynamics, he observaion model and he cos are he same as he basic model of Secion II-A: he sysem dynamics are given by (1), each conroller s curren observaion is given by (2) and he insananeous cos a ime is l(x, U ). As before, he performance of a coordinaion sraegy is measured by he expeced oal cos [ T ] Ĵ(d) = E l(x, U ), (11) =1 where he expecaion is wih respec o a join measure on (X 1:T, U 1:T ) induced by he choice of d. In his coordinaed sysem, we are ineresed in he following opimizaion problem: Problem 2 For he model of he coordinaed sysem described above, find a coordinaion sraegy d ha minimizes he oal expeced cos given by (11). Sage 2: The coordinaed sysem as a POMDP We will now show ha he coordinaed sysem is a parially observed Markov decision process. For ha maer, we firs describe he model of POMDPs [43]. POMDP Model: A parially observable Markov decision process consiss of a sae process S S, an observaion process O O, an acion process A A, = 1, 2,..., T, and a single decision-maker where 1) The acion a ime is chosen by he decision-maker as a funcion of observaion and acion hisory, ha is, A = d (O 1:, A 1: 1 ), (12) d is he decision rule a ime. 2) Afer he acion a ime is aken, he new sae and new observaion are generaed according o he ransiion probabiliy rule P(S +1, O +1 S 1:, O 1:, A 1: ) = P(S +1, O +1 S, A ). (13) 3) A each ime, an insananeous cos l(s, A ) is incurred. 4) The opimizaion problem for he decision-maker is o choose a decision sraegy d := (d 1,..., d T ) o minimize a oal cos given as T E[ l(s, A )]. (14) =1

15 15 The following well-known resuls provides he srucure of opimal sraegies and a dynamic program for POMDPs. For deails, see [43]. Theorem 1 (POMDP Resul) Le Θ be he condiional probabiliy disribuion of he sae S a ime given he observaions O 1: and acions A 1: 1, Then, Θ (s) = P(S = s O 1:, A 1: 1 ), s S. 1) Θ +1 = η (Θ, A, O +1 ), where η is he sandard non-linear filer: If θ, a, o +1 are he realizaions of Θ, A and O +1, hen he realizaion of s h elemen of he vecor Θ +1 is θ θ +1 (s) = s (s )P(S +1 = s, O +1 = o +1 S = s, A = a ) ŝ, s θ (ŝ)p(s +1 = s, O +1 = o +1 S = ŝ, A = a ) =: η s (θ, a, o +1 ) (15) and η (θ, a, o +1 ) is he vecor (η s (θ, a, o +1 )) s S. 2) There exiss an opimal decision sraegy of he form A = ˆd (Θ ). Furher, such a sraegy can be found by he following dynamic program: and for 1 T 1, V T (θ) = inf a E{ l(s T, a) Θ T = θ}, (16) V (θ) = inf a E{ l(s, a) + V +1 (η (θ, a, O +1 )) Θ = θ, A = a }. (17) We will now show ha he coordinaed sysem can be viewed as an insance of he above POMDP model by defining he sae process as S := {X, Y, M }, he observaion process as O := Z 1, and he acion process A := Γ. Lemma 1 For he coordinaed sysem of Problem 2, 1) There exis funcions f and h, = 1,..., T, such ha and S +1 = f (S, Γ, W 0, W +1 ), (18) Z = h (S, Γ ). (19)

16 16 In paricular, we have ha P(S +1, Z S 1:, Z 1: 1, Γ 1: ) = P(S +1, Z S, Γ ). (20) 2) Furhermore, here exiss a funcion l such ha Thus, he objecive of minimizing (11) is same as minimizing l(x, U ) = l(s, Γ ). (21) [ T ] Ĵ(d) = E l(s, Γ ). (22) =1 Proof: The exisence of f follows from (1), (2), (10), (7) and he definiion of S. The exisence of h follows from he fac ha Z i is a fixed subse of {M i, Y i, U i }, equaion (10) and he definiion of S. Equaion (20) follows from (18) and he independence of W 0, W +1 from all random variables in he condiioning in he lef hand side of (20). The exisence of l follows from he definiion of S and (10). form Recall ha he coordinaor is choosing is acions according o a coordinaion sraegy of he Γ = d (C, Γ 1: 1 ) = d (Z 1: 1, Γ 1: 1 ). (23) Equaion (23) and Lemma 1 imply ha he coordinaed sysem is an insance of he POMDP model described above. Sage 3: Srucural resul and dynamic program for he coordinaed sysem Since he coordinaed sysem is a POMDP, Theorem 1 gives he srucure of he opimal coordinaion sraegies. For ha maer, define coordinaor s informaion sae Π := P(S Z 1: 1, Γ 1: 1 ) = P(S C, Γ 1: 1 ). (24) Then, we have he following: Proposiion 1 For Problem 2, here is no loss of opimaliy in resricing aenion o coordinaion rules of he form Γ = ˆd (Π ). (25)

17 17 Furhermore, an opimal coordinaion sraegy of he above form can be found using a dynamic program. For ha maer, observe ha we can wrie Π +1 = η (Π, Z, Γ ) (26) where η is he sandard non-linear filering updae funcion (see Appendix A). We denoe by B he space of possible realizaions of Π. Thus, B := (X Y 1 M 1... Y n M n ). (27) Recall ha F (Y i M i, U i ) is he se of all funcions from Y i M i o U i (see Secion I-C). Then, we have he following resul. Proposiion 2 For all π in B, define V T (π) = and for 1 T 1, V (π) = inf { γ i T F (Yi T Mi T,U i T ),1 i n} E[ l(s, Γ T ) Π = π, Γ T = (γ 1 T,..., γ n T )], (28) inf E[ l(s, Γ ) + V +1 (η (Π, Γ, Z ) Π = π, Γ = (γ 1 { γ i F (Y i Mi,U i),1 i n},..., γ n )]. Then he arg inf a each ime sep gives he coordinaor s opimal prescripions for he conrollers when he coordinaor s informaion sae is π. Proposiion 2 gives a dynamic program for he coordinaor s problem (Problem 2). Since he coordinaed sysem is a POMDP, i implies ha compuaional algorihms for POMDPs can be used o solve he dynamic program for he coordinaor s problem as well. We refer he reader o [44] and references herein for a review of algorihms o solve POMDPs. (29) Sage 4: Equivalence beween he wo models We firs observe ha since C s C, for all s <, under any given coordinaion sraegy d, we can use C o evaluae he pas prescripions by recursive subsiuion. For example, for = 2, 3, he pas prescripions can be evaluaed as funcions of C 2, C 3 as follows: Γ 1 = d 1 (C 1 ) =: d 1 (C 2 ), Γ 2 = d 2 (C 2, Γ 1 ) = d 2 (C 2, d 1 (C 2 )) =: d 2 (C 3 )

18 18 We can now sae he following resul. Proposiion 3 The basic model of Secion II-A and he coordinaed sysem are equivalen. More precisely: (a) Given any conrol sraegy g 1:n for he basic model, choose a coordinaion sraegy d for he coordinaed sysem of sage 1 as d (C ) = ( g 1 (,, C ),..., g n (,, C ) ). Then Ĵ(d) = J(g1:n ). (b) Conversely, for any coordinaion sraegy for he coordinaed sysem, choose a conrol sraegy g 1:n for he basic model as g1(, i, C 1 ) = d i 1(C 1 ), and g(, i, C ) = d i (C, Γ 1: 1 ), where Γ k = d k (C k, Γ 1:k 1 ), k = 1, 2,..., 1 and d i ( ) is he i-h componen of d ( ) (ha is, d i ( ) gives he coordinaor s prescripion for he i-h conroller). Then, J(g 1:n ) = Ĵ(d). Proof: See Appendix B. Sage 5: Srucural resul and dynamic program for he basic model Combining Proposiion 1 wih Proposiion 3, we ge he following srucural resul for Problem 1. Theorem 2 (Srucural Resul for Opimal Conrol Sraegies) In Problem 1, here exis opimal conrol sraegies of he form U i = ĝ i (Y i, M i, Π ), i = 1, 2,..., n, (30) where Π is he condiional disribuion on X, Y, M given C, defined as Π (x, y, m) := Pĝ1:n 1: 1 (X = x, Y = y, M = m C ), (31) for all possible realizaions (x, y, m) of (X, Y, M ). (27). We call Π he common informaion sae. Recall ha Π akes values in he se B defined in

19 19 Consider a conrol sraegy ĝ i for conroller i of he form specified in Theorem 2. The conrol law ĝ i a ime is a funcion from he space Y i M i B o he space of decisions U i. Equivalenly, he conrol law ĝ i can be represened as a collecion of funcions {ĝ i (,, π)} π B, where each elemen of his collecion is a funcion from Y i M i o U i. An elemen ĝ i (,, π) of his collecion specifies a conrol acion for each possible realizaion of Y i, M i and a fixed realizaion π of Π. We call ĝ i (,, π) he parial conrol law of conroller i a ime for he given realizaion π of he common informaion sae Π. We now use Proposiion 2 o describe a dynamic programming decomposiion of he problem of finding opimal conrol sraegies. This dynamic programming decomposiion allows us o evaluae opimal parial conrol laws for each realizaion π of he common informaion sae in a backward inducive manner. Recall ha B is he space of all possible realizaions of Π (see (27)) and F (Y i M i, U i ) is he se of all funcions from Y i M i o U i (see Secion I-C). Theorem 3 (Dynamic Programming Decomposiion) Define he funcions V : B R, for = 1,..., T as follows: V T (π) = and for 1 T 1, V (π) = inf E{l(X T, γ 1 { γ T i F (Yi T Mi T,U T i ),1 i n} T (YT 1, MT 1 ),..., γ T n (YT n, MT n )) Π T = π}, (32) inf E { l(x, γ 1 { γ i F (Yi Mi,U i),1 i n} (Y 1, M i ),..., γ n (Y n, M n ))+ where η is a B +1 -valued funcion defined in (26) and Appendix A. V +1 (η (π, γ 1,..., γ n, Z )) Π = π }, (33) For = 1,..., T and for each π B, an opimal parial conrol law for conroller i is he minimizing choice of γ i in he definiion of V (π). Le Ψ (π) denoe he arg inf of he righ hand side of V (π), and Ψ i denoe is i-h componen. Then, an opimal conrol saegy is given by: ĝ i (,, π) = Ψ i (π). (34) A. Comparison wih Person by Person and Designer Approaches The common informaion based approach adoped above differs from he person-by-person approach and he designer s approach menioned in he inroducion. In paricular, he srucural

20 20 resul of Theorem 2 canno be found by he person-by- person approach. If we fix sraegies of all bu he ih conroller o an arbirary choice, hen i is no necessarily opimal for conroller i o use a sraegy of he form in Theorem 2. This is because if conroller j s sraegy uses he enire common informaion C, hen conroller i, in general, would need o consider he enire common informaion o beer predic conroller j s acions and hence conroller i s opimal choice of acion may oo depend on he enire common informaion. The use of common informaion based approach allowed us o prove ha all conrollers can joinly use sraegies of he form in Theorem 2 wihou loss of opimaliy. The dynamic programming decomposiion of Theorem 3 is simpler han any dynamic programming decomposiion obained using he designer s approach. As described earlier, he designer s approach models he decenralized conrol problem as an open-loop cenralized planning problem in which a designer a each sage chooses conrol laws g i ha map (Y i, M i, C ) o U i, i = 1,..., n. On he oher hand, he common-informaion approach developed in his paper models he decenralized conrol problem as a closed-loop cenralized planning problem in which a coordinaor a each sage chooses he parial conrol laws γ i ha map (Y i, M i ) o U i, i = 1,..., n. The space of parial conrol laws is always smaller han he space of full conrol laws; if he common informaion is non-empy, hen hey are sricly smaller. Thus, he dynamic programming decomposiion of Theorem 3 is simpler han ha obained by he designer s approach. This simplificaion is bes illusraed by he example of Secion IV-C1 where all conrollers receive a common observaion Y com. For his example, we show ha our informaion sae (and hence our dynamic program) reduce o P(X Y com 1: ), which is idenical o he informaion sae of cenralized sochasic conrol. In conras, he informaion sae P(X, Y com 1: ) obained by he designer s approach is much more complicaed. B. Special Cases: The Resuls In Secion II-B, we described several models of decenralized conrol problems ha are special cases of he model described in Secion II-A. In his secion, we sae he resuls of Theorems 2 and 3 for hese models. 1) Delayed Sharing Informaion Srucure: Corollary 1 In he delayed sharing informaion srucure of secion II-B1, here exis opimal

21 21 conrol sraegies of he form U i = ĝ i (Y i s+1:, U i s+1: 1, Π ), i = 1, 2,..., n, (35) where Π := Pĝ1:n 1: 1 (X, Y s+1:, U s+1: 1 C ). (36) Moreover, opimal conrol sraegies can be obained by a dynamic program similar o ha of Theorem 3. The above resul is analogous o he resul in [36]. 2) Delayed Sae Sharing Informaion Srucure: Corollary 2 In he delayed sae sharing informaion srucure of secion II-B2, here exis opimal conrol sraegies of he form U i = ĝ(x i s+1:, i U s+1: 1, i Π ), i = 1, 2,..., n, (37) where Π := Pĝ1:n 1: 1 (X s+1:, U s+1: 1 C ). (38) Moreover, opimal conrol sraegies can be obained by a dynamic program similar o ha of Theorem 3. The above resul is analogous o he resul in [36]. 3) Periodic Sharing Informaion Srucure: Corollary 3 In he periodic sharing informaion srucure of secion II-B3, here exis opimal conrol sraegies of he form U i = ĝ i (Y i ks+1:, U i ks+1: 1, Π ), i = 1, 2,..., n, ks < (k + 1)s, (39) where Π := Pĝ1:n 1: 1 (X, Y ks+1:, U ks+1: 1 C ), ks < (k + 1)s. (40) Moreover, opimal conrol sraegies can be obained by a dynamic program similar o ha of Theorem 3.

22 22 The above resul gives a finer dynamic programming decomposiion ha [37]. In [37], he dynamic programming decomposiion is only carried ou a he imes of informaion sharing, = ks, s = 1, 2,... ; and a each sep he parial conrol laws unil he nex sharing insan are chosen. In conras, in he above dynamic program, he parial conrol laws of each sep are chosen sequenially. 4) Conrol Sharing Informaion Srucure: Corollary 4 In he conrol sharing informaion srucure of secion II-B4, here exis opimal conrol sraegies of he form U i = ĝ i (Y i 1:, Π ), i = 1, 2,..., n, (41) where Π := Pĝ1:n 1: 1 (X, Y 1: C ). (42) Moreover, opimal conrol sraegies can be obained by a dynamic program similar o ha of Theorem 3. 5) No Shared Memory wih or wihou finie local memory: Corollary 5 In he informaion srucure of Secion II-B5, here exis opimal conrol sraegies of he form U i = ĝ i (Y i, M i, Π ) (43) where Π = Pĝ1:n 1: 1 (X, Y, M ) (44) Moreover, opimal conrol sraegies can be obained by a dynamic program similar o ha of Theorem 3. Noe ha, since he common informaion is empy, he common informaion sae Π is now an uncondiional probabiliy. In paricular, Π is a consan random variable and akes a fixed value ha depends only on he choice of pas conrol laws. Therefore, we can define an appropriae conrol law g i such ha ĝ i (Y i, M i, Π ) = g i (Y i, M i ), wih probabiliy 1. Hence, he srucural resul of (43) may be simplified o U i = ĝ i (Y i, M i, Π ) = g i (Y i, M i ).

23 23 This resul is redundan since all conrol laws are of he above form. Noneheless, Corollary 5 gives a procedure of finding such conrol laws using he dynamic program of Theorem 3. The above resul is similar o he resuls in [42] for he case of one conroller wih finie memory and o hose in [23] for he case of wo conrollers wih finie memories. IV. SIMPLIFICATIONS AND GENERALIZATIONS A. Simplificaion of he Common Informaion Sae Theorems 2 and 3 idenify he condiional probabiliy disribuion on (X, Y, M ) given C as he common informaion sae for our problem. In he following lemma, we make he simple observaion ha in our model he condiional disribuion on (X, Y, M ) given C is compleely deermined by he condiional disribuion on (X, M ) given C. Lemma 2 For any choice of conrol laws ĝ 1:n 1: 1, define he condiional disribuion on X, M given C as Π new (x, m) := Pĝ1:n 1: 1 (X = x, M = m C ), for all possible realizaions (x, m) of (X, M ). Also define B new := (X M i... M n ). Then, Therefore, Π new Π new (x, m) = y Π (x, y, m). (45) = χ (Π ), where each componen of he B new - valued funcion χ is deermined by he righ hand side of (45). Also, Π (x, y, m) = Π new (x, m)p(y = y X = x), (46) where he second erm on righ hand side of (46) is deermined by he fixed disribuion of he observaions noises. Therefore, Π = ζ (Π new ), where each componen of he B - valued funcion ζ is deermined by he righ hand side of (46). Lemma 2 implies ha he resuls of Theorems 2 and 3 can be wrien in erms of Π new. Theorem 4 (Alernaive Common Informaion Sae) In Problem 1, here exis opimal conrol sraegies of he form U i = ĝ i (Y i, M i, Π new ), i = 1, 2,..., n, (47)

24 24 where Π new := P ĝ1:n 1: 1 (X, M C ). (48) Furher, define he funcions V new : B new R, for = 1,..., T as follows: V new T (π new ) = inf { γ i T F (Yi T Mi T,U i T ),1 i n} E{l(X T, γ 1 T (Y 1 T, M 1 T ),..., γ n T (Y n T, M n T )) Π T = ζ T (π new )}, and for 1 T 1, (49) V new (π new ) = inf { γ i F (Yi Mi,U i ),1 i n} E { l(x, γ 1 (Y 1, M i ),..., γ n (Y n, M n ))+ V new +1 (χ (η (Π, γ 1,..., γ n, Z ))) Π = ζ (π new ) }, (50) where ζ, χ are defined in Lemma 2, and η is defined in (26) and Appendix A. For 1 T and for each π new, an opimal parial conrol law for conroller i is he minimizing choice of γ i in he definiion of V new (π new ). Proof: For any π new B new backward inducion argumen ha V new and any π B, i is sraighforward o esablish using a (π new ) = V (ζ (π new )) and V (π) = V new (χ (π)), where V ( ) is he value funcion from he dynamic program in Theorem 3. The opimaliy of he new dynamic program hen follows from he opimaliy of he dynamic program in Theorem 3. The resul of Theorem 4 is concepually he same as he resuls in Theorems 2 and 3. Theorem 4 implies ha he Corollaries of Secion III-B can be resaed in erms of new informaion saes by simply removing Y from he definiion of original informaion saes. For example, he resul of Corollary 1 for delayed sharing informaion srucure is also rue when Π is replaced by Π new := Pĝ1:n 1: 1 (X, Y s+1: 1, U s+1: 1 C ). (51) This resul is simpler han ha of [36, Theorem 2]. B. Generalizaion of he Model The mehodology described in Secion III relies on he fac ha he shared memory is common informaion among all conrollers. Since he coordinaor in he coordinaed sysem knows only he common informaion, any coordinaion sraegy can be mapped o an equivalen conrol sraegy in he basic model (see Sage 4 of Secion III). In some cases, in addiion o he shared

25 25 memory, he curren observaion (or if he curren observaion is a vecor, some componens of i) may also be commonly available o all conrollers. The general mehodology of Secion 2 can be easily modified o include such cases as well. Consider he model of Secion II-A wih he following modificaions: 1) In addiion o heir curren local observaion, all conrollers have a common observaion a ime. Y com = h com (X, V ) (52) where {V, = 1,..., T } is a sequence of i.i.d. random variables wih probabiliy disribuion Q V which is independen of all oher primiive random variables. 2) The shared memory C a ime is a subse of {Y com 1: 1, Y 1: 1, U 1: 1 }. 3) Each conroller selecs is acion using a conrol law of he form U i = g i (Y i, M i, C, Y com ). (53) 4) Afer aking he conrol acion a ime, conroller i sends a subse Z i of {M i, Y i, U i, Y com } ha necessarily includes Y com. Tha is, Y com Z i {M i, Y i, U i, Y com }. This implies ha he hisory of common observaions is necessarily a par of he shared memory, ha is, Y com 1: 1 C. The res of he model is same as in Secion II-A. In paricular, he local memory updae saisfies (7), so he local memory and shared memory a ime + 1 don overlap. The insananeous cos is given by l(x, U ) and he objecive is o minimize an expeced oal cos given by (8). The argumens of Secion III are also valid for his model. The observaion process in Lemma 1 is now defined as R +1 = {Z, Y com +1 }. The analysis of Secion III leads o srucural resuls and dynamic programming decomposiions analogous o Theorems 2 and 3 wih Π now defined as Π := P g1:n 1: 1 (X, Y, M C, Y com ). (54) Using an argumen similar o Lemma 2, we can show ha he resul of Theorem 4 is rue for he above model wih Π new defined as Π new := Pĝ1:n 1: 1 (X, M C, Y com ). (55)

26 26 C. Examples of he Generalized Model 1) Conrollers wih Idenical Informaion: Consider he following special case of he above generalized model. 1) All conrollers only make he common observaion Y com ; conrollers have no local observaion or local memory. 2) The shared memory a ime is C = Y com 1: 1. Thus, a ime, all conrollers have idenical informaion given as {C, Y com } = Y1: com. 3) Afer aking he acion a ime, each conroller sends Z i = Y com o he shared memory. Recall ha he coordinaor s prescripion Γ i in Secion III are chosen from he se of funcions from Y i M i o U i. Since, in his case Y i = M i =, we inerpre he coordinaor s prescripion as prescribed acions. Tha is, Γ i U i. Wih his inerpreaion, he common informaion sae becomes and he dynamic program of Theorem 3 becomes and for 1 T 1, V (π) = V T (π) = Π := P g1:n 1: 1 (X Y com 1: ) (56) inf E{l(X T, u 1 {u i T U T i ),1 i n} T,..., u n T ) Π T = π}, (57) inf E { l(x, u 1 {u i U T i ),1 i n},..., u n ) + V +1 (η (π, u 1,..., u n, Y+1 com )) Π = π }. (58) Since all he conrollers have idenical informaion, he above resuls correspond o he cenralized dynamic program of Theorem 1 wih a single conroller choosing all he acions. 2) Coupled subsysems wih conrol sharing informaion srucure: Consider he following special case of he above generalized model. 1) The sae of he sysem a ime is a (n+1)-dimensional vecor X = (X 1, X 2,..., X n, X ), where X i, i = 1,..., n corresponds o he local sae of subsysem i, and X sae of he sysem. 2) The sae updae funcion is such ha he global sae evolves according o X +1 = f (X, U, N 0 ), while he local sae of subsysem i evolves according o X i +1 = f i (X i, X, U, N i ), is a global

27 27 where {N 0, = 1,... T },..., {N n, = 1,... T } are muually independen i.i.d noise processes ha are independen of he iniial sae, X 1 = (X 1 1, X 2 1,..., X n 1, X 1). 3) A ime, he common observaion of all conrollers is given by Y com = X. 4) A ime, he local observaion of conroller i is given by Y i = X i, i = 1,..., n. 5) The shared memory a ime is C = {X 1: 1, U 1: 1 }. A each ime, afer aking he acion U i, conroller i sends Z i = {X, U i } o he shared memory. The above special case corresponds o he model of coupled subsysems wih conrol sharing considered in [39], where several applicaions of his model are also presened. I is shown in [39] ha here is no loss of opimaliy in resricing aenion o conrollers wih no local memory, i.e., M =. Wih his addiional resricion, he resul of Theorems 1 and 2 apply for his model wih Π defined as Noe ha Π can be evaluaed from X ha X 1, X 2,..., X n Π := P g1:n 1: 1 (X, X 1,..., X n X 1:, U 1: 1 ). and P g1:n 1: 1 (X 1,..., X n X 1:, U 1: 1 ). I is shown in [39] are condiionally independen given X 1:, U 1: 1, hence he join disribuion P g1:n 1: 1 (X 1,..., X n X 1:, U 1: 1 ) is a produc of is marginal disribuions. 3) Broadcas informaion srucure: Consider he following special case of he above generalized model. 1) The sae of he sysem a ime is a n-dimensional vecor X = (X 1, X 2,..., X n ), where X i, i = 1,..., n corresponds o he local sae of subsysem i. The firs componen i = 1 is special and called he cenral node. Oher componens, i = 2,..., n, are called peripheral nodes. 2) The sae updae funcion is such ha he sae of he cenral node evolves according o X 1 +1 = f 1 (X 1, U 1, N 1 ) while he sae of he peripheral nodes evolves according o X i +1 = f i (X i, X 1, U i, U 1, N i ) where {N i, i = 1, 2,... n; = 1,... } are noise processes ha are independen across ime and independen of each oher. 3) A ime, he common observaion of all conrollers is given by Y com = X 1.

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Signaling equilibria for dynamic LQG games with. asymmetric information

Signaling equilibria for dynamic LQG games with. asymmetric information Signaling equilibria for dynamic LQG games wih asymmeric informaion Deepanshu Vasal and Achilleas Anasasopoulos Absrac We consider a finie horizon dynamic game wih wo players who observe heir ypes privaely

More information

Optimal Control for LQG Systems on Graphs Part I: Structural Results

Optimal Control for LQG Systems on Graphs Part I: Structural Results Opimal Conrol for LQG Sysems on Graphs Par I: Srucural Resuls Ashuosh Nayyar Lauren Lessard Absrac In his wo-par paper, we idenify a broad class of decenralized oupu-feedback LQG sysems for which he opimal

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Sequential decomposition of sequential dynamic teams: applications to real-time communication and networked control systems by Aditya Mahajan

Sequential decomposition of sequential dynamic teams: applications to real-time communication and networked control systems by Aditya Mahajan Sequenial decomposiion of sequenial dynamic eams: applicaions o real-ime communicaion and neworked conrol sysems by Adiya Mahajan A disseraion submied in he parial fulfillmen of he requiremens for he degree

More information

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems

On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 38, No. 2, May 2013, pp. 209 227 ISSN 0364-765X (prin) ISSN 1526-5471 (online) hp://dx.doi.org/10.1287/moor.1120.0562 2013 INFORMS On Boundedness of Q-Learning Ieraes

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

5. Stochastic processes (1)

5. Stochastic processes (1) Lec05.pp S-38.45 - Inroducion o Teleraffic Theory Spring 2005 Conens Basic conceps Poisson process 2 Sochasic processes () Consider some quaniy in a eleraffic (or any) sysem I ypically evolves in ime randomly

More information

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details!

Finish reading Chapter 2 of Spivak, rereading earlier sections as necessary. handout and fill in some missing details! MAT 257, Handou 6: Ocober 7-2, 20. I. Assignmen. Finish reading Chaper 2 of Spiva, rereading earlier secions as necessary. handou and fill in some missing deails! II. Higher derivaives. Also, read his

More information

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes Represening Periodic Funcions by Fourier Series 3. Inroducion In his Secion we show how a periodic funcion can be expressed as a series of sines and cosines. We begin by obaining some sandard inegrals

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

Structural results for partially nested LQG systems over graphs

Structural results for partially nested LQG systems over graphs Srucural resuls for parially nesed LQG sysems over graphs Ashuosh Nayyar 1 Lauren Lessard 2 American Conrol Conference, pp. 5457 5464, 2015 Absrac We idenify a broad class of decenralized oupufeedback

More information

Some Ramsey results for the n-cube

Some Ramsey results for the n-cube Some Ramsey resuls for he n-cube Ron Graham Universiy of California, San Diego Jozsef Solymosi Universiy of Briish Columbia, Vancouver, Canada Absrac In his noe we esablish a Ramsey-ype resul for cerain

More information

Optimal Server Assignment in Multi-Server

Optimal Server Assignment in Multi-Server Opimal Server Assignmen in Muli-Server 1 Queueing Sysems wih Random Conneciviies Hassan Halabian, Suden Member, IEEE, Ioannis Lambadaris, Member, IEEE, arxiv:1112.1178v2 [mah.oc] 21 Jun 2013 Yannis Viniois,

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD HAN XIAO 1. Penalized Leas Squares Lasso solves he following opimizaion problem, ˆβ lasso = arg max β R p+1 1 N y i β 0 N x ij β j β j (1.1) for some 0.

More information

Two Coupled Oscillators / Normal Modes

Two Coupled Oscillators / Normal Modes Lecure 3 Phys 3750 Two Coupled Oscillaors / Normal Modes Overview and Moivaion: Today we ake a small, bu significan, sep owards wave moion. We will no ye observe waves, bu his sep is imporan in is own

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients Secion 3.5 Nonhomogeneous Equaions; Mehod of Undeermined Coefficiens Key Terms/Ideas: Linear Differenial operaor Nonlinear operaor Second order homogeneous DE Second order nonhomogeneous DE Soluion o homogeneous

More information

Echocardiography Project and Finite Fourier Series

Echocardiography Project and Finite Fourier Series Echocardiography Projec and Finie Fourier Series 1 U M An echocardiagram is a plo of how a porion of he hear moves as he funcion of ime over he one or more hearbea cycles If he hearbea repeas iself every

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach

Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach Decentralized Stochastic Control with Partial Sharing Information Structures: A Common Information Approach 1 Ashutosh Nayyar, Aditya Mahajan and Demosthenis Teneketzis Abstract A general model of decentralized

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem

An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem An Opimal Approximae Dynamic Programming Algorihm for he Lagged Asse Acquisiion Problem Juliana M. Nascimeno Warren B. Powell Deparmen of Operaions Research and Financial Engineering Princeon Universiy

More information

O Q L N. Discrete-Time Stochastic Dynamic Programming. I. Notation and basic assumptions. ε t : a px1 random vector of disturbances at time t.

O Q L N. Discrete-Time Stochastic Dynamic Programming. I. Notation and basic assumptions. ε t : a px1 random vector of disturbances at time t. Econ. 5b Spring 999 C. Sims Discree-Time Sochasic Dynamic Programming 995, 996 by Chrisopher Sims. This maerial may be freely reproduced for educaional and research purposes, so long as i is no alered,

More information

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance.

An Introduction to Backward Stochastic Differential Equations (BSDEs) PIMS Summer School 2016 in Mathematical Finance. 1 An Inroducion o Backward Sochasic Differenial Equaions (BSDEs) PIMS Summer School 2016 in Mahemaical Finance June 25, 2016 Chrisoph Frei cfrei@ualbera.ca This inroducion is based on Touzi [14], Bouchard

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection

Appendix to Online l 1 -Dictionary Learning with Application to Novel Document Detection Appendix o Online l -Dicionary Learning wih Applicaion o Novel Documen Deecion Shiva Prasad Kasiviswanahan Huahua Wang Arindam Banerjee Prem Melville A Background abou ADMM In his secion, we give a brief

More information

Lecture Notes 2. The Hilbert Space Approach to Time Series

Lecture Notes 2. The Hilbert Space Approach to Time Series Time Series Seven N. Durlauf Universiy of Wisconsin. Basic ideas Lecure Noes. The Hilber Space Approach o Time Series The Hilber space framework provides a very powerful language for discussing he relaionship

More information

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality

Matrix Versions of Some Refinements of the Arithmetic-Geometric Mean Inequality Marix Versions of Some Refinemens of he Arihmeic-Geomeric Mean Inequaliy Bao Qi Feng and Andrew Tonge Absrac. We esablish marix versions of refinemens due o Alzer ], Carwrigh and Field 4], and Mercer 5]

More information

Matlab and Python programming: how to get started

Matlab and Python programming: how to get started Malab and Pyhon programming: how o ge sared Equipping readers he skills o wrie programs o explore complex sysems and discover ineresing paerns from big daa is one of he main goals of his book. In his chaper,

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information

Distributed Fictitious Play for Optimal Behavior of Multi-Agent Systems with Incomplete Information Disribued Ficiious Play for Opimal Behavior of Muli-Agen Sysems wih Incomplee Informaion Ceyhun Eksin and Alejandro Ribeiro arxiv:602.02066v [cs.g] 5 Feb 206 Absrac A muli-agen sysem operaes in an uncerain

More information

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS

LECTURE 1: GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS LECTURE : GENERALIZED RAY KNIGHT THEOREM FOR FINITE MARKOV CHAINS We will work wih a coninuous ime reversible Markov chain X on a finie conneced sae space, wih generaor Lf(x = y q x,yf(y. (Recall ha q

More information

The expectation value of the field operator.

The expectation value of the field operator. The expecaion value of he field operaor. Dan Solomon Universiy of Illinois Chicago, IL dsolom@uic.edu June, 04 Absrac. Much of he mahemaical developmen of quanum field heory has been in suppor of deermining

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Christos Papadimitriou & Luca Trevisan November 22, 2016

Christos Papadimitriou & Luca Trevisan November 22, 2016 U.C. Bereley CS170: Algorihms Handou LN-11-22 Chrisos Papadimiriou & Luca Trevisan November 22, 2016 Sreaming algorihms In his lecure and he nex one we sudy memory-efficien algorihms ha process a sream

More information

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC

1. An introduction to dynamic optimization -- Optimal Control and Dynamic Programming AGEC This documen was generaed a :45 PM 8/8/04 Copyrigh 04 Richard T. Woodward. An inroducion o dynamic opimizaion -- Opimal Conrol and Dynamic Programming AGEC 637-04 I. Overview of opimizaion Opimizaion is

More information

Notes for Lecture 17-18

Notes for Lecture 17-18 U.C. Berkeley CS278: Compuaional Complexiy Handou N7-8 Professor Luca Trevisan April 3-8, 2008 Noes for Lecure 7-8 In hese wo lecures we prove he firs half of he PCP Theorem, he Amplificaion Lemma, up

More information

Dynamic Oligopoly Games with Private Markovian Dynamics

Dynamic Oligopoly Games with Private Markovian Dynamics Dynamic Oligopoly Games wih Privae Markovian Dynamics Yi Ouyang, Hamidreza Tavafoghi and Demoshenis Tenekezis Absrac We analyze a dynamic oligopoly model wih sraegic sellers and buyers/consumers over a

More information

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8) I. Definiions and Problems A. Perfec Mulicollineariy Econ7 Applied Economerics Topic 7: Mulicollineariy (Sudenmund, Chaper 8) Definiion: Perfec mulicollineariy exiss in a following K-variable regression

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecure Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/~ehem/i2ml3e CHAPTER 2: SUPERVISED LEARNING Learning a Class

More information

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course

Overview. COMP14112: Artificial Intelligence Fundamentals. Lecture 0 Very Brief Overview. Structure of this course OMP: Arificial Inelligence Fundamenals Lecure 0 Very Brief Overview Lecurer: Email: Xiao-Jun Zeng x.zeng@mancheser.ac.uk Overview This course will focus mainly on probabilisic mehods in AI We shall presen

More information

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu

11!Hí MATHEMATICS : ERDŐS AND ULAM PROC. N. A. S. of decomposiion, properly speaking) conradics he possibiliy of defining a counably addiive real-valu ON EQUATIONS WITH SETS AS UNKNOWNS BY PAUL ERDŐS AND S. ULAM DEPARTMENT OF MATHEMATICS, UNIVERSITY OF COLORADO, BOULDER Communicaed May 27, 1968 We shall presen here a number of resuls in se heory concerning

More information

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature On Measuring Pro-Poor Growh 1. On Various Ways of Measuring Pro-Poor Growh: A Shor eview of he Lieraure During he pas en years or so here have been various suggesions concerning he way one should check

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Cash Flow Valuation Mode Lin Discrete Time

Cash Flow Valuation Mode Lin Discrete Time IOSR Journal of Mahemaics (IOSR-JM) e-issn: 2278-5728,p-ISSN: 2319-765X, 6, Issue 6 (May. - Jun. 2013), PP 35-41 Cash Flow Valuaion Mode Lin Discree Time Olayiwola. M. A. and Oni, N. O. Deparmen of Mahemaics

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT

GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT Inerna J Mah & Mah Sci Vol 4, No 7 000) 48 49 S0670000970 Hindawi Publishing Corp GENERALIZATION OF THE FORMULA OF FAA DI BRUNO FOR A COMPOSITE FUNCTION WITH A VECTOR ARGUMENT RUMEN L MISHKOV Received

More information

Object tracking: Using HMMs to estimate the geographical location of fish

Object tracking: Using HMMs to estimate the geographical location of fish Objec racking: Using HMMs o esimae he geographical locaion of fish 02433 - Hidden Markov Models Marin Wæver Pedersen, Henrik Madsen Course week 13 MWP, compiled June 8, 2011 Objecive: Locae fish from agging

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Class Meeting # 10: Introduction to the Wave Equation

Class Meeting # 10: Introduction to the Wave Equation MATH 8.5 COURSE NOTES - CLASS MEETING # 0 8.5 Inroducion o PDEs, Fall 0 Professor: Jared Speck Class Meeing # 0: Inroducion o he Wave Equaion. Wha is he wave equaion? The sandard wave equaion for a funcion

More information

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY

RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY ECO 504 Spring 2006 Chris Sims RANDOM LAGRANGE MULTIPLIERS AND TRANSVERSALITY 1. INTRODUCTION Lagrange muliplier mehods are sandard fare in elemenary calculus courses, and hey play a cenral role in economic

More information

Dynamic Oligopoly Games with Private Markovian Dynamics

Dynamic Oligopoly Games with Private Markovian Dynamics Dynamic Oligopoly Games wih Privae Markovian Dynamics Yi Ouyang, Hamidreza Tavafoghi and Demoshenis Tenekezis Absrac We analyze a dynamic oligopoly model wih sraegic sellers and buyers/consumers over a

More information

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data

Chapter 2. Models, Censoring, and Likelihood for Failure-Time Data Chaper 2 Models, Censoring, and Likelihood for Failure-Time Daa William Q. Meeker and Luis A. Escobar Iowa Sae Universiy and Louisiana Sae Universiy Copyrigh 1998-2008 W. Q. Meeker and L. A. Escobar. Based

More information

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation:

Hamilton- J acobi Equation: Weak S olution We continue the study of the Hamilton-Jacobi equation: M ah 5 7 Fall 9 L ecure O c. 4, 9 ) Hamilon- J acobi Equaion: Weak S oluion We coninue he sudy of he Hamilon-Jacobi equaion: We have shown ha u + H D u) = R n, ) ; u = g R n { = }. ). In general we canno

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1.

Robotics I. April 11, The kinematics of a 3R spatial robot is specified by the Denavit-Hartenberg parameters in Tab. 1. Roboics I April 11, 017 Exercise 1 he kinemaics of a 3R spaial robo is specified by he Denavi-Harenberg parameers in ab 1 i α i d i a i θ i 1 π/ L 1 0 1 0 0 L 3 0 0 L 3 3 able 1: able of DH parameers of

More information

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems

Essential Microeconomics : OPTIMAL CONTROL 1. Consider the following class of optimization problems Essenial Microeconomics -- 6.5: OPIMAL CONROL Consider he following class of opimizaion problems Max{ U( k, x) + U+ ( k+ ) k+ k F( k, x)}. { x, k+ } = In he language of conrol heory, he vecor k is he vecor

More information

) were both constant and we brought them from under the integral.

) were both constant and we brought them from under the integral. YIELD-PER-RECRUIT (coninued The yield-per-recrui model applies o a cohor, bu we saw in he Age Disribuions lecure ha he properies of a cohor do no apply in general o a collecion of cohors, which is wha

More information

Optimality Conditions for Unconstrained Problems

Optimality Conditions for Unconstrained Problems 62 CHAPTER 6 Opimaliy Condiions for Unconsrained Problems 1 Unconsrained Opimizaion 11 Exisence Consider he problem of minimizing he funcion f : R n R where f is coninuous on all of R n : P min f(x) x

More information

Inventory Control of Perishable Items in a Two-Echelon Supply Chain

Inventory Control of Perishable Items in a Two-Echelon Supply Chain Journal of Indusrial Engineering, Universiy of ehran, Special Issue,, PP. 69-77 69 Invenory Conrol of Perishable Iems in a wo-echelon Supply Chain Fariborz Jolai *, Elmira Gheisariha and Farnaz Nojavan

More information

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC his documen was generaed a 1:4 PM, 9/1/13 Copyrigh 213 Richard. Woodward 4. End poins and ransversaliy condiions AGEC 637-213 F z d Recall from Lecure 3 ha a ypical opimal conrol problem is o maimize (,,

More information

On-line Adaptive Optimal Timing Control of Switched Systems

On-line Adaptive Optimal Timing Control of Switched Systems On-line Adapive Opimal Timing Conrol of Swiched Sysems X.C. Ding, Y. Wardi and M. Egersed Absrac In his paper we consider he problem of opimizing over he swiching imes for a muli-modal dynamic sysem when

More information

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still. Lecure - Kinemaics in One Dimension Displacemen, Velociy and Acceleraion Everyhing in he world is moving. Nohing says sill. Moion occurs a all scales of he universe, saring from he moion of elecrons in

More information

Comparing Means: t-tests for One Sample & Two Related Samples

Comparing Means: t-tests for One Sample & Two Related Samples Comparing Means: -Tess for One Sample & Two Relaed Samples Using he z-tes: Assumpions -Tess for One Sample & Two Relaed Samples The z-es (of a sample mean agains a populaion mean) is based on he assumpion

More information

Chapter 6. Systems of First Order Linear Differential Equations

Chapter 6. Systems of First Order Linear Differential Equations Chaper 6 Sysems of Firs Order Linear Differenial Equaions We will only discuss firs order sysems However higher order sysems may be made ino firs order sysems by a rick shown below We will have a sligh

More information

Analyze patterns and relationships. 3. Generate two numerical patterns using AC

Analyze patterns and relationships. 3. Generate two numerical patterns using AC envision ah 2.0 5h Grade ah Curriculum Quarer 1 Quarer 2 Quarer 3 Quarer 4 andards: =ajor =upporing =Addiional Firs 30 Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 andards: Operaions and Algebraic Thinking

More information

Games Against Nature

Games Against Nature Advanced Course in Machine Learning Spring 2010 Games Agains Naure Handous are joinly prepared by Shie Mannor and Shai Shalev-Shwarz In he previous lecures we alked abou expers in differen seups and analyzed

More information

8. Basic RL and RC Circuits

8. Basic RL and RC Circuits 8. Basic L and C Circuis This chaper deals wih he soluions of he responses of L and C circuis The analysis of C and L circuis leads o a linear differenial equaion This chaper covers he following opics

More information

EKF SLAM vs. FastSLAM A Comparison

EKF SLAM vs. FastSLAM A Comparison vs. A Comparison Michael Calonder, Compuer Vision Lab Swiss Federal Insiue of Technology, Lausanne EPFL) michael.calonder@epfl.ch The wo algorihms are described wih a planar robo applicaion in mind. Generalizaion

More information

Optimal Decentralized State-Feedback Control with Sparsity and Delays

Optimal Decentralized State-Feedback Control with Sparsity and Delays Opimal Decenralized Sae-Feedback Conrol wih Sparsiy and Delays Andrew Lamperski Lauren Lessard Submied o Auomaica Absrac This work presens he soluion o a class of decenralized linear quadraic sae-feedback

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

Sensors, Signals and Noise

Sensors, Signals and Noise Sensors, Signals and Noise COURSE OUTLINE Inroducion Signals and Noise: 1) Descripion Filering Sensors and associaed elecronics rv 2017/02/08 1 Noise Descripion Noise Waveforms and Samples Saisics of Noise

More information

CENTRALIZED VERSUS DECENTRALIZED PRODUCTION PLANNING IN SUPPLY CHAINS

CENTRALIZED VERSUS DECENTRALIZED PRODUCTION PLANNING IN SUPPLY CHAINS CENRALIZED VERSUS DECENRALIZED PRODUCION PLANNING IN SUPPLY CHAINS Georges SAHARIDIS* a, Yves DALLERY* a, Fikri KARAESMEN* b * a Ecole Cenrale Paris Deparmen of Indusial Engineering (LGI), +3343388, saharidis,dallery@lgi.ecp.fr

More information

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems.

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems. di ernardo, M. (995). A purely adapive conroller o synchronize and conrol chaoic sysems. hps://doi.org/.6/375-96(96)8-x Early version, also known as pre-prin Link o published version (if available):.6/375-96(96)8-x

More information

Block Diagram of a DCS in 411

Block Diagram of a DCS in 411 Informaion source Forma A/D From oher sources Pulse modu. Muliplex Bandpass modu. X M h: channel impulse response m i g i s i Digial inpu Digial oupu iming and synchronizaion Digial baseband/ bandpass

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information