Optimal Decentralized State-Feedback Control with Sparsity and Delays

Size: px

Start display at page:

Download "Optimal Decentralized State-Feedback Control with Sparsity and Delays"

Alexander Watts
5 years ago
Views:

1 Opimal Decenralized Sae-Feedback Conrol wih Sparsiy and Delays Andrew Lamperski Lauren Lessard Submied o Auomaica Absrac This work presens he soluion o a class of decenralized linear quadraic sae-feedback conrol problems, in which he plan and conroller mus saisfy he same combinaion of delay and sparsiy consrains. Using a novel decomposiion of he noise hisory, he conrol problem is spli ino independen subproblems ha are solved using dynamic programming. The approach presened herein boh unifies and generalizes many exising resuls. I Inroducion While opimal decenralized conroller synhesis is difficul in general [25, 28], much progress has been made oward idenifying racable subclasses of problems. Two closely relaed condiions, parial nesedness and quadraic invariance, guaranee, respecively, ha he opimal soluion for LQG conrol problems is linear [3], and ha opimal synhesis can be cas as convex opimizaion [6, 9]. These resuls alone do no necessarily guaranee ha he opimal conroller can be efficienly compued, since he associaed convex programs are large. Indeed, he problem of [3] reduces o a sysem of linear equaions ha grows wih he ime horizon. In [6], a sequence of convex programs of increasing size converges o he opimal soluion, and [9] uses vecorizaion o reduce decenralized problems o much larger cenralized ones. More efficien compuaional ools have been developed for linear quadraic problems, in paricular linear marix inequaliies have been used o solve he sae-feedback [7] and oupu-feedback [, 8] cases. Oupu-feedback problems wih delays can also be solved using a combinaion of specral facorizaion and quadraic programming [7]. A general drawback of purely compuaional approaches is ha hey give lile insigh ino he srucure of he opimal conroller. When finding he bes linear conroller, for example, we mus ofen resric our search o conrollers of a fixed dimension, and here is no inuiion for he physical meaning of he conroller saes. Explici soluions, which provide efficien compuaion in addiion o a physical inerpreaion for he saes of he conroller, have been found for he specific consrain classes of sparsiy and delays. In he delay case, i is assumed all conrollers evenually measure he global sae, bu no necessarily simulaneously. Problems wih a one-imesep delay beween conrollers were found by exending classical dynamic programming argumens [4, 20, 29]. In he linear quadraic seing, he sae-feedback problem wih delays characerized by a graph is solved in [6]. The soluion is similar in spiri and complexiy o compuing he opimal cenralized LQR conroller, bu he opimal policy urns ou o be dynamic raher han saic. Conroller saes have inerpreaions as delayed esimaes of he global sae. In he sparsiy case, all measuremens are ransmied insanly, bu some conrollers never gain access o cerain measuremens. This amouns o a delayed sysem where each delay is eiher zero or infinie. The firs explici soluion solved he wo-player case using a specral facorizaion approach [24]. The resuls were laer exended o a general class of quadraically invarian, also called pose-causal, sparsiy paerns [2, 22]. A dynamic programming argumen for he wo-player problem was also given in [23]. Again, he conroller saes can be inerpreed as esimaes of he global saes condiioned on he paricular subses of he available informaion. This paper unifies he reamen of sparsiy and delay consrains by considering an informaion flow characerized by a direced graph. Each edge may be labeled wih eiher a 0 for insananeous informaion ransfer, or wih a for one-imesep delayed ransfer. We use a generalizaion of he dynamic programming approach of [6] ha does no require he graph o be srongly conneced. Therefore, his work gives a general mehod for solving boh he delay and sparsiy problems described in he previous wo paragraphs. In addiion, our framework can rea problems ha conain a mixure of sparsiy and delay consrains. A preliminary version of his work appeared in he conference paper [8]. The presen work differs subsanially from [8]. Specifically, he presen work includes complee proofs o all resuls, new illusraive examples, and exensive discussion in Secions IV and V. We also have a new resul, Theorem 3, which gives a disribued messagepassing implemenaion of he opimal conroller. In he remainder of his secion, we explain he noaions and oher convenions used in he paper. In Secions II, we sae he main problem, and we presen our soluion in Secion III. We discuss how our work unifies exising resuls in Secion IV and we discuss limiaions

2 and possible exensions in Secion V. We provide wo numerical examples in Secion VI. The proof o our main resul appears in Secion VII. Finally, we conclude in Secion VIII. I-A Noaion Lower-case leers are random vecors unless oherwise indicaed. Subscrips are ime indices and superscrips are subsysem labels. For example, he sae of subsysem 2 a ime is x 2. The global sae of all subsysems is indicaed by omiing he subsysem, and ime sequences are denoed using he colon operaor. For example, x = x. x N and x 0: = x 0. x. For marices, subscrips denoe subblocks, while superscrips are used o selec ses of subblocks. For example, a block 2 3 marix M is wrien as [ ] M M M = 2 M 3. M 2 M 22 M 23 To selec row r = {2} and columns s = {, 3}, we wrie M rs = [ ] M 2 M 23. The symbol I denoes a blockideniy marix whose dimensions are o be inferred by conex. In he example above, we could have wrien M rs = I r,{,2} MI {,2,3},s. If Y = {y,..., y M } is a se of random vecors (possibly of differen sizes), we say ha z lin Y if here are appropriaely sized real marices C,..., C M such ha z = M i= Ci y i. In general, all marices in his paper are ime-varying bu we will ofen omi ime indices if hey are clear from conex. We use a new noaion o indicae a family of equaions. When we wrie x + = Ax + Bu + w, we mean ha x + = A x + B u + w for = 0,..., T. We will use he same ime horizon T hroughou his paper, so here is no ambiguiy. Noe ha he subscrip + means ha he ime index of he corresponding symbol is incremened by. We use a similar noaion for oher binary relaions such as inequaliies, by wriing a over he symbol. Finally, a similar noaion is used o modify summaions. For example, we wrie II x T Qx insead of wriing Problem saemen T x T Q x. =0 We begin wih some basic definiions. A nework graph G(V, E) is a direced graph where each edge is labeled wih a 0 if he associaed link is delay-free, or a if i has a one-imesep delay. The verices are V = {,..., N}. If here is an edge from j o i, we wrie (j, i) E, or simply j i. When delays are perinen, hey are denoed as j 0 i or j i. Associaed wih he nework graph G(V, E) is he delay marix D. Each enry D ij is he sum of he delays along he direced pah from j o i wih he shores aggregae delay. We assume D ii = 0 for all i, and if no direced pah exiss, we se D ij =. Delays are assumed o be fixed for all ime. Direced cycles are permied in he nework graph, bu we assume here are no direced cycles wih a oal delay of zero. In our framework, all nodes belonging o such a delay-free cycle can be collapsed ino a single node and reaed as such wihou any loss of generaliy. See Fig. for an example of a nework graph. We now sae he general class of problems ha can be solved using he mehods developed in his paper. Problem. Le G(V, E) be a nework graph wih a imeinvarian delay marix D. Suppose he following imevarying sae-space equaions are given: x i + = ( Aij x j + B ij u j) + w i for all i V. () j V D ij Sacking he various vecors and marices, we obain he more compac represenaion x + = Ax + Bu + w. (2) All random disurbances are assumed o be joinly Gaussian and independen from one anoher. Specifically, he random vecors {x i 0, w0, i..., wt i } i V are muually independen. Their means and covariances are given by x i 0 N (µ i 0, Σ i 0) and w i N (0, W i ) for all 0. (3) The inpus u 0:T are conrolled using sae feedback subjec o an informaion consrain. The informaion se for conroller i a ime is as follows. I i = { x j k : j V, 0 k D ij}. (4) In oher words, I i is he se of saes belonging o nodes ha have had sufficien ime o reach node i by ime. Each decision measures is corresponding informaion se u i = γ i (I i ). (5) The goal is o choose he se of policies γ = {γ0:t i } i V ha minimize he expeced quadraic cos ( [ ] T [ ] [ ] ) min E γ x Q S x + x TT Q γ u R u f x T, (6) S T where he expecaion is aken wih respec o he join probabiliy measure on (x 0:T, u 0:T ) induced by he choice of γ. We make he sandard assumpions ha [ ] Q S 0, R > 0, Qf 0. (7) S T R In Problem, we assume ha all decision-makers know he sysem parameers A, B, Q, R, S, Q f, as well as he 2

3 opology of he underlying nework graph G(V, E). The problem is cooperaive in naure; we are o joinly design he se of policies γ o opimize he cos (6). As menioned in Secion I-A, he parameers of he sysem in Problem can be ime-varying. In fac, heir sizes can also vary wih ime. For example, he inpu u may change dimensions a every imesep, and he marices B, R, S would change size accordingly. We now illusrae how he general problem saemen given by () (7) specializes in a simple hree-node case. Example. Consider he nework graph of Fig Figure : Nework graph for Example. Each node represens a subsysem, and he edge labels indicae he propagaion delay from one subsysem o anoher. The delay marix for Example is given by 0 D = 0. (8) 0 0 While he delay marix for Example only conains 0 s, s, and s, a general delay marix may conain arbirarily large nonnegaive inegers, provided he associaed nework graph is sufficienly large. The sae-space equaions for Example are of he form x + x 2 + = x 3 + A A 2 0 A 2 A 22 0 A 3 A 32 A 33 + x x 2 x 3 B B 2 0 B 2 B 22 0 B 3 B 32 B 33 u u 2 + u 3 w w 2. (9) w 3 Noe ha whenever D ij >, we have A ij = 0 and B ij = 0. The policies of he hree decision-makers in Example are consrained as follows. u = γ ( x 0:, x 2 ) 0: u 2 = γ 2 ( x 0:, x 2 ) 0: for 0 T. (0) u 3 = γ 3 ( x 0:, x 2 0:, x 3 ) 0: There is a combinaion of sparsiy and delay consrains; some saes may never be available o a paricular conroller, while oher saes migh be available bu delayed. Our goal is o choose he policies γ such ha we minimize he cos given by (6). Noe ha according o our definiions, boh he plan () and he conroller (5) share he same sparsiy consrains. In Example for insance, x 3 does no influence x or x 2 via he dynamics, nor can i affec u or u 2 via he conroller. As explained in Subsecion VII-A, his condiion is sufficien o guaranee ha he opimal conrol policies γ are linear, a powerful fac. Based on our formulaion, he conroller may be any funcion of he pas informaion hisory, which grows wih he size of he ime horizon T. We will show by consrucion ha here exiss an opimal policy ha has a finie memory ha is independen of T. III Main resuls This secion presens he main resuls: explici saespace soluions for Problem. Two equivalen forms of he conroller are presened. The firs form has saes which are funcions of he primiive random variables x 0 and w 0:. The second form uses sae feedback only and implemens he conroller in a disribued fashion by message passing. To describe he conroller coordinaes, he nework graph is ransformed ino an alernae represenaion, he informaion graph. This graph racks he propagaion of he noise signals w i as hey propagae hrough he nework graph. Formally, we define he informaion graph as follows. Le s j k be he se of nodes reachable from node j wihin k seps: s j k = {i V : D ij k}. () The informaion graph Ĝ(U, F), is given by U = {s j k : k 0, j V} F = {(s j k, sj k+ ) : k 0, j V}. Consider Example, and rack each of he noise signals w, w 2, w 3 as hey are propagaed hrough he nework graph of Fig.. w {} {, 2, 3} {, 2, 3}... w 2 {2, 3} {, 2, 3} {, 2, 3}... w 3 {3} {3}... (2) Each chain in (2) is of he form w i s i 0 s i s i 2... and racks he nodes reached by w i afer some number of imeseps. Assembling he pahs in (2) and aggregaing duplicae nodes, we obain he informaion graph for Example, shown in Fig. 2. The addiional labels w i are no couned amongs he nodes of Ĝ as a maer of convenion, bu are shown as a reminder of which noise signal is being racked. We will ofen wrie expressions such as {s U : w i s} o denoe he se of roo nodes of Ĝ. The following proposiion gives some useful properies of he informaion graph. 3

4 {, 2, 3} {} {2, 3} {3} w w 2 w 3 Figure 2: Informaion graph for Example. The nodes are subses of nodes in he nework graph (see Fig. ) affeced by differen noises. For example, w 2 is injeced a node 2, affecs nodes {2, 3} immediaely, and affecs {, 2, 3} afer one imesep. Proposiion. Given an informaion graph Ĝ(U, F), he following properies hold. (i) Every node in Ĝ has exacly one descendan. In oher words, for every r U, here is a unique s U such ha r s. (ii) Every pah evenually his a node wih a self-loop. (iii) If he nework graph saisfies V = n, he number of nodes in Ĝ is bounded by n U n2 n +. The firs wo properies are immediae by consrucion. The lower bound on U is achieved by direced acyclic nework graphs wih zero delay on all edges. The upper bound on U is achieved by nework graph consising of one large cycle: 2 n, and each link has a one-imesep delay. Noe ha he informaion graph may have several conneced componens. This happens whenever he nework graph is no srongly conneced. For example, Fig. 2 has wo conneced componens because here is no pah 3 2 in Fig.. In (2), we characerized how noise injeced a he curren imesep propagaes o he oher nodes a fuure imeseps. Alernaively, one may consider he hisory of all pas injeced noises, and ask which of hese noise erms have affeced a paricular node a he presen ime. Define he noise hisory a ime o be he se of all noise erms injeced ino he sysem prior o ime, H = {x 0, w 0: }. (3) The noise pariion diagram is formed by arranging he elemens of H in a able and grouping hem based on which nodes a he curren ime hey can affec. We call hese groups label ses, and hey are discussed in greaer deail in Secion VII-B. The noise pariion diagram for Example is shown in Fig. 3. Noe ha as we raverse each row of he noise pariion diagram from righ o lef in Fig. 3, we recover (2). We are now ready o presen he main resul of his paper, which expresses he opimal conroller as a funcion of new coordinaes induced by he informaion graph. {, 2, 3} {} Node w 3 w 2 w {2, 3} Node 2 w 2 3 w 2 2 w 2 {3} Node 3 w 3 3 w 3 2 w Time Figure 3: Noise pariion diagram for Example (see Fig. and 2). All pas noise disurbances are placed ino groups depending on which decisions a ime hey can affec. Theorem 2. Consider Problem, and le Ĝ(U, F) be he associaed informaion graph. Define he marices {X0:T r } r U and {K0:T r } r U recursively as follows, X r T = Q rr f K r = ( R rr + B srt X s +B sr) ( S rr + A srt X s +B sr) T X r = (Q rr + A srt X+A s sr ) + ( S rr + A srt X+B s sr) K r. (4) where for each r U, we have defined s U o be he unique node such ha r s. The opimal conrol decisions saisfy he following sae-space equaions ζ0 s = I s,{i} x i 0 ζ s + u i = ( A sr + B sr K r) ζ r + I s,{i} w i r s = I {i},r K r ζ r. r i The corresponding opimal expeced cos is ( (µ i 0) T (X0) s {i},{i} (µ i 0) V 0 = i V (5) ( )) + race (X0) s {i},{i} Σ i 0 + ( race (X+) s {i},{i} W i). (6) i V Proof. See Secion VII. In he saemen of Theorem 2, marices wih ses used as superscrips, as in A sr or I s,{i}, specify submarices. This noaion is explained in Secion I-A. Noe ha (5) expresses he conroller as a map w u, generalizing noise-based conrollers from [5, 2, 22, 23]. (See [, 7] for oher uses of noise-based represenaions in decenralized conrol.) In he proof of Theorem 2, we show ha his conroller is feasible because he noise erms used in compuing each u i can be deduced from he associaed informaion se I i. Our second main resul, Theorem 3, gives an explici implemenaion of Theorem 2 by describing how each 4

5 node uses is locally available sae informaion o compue he opimal conrol decisions. To illusrae how his will work, recall he nework graph of Example and is corresponding informaion graph, Fig. 2. We implemen he opimal conroller of Theorem 2 by using a combinaion of local measuremen, local compuaion, and message passing, as follows. Assume ha a ime, each node has a copy of he conroller saes required for compuing is local u i using (5). Namely, : ζ {}, ζ {,2,3} 2 : ζ {2,3}, ζ {,2,3} 3 : ζ {2,3}, ζ {,2,3}, ζ {3}. (7) A ime, messages are passed along he delay- edges of he nework graph as follows: node sends ζ {} o node 2 and node 2 sends ζ {2,3} o node. These messages arrive one imesep laer, due o he delays associaed wih hese edges of he nework graph. Now advance o imesep +, such ha all hese messages have been received. Nodes and 2 now have copies of ζ {}, ζ {2,3}, and ζ {,2,3}. Thus, nodes and 2 can compue ζ {,2,3} + via (5) using heir curren conroller sae informaion. We canno use (5) o compue ζ {} +, ζ{2,3} +, or ζ{3} + since i would require knowledge of w, w 2, or w 3 respecively. However, we will see laer ha he ζ-coordinaes combine o form he global sae: x = s U IV,s ζ s, for = 0,..., T. The need o know he w i can hus be avoided. For example, once node measures x +, we can compue ζ {} + = x + (ζ {,2,3} + ) {} and similarly for ζ {2,3} +. Finally, node 2 sends ζ{2,3} + and ζ {,2,3} + o node 3. These messages arrive insanly because he associaed edge has delay 0. Node 3 hen measures x 3 + and compues ζ {3} + = x3 + (ζ {,2,3} + ) {3} (ζ {2,3} + ){3}. A his poin, each node has copies of he saes indicaed by (7) a he incremened ime +. Therefore, u + can be compued and we have come full-circle. Fig. 4 shows all he messages passed along he nework graph. : ζ {} 0 : ζ {2,3}, ζ {,2,3} 2 3 coming seps. Noe ha if = 0, here are no delay- messages o receive. 2) For non-roo nodes s, so w i s, if ζ s is available from a delay-0 message, do nohing. Oherwise compue i (or iniialize i if = 0) using (5). 3) For he roo node s such ha w i s, measure x i and updae as follows. (ζ s ) s\{i} = r s (ζ s ) {i} = x i r i r s I s\{i},s ( A sr + B sr K r ) ζ r I {i},r ζ r 4) Compue inpu u i according o (5). 5) Send all oubound messages M ij, defined as follows. If i 0 j : If i j : M ij M ij = {ζ s : s U, i, j s}. = {ζ s : s U, i s, j / s}. Proof. Proved in Subsecion VII-D. IV Exising resuls (8) In his secion, we explain how Theorem 2 specializes o he exising resuls menioned in Secion I. For each such case, we show he associaed nework graph and informaion graph. Cenralized case. The firs relaed resul is classical sae-feedback LQR conrol. This corresponds o he rivial case where he nework graph and informaion graphs are single nodes, as in Fig. 5. The recursion (4) becomes a sandard Riccai difference equaion, and K {} is he classical LQR gain. Furhermore, he algorihm of Theorem 3 reduces o he classical saefeedback policy. No messages are passed, none of he updae seps are execued, and ζ {} = x. : ζ {2,3} Figure 4: Nework graph for Example wih messages. Our nex main resul, given below, generalizes he above procedure o any insance of Problem. Theorem 3. A disribued implemenaion of he opimal conroller in Theorem 2 is given as follows. Each node i V is responsible for soring and updaing a local copy of he saes {ζ s } s i. Each node i V performs he following seps for = 0,..., T. ) Receive and sore all inbound delay-0 and delay- messages. This provides he ζ r and ζ r values respecively ha will be used for local updaes in he (a) nework graph {} w (b) informaion graph Figure 5: Classical cenralized sae-feedback conrol. Sparsiy consrains. For sparsiy consrains wih no delays, he simples problem, he wo-player problem, is shown in Fig. 6. The informaion graph consiss of wo disconneced self-loops, implying ha he opimal 5

6 conroller depends on he soluion of wo Riccai difference equaions and heir associaed K-gains. For sparsiy over a general graph wih N nodes, he informaion graph has N disconneced self-loops and so N Riccai difference equaions mus be solved. 0 2 (a) nework graph Figure 6: {, 2} {2} w w 2 (b) informaion graph The wo-player problem, sparsiy only. Delayed sharing. The simples problem wih delays bu no sparsiy, he one-sep informaion sharing paern, is shown in Fig. 7. For general srongly conneced graphs wih one-imesep delays on every edge, he informaion graph is conneced, and all edges evenually lead o he self-loop V V. Thus, a single Riccai difference equaion mus be solved, which corresponds o he cenralized Riccai difference equaion, and is soluion is propagaed according o (4). Figure 7: V 2 (a) nework graph {, 2} {} {2} w w 2 (b) informaion graph One-sep delay informaion sharing paern. Furher exensions We now discuss seleced opics exploring he limiaions of our work and direcions for possible fuure research. Infinie horizon. While he problem formulaion in his paper considers opimizaion over a finie ime horizon, our soluion exends naurally o an infinie horizon. To his end, we assume all sysem parameers are ime-invarian, and we seek a sabilizing conroller ha minimizes he average finie-horizon cos as he lengh of he horizon ends o infiniy. In his limi, he Riccai difference equaions (4) for he nodes of he informaion graph wih self-loops r r become Algebraic Riccai Equaions (AREs). Under classical condiions [2], each X rr approaches a seady-sae value as T, he associaed seady-sae gain K r will be sabilizing, and he corresponding ζ r sae will be sable. One can easily show ha he remaining X s and K s marices for he oher nodes of he informaion graph also approach seady-sae limis and heir corresponding saes are also sable. For a numerical example and furher discussion, see Example 2 in Secion VI. Oupu feedback. In oupu feedback problems, he decision-makers have access o noisy measuremens of saes raher han he saes hemselves. Soluions are known for wo-player sparsiy [2, 3], n-player broadcas [9], and one-imesep delays [4, 20, 29], bu i is no clear how o exend he resuls of his paper o oupu feedback over a general graph wih mixed delays. Difficulies arise because deermining appropriae sufficien saisics for dynamic programming becomes suble when he delay beween any wo nodes is a leas 2 imeseps [5, 27, 30]. This phenomenon occurs for all sparsiy paerns and mos complex delay paerns. For sysems in his paper, sufficien saisics are compued by projecing he sae ono orhogonal subspaces of he informaion ses. I is unclear if such a decomposiion is possible for oupu-feedback problems. We menion one promising excepion [4], in which sufficien saisics are derived along wih a soluion o he finie-horizon version of he wo-player problem. The oupu-feedback soluions found so far are significanly more complicaed ha heir sae-feedback counerpars. For insance, in Theorem 2, he noise covariances W i only appear in he expression for he opimal cos. In oupu-feedback however, he policy iself depends explicily on W i. Correlaed noise. We assume in Problem ha he noises injeced ino he various nodes are independen. This independence is used o show ha he ζ s saes are muually independen, hus enabling a criical simplificaion of he value funcion used in he dynamic programming argumen. If he noises are correlaed, for example E(w w 2 T ) 0, our approach fails. Such problems are sill parially nesed, as defined in Secion VII-A, so we expec he opimal conroller o be unique and linear, bu we do no know how many saes i will have. A simple version of his problem wih wo players and decoupled dynamics was recenly solved [0], and he soluion is surprisingly complex. If each player has n saes, he opimal conroller may have a number of saes proporional o n 2. Realizabiliy. In general, causal linear ime-invarian sysem may be equivalenly represened using eiher sae-space or ransfer funcions. However, he wo represenaions are no equivalen when we impose an underlying graph srucure and associaed sparsiy for he sae-space marices [, 26]. Specifically, every srucured sae-space realizaion ha is sabilizable and deecable corresponds o a srucured ransfer funcion, bu he opposie is no rue. A counerexample is given in []. We avoid he issue of realizabiliy in his paper by only considering plans for which a srucured sae-space 6

7 realizaion exiss, as implied by (). Under his assumpion, we produce a srucured sae-space realizaion for he opimal conroller. As discussed earlier in his secion, all inernal modes ζ s of his conroller are sable by consrucion, so our realizaion is sabilizing. For graphs wih pure sparsiy (no delays), here is no loss in assuming a realizable plan, because non-realizable plans can never be sabilized using srucured conrollers []. However, no analogous resul is known for sysems wih delays, as discussed in [26]. VI Examples In his secion, we give wo examples o illusrae boh he generaliy and versailiy of our resuls. Example 2. Consider he four-node sysem depiced in Fig. 8. We show he nework graph and is associaed messages from Theorem 3 and he informaion graph and noise pariion diagram derived in Secion III. : ζ {} : ζ {} 2 4 : ζ {,2,3}, ζ {2} (a) nework graph wih messages 3 ζ {,2,3,4} : ζ{,2,3} ζ {3} 0 : ζ {2,3,4} ζ {3,4} As a supplemen o he discussion on infinie ime horizon soluions in Secion V, we now presen a numerical simulaion of his four-node example ha shows convergence of he opimal conrol gains as he ime horizon grows. We use he following ime-invarian parameers, A = B = D = Q = R = 8 8 S = and noise covariance W = I. Noe ha he A and B marices have a sparsiy paern ha conforms o he delay marix D. As explained in Secion II, we mus have A ij = 0 and B ij = 0 whenever D ij 2. To guaranee ha he Riccai equaions for he selfloops converge o sabilizing soluions, i is sufficien ha (Q rr S rr R rr S rrt, A rr B rr R rr S rrt, B rr ) be sabilizable and deecable for r = {3, 4}, {2, 3, 4}, and {, 2, 3, 4}. A deailed discussion of he convergence of Riccai equaions may be found in [2]. A direc calculaion shows ha Q rr S rr R rr S rrt and B rr are inverible for all self-loops, and so sabilizabiliy and deecabiliy are guaraneed. I follows ha he X r marices, which are soluions o (4), should converge o seady-sae values as we ge farher from he erminal imesep. This fac is suppored by he plo of Fig. 9. {, 2, 3, 4} {, 2, 3} {2, 3, 4} 400 {} {2} {3} {3, 4} 200 w w 2 w 3 w 4 (b) informaion graph {, 2, 3, 4} {, 2, 3} {} Node w 3 w 2 w {2, 3, 4} {2} Node 2 w 2 3 w 2 2 w 2 {3, 4} {3} Node 3 w 3 3 w 3 2 w 3 Node 4 w 3 4 w 2 4 w Time (c) noise pariion diagram Figure 8: Graphical represenaions for Example Time Sep () Figure 9: Plo of race(x r ) as a funcion of ime for Example 2. A curve is shown for each r U. Example 3. To invesigae he effec of increased delays on oal cos, we consider he 0-node sysem defined by he linear nework graph of Fig. 0. Depending on he value of k, we can vary he end-o-end delay D 90 = k k 9 Figure 0: Nework graph for Example 3. 7

8 Suppose he sysem marices are bi-diagonal, as follows A = B = The cos marices are Q = R = I, S = 0, and he noise covariance is W = 0 3 I. Because of heir bi-diagonal srucure, hese marices are admissible for any choice of k. In Fig., he oal cos is ploed as a funcion of k for he fixed ime horizon of T = 20. As expeced, he cos increases monoonically as we increase he delay Toal delay (k) Figure : Plo of he opimal cos as a funcion of he oal delay for Example 3, wih a ime horizon of T = 20. VII Proof of main resuls This secion conains proofs of Theorems 2 and 3. The proof of Theorem 2 requires hree seps. ) Lineariy. We show ha he informaion consrain for Problem is parially nesed [3]. Thus he opimal policy is unique, and consiss of linear funcions of he informaion hisory. 2) Feasibiliy. In (5), u i depends on w i. In spie of his, we prove ha u i need only measure is associaed informaion se I i. So he policy is feasible. 3) Opimaliy. Finally, we mus show ha he proposed policy is opimal. Our approach uses dynamic programming in he ζ s coordinaes, wih an opimizaion over policies raher han acions. VII-A Lineariy Lineariy of he opimal policy follows from parial nesedness, a concep firs inroduced by Ho and Chu in [3]. We sae he main definiion and resul below. Definiion 4. A dynamical sysem (2) wih informaion srucure (5) is parially nesed if for every admissible policy γ, whenever u j τ affecs I i, hen I j τ I i. Lemma 5 (see [3]). Given a parially nesed srucure, he opimal conrol law ha minimizes a quadraic cos of he form (6) exiss, is unique, and is linear. In oher words, an informaion srucure is parially nesed if whenever he decision of Player j affecs he informaion used in Player i s decision, hen Player i mus have access o all he informaion available o Player j. When his is he case, he opimal policy is linear. Using parial nesedness, he following lemma shows ha he opimal sae and inpu may be expressed as linear funcions of erms from he informaion ses I i and he noise hisory H, which was defined in (3). Lemma 6. The informaion srucure described in Problem is parially nesed. Furhermore, he opimally conrolled saes and inpus belong o heir associaed informaion subspaces, and he global sae and inpu belong o he noise hisory subspace. Namely, x i, u i lin I i for all i V, and x, u lin H. (9) Proof. See Appendix B. VII-B Feasibiliy I is clear from Lemma 6 ha lin I i lin H. The inclusion is ypically sric, since he informaion consrains may preven some noise erms in H from having an immediae influence on all nodes of he nework. In his subsecion, we begin by characerizing he subspaces of H ha are associaed wih each of he I i. This will evenually lead o an inuiive definiion for he new ζ s coordinaes. Lemma 7. Consider an informaion graph Ĝ(U, F) and define he corresponding label ses {L s 0:T } s U recursively by L s 0 = {x i 0} L s + s U = s U {w i } L r. r s The following properies of he label ses hold. (20) (i) For every 0, he label ses are a pariion of he noise hisory: L r L s = when r s, and H = L s. (2) (ii) For all i V, lin I i Proof. See Appendix C. s U = lin L s. (22) The label ses specify he groupings in he noise pariion diagram, and hus we can use he noise pariion s i 8

9 diagram o find he noise erms available o a given inpu. Consider Example and Fig. 3 once again. The label ses ha conain 2 are {2, 3} and {, 2, 3}. Therefore, u 2 lin I 2 = lin{l {2,3}, L {,2,3} }. (23) Referring o Fig. 3, we may expand he conens of hese label ses and obain u 2 lin{x 0, x 2 0, w 0: 2, w 2 0: }. We now define he ζ s coordinaes used in he saemen of Theorem 2. Combining (9) and (22), we have ha Thus, we may wrie x = s U x i, u i lin s i L s for all i V. (24) I V,s ζ s and u = s U I V,s ϕ s. (25) where ζ s lin L s and ϕ s lin L s. As explained in Secion I-A, I V,s denoes a submarix of a large ideniy marix wih block rows and columns of appropriae dimensions. For example, he inpu in Example is u = I 0 0 ϕ {} I 0 0 I ϕ {2,3} I ϕ {3} + ϕ {,2,3}, where each of he ϕ s are muually independen. The reason for his srucure is due o (23), or more generally (24). For example, if we examine he second row of he above equaions, u 2 only depends on ϕ {2,3} and ϕ {,2,3}, which depend on L {2,3} and L {,2,3}, and his agrees wih wha we found in (23). Since we assumed in Problem ha he random vecors in H are muually independen and he {L s } s U are a pariion of H, i follows ha he vecors in he se {[ ]} ζ s are muually independen. (26) ϕ s s U Jus as (2) characerizes he evoluion of x, we may also derive sae equaions for he ζ coordinaes. We sae he resul in he following lemma. Lemma 8. The coordinaes {ζ s } s U and {ϕ s } s U saisfy he recursive equaions ζ0 s = I s,{i} x i 0 (27) ζ s + = ( A sr ζ r + B sr ϕ r) + r s Proof. See Appendix D. I s,{i} w i (28) Noe ha (27) (28) agrees wih he formula (5) given in he saemen of Theorem 2, provided ha we se ϕ s = K s ζ s. This choice of policy is feasible because u i = s i I {i},s K s ζ s lin s i L s = lin I i, where he inclusion follows by definiion, and he las equaliy follows from Lemma 7. This complees he proof of feasibiliy. Remark 9. We may inerpre ζ s and ϕ s as condiional esimaes of x and u, respecively. Namely, VII-C ζ s = I s,v E(x L s ) and ϕ s = I s,v E(u L s ). Opimaliy We now prove he conroller is opimal, and derive an expression for he corresponding minimal expeced cos. Our proof uses a dynamic programming argumen, and we opimize over policies raher han acions. Le γ = {γ i } i V be he se of policies a ime. By Lemma 5, we may assume he γ i are linear. Define he cos-o-go V (γ 0: ) = min E γ γ :T ( T [ xk k= u k ] T [ Qk S T k S k R k ] [ xk u k ] + x T T Q f x T ), where he expecaion is aken wih respec o he join probabiliy measure on (x :T, u :T ) induced by he choice of γ = γ 0:T. These funcions are he minimum expeced fuure cos from ime, given ha he policies up o ime have been fixed. We allow V o be a funcion of pas policies, bu i urns ou ha V will no depend on hem explicily. By causaliy, we may ierae he minimizaions and wrie a recursive formulaion for he cos-o-go, V (γ 0: ) = ( [x ] T [ min E γ Q γ u S T S R ] ) + V u + (γ 0:, γ ). ] [ x (29) Our objecive is o find he opimal cos (6), which is simply V 0. Consider he erminal imesep, and use he decomposiion (25), V T (γ 0:T ) = E γ( x T T Q f x T ) = E γ s U (ζ s T ) T Q ss f (ζ s T ). In he las sep, we used he fac ha he ζ s coordinaes are independen (26). Noe ha V T depends on he policies up o ime T because he disribuion of ζt s depends on pas policies implicily hrough (28). We will prove by inducion ha he value funcion always has a similar quadraic form. Specifically, suppose ha for some 0, we have V + (γ 0: ) = E γ s U(ζ s +) T X s +(ζ s +) + c +, where {X+} s s U is a se of marices and c + is a scalar. Now compue V (γ 0: ) using he recursion (29). Subsiuing he definiions for ζ s and ϕ s from (25) and using 9

10 he independence resul (26), we obain ( [ ] V (γ 0: ) = min E γ ζ s T [ Q ss γ s U ϕ s S ss S sst R ss ] [ ] ζ s ϕ s + (ζ s +) T X s +(ζ s +) + c + ). Subsiuing he sae equaions (28), using he independence resul once more and rearranging erms, we obain V (γ 0: ) = min E [ ] γ ζ r T [ ] γ ϕ r Γ r ζ r ϕ r + c, (30) r U where Γ r 0:T and c 0:T are given by: Γ r = [ Q rr S rr S rrt R rr c = c + + i V ] + [ A sr B sr] T X s + [ A sr B sr] (3) race ( (X s +) {i},{i} W i). (32) The erminal condiions are Γ r T = Qrr f and c T = 0, and s is he unique node in Ĝ(U, F) such ha r s, see Proposiion. In (30), he erms in he sum are independen, so hey may be opimized separaely. A lower bound on he cos-o-go is found by relaxing he informaion consrains and performing an unconsrained opimizaion over he acions ϕ = {ϕ s } s U, V (γ 0: ) E γ r U min γ [ ζ r ϕ r E [ γ ζ r min ϕ ϕ r r U ] T [ Γ r ζ r ϕ r ] T [ Γ r ζ r ϕ r ] + c ] + c, where he firs inequaliy follows from Faou s lemma applied o (30), and he second inequaliy follows from he relaxaion menioned above. Each minimizaion is a simple quadraic opimizaion, and he opimal cos and acion are given by (4). Subsiuion yields V (γ 0: ) E γ s U(ζ s ) T X s (ζ s ) + c. This lower-bound is in fac igh, because he opimal unconsrained acions are ϕ s = K s ζ s lin L s, which is precisely he admissible se for ϕ s. This complees he inducion argumen as well as he proof ha he specified policy is opimal. The opimal cos is given by V 0 = E s U(ζ s 0) T X s 0(ζ s 0) + c 0 = E (x i 0) T (X0) s {i},{i} (x i 0) + c 0. (33) i V where c 0 may be evaluaed by saring wih c T = 0 and recursing backwards using (32). Finally, (33) evaluaes o he desired expression (6) because x i 0 N (µ i 0, Σ i 0). This complees he proof of Theorem 2. VII-D Proof of Theorem 3 Firs we show ha he message passing algorihm execues wihou deadlock. A = 0, deadlock canno occur, since here are no delay-0 direced cycles. Therefore, he delay-0 edges induce a parial order on he nodes by i j if here is a direced delay-0 pah from i o j. Any execuion ordering ha saisfies he parial order will be feasible. Now for > 0, assuming ha all delay- messages were sen a ime, he same reasoning shows ha no deadlock could occur a ime. Each node compues u i using (5), so we mus check ha all ζ s saes wih s i are available o node i a ime. I is sraighforward o show ha he claim holds for = 0. Suppose i holds a imesep. Compuing ζ s in sep 2 or he firs par of sep 3 requires knowledge of {ζ } r r s. If i r, hen ζ r is already available, by assumpion. If i / r, hen here exiss j r wih j i. Then (8) implies ha ζ r M ji. Thus, ζ r is made available via an inbound message. For he second par of sep 3, we mus know all ζ r for which r i and r s. This happens as long as we evaluae he updaes while respecing he parial order explained above. Then all missing ζ r are provided by delay-0 messages, as defined in (8). This complees he proof of Theorem 3. VIII Conclusion This paper uses dynamic programming o derive opimal policies for a general class of decenralized linear quadraic sae feedback problems. As noed in Secion IV, he soluion generalizes many exising works on decenralized sae-feedback conrol [6, 2, 22]. The key concep in his paper is he informaion graph. Is nodes are a paricular subses of he se of all nodes in he nework and is edges show how available informaion evolves over ime. This urns ou o be closely relaed o he srucure of he opimal decenralized conroller. For example, he number of Riccai difference equaions ha mus be propagaed is equal o he number of conneced componens in he informaion graph. As discussed in Secion V, many possible avenues for fuure research remain open. For example, some special cases wih noisy measuremens or correlaed noise have been solved, bu exensions o general direced graphs wih mixed sparsiy and delays have ye o be found. IX Acknowledgemens The firs auhor hanks John Doyle for very helpful discussions. The second auhor would like o hank Ashuosh Nayyar for some very helpful discussions. 0

11 References [] A. Gaami. Generalized linear quadraic conrol. IEEE Transacions on Auomaic Conrol, 55():3 36, 200. [2] M. Green and D. J. N. Limebeer. Linear Robus Conrol. Prenice Hall, 995. [3] Y.-C. Ho and K.-C. Chu. Team decision heory and informaion srucures in opimal conrol problems Par I. IEEE Transacions on Auomaic Conrol, 7():5 22, 972. [4] B.-Z. Kuraran and R. Sivan. Linear-quadraic- Gaussian conrol wih one-sep-delay sharing paern. IEEE Transacions on Auomaic Conrol, 9(5):57 574, 974. [5] A. Lamperski and J. C. Doyle. On he srucure of saefeedback LQG conrollers for disribued sysems wih communicaion delays. In IEEE Conference on Decision and Conrol, pages , 20. [6] A. Lamperski and J. C. Doyle. Dynamic programming soluions for decenralized sae-feedback LQG problems wih communicaion delays. In American Conrol Conference, pages , 202. [7] A. Lamperski and J. C. Doyle. Oupu feedback H 2 model maching for decenralized sysems wih delays. In American Conrol Conference, 203. To appear. [8] A. Lamperski and L. Lessard. Opimal sae-feedback conrol under sparsiy and delay consrains. In 3rd IFAC Workshop on Disribued Esimaion and Conrol in Neworked Sysems, pages , 202. [9] L. Lessard. Decenralized LQG conrol of sysems wih a broadcas archiecure. In IEEE Conference on Decision and Conrol, volume , 202. [0] L. Lessard. Opimal conrol of a fully decenralized quadraic regulaor. In Alleron Conference on Communicaion, Conrol, and Compuing, pages IEEE, 202. [] L. Lessard, M. Krisalny, and A. Ranzer. On srucured realizabiliy and sabilizabiliy of linear sysems. In American Conrol Conference, 203. To appear. [2] L. Lessard and S. Lall. Opimal conroller synhesis for he decenralized wo-player problem wih oupu feedback. In American Conrol Conference, pages , 202. [3] L. Lessard and S. Lall. Opimal conrol of wo-player sysems wih oupu feedback. ArXiv e-prins, Mar [4] L. Lessard and A. Nayyar. Srucural resuls and explici soluion for wo-player LQG sysems on a finie ime horizon. ArXiv e-prins, Mar [5] A. Nayyar, A. Mahajan, and D. Tenekezis. Opimal conrol sraegies in delayed sharing informaion srucures. IEEE Transacions on Auomaic Conrol, 56(7): , 20. [6] X. Qi, M. V. Salapaka, P. G. Voulgaris, and M. Khammash. Srucured opimal and robus conrol wih muliple crieria: a convex soluion. IEEE Transacions on Auomaic Conrol, 49(0): , [7] A. Ranzer. Linear quadraic eam heory revisied. In American Conrol Conference, pages , [8] A. Ranzer. A separaion principle for disribued conrol. In IEEE Conference on Decision and Conrol, pages , [9] M. Rokowiz and S. Lall. A characerizaion of convex problems in decenralized conrol. IEEE Transacions on Auomaic Conrol, 5(2): , [20] N. R. Sandell and M. Ahans. Soluion of some nonclassical LQG sochasic decision problems. IEEE Transacions on Auomaic Conrol, 9(2):08 6, 974. [2] P. Shah and P. A. Parrilo. H 2-opimal decenralized conrol over poses: A sae space soluion for saefeedback. In IEEE Conference on Decision and Conrol, pages , 200. [22] J. Swigar. Opimal Conroller Synhesis for Decenralized Sysems. PhD hesis, Sanford Universiy, 200. [23] J. Swigar and S. Lall. An explici dynamic programming soluion for a decenralized wo-player opimal linearquadraic regulaor. In Symposium on he Mahemaical Theory of Neworks, pages , 200. [24] J. Swigar and S. Lall. An explici sae-space soluion for a decenralized wo-player opimal linear-quadraic regulaor. In American Conrol Conference, pages , 200. [25] J. Tsisiklis and M. Ahans. On he complexiy of decenralized decision making and deecion problems. IEEE Transacions on Auomaic Conrol, 30(5): , 985. [26] A. S. M. Vamsi and N. Elia. Nework realizabiliy for inerconneced sysems over arbirary one-sep delay neworks. In American Conrol Conference, pages , 202. [27] P. Varaiya and J. Walrand. On delayed sharing paerns. IEEE Transacions on Auomaic Conrol, 23(3): , 978. [28] H. S. Wisenhausen. A counerexample in sochasic opimum conrol. SIAM Journal on Conrol, 6():3 47, 968. [29] T. Yoshikawa. Dynamic programming approach o decenralized sochasic conrol problems. IEEE Transacions on Auomaic Conrol, 20(6): , 975. [30] T. Yoshikawa and H. Kobayashi. Separaion of esimaion and conrol for decenralized sochasic conrol sysems. Auomaica, 4(6): , 978.

12 A Informaion Se Properies We now prove some useful properies of informaion ses. Lemma 0. The informaion ses (4) may be expressed recursively as follows, I i = (34) I i = { x j : j V, D ij = 0 } I j for 0. j V D ij Furhermore, suppose i, j V and 0 k. The following are equivalen. (i) I j k Ii (ii) x j k Ii (iii) D ij k Proof. We firs prove ha (i) (iii) are equivalen. (i) = (ii): I is immediae from (4) ha x j k Ij k. So i follows ha x j k Ii. (ii) = (iii): I follows from (4) ha if x j k Ii, hen we mus have 0 k D ij. Therefore D ij k as required. (iii) = (i): By he riangle inequaliy, D iα D jα D ij for any α V. Therefore, if D ij k, hen k D jα D iα. So for any l ha saisfies 0 l k D jα, we mus also have 0 l D iα. I follows from (4) ha I j k Ii. We now derive he recursive expression for I. i wih (4), which we rewrie here for convenience, I i = { x j k : j V, 0 k D ij}. Sar Pariion ino wo cases; when k = (which implies D ij = 0), and when k. Then, pariion furher based on he value of D ij. I i = { x j : D ij = 0 } { x j k : 0 k D ij, k } = { x j : j V, D ij = 0 } { x j k : j V, 0 k D ij, D ij = 0 } { x j k : j V, 0 k D ij, D ij }. and D lj + = D ij. Therefore, I i = { x j : j V, D ij = 0 } { x j k : j V, 0 k D ij, D ij = 0 } { x j k : j V, 0 k D } lj l i D il = = { x j : D ij = 0 } l V D il = { x j : j V, D ij = 0 } as required. B Proof of Lemma 6 { x j k : 0 k D } lj j V D ij I j, Suppose ha u j τ affecs I i in he simples way possible; namely ha u j τ affecs x l σ a a fuure imesep σ > τ via recursive applicaions of he sae equaions (), and x l σ I. i Then we have u j τ affecs x l σ = D lj σ τ (35) x l σ I i = D il σ. (36) Adding (35) (36) ogeher and using he riangle inequaliy, we obain D ij τ. By Lemma 0, i follows ha Iτ j I, i as required. If u j τ affecs I i via a more complicaed pah, apply he above argumen o each consecuive pair of inpus along he pah o obain he chain of inclusions Iτ j I. i Wih parial nesedness esablished, Lemma 5 implies ha here is a unique linear opimal conroller. In paricular, he opimal u i is a linear funcion of I. i The same is rivially rue of x i since we have x i I i from (4). Thus he firs par of (9) is verified. We proceed by inducion o prove he second par of (9). A = 0, we clearly have x 0, u 0 lin H 0, since I i = {x 0,..., x N 0 } = H 0. i V Now suppose ha for some 0, we have x, u lin H. Applying he sae equaions (2), i follows ha x + lin { H {w } } = lin H + u + lin i V I i + = lin { {x + } H } = lin H+, and he proof is complee. In he las erm, when D ij, i means ha here is a pah j i wih an aggregae delay of a leas ; so here exiss an inermediae node l i where D il = 2

13 C Proof of Lemma 7 Par (i). We proceed by inducion. A = 0, we have H 0 = {x 0,..., x N 0 }. Since each w i poins o exacly one elemen s U, i is clear from (20) ha {L s 0} s U pariions H 0. Now suppose ha {L s } s U pariions H for some 0. By Proposiion, for each r U here exiss a unique s U such ha r s. Therefore each elemen w i k H is conained in exacly one label se L s +. I follows from (20) ha {L s +} s U mus pariion H + and he proof is complee. Par (ii). Again, by inducion. A = 0, I i 0 = {x j 0 : D ij = 0} = {x j 0 : sj 0 i} = {x j 0 : wj s, s i} = s i L s 0. So he ideniy holds a = 0. Now suppose i holds for some 0. By Lemma 0, I+ i = {x j + : j V, D ij = 0} I j. (37) j V D ij Applying he sae equaions (), we have: x j + = ( Ajl x l + B jl u l) + w j. l V D jl By Lemma 6, he x l and u l erms belong o lin I l where D jl. Therefore, aking he lin of boh sides of (37), D Proof of Lemma 8 Suppose ζ s saisfies (27) (28). The recursive formulaion of he label ses (20) implies ha ζ s lin L s for all 0 and all s U. All ha remains o be shown is ha ζ s indeed provides decomposiion of x as in (25). The decomposiion of x is saisfied a = 0, since for all i V here is a unique s U wih w i s. Now assume inducively ha (25) holds for some 0. Therefore, I V,s ζ+ s s U = ( I V,s ( A sr ζ r + B sr ϕ r) + s U r s = s U I s,{i} w i ) ( ( AI V,r ζ r + BI V,r ϕ r) + ) I V,{i} w i r s ( AI V,s ζ s + BI V,s ϕ s) + w i = s U = Ax + Bu + w = x +, where we have subsiued he dynamics (27) in he firs sep, and he inducion hypohesis in he fourh sep. In he second sep, we ook advanage of he paricular sparsiy srucures of A and B, ha imply AI V,r = I V,s A sr and BI V,r = I V,s B sr. I follows ha (25) holds for +. lin I i + = lin{w j : j V, D ij = 0} + j V D ij = lin{w j : j V, D ij = 0} + j V D ij = lin s i L s +, lin I j lin r j L r where he second equaliy follows from he inducion hypohesis and he hird equaliy follows from he label se recursion (20). 3

Structural results for partially nested LQG systems over graphs

Structural results for partially nested LQG systems over graphs Srucural resuls for parially nesed LQG sysems over graphs Ashuosh Nayyar 1 Lauren Lessard 2 American Conrol Conference, pp. 5457 5464, 2015 Absrac We idenify a broad class of decenralized oupufeedback