Online Partially Model-Free Solution of Two-Player Zero Sum Differential Games

Size: px
Start display at page:

Download "Online Partially Model-Free Solution of Two-Player Zero Sum Differential Games"

Transcription

1 Preprins of he 10h IFAC Inernaional Symposium on Dynamics and Conrol of Process Sysems The Inernaional Federaion of Auomaic Conrol Online Parially Model-Free Soluion of Two-Player Zero Sum Differenial Games Praveen P Shubhendu Bhasin Deparmen of Elecrical Engineering, Indian Insiue of Technology Delhi, India ( eez138263@ee.iid.ac.in) Deparmen of Elecrical Engineering, Indian Insiue of Technology Delhi, India ( sbhasin@ee.iid.ac.in) Absrac: An online adapive dynamic programming based ieraive algorihm is proposed for a wo-player zero sum linear differenial game problem arising in he conrol of process sysems affeced by disurbances. The objecive in such a scenario is o obain an opimal conrol policy ha minimizes he specified performance index or cos funcion in presence of wors case disurbance. Convenional algorihms for he soluion of such problems require full knowledge of sysem dynamics. The algorihm proposed in his paper is parially model-free and solves he wo-player zero sum linear differenial game problem wihou knowledge of sae and conrol inpu marices. Keywords: Two-player zero sum differenial game, Adapive dynamic programming, Approximae Dynamic Programming 1. INTRODUCTION In a wo-player zero sum game, each player s gain is exacly balanced by he loss of his compeior and he ne gain of he players a any poin in ime is zero [Isaacs (1965)]. In scenarios of perfec compeiion such as ha exis in a zero sum game, each player ries o make he bes possible decision aking ino accoun he fac ha his opponen also ries o do he same. The heory of zero sum differenial game finds applicaions in a wide variey of disciplines including ha of conrol heory [Tomlin e al. (2000); Wei and Liu (2012)]. The problem of designing opimal conroller for any process sysem subjec o wors possible disurbance is a minimax opimizaion problem where he conroller ries o minimize and he disurbance ries o maximize he infinie horizon quadraic cos. Using game heoreic framework, he above minimax opimizaion problem can be viewed as a zero sum differenial game where he conroller acs as he minimizer or minimizing player, while he disurbance acs as he maximizer or maximizing player [Basar and Bernhard (1995); Basar and Olsder (1995)]. The opimal conrol in such a scenario is equivalen o finding he Nash equilibrium or saddle poin equilibrium of he corresponding wo-player zero sum differenial game [Basar and Bernhard (1995)]. To obain he saddle poin equilibrium sraegy of each player in he wo-player zero sum linear differenial game, one needs o solve an Algebraic Riccai Equaion (ARE) wih a sign indefinie quadraic erm known as he Game Algebraic Riccai Equaion (GARE). [Kleinman (1968)] proposed an ieraive mehod for solving he ARE wih a sign definie quadraic erm. A series of Lyapunov equaions are consruced a each ieraion and he posiive semi-definie soluions of he Lyapunov equaions are shown o converge o he soluion of ARE. However he algorihm proposed in [Kleinman (1968)] for solving he ARE could no be exended o he GARE due o he presence of a sign indefinie quadraic erm. Subsequenly, several Newon-ype algorihms were proposed for he soluion he GARE [Arnold and Laub (1984); Damm and Hinrichsen (2001); Mehrmann and Tan (1988)]. [Mehrmann (1991)] and [Sima (1996)] proposed a marix sign funcion mehod for solving GARE. A more robus ieraive algorihm for solving GARE was proposed by [Lanzon e al. (2008)], where he GARE wih a sign indefinie quadraic erm is replaced by a sequence of AREs wih sign definie quadraic erms. Each of hese AREs can hen be sequenially solved using Kleinman s algorihm or any oher exising algorihm and he recursive soluion of hese AREs converge o he soluion of GARE. However, all he above resuls require knowledge of full sysem dynamics, which is a severe resricion owing o he uncerainies in sysem modelling. The concep of Adapive Dynamic Programming (ADP) was proposed by [Werbos (1992)] for solving he dynamic programming problems relaed o classical opimal conrol in a forward in ime fashion. ADP is based on he conceps of Dynamic Programming [Bellman (2003)] and Reinforcemen Learning [Suon and Baro (1998)] and has been widely used o reach approximae soluions of opimal conrol problems [Abu-Khalaf and Lewis (2005); Lewis and Liu (2013)]. The classical opimal conrol problem is a single player linear differenial game problem [Isaacs (1965)] and a deailed analysis of he applicaion of ADP for he soluion of single player linear differenial game problems (opimal conrol problems) is provided in [Wang e al. (2009)] and [Lewis e al. (2012)]. Copyrigh 2013 IFAC 696

2 Recenly, [Vrabie e al. (2009)] proposed a parially modelfree, online algorihm for finding he soluion of single player zero sum differenial game in coninuous ime. However he proposed algorihm required parial knowledge of sysem dynamics, namely he conrol inpu marix. Subsequenly, a compleely model-free algorihm was proposed by [Jiang and Jiang (2012)] in order o arrive a he opimal conrol of linear sysem wihou requiring knowledge of he conrol inpu and he sysem drif marices. Exending he principle of ADP o differenial games, a model-free algorihm was proposed by [Al-Tamimi e al. (2007)] for solving he wo-player linear differenial zero sum game appearing in discree ime H conrol. [Abu- Khalaf e al. (2008)] proposed an algorihm o solve coninuous ime wo-player nonlinear differenial game problem wih full knowledge of sysem dynamics. In [Vrabie and Lewis (2011)], he saddle poin soluion of a wo player zero sum differenial game was obained using a parially model-free algorihm. The algorihm required no knowledge of sysem drif dynamics, while he knowledge of conrol inpu and disurbance marices was sill needed. In his paper, an online parially model free ieraive algorihm is proposed o solve he wo-player zero sum differenial game problem. The conribuion of his resul over exising resuls in he lieraure is ha he saddle poin soluion is arrived a wihou he knowledge of sysem drif and conrol inpu marices. Only knowledge of he disurbance inpu marix is required and hence, he proposed algorihm is comparaively more robus han he one proposed in [Vrabie and Lewis (2011)]. The proposed algorihm hereby reduces he need of rigorous sysem idenificaion requiremens, which oherwise is necessary for obaining he sysem drif and conrol inpu marices. The algorihm is daa-driven and he saddle poin soluion of he game problem can be achieved online. The algorihm makes use of wo ieraive seps namely, he ouer level ieraion and he inner level ieraion. In he ouer level ieraion, he wo-player zero sum differenial game problem is convered o a sequence of opimal conrol problems. The inner level ieraion moivaed by he work of [Jiang and Jiang (2012)], is used o find he modelfree soluion of each of hese opimal conrol problems. I is shown ha he soluion of opimal conrol problems converge o he saddle poin equilibrium soluion of he underlying game problem. The algorihm is implemened on a linearized power sysem model [Wang e al. (1963)] o arrive a he opimal conroller in presence of wors case disurbance. The algorihm can be exended o any process sysem wih parially unknown dynamics. 2.1 Problem Formulaion 2. PRELIMINARIES The H conroller design was formulaed by [Basar and Bernhard (1995)] as a wo-player zero sum differenial game. Consider he differenial game formulaion given below, ẋ = Ax + B 1 w + B 2 u (1) y = Cx, (2) where x R n is he sae vecor assumed o be measurable, u R m is he conrol inpu and w R d is he disurbance inpu. The conroller player s objecive is o minimize he infinie horizon cos funcion given by, J(x(0), u(), w()) = (x T C T Cx + u T u w T w)d. (3) A he same ime, he disurbance player who is in perfec compeiion wih he conroller player ries o maximize he cos funcion (3). The opimal sraegies o be adoped by he players in he zero sum game is ermed as he saddle-poin equilibrium. The saddle poin equilibrium sraegy for each player is defined as follows:- Definiion 1. A pair of sraegies {u, w } is in saddle poin equilibrium, if he following se of inequaliies are saisfied for all permissible u and w [Basar and Olsder (1995)]: J(x, u, w) J(x, u, w ) J(x, u, w ) (4) The quaniy J(x, u, w ) is called he saddle poin value of he zero sum game. The wo-player zero sum differenial game described above is a wo-player opimizaion problem and he saddle poin sraegies can be found by using he Ponryagin s Maximum Principle. Hamilonian for he cos funcion in (1) and he sysem in (3) can be defined as, H(x, u, v, λ) = 1 2 (xt C T Cx + u T u w T w) + λ T (Ax + B 1 w + B 2 u). (5) Applying he necessary condiion for opimaliy of H u = 0 and H w = 0, u = B T 2 Π x() (6) w = B T 1 Π x(), (7) where Π is he posiive definie symmeric soluion o he ARE given by, 0 = A T Π + Π A + C T C Π (B 2 B T 2 B 1 B T 1 )Π. (8) The saddle poin equilibrium value is given by, J(x(), u, w ) = x() T Π x(). (9) Hence, o obain he saddle poin equilibrium sraegy of each player in he wo-player zero sum differenial game, one needs o solve he ARE (8) wih he sign indefinie quadraic erm, Π(B 2 B T 2 B 1 B T 1 )Π. 2.2 Ieraive Soluion Using Knowledge of Full Sysem Dynamics The ieraive soluion of he GARE (8) wih full knowledge on sysem dynamics was given by [Lanzon e al. (2008)]. Consider real marices A, B 1, B 2 and C, where (C,A) is observable and (A, B 2 ) is sabilizable. Then, he ieraive mehod for solving he ARE (8) is given as [Lanzon e al. (2008)]: 697

3 1. Sar wih Π 0 = 0. (10) 2. Solve, for he unique posiive definie Z k where k 0, 0 = (A + B 1 B T 1 Π k 1 B 2 B T 2 Π k 1 ) T Z k where, + Z k (A + B 1 B T 1 Π k 1 B 2 B T 2 Π k 1 ) F (Π k 1 ) = A T Π k 1 + Π k 1 A + C T C 3. Updae Z k B 2 B T 2 Z k + F (Π k 1 ), (11) Π k 1 (B 2 B T 2 B 1 B T 1 )Π k 1. (12) Π k = Π k 1 + Z k. (13) where he Π k and Z k possess he following properies: (1) (A + B 1 B T 1 Π k, B 2 ) is sabilizable k (2) F (Π k+1 ) = Z k B 1 B T 1 Z k k (3) A + B 1 B T 1 Π k B 2 B T 2 Π k+1 is Hurwiz k (4) Π Π k+1 Π k 0 k Then, lim k Πk = Π. (14) The convergence proof of he algorihm o he unique posiive definie soluion, Π of he GARE is provided by [Lanzon e al. (2008)]. 3.1 Soluion Approach 3. MAIN RESULTS In he wo-player zero sum differenial game defined hrough (1) and (3), he conroller and he disurbance are wo compeing players. If any of he players adops a consan sraegy, i becomes a single player opimizaion problem or a single player differenial game problem (opimal conrol problem). In he proposed game sraegy, he conroller acs as he acive, opimising player and disurbance acs a passive player, whose policy remains consan during a ime frame. Wih disurbance policy remaining consan, he wo-player zero sum game akes he shape of a classical opimal conrol problem. The soluion approach proposed is o break down he original wo-player zero sum differenial game ino a sequence of opimal conrol problems. The acive conroller player seeks he opimal policy which minimizes he cos funcion of he associaed opimal conrol problem. Afer he conroller player converges o his opimal policy, he disurbance player updaes his policy accordingly. Wih he changed disurbance policy, a new opimal conrol problem is consruced and he conroller player hen seeks he new opimal policy. The aim of his paper is o arrive a he saddle poin soluion of he game wih less dependency on he knowledge of sysem dynamics. This objecive ranslaes o solving each of he opimal conrol problem wihou knowledge of A and B 2 marices. The proposed algorihm only requires knowledge of B 1 for implemenaion. The game soluion is achieved using wo ieraive seps namely ouer level ieraion and he inner level ieraion. In each ouer level ieraion, an opimal conrol problem is consruced from he underlying game problem. The inner level ieraion process is embedded wihin each ouer level ieraion and finds he online soluion of each of hose opimal conrol problems. I is shown ha he inner level ieraions converge o he soluion of he corresponding opimal conrol problem and he ouer level ieraions converge o he saddle poin soluion of he game problem. Ouer Level Ieraion The ouer level ieraion is moivaed by [Lanzon e al. (2008)], where an opimal conrol problem is consruced considering ha he disurbance player acs as a passive player whose policy remains consan during each ieraion. The conroller player, afer considering he disurbance player s policy, hen minimizes he cos funcion by solving he corresponding opimal conrol problem. Le a he k h ouer level ieraion, he disurbance player chooses he sraegy, w = B T 1 Π k 1 x. Considering ha he sraegy of he passive disurbance player remains consan, he game cos funcion (3) can be ransformed ino he following opimal conrol form, J(x(0), u()) = (x T (C T C Π k 1 B 1 B T 1 Π k 1 )x + u T u)d. (15) subjec o sysem dynamics given by,. x = (A + B 1 B T 1 Π k 1 )x + B 2 u. (16) Lemma 2. The soluion of opimal conrol problem defined by (15) and (16) is equivalen o he ieraion seps defined by (11) and (13). Proof. The soluion of he opimal conrol problem defined in (15) and (16) can be found by solving he ARE, 0 = (A + B 1 B T 1 Π k 1 ) T Π k + Π k (A + B 1 B T 1 Π k 1 ) Π k B 2 B T 2 Π k + (C T C Π k 1 B 1 B T 1 Π k 1 ). (17) In he equaion (11), subsiuing for F (Π k 1 ) from (12) and Z k from (13) resul in he ARE (17), which proves he lemma. A he k h ouer level ieraion, he opimal conrol problem o minimize he cos funcion (15) will be found subjec o he modified sysem dynamical equaion (16) resuling in he opimal values, Π k and K k. The conroller sraegy hen becomes, u k = K k x. The disurbance being a passive player, adops he policy w k = B1 T Π k x. Wih he new disurbance policy, he ouer level ieraion is carried ou where he opimal conrol problem defined by (15) and (16) are modified and he opimal conrol soluion is sough again o obain K k+1 and Π k+1. The process is coninued ill convergence where K k+1 K k ɛ 1, where ɛ 1 is a specified hreshold. A convergence, he policies u k = K k+1 x and w k = B T 1 Π k+1 x converge o he saddle poin equilibrium values u = K x and w = B T 1 Π x, for conroller and disurbance respecively. 698

4 Inner Level Ieraion The opimal conrol problem consruced in each ouer level ieraion is solved using he inner level ieraion process. The inner level ieraion is an ADP algorihm based on he work by [Jiang and Jiang (2012)]. I comprises of a policy evaluaion sage and a policy improvemen sage. A he policy evaluaion sage, he curren conrol policy is evaluaed and he policy is updaed in he policy improvemen sage. Lemma 3. The opimal conrol problem defined by (15) and (16) can be solved by sequenially using he following Lyapunov equaions [Kleinman (1968)], (A k i ) T Π i + Π i A k i + C T C Π k 1 B 1 B T 1 Π k 1 + (K i ) T K i = 0. (18) and performing policy updae as, K i+1 = B T 2 Π i, (19) where A k i = A + B 1B T 1 Π k 1 B 2 K i and i is he ieraion. Then, lim K i = K k. i Proof. The proof is provided by [Kleinman (1968)]. The Kleinman s algorihm guaranees he convergence of K i o he opimal value of k h ouer level ieraion, K k. Wih an inenion o arrive a he soluion of opimal conrol problem wihou knowledge on sysem marices A and B 2, he equaion (16) is reformulaed as [Jiang and Jiang (2012)],. x = A k i x + B 2 (u + K i x), (20) where A k i = A + B 1B1 T Π k 1 B 2 K i, k is he ouer level ieraion sage and i is he curren inner level ieraion sage. Wih quadraic paramerizaion, he infinie horizon cos (15) a i h inner level ieraion of k h ouer level ieraion can be wrien as, J i (x()) = x() T Π i x() where Π i is a posiive definie symmeric marix. Then using (18), (19) and (20), dj i d = ẋ()t Π i x() + x() T Π i ẋ() = (A k i x + B 2 (u + K i x)) T Π i x() + x() T Π i (A k i x + B 2 (u + K i x)) = x() T ((A k i ) T Π i + Π i A k i )x() + 2(u + K i x) T B T 2 Π i x() = x() T (C T C Π k 1 B 1 B T 1 Π k 1 + K i T K i )x() + 2(u + K i x) T B T 2 Π i x() (21) Then using K i+1 = B T 2 Π i from (19), i follows from (21) ha, x() T Π i x() x( + T ) T Π i x( + T ) = x T (C T C Π k 1 B 1 B T 1 Π k 1 + K i T K i )dτ 2 where i is he inner ieraion sage. (u + K i x) T K i+1 xdτ, (22) The LHS of he equaion (22) can be reformulaed as x() T Π i x() x(+t ) T Π i x(+t ) = ( x() x(+t )) T ˆΠi. (23) where x() is a modified Kronecker produc vecor wih elemens {x p ()x q ()} p=1:n;q=1:n and ˆΠ i is a vecor formed by sacking he diagonal and upper riangular par of a symmeric marix ino a vecor whose off diagonal erms are muliplied by wo. The inegrand in he second par of RHS can hen be modified as (u + K i x) T K i+1 x = [(x T u T ) + (x T x T )(I n (K i ) T )] ˆK i+1. (24) where denoes Kronecker produc and ˆK i+1 is he vecor form of marix K i+1 sacking columns one over anoher. Now using (23) and (24), he enire equaion (22) can be rewrien as, ( x() x( + T )) T ˆΠi = + 2[(x T u T ) + (x T x T )(I n K T i )] ˆK i+1 x T (C T C Π k 1 B 1 B T 1 Π k 1 + (K i ) T K i )dτ. (25) The equaion (25) can be ransformed ino linear regression form given by Xθ = Y (26) where X = [( x() x(+t )) T 2((x T u T )+(x T x T )(I n Ki T ))], he regression vecor θ = [(Π ˆ i ) T ( ˆK i+1 ) T ] T, and Y = x T (C T C Π k 1 B 1 B T 1 Π k 1 + (K i ) T K i )dτ. Assuming C and B 1 are known, he cos X can be measured along he sae rajecory. Afer geing enough measuremens in Y and X, θ can be obained as a leas square esimae wihou knowledge of A and B 2. ˆθ = (X T X) 1 X T Y. (27) The above esimaion sage is he policy evaluaion sage of ADP, where Π i and K i+1 are esimaed from measuremens of he sysem saes. These measuremens can be considered as reinforcemens from he sysem, analogous o he one described in he heoreical framework of reinforcemen learning. The minimum number of measuremens required for he soluion of he leas square problem (27) equals he number of unknowns in he regression vecor, θ. Afer he policy evaluaion sage, policy updaion sage is performed where he curren policy is improved as u i+1 = K i+1 x + e, where e is he exploraion signal. Remark 4. The algorihm which is based on he ADP conceps of policy evaluaion and policy updaion requires persisence of exciaion (PE) so ha (27) is a well posed leas square problem. In cases wherein he sysem saes reach a saionary phase, persisence of exciaion condiion fails and i may affec he numerical sabiliy [Lee e al. (2012)] of he algorihm. Hence, exploraion signals 699

5 are added o he conrol inpu as i ensures ha sysem saes never become saionary and hence saisfy he PE condiion. Any sufficienly exciing signal like random noise or sinusoidal signal can be used as exploraion signal. The inner level ieraion defined by (26) is done again wih measuremens obained wih new conroller policy, u i+1 and he process is coninued ill convergence. A hreshold ɛ 2 is specified and in each ouer level ieraion, he inner level ieraions are carried ou ill K i+1 K i ɛ 2. The opimal conrol problem specified by (15) and (16) is solved once inner level ieraion is compleed. The opimal values Ki and Π i are hen be passed on for updaion in nex ouer level ieraion as K k and Π k. Remark 5. As K k is direcly esimaed hrough he inner level ieraion process, he opimal conrol can be arrived a wih ou he knowledge of A and B 2 marices. 3.2 Algorihm The proposed algorihm consiss of ouer level and inner level ieraive sages deailed as follows:- (1) Sar he algorihm wih an iniial sabilizing policy K 0 and Π 0 = 0. (2) Le k be he curren ouer ieraion sage wih known values for K k 1 and Π k 1. The disurbance policy is defined as w k = B 1 Π k 1 x. (3) Ouer level ieraion :- Wih he disurbance policy remaining consan, solve he corresponding opimal conrol problem defined by (15) and (16) for Π k and K k using he following inner level ieraion. (4) Inner level ieraion :- (a) Solve online for Π i and K i+1 from sae measuremens where i denoes he inner level ieraion sage, = ( x() x( + T )) T Π i + 2[(x T u T ) + (x T x T )(I n K T i )]K i+1 x T (C T C Π k 1 B 1 B T 1 Π k 1 +K T i K i )dτ. (b) Updae u = K i+1 x + e. (c) Coninue inner level ieraion ill K i+1 K i ɛ 2. (5) Se K k = K i and Π k = Π i. (6) Updae disurbance policy as w k+1 = B T 1 Π k x. Modify he opimal conrol problem described by (15) and (16) using updaed disurbance policy. (7) Coninue ouer level ieraion(seps (3) o (6)) ill K k K k 1 ɛ 1. (8) Se saddle poin policy, K = K k. Remark 6. In he Sep (1) of he algorihm, he iniial sabilizing gain K 0 can be obained by using some nominal knowledge abou he sysem, e.g for sable sysems K 0 can be chosen. 3.3 Convergence Analysis. Theorem 7. The proposed algorihm defined by he ouer level and inner level ieraive sages (Seps (1) o (6) in Secion 3.2) converge o he saddle poin soluion of he (28) wo-player zero sum game. The proposed algorihm creaes a sequence of conrol policies, {K k, k= 1,2,3...} which converges o he saddle poin conrol policy, K so ha lim k Kk K = 0. Proof. The proposed algorihm is based on ouer and inner level ieraive sages. Convergence of proposed algorihm is proved by proving convergence of inner level and ouer level ieraions. Since he inner level ieraion is embedded in he ouer level ieraion, he convergence of inner level ieraion is analyzed firs. The inner level ieraion based on (28) is solved using leas square esimae of K i+1 and Π i using (27). Since (28) is derived from (18) and (19) using (20), (21), (22) and (23), he inner level ieraion sage is equivalen o Kleinman s ieraion (18) and (19). Hence convergence of inner level ieraion is proved using Lemma 3. The ouer level ieraions are based on he soluion of (15) and (16), which is equivalen o he soluion of (11) and (13) using Lemma 2. Then using [Lanzon e al. (2008)], convergence of ouer level ieraion is proved. The convergence resuls of inner level and he ouer level ieraions guaranee convergence of he proposed algorihm o he saddle poin soluion of he wo-player zero sum differenial game. Remark 8. The proposed algorihm assumes knowledge of he disurbance inpu marix while no requiring he knowledge of drif dynamics and conrol inpu marix. 4. SIMULATION RESULTS The algorihm is implemened on a linearized power sysem model proposed in [Wang e al. (1963)]. The sysem marices of he nominal model are A = B 2 = [ ] T, [B 1 = [ ] T,, C = I 4. (29) Wih complee knowledge of sysem dynamics, he feedback gain of he conroller player corresponding o he saddle poin equilibrium can be calculaed using he algorihm pu forward by [Lanzon e al. (2008)] as, K saddle = [ ] (30) The adapive algorihm proposed in his paper does no require he knowledge of A and B 2 and only he knowledge of B 1 is required. Using he proposed algorihm wih random noise as exploraion signal, he following saddle poin policy is obained afer four ouer level ieraions and eleven inner level ieraions. K alg = [ ] (31) 700

6 Fig. 1. Evoluion of Feedback Gain 5. CONCLUSION An adapive dynamic programming based algorihm for finding an online saddle poin soluion of wo-player zero sum linear differenial games is presened in his paper. The proposed algorihm is capable of arriving a he saddle poin soluion of he game problem wih parial knowledge of sysem dynamics, i.e only he knowledge of disurbance inpu marix is required. The algorihm is used o find he opimal conrol policy in a power sysem under wors case disurbance. Fuure effors will be focussed on making he algorihm compleely model-free so ha saddle poin soluion can be achieved wihou any knowledge sysem dynamics. REFERENCES Abu-Khalaf, M., Lewis, F., and Huang, J. (2008). Neurodynamic programming and zero-sum games for consrained conrol sysems. IEEE Transacions on Neural Neworks, 19(7), Abu-Khalaf, M. and Lewis, F.L. (2005). Nearly opimal conrol laws for nonlinear sysems wih sauraing acuaors using a neural nework HJB approach. Auomaica, 41(5), Al-Tamimi, A., Lewis, F.L., and Abu-Khalaf, M. (2007). Model-free Q-learning designs for linear discree-ime zero-sum games wih applicaion o H-infiniy conrol. Auomaica, 43(3), Arnold, W.F., I. and Laub, A. (1984). Generalized eigenproblem algorihms and sofware for Algebraic Riccai Equaions. Proceedings of he IEEE, 72(12), Basar, T. and Bernhard, P. (1995). H-infiniy Opimal Conrol and Relaed Minimax Design Problems. Birkhauser. Basar, T. and Olsder, G.J. (1995). Dynamic Non- Cooperaive Game Theory. SIAM. Bellman, R.E. (2003). Dynamic Programming. Dover Publicaions. Damm, T. and Hinrichsen, D. (2001). Newon s mehod for a raional marix equaion occurring in sochasic conrol. Linear Algebra and is Applicaions, 332(0), Isaacs, R. (1965). Differenial Games: A Mahemaical Theory wih Applicaions o Warfare and Pursui, Conrol and Opimizaion. Dover Publicaions. Jiang, Y. and Jiang, Z. (2012). Compuaional adapive opimal conrol for coninuous-ime linear sysems wih compleely unknown dynamics. Auomaica, 48(10), Kleinman, D. (1968). On an ieraive echnique for Riccai equaion compuaions. IEEE Transacions on Auomaic Conrol, 13(1), Lanzon, A., Feng, Y., Anderson, B., and Rokowiz, M. (2008). Compuing he posiive sabilizing soluion o Algebraic Riccai Equaions wih an indefinie quadraic erm via a recursive mehod. IEEE Transacions on Auomaic Conrol, 53(10), Lee, J.Y., Park, J.B., and Choi, Y.H. (2012). Inegral Q-learning and explorized policy ieraion for adapive opimal conrol of coninuous-ime linear sysems. Auomaica, 48(11), Lewis, F. and Liu, D. (2013). Reinforcemen Learning and Approximae Dynamic Programming for Feedback Conrol. Wiley-IEEE Press. Lewis, F., Vrabie, D., and Vamvoudakis, K. (2012). Reinforcemen learning and feedback conrol: Using naural decision mehods o design opimal adapive conrollers. IEEE Conrol Sysems Magazine, 32(6), Mehrmann, V. and Tan, E. (1988). Defec correcion mehod for he soluion of Algebraicl Riccai Equaions. IEEE Transacions on Auomaic Conrol, 33(7), Mehrmann, V. (1991). The Auonomous Linear Quadraic Conrol Problem: Theory and Numerical Soluion. Lecure Noes in Conrol and Informaion Sciences. Springer. Sima, V. (1996). Algorihms for Linear Quadraic Opimizaion. Marcel Dekker Incorporaed. Suon, R.S. and Baro, A.G. (1998). Reinforcemen Learning: An Inroducion (Adapive Compuaion and Machine Learning). A Bradford Book. Tomlin, C., Lygeros, J., and Sasry, S. (2000). A game heoreic approach o conroller design for hybrid sysems. Proceedings of he IEEE, 88(7), Vrabie, D., Pasravanu, O., Abu-Khalaf, M., and Lewis, F. (2009). Adapive opimal conrol for coninuous-ime linear sysems based on policy ieraion. Auomaica, 45(2), Vrabie, D. and Lewis, F. (2011). Adapive dynamic programming for online soluion of a zero-sum differenial game. Journal of Conrol Theory and Applicaions, 9(3), Wang, F.Y., Zhang, H., and Liu, D. (2009). Adapive dynamic programming: An inroducion. IEEE Compuaional Inelligence Magazine, 4(2), Wang, Y., Zhou, R., and Wen, C. (1963). Robus loadfrequency conroller design for power sysems. IEE Proceedings C Generaion, Transmission and Disribuion, 140(1), Wei, Q. and Liu, D. (2012). Self-learning conrol schemes for wo-person zero-sum differenial games of coninuous-ime nonlinear sysems wih sauraing conrollers. In Inernaional conference on Advances in Neural Neworks, Werbos, P. (1992). Approximae dynamic programming for real-ime conrol and neural modeling. Handbook of Inelligen Conrol: Neural, Fuzzy and Adapive Approaches. 701

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game

Sliding Mode Extremum Seeking Control for Linear Quadratic Dynamic Game Sliding Mode Exremum Seeking Conrol for Linear Quadraic Dynamic Game Yaodong Pan and Ümi Özgüner ITS Research Group, AIST Tsukuba Eas Namiki --, Tsukuba-shi,Ibaraki-ken 5-856, Japan e-mail: pan.yaodong@ais.go.jp

More information

Lecture 20: Riccati Equations and Least Squares Feedback Control

Lecture 20: Riccati Equations and Least Squares Feedback Control 34-5 LINEAR SYSTEMS Lecure : Riccai Equaions and Leas Squares Feedback Conrol 5.6.4 Sae Feedback via Riccai Equaions A recursive approach in generaing he marix-valued funcion W ( ) equaion for i for he

More information

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006 2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)

More information

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems 8 Froniers in Signal Processing, Vol. 1, No. 1, July 217 hps://dx.doi.org/1.2266/fsp.217.112 Recursive Leas-Squares Fixed-Inerval Smooher Using Covariance Informaion based on Innovaion Approach in Linear

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs

A Primal-Dual Type Algorithm with the O(1/t) Convergence Rate for Large Scale Constrained Convex Programs PROC. IEEE CONFERENCE ON DECISION AND CONTROL, 06 A Primal-Dual Type Algorihm wih he O(/) Convergence Rae for Large Scale Consrained Convex Programs Hao Yu and Michael J. Neely Absrac This paper considers

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

Stability and Bifurcation in a Neural Network Model with Two Delays

Stability and Bifurcation in a Neural Network Model with Two Delays Inernaional Mahemaical Forum, Vol. 6, 11, no. 35, 175-1731 Sabiliy and Bifurcaion in a Neural Nework Model wih Two Delays GuangPing Hu and XiaoLing Li School of Mahemaics and Physics, Nanjing Universiy

More information

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS

A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS A DELAY-DEPENDENT STABILITY CRITERIA FOR T-S FUZZY SYSTEM WITH TIME-DELAYS Xinping Guan ;1 Fenglei Li Cailian Chen Insiue of Elecrical Engineering, Yanshan Universiy, Qinhuangdao, 066004, China. Deparmen

More information

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018

MATH 5720: Gradient Methods Hung Phan, UMass Lowell October 4, 2018 MATH 5720: Gradien Mehods Hung Phan, UMass Lowell Ocober 4, 208 Descen Direcion Mehods Consider he problem min { f(x) x R n}. The general descen direcions mehod is x k+ = x k + k d k where x k is he curren

More information

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is UNIT IMPULSE RESPONSE, UNIT STEP RESPONSE, STABILITY. Uni impulse funcion (Dirac dela funcion, dela funcion) rigorously defined is no sricly a funcion, bu disribuion (or measure), precise reamen requires

More information

Variational Iteration Method for Solving System of Fractional Order Ordinary Differential Equations

Variational Iteration Method for Solving System of Fractional Order Ordinary Differential Equations IOSR Journal of Mahemaics (IOSR-JM) e-issn: 2278-5728, p-issn: 2319-765X. Volume 1, Issue 6 Ver. II (Nov - Dec. 214), PP 48-54 Variaional Ieraion Mehod for Solving Sysem of Fracional Order Ordinary Differenial

More information

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL

SUFFICIENT CONDITIONS FOR EXISTENCE SOLUTION OF LINEAR TWO-POINT BOUNDARY PROBLEM IN MINIMIZATION OF QUADRATIC FUNCTIONAL HE PUBLISHING HOUSE PROCEEDINGS OF HE ROMANIAN ACADEMY, Series A, OF HE ROMANIAN ACADEMY Volume, Number 4/200, pp 287 293 SUFFICIEN CONDIIONS FOR EXISENCE SOLUION OF LINEAR WO-POIN BOUNDARY PROBLEM IN

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems.

di Bernardo, M. (1995). A purely adaptive controller to synchronize and control chaotic systems. di ernardo, M. (995). A purely adapive conroller o synchronize and conrol chaoic sysems. hps://doi.org/.6/375-96(96)8-x Early version, also known as pre-prin Link o published version (if available):.6/375-96(96)8-x

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Optimal Path Planning for Flexible Redundant Robot Manipulators

Optimal Path Planning for Flexible Redundant Robot Manipulators 25 WSEAS In. Conf. on DYNAMICAL SYSEMS and CONROL, Venice, Ialy, November 2-4, 25 (pp363-368) Opimal Pah Planning for Flexible Redundan Robo Manipulaors H. HOMAEI, M. KESHMIRI Deparmen of Mechanical Engineering

More information

EE363 homework 1 solutions

EE363 homework 1 solutions EE363 Prof. S. Boyd EE363 homework 1 soluions 1. LQR for a riple accumulaor. We consider he sysem x +1 = Ax + Bu, y = Cx, wih 1 1 A = 1 1, B =, C = [ 1 ]. 1 1 This sysem has ransfer funcion H(z) = (z 1)

More information

EXERCISES FOR SECTION 1.5

EXERCISES FOR SECTION 1.5 1.5 Exisence and Uniqueness of Soluions 43 20. 1 v c 21. 1 v c 1 2 4 6 8 10 1 2 2 4 6 8 10 Graph of approximae soluion obained using Euler s mehod wih = 0.1. Graph of approximae soluion obained using Euler

More information

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Mean-square Stability Control for Networked Systems with Stochastic Time Delay JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455

More information

Q-LEARNING (QL) is a popular and powerful off-policy

Q-LEARNING (QL) is a popular and powerful off-policy MANUSCRIPT UNER REVIEW 1 Q-learning for Opimal Conrol of Coninuous-ime Sysems Biao Luo, erong Liu, Fellow, IEEE and Tingwen Huang he opimal conrol problem in conrol communiy. Thus, i is possible and promising

More information

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION

MATH 128A, SUMMER 2009, FINAL EXAM SOLUTION MATH 28A, SUMME 2009, FINAL EXAM SOLUTION BENJAMIN JOHNSON () (8 poins) [Lagrange Inerpolaion] (a) (4 poins) Le f be a funcion defined a some real numbers x 0,..., x n. Give a defining equaion for he Lagrange

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Ordinary Differential Equations

Ordinary Differential Equations Ordinary Differenial Equaions 5. Examples of linear differenial equaions and heir applicaions We consider some examples of sysems of linear differenial equaions wih consan coefficiens y = a y +... + a

More information

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales

The Asymptotic Behavior of Nonoscillatory Solutions of Some Nonlinear Dynamic Equations on Time Scales Advances in Dynamical Sysems and Applicaions. ISSN 0973-5321 Volume 1 Number 1 (2006, pp. 103 112 c Research India Publicaions hp://www.ripublicaion.com/adsa.hm The Asympoic Behavior of Nonoscillaory Soluions

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

BU Macro BU Macro Fall 2008, Lecture 4

BU Macro BU Macro Fall 2008, Lecture 4 Dynamic Programming BU Macro 2008 Lecure 4 1 Ouline 1. Cerainy opimizaion problem used o illusrae: a. Resricions on exogenous variables b. Value funcion c. Policy funcion d. The Bellman equaion and an

More information

Lecture 2 October ε-approximation of 2-player zero-sum games

Lecture 2 October ε-approximation of 2-player zero-sum games Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

An Introduction to Malliavin calculus and its applications

An Introduction to Malliavin calculus and its applications An Inroducion o Malliavin calculus and is applicaions Lecure 5: Smoohness of he densiy and Hörmander s heorem David Nualar Deparmen of Mahemaics Kansas Universiy Universiy of Wyoming Summer School 214

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

Sliding Mode Controller for Unstable Systems

Sliding Mode Controller for Unstable Systems S. SIVARAMAKRISHNAN e al., Sliding Mode Conroller for Unsable Sysems, Chem. Biochem. Eng. Q. 22 (1) 41 47 (28) 41 Sliding Mode Conroller for Unsable Sysems S. Sivaramakrishnan, A. K. Tangirala, and M.

More information

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes WHAT IS A KALMAN FILTER An recursive analyical echnique o esimae ime dependen physical parameers in he presence of noise processes Example of a ime and frequency applicaion: Offse beween wo clocks PREDICTORS,

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Online Convex Optimization Example And Follow-The-Leader

Online Convex Optimization Example And Follow-The-Leader CSE599s, Spring 2014, Online Learning Lecure 2-04/03/2014 Online Convex Opimizaion Example And Follow-The-Leader Lecurer: Brendan McMahan Scribe: Sephen Joe Jonany 1 Review of Online Convex Opimizaion

More information

Lecture 9: September 25

Lecture 9: September 25 0-725: Opimizaion Fall 202 Lecure 9: Sepember 25 Lecurer: Geoff Gordon/Ryan Tibshirani Scribes: Xuezhi Wang, Subhodeep Moira, Abhimanu Kumar Noe: LaTeX emplae couresy of UC Berkeley EECS dep. Disclaimer:

More information

Lecture 1 Overview. course mechanics. outline & topics. what is a linear dynamical system? why study linear systems? some examples

Lecture 1 Overview. course mechanics. outline & topics. what is a linear dynamical system? why study linear systems? some examples EE263 Auumn 27-8 Sephen Boyd Lecure 1 Overview course mechanics ouline & opics wha is a linear dynamical sysem? why sudy linear sysems? some examples 1 1 Course mechanics all class info, lecures, homeworks,

More information

Chapter 8 The Complete Response of RL and RC Circuits

Chapter 8 The Complete Response of RL and RC Circuits Chaper 8 The Complee Response of RL and RC Circuis Seoul Naional Universiy Deparmen of Elecrical and Compuer Engineering Wha is Firs Order Circuis? Circuis ha conain only one inducor or only one capacior

More information

On Robust Stability of Uncertain Neutral Systems with Discrete and Distributed Delays

On Robust Stability of Uncertain Neutral Systems with Discrete and Distributed Delays 009 American Conrol Conference Hya Regency Riverfron S. Louis MO USA June 0-009 FrC09.5 On Robus Sabiliy of Uncerain Neural Sysems wih Discree and Disribued Delays Jian Sun Jie Chen G.P. Liu Senior Member

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

Optimal Control of Dc Motor Using Performance Index of Energy

Optimal Control of Dc Motor Using Performance Index of Energy American Journal of Engineering esearch AJE 06 American Journal of Engineering esearch AJE e-issn: 30-0847 p-issn : 30-0936 Volume-5, Issue-, pp-57-6 www.ajer.org esearch Paper Open Access Opimal Conrol

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1

SZG Macro 2011 Lecture 3: Dynamic Programming. SZG macro 2011 lecture 3 1 SZG Macro 2011 Lecure 3: Dynamic Programming SZG macro 2011 lecure 3 1 Background Our previous discussion of opimal consumpion over ime and of opimal capial accumulaion sugges sudying he general decision

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

(1) (2) Differentiation of (1) and then substitution of (3) leads to. Therefore, we will simply consider the second-order linear system given by (4)

(1) (2) Differentiation of (1) and then substitution of (3) leads to. Therefore, we will simply consider the second-order linear system given by (4) Phase Plane Analysis of Linear Sysems Adaped from Applied Nonlinear Conrol by Sloine and Li The general form of a linear second-order sysem is a c b d From and b bc d a Differeniaion of and hen subsiuion

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004 Augmened Realiy II Kalman Filers Gudrun Klinker May 25, 2004 Ouline Moivaion Discree Kalman Filer Modeled Process Compuing Model Parameers Algorihm Exended Kalman Filer Kalman Filer for Sensor Fusion Lieraure

More information

Continuous Time. Time-Domain System Analysis. Impulse Response. Impulse Response. Impulse Response. Impulse Response. ( t) + b 0.

Continuous Time. Time-Domain System Analysis. Impulse Response. Impulse Response. Impulse Response. Impulse Response. ( t) + b 0. Time-Domain Sysem Analysis Coninuous Time. J. Robers - All Righs Reserved. Edied by Dr. Rober Akl 1. J. Robers - All Righs Reserved. Edied by Dr. Rober Akl 2 Le a sysem be described by a 2 y ( ) + a 1

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

On-line Adaptive Optimal Timing Control of Switched Systems

On-line Adaptive Optimal Timing Control of Switched Systems On-line Adapive Opimal Timing Conrol of Swiched Sysems X.C. Ding, Y. Wardi and M. Egersed Absrac In his paper we consider he problem of opimizing over he swiching imes for a muli-modal dynamic sysem when

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

Signaling equilibria for dynamic LQG games with. asymmetric information

Signaling equilibria for dynamic LQG games with. asymmetric information Signaling equilibria for dynamic LQG games wih asymmeric informaion Deepanshu Vasal and Achilleas Anasasopoulos Absrac We consider a finie horizon dynamic game wih wo players who observe heir ypes privaely

More information

Independent component analysis for nonminimum phase systems using H filters

Independent component analysis for nonminimum phase systems using H filters Independen componen analysis for nonminimum phase sysems using H filers Shuichi Fukunaga, Kenji Fujimoo Deparmen of Mechanical Science and Engineering, Graduae Shool of Engineering, Nagoya Universiy, Furo-cho,

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

POSITIVE AND MONOTONE SYSTEMS IN A PARTIALLY ORDERED SPACE

POSITIVE AND MONOTONE SYSTEMS IN A PARTIALLY ORDERED SPACE Urainian Mahemaical Journal, Vol. 55, No. 2, 2003 POSITIVE AND MONOTONE SYSTEMS IN A PARTIALLY ORDERED SPACE A. G. Mazo UDC 517.983.27 We invesigae properies of posiive and monoone differenial sysems wih

More information

and Zero Sum Game Solutions

and Zero Sum Game Solutions F.L. Lewis, K. Vamvoudakis, and D. Vrabie Moncrief-O Donnell Endowed Chair Head, Conrols & Sensors Group Auomaion & Roboics Research Insiue (ARRI) he Universiy of exas a Arlingon Suppored by : NSF PAUL

More information

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence

Supplement for Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence Supplemen for Sochasic Convex Opimizaion: Faser Local Growh Implies Faser Global Convergence Yi Xu Qihang Lin ianbao Yang Proof of heorem heorem Suppose Assumpion holds and F (w) obeys he LGC (6) Given

More information

Four Generations of Higher Order Sliding Mode Controllers. L. Fridman Universidad Nacional Autonoma de Mexico Aussois, June, 10th, 2015

Four Generations of Higher Order Sliding Mode Controllers. L. Fridman Universidad Nacional Autonoma de Mexico Aussois, June, 10th, 2015 Four Generaions of Higher Order Sliding Mode Conrollers L. Fridman Universidad Nacional Auonoma de Mexico Aussois, June, 1h, 215 Ouline 1 Generaion 1:Two Main Conceps of Sliding Mode Conrol 2 Generaion

More information

Expert Advice for Amateurs

Expert Advice for Amateurs Exper Advice for Amaeurs Ernes K. Lai Online Appendix - Exisence of Equilibria The analysis in his secion is performed under more general payoff funcions. Wihou aking an explici form, he payoffs of he

More information

Fractional Method of Characteristics for Fractional Partial Differential Equations

Fractional Method of Characteristics for Fractional Partial Differential Equations Fracional Mehod of Characerisics for Fracional Parial Differenial Equaions Guo-cheng Wu* Modern Teile Insiue, Donghua Universiy, 188 Yan-an ilu Road, Shanghai 51, PR China Absrac The mehod of characerisics

More information

THE BELLMAN PRINCIPLE OF OPTIMALITY

THE BELLMAN PRINCIPLE OF OPTIMALITY THE BELLMAN PRINCIPLE OF OPTIMALITY IOANID ROSU As I undersand, here are wo approaches o dynamic opimizaion: he Ponrjagin Hamilonian) approach, and he Bellman approach. I saw several clear discussions

More information

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes Some common engineering funcions 2.7 Inroducion This secion provides a caalogue of some common funcions ofen used in Science and Engineering. These include polynomials, raional funcions, he modulus funcion

More information

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC

This document was generated at 1:04 PM, 09/10/13 Copyright 2013 Richard T. Woodward. 4. End points and transversality conditions AGEC his documen was generaed a 1:4 PM, 9/1/13 Copyrigh 213 Richard. Woodward 4. End poins and ransversaliy condiions AGEC 637-213 F z d Recall from Lecure 3 ha a ypical opimal conrol problem is o maimize (,,

More information

L p -L q -Time decay estimate for solution of the Cauchy problem for hyperbolic partial differential equations of linear thermoelasticity

L p -L q -Time decay estimate for solution of the Cauchy problem for hyperbolic partial differential equations of linear thermoelasticity ANNALES POLONICI MATHEMATICI LIV.2 99) L p -L q -Time decay esimae for soluion of he Cauchy problem for hyperbolic parial differenial equaions of linear hermoelasiciy by Jerzy Gawinecki Warszawa) Absrac.

More information

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models.

Technical Report Doc ID: TR March-2013 (Last revision: 23-February-2016) On formulating quadratic functions in optimization models. Technical Repor Doc ID: TR--203 06-March-203 (Las revision: 23-Februar-206) On formulaing quadraic funcions in opimizaion models. Auhor: Erling D. Andersen Convex quadraic consrains quie frequenl appear

More information

Improved Approximate Solutions for Nonlinear Evolutions Equations in Mathematical Physics Using the Reduced Differential Transform Method

Improved Approximate Solutions for Nonlinear Evolutions Equations in Mathematical Physics Using the Reduced Differential Transform Method Journal of Applied Mahemaics & Bioinformaics, vol., no., 01, 1-14 ISSN: 179-660 (prin), 179-699 (online) Scienpress Ld, 01 Improved Approimae Soluions for Nonlinear Evoluions Equaions in Mahemaical Physics

More information

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3

Macroeconomic Theory Ph.D. Qualifying Examination Fall 2005 ANSWER EACH PART IN A SEPARATE BLUE BOOK. PART ONE: ANSWER IN BOOK 1 WEIGHT 1/3 Macroeconomic Theory Ph.D. Qualifying Examinaion Fall 2005 Comprehensive Examinaion UCLA Dep. of Economics You have 4 hours o complee he exam. There are hree pars o he exam. Answer all pars. Each par has

More information

Problemas das Aulas Práticas

Problemas das Aulas Práticas Mesrado Inegrado em Engenharia Elecroécnica e de Compuadores Conrolo em Espaço de Esados Problemas das Aulas Práicas J. Miranda Lemos Fevereiro de 3 Translaed o English by José Gaspar, 6 J. M. Lemos, IST

More information

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3 and d = c b - b c c d = c b - b c c This process is coninued unil he nh row has been compleed. The complee array of coefficiens is riangular. Noe ha in developing he array an enire row may be divided or

More information

Adaptive MPC for Iterative Tasks

Adaptive MPC for Iterative Tasks Adapive MPC for Ieraive Tasks Monimoy Bujarbaruah, Xiaojing Zhang, Ugo Rosolia, Francesco Borrelli This paper is organized as follows: Secion II inroduces he problem seup. Secion III formulaes he Adapive

More information

arxiv: v1 [cs.ai] 9 May 2017

arxiv: v1 [cs.ai] 9 May 2017 Inegral Policy Ieraions for Reinforcemen Learning Problems in Coninuous Time and Space Jae Young Lee, Richard S Suon Deparmen of Compuing Science, Universiy of Albera, Edmonon, AB, Canada, T6G 2R3 arxiv:170503520v1

More information

CHAPTER 2 Signals And Spectra

CHAPTER 2 Signals And Spectra CHAPER Signals And Specra Properies of Signals and Noise In communicaion sysems he received waveform is usually caegorized ino he desired par conaining he informaion, and he undesired par. he desired par

More information

Moving Horizon Estimation for Staged QP Problems

Moving Horizon Estimation for Staged QP Problems 51s IEEE Conference on Decision and Conrol, Maui, HI, December 212 Moving Horizon Esimaion for Saged QP Problems Eric Chu, Arezou Keshavarz, Dimiry Gorinevsky, and Sephen Boyd Absrac his paper considers

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source

Multi-scale 2D acoustic full waveform inversion with high frequency impulsive source Muli-scale D acousic full waveform inversion wih high frequency impulsive source Vladimir N Zubov*, Universiy of Calgary, Calgary AB vzubov@ucalgaryca and Michael P Lamoureux, Universiy of Calgary, Calgary

More information

4. Advanced Stability Theory

4. Advanced Stability Theory Applied Nonlinear Conrol Nguyen an ien - 4 4 Advanced Sabiliy heory he objecive of his chaper is o presen sabiliy analysis for non-auonomous sysems 41 Conceps of Sabiliy for Non-Auonomous Sysems Equilibrium

More information

arxiv: v1 [math.gm] 4 Nov 2018

arxiv: v1 [math.gm] 4 Nov 2018 Unpredicable Soluions of Linear Differenial Equaions Mara Akhme 1,, Mehme Onur Fen 2, Madina Tleubergenova 3,4, Akylbek Zhamanshin 3,4 1 Deparmen of Mahemaics, Middle Eas Technical Universiy, 06800, Ankara,

More information

) were both constant and we brought them from under the integral.

) were both constant and we brought them from under the integral. YIELD-PER-RECRUIT (coninued The yield-per-recrui model applies o a cohor, bu we saw in he Age Disribuions lecure ha he properies of a cohor do no apply in general o a collecion of cohors, which is wha

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

dy dx = xey (a) y(0) = 2 (b) y(1) = 2.5 SOLUTION: See next page

dy dx = xey (a) y(0) = 2 (b) y(1) = 2.5 SOLUTION: See next page Assignmen 1 MATH 2270 SOLUTION Please wrie ou complee soluions for each of he following 6 problems (one more will sill be added). You may, of course, consul wih your classmaes, he exbook or oher resources,

More information

Properties Of Solutions To A Generalized Liénard Equation With Forcing Term

Properties Of Solutions To A Generalized Liénard Equation With Forcing Term Applied Mahemaics E-Noes, 8(28), 4-44 c ISSN 67-25 Available free a mirror sies of hp://www.mah.nhu.edu.w/ amen/ Properies Of Soluions To A Generalized Liénard Equaion Wih Forcing Term Allan Kroopnick

More information

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Kriging Models Predicing Arazine Concenraions in Surface Waer Draining Agriculural Waersheds Paul L. Mosquin, Jeremy Aldworh, Wenlin Chen Supplemenal Maerial Number

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Anti-Disturbance Control for Multiple Disturbances

Anti-Disturbance Control for Multiple Disturbances Workshop a 3 ACC Ani-Disurbance Conrol for Muliple Disurbances Lei Guo (lguo@buaa.edu.cn) Naional Key Laboraory on Science and Technology on Aircraf Conrol, Beihang Universiy, Beijing, 9, P.R. China. Presened

More information

Undetermined coefficients for local fractional differential equations

Undetermined coefficients for local fractional differential equations Available online a www.isr-publicaions.com/jmcs J. Mah. Compuer Sci. 16 (2016), 140 146 Research Aricle Undeermined coefficiens for local fracional differenial equaions Roshdi Khalil a,, Mohammed Al Horani

More information

The expectation value of the field operator.

The expectation value of the field operator. The expecaion value of he field operaor. Dan Solomon Universiy of Illinois Chicago, IL dsolom@uic.edu June, 04 Absrac. Much of he mahemaical developmen of quanum field heory has been in suppor of deermining

More information

An Introduction to Stochastic Programming: The Recourse Problem

An Introduction to Stochastic Programming: The Recourse Problem An Inroducion o Sochasic Programming: he Recourse Problem George Danzig and Phil Wolfe Ellis Johnson, Roger Wes, Dick Cole, and Me John Birge Where o look in he ex pp. 6-7, Secion.2.: Inroducion o sochasic

More information

STABILITY OF NONLINEAR NEUTRAL DELAY DIFFERENTIAL EQUATIONS WITH VARIABLE DELAYS

STABILITY OF NONLINEAR NEUTRAL DELAY DIFFERENTIAL EQUATIONS WITH VARIABLE DELAYS Elecronic Journal of Differenial Equaions, Vol. 217 217, No. 118, pp. 1 14. ISSN: 172-6691. URL: hp://ejde.mah.xsae.edu or hp://ejde.mah.un.edu STABILITY OF NONLINEAR NEUTRAL DELAY DIFFERENTIAL EQUATIONS

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

The Arcsine Distribution

The Arcsine Distribution The Arcsine Disribuion Chris H. Rycrof Ocober 6, 006 A common heme of he class has been ha he saisics of single walker are ofen very differen from hose of an ensemble of walkers. On he firs homework, we

More information

Projection-Based Optimal Mode Scheduling

Projection-Based Optimal Mode Scheduling Projecion-Based Opimal Mode Scheduling T. M. Caldwell and T. D. Murphey Absrac This paper develops an ieraive opimizaion echnique ha can be applied o mode scheduling. The algorihm provides boh a mode schedule

More information

Pade and Laguerre Approximations Applied. to the Active Queue Management Model. of Internet Protocol

Pade and Laguerre Approximations Applied. to the Active Queue Management Model. of Internet Protocol Applied Mahemaical Sciences, Vol. 7, 013, no. 16, 663-673 HIKARI Ld, www.m-hikari.com hp://dx.doi.org/10.1988/ams.013.39499 Pade and Laguerre Approximaions Applied o he Acive Queue Managemen Model of Inerne

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

Math 334 Fall 2011 Homework 11 Solutions

Math 334 Fall 2011 Homework 11 Solutions Dec. 2, 2 Mah 334 Fall 2 Homework Soluions Basic Problem. Transform he following iniial value problem ino an iniial value problem for a sysem: u + p()u + q() u g(), u() u, u () v. () Soluion. Le v u. Then

More information

A generalization of the Burg s algorithm to periodically correlated time series

A generalization of the Burg s algorithm to periodically correlated time series A generalizaion of he Burg s algorihm o periodically correlaed ime series Georgi N. Boshnakov Insiue of Mahemaics, Bulgarian Academy of Sciences ABSTRACT In his paper periodically correlaed processes are

More information

ARTICLE IN PRESS Neural Networks ( )

ARTICLE IN PRESS Neural Networks ( ) ARTICE IN PRESS Neural Neworks ( ) Conens liss available a ScienceDirec Neural Neworks journal homepage: www.elsevier.com/locae/neune 2009 Special Issue Neural nework approach o coninuous-ime direc adapive

More information

On the Solutions of First and Second Order Nonlinear Initial Value Problems

On the Solutions of First and Second Order Nonlinear Initial Value Problems Proceedings of he World Congress on Engineering 13 Vol I, WCE 13, July 3-5, 13, London, U.K. On he Soluions of Firs and Second Order Nonlinear Iniial Value Problems Sia Charkri Absrac In his paper, we

More information