Robust Adaptive Markov Decision Processes in Multivehicle

Size: px
Start display at page:

Download "Robust Adaptive Markov Decision Processes in Multivehicle"

Transcription

1 Robust Adaptve Markov Decson Processes n Multvehcle Applcatons The MIT Faculty has made ths artcle openly avalable. Please share how ths access benefts you. Your story matters. Ctaton As Publshed Publsher Bertuccell, L.F., B. Bethke, and J.P. How. Robust adaptve Markov Decson Processes n mult-vehcle applcatons. Amercan Control Conference, ACC ' Copyrght 2009 IEEE Insttute of Electrcal and Electroncs Engneers Verson Fnal publshed verson Accessed Mon Apr 09 08::06 EDT 208 Ctable Lnk Terms of Use Detaled Terms Artcle s made avalable n accordance wth the publsher's polcy and may be subject to US copyrght law. Please refer to the publsher's ste for terms of use.

2 2009 Amercan Control Conference Hyatt Regency Rverfront, St. Lous, MO, USA June 0-2, 2009 WeB9.4 Robust Adaptve Markov Decson Processes n Mult-vehcle Applcatons Luca F. Bertuccell, Brett Bethke, and Jonathan P. How Aerospace Controls Laboratory Massachusetts Insttute of Technology {lucab, bbethke, jhow}@mt.edu Abstract Ths paper presents a new robust and adaptve framework for Markov Decson Processes that accounts for errors n the transton probabltes. Robust polces are typcally found off-lne, but can be extremely conservatve when mplemented n the real system. Adaptve polces, on the other hand, are specfcally suted for on-lne mplementaton, but may dsplay undesrable transent performance as the model s updated though learnng. A new method that explots the ndvdual strengths of the two approaches s presented n ths paper. Ths robust and adaptve framework protects the adaptaton process from exhbtng a worst-case performance durng the model updatng, and s shown to converge to the true, optmal value functon n the lmt of a large number of state transton observatons. The proposed framework s nvestgated n smulaton and actual flght experments, and shown to mprove transent behavor n the adaptaton process and overall msson performance. I. INTRODUCTION Many decson processes, such as Markov Decson Processes (MDPs) and Partally Observable MDPs (POMDPs) are modeled as a probablstc process drven by a known Markov Chan. In practce however, the true parameters of the Markov Chan are frequently unavalable to the modeler, and many researchers have recently addressed the ssue of robust performance n these decson systems [] [4]. Whle many authors have studed the problem of MDPs wth uncertan transton probabltes [5] [7], robust counterparts to these MDPs have been obtaned only recently. Robust MDP counterparts have been ntroduced n the work of Bagnell et al [8], Nlm [] and Iyengar [2]. Bagnell presented a robust value teraton algorthm for solvng the robust MDPs. The convergence of robust value teraton was formally proved by Nlm [] and Iyengar [2]. Both Nlm and Iyengar ntroduced meanngful uncertanty sets for the transton probabltes that could be effcently solved by addng an addtonal, nner optmzaton on the uncertan transton probabltes. One of the methods for fndng a robust polcy n [] was to use scenaro-based methods, wheren the performance s optmzed for dfferent realzatons of the transton probabltes. However, t was recently shown that a scenaro-based approach may requre an extremely large number of realzatons to yeld a robust polcy [4]. Ths observaton motvated the development of a specfc scenaro selecton process usng the frst two moments of a Bayesan pror to obtan robust polces usng much fewer scenaros [4], [2]. Robust methods fnd robust polces that hedge aganst errors n the transton probabltes. However, there are many cases when ths type of an approach s too conservatve. For example, t may be possble to dentfy the transton probabltes by observng state transtons, and obtan mproved estmates, and resolve the optmzaton to fnd a less conservatve polcy. Model-based learnng of MDPs s closely related to ndrect adaptve control [9] n that the transton probabltes are estmated n real-tme usng a maxmum lkelhood estmator. At each tme step, certanty equvalence s assumed on the transton probabltes, and a new polcy s found wth the new model estmate [0]. Jaulmes et al. [], [2] study ths problem n an actve estmaton context usng POMDPs. Marbach [3] consders ths problem, when the transton probabltes depend on a parameter vector. Konda and Tstskls [4] consder the problem of slowly-varyng Markov Chans n the context of renforcement learnng. Sato [5] consders ths problem and shows asymptotc convergence of the probablty estmates also n the context of dual control. Kumar [6] also consdered the adaptaton problem. Ford and Moore [7] consder the problem of estmatng the parameters of a non-statonary Hdden Markov Model. Ths paper demonstrates the need to account for both robust plannng and adaptaton n MDPs wth uncertanty n ther transton probabltes. Just lke n control [8] or n task assgnment problems [9], adaptaton alone s generally not suffcent to ensure relable operaton of the overall control system. Ths paper shows that robustness s crtcal to mtgatng worst-case performance, partcularly durng the transent perods of the adaptaton. Ths paper contrbutes a new combned robust and adaptve problem formulaton for MDPs wth errors n the transton probabltes. The key result of ths paper shows that robust and adaptve MDPs can converge to the truly optmal objectve n the lmt of a large number of observatons. We demonstrate the robust component of ths approach by usng a Bayesan pror, and fnds the robust polcy by usng scenaro-based methods. We then augment the robust approach wth an adaptaton scheme that s more effectve at ncorporatng new nformaton n the models. The MDP framework s dscussed n Secton II, the mpact of uncertanty s demonstrated n Secton III, and then we present the ndvdual components of robustness and adaptaton n /09/$ AACC 304

3 Secton IV. The combned robust and adaptve MDP s shown to converge to the true, optmal value functon n the lmt of a large number of observatons. The paper concludes n Secton VI wth a set of demonstratve numercal smulatons and actual flght results on our UAV testbed. A. Problem Formulaton II. MARKOV DECISION PROCESS The Markov Decson Process (MDP) framework that we consder n ths paper conssts of a set of states S of cardnalty N, a set of control actons u U of cardnalty M wth a correspondng polcy µ : S U, a transton model gven by A u j = Pr( k+ j k,u k ), and a reward model g(,u). The tme-addtve objectve functon s defned as N J µ = g N ( N ) + φ k g k ( k,u k ) () k=0 where 0 < φ s an approprate dscount factor. The goal s to fnd an optmal control polcy, µ, that maxmzes an expected objectve gven some known transton model A u J = max µ E [ J µ ( 0 ) ] (2) In an nfnte horzon settng (N ), the soluton to Eq. 2 can be found by solvng the Bellman Equaton [ ] J () = max g() + φ A u u jj ( j), (3) j The optmal control s found by solvng u () argmax u U E[ J µ ( 0 ) ] S (4) The optmal polcy can be found n many dfferent ways usng Value Iteraton or Polcy Iteraton, whle Lnear Programmng can be used for moderately szed problems [20]. III. MODEL UNCERTAINTY It has been shown that the value functon can be based n the presence of small errors n the transton probabltes [3], and that the optmal polcy µ can be extremely senstve to small errors n the model parameters. For example, n the context of UAV mssons, t has been shown that errors n the state transton matrx, Ã u, can result n ncreased UAV crashes when mplemented n real systems [2]. An example of ths suboptmal performance s reflected n Fgure, whch shows two summary plots for a 2-UAV persstent survellance msson formulated as an MDP [22] averaged over 00 Monte Carlo smulatons. The smulatons were performed wth modelng errors: the polcy was found by usng an estmated probablty shown on the y-axs ( Modeled ), but mplemented on the real system that assumed a nomnal probablty shown on the x-axs, ( Actual ). Fgure (a) shows the mean number of faled vehcles n the msson. Note that n the regon labeled Rsky, the falure rate s ncreased sgnfcantly such that all the vehcles n the msson are lost due to the modelng error. Fgure (b) shows the penalty n total coverage tme when the transton probablty s underestmated (n the area denoted (a) Total number of faled vehcles (b) Mean coverage tme vs msmatched fuel flow probabltes Fg.. Impact on modelng error on the overall msson effectveness as Rsky n the fgure). In ths regon, the total coverage tme decreases from approxmately 40 tme steps (out of a 50 tme step msson) to only 0 tme steps. It s of paramount mportance to develop precse mathematcal descrptons for these errors and use ths nformaton to fnd robust polces. Whle there are many methods to descrbe uncertanty sets [], [2], our approach reles on a Bayesan descrpton of ths uncertanty. Ths choce s prmarly motvated by the need to update estmates of these probabltes n realtme n a computatonally tractable manner. Ths approach assumes a pror Drchlet dstrbuton on each row of the transton matrx, and recursvely updates ths dstrbuton wth observatons. The Drchlet dstrbuton f D at tme k for a row of the N-dmensonal transton model s gven by p k = [p, p 2,..., p N ] T and postve dstrbuton parameters α(k) = [α,α 2,...,α N ] T, s defned as f D (p k α(k)) = K N p α =, p = (5) = K p α p α N ( p ) α N = where K s a normalzng factor that ensures the probablty dstrbuton ntegrates to unty. Each p s the th entry of 305

4 the m th row, that s: p = A u m, and 0 p and p =. The prmary reasons for usng the Drchlet dstrbuton s that the mean p satsfes the requrements of a probablty vector 0 p and p = by constructon. Furthermore, the parameters α can be nterpreted as counts, or tmes that a partcular state transton was observed. Ths enables computatonally tractable updates on the dstrbuton based on new observatons. The uncertanty set descrpton for the Drchlet s known as a credblty regon, and can be found by Monte Carlo ntegraton. IV. ADAPTATION AND ROBUSTNESS Ths secton dscusses ndvdual methods for adaptng to changes n the transton probabltes, as well as methods for accountng for robustness n the presence of the transton probablty uncertanty. A. Adaptaton It s well known that the Drchlet dstrbuton s conjugate to the multnomal dstrbuton, mplyng a measurement update step that can be expressed n closed form usng the prevously observed counts α(k). The posteror dstrbuton f D (p k+ α(k + )) s gven n terms of the pror f D (p k α(k)) as f D (p k+ α(k + )) f D (p k α(k)) f M (β(k) p k ) = N = p α p β N = = p α +β where f M (β(k) p k ) s a multnomal dstrbuton wth hyperparameters β(k) = [β,...,β N ]. Each β s the total number of transtons observed from state to a new state : mathematcally β = δ, and { f transton δ, = 0 Otherwse ndcates how many tmes transtons were observed from state to state. For the next dervatons, we assume that only a sngle transton can occur per tme step, β = δ,. Upon recept of the observatons β(k), the parameters α(k) are updated accordng to α (k + ) = α (k) + δ, (6) and the mean can be found by normalzng these parameters p = α /α 0. Our recent work [23] has shown that the mean p can be equvalently expressed recursvely n terms of the prevous mean and varance p = α /α 0 (7) Σ = α (α 0 α ) α0 2(α (8) 0 + ) by recursvely wrtng these moments as δ, p (k) p (k + ) = p (k) + Σ (k) p (k)( p (k)) Σ (k + ) = γ k+ Σ (k) + p (k+)( p (k+)) p (k)( p (k)) where γ k+ = p (k+)( p (k+)). Furthermore, t was shown that these mean-varance recursons, just as ther countequvalent counterparts, can be slow n detectng changes f the model s non-statonary. Hence, a modfed set of recursons was derved that showed that the followng recursons provded a much more effectve change-detecton mechansm. δ, p (k) p (k + ) = p (k) + /λ k Σ (k) p (k)( p (k)) Σ (k + ) = λ k γ k+ Σ (k) + p (k)( p (k)) (9) (0) The key change was the addton of an effectve process through the use of a dscount factor 0 < λ k, and ths allowed for a much faster estmator response [23]. B. Robustness Whle an adaptaton mechansm s useful to account for changes n the transton probabltes, the estmates of the transton probabltes are only guaranteed to converge n the lmt of an nfnte number of observatons. Whle n practce the estmates do not requre an unbounded number of observatons, smply replacng the uncertan model à wth the best estmate  may lead to a based value functon [3] and senstve polces, especally f the estmator has not yet converged to the true parameter A. For the purposes of ths paper, the robust counterpart of Eq. (2) s defned as [], [2] JR = mn max E[ J µ ( 0 ) ] () à A µ Lke the nomnal problem, the objectve functon s maxmzed wth respect to the control polcy; however, for the robust counterpart, the objectve s mnmzed wth respect to the uncertanty set A. When the uncertanty model A s descrbed by a Bayesan pror, scenaro-based methods can be used to generate realzatons of the transton probablty model. Ths gves rse to a scenaro-based robust method whch can turn out to be computatonally ntensve, snce the total number of scenaros needs to be large [2]. Ths motvated our work [4] that, gven a pror Drchlet dstrbuton on the transton probabltes, determnstcally generates samples of each transton probablty row Y (so-called Drchlet Sgma Ponts) usng the frst two statstcal moments of each row of the transton probablty matrx, p and Σ, Y 0 = p { Y = p + β ( Σ /2) p β ( Σ /2) =,...,N = N +,...,2N where β s a tunng parameter that depends on the level of desred conservatsm, whch n turn depends on the sze of the credblty regon. Here, Σ /2 denotes the th row of the matrx square root of Σ. The uncertanty set A contans the determnstc samples Y {,2,...,2N}. 306

5 V. ROBUST ADAPTATION There are many choces for replannng effcently usng model-based methods, such as Real Tme Dynamc Programmng (RTDP) [24], [25]. RTDP assumes that the transton probabltes are unknown, and are contnually updated through an agent s actons n the state space. Due to computatonal consderatons, only a sngle sweep of the value teraton s performed at each measurement update. The result of Gullapall [25] shows that f each state and acton are executed nfntely often, then the (asynchronous) value teraton algorthm converges to the true value functon. An alternatve strategy s to perform synchronous value teraton, by usng a bootstrappng approach where the old polcy s used as the ntal guess for the new polcy [26]. In ths secton, we consder the full robust replannng problem (see Algorthm ). The two man steps are an adaptaton step, where the Drchlet dstrbutons (or alternatvely, the Drchlet Sgma Ponts) for each row and acton are updated based on the most recent observatons, and a robust replan step. For ths paper, we use the Drchlet Sgma Ponts to fnd the robust polcy by usng scenaro-based methods, but we note that the followng theoretcal results apply to any robust value functon. Whle appealng to account for both robustness and adaptaton, t s crtcal to demonstrate that the proposed algorthm n fact converges to the true, optmal soluton n the lmt. We show ths next. A. Convergence Gullapall and Barto [25] showed that n an adaptve (but non-robust) settng, an asynchronous verson of the Value Iteraton algorthm converges to the optmal value functon. Theorem : [25] (Convergence of an adaptve, asynchronous value teraton algorthm) For any fnte state, fnte acton MDP wth an nfnte-horzon dscounted performance measure, an ndrect adaptve asynchronous value teraton algorthm converges to the optmal value functon wth probablty one f: ) the condtons for convergence of the non-adaptve algorthm are met; 2) n the lmt, every acton s executed from every state nfntely often; 3) the estmates of the state transton probabltes reman bounded and converge n the lmt to ther true values wth probablty one. Proof: See [25]. Usng the framework of the above theorem, the robust counterpart to ths theorem s stated next. Theorem 2: (Convergence of a robust adaptve, asynchronous value teraton algorthm) For any fnte state, fnte acton MDP wth an nfnte-horzon dscounted performance measure, a robust, ndrect adaptve asynchronous value teraton algorthm of Theorem { mnµ max J k+ () = Ak A k E[J µ ] f B k S (4) J k () Otherwse converges to the optmal value functon wth probablty one f the condtons of Theorem are satsfed, and the Algorthm Robust Replannng Intalze uncertanty model: for example, Drchlet dstrbuton parameters α whle Not fnshed do Usng a statstcally effcent estmator, update estmates of the transton probabltes (for each row, acton). For example usng the dscounted estmator of Eq. 9 δ, p (k) p (k + ) = p (k) + /λ k Σ (k) p (k)( p (k)) Σ (k + ) = λ k γ k+ Σ (k) + p (k)( p (k)) For each uncertan row of the transton probablty matrx, fnd the robust polcy usng robust DP mn maxe[ ] J µ (2) A µ For example, update the Drchlet Sgma Ponts (for each row, acton), Y 0 = p ( Y = p + β Σ /2) =,...,N (3) ( Y = p β Σ /2) = N +,...,2N and fnd new robust polcy mn µ max A Y E [ ] J µ Return end whle uncertanty set A k converges to the sngleton  k, n other words, lm k A k = { k }. Here B k denotes the subset of states that are updated at each tme step. Proof: The key dfference between ths theorem and Theorem s the maxmzaton over the uncertanty set A k. However, as addtonal observatons are ncurred and by vrtue of the convergent, unbased estmator, the sze of the uncertanty set wll decrease to the sngleton unbased estmate  k. Furthermore, snce the robust operator gven by T. = mn µ max Ak A k s a contracton mappng [], [2]. Usng both of these arguments, and snce ths unbased estmate wll n turn converge to the true value of the transton probablty, then the robust adaptve asynchronous value teraton algorthm wll converge to the true, optmal soluton. Corollary 3: (Convergence of synchronous verson) The synchronous verson of the robust, adaptve MDP wll converge to the true, optmal value functon. Proof: In the event that an entre sweep of the state space occurs at each value teraton, then the uncertanty set A k wll stll converge to the sngleton { k }. Remark: (Convergence of robust adaptaton wth Drchlet Sgma Ponts) For the Drchlet Sgma Ponts, the dscounted estmator of Eq. 9 converges n the lmt of a large number of observatons (wth approprate choce of λ k ), and the covarance Σ s eventually drven to 0, then each of the Drchlet Sgma Ponts wll collapse to the sngleton, the unbased estmate of the true transton probabltes. Ths means that the model wll have converged, and that the robust soluton wll n fact have converged to the optmal value functon. 307

6 VI. NUMERICAL RESULTS Ths secton presents actual flght demonstratons of the proposed robust and adaptve algorthm on a persstent survellance msson n the RAVEN testbed [22]. The UAVs are ntally located at a base locaton, whch s separated by some (possbly large) dstance from the survellance locaton. The objectve of the problem s to mantan a specfed number r of requested UAVs over the survellance locaton at all tmes. The base locaton s denoted by Y b, the survellance locaton s denoted by Y s, and a dscretzed set of ntermedate locatons are denoted by {Y 0,...,Y s }. Vehcles can move between adjacent locatons at a rate of one unt per tme step. The UAVs have a specfed maxmum fuel capacty F max, and we assume that the rate Ḟ burn at whch they burn fuel may vary randomly durng the msson: the probablty of nomnal fuel flow s gven by p nom. Ths uncertanty n the fuel flow may be attrbuted to aggressve maneuverng that may be requred for short tme perods, for example. Thus, the total flght tme each vehcle acheves on a gven flght s a random varable, and ths uncertanty must be accounted for n the problem. If a vehcle runs out of fuel whle n flght, t crashes and s lost. The vehcles can refuel (at a rate Ḟ re f uel ) by returnng to the base locaton. In ths secton, the adaptve replannng was mplemented by explctly accountng for the uncertanty n the probablty of nomnal fuel flow, p nom. The replannng archtecture updates both the mean and varance of the fuel flow transton probablty, whch s then passed to the onlne MDP solver, whch computes the robust polcy. Ths robust polcy s then passed to the polcy executer, whch mplements the control decson on the sytem. The Drchlet Sgma Ponts were formed usng updated mean and varance Y 0 = ˆp nom Y = ˆp nom + β σ p Y 2 = ˆp nom β σ p and used to fnd the robust polcy. Usng the results from the earler chapters, approprate choces of β could range from to 5, where β 3 corresponds to a 99% certanty regon for the Drchlet (n ths case, the Beta densty). For ths scalar problem, the robust soluton of the MDP corresponds to usng a value of ˆp nom βσ p n place of the nomnal probablty estmate ˆp nom. Flght experments were performed for a case when the probablty estmate ˆp nom was vared n md-msson, and three dfferent replannng strateges were compared Adaptve only: The frst replan strategy nvolved only an adaptve strategy, wth λ = 0.8, and usng only the estmate ˆp nom Robust replan, undscounted adaptaton: Ths replan strategy used the undscounted mean-varance estmator λ =, and set β = 4 for the Drchlet Sgma Ponts Robust replan, dscounted adaptaton: Ths replan strategy used the undscounted mean-varance estmator λ = 0.8, and set β = 4 for the Drchlet Sgma Ponts (a) Fast adaptaton (λ = 0.8) wth no robustness (β = 0) (b) Hgh robustness (β = 4) but slow adaptaton (λ = ) Fg. 2. Expermental results showng vehcle trajectores (red and blue), and probablty estmate used n the plannng (black) In all cases, the vehcle takes off from base, travels through 2 ntermedate areas, and then reaches the survellance locaton. In the nomnal fuel flow settng losng unt of fuel per tme step, the vehcle can safely reman at the survellance regon for 4 tme steps, but n the off-nomnal fuel flow settng (losng 2 unts), the vehcle can only reman on survellance for only tme step. The man results are shown n Fgure 2, where the transton n p nom occurred at t = 7 tme steps. At ths pont n tme, one of the vehcles s just completng the survellance, and s ntatng the return to base to refuel, as the second vehcle s headng to the survellance area. The key to the successful msson, n the sense of avodng vehcle crashes, s to ensure that the change s detected suffcently quckly, and that the planner mantans some level of cautousness n ths estmate by embeddng robustness. The successful msson wll detect ths change rapdly, and leave the UAVs on target for a shorter tme. The result of Fgure 2(a) gnores any uncertanty n the estmate but has a fast adaptaton (snce t uses the factor λ = 0.8). However, by not embeddng the uncertanty, the estmator whle detectng the change n p nom quckly, nonetheless allocates the second vehcle to reman at the survellance regon. Consequently, one of the vehcles runs out of fuel 308

7 ACKNOWLEDGEMENTS Research supported by AFOSR grant FA Fg. 3. Fast adaptaton (λ = 0.8) wth robustness (β = 4) and crashes. At the second cycle of the msson, the second vehcle remans at the survellance area for only tme step. The result of Fgure 2(b) accounts for uncertanty n the estmate but has a slow adaptaton (snce t uses the factor λ = ). However, whle embeddng the uncertanty, the replannng s not done quckly, and for ths dfferent reason from the adaptve, non-robust example, one of the vehcle runs out of fuel, and crashes. At the second cycle of the msson, the second vehcle remans at the survellance area for only tme step. Fgure 3 shows the robustness and adaptaton actng together to cautously allocate the vehcles, whle respondng quckly to changes n p nom. The second vehcle s allocated to perform survellance for only 2 tme steps (nstead of 3), and safely returns to base wth no fuel remanng. At the second cycle, both vehcles only stay at the survellance area for tme step. Hence, the robustness and adaptaton have together been able to recover msson effcency by brngng n ther relatve strengths: the robustness by accountng for uncertanty n the probablty, and the adaptaton by quckly respondng to the changes n the probablty. VII. CONCLUSIONS Ths paper has presented a combned robust and adaptve framework that accounts for errors n the transton probabltes. Ths framework s shown to converge to the true, optmal value functon n the lmt of a large number of observatons. The proposed framework has been verfed both n smulaton and actual flght experments, and shown to mprove transent behavor n the adaptaton process and overall msson performance. Our current work s addressng a more actve learnng mechansm for the transton probabltes, by the use of exploratory actons specfcally taken to reduce the uncertanty n the transton probabltes. Our future work wll consder the problem of decentralzaton of the robust adaptve framework across multple vehcles, specfcally addressng the ssues of model consensus n a mult-agent system, and the mpact of any dsagreement on the robust soluton. REFERENCES [] A. Nlm and L. E. Ghaou, Robust Solutons to Markov Decson Problems wth Uncertan Transton Matrces, Operatons Research, vol. 53, no. 5, [2] G. Iyengar, Robust Dynamc Programmng, Math. Oper. Res., vol. 30, no. 2, pp , [3] S. Mannor, D. Smester, P. Sun, and J. Tstskls, Bas and Varance Approxmaton n Value Functon Estmates, Management Scence, vol. 52, no. 2, pp , [4] L. F. Bertuccell and J. P. How, Robust Decson-Makng for Uncertan Markov Decson Processes Usng Sgma Pont Samplng, IEEE Amercan Controls Conference, [5] D. E. Brown and C. C. Whte., Methods for reasonng wth mprecse probabltes n ntellgent decson systems, IEEE Conference on Systems, Man and Cybernetcs, pp. 6 63, 990. [6] J. K. Sata and R. E. Lave., Markovan Decson Processes wth Uncertan Transton Probabltes, Operatons Research, vol. 2, no. 3, 973. [7] C. C. Whte and H. K. Eldeb., Markov Decson Processes wth Imprecse Transton Probabltes, Operatons Research, vol. 42, no. 4, 994. [8] A. Bagnell, A. Ng, and J. Schneder, Solvng Uncertan Markov Decson Processes, NIPS, 200. [9] K. J. Astrom and B. Wttenmark, Adaptve Control. Boston, MA, USA: Addson-Wesley Longman Publshng Co., Inc., 994. [0] R. S. Sutton and A. G. Barto, Renforcement Learnng: An Introducton (Adaptve Computaton and Machne Learnng). The MIT Press, 998. [] R. Jaulmes, J. Pneau, and D. Precup., Actve Learnng n Partally Observable Markov Decson Processes, European Conference on Machne Learnng (ECML), [2] R. Jaulmes, J. Pneau, and D. Precup., Learnng n Non-Statonary Partally Observable Markov Decson Processes, ECML Workshop on Renforcement Learnng n Non-Statonary Envronments, [3] P. Marbach, Smulaton-based methods for Markov Decson Processes. PhD thess, MIT, 998. [4] V. Konda and J. Tstskls, Lnear stochastc approxmaton drven by slowly varyng Markov chans, Systems and Control Letters, vol. 50, [5] M. Sato, K. Abe, and H. Takeda., Learnng Control of Fnte Markov Chans wth Unknown Transton Probabltes, IEEE Trans. on Automatc Control, vol. AC-27, no. 2, 982. [6] P. R. Kumar and W. Ln., Smultaneous Identfcaton and Adaptve Control of Unknown Systems over Fnte Parameters Sets., IEEE Trans. on Automatc Control, vol. AC-28, no., 983. [7] J. Ford and J. Moore, Adaptve Estmaton of HMM Transton Probabltes, IEEE Transactons on Sgnal Processng, vol. 46, no. 5, 998. [8] P. A. Ioannou and J. Sun, Robust Adaptve Control. Prentce-Hall, 996. [9] M. Alghanbar and J. P. How, A Robust Approach to the UAV Task Assgnment Problem, Internatonal Journal of Robust and Nonlnear Control, vol. 8, no. 2, [20] M. Puterman, Markov Decson Processes: Dscrete Stochastc Dynamc Programmng. Wley, [2] L. F. Bertuccell, Robust Decson-Makng wth Model Uncertanty n Aerospace Systems. PhD thess, MIT, [22] B. Bethke, J. How, and J. Van., Group Health Management of UAV Teams Wth Applcatons to Persstent Survellance, IEEE Amercan Controls Conference, [23] L. F. Bertuccell and J. P. How, Estmaton of Non-Statonary Markov Chan Transton Models, IEEE Conference on Decson and Control, [24] A. Barto, S. Bradtke, and S. Sngh., Learnng to Act usng Real-Tme Dynamc Programmng, Artfcal Intellgence, vol. 72, pp. 8 38, 993. [25] V. Gullapall and A. Barto., Convergence of Indrect Adaptve Asynchronous Value Iteraton Algorthms, Advances n NIPS, 994. [26] B. Bethke, L. Bertuccell, and J. P. How, Expermental Demonstraton of MDP- Based Plannng wth Model Uncertanty, n AIAA Gudance Navgaton and Control Conference, Aug AIAA

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Appendix B: Resampling Algorithms

Appendix B: Resampling Algorithms 407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles

More information

The Study of Teaching-learning-based Optimization Algorithm

The Study of Teaching-learning-based Optimization Algorithm Advanced Scence and Technology Letters Vol. (AST 06), pp.05- http://dx.do.org/0.57/astl.06. The Study of Teachng-learnng-based Optmzaton Algorthm u Sun, Yan fu, Lele Kong, Haolang Q,, Helongang Insttute

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Dynamic Programming. Lecture 13 (5/31/2017)

Dynamic Programming. Lecture 13 (5/31/2017) Dynamc Programmng Lecture 13 (5/31/2017) - A Forest Thnnng Example - Projected yeld (m3/ha) at age 20 as functon of acton taken at age 10 Age 10 Begnnng Volume Resdual Ten-year Volume volume thnned volume

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Chapter - 2. Distribution System Power Flow Analysis

Chapter - 2. Distribution System Power Flow Analysis Chapter - 2 Dstrbuton System Power Flow Analyss CHAPTER - 2 Radal Dstrbuton System Load Flow 2.1 Introducton Load flow s an mportant tool [66] for analyzng electrcal power system network performance. Load

More information

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming

EEL 6266 Power System Operation and Control. Chapter 3 Economic Dispatch Using Dynamic Programming EEL 6266 Power System Operaton and Control Chapter 3 Economc Dspatch Usng Dynamc Programmng Pecewse Lnear Cost Functons Common practce many utltes prefer to represent ther generator cost functons as sngle-

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Report on Image warping

Report on Image warping Report on Image warpng Xuan Ne, Dec. 20, 2004 Ths document summarzed the algorthms of our mage warpng soluton for further study, and there s a detaled descrpton about the mplementaton of these algorthms.

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Adjusted Control Lmts for U Charts Copyrght 207 by Taylor Enterprses, Inc., All Rghts Reserved. Adjusted Control Lmts for U Charts Dr. Wayne A. Taylor Abstract: U charts are used

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

8. Modelling Uncertainty

8. Modelling Uncertainty 8. Modellng Uncertanty. Introducton. Generatng Values From Known Probablty Dstrbutons. Monte Carlo Smulaton 4. Chance Constraned Models 5 5. Markov Processes and Transton Probabltes 6 6. Stochastc Optmzaton

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results.

For now, let us focus on a specific model of neurons. These are simplified from reality but can achieve remarkable results. Neural Networks : Dervaton compled by Alvn Wan from Professor Jtendra Malk s lecture Ths type of computaton s called deep learnng and s the most popular method for many problems, such as computer vson

More information

Hidden Markov Models

Hidden Markov Models CM229S: Machne Learnng for Bonformatcs Lecture 12-05/05/2016 Hdden Markov Models Lecturer: Srram Sankararaman Scrbe: Akshay Dattatray Shnde Edted by: TBD 1 Introducton For a drected graph G we can wrte

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS

A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS HCMC Unversty of Pedagogy Thong Nguyen Huu et al. A PROBABILITY-DRIVEN SEARCH ALGORITHM FOR SOLVING MULTI-OBJECTIVE OPTIMIZATION PROBLEMS Thong Nguyen Huu and Hao Tran Van Department of mathematcs-nformaton,

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016 CS 29-128: Algorthms and Uncertanty Lecture 17 Date: October 26, 2016 Instructor: Nkhl Bansal Scrbe: Mchael Denns 1 Introducton In ths lecture we wll be lookng nto the secretary problem, and an nterestng

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows:

A linear imaging system with white additive Gaussian noise on the observed data is modeled as follows: Supplementary Note Mathematcal bacground A lnear magng system wth whte addtve Gaussan nose on the observed data s modeled as follows: X = R ϕ V + G, () where X R are the expermental, two-dmensonal proecton

More information

Week 5: Neural Networks

Week 5: Neural Networks Week 5: Neural Networks Instructor: Sergey Levne Neural Networks Summary In the prevous lecture, we saw how we can construct neural networks by extendng logstc regresson. Neural networks consst of multple

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 14: Bandits with Budget Constraints

Lecture 14: Bandits with Budget Constraints IEOR 8100-001: Learnng and Optmzaton for Sequental Decson Makng 03/07/16 Lecture 14: andts wth udget Constrants Instructor: Shpra Agrawal Scrbed by: Zhpeng Lu 1 Problem defnton In the regular Mult-armed

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

Portfolios with Trading Constraints and Payout Restrictions

Portfolios with Trading Constraints and Payout Restrictions Portfolos wth Tradng Constrants and Payout Restrctons John R. Brge Northwestern Unversty (ont wor wth Chrs Donohue Xaodong Xu and Gongyun Zhao) 1 General Problem (Very) long-term nvestor (eample: unversty

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Probability Theory (revisited)

Probability Theory (revisited) Probablty Theory (revsted) Summary Probablty v.s. plausblty Random varables Smulaton of Random Experments Challenge The alarm of a shop rang. Soon afterwards, a man was seen runnng n the street, persecuted

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

DUE: WEDS FEB 21ST 2018

DUE: WEDS FEB 21ST 2018 HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Reinforcement learning

Reinforcement learning Renforcement learnng Nathanel Daw Gatsby Computatonal Neuroscence Unt daw @ gatsby.ucl.ac.uk http://www.gatsby.ucl.ac.uk/~daw Mostly adapted from Andrew Moore s tutorals, copyrght 2002, 2004 by Andrew

More information

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty

Additional Codes using Finite Difference Method. 1 HJB Equation for Consumption-Saving Problem Without Uncertainty Addtonal Codes usng Fnte Dfference Method Benamn Moll 1 HJB Equaton for Consumpton-Savng Problem Wthout Uncertanty Before consderng the case wth stochastc ncome n http://www.prnceton.edu/~moll/ HACTproect/HACT_Numercal_Appendx.pdf,

More information

A New Evolutionary Computation Based Approach for Learning Bayesian Network

A New Evolutionary Computation Based Approach for Learning Bayesian Network Avalable onlne at www.scencedrect.com Proceda Engneerng 15 (2011) 4026 4030 Advanced n Control Engneerng and Informaton Scence A New Evolutonary Computaton Based Approach for Learnng Bayesan Network Yungang

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Ryan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions

Ryan (2009)- regulating a concentrated industry (cement) Firms play Cournot in the stage. Make lumpy investment decisions 1 Motvaton Next we consder dynamc games where the choce varables are contnuous and/or dscrete. Example 1: Ryan (2009)- regulatng a concentrated ndustry (cement) Frms play Cournot n the stage Make lumpy

More information

An Integrated Asset Allocation and Path Planning Method to to Search for a Moving Target in in a Dynamic Environment

An Integrated Asset Allocation and Path Planning Method to to Search for a Moving Target in in a Dynamic Environment An Integrated Asset Allocaton and Path Plannng Method to to Search for a Movng Target n n a Dynamc Envronment Woosun An Mansha Mshra Chulwoo Park Prof. Krshna R. Pattpat Dept. of Electrcal and Computer

More information

k t+1 + c t A t k t, t=0

k t+1 + c t A t k t, t=0 Macro II (UC3M, MA/PhD Econ) Professor: Matthas Kredler Fnal Exam 6 May 208 You have 50 mnutes to complete the exam There are 80 ponts n total The exam has 4 pages If somethng n the queston s unclear,

More information

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30 STATS 306B: Unsupervsed Learnng Sprng 2014 Lecture 10 Aprl 30 Lecturer: Lester Mackey Scrbe: Joey Arthur, Rakesh Achanta 10.1 Factor Analyss 10.1.1 Recap Recall the factor analyss (FA) model for lnear

More information

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong Moton Percepton Under Uncertanty Hongjng Lu Department of Psychology Unversty of Hong Kong Outlne Uncertanty n moton stmulus Correspondence problem Qualtatve fttng usng deal observer models Based on sgnal

More information

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k)

Hongyi Miao, College of Science, Nanjing Forestry University, Nanjing ,China. (Received 20 June 2013, accepted 11 March 2014) I)ϕ (k) ISSN 1749-3889 (prnt), 1749-3897 (onlne) Internatonal Journal of Nonlnear Scence Vol.17(2014) No.2,pp.188-192 Modfed Block Jacob-Davdson Method for Solvng Large Sparse Egenproblems Hongy Mao, College of

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Hidden Markov Models

Hidden Markov Models Hdden Markov Models Namrata Vaswan, Iowa State Unversty Aprl 24, 204 Hdden Markov Model Defntons and Examples Defntons:. A hdden Markov model (HMM) refers to a set of hdden states X 0, X,..., X t,...,

More information

Introduction to Hidden Markov Models

Introduction to Hidden Markov Models Introducton to Hdden Markov Models Alperen Degrmenc Ths document contans dervatons and algorthms for mplementng Hdden Markov Models. The content presented here s a collecton of my notes and personal nsghts

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Grover s Algorithm + Quantum Zeno Effect + Vaidman

Grover s Algorithm + Quantum Zeno Effect + Vaidman Grover s Algorthm + Quantum Zeno Effect + Vadman CS 294-2 Bomb 10/12/04 Fall 2004 Lecture 11 Grover s algorthm Recall that Grover s algorthm for searchng over a space of sze wors as follows: consder the

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information