Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution Ruijie He and Nicholas Roy

Size: px
Start display at page:

Download "Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution Ruijie He and Nicholas Roy"

Transcription

1 Compuer Science and Arificial Inelligence Laboraory Technical Repor MIT-CSAIL-TR Sepember 23, 2009 Efficien POMDP Forward Search by Predicing he Poserior Belief Disribuion Ruijie He and Nicholas Roy massachuses insiue of echnology, cambridge, ma usa

2 Efficien POMDP Forward Search by Predicing he Poserior Belief Disribuion Ruijie He Compuer Science and Arificial Inelligence Laboraory Massachuses Insiue of Technology Cambridge, MA Nicholas Roy Compuer Science and Arificial Inelligence Laboraory Massachuses Insiue of Technology Cambridge, MA Absrac Online, forward-search echniques have demonsraed promising resuls for solving problems in parially observable environmens. These echniques depend on he abiliy o efficienly search and evaluae he se of beliefs reachable from he curren belief. However, enumeraing or sampling acion-observaion sequences o compue he reachable beliefs is compuaionally demanding; coupled wih he need o saisfy real-ime consrains, exising online solvers can only search o a limied deph. In his paper, we propose ha policies can be generaed direcly from he disribuion of he agen s poserior belief. When he underlying sae disribuion is Gaussian, and he observaion funcion is an exponenial family disribuion, we can calculae his disribuion of beliefs wihou enumeraing he possible observaions. This propery no only enables us o plan in problems wih large observaion spaces, bu also allows us o search deeper by considering policies composed of muli-sep acion sequences. We presen he Poserior Belief Disribuion (PBD) algorihm, an efficien forward-search POMDP planner for coninuous domains, demonsraing ha beer policies are generaed when we can perform deeper forward search. 1 Inroducion The Parially Observable Markov Decision Process (POMDP) is a general framework for sequenial decision making in parially observable environmens, when he agen is unable o exacly observe he sae of is environmen. Tradiionally, a POMDP solver generaes a policy offline, compuing an acion for a se of possible beliefs before policy execuion. However, for problems wih large domains, offline mehods can incur significan compuaion coss. Recenly, online forward-search mehods have demonsraed promising resuls in problems wih large domains (see [17] for a review), suggesing ha POMDP planning can be performed efficienly by only considering he belief saes ha are reachable from he agen s curren belief. If a POMDP solver is able o search deep enough, i will find he opimal policy for he curren belief [6, 14]. Unforunaely, he number of belief saes reachable wihin deph D is ( A Z ) D, where A and Z are he sizes of he acion and observaion ses. No only does he search quickly become inracable as D increases, bu online echniques generally have o mee real-ime consrains, which limis he planning ime available for each ieraion. Exising online, forward search 1

3 algorihms seek o reduce he number of possible observaions ha have o be explored by using branch-and-bound [10], Mone Carlo sampling [9] and heurisic search [15, 21] echniques. Fundamenally, hese algorihms sill branch on he possible individual acions and observaions o deermine he se of reachable poserior beliefs. An alernaive approach would be o consider shallow policies composed of muli-sep acion sequences, or macro-acions [20], branching only a he end of each acion sequence. However, o plan wih muli-sep acion sequences, an algorihm mus have he abiliy o deermine he se of poserior beliefs ha could resul afer he acion sequence, since he goal of a POMDP solver is o generae he policy ha maximizes he agen s expeced discouned reward. This se of beliefs is usually compued by enumeraing or sampling from he se of observaion sequences, which is iself a cosly process and reduces he poenial savings of macroacions. If i were possible o efficienly characerize he disribuion of poserior beliefs afer an acion sequence wihou enumeraing he possible observaions, forward search POMDP planning could hen be done much more efficienly. If he disribuion over poserior beliefs can be compued efficienly and is of a low dimension, hen sampling from his disribuion requires subsanially fewer samples and much less compuaion, allowing much faser search and efficien planning in POMDP problems wih large observaion spaces. In his paper, we demonsrae ha when he agen s belief and observaion models can be represened in parameric form, he disribuion of he agen s poserior beliefs can be direcly compued for a muli-sep acion sequence. Parameric represenaions have previously been proposed [2, 3, 12, 16] as an alernaive for compacly represening high-dimensional belief spaces, and are especially valuable for POMDP problems wih coninuous sae spaces. Specifically, we focus on problems where he agen s belief is reasonably represened as a Gaussian disribuion over a coninuous sae space, and where he ransiion and observaion models belong o any member of he exponenial family of disribuions, such as he linear-gaussian or mulinomial disribuions ha ofen characerize POMDP problems. By also consraining he agen s poserior belief o he Gaussian parameric represenaion, we can direcly compue how he sufficien saisics of he agen s belief are expeced o evolve over muli-sep acion sequences. Furhermore, for Gaussian disribuions, we will see ha he second momen of he belief disribuion (i.e., he covariance) can be compued in amorized O(1) for a given muli-sep acion sequence, increasing he efficiency of he search process. We presen he Poserior Belief Disribuion (PBD) algorihm, an efficien, POMDP forward search algorihm ha can perform much deeper forward searches. 2 POMDPs Formally, a POMDP consiss of a se of saes S, a se of acions A, and a se of observaions Z. I also includes a sae-ransiion model p(s s, a), an observaion model p(z a, s ), a reward model r S (s, a), as well as a discoun facor γ and iniial belief bel 0. The goal of a POMDP solver is o compue a policy π mapping beliefs o acions π : bel a ha will maximize he agen s expeced oal reward over is lifeime. Given a policy π and curren belief bel, he agen akes an acion a = π(bel) and obains an observaion z. I hen updaes is belief according o bel (s ) = τ(bel, a, z) = η p(z a, s ) p(s s, a)bel(s)ds (1) where τ(bel, a, z) represens he belief updae funcion and η is a normalizaion consan. Each policy π is also associaed wih a value funcion V π : bel R, specifying he expeced oal reward of execuing policy π saring from bel V π (bel) = max a A s S [ rb (bel, a) + γ p(z bel, a)v π (τ(bel, a, z)) ] (2) z Z where he funcion r B (bel, a) = s S bel(s)r S(s, a)ds specifies he immediae expeced reward of execuing acion a in belief bel. A POMDP solver seeks o find he opimal policy π (bel 0 ) ha maximizes V π (bel 0 ). Tradiionally, policies have been compued offline, and one class of POMDP solvers ha has achieved paricular success is he poin-based mehods. For discree sae spaces, poin-based approaches such as PBVI [11] and HSVI [18] leverage he piecewise-linear and convex (PWLC) aspecs of he value funcion [19] o obain lower bounds on he value funcion (Eqn. 2), performing 2

4 value updaes only a seleced belief saes. The value funcion has similarly been shown [12] o be PWLC for coninuous sae spaces. 3 Forward Search in Parameric Space Fig. 1: A forward search ree. An acion is chosen a each belief node (OR-node), while all observaions mus be considered a he acion nodes (AND-node) Raher han compuing a policy for every possible belief sae, forward search echniques avoid he compuaional complexiy of full policy compuaion by direcing compuaional effor only owards belief saes ha are reachable from he curren belief under differen acions. These echniques alernae beween a planning and execuion phase, planning online only for he belief a he curren imesep. During he planning phase, a forward search algorihm creaes an AND-OR ree (Fig. 3) of reachable belief saes from he curren belief sae. The ree is expanded using acion-observaion pairs ha are admissible from he curren belief, and he beliefs a he leaf nodes are found using Eqn. 1. By using a value heurisic [17] ha esimaes he value a he fringe nodes, he expeced value of execuing a policy from he curren belief can be propagaed up he ree o he roo node (Eqn. 2). To obain he se of reachable beliefs, exising forward search algorihms branch on he possible acions and observaions a each successive deph. Unforunaely, he branching facor for reasonably large discree or coninuous acion and observaion ses severely limis he maximum search deph achievable in real-ime. Even if we resric our acion space o a se of macro-acions, and compue he expeced reward of each macro-acion by sampling observaion sequences of corresponding lengh [20], he size of he observaion space and sampling complexiy will grow exponenially wih he lengh of he acion sequence. For a paricular macro-acion, he probabiliy of he agen obaining an observaion sequence is equivalen o he probabiliy of obaining he poserior belief associaed wih ha observaion sequence. Seen from anoher angle, every macro-acion generaes a disribuion over beliefs, or a disribuion of disribuions. If we are able o calculae he disribuion over poserior beliefs for every acion sequence, and branch a he end of he acion sequence by sampling poserior beliefs wihin his disribuion, he sampling complexiy is hen independen of he macro-acion lengh. Furhermore, he expeced reward of an acion sequence can hen be compued by finding he expeced rewards wih respec o ha disribuion, raher han by sampling he possible observaions. In his paper, we focus on problems where he agen s belief bel = N(µ, Σ) is normally disribued over he sae space, and he observaions are drawn from an exponenial family disribuion [1]. Wihou loss of generaliy, only he observaions are modeled as an exponenial family disribuion here, hough he same analysis could be applied o he sae-ransiion model. These model assumpions imply ha he poserior belief is no sricly Gaussian, since he Gaussian disribuion is no a conjugae prior for generic exponenial family observaion Fig. 2: Disribuion of poserior beliefs. a) A Gaussian poserior belief resuls afer incorporaing an observaion sequence. b) Over all possible observaion sequences, he disribuion of poserior means is a Gaussian (black line), and for each poserior mean, a Gaussian (blue curve) describes he agen s poserior belief. models. We neverheless assume ha he agen s poserior belief remains Gaussian, and show in Secion 4 ha he disribuion over poserior beliefs is iself a Gaussian over Gaussian beliefs (Fig. 3). We will show ha all poserior beliefs in his disribuion have he same covariance, and he poserior means are normally disribued over he coninuous sae space. Given an acion sequence, he poserior disribuion over beliefs is herefore a join disribuion over he poserior means and 3

5 he corresponding disribuion over saes. We can hen evaluae he expeced reward of an acion sequence by performing Mone Carlo inegraion over his disribuion of disribuions. 4 Gaussian Poserior Predicion Our sae-ransiion and observaion models can be represened as follows: s = A s 1 + B u + ε, s 1 N(µ 1, Σ 1 ), ε N(0, R ) (3) p(z θ ) = exp(z T θ b (θ ) + κ (z )) (4) We assume ha our sae-ransiion model is linear-gaussian, and A and B are he linear ransiion marices. θ and b (θ ) are respecively he canonical parameer and normalizaion facor of he exponenial family disribuion ha generaes he observaion. The exponenial family encompasses a large se of parameric disribuions, including he Gaussian disribuion. When he sae-ransiion and observaion models are normally disribued and linear funcions of he sae, he Kalman filer provides a closed-form soluion for he poserior belief (µ, Σ ), given a prior belief (µ 1, Σ 1 ), µ = A µ 1 + B u µ = µ + K (z C µ ) (5) Σ = A Σ 1 A T + R Σ = (C T Q 1 C + Σ 1 ) 1, (6) where C is he observaion marix, R and Q are he covariances of he Gaussian process and measuremen noise respecively, and K is he Kalman gain. µ and Σ are he mean and covariance afer an acion is aken bu before incorporaing he measuremen. Eqn. 5 and 6 show ha for problems wih linear-gaussian sae-ransiion and observaion models, he covariance updae is independen of he observaion obained. This is because he Fisher informaion associaed wih he observaion model, M = C TQ 1 C, is dependen only on he observaion model parameers, raher han he observaion obained [4]. For linear-gaussian models, M is also consan across he enire sae space. Unforunaely, he linear-gaussian assumpion is highly resricive, and mos POMDP models have observaion funcions ha are non-gaussian. A more general form of he Kalman filer updae exiss, which allows for a closed-form soluion of he poserior belief for problems wih observaion models ha belong o a larger class of parameric disribuions, he exponenial family. Building on saisical economics research for ime-series analysis of non-gaussian observaions [5], a dynamic generalized linear model [22] has been shown o provide he exponenial family equivalen of he Kalman filer (efkf). The key idea is o consruc linear-gaussian models which approximae he non-gaussian exponenial family model in he neighborhood of he condiional mode, s z. The approximae linear-gaussian observaion model can hen be used in a radiional Kalman filer. Since his idea was developed elsewhere, he derivaion of he filer is presened as an Appendix, and we presen he main equaions here. Consrucing he approximae linear-gaussian model requires compuaion of he firs wo momens of he disribuion and linearizing abou he mean esimae a every imesep. For an exponenial family observaion model, he firs wo momens of he disribuion [22] are, E(z θ ) = ḃ = b (θ ) θ θ=w(µ ) V ar(z θ ) = b = 2 b (θ ) θ θ T θ=w(µ ) θ = W(s ), where ḃ and b are he derivaives of he exponenial family disribuion s normalizaion facor, boh linearized abou θ = W(µ ). W(.) is he canonical link funcion, which maps he saes o canonical parameer values, and depends on he paricular member of he exponenial family. Given an acion-observaion sequence, he poserior mean of he agen s belief in he efkf can hen be updaed according o µ = A µ 1 + B u µ = µ + K ( z W(µ )), (8) Σ = A Σ 1 A T + R Σ = (Σ 1 (7) + Y b Y T ) 1, (9) 4

6 b 1 where K = Σ Y (Y Σ Y + ) 1 is he efkf Kalman gain, and z = θ (ḃ z ) is he projecion of he observaion ono he parameer space of he exponenial family observaion model. Y = θ s=µ s is he gradien of he exponenial family disribuion s canonical parameer, linearized abou µ. While our relaxaion of he observaion model o he exponenial family necessarily implies ha he Gaussian poserior belief is an approximaion of he rue poserior, he Gaussian assumpion does allow he use of a Kalman filer varian, which in urn allows us o approximae our disribuion of poserior beliefs efficienly and in closed-form. In he following subsecions, we ake advanage of he efkf o compue he disribuion of he poserior beliefs afer a muli-sep acion sequence. Since our belief is assumed o be Gaussian, expressing a disribuion over he poserior beliefs requires us o have a disribuion over he poserior means and covariances. 4.1 Predicion of Poserior Mean Disribuion Eqn. 8 reveals ha he poserior mean µ direcly depends on he observaion z. Neverheless, we demonsrae in his secion ha given a curren prior belief afer an acion, bel N(µ, Σ ), he expeced disribuion of he poserior means p(µ µ ) is normally disribued abou µ. Given he wo momens of he exponenial family observaion model, we can represen he condiional disribuion z s according o b 1 z s N(W(s ), b 1 ) (10) N(W(µ ) + Y (s µ ), b 1 ) (11) We can hen marginalize ou s using p( z µ ) = p( z s, µ )p(s µ )ds and using linear ransformaions, z µ N(W(µ ), Y Σ Y T µ + K ( z W(µ )) µ N(µ, K (Y Σ Y T 1 + b + ) (12) 1 b ) K T ) (13) µ µ N(µ, Σ Y KT ) (14) Eqn. 14 indicaes ha he poserior mean afer a measuremen updae µ is normally disribued abou he µ, wih a covariance ha depends on he prior covariance Σ and he observaion model parameers Y and b. The observaion model parameers are linearized abou he prior mean µ ; hence, for an acion-observaion sequence of lengh 1, he parameers are independen of he observaion ha will be obained. To obain he poserior mean disribuion afer a muli-sep acion sequence updae, we firs combine he process and measuremen updaes by marginalizing ou µ using p(µ µ 1 ) = p(µ µ )p(µ µ 1 )dµ, obaining µ µ 1 N(A µ 1 + B u, Σ Y KT ) (15) We assumed above ha for a one-sep belief updae, µ 1 is a fixed value. For a muli-sep updae, he mean is a random variable, i.e. µ 1 N(m 1, S 1 ). We can hen marginalize ou µ 1 o obain µ N(A m 1 + B u, S 1 + Σ Y KT ) (16) Equaion 16 can now be used o perform a predicion of he poserior mean disribuion afer a mulisep acion sequence. Assuming ha he agen is currenly a ime and has a paricular prior mean µ 1 N(µ 1, 0), he poserior mean afer an acion sequence of T imeseps is herefore +T µ +T N(f(µ 1, A :+T, B :+T, u :+T ), Σ i Y i KT i ) (17) where f(µ 1, A +1:+T, B +1:+T, u +1:+T ) is he deerminisic ransformaion of he means according o µ +k = A +k µ +k 1 + B +k u +k. Since an observaion on is own does no shif he mean value m +k of he disribuion of poserior means, m +k is dependen only on he saeransiion model parameers and can be calculaed via a recursive updae along he acion sequence. i= 5

7 4.2 Single-sep Predicion of Covariance Eqn. 9 dicaes how he poserior covariance of he agen s belief can be calculaed, afer an acion is aken and an observaion is obained. Given ha he Fisher informaion associaed wih he observaion model M = Y b Y T is independen of he observaions, he poserior covariance can be compued in closed form, and is independen of he poserior mean. For greaer efficiency, i has been shown [13] ha for a Kalman filer model, he process and measuremen updaes can be compiled ino a single linear ransfer funcion by using a facored represenaion of he covariance, Σ 1 = B 1 C 1 1. Adaping his resul o our model, he complee covariance updae sep can be wrien as: [ [ ] [ ] [ B 0 I 0 A T B Ψ = = C] I Y by A RA T, (18) C] 1 which enables us o collapse muli-sep covariance updaes ino a single sep Ψ +1:T = T i=1 Ψ i, recovering he poserior covariance Σ +T from Σ +T = B +T C 1 +T. Once Ψ +1:T is consruced for a given acion sequence, i can be reused for he same acion sequence a fuure beliefs. Calculaing he Fisher informaion associaed wih he observaion model M again requires us o perform linearizaions abou he mean of he agen s prior belief µ 1. This implies ha he covariance updae afer a muli-sep acion sequence depends on he poserior mean afer each imesep. Neverheless, as shown in Secion 4.1, he agen s expecaion of he poserior mean afer a measuremen updae µ +1 is a normal disribuion around he µ. Hence, when considering he effec of aking an acion sequence from he agen s curren belief, we perform our linearizaions abou µ, he prior mean a each sep along he acion sequence. 5 Poserior Belief Disribuion (PBD) Algorihm Secion 4 demonsraes ha, given he agen s curren belief (µ, Σ ), he parameers describing he disribuion of poserior beliefs can be compued wihou enumeraing he observaions. In paricular, he poserior covariance Σ +T can be prediced via a ransfer funcion (Eqn. 18), while he poserior mean µ +T is normally disribued according o Eqn. 17. These resuls enable us o consider a subse of admissible acion sequences and obain he associaed disribuion of poserior beliefs wihou enumeraing he corresponding observaion sequences. We now presen an algorihm, he Poserior Disribuion Predicion (PBD) algorihm, ha akes advanage of he resuls o perform forward search o much greaer dephs. The PBD algorihm, shown in Algs. 1 and 2, builds upon a generic online forward search for POMDPs [17], alernaing beween a planning and execuion phase a every ieraion. During he planning phase, he nex bes acion sequence â seq is chosen, and he firs acion a of his acion sequence is execued. The agen hen updaes is curren belief according o he observaion obained, and he cycle repeas. During he planning phase, he EXPAND() subrouine is execued recursively in a deph-firs search manner. The agen firs samples a number of muli-sep acion sequences ha i can execue from is curren belief. The choice of hese acion sequences is domain-specific. For each of hese acion sequences, he algorihm calculaes he parameers of he agen s poserior belief, i.e., he parameers of he poserior mean disribuion m +T, S +T according o Eqn. 17, as well as he poserior covariance Σ +T according o Eqn. 18. These hree ses of parameers are he sufficien saisics of he disribuion of beliefs shown in Fig. 3. As discussed in Secion 3, we hen sample from he poserior belief disribuion o insaniae beliefs for deeper forward search. Given ha he covariance can be assumed consan, we perform imporance sampling only on he poserior mean disribuion, generaing samples of poserior beliefs by associaing he poserior mean samples {n i } wih he poserior covariance Σ +T. These beliefs are hen used o perform an addiional layer of deph-firs search, and his process repeas for a pre-deermined search deph D. A he leaf nodes, a value heurisic is used o provide an esimae of being a he belief associaed wih he node. These values are hen propagaed up he ree (Alg. 2, Eqn. 12), and consiss of boh he insan rewards of execuing he acion sequence and a 6

8 Algorihm 1 Poserior Belief Disribuion Algorihm Require: b 0 : Agen s iniial belief, D : Maximum search deph, V h : Value heurisic funcion 1: b c b 0 2: while no EXECUTIONEND() do 3: while no PLANNINGEND() do 4: â seq = EXPAND(b c, D, V h ) 5: end while 6: Execue firs acion â of â seq 7: Obain new observaion z 8: Updae curren belief b c τ(b c, â, z ) 9: : end while FVRS[8,5] Ave rewards Online ime (s) Offline ime (s) QMDP HSVI(150s) RTBSS(d1) PBD(d1,s100) Fig. 3: FVRS resuls. Algorihm parameers: HSVI(# seconds), RTBSS(search deph), PBD(muli-sep search deph, # samples) Algorihm 2 EXPAND() Require: b : Belief node o be expanded, d : Deph of expansion under b, V h : Value heurisic funcion 1: if d = 0 hen 2: V (b) V h (b) 3: reurn V (b) 4: else 5: V (b) 6: for all a seq,i A seq do 7: V (b, a i) R B(b, a seq,i) 8: Compue m +T, S +T, Σ +T (Eqn. 17, 18) 9: Sample se of poserior means {n i} according o N(m +T,S +T) 10: for all n i do 11: b N(n i,σ +T) 12: V (b, a seq,i) V (b, a seq,i)+γp(n i m +T,S +T) EXPAND(τ(b, a seq,i, z), d 1) 13: if V (b, a i) > V (b) hen 14: V (b) V (b, a seq,i) 15: a seq a seq,i 16: end if 17: end for 18: end for 19: end if 20: reurn V (b), a seq Mone Carlo inegraion of he value esimaes from he sampled poserior beliefs. A he roo, he algorihm chooses he acion sequence wih he highes expeced discouned value. 5.1 Errors Induced by Linearizaions Sampling he disribuion of poserior means induces an error on he expeced rewards. However, error bounds can be compued using Hoeffding s inequaliy [7], ( p(v S (b, a seq,i ) V (b, a seq,i ) γnǫ) exp 2n 2 ǫ 2 n(v max V min ) 2 ), (19) where V S (.) and V (.) are he sampled and exac value esimaes respecively, V max and V min are he maximum and minimum rewards ha could be obained, and n is he number of samples. For a desired accuracy ǫ, we can recover he appropriae number of samples. In addiion, for a generic exponenial family observaion model, calculaing he poserior mean and covariance afer a single acion (Eqn. 14 and 9) requires us o linearize abou he prior mean o calculae he Jacobians Y and b. When deermining he poserior belief disribuion afer a mulisep acion sequence (Eqn. 17 and 18), we make he assumpion ha we know he fuure prior means a each sep along he acion sequence, in order o perform linearizaion a every sep along ha sequence. This assumpion is an approximaion, since he fuure mean explicily depends on he observaion sequence. As a resul, he Jacobians used o calculae he Kalman gain and covariance are approximae, and he error of he approximaion in he Kalman gains and covariances increases wih macro-acion lengh. Bounds on approximaion errors for he EKF are known o exis [8], and in fuure, we plan o provide analysis of how hese bounds can be used o deermine he lengh of he macro-acions. 7

9 6 Empirical Resuls We provide iniial resuls demonsraing he performance of he PBD algorihm. Unforunaely, mos exising benchmark POMDP problems do no require POMDP solvers o search deeply o generae good policies. For example, he Rocksample problem, originally proposed in [18], and he exension FieldVisionRockSample (FVRS) problem [15] boh allow simple approximaion echniques o perform well. In he FVRS problem, an agen explores and samples rocks in a grid world. The agen is fully aware of he posiion of boh iself and he rocks. A each imesep, i receives a binary, noisy observaion of he value of each rock, and he observaion accuracy increases as he agen moves closer o he rock. The original Rocksample and FVRS problems appear easily solvable wih exising POMDP echniques such as HSVI [18] and AEMS [15]. Moving o sample he rock involves he same acions as moving o acquire more informaion of he rock. For hese problems, a greedy policy of finding he shores pah o all he rocks is a good approximaion o he opimal policy. Fig. 3 shows ha our algorihm is compeiive wih offline solvers such as HSVI, bu suggess ha even a naive QMDP solver will perform well. Our PBD algorihm performs slighly poorer since i approximaes he discree sae space as coninuous, bu he error induced, if any, is no saisically significan. For all he experimens repored in his paper, we adaped he HSVI implemenaion from he ZMDP sofware package 1, while our RTBSS implemenaion uses he QMDP and HSVI alpha-vecors as upper and lower bounds on he value funcion. HSVI ran in C++, while we ran he QMDP, RTBSS, and PBD algorihms in Malab. To beer es he algorihms, we propose he Informaion Search RockSample problem (ISRS) (Fig. 4), modifying he FVRS problem in wo ways. We inroduce a se of beacons (shown as yellow riangles) ha each correspond o a rock (shown as grey circles). Raher han he observaion accuracy depend on he proximiy o each rock, he agen r mus insead move o he beacon RB i, o ge beer informaion. Second, we modified he rock values RV i, o each have a coninuous value of beween 0 and 1, raher han being a binary-valued se. The agen coninues o receive a binary, noisy observaion of he value of each rock z, according o: { 4 r RB i, (m 0.5)2 D 0 z i, = 1 O : p(z i, RV i, = m, r, RB i, ) = 0.5 (m 0.5)2 4 r RB i, 2 D 0 z i, = 0 where D 0 is a uning parameer ha conrols how quickly he accuracy of he observaions decrease wih greaer disance beween he agen and he beacon. The agen s sensor can herefore be modeled as a Bernoulli observaion model, and he observaions are assumed o be more accurae han an unbiased coin. The agen obains a large reward if i samples a rock ha has a high value, bu incurs a large cos for wrongly doing so. Small coss are incurred for moving around he environmen. 6.1 Comparison wih oher algorihms The ISRS problem was used o compare he PBD algorihm agains a fas upper bound (QMDP), a poin-based offline value ieraion echnique (HSVI) and an online forward search algorihm (RTBSS). Macro-acions were generaed for he PBD algorihm by sampling robo poses in he grid world and compuing he sequence of acions necessary o reach he sampled pose. Fig. 4 and 6 illusrae he fundamenally differen policies generaed by our PBD algorihm, relaive o exising echniques. In Fig. 4, he RTBSS solver obains an observaion suggesing ha rock 1 is valuable. Unforunaely, is inabiliy o search beyond deph 1 causes i o fail o realize ha good informaion abou rock 1 can be obained by making a shor deour. Insead, i heads owards rock 1, bu because subsequen observaions are noisy, due o disance from he beacon, he solver is unable o commi o sampling he rock and is suck a a local minimum. In conras, he PBD solver (Fig. 6) is able o evaluae he value of making a deour when seeking informaion abou rock 5. Subsequenly, he agen moves o he region wih muliple beacons, concludes ha here is lile value lef o sample, and exis he problem. Fig. 5 repors he performance of he differen algorihms esed on he ISRS problem. The problem was iniialized wih differen hidden values of he rocks, and 20 rials were conduced for each simulaion. The differen algorihms were also iniialized wih differen offline processing ime (HSVI), 1 ZMDP Sofware for POMDP and MDP Planning. hp:// rey/zmdp/ (20) 8

10 Fig. 4: ISRS world. Blue line indicaes example RTBSS policy ISRS[8,5] (2304s, 5a, 32o) Ave rewards Online ime(s) Offline ime(s) QMDP HSVI(150s) HSVI(500s) HSVI(4000s) RTBSS(d1) RTBSS(d2) PBD(d1,s300) PBD(d2,s30) Fig. 5: Performance of POMDP solvers in ISRS problem Fig. 6: Example PBD policy differen search deph (RTBSS), and differen poserior mean samples (PBD). The resuls demonsrae ha he PBD algorihm is able o perform significanly beer han oher exising algorihms, hereby demonsraing he value of searching deeper. 7 Conclusion and Discussion We have demonsraed he abiliy o perform belief updaes for muli-sep acion sequences in closed form for models ha have specific parameric represenaions. When he models are linear-gaussian, our expression for he poserior belief disribuion afer a muli-sep acion sequence is exac. For observaion models ha are members of he exponenial family, he belief updae can be approximaed by using a linearized varian of he Kalman filer. By being able o compue he agen s disribuion of poserior beliefs afer muli-sep acion sequences, we developed an algorihm ha can perform deeper forward search when planning under uncerainy. While we may sacrifice an exhausive search over all acion sequences up of a paricular lengh, our algorihm enables us o search o an equivalen lengh of muli-sep acion sequences. This effecively allows us o search deeper, allowing us o discover policies ha would oherwise no have been found wih a shallow search deph. In conras o mos POMDP solvers, our algorihm assumes ha he agen s belief and observaion models are represenable as paricular classes of parameric disribuions. This necessary implies ha our algorihm is no a generic POMDP solver. Neverheless, we believe ha no only do hese classes of probabiliy disribuions represen a wide variey of belief disribuions, bu also ha coninuous, parameric represenaions are our only soluion for avoiding he curse of dimensionaliy [2], especially as we seek o solve larger POMDP problems in fuure. In addiion, he noion of predicing he disribuion of poserior beliefs should be exendable o a broader class of belief represenaions. Exponenial family disribuions ha are conjugae priors o exponenial family observaion models are aracive candidaes for represening he belief space, since he poserior belief afer a belief updae belongs o he same parameric class as he prior belief. References [1] O.E. Barndorff-Nielsen. Informaion and exponenial families in saisical heory. Bull. Amer. Mah. Soc. 1 (1979), , 273(0979), [2] A. Brooks, A. Makarenko, S. Williams, and H. Durran-Whye. Parameric POMDPs for planning in coninuous sae spaces. Roboics and Auonomous Sysems, 54(11): , [3] E. Brunskill, L. Kaelbling, T. Lozano-Perez, and N. Roy. Coninuous-sae POMDPs wih hybrid dynamics. In Symposium on Arificial Inelligence and Mahemaics, [4] T.M. Cover and J.A. Thomas. Elemens of informaion heory. Wiley-Inerscience, [5] J. Durbin and SJ Koopman. Time series analysis of non-gaussian observaions based on sae space models from boh classical and Bayesian perspecives. Journal of he Royal Saisical Sociey: Series B (Mehodological), 62(1):3 56,

11 [6] M. Hauskrech. Value-funcion approximaions for parially observable Markov decision processes. Journal of Arificial Inelligence Research, 13(2000):33 94, [7] W. Hoeffding. Probabiliy inequaliies for sums of bounded random variables. Journal of he American Saisical Associaion, pages 13 30, [8] B.F. La Scala, R.R. Bimead, and M.R. James. Condiions for sabiliy of he exended Kalman filer and heir applicaion o he frequency racking problem. Mahemaics of Conrol, Signals, and Sysems (MCSS), 8(1):1 26, [9] D. McAlleser and S. Singh. Approximae planning for facored POMDPs using belief sae simplificaion. In Proceedings of he Fifeenh Conference on Uncerainy in Arificial Inelligence, pages , [10] S. Paque, L. Tobin, and B. Chaib-draa. An online POMDP algorihm for complex muliagen environmens. In Proceedings of he fourh inernaional join conference on Auonomous agens and muliagen sysems, pages ACM New York, NY, USA, [11] J. Pineau, G. Gordon, and S. Thrun. Poin-based value ieraion: An anyime algorihm for POMDPs. In Inernaional Join Conference on Arificial Inelligence, volume 18, pages , [12] J.M. Pora, N. Vlassis, M.T.J. Spaan, and P. Poupar. Poin-based value ieraion for coninuous POMDPs. The Journal of Machine Learning Research, 7: , [13] S. Prenice and N. Roy. The Belief Roadmap: Efficien Planning in Linear POMDPs by Facoring he Covariance. In Proceedings of he 13h Inernaional Symposium of Roboics Research (ISRR), [14] M.L. Puerman. Markov decision processes: discree sochasic dynamic programming. John Wiley & Sons, Inc. New York, NY, USA, [15] S. Ross and B. Chaib-draa. AEMS: An Anyime Online Search Algorihm for Approximae Policy Refinemen in Large POMDPs. In Proceedings of The 20h Join Conference in Arificial Inelligence (IJCAI 2007), Hyderabad, India, [16] S. Ross, B. Chaib-draa, and J. Pineau. Bayesian reinforcemen learning in coninuous POMDPs wih applicaion o robo navigaion. In IEEE Inernaional Conference on Roboics and Auomaion, ICRA 2008, pages , [17] S. Ross, J. Pineau, S. Paque, and B. Chaib-draa. Online planning algorihms for POMDPs. Journal of Arificial Inelligence Research (JAIR), 32: , [18] T. Smih and R. Simmons. Poin-based POMDP algorihms: Improved analysis and implemenaion. In Proc. Uncerainy in Arificial Inelligence, [19] E.J. Sondik. The opimal conrol of parially observable Markov processes over he infinie horizon: Discouned coss. Operaions Research, 26(2): , [20] G. Theocharous and L.P. Kaelbling. Approximae planning in POMDPs wih macro-acions. Advances in Neural Processing Informaion Sysems, 17, [21] R. Washingon. BI-POMDP: Bounded, incremenal parially-observable Markovmodel planning. In Proceedings of he 4h European Conference on Planning (ECP), pages Springer, [22] M. Wes, P.J. Harrison, and H.S. Migon. Dynamic generalized linear models and Bayesian forecasing. Journal of he American Saisical Associaion, pages 73 83,

12 Appendix A: Exponenial Family Kalman Filer Building on saisical economics research for ime-series analysis of non-gaussian observaions [5], we presen he Kalman filer equivalen for sysems wih linear-gaussian sae-ransiions and observaion models ha belong o he exponenial family of disribuions. The sae-ransiion and observaion models can be represened as follow: s = A s 1 + B u + ε, s 1 N(µ 1, Σ 1 ), ε N(0, R ) (21) p(z θ ) = exp(z T θ b (θ ) + κ (z )), θ = W(s ) (22) For he sae-ransiion model, s is he sysem s hidden sae, u is he conrol acions, A and B are he linear ransiion marices, and ǫ is he sae-ransiion Gaussian noise wih covariance R. The observaion model belongs o he exponenial family of disribuions. θ and b (θ ) are he canonical parameer and normalizaion facor of he disribuion, and W(.) maps he saes o canonical parameer values. W(.) depends on he paricular member of he exponenial family. For ease of noaion, we le β (z θ ) = logp(z θ ) = z T θ + b (θ ) + κ (z ) (23) Following he radiional Kalman filer, he process updae can be wrien as µ = A µ 1 + B u, Σ = A Σ 1 A T + R (24) where µ and Σ are he mean and covariances of he poserior belief afer he process updae bu before he measuremen udpae. For he measuremen updae, we seek o find he condiional mode µ = argmax s p(s z ) (25) = argmax s p(z s )bel(s ) (Bayes rule) (26) = argmax s p(z θ )bel(s ) (27) = argmax s exp( J ), where J = log p(z θ ) (s µ ) T Σ 1 (s µ ) 0 = J = s s=µ β (z, θ ) θ + Σ 1 (µ µ θ s ), (29) Taking he derivaive of θ = W(s ) abou he prior mean µ, we le Y = W(s ) s (30) s=µ Similarly, performing Taylor expansion on β(z θ) θ abou θ = W(µ ), β (z θ ) = β (z θ ) θ θ + 2 β (z θ ) θ=θ θ θ T (θ θ ) (31) θ=θ β (z θ ) = θ β + β (θ θ ) (32) where β = ( z T θ + b (θ ) κ (z )) θ, (Eqn. 23) (33) θ=θ = b (θ ) θ z (34) θ=θ (28) β =ḃ z (35) and β = 2 β (z θ ) θ θ T (θ θ ) (36) θ=θ β = b (37) 11

13 Plugging Equaions 35 and 37 ino Equaion 32, and hen ino Equaion 29, b 1 Y (ḃ z + b (θ θ )) = Σ 1 (µ µ ) (38) Y b ( b 1 (ḃ z ) θ + θ ) = Σ 1 (µ µ ) (39) 1 Y b ((θ b (ḃ z )) θ ) =Σ 1 (µ µ ) (40) Y b ( z W(s )) =Σ 1 (µ µ ) (41) where z = (θ (ḃ z )) is he projecion of he observaion ono he parameer space of he exponenial family disribuion, and is independen of s. In Equaion 41 we subsiued θ using Equaion 22. Mean Updae Using Equaion 41 and subsiuing µ for s, Σ 1 (µ µ ) = Y b ( z W(µ )) (42) = Y b ( z W(µ )) + W(µ ) W(µ ) (43) = Y b ( z W(µ )) Y b (W(µ ) W(µ )) (44) Linearizing W(s ) abou µ, W(s ) = W(µ ) + W (s ) s=µ (s µ ) (45) = W(µ ) + Y (µ µ ) (46) Σ 1 (µ µ ) = Y b ( z W(µ )) Y b Y (µ µ ) (47) Y b ( z W(µ )) = (Σ 1 + Y b Y )(µ µ ) (48) = Σ 1 (µ µ ) (49) µ µ = Σ Y b ( z W(µ )) (50) where Σ Y b = K is he Kalman gain for non-gaussian exponenial family disribuions. Via a sandard ransformaion, he Kalman gain can be wrien in erms of covariances oher han Σ, Covariance Updae Given a Gaussian poserior belief, 2 J s 2 Σ 1 1 K = Σ Y (Y Σ Y + b ) 1 (51) and µ = µ + K ( z W(µ )) (52) = 2 J s 2 is he inverse of he covariance of he agen s belief (53) = x (Σ 1 (s µ ) Y b ( z W(s ))) (54) = Σ 1 + Y b Y (55) Σ = (Σ 1 + Y b Y ) 1 (56) 12

14

Planning in POMDPs. Dominik Schoenberger Abstract

Planning in POMDPs. Dominik Schoenberger Abstract Planning in POMDPs Dominik Schoenberger d.schoenberger@sud.u-darmsad.de Absrac This documen briefly explains wha a Parially Observable Markov Decision Process is. Furhermore i inroduces he differen approaches

More information

Notes on Kalman Filtering

Notes on Kalman Filtering Noes on Kalman Filering Brian Borchers and Rick Aser November 7, Inroducion Daa Assimilaion is he problem of merging model predicions wih acual measuremens of a sysem o produce an opimal esimae of he curren

More information

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017 Two Popular Bayesian Esimaors: Paricle and Kalman Filers McGill COMP 765 Sep 14 h, 2017 1 1 1, dx x Bel x u x P x z P Recall: Bayes Filers,,,,,,, 1 1 1 1 u z u x P u z u x z P Bayes z = observaion u =

More information

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS NA568 Mobile Roboics: Mehods & Algorihms Today s Topic Quick review on (Linear) Kalman Filer Kalman Filering for Non-Linear Sysems Exended Kalman Filer (EKF)

More information

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II Roland Siegwar Margaria Chli Paul Furgale Marco Huer Marin Rufli Davide Scaramuzza ETH Maser Course: 151-0854-00L Auonomous Mobile Robos Localizaion II ACT and SEE For all do, (predicion updae / ACT),

More information

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB Elecronic Companion EC.1. Proofs of Technical Lemmas and Theorems LEMMA 1. Le C(RB) be he oal cos incurred by he RB policy. Then we have, T L E[C(RB)] 3 E[Z RB ]. (EC.1) Proof of Lemma 1. Using he marginal

More information

Estimation of Poses with Particle Filters

Estimation of Poses with Particle Filters Esimaion of Poses wih Paricle Filers Dr.-Ing. Bernd Ludwig Chair for Arificial Inelligence Deparmen of Compuer Science Friedrich-Alexander-Universiä Erlangen-Nürnberg 12/05/2008 Dr.-Ing. Bernd Ludwig (FAU

More information

EKF SLAM vs. FastSLAM A Comparison

EKF SLAM vs. FastSLAM A Comparison vs. A Comparison Michael Calonder, Compuer Vision Lab Swiss Federal Insiue of Technology, Lausanne EPFL) michael.calonder@epfl.ch The wo algorihms are described wih a planar robo applicaion in mind. Generalizaion

More information

Sequential Importance Resampling (SIR) Particle Filter

Sequential Importance Resampling (SIR) Particle Filter Paricle Filers++ Pieer Abbeel UC Berkeley EECS Many slides adaped from Thrun, Burgard and Fox, Probabilisic Roboics 1. Algorihm paricle_filer( S -1, u, z ): 2. Sequenial Imporance Resampling (SIR) Paricle

More information

Probabilistic Robotics

Probabilistic Robotics Probabilisic Roboics Bayes Filer Implemenaions Gaussian filers Bayes Filer Reminder Predicion bel p u bel d Correcion bel η p z bel Gaussians : ~ π e p N p - Univariae / / : ~ μ μ μ e p Ν p d π Mulivariae

More information

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering Inroducion o Arificial Inelligence V22.0472-001 Fall 2009 Lecure 18: aricle & Kalman Filering Announcemens Final exam will be a 7pm on Wednesday December 14 h Dae of las class 1.5 hrs long I won ask anyhing

More information

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1

RL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1 RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and

More information

An introduction to the theory of SDDP algorithm

An introduction to the theory of SDDP algorithm An inroducion o he heory of SDDP algorihm V. Leclère (ENPC) Augus 1, 2014 V. Leclère Inroducion o SDDP Augus 1, 2014 1 / 21 Inroducion Large scale sochasic problem are hard o solve. Two ways of aacking

More information

Introduction to Mobile Robotics

Introduction to Mobile Robotics Inroducion o Mobile Roboics Bayes Filer Kalman Filer Wolfram Burgard Cyrill Sachniss Giorgio Grisei Maren Bennewiz Chrisian Plagemann Bayes Filer Reminder Predicion bel p u bel d Correcion bel η p z bel

More information

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 175 CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK 10.1 INTRODUCTION Amongs he research work performed, he bes resuls of experimenal work are validaed wih Arificial Neural Nework. From he

More information

Vehicle Arrival Models : Headway

Vehicle Arrival Models : Headway Chaper 12 Vehicle Arrival Models : Headway 12.1 Inroducion Modelling arrival of vehicle a secion of road is an imporan sep in raffic flow modelling. I has imporan applicaion in raffic flow simulaion where

More information

Tom Heskes and Onno Zoeter. Presented by Mark Buller

Tom Heskes and Onno Zoeter. Presented by Mark Buller Tom Heskes and Onno Zoeer Presened by Mark Buller Dynamic Bayesian Neworks Direced graphical models of sochasic processes Represen hidden and observed variables wih differen dependencies Generalize Hidden

More information

Robust estimation based on the first- and third-moment restrictions of the power transformation model

Robust estimation based on the first- and third-moment restrictions of the power transformation model h Inernaional Congress on Modelling and Simulaion, Adelaide, Ausralia, 6 December 3 www.mssanz.org.au/modsim3 Robus esimaion based on he firs- and hird-momen resricions of he power ransformaion Nawaa,

More information

STATE-SPACE MODELLING. A mass balance across the tank gives:

STATE-SPACE MODELLING. A mass balance across the tank gives: B. Lennox and N.F. Thornhill, 9, Sae Space Modelling, IChemE Process Managemen and Conrol Subjec Group Newsleer STE-SPACE MODELLING Inroducion: Over he pas decade or so here has been an ever increasing

More information

Probabilistic Robotics SLAM

Probabilistic Robotics SLAM Probabilisic Roboics SLAM The SLAM Problem SLAM is he process by which a robo builds a map of he environmen and, a he same ime, uses his map o compue is locaion Localizaion: inferring locaion given a map

More information

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi

Navneet Saini, Mayank Goyal, Vishal Bansal (2013); Term Project AML310; Indian Institute of Technology Delhi Creep in Viscoelasic Subsances Numerical mehods o calculae he coefficiens of he Prony equaion using creep es daa and Herediary Inegrals Mehod Navnee Saini, Mayank Goyal, Vishal Bansal (23); Term Projec

More information

Probabilistic Robotics The Sparse Extended Information Filter

Probabilistic Robotics The Sparse Extended Information Filter Probabilisic Roboics The Sparse Exended Informaion Filer MSc course Arificial Inelligence 2018 hps://saff.fnwi.uva.nl/a.visser/educaion/probabilisicroboics/ Arnoud Visser Inelligen Roboics Lab Informaics

More information

Online Appendix to Solution Methods for Models with Rare Disasters

Online Appendix to Solution Methods for Models with Rare Disasters Online Appendix o Soluion Mehods for Models wih Rare Disasers Jesús Fernández-Villaverde and Oren Levinal In his Online Appendix, we presen he Euler condiions of he model, we develop he pricing Calvo block,

More information

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED 0.1 MAXIMUM LIKELIHOOD ESTIMATIO EXPLAIED Maximum likelihood esimaion is a bes-fi saisical mehod for he esimaion of he values of he parameers of a sysem, based on a se of observaions of a random variable

More information

Probabilistic Robotics SLAM

Probabilistic Robotics SLAM Probabilisic Roboics SLAM The SLAM Problem SLAM is he process by which a robo builds a map of he environmen and, a he same ime, uses his map o compue is locaion Localizaion: inferring locaion given a map

More information

Anno accademico 2006/2007. Davide Migliore

Anno accademico 2006/2007. Davide Migliore Roboica Anno accademico 2006/2007 Davide Migliore migliore@ele.polimi.i Today Eercise session: An Off-side roblem Robo Vision Task Measuring NBA layers erformance robabilisic Roboics Inroducion The Bayesian

More information

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid

More information

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter Sae-Space Models Iniializaion, Esimaion and Smoohing of he Kalman Filer Iniializaion of he Kalman Filer The Kalman filer shows how o updae pas predicors and he corresponding predicion error variances when

More information

1 Review of Zero-Sum Games

1 Review of Zero-Sum Games COS 5: heoreical Machine Learning Lecurer: Rob Schapire Lecure #23 Scribe: Eugene Brevdo April 30, 2008 Review of Zero-Sum Games Las ime we inroduced a mahemaical model for wo player zero-sum games. Any

More information

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon

3.1.3 INTRODUCTION TO DYNAMIC OPTIMIZATION: DISCRETE TIME PROBLEMS. A. The Hamiltonian and First-Order Conditions in a Finite Time Horizon 3..3 INRODUCION O DYNAMIC OPIMIZAION: DISCREE IME PROBLEMS A. he Hamilonian and Firs-Order Condiions in a Finie ime Horizon Define a new funcion, he Hamilonian funcion, H. H he change in he oal value of

More information

A Hop Constrained Min-Sum Arborescence with Outage Costs

A Hop Constrained Min-Sum Arborescence with Outage Costs A Hop Consrained Min-Sum Arborescence wih Ouage Coss Rakesh Kawara Minnesoa Sae Universiy, Mankao, MN 56001 Email: Kawara@mnsu.edu Absrac The hop consrained min-sum arborescence wih ouage coss problem

More information

Air Traffic Forecast Empirical Research Based on the MCMC Method

Air Traffic Forecast Empirical Research Based on the MCMC Method Compuer and Informaion Science; Vol. 5, No. 5; 0 ISSN 93-8989 E-ISSN 93-8997 Published by Canadian Cener of Science and Educaion Air Traffic Forecas Empirical Research Based on he MCMC Mehod Jian-bo Wang,

More information

Georey E. Hinton. University oftoronto. Technical Report CRG-TR February 22, Abstract

Georey E. Hinton. University oftoronto.   Technical Report CRG-TR February 22, Abstract Parameer Esimaion for Linear Dynamical Sysems Zoubin Ghahramani Georey E. Hinon Deparmen of Compuer Science Universiy oftorono 6 King's College Road Torono, Canada M5S A4 Email: zoubin@cs.orono.edu Technical

More information

10. State Space Methods

10. State Space Methods . Sae Space Mehods. Inroducion Sae space modelling was briefly inroduced in chaper. Here more coverage is provided of sae space mehods before some of heir uses in conrol sysem design are covered in he

More information

Random Walk with Anti-Correlated Steps

Random Walk with Anti-Correlated Steps Random Walk wih Ani-Correlaed Seps John Noga Dirk Wagner 2 Absrac We conjecure he expeced value of random walks wih ani-correlaed seps o be exacly. We suppor his conjecure wih 2 plausibiliy argumens and

More information

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004 Augmened Realiy II Kalman Filers Gudrun Klinker May 25, 2004 Ouline Moivaion Discree Kalman Filer Modeled Process Compuing Model Parameers Algorihm Exended Kalman Filer Kalman Filer for Sensor Fusion Lieraure

More information

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle Chaper 2 Newonian Mechanics Single Paricle In his Chaper we will review wha Newon s laws of mechanics ell us abou he moion of a single paricle. Newon s laws are only valid in suiable reference frames,

More information

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems

Recursive Least-Squares Fixed-Interval Smoother Using Covariance Information based on Innovation Approach in Linear Continuous Stochastic Systems 8 Froniers in Signal Processing, Vol. 1, No. 1, July 217 hps://dx.doi.org/1.2266/fsp.217.112 Recursive Leas-Squares Fixed-Inerval Smooher Using Covariance Informaion based on Innovaion Approach in Linear

More information

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles Diebold, Chaper 7 Francis X. Diebold, Elemens of Forecasing, 4h Ediion (Mason, Ohio: Cengage Learning, 006). Chaper 7. Characerizing Cycles Afer compleing his reading you should be able o: Define covariance

More information

Bias-Variance Error Bounds for Temporal Difference Updates

Bias-Variance Error Bounds for Temporal Difference Updates Bias-Variance Bounds for Temporal Difference Updaes Michael Kearns AT&T Labs mkearns@research.a.com Sainder Singh AT&T Labs baveja@research.a.com Absrac We give he firs rigorous upper bounds on he error

More information

A Shooting Method for A Node Generation Algorithm

A Shooting Method for A Node Generation Algorithm A Shooing Mehod for A Node Generaion Algorihm Hiroaki Nishikawa W.M.Keck Foundaion Laboraory for Compuaional Fluid Dynamics Deparmen of Aerospace Engineering, Universiy of Michigan, Ann Arbor, Michigan

More information

Learning to Take Concurrent Actions

Learning to Take Concurrent Actions Learning o Take Concurren Acions Khashayar Rohanimanesh Deparmen of Compuer Science Universiy of Massachuses Amhers, MA 0003 khash@cs.umass.edu Sridhar Mahadevan Deparmen of Compuer Science Universiy of

More information

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes

An recursive analytical technique to estimate time dependent physical parameters in the presence of noise processes WHAT IS A KALMAN FILTER An recursive analyical echnique o esimae ime dependen physical parameers in he presence of noise processes Example of a ime and frequency applicaion: Offse beween wo clocks PREDICTORS,

More information

Motion Planning under Uncertainty using Iterative Local Optimization in Belief Space

Motion Planning under Uncertainty using Iterative Local Optimization in Belief Space Moion Planning under Uncerainy using Ieraive Local Opimizaion in Belief Space Jur van den Berg 1 Sachin Pail 2 Ron Aleroviz 2 1 School of Compuing, Universiy of Uah, berg@cs.uah.edu. 2 Dep. of Compuer

More information

SEIF, EnKF, EKF SLAM. Pieter Abbeel UC Berkeley EECS

SEIF, EnKF, EKF SLAM. Pieter Abbeel UC Berkeley EECS SEIF, EnKF, EKF SLAM Pieer Abbeel UC Berkeley EECS Informaion Filer From an analyical poin of view == Kalman filer Difference: keep rack of he inverse covariance raher han he covariance marix [maer of

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Noes for EE7C Spring 018: Convex Opimizaion and Approximaion Insrucor: Moriz Hard Email: hard+ee7c@berkeley.edu Graduae Insrucor: Max Simchowiz Email: msimchow+ee7c@berkeley.edu Ocober 15, 018 3

More information

14 Autoregressive Moving Average Models

14 Autoregressive Moving Average Models 14 Auoregressive Moving Average Models In his chaper an imporan parameric family of saionary ime series is inroduced, he family of he auoregressive moving average, or ARMA, processes. For a large class

More information

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model Modal idenificaion of srucures from roving inpu daa by means of maximum likelihood esimaion of he sae space model J. Cara, J. Juan, E. Alarcón Absrac The usual way o perform a forced vibraion es is o fix

More information

15. Vector Valued Functions

15. Vector Valued Functions 1. Vecor Valued Funcions Up o his poin, we have presened vecors wih consan componens, for example, 1, and,,4. However, we can allow he componens of a vecor o be funcions of a common variable. For example,

More information

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis Speaker Adapaion Techniques For Coninuous Speech Using Medium and Small Adapaion Daa Ses Consaninos Boulis Ouline of he Presenaion Inroducion o he speaker adapaion problem Maximum Likelihood Sochasic Transformaions

More information

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3

d 1 = c 1 b 2 - b 1 c 2 d 2 = c 1 b 3 - b 1 c 3 and d = c b - b c c d = c b - b c c This process is coninued unil he nh row has been compleed. The complee array of coefficiens is riangular. Noe ha in developing he array an enire row may be divided or

More information

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems Paricle Swarm Opimizaion Combining Diversificaion and Inensificaion for Nonlinear Ineger Programming Problems Takeshi Masui, Masaoshi Sakawa, Kosuke Kao and Koichi Masumoo Hiroshima Universiy 1-4-1, Kagamiyama,

More information

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand

Excel-Based Solution Method For The Optimal Policy Of The Hadley And Whittin s Exact Model With Arma Demand Excel-Based Soluion Mehod For The Opimal Policy Of The Hadley And Whiin s Exac Model Wih Arma Demand Kal Nami School of Business and Economics Winson Salem Sae Universiy Winson Salem, NC 27110 Phone: (336)750-2338

More information

Optimal Path Planning for Flexible Redundant Robot Manipulators

Optimal Path Planning for Flexible Redundant Robot Manipulators 25 WSEAS In. Conf. on DYNAMICAL SYSEMS and CONROL, Venice, Ialy, November 2-4, 25 (pp363-368) Opimal Pah Planning for Flexible Redundan Robo Manipulaors H. HOMAEI, M. KESHMIRI Deparmen of Mechanical Engineering

More information

Simulating models with heterogeneous agents

Simulating models with heterogeneous agents Simulaing models wih heerogeneous agens Wouer J. Den Haan London School of Economics c by Wouer J. Den Haan Individual agen Subjec o employmen shocks (ε i, {0, 1}) Incomplee markes only way o save is hrough

More information

m = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19

m = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19 Sequenial Imporance Sampling (SIS) AKA Paricle Filering, Sequenial Impuaion (Kong, Liu, Wong, 994) For many problems, sampling direcly from he arge disribuion is difficul or impossible. One reason possible

More information

Mean-square Stability Control for Networked Systems with Stochastic Time Delay

Mean-square Stability Control for Networked Systems with Stochastic Time Delay JOURNAL OF SIMULAION VOL. 5 NO. May 7 Mean-square Sabiliy Conrol for Newored Sysems wih Sochasic ime Delay YAO Hejun YUAN Fushun School of Mahemaics and Saisics Anyang Normal Universiy Anyang Henan. 455

More information

CSE-571 Robotics. Sample-based Localization (sonar) Motivation. Bayes Filter Implementations. Particle filters. Density Approximation

CSE-571 Robotics. Sample-based Localization (sonar) Motivation. Bayes Filter Implementations. Particle filters. Density Approximation Moivaion CSE57 Roboics Bayes Filer Implemenaions Paricle filers So far, we discussed he Kalman filer: Gaussian, linearizaion problems Paricle filers are a way o efficienly represen nongaussian disribuions

More information

GMM - Generalized Method of Moments

GMM - Generalized Method of Moments GMM - Generalized Mehod of Momens Conens GMM esimaion, shor inroducion 2 GMM inuiion: Maching momens 2 3 General overview of GMM esimaion. 3 3. Weighing marix...........................................

More information

Final Spring 2007

Final Spring 2007 .615 Final Spring 7 Overview The purpose of he final exam is o calculae he MHD β limi in a high-bea oroidal okamak agains he dangerous n = 1 exernal ballooning-kink mode. Effecively, his corresponds o

More information

Ordinary dierential equations

Ordinary dierential equations Chaper 5 Ordinary dierenial equaions Conens 5.1 Iniial value problem........................... 31 5. Forward Euler's mehod......................... 3 5.3 Runge-Kua mehods.......................... 36

More information

Efficient Optimization of Information-Theoretic Exploration in SLAM

Efficient Optimization of Information-Theoretic Exploration in SLAM Proceedings of he Tweny-Third AAAI Conference on Arificial Inelligence (2008) Efficien Opimizaion of Informaion-Theoreic Exploraion in SLAM Thomas Kollar and Nicholas Roy Compuer Science and Arificial

More information

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006

2.160 System Identification, Estimation, and Learning. Lecture Notes No. 8. March 6, 2006 2.160 Sysem Idenificaion, Esimaion, and Learning Lecure Noes No. 8 March 6, 2006 4.9 Eended Kalman Filer In many pracical problems, he process dynamics are nonlinear. w Process Dynamics v y u Model (Linearized)

More information

Chapter 2. First Order Scalar Equations

Chapter 2. First Order Scalar Equations Chaper. Firs Order Scalar Equaions We sar our sudy of differenial equaions in he same way he pioneers in his field did. We show paricular echniques o solve paricular ypes of firs order differenial equaions.

More information

Ensamble methods: Bagging and Boosting

Ensamble methods: Bagging and Boosting Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par

More information

Tracking. Many slides adapted from Kristen Grauman, Deva Ramanan

Tracking. Many slides adapted from Kristen Grauman, Deva Ramanan Tracking Man slides adaped from Krisen Grauman Deva Ramanan Coures G. Hager Coures G. Hager J. Kosecka cs3b Adapive Human-Moion Tracking Acquisiion Decimaion b facor 5 Moion deecor Grascale convers. Image

More information

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H. ACE 56 Fall 005 Lecure 5: he Simple Linear Regression Model: Sampling Properies of he Leas Squares Esimaors by Professor Sco H. Irwin Required Reading: Griffihs, Hill and Judge. "Inference in he Simple

More information

Presentation Overview

Presentation Overview Acion Refinemen in Reinforcemen Learning by Probabiliy Smoohing By Thomas G. Dieerich & Didac Busques Speaer: Kai Xu Presenaion Overview Bacground The Probabiliy Smoohing Mehod Experimenal Sudy of Acion

More information

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing

Application of a Stochastic-Fuzzy Approach to Modeling Optimal Discrete Time Dynamical Systems by Using Large Scale Data Processing Applicaion of a Sochasic-Fuzzy Approach o Modeling Opimal Discree Time Dynamical Sysems by Using Large Scale Daa Processing AA WALASZE-BABISZEWSA Deparmen of Compuer Engineering Opole Universiy of Technology

More information

Approximation Algorithms for Unique Games via Orthogonal Separators

Approximation Algorithms for Unique Games via Orthogonal Separators Approximaion Algorihms for Unique Games via Orhogonal Separaors Lecure noes by Konsanin Makarychev. Lecure noes are based on he papers [CMM06a, CMM06b, LM4]. Unique Games In hese lecure noes, we define

More information

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite

The Optimal Stopping Time for Selling an Asset When It Is Uncertain Whether the Price Process Is Increasing or Decreasing When the Horizon Is Infinite American Journal of Operaions Research, 08, 8, 8-9 hp://wwwscirporg/journal/ajor ISSN Online: 60-8849 ISSN Prin: 60-8830 The Opimal Sopping Time for Selling an Asse When I Is Uncerain Wheher he Price Process

More information

References are appeared in the last slide. Last update: (1393/08/19)

References are appeared in the last slide. Last update: (1393/08/19) SYSEM IDEIFICAIO Ali Karimpour Associae Professor Ferdowsi Universi of Mashhad References are appeared in he las slide. Las updae: 0..204 393/08/9 Lecure 5 lecure 5 Parameer Esimaion Mehods opics o be

More information

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux

Guest Lectures for Dr. MacFarlane s EE3350 Part Deux Gues Lecures for Dr. MacFarlane s EE3350 Par Deux Michael Plane Mon., 08-30-2010 Wrie name in corner. Poin ou his is a review, so I will go faser. Remind hem o go lisen o online lecure abou geing an A

More information

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé Bias in Condiional and Uncondiional Fixed Effecs Logi Esimaion: a Correcion * Tom Coupé Economics Educaion and Research Consorium, Naional Universiy of Kyiv Mohyla Academy Address: Vul Voloska 10, 04070

More information

Some Basic Information about M-S-D Systems

Some Basic Information about M-S-D Systems Some Basic Informaion abou M-S-D Sysems 1 Inroducion We wan o give some summary of he facs concerning unforced (homogeneous) and forced (non-homogeneous) models for linear oscillaors governed by second-order,

More information

) were both constant and we brought them from under the integral.

) were both constant and we brought them from under the integral. YIELD-PER-RECRUIT (coninued The yield-per-recrui model applies o a cohor, bu we saw in he Age Disribuions lecure ha he properies of a cohor do no apply in general o a collecion of cohors, which is wha

More information

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar

CONTROL SYSTEMS, ROBOTICS AND AUTOMATION Vol. XI Control of Stochastic Systems - P.R. Kumar CONROL OF SOCHASIC SYSEMS P.R. Kumar Deparmen of Elecrical and Compuer Engineering, and Coordinaed Science Laboraory, Universiy of Illinois, Urbana-Champaign, USA. Keywords: Markov chains, ransiion probabiliies,

More information

Rapid Termination Evaluation for Recursive Subdivision of Bezier Curves

Rapid Termination Evaluation for Recursive Subdivision of Bezier Curves Rapid Terminaion Evaluaion for Recursive Subdivision of Bezier Curves Thomas F. Hain School of Compuer and Informaion Sciences, Universiy of Souh Alabama, Mobile, AL, U.S.A. Absrac Bézier curve flaening

More information

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach 1 Decenralized Sochasic Conrol wih Parial Hisory Sharing: A Common Informaion Approach Ashuosh Nayyar, Adiya Mahajan and Demoshenis Tenekezis arxiv:1209.1695v1 [cs.sy] 8 Sep 2012 Absrac A general model

More information

Tracking. Announcements

Tracking. Announcements Tracking Tuesday, Nov 24 Krisen Grauman UT Ausin Announcemens Pse 5 ou onigh, due 12/4 Shorer assignmen Auo exension il 12/8 I will no hold office hours omorrow 5 6 pm due o Thanksgiving 1 Las ime: Moion

More information

Chapter 3 Boundary Value Problem

Chapter 3 Boundary Value Problem Chaper 3 Boundary Value Problem A boundary value problem (BVP) is a problem, ypically an ODE or a PDE, which has values assigned on he physical boundary of he domain in which he problem is specified. Le

More information

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t...

t is a basis for the solution space to this system, then the matrix having these solutions as columns, t x 1 t, x 2 t,... x n t x 2 t... Mah 228- Fri Mar 24 5.6 Marix exponenials and linear sysems: The analogy beween firs order sysems of linear differenial equaions (Chaper 5) and scalar linear differenial equaions (Chaper ) is much sronger

More information

Echocardiography Project and Finite Fourier Series

Echocardiography Project and Finite Fourier Series Echocardiography Projec and Finie Fourier Series 1 U M An echocardiagram is a plo of how a porion of he hear moves as he funcion of ime over he one or more hearbea cycles If he hearbea repeas iself every

More information

hen found from Bayes rule. Specically, he prior disribuion is given by p( ) = N( ; ^ ; r ) (.3) where r is he prior variance (we add on he random drif

hen found from Bayes rule. Specically, he prior disribuion is given by p( ) = N( ; ^ ; r ) (.3) where r is he prior variance (we add on he random drif Chaper Kalman Filers. Inroducion We describe Bayesian Learning for sequenial esimaion of parameers (eg. means, AR coeciens). The updae procedures are known as Kalman Filers. We show how Dynamic Linear

More information

DEPARTMENT OF STATISTICS

DEPARTMENT OF STATISTICS A Tes for Mulivariae ARCH Effecs R. Sco Hacker and Abdulnasser Haemi-J 004: DEPARTMENT OF STATISTICS S-0 07 LUND SWEDEN A Tes for Mulivariae ARCH Effecs R. Sco Hacker Jönköping Inernaional Business School

More information

Introduction to Probability and Statistics Slides 4 Chapter 4

Introduction to Probability and Statistics Slides 4 Chapter 4 Inroducion o Probabiliy and Saisics Slides 4 Chaper 4 Ammar M. Sarhan, asarhan@mahsa.dal.ca Deparmen of Mahemaics and Saisics, Dalhousie Universiy Fall Semeser 8 Dr. Ammar Sarhan Chaper 4 Coninuous Random

More information

Tracking. Many slides adapted from Kristen Grauman, Deva Ramanan

Tracking. Many slides adapted from Kristen Grauman, Deva Ramanan Tracking Man slides adaped from Krisen Grauman Deva Ramanan Coures G. Hager Coures G. Hager J. Kosecka cs3b Adapive Human-Moion Tracking Acquisiion Decimaion b facor 5 Moion deecor Grascale convers. Image

More information

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions

Inventory Analysis and Management. Multi-Period Stochastic Models: Optimality of (s, S) Policy for K-Convex Objective Functions Muli-Period Sochasic Models: Opimali of (s, S) Polic for -Convex Objecive Funcions Consider a seing similar o he N-sage newsvendor problem excep ha now here is a fixed re-ordering cos (> 0) for each (re-)order.

More information

Timed Circuits. Asynchronous Circuit Design. Timing Relationships. A Simple Example. Timed States. Timing Sequences. ({r 6 },t6 = 1.

Timed Circuits. Asynchronous Circuit Design. Timing Relationships. A Simple Example. Timed States. Timing Sequences. ({r 6 },t6 = 1. Timed Circuis Asynchronous Circui Design Chris J. Myers Lecure 7: Timed Circuis Chaper 7 Previous mehods only use limied knowledge of delays. Very robus sysems, bu exremely conservaive. Large funcional

More information

Article from. Predictive Analytics and Futurism. July 2016 Issue 13

Article from. Predictive Analytics and Futurism. July 2016 Issue 13 Aricle from Predicive Analyics and Fuurism July 6 Issue An Inroducion o Incremenal Learning By Qiang Wu and Dave Snell Machine learning provides useful ools for predicive analyics The ypical machine learning

More information

Energy Storage Benchmark Problems

Energy Storage Benchmark Problems Energy Sorage Benchmark Problems Daniel F. Salas 1,3, Warren B. Powell 2,3 1 Deparmen of Chemical & Biological Engineering 2 Deparmen of Operaions Research & Financial Engineering 3 Princeon Laboraory

More information

Planning in Information Space for a Quadrotor Helicopter in a GPS-denied Environment

Planning in Information Space for a Quadrotor Helicopter in a GPS-denied Environment Planning in Informaion Space for a Quadroor Helicoper in a GPS-denied Environmen Ruijie He, Sam Prenice and Nicholas Roy Absrac This paper describes a moion planning algorihm for a quadroor helicoper flying

More information

Linear Response Theory: The connection between QFT and experiments

Linear Response Theory: The connection between QFT and experiments Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and

More information

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks -

Deep Learning: Theory, Techniques & Applications - Recurrent Neural Networks - Deep Learning: Theory, Techniques & Applicaions - Recurren Neural Neworks - Prof. Maeo Maeucci maeo.maeucci@polimi.i Deparmen of Elecronics, Informaion and Bioengineering Arificial Inelligence and Roboics

More information

7630 Autonomous Robotics Probabilistic Localisation

7630 Autonomous Robotics Probabilistic Localisation 7630 Auonomous Roboics Probabilisic Localisaion Principles of Probabilisic Localisaion Paricle Filers for Localisaion Kalman Filer for Localisaion Based on maerial from R. Triebel, R. Käsner, R. Siegwar,

More information

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is

( ) ( ) if t = t. It must satisfy the identity. So, bulkiness of the unit impulse (hyper)function is equal to 1. The defining characteristic is UNIT IMPULSE RESPONSE, UNIT STEP RESPONSE, STABILITY. Uni impulse funcion (Dirac dela funcion, dela funcion) rigorously defined is no sricly a funcion, bu disribuion (or measure), precise reamen requires

More information

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t Exercise 7 C P = α + β R P + u C = αp + βr + v (a) (b) C R = α P R + β + w (c) Assumpions abou he disurbances u, v, w : Classical assumions on he disurbance of one of he equaions, eg. on (b): E(v v s P,

More information

Appendix to Creating Work Breaks From Available Idleness

Appendix to Creating Work Breaks From Available Idleness Appendix o Creaing Work Breaks From Available Idleness Xu Sun and Ward Whi Deparmen of Indusrial Engineering and Operaions Research, Columbia Universiy, New York, NY, 127; {xs2235,ww24}@columbia.edu Sepember

More information

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems

Single-Pass-Based Heuristic Algorithms for Group Flexible Flow-shop Scheduling Problems Single-Pass-Based Heurisic Algorihms for Group Flexible Flow-shop Scheduling Problems PEI-YING HUANG, TZUNG-PEI HONG 2 and CHENG-YAN KAO, 3 Deparmen of Compuer Science and Informaion Engineering Naional

More information

Data Fusion using Kalman Filter. Ioannis Rekleitis

Data Fusion using Kalman Filter. Ioannis Rekleitis Daa Fusion using Kalman Filer Ioannis Rekleiis Eample of a arameerized Baesian Filer: Kalman Filer Kalman filers (KF represen poserior belief b a Gaussian (normal disribuion A -d Gaussian disribuion is

More information