Multi-Modal Inference in Time Series

Size: px
Start display at page:

Download "Multi-Modal Inference in Time Series"

Transcription

1 Multi-Modal Iferece i Time Series No-Liear State Space Models Master-Thesis vo Saket Kamthe aus Pue,Idia Jauar 2014 Fachbereich Iformatik/ETiT

2 Multi-Modal Iferece i Time Series No-Liear State Space Models Vorgelegte Master-Thesis vo Saket Kamthe aus Pue,Idia 1. Gutachte: Dr.-Ig. Marc P Deiseroth 2. Gutachte: Prof. Dr. Ja Peters 3. Gutachte: Prof. Dr.-Ig. Heig Puder Tag der Eireichug: Bitte zitiere Sie dieses Dokumet als: URN: ur:b:de:tuda-tuprits URL: Dieses Dokumet wird bereitgestellt vo tuprits, E-Publishig-Service der TU Darmstadt tuprits@ulb.tu-darmstadt.de Die Veröffetlichug steht uter folgeder Creative Commos Lizez: Nameseug Keie kommerzielle Nutzug Keie Bearbeitug 2.0 Deutschlad

3 Erklärug zur Master-Thesis Hiermit versichere ich, die vorliegede Master-Thesis ohe Hilfe Dritter ur mit de agegebee Quelle ud Hilfsmittel agefertigt zu habe. Alle Stelle, die aus Quelle etomme wurde, sid als solche ketlich gemacht. Diese Arbeit hat i gleicher oder ählicher Form och keier Prüfugsbehörde vorgelege. Darmstadt, de Jauary 31, 2014 (Saket Kamthe) 1

4 Abstract Multi-modal desities appear frequetly i time series ad practical applicatios. However, they caot be represeted by commo state estimators, such as the Exteded Kalma Filter (EKF) ad the Usceted Kalma Filter (UKF), which additioally suffer from the fact that ucertaity is ofte ot captured sufficietly well, which ca result i icoheret ad diverget trackig performace. I this thesis, we address these issues by devisig a o-liear filterig algorithm where desities are represeted by Gaussia mixture models, whose parameters are estimated i closed form. The filtered results ca be further improved by a backward pass or smoothig. However, the optimal backward filter does ot offer a closed form solutio ad, hece, approximatios are eeded. We propose a ovel algorithm for smoothig i oliear dyamical systems with multi-modal beliefs. The resultig method exhibits a superior performace o typical bechmarks. i

5 Ackowledgemets First of all, I would thak my supervisors Dr.-Ig. Marc P Deiseroth ad Prof. Ja Peters for acceptig me as a exteral studet, ad Prof. Heig Puder for his support as a iteral supervisor. I have leared a lot from my iteractios with Marc, ad I am grateful for his cotiued support eve after movig to Lodo. I admire him for his hads-o approach to supervisio, especially for ecouragig some of my outladish ideas. I thak him for his suggestios o the approximate iferece algorithms ad his patiece to help me draft my first research paper. Both, the paper ad his suggestios costitute a substatial part of this thesis. I thak Prof. Ja Peters for his support durig my thesis with ot oly techical supervisio ad fiacial assistace but also guidace o host of other questios I keep directig towards him. I ackowledge the partial fiacial support for my thesis by Akademisches Ausladsamt at Techische Uiversität Darmstadt I thak Felix Schmitt, Domiik Notz, Marco Ewerto ad Roberto Caladra for makig my stay i the lab ejoyable ad for idulgig i discussios from abstract mathematics to utter osese. My floor-mates Aishwarya Sharma ad Divya Yea helped me to cope up with absolute lows i my persoal life ad their support allowed me to cocetrate o my thesis, ad, hece, I would like to thak them. This space would be too small to express my gratitude to my parets, as I caot thak them eough for their cotiued support throughout my studies without which I could ot have reached this stage i my career. ii

6 Cotets 1. Itroductio Scope of the Work ad Cotributios Scope Cotributios Bayesia Filter Filterig Smoothig Liear Gaussia Filter The Kalma Filter The Rauch-Tug-Striebel (RTS) Smoother Noliear Filters Exteded Kalma Filter Usceted Kalma Filter Particle Filters Gaussia Sum Filter Summary of State of the Art Multi-Modal Filter ad Smoother Multi-Modal Filter Estimatio of the Gaussia Mixture Parameters Propagatio of Ucertaity Filterig Mixture Reductio Multi-Modal Smoothig Gaussia Sum Smoother for No-liear systems Summary of the Multi-modal filter ad Smoother Numerical Results Uivariate No-liear Growth Model No-statioary Model Statioary Model Lorez System Discussio ad Coclusio Coclusio Future work A. Gaussia Coditioig 47 iii

7 1 Itroductio The Hawk-Eye (Owes et al., 2003) techology has revolutioised the way we watch teis. How do they geerate ball trajectory? A path traced by a dyamical system (ball) is called trajectory of that dyamical system. We express trajectory as a times series, that is our trace is over time, as we are iterested i predictig the trajectory (over time). The dyamical systems are ofte described by differetial equatio i time. I the Figure 1 Hawk-Eye techology geerates the trajectory of the ball. The proprietary system was origially developed for cricket i 2001 ad ow is icluded i Eglish Premier League as well. The Hawk-Eye uses a set of cameras that capture the motio of the ball ad based o the model of dyamical system we estimate the positio of ball i a teis court (Owes et al., 2003). Trackig a movig object based o observatios is a classic example of estimatio or filterig (Bar-Shalom et al., 2004). Filterig is a sigal processig term for estimatio of a true sigal from a oisy data (we filter oise). The filterig algorithms to estimate the true sigal based o oisy time series data have wide applicatios from biomedicie to rocket sciece. I fact, the first (liear) filter for trackig o-statioary systems was proposed by Kalma (Kalma, 1960) for Natioal Aeroautics ad Space Admiistratio (NASA). The Kalma filter, the Hawk-Eye system ad this work share a commo theme, they estimate the latet state of a system (ball) from sequetial oisy observatios. Filterig ad estimatio i a o-liear dyamical system has bee studied extesively ad hece, has resulted i uique termiology. We use the teis ball trackig as a example to establish termiology ad as a visual aid 1. We defie the ball ad the forces actig o it (wid, gravity etc.) as a system. We wish to track the ball, i.e. the positio of the ball Figure 1.1.: Ball trackig: The shaded trace represets estimated trajectory of the i space ad its velocity, which tell us the curret ball. The trajectory ca be represeted state of our system, as our state chages over time we defie state by state variables. Now we caot actually measure these state variables directly without disturbig the ogoig match, so we observe the latet state of ball. as a time series describig time evolutio of a dyamical system.(for defiitios of the terms see text ) For techical reasos, it is easier to measure distace ad the agle the ball makes with respect to observer, we call these measuremets as observed variables. Now, we aim to predict how the ball will move i future based o the measuremets of the ball by the observer. We may be able to approximately model the 1 The ball trackig is used oly as a visual aid to iterpret abstract termiology. The focus of thesis is ot trackig a teis ball, however, the algorithms proposed here ca be used for trackig. 1

8 teis court, e.g. hard court or clay court to predict the velocity of ball oce it hits the surface. However, the challege of accurately modellig the impact forces ad the approximatios i model itroduce ucertaity i our system ad we ca oly estimate the positio of the ball. How we accout for the ucertaity itroduced by the approximate model of our complex real life system is cetral theme of this thesis. The ukow oise makes the task eve more challegig, we model oise a Gaussia with kow variace. I this report, we restrict ourselves to the Gaussia oise model ad o-liear dyamical systems. See Sectio for details o scope of the work. A optimal algorithm for a liear dyamical system with Gaussia oise was proposed by Kalma (Kalma, 1960). I such systems, the Gaussiaity allows us to derive the recursive filterig equatios i closed-form. I cotrast, for a o-liear system Gaussia ucertaities may become o-gaussia due to the o-liear trasform. Hece, we require approximatios, such as liearisig the fuctios, e.g. i the Exteded Kalma Filter (EKF) (see 1.4.1), or determiistic samplig, e.g. i the Usceted Kalma Filter (UKF) (see 1.4.2) to approximate a o-gaussia desity by a Gaussia (Julier ad Uhlma, 2004). Such approximatios make the limitig implicit assumptio that the true desities are ui-modal. Filters based o these approximatios ofte severely uder-perform whe true desities are multi-modal. Hece, multi-modal approaches are frequetly eeded. For represetig multi-modal, o-gaussia desities particle filters are a stadard approach (see 1.4.3). They are computatioally demadig sice they ofte require a large umber of particles for good performace, e.g. due to the curse of dimesioality. A isufficiet umber of particles may fail to capture the tails of the desity ad lead to degeerate solutios. I practice, we have to compromise betwee the determiistic ad fast (UKF/EKF) or the computatioally demadig ad more accurate Mote Carlo methods (Doucet et al., 2001). A ideal filter for a o-liear system should allow for multi-modal approximatios, ad at the same time its approximatios should be cosistet to avoid degeerate solutios. I this report, we propose a filterig method (see 2.1.3) that approximates a o-gaussia desity by a Gaussia mixture model (GMM). A GMM allows modellig multi-modality as well as represetig ay desity with arbitrary accuracy give a sufficietly large umber of Gaussias, see (Aderso ad Moore, 2005), Sectio 8.4, for a proof. Ituitively we ca imagie a extreme case of ifiite Gaussias with variaces tedig to zero formig a trai of Dirac deltas, that i asymptotically ca approximate ay fuctio with arbitrary precisio. The GMM presets a elegat determiistic filterig solutio i the form of the Gaussia Sum filter (Alspach ad Soreso, 1972). The Gaussia Sum filter (GSUM-F) was proposed as a solutio to estimatio problems with o-gaussia oise or prior desities. The GSUM-F relies o liear dyamics ad the assumptio that the parameters of the Gaussia mixture approximatio to the o-gaussia oise or prior desities are kow a priori. This liearity assumptio ca be relaxed, e.g. by liearisatio (EKF GSUM-F) (Aderso ad Moore, 2005) or determiistic samplig (UKF GSUM-F) (Luo et al., 2010), but both solutios still require a priori kowledge of the GMM parameters. If, however, the prior ad oise desities are Gaussia, the UKF GSUM-F ad EKF GSUM-F are reduced to the stadard UKF ad EKF, i.e. they become ui-modal filters. To accout for a possible ui-modal to multi-modal trasitio i a o-liear system, we eed to solve two problems: the propagatio of the ucertaity ad the parameter estimatio of the GMM approximatio. Kotecha ad Duric (Kotecha ad Djuric, 2003), proposed radom samplig for ucertaity propagatio ad Expectatio-Maximizatio (EM) to estimate the GMM parameters. I this report, 2 1. Itroductio

9 we propose to propagate ucertaity determiistically usig the Usceted Trasform, which also allows for a closed-form expressio of the GMM parameters. I our proposed multi-modal filter (Kamthe et al., 2013) we preset a approximate solutio to address the possible ui-modal to multi-modal trasitio i a o-liear system by a Gaussia mixture models ad derive closed form expressios to estimate Gaussia mixture model parameters, for details see Sectio While filterig is a olie estimatio, i.e. our curret state estimate is based o all the observatios up to curret time idex, we ca look back ad correct our previous state estimates based o ew observatios, the filter to correct estimates based o future observatios is called Fixed lag Smoother or just smoother. The smoother also eeds to take ito accout the oliear trasitio fuctio. We propose a ovel smoother based o multi-modal filter i the Sectio 2.2. To the best of our kowledge the smoothig algorithm for o-liear estimatio based o Gaussia is ot available. The other Gaussia mixture based smoothers i literature are the Two-filter smoother based o the backward iformatio filter (Kitagawa, 1994), the Gaussia Mixture smoother (Vo et al., 2012) based o the β recursio ad the Expectatio Correctio smoother (Barber, 2006) based o approximatig a Gaussia itegral by evaluatig the itegral at mea of the Gaussia ucertaity. 3

10 1.1 Scope of the Work ad Cotributios As a testimoial to the importace of the o-liear estimatio ad time series models may stadard texts are available, ad i particular excellet descriptio of filterig ad state estimatio problems are covered i texts (Kailath et al., 2000) ad (Bar-Shalom et al., 2004). We iclude stadard Gaussia filters ad particle filters as state of the art o-liear estimators i itroductory chapter. No-liear filters ca be broadly classified as described i 1.2. We itroduce the oly the stadard algorithms for each theme, e.g. Determiisitic uimodal filters (Gaussia beliefs) also iclude Cubature Kalma filters (Arasaratam ad Hayki, 2009), Gauss Hermite Quadrature filters, etc Scope I this work we restrict ourselves to Gaussia priors, Gaussia Noise desities ad use stadard bechmarkig tools like Uivariate No-liear Growth Model (UNGM) to test ew algorithms proposed i this work. The Gaussia sum filter ad Multi-modal filters ca hadle o-gausia desities ad, hece, will outperform ui-modal o-liear filters i o-gaussia oise desity scearios. However, we avoid o-gaussia priors to demostrate that the proposed algorithm ca predict ad hadle o-gaussia desities geerated due to o-liearity of systems. The proposed mulit-modal filter will outperform ui-modal filters uder stadard testig coditios for o-gaussia priors 2, hece, for fair compariso with ui-modal filters we oly cosider a Gaussia oise i the system. No-liear Filters Gibbs Filter Particle Filter Uimodal / Gaussia Multimodal Radom Determiistic Determiistic UKF Gaussia Sum Filter Radom EKF Figure 1.2.: No-liear Filters: The classificatio of o-liear filters is abstract ad its mai purpose to defie scope of this work i cotext of state of the art literature, e.g. all leafs are o-liear filters ad described i this report. Orage circles represet the stadard algorithms o which we establish our cotributios. 2 Multi-modal filter is based o (Alspach ad Sorese, 1972) 4 1. Itroductio

11 Assumptios ad Scope Uless stated otherwise, the followig assumptios are made throughout the report. No-liearity: The system fuctio f ad measuremet fuctio h are o-liear ad assumed to be kow ad cotiuous. Prior: We assume a Gaussia prior o iitial state variables Noise: Noise desity is assumed to be Gaussia ad we assume stadard ormal distributio through out our discussio ad examples to maitai cosistecy. The followig poits are NOT covered i this report. A thorough compariso with Markov-Chai-Mote-Carlo (MCMC) methods. Compariso of proposed smoother with state of the art Gaussia smoothers. We, however, iclude a compariso with Gibbs smoother (Deiseroth ad Ohlsso, 2011) ad leave thorough comparisos to future work Scope of the Work ad Cotributios 5

12 1.1.2 Cotributios The mai cotributios of this thesis are elisted below. 1. We preset closed-form expressios for estimatig the parameters of the Gaussia mixture model to approximate multi-modal predictive desity of a o-liear dyamical system, see Sectio Multi-Modal-Filter (M-MF) (Kamthe et al., 2013), itroduced i Sectio 2.1 is a multimodal approach to filterig i o-liear dyamical systems, where all desities are represeted by Gaussia mixtures. 3. We propose a ovel multi-modal smoothig pass for o-liear dyamical systems, the result from Multi-Modal Filter is used to obtai a Gaussia Mixture based multi-modal smoother, see Sectio 2.2. A smoothig pass based o Gaussia sum approximatios is ot available i literature to the best of our kowledge, ad this algorithm is the most sigificat cotributio of the thesis. Note: Gaussia Sum Approximatios were first itroduced i (Alspach ad Sorese, 1972) ad form the basis of Gaussia sum filter (Aderso ad Moore, 2005). The Gaussia mixture represetatio provides flexibility to approximate complex predictive desities alog with ability to use closed form expressios for filterig ad smoothig Itroductio

13 1.2 Bayesia Filter f x 1 x x +1 h Filterig Smoothig y 1 y y +1 Figure 1.3.: State Space Model (SSM) of a dyamical system. Circles represet the hidde state variables, squares represet observed variables ad subscripts deote time idices. State trasitio fuctio f ad measuremet fuctio h are assumed kow ad o-liear. Arrows idicate iformatio flow i filterig ad smoothig ad both estimators are recursive, i.e. i a typical SSM model the above sequece repeats itself i both directios (left ad right). We ow describe the geeric framework of iferece i state space models ad we itroduce stadard termiology i the cotext of teis ball trackig. Readers familiar with state-space models ca skip to Bayesia filters i Sectio The mathematics or equatios to predict the future state values ad correct our estimate of the state based o the ew measuremets is called filterig. Whe the system is o-statioary ad oise is Gaussia we ca use Kalma Filter (Kalma, 1960) to obtai optimal estimates (Aderso ad Moore, 2005). The state-space models are powerful tools to represet may complex problems elegatly, e.g. whole ball trackig system ca be eatly ad compactly represeted by the state-space diagram show i Figure 1.3. The state-space represetatio is simple eough to uderstad ituitively ad yet it ca represet almost all the estimatio problems of practical sigificace (Thru et al., 2005). I the diagram the states are represeted by circles ad measuremets by squares. The circles are called the latet variables as we ca ascertai their existece oly through our observatios, the orage squares i the Figure 1.3. The velocity of teis ball for example, is a latet variable ad we ca estimate it based o our observatios oly. To facilitate a mathematical formulatio of system we assume Markov property, i.e. the ext state is based etirely o the curret state ad is idepedet of all the other states. The Markov assumptio allows us to write recursive estimatio algorithms. Throughout this thesis we represet state of the system by variable x ad use subscript otatio x to deote values of the state vector at discrete time step idex. To model sequetial data i time series we use SSM as described above. The state trasitio fuctio f ad measuremet(observatio) fuctio h are assumed to be kow. The variable Y j = {y 1,..., y j } represets all the observatios up to time step j. The model ca be represeted as i Figure Bayesia Filter 7

14 1.2.1 Filterig I a recursive Bayesia filter, the curret belief or the state distributio p(x ) is calculated from the previous state distributio p(x 1 ) at a discrete time idex 1. We iitialise the recursive filter with the state x 0 ad a distributio p(x 0 ). As our state is latet, the curret belief is coditioed o observatios up util the curret observatio, i.e. the curret belief is p(x 1 Y 1 ). Filterig is defied as estimatig p(x Y ) give ew observatio y ad ca be divided i two steps, the time update ad the measuremet update. We typically, alterate betwee these two steps, i.e. we predict the state (time update) ad the correct the state estimate (measuremet update) ad use the ew corrected state to predict ext state ad so o. Time Update I the time update we move oe step horizotally alog the chai (horizotal red arrow) i the Figure 1.3. This ca be represeted as estimatig p(x Y 1 ) from p(x 1 Y 1 ). The time update is defied by the itegral p(x Y 1 ) = p(x x 1 )p(x 1 Y 1 ) dx 1. (1.1) This itegral caot be solved i closed form for o-liear systems. All o-liear filters approximate this itegral. Particle filters replace the itegral by sum ad probabilities by radom samples. Measuremet Update I the measuremet update we icorporate the curret observatio y ad estimate the posterior desity p(x Y ). This ca be iterpreted as movig iformatio from observatio to correct the time update predictio accordig to the curret measuremet (vertical red arrow). We use Baye s theorem to write p(x Y ) = p(y x )p(x Y 1 ), (1.2) p(y Y 1 ) where the margial desity p(y Y 1 ) is obtaied by p(y x )p(x Y 1 ) dx Smoothig I smoothig, we wish to estimate the distributio of the state x give all observatios Y N. We assume that all the state predictive desities p(x Y ) are available. We estimate p(x Y ) recursively as (Kitagawa, 1994) p(x+1 Y N )p(x +1 x ) p(x Y N ) = p(x Y ) dx +1. (1.3) p(x +1 Y ) I the Figure 1.3, this ca be iterpreted as movig from right to left Itroductio

15 Teis Ball Trackig We ow cosider Bayesia filterig i the ball trackig cotext. Time Update: We estimate the ext state of the ball. Will it move i a straight lie? Will it hit the groud (i ext time step)? Was there ay spi? Measuremet Update: We correct our estimate to update our curret state, e.g. it hit the groud, there was a top spi imparted o it, etc. Smoothig Update: Smoothig allows us to correct the whole trajectory based o the fial observatios, i.e. ow we kow that there was top spi, does it chage our estimate of the state of the ball at the very begiig (Ideally, it should, if the spi imparted was our latet variable ad ow we kow more about, the type ad the amout, of spi based o the fial result). So if we kow how the system works ad have high speed cameras why do we eed itegrals ad probabilities? Our observatios are oisy or ucertai (capturig 3D world with 2D sesors), eve with stereoscopic visio we ca be tricked (optical illusios). I the Hawk-Eye trackig system the ball ad its eviromet are ucertai, e.g. humidity ad amout of spi imparted decide how the ball moves i air. So to deal with ucertaities we make assumptios, ad aturally the accuracy of estimatio depeds o how close our assumptios are to the reality. I the followig we establish a stadard approximatio used i Bayesia filterig ad smoothig Bayesia Filter 9

16 1.3 Liear Gaussia Filter The Bayesia filterig as defied by the Equatios (1.1) to (1.3), ca be solved i closed form with Liear Gaussia systems (Thru et al., 2005). The derivatio of the Kalma filter from Bayesia filter is well kow ad we do ot attempt to rewrite it. Readers ca fid Kalma filter derivatio based o Wieer filter i (Kailath et al., 2000), a derivatio based o Bayesia filter approach is give i (Thru et al., 2005). The Kalma filter exploits Gaussia coditioig property, i.e. if we have a joitly Gaussia distributio we ca obtai coditioal desity i a closed form. We assume Gaussia ucertaities ad approximate desities or beliefs as Gaussias. The filterig based o the Gaussia assumptio is called Gaussia filterig The Kalma Filter We cosider a dyamical system described by the followig equatios x = f (x 1 ) + w, w (0,Q), (1.4) y = h(x ) + v, v (0, R), (1.5) where f ad h are the (o-liear) trasitio ad measuremet fuctios, respectively. The oise processes w ad v are i.i.d. zero mea Gaussia with covariaces Q ad R, respectively. We deote the D dimesioal state by x, ad y is the E dimesioal observatio. I Gaussia filterig we assume that all the kow desities, system ad measuremet oise ad prior desity are Gaussia. For a liear system this eables us to use liear Gaussia coditioig ad all the equatios (1.1) to (1.3) ca be derived i closed form. We assume that our state ad measuremet variables are joitly Gaussia to exploit the Gaussia coditioig property. The assumptio reduces Kalma filterig to estimatig the auto-covariace ad cross-covariace matrices (Deiseroth ad Ohlsso, 2011). The joit Gaussia matrix for a latet state x ad measuremet variable y ca be writte as x y x y µ µ y Σ x,x, Γ x, y T Γ x, y Σ y, y (1.6) ad for state variables at time idex ad 1 as, x 1 x x 1 x µ 1 µ, Σ x,x 1 Γ x,x T Γ x,x Σ x,x (1.7) Itroductio

17 Algorithm 1 Kalma Filter: Algorithm illustrates Kalma filterig for Liear Gaussia systems. Subscripts defie time idex, superscripts deote latet x or observed y variable, µ, Σ, ad Γ are mea, covariace ad cross covariace respectively of a Gaussia (Liear systems) or approximated as Gaussia (No-liear systems) beliefs 1: fuctio KALMANFILTER( f, h, µ 1, Σ 1, y,q, R) 2: µ = f (µ 1 ) 3: Estimate Σ x,x usig f ad Σ 1 4: Σ = Σ x,x + Q Time update 5: Estimate Σ y, y ad Γ x, y usig h ad Σ Σ y,y 6: K = Γ x, y 7: µ = µ + K y h(µ ) 8: Σ = Σ 1 (Γ x, y 9: retur µ, Σ 10: ed fuctio + R 1 )(Σy, y )(Γx, y T ) Kalma Gai K Measuremet Update We describe Kalma filter algorithm based o the joit Gaussia assumptio i Algorithm 1. For the liear dyamical system with system gai matrix F ad measuremet matrix H the covariaces are give by Σ x,x = F Σ F T, (1.8) Σ y, y = H Σ H T, Γ x,x = Σ F T, Γ x, y = Σ H T. For a o-liear system with a Gaussia assumptio (implicit liearisatio), we approximate these covariaces matrices usig differet methods such as the Usceted Trasform or Gibbs samplig (Deiseroth ad Ohlsso, 2011) Liear Gaussia Filter 11

18 1.3.2 The Rauch-Tug-Striebel (RTS) Smoother I smoothig we move backwards to update our estimate of trajectory. Below we describe a well-kow backward pass called Rauch-Tug-Striebel (RTS) smoother. The derivatio of the backward pass below is similar to the backward pass preseted i (Särkkä, 2008). We describe below a procedure to evaluate itegral (1.3). 1. We reproduce the Equatio (1.3) p(x Y N ) = p(x Y ) p(x+1 Y N )p(x +1 x ) p(x +1 Y ) dx +1. (1.9a) 2. We push the filterig result p(x Y ) iside the the itegral as it a costat w.r.t. itegratio variable x +1 to obtai p(x Y N ) = p(x +1 Y N ) p(x +1 x )p(x Y ) }{{} p(x +1 Y ) dx +1. joit distributio of x ad x +1 (1.9b) 3. We ca obtai coditioal distributio of x give x +1 ad Y usig the joit distributio p(x, x +1 Y ). This coditioal ca be achieved by mere reformulatio of terms i step above p(x, x +1 Y ) p(x Y N ) = p(x +1 Y ) }{{} Coditioal distributio of x p(x +1 Y N ) dx +1. (1.9c) 4. Exploitig the Markov property, we ca write p(x x +1, Y N ) = p(x x +1, Y ) ad, hece, p(x Y N ) = p(x x +1, Y N )p(x +1 Y N ) dx +1. (1.9d) Itroductio

19 Algorithm 2 RTS Smoother: The algorithm describes RTS smoother for liear Gaussia systems. Subscripts defie time idex, superscripts deote latet x or observed y variable, µ, Σ, ad Γ are mea, covariace ad cross covariace respectively of a Gaussia (Liear systems) or approximated as Gaussia (No-liear systems) beliefs. Superscript s is used to deote fial smoothed result, i.e. we defie smoothed mea as µ s ad filtered mea as µ. 1: fuctio RTSSMOOTHER(µ, Σ, µ +1, Σ +1,Q) 2: µ +1 = f (µ ) 3: Σ +1 = F Σ 1 F (µ ) T + Q 4: 5: D = Σ F Σ 1 +1 µ s = µ + D µ+1 µ +1 Smoother Gai D 6: Σ s = Σ + D Σ+1 Σ +1 D T 7: retur µ s, Σs 8: ed fuctio 5. If we assume p(x x +1, Y N ) (x µ, Σ ) ad p(x +1 Y N ) (x µ s +1, Σs +1 ), the the distributio p(x Y N ) is also Gaussia, i.e. p(x x +1, Y N ) (x µ s, Σs ) Based o Step 5 above, we ca use Gaussia coditioig to write smoothig algorithm as Algorithm 2, 1.3. Liear Gaussia Filter 13

20 25 Noliear Fuctio f(x) p(f(x)) 15 p(f(x)) f(x) p(x) p(x) x Figure 1.4.: Example of Multi-modal predictive desity i UNGM with stadard ormal distributio as prior. The blue curve is stadard ormal prior ad the gree curve represets a o-liear fuctio. The predictive desity for a o-liear mappig of stadard ormal distributio is show by red curve. The red curve was geerated usig kerel smoothig desity estimator usig 10 5 samples. The samples are geerated from the Gaussia distributio show i blue ad the mapped by o-liear fuctio f (i gree). 1.4 Noliear Filters The Kalma filter ad RTS smoother allow us to recursively calculate state estimates ad all relatios are give i closed form, however, they make assumptio that the system is liear. Most applicatios of real life examples are oliear i ature. To uderstad it i our ball trackig system, our observatio fuctio, the video cameras implicitly use o-liear trasformatios to measure positio of the ball, e.g. Cartesia to polar, 3 dimesioal world to 2 dimesioal observatio. We caot use Kalma Filter i its form described i Sectio uless we use approximatios to estimate the covariace matrices i Equatios (1.6) ad (1.7). The approximatios are eeded due to the followig challeges i o-liear dyamical systems. 1. Predictive distributios or the itegrals used i our Bayesia framework do ot yield closed form solutios. Approximatios are eeded. 2. No-liear mappig results i o-gaussia distributios, i.e. eve if we were able to somehow evaluate the itegrals, we caot propagate the results forward due to the lack of parametric form. As a illustratio, cosider a hypothetical system that starts as Gaussia Itroductio

21 ad udergoes a oliear fuctio map as show i Figure 1.4. The red curve o left describes o-gaussia desity hece we caot use Kalma filter or RTS smoother. Most o-liear estimators solve both challeges simultaeously i.e by liearisig a o-liear system. The process is described i the followig sectios. We make the above distictio to illustrate our approach to tackle the problem. We treat above challeges idividually, i.e. 1. Provide a Bayesia framework to estimate predictive desity 2. The desity is parametric hece we ca use recursive filters like Gaussia sum filters ad yet it is ot ecessarily liearisatio as we allow a o-gaussia predictive desity to a Gaussia prior. The Figure 1.4 shows approximatio for the same o-liear fuctio we described i Figure 2.2. If we approximate the predictive desity by a Gaussia we implicitly liearise the system. This implicit liearisatio ca lead to large approximatio errors, e.g. see Figure 1.4. I Figure 1.4, ay Gaussia approximatio to the red curve (predictive desity) would result i large errors as a sigle Gaussia caot capture both modes simultaeously Noliear Filters 15

22 Liear Approximatio As described above i the liear approximatio we approximate all the beliefs or state distributios by a Gaussia desity. The Gaussia assumptio allows us to use property of Gaussia coditioig a property that eables us to write state estimatio i closed form. So, if we assume p(x ) (x µ, Σ ) ad p(x 1 ) (x 1 µ 1, Σ 1 ), the our liear approximatio implies x x 1 x x 1 µ µ 1, Σ Γ T Γ Σ 1. (1.10) With the kow trasitio fuctio f we ca evaluate the mea of the predictive Gaussia as µ = f (µ 1 ). I the followig we describe stadard methods classified based o the way we approximate the predictive Covariaces Σ ad Γ. A uified view o the liear Gaussia systems is give i (Deiseroth ad Ohlsso, 2011). We give brief overview of some o-liear estimators that use the liear Gaussia assumptio. Exteded Kalma Filter: Liearise the fuctio usig Taylor series expasio, see Sectio Usceted Kalma Filter: Use determiistic samplig to estimate the Covariaces, see Sectio Gibbs Filter: Radom samplig to estimate the mea ad the Covariaces, see Sectio The list of algorithms is meat to be illustrative ad ot exhaustive. Each type described above demostrates a priciple or cocept used to approximate to tackle itractability iheret i oliear filterig systems (Deiseroth ad Ohlsso, 2011). Several ad-hoc methods are available i literature Itroductio

23 1.4.1 Exteded Kalma Filter As described i Sectio 1.4, we liearise the system ad measuremet fuctios. We use the same model as described by system of Equatios (1.4). With our liear approximatio model (1.10), the parameters of predictive Gaussia are calculated as µ = f (µ 1 ), (1.11) Σ x,x Σ y, y Γ x, y Γ x,x = F 1 (µ 1 )(Σ 1 )F T 1 (µ 1), = H (µ )(Σ )H T (µ ), = Σ 1 H T 1 (µ 1), = Σ 1 F T 1 (µ 1), where F 1 is the Jacobia matrix of f obtaied by F (µ ) k,k = f k(x) k x. (1.12) x=µ where k ad k deote row ad colum of a matrix, similarly we have H (µ ) k,k = h k(y) k y. (1.13) x=µ The matrices i (1.11) ca be used i Algorithm 1 to obtai Exteded Kalma Filter (EKF) Noliear Filters 17

24 Figure 1.5.: Example of the UT for mea ad covariace propagatio. a) actual, b) first-order liearizatio (EKF), c) UT. The Figure is reproduced from (Wa ad va der Merwe, 2000) Usceted Kalma Filter The usceted Kalma filter is based o the Usceted Trasform (UT) (Julier ad Uhlma, 2004) where we use determiistic samplig to estimate the covariace matrices Σ ad Γ. Similar to EKF we eed oly estimate the covariace matrices (Deiseroth ad Ohlsso, 2011) to obtai Usceted Kalma filter. The matrices i (1.14) ca be used i Algorithm 1 to obtai Usceted Kalma Filter (UKF). To illustrate the UT cosider a variable x 1 (x 1 µ 1, Σ 1 ) ad we wat to estimate the distributio of p(x ) giev that x = f (x 1 ) ad f is ay o-liear fuctio. We ca divide the trasform i the followig steps, 1. Calculate sigma poits X of distributio (x 1 µ 1, Σ 1 ). 2. Propagate the sigma poits through o-liear fuctio f. 3. Estimate trasformed mea ad covariaces based o the trasformed sigma poits. 3 3 The Y i (1.14f) ad (1.14g) is a matrix ad ot a vector of observatios Itroductio

25 χ [0] 1 = µ 1, χ [j] 1 = µ 1 + ( Σ 1 ) j, χ [j+d] 1 = µ 1 ( Σ 1 ) j, (1.14a) (1.14b) (1.14c) where ( Σ 1 ) j represets j th row of the Cholesky factor of Σ 1. The poits χ 1 are called sigma poits as they are spread aroud mea at a distace of oe sigma µ = X w µ, (1.14d) Σ x,x = X W X T, (1.14e) Σ y, y = Y W Y T, (1.14f) Γ y,x = Y W X T, (1.14g) where χ 1 is a sigma poit matrix ad Σ 1 is a lower triagular matrix of the Cholesky factorisatio, fuctio f (.) is applied to each colum of argumet matrix. The costat is obtaied by c = α 2 ( + κ) where α ad κ are costats of UT. The vector w µ ad matrix W are defied as w µ = W 0 m... W 2 T m, (1.14h) W = (I [w µ... w µ ]) diag(w 0 c... W 2 c ) (I [w µ... w µ ]) T. (1.14i) Where the weights are obtaied by (1.14j) W 0 m W 0 c W i m W i c = λ/( + λ), (1.14k) = λ/( + λ) + (1 α2 + β), = 1/ {2( + λ)}, i = 1,..., 2., = 1/ {2( + λ)}, i = 1,..., 2. with λ defied as λ = α 2 ( + κ). (1.14l) The hyperparameters λ, α ad κ cotrol the spread of the sigma poits χ (?) 1.4. Noliear Filters 19

26 Algorithm 3 Particle Filter: The followig algorithm describes a geeric particle filter algorithm, adapted from (Thru et al., 2005). The χ represets samples, subscript deotes time idex. The system fuctio ad the measuremet fuctios are give by f ad h respectively. The algorithm returs particles χ that collectively represet filtered estimate of system at time. 1: fuctio PARTICLE FILTER(χ 1, f, h, y ) 2: χ = χ = 3: for m=1 to M do 4: sample x [m] 5: w [m] 6: χ = χ + x [m] f (x [m] 1 ) p(x x [m] 1 ) = p(y h(x [m] ) ) Calculate weight (importace) of particle, w [m] 7: ed for 8: for m=1 to M do 9: sample i with probability w [i] 10: add x [i] to χ 11: ed for 12: retur χ 13: ed fuctio Importace re-samplig Particle Filters Particle filters are o parametric implemetatio of Bayesia filter. I particle filters we represet our beliefs or probabilities desities by samples draw from these desities. See (Thru et al., 2005) for detailed overview o particle filters i terms of Bayesia filterig. As a illustrative example cosider a system with prior probability, p(x 0 Y 0 ) = (x 0 µ 0, Σ 0 ). Subsequetly, we represet p(x 0 Y 0 ) by set χ of radomly draw samples from (x 0 µ 0, Σ 0 ), i.e. where M represets the umber of particles. χ [M] 0 p(x 0 Y 0 ) = (x 0 µ 0, Σ 0 ) For the time update we ca use the fuctio f o set of particles χ [M] 1, i.e. χ [M] p(x Y 1 ) f (χ [M] 1 ) Geeric particle filter is described i algorithm 3 which is adapted from (Thru et al., 2005) for a thorough overview of particle filters ad Markov Chai Mote Carlo (MCMC) methods refer (Doucet et al., 2001) Itroductio

27 Gibbs Filter Ulike geeric particle filters described i Sectio Gibbs filter is parametric. The state desities p(x Y ) are approximated by a parametric distributio ad we use radom samplig (Gibbs sampler) to estimate parameters of these approximate desity. I (Deiseroth ad Ohlsso, 2011), the authors use the Gaussia assumptio for predictive desities ad the state covariace ad cross covariace matrices are calculated usig Gibbs sampler. I this work we use Gibbs filter described i (Deiseroth ad Ohlsso, 2011) to compare performace of the proposed filter Gaussia Sum Filter The o-liear filters we discussed i precedig sectios all share a commo philosophy, approximate a itractable predictive distributio by a Gaussia distributio. Gaussia approximatio also implicitly imposes a uimodal belief o our state estimate, however, as the ame of the report suggests we are iterested i multi-modal beliefs. If we express our multi-modal beliefs as Gaussia mixtures the we ca use Gaussia Sum Filter proposed by (Alspach ad Soreso, 1972). We would like to stress that we view Gaussia sum filter as a algorithm to Kalma Filter like approximatios whe beliefs are expressed as Gaussia Mixture. The differece is subtle ad we give followig examples to support our view. 1. Gaussia sum filter ca be used for o-liear systems if ad oly if we express approximate beliefs as Gaussia Mixtures. Gaussia sum filter has o mechaism to deal with o-liearities. Various ways to approximate fuctios usig optimisatio are discussed by authors i (Alspach ad Soreso, 1972). 2. I Switchig Liear Dyamical Systems (SLDS) Gaussia mixtures arise aturally, Gaussia sum filter ca be expressed as a Assumed Desity Filter [cite Mika, Barber] where the assumed desity is Gaussia Mixture. A excellet discussio ad alterative view o usage of Gaussia sum for o-liear filterig is give i Sectio 8.4 of (Aderso ad Moore, 2005). The chapter covers several pages ad we recommed it as must read to gai isights i the iterpretatios of Gaussia Sum Filter. However, the argumet Gaussia Sum Filter ca be used as a approximate iferece techique with Gaussia Mixture beliefs is the importat take away poit from this Sectio Noliear Filters 21

28 Estimatio Scheme Mea Variace P(X 0) χ 2 distributio Particle Filter 1e3 [1e6] particles [0.9998] [2.0020] 0 Exteded Kalma Filter 0 N/A N/A Usceted Kalma Filter Multi-modal Filter Table 1.1.: Compariso of various determiistic o-liear approximatios: The table shows approximatios for stadard ormal distributio p(x) = (0, 1) mapped through quadratic o-liearity x 2. True distributio is chi-squared with 1 degree of freedom p(x 2 ) = χ 2 1. The usceted Kalma filter ad multi-modal filter use the same optimal hyper parameters, multi-modal filter i additio uses tuig parameter α = 1.2. The EKF diverges, see Figure Summary of State of the Art Filters ad Motivatio for Multi-Modal Filter The Exteded Kalma filter ad Usceted Trasform filter are stadard state of the art algorithms for determiistic iferece with Gaussia beliefs, a similar method for multi-modal beliefs is ot available to the best of our kowledge. The success of EKF ad UKF is based o the ease of implemetatio ad approximatios may result i severe errors eve for most trivial problems. We motivate eed for a fast determiistic method with a simple example of oe dimesioal quadratic map to demostrate. The Sectio is motivated by discussios i (Gustafsso ad Hedeby, 2012). We use quadratic approximatio to test the methods. Large umber of real life applicatios ofte have quadratic map, e.g. the euclidea distace i trackig problems is a quadratic map. Quadratic map is simple eough that we ca calculate desired values i closed form. The probability distributio is ot a Gaussia ad we ca estimate errors i approximatio. We cosider a map f (x) = x 2 for a stadard Gaussia variable x (x 0, 1). The p(f (x)) is the a χ 2 distributio with degrees of freedom, where is dimesio of the variable x. The plots of approximatios for predictio with UKF, EKF, Particle Filter(samplig) ad true χ 2 1 distributio are plotted i Figure 1.6. We discuss this results alog with the proposed Multi-modal filter i Chapter 4. Here, the mai aim of presetig these results is, both EKF ad UKF may result i severe errors, particle filters eed substatially large umber of particles to get good approximatios, a approximatio may ot match true momets of trasformed variable ad yet result i less approximatio errors, see Figure 1.6 ad Table Itroductio

29 Probability CDF of Multi Modal Filter CDF of χ 2 distributio CDF of Usceted Trasform CDF of Exteded Kalma Filter Variable x (a) Cumulative Distributio Fuctio 1 Probability Multi modal Filter χ 2 distributio Usceted Trasform Variable x (b) Probability Distributio Fuctio Figure 1.6.: Compariso of various determiistic o-liear approximatios: The figure shows predictive desities for a quadratic trasformatio (x 2 ) of the stadard ormal distributio by o-liear estimatio methods. The gree curve shows the ideal result, χ 2 distributio with 1 degree of freedom (χ 2 1 ) (a) The plot shows cumulative distributio of the predicted desities, ideally there should be o mass before zero as show by the gree curve of χ 2 distributio. The EKF cocetrates whole mass at x = 0 (b) The plot shows predictive desities. Note, the χ 2 1 distributio has discotiuity at x = Summary of State of the Art 23

30 2 Multi-Modal Filter ad Smoother We discussed shortcomigs of uimodal approximatios i Sectio 1.5. The ideal filter for a o-liear system should be able to represet multi-modal desities ad make cosistet estimates. The multi-modal beliefs give the filter additioal flexibility to reduce approximatio errors. A Gaussia mixture is a ideal approximatio for multi-modal beliefs; it is parametric, ad we ca use Gaussia Sum filter for estimatio. With a sufficietly large umber of Gaussias i the mixture we ca arbitrarily reduce approximatio error (Aderso ad Moore, 2005); (Alspach ad Sorese, 1972). The challege with the Gaussia mixture approximatios is to estimate its parameters. We eed to fid a method to fit a Gaussia mixture to the predictive desity of Gaussia udergoig a o-liear trasform. Give such a method we ca repeat the procedure to obtai a predictive Gaussia mixture for a o-liear trasform of the iput Gaussia mixture. The first step to estimate Gaussia mixture parameters to approximate the o-gaussia, predictive desity for a o-liear map of a Gaussia, is described i Sectio (Kamthe et al., 2013). Give the filterig results ad all the observatios we ca improve our estimatio result by implemetig a backward filter as described i Sectio For multi-modal beliefs the optimal backward filter, Equatio (1.3), caot be evaluated i closed form due to coditioig o p(x +1 Y ) which is a Gaussia a mixture (see Sectio 2.1.3). We propose a approximate solutio to smoothig with Gaussia mixture beliefs i Sectio Multi-Modal Filter I the followig, we devise a closed-form filterig algorithm with multi-modal represetatios of the state distributios. Our algorithm is ispired by the followig observatio made by Julier ad Uhlma (Julier ad Uhlma, 2004): Give oly the mea ad the variace of the uderlyig distributio, ad, i absece of ay a priori iformatio, ay distributio (with the same mea ad variace) used to calculate the trasformed mea ad variace is trivially optimal. This observatio was the basis to derive the Usceted trasform ad the UKF. However, predictios based o the Usceted Trasform ofte uder-estimate the true predictive ucertaity, which ca result i icoheret state estimatio ad diverget trackig performace. To address this issue, we use a differet (optimal) represetatio of the uderlyig distributio, which still matches the mea ad variace: We propose to represet each sigma poit i the Usceted trasform by a Gaussia cetred at this sigma poit. This approximatio of the origial distributio is effectively a GMM with 2D+1 compoets, where D is the dimesioality of x. I Sectio 2.1.1, we derive a optimal GMM represetatio of a state distributio p(x 1 ) of which oly the mea ad variace are kow. I Sectio 2.1.2, we detail how to map this GMM through a o-liear fuctio to obtai a predictive distributio p(x ), which is represeted by a GMM. We geeralise both ucertaity propagatio ad parameter estimatio to the case where p(x 1 ) is give by a GMM. I Sectio 2.1.4, we propose a method for pruig the umber of 24

31 mixture compoets i a GMM to avoid their expoetial icrease i umber of compoets. I Sectio 2.1.3, we propose the resultig filterig algorithm, which exploits the results from Sectios Estimatio of the Gaussia Mixture Parameters Let the mea ad the variace of the state distributio p(x 1 ) be give by µ, Σ, respectively. The, we ca represet p(x 1 ) by a GMM p(x 1 ) = 2D i=0 δ iϕ i (x 1 ), such that the mea ad the variace of the approximate desity p(x 1 ) equal the mea µ ad variace Σ of p(x 1 ). This represetatio is achieved by the closed-form relatios δ i = 1/(2D + 1), µ 0 = µ, µ j = µ + σ j, µ j+d = µ σ j, Σ i = 1 2α Σ, D+1 (2.1) where i = 0,..., 2D ad j = 1,..., D, where D is the dimesioality of the state variable x 1. The variable σ deotes D rows or colums from the matrix square root ± ασ. From (2.1), we ca see that we eed to calculate Σ oly oce for all 2D + 1 Gaussias ϕ i (x 1 ). To esure that Σ i is positive semi-defiite, the scalig factor α should be chose such that 2α (2D + 1), see (2.1). For 2α = 2D+1 i (2.1), the equatios above reduce to scaled sigma poits (Julier ad Uhlma, 2004). Hece, the GMM represetatio i (2.1) ca be cosidered a geeralisatio of the classical sigma poit represetatio of desities employed by the Usceted Trasform, where each sigma poit becomes a improper probability distributio Propagatio of Ucertaity A key step i filterig is the ucertaity propagatio step, i.e. estimatig the probability distributio of radom variable, which has bee trasformed by meas of the trasitio fuctio f. Give p(x 1 ) ad the system dyamics (1.4), we determie p(x ) by evaluatig p(x x 1 )p(x 1 )dx 1. For o-liear fuctios f, the itegral above ca ot be solved i closed form. Thus, approximate solutios are required. Ucertaity propagatio i o-liear systems ca be achieved by approximate methods, employig liearisatio or determiistic samplig as i the EKF ad UKF. I such approaches, the state distributio p(x 1 ) ad the approximate predictive desity p(x ) are well represeted by Gaussias. If the state distributio p(x 1 ) is a Gaussia mixture as i (2.1), we ca estimate the predictive distributio p(x ) similarly, e.g. by applyig such a approximate update to each mixture compoet i the GMM. I the multi-modal filter, we propagate each mixture compoet ϕ i (x 1 ) of the GMM through f ad approximate p(x ) by p(x ) = 2D 2D p(x x 1 ) δ i=0 iϕ i (x 1 )dx 1 j=0 δ jϕ j (x ), (2.2) 2.1. Multi-Modal Filter 25

32 30 20 p(f(x)) p(f(x)) 0 f(x) 0 f(x) p(x) p(x) p(x) p(x) x Figure 2.1.: Demostratio of predictive desity estimatio with multi-modal filter. 1) We approximate Gaussia (blue solid) by a Gaussia mixture (blue dashed) usig (2.1) 2) Gree solid lie represets a o-liear fuctio 3) The solid red lie represets a Gaussia mixture evaluated usig EM. The dashed red Gaussias represet usceted trasform of dashed blue Gaussias. The dashed straight dashed lies represet sigma poits used by a usceted trasform. where the mea ad covariace of each ϕ j (x ) are computed by meas of the Usceted Trasform. The Figure 2.1, shows a example of ucertaity propagatio usig scheme described i Equatio (2.2). If the prior desity is a Gaussia mixture p(x 1 ) = M 1 j=0 β jϕ j (x 1 ), we repeat the procedure above for each mixture compoet i p(x 1 ), i.e. we split each mixture compoet ϕ j ito 2D + 1 compoets δ i ϕ ji, i = 0,..., 2D, ad propagate them forward usig the Usceted Trasform. For otatioal coveiece, we defie this operatio o a Gaussia mixture as (f, p(x 1 )), such that p(x ) = (f, p(x 1 ))= = M 1 j=0 2D i=0 M(2D+1) 1 l=0 β j δ i p(x x 1 )p(x 1 )dx 1 p(x x 1 )ϕ i j (x 1 )dx 1 γ l ϕ l (x ), (2.3) where γ l = δ i β j. We compute the momets of the mixture compoets ϕ l by meas of the Usceted Trasform. From (2.3), it ca be observed that durig a update step the umber of compoets grows by a factor of M. Whe applied multiple times, it will result i a expoetial explosio of Multi-Modal Filter ad Smoother

33 Helliger Distace from Particle Filter UT : GMM UT : Multi modal filter Particle Filter Usceted Trasform Helliger Distace from Particle Filter UT : GMM UT : Multi modal filter Particle Filter Usceted Trasform Helliger Distace from Particle Filter UT : GMM UT : Multi modal filter Particle Filter Usceted Trasform Helliger Distace from Particle Filter UT : GMM UT : Multi modal filter Particle Filter Usceted Trasform Figure 2.2.: Example of a multi-modal predictive desity for a o-liear statioary fuctio with stadard ormal distributio as prior. The blue curve shows predictive distributio with calculated usig EM (10 5 samples, 3 compoets). The solid red curve shows fit with proposed mulit-modal method ad black solid lie represets the Usceted Trasform approximatio of predictive desity. We use Helliger distace(bera, 1977) as a metric to measure goodess of fit which shows that proposed method outperforms stadard Usceted Trasform based method for this o-liear fuctio. We ca also visually cofirm that multi-modal filter is able to capture both modes of the distributio. compoets. We reduce the umber of mixture compoets at each time step, for details, see Sectio Filterig I the followig, we subsume all derivatios i our multi-modal o-liear state estimator, whose time ad measuremet updates are summarised i the followig. Time Update Assume that the filter distributio p(x 1 Y 1 ) is represeted by a GMM with M compoets. The time update, i.e. the oe-step ahead predictive distributio is give by p(x Y 1 ) = p(x x 1 )p(x 1 Y 1 ) dx 1. (2.4) 2.1. Multi-Modal Filter 27

34 This itegral ca be evaluated as (f, p(x 1 Y 1 )), such that we obtai a GMM represetatio of the time update as detailed i (2.3). p(x Y 1 ) = M(2D+1) 1 j=0 γ j ϕ j (x 1 ) (2.5) Measuremet Update The measuremet update ca be approximated up to a ormalisatio costat by p(x Y ) p(y x )p(x Y 1 ), (2.6) where p(x Y 1 ) is the time update (2.5). We ow apply a similar operatio as i (2.3) with the measuremet fuctio h as a o-liear fuctio ad obtai p(y Y 1 ) = (h, p(x Y 1 )). (2.7) Substitutig (2.7) ad (2.5) i (2.6) yields the measuremet update, i.e. distributio the filtered state p(x ) 2D i=0 M(2D+1) 1 δ i ϕ i (y 1 ) γ j ϕ j (x 1 ), M(2D+1) 2 1 l=0 j=0 β l ϕ l (x ). (2.8) We calculate the measuremet update for each pair ϕ i ad ϕ j. Recallig that ϕ l (x ) = (x µ i j, Σi j ) for i = 0,..., 2D ad j = 0,..., (2D+1)2 1, the measuremet updates (Särkkä, 2008) ad weight updates (Gaussia Sum (Alspach ad Soreso, 1972)) are give by K j = Γj j 1, 1 Σ 1 µ i j = µi 1 + K j j y µ Σ i j = Σi 1 K j j Σ K j β i,j = 1 1, δ i γ j (x = y µ j 1, Σj 1 ) k,l δ l γ k (x = y µ k 1, Σk 1 ), T, (2.9) where Γ j 1 is the cross covariace matrix cov(x 1, x ) determied via the Usceted Trasform (Särkkä, 2008). After the measuremet update, we reduce the M(2D + 1) 2 mixture compoets i the GMM, see (2.8), to M accordig to Sectio Multi-Modal Filter ad Smoother

35 2.1.4 Mixture Reductio Up to this poit, we have cosidered the case where a desity with kow mea ad variace has bee represeted by a GMM, which could subsequetly be used to estimate the predicted state distributio. Icorporatig these steps ito a recursive state estimator for time series, there is a expoetial growth i the umber of mixture compoets i (2.3). Oe way to mitigate this effect is to represet the estimated desities by a mixture model with a fixed umber of compoets (Aderso ad Moore, 2005). To keep the umber of mixture compoets costat we ca reduce them at each time step (Aderso ad Moore, 2005). A straightforward ad fast approach is to drop the Gaussia compoets with the lowest weights. Such omissios, however, ca result i poor performace of the filter (Kitagawa, 1994). Kitagawa (Kitagawa, 1994) suggested to repeatedly merge a pair Gaussia compoets. A pair is selected with lowest distace i terms of some distace metric. We evaluated multiple distace metrics, e.g. the L 2 distace (Williams ad Maybeck, 2003), the KL divergece (Rualls, 2007), ad the Cauchy Schwarz divergece (Kampa et al., 2011). I this report, we used the symmetric KL divergece (Kitagawa, 1994), D(p, q) = (K L (p q) + K L (q p))/2, which outperformed aforemetioed distace measures for mixture reductio i filterig. 2.2 Multi-Modal Smoothig The smoothig update is based o the forward-backward filter. The optimal backward pass for a dyamical system described by Figure 1.3 is give by equatio (1.3) (Kitagawa, 1994), we repeat the equatio (1.3) here for quick referece, p(x+1 Y N )p(x +1 x ) p(x Y N ) = p(x Y ) dx +1. (2.10) p(x +1 Y ) Equatio (2.10) is a recursio iitiated by settig p(x +1 Y N ) = p(x +1 Y +1 ) where the idex + 1 ow poits at last observatio, i.e. N = + 1. The quatities we kow are p(x +1 Y N ) = p(x Y ) = p(x +1 x ) = M α j ϕ j (x +1 Y N ), previous smoothig result (2.11) j P δ i ϕ i (x Y ), previous filterig result (2.12) i L α k ϕ k (x +1 x ). forward predictio or time update (2.13) k The deomiator of (2.10) is a Gaussia mixture that ca be evaluated as am itegral p(x+1 x )p(x Y )dx. A Gaussia mixture i the deomiator makes a closed form exact solutio to the equatio (2.10) impossible ad we eed approximatios for a Gaussia sum smoother. We propose a approximate solutio to the Gaussia sum smoother i Sectio The solutio proposed i Sectio is ot available i the literature to the best of our kowl Multi-Modal Smoothig 29

36 edge. I the followig, we list state of the art determiistic approximatios to the smoothig Equatio (2.10). Kitagawa s two filter smoother (Kitagawa, 1994). I the two filter smoother the right had side itegral is replaced by a backward iformatio filter p(y N x ), i.e. we write the smoothig pass as, p(x Y N ) p(x Y ) p(y N x ). Closed form forward-backward smoothig (Vo et al., 2012). The forward-backward smoother uses a recursio to calculate likelihood to approximate smoother as, p(x Y N ) = p(x Y ) Lik(Y N x ), the Lik( ) deotes likelihood fuctio. The forward-backward replaces the iformatio filter i Kitagawa s two filter smoother by the likelihood ad avoids the iversio of trasformatio matrix (Vo et al., 2012) Expectatio Correctio (Barber, 2006). The expectatios correctio algorithm igores the itegral altogether ad evaluates left had side at the mea values of Gaussias i mixture, i.e. we replace Gaussia by its mea. The crude approximatio gives excellet results, uderliig the importace of a Gaussia sum smoother i estimatio. The Gaussia sum smoothers described above attempt to solve the problem of backward pass with Gaussia mixture beliefs, however, the system is assumed to be liear (Switchig Liear Dyamical Systems (SLDS)). There is o smoothig algorithm i literature for iferece i a o-liear dyamical system with multi-modal beliefs, to the best of our kowledge. The proposed multi-modal smoother is, however, ot restricted to o-liear dyamical systems aloe ad i priciple ca be applied to SLDS as well Gaussia Sum Smoother for No-liear systems Propositio 2.1. If we defie probabilities as i Equatios (2.11) to (2.13), the Gaussia mixture approximatio to smoothig equatio (2.10) is give as p(x Y N ) = LM P z=1 β z (x N µ z N, Σz ), (2.14) N where the mea µ z N ad covariace Σz N are give by 1 D k = Γk +1 Σ+1 k, µ i jk N = µi + Dk µ j +1 N µk +1, Σ i jk N = Σi + Dk Σ j +1 N Σk +1 D k T, β i, j,k = δ i α j γ k (x = µ j +1 N µk 1, Σk +1 + Σ j +1 N ) δ p α q γ r (x = µ q +1 N µr +1, Σr +1 + Σq +1 N ). p,q,r (2.15) Multi-Modal Filter ad Smoother

37 Proof. We ca rewrite (2.10) as p(x Y N ) = Step 1 p(x x +1, Y N ) p(x }{{} +1 Y N ) Step 1 } {{ } Step 2 dx +1. (2.16) The coditioig is defied as p(x x +1, Y N ) = p(x +1 x )p(x Y ). (2.17) p(x +1 Y ) The right had side of (2.17) caot be evaluated i closed form as the deomiator p(x +1 Y ) is a Gaussia mixture give by equatio (2.13). We rewrite the deomiator as, p(x x +1, Y N ) = p(x +1 x )p(x Y ) p(x+1 x )p(x Y ) dx. (2.18) The expressio p(x +1 x )p(x Y ) dx i the deomiator is idepedet of x ad ca be treated as ormalisatio costat for the probability distributio p(x x +1, Y N ). We ca obtai p(x x +1, Y N ) p(x +1 x ) p(x Y ) = β p(x +1 x ) p(x Y ) (2.19) where β is a costat idepedet of x ad defied as β = 1 p(x +1 x )p(x Y ) dx. (2.20) It should be oted that β is a fuctio of x +1 ad, hece, caot be treated as a costat whe margialisig over x +1. Substitutig (2.13) i (2.19) yields p(x x +1, Y N ) = β L γ k ϕ k (x +1 x ) k M α j ϕ j (x +1 Y ). (2.21) j This derivatio completes Step 1 of (2.16). Substitutig the result of Step 1 (2.17) i (2.16) we obtai p(x Y N ) = β L γ k ϕ k (x +1 x ) k P M δ i ϕ i (x Y ) α j ϕ j (x +1 Y N ) dx +1. (2.22) i j 2.2. Multi-Modal Smoothig 31

38 Pushig the Gaussia variables ϕ k ad ϕ i iside the sum yields p(x Y N ) = β L k γ k P i δ i M α j ϕ k (x +1 x ) ϕ i (x Y ) ϕ }{{} j (x +1 Y N ) j ϕ i,k (x,x +1 Y ) } {{ } Step 2 dx +1. (2.23) We defie the idividual compoets of Gaussia mixture ϕ as to evaluate the joit distributio ϕ i (x Y ) = (x µ i, Σi ), (2.24) ϕ k (x +1 Y ) = (x +1 µ k +1, Σk +1 ). (2.25) ϕ i,k (x, x +1 Y ) = (x µi,k, Σi,k ), (2.26) where D k = Γk +1 Σ k +1 1, (2.27) µ i,k = µi + D (x +1 µ k +1 ), Σ i,k = Σi + D (Σ k +1 )DT. The covariace Σ ad cross-covariace Γ ca be obtaied by ay method o-liear Gaussia approximatio methods like Usceted Trasform, exteded Kalma filter etc. By defiitio of β i (2.20) we write Gaussia compoet of coditioal p(x, x +1 Y ) as, Substitutig values from(2.28) i (2.23) yields ϕ i,k (x x +1, Y ) = ϕ i,k (x, x +1 Y ) β(x +1 ) (2.28) p(x Y N ) = β(x +1 ) L k γ k P i δ i M α j ϕ i,k (x x +1, Y ) ϕ }{{} j (x +1 Y N ) j Step 1 } {{ } Step 2 dx +1. (2.29) Step 2 We margialise over x +1 to obtai ϕ i, j,k (x x +1, Y ) = (x µ i, j,k j,k, Σi, ), (2.30) Multi-Modal Filter ad Smoother

39 where D k = Γk +1 i, j,k µ Σ k +1 1, (2.31) = µi + D (µ j +1 N µk +1 ), i, j,k Σ = Σi + D Σ j +1 N Σk +1 DT. To evaluate the costat β(x +1 ), we use Theorem 1 to write β(x +1 ) = (x +1 µ k +1, Σk +1 ) (2.32) ad usig (2.29) we obtai β i, j,k = δ i α j γ k (x +1 µ k +1 ), Σk +1 ) (x +1 µ j +1 N, Σj +1 N ) dx +1. (2.33) We use defiitio of product of two Gaussias, ad the fact that margialisatio is over x +1 we obtai β i, j,k = δ i α j γ k (x = µ j +1 N µk +1, Σk +1 + Σ j +1 N ) (x +1, µ +1, Σ +1 ) dx +1, (2.34) = δ i α j γ k (x = µ j +1 N µk +1, Σk +1 + Σ j +1 N ). To esure the Gaussia mixture approximatio is a valid probability distributio we must esure that all weights sum to oe. Therefore, we ormalise β accordig to β i, j,k = δ i α j γ k (x = µ j +1 N µk 1, Σk +1 + Σ j +1 N ) δ p α q γ r (x = µ q +1 N µr +1, Σr +1 + (2.35) Σq ). +1 N p,q,r The right had side of (2.29) is completed by (2.35) ad (2.31) ad completes the proof of Propositio Summary of the Multi-modal filter ad Smoother I this chapter we proposed a multi-modal filter ad multi-modal smoother based o the Gaussia sum approximatio. The importat poits of this chapter are below. We described a closed form solutio for estimatig the mixture parameters of the desity obtaied, whe mappig a Gaussia mixture through a o-liear mappig. The approximatios of a Gaussia by a Gaussia mixture is based o Cholesky factorisatio ad the approximatio ca be described as a geeralisatio of the Usceted Trasform. The relative error values i likelihood per data poit of Gaussia samples, uder Gaussia mixture as a model shows that our approximatio does ot itroduce large errors (see figure ) Summary of the Multi-modal filter ad Smoother 33

40 We used Gaussia sum filter for filterig with multi-modal beliefs, the importat property of Gaussia sum filter is that it is based o the Gaussia Sum Approximatios. A Gaussia sum approximatio allows a filter to match the optimal Bayesia filter as the umber of mixture compoets icreases (Aderso ad Moore, 2005). Hece, our filter based o Gaussia sum filter will coverge to the optimal Bayesia filter as the umber of compoets reaches ifiity. We use the Gaussia sum approximatios to devise a closed form Gaussia sum smoother. Mixture reductio: We use the symmetric KL divergece as a measure for a distace based Gaussia mixture reductio techique. It was foud to perform better tha stadard mixture reductio techiques. A comprehesive study o the various mixture reductio techiques is marked as a future work. The multi-modal filter ad smoother allow us to use the flexibility of Gaussia mixture models over Gaussia models. The parametric form makes filter determiistic ad, hece, the umerical results we obtai i ext ext chapter are reproducible Multi-Modal Filter ad Smoother

41 P(x) Variable X Time (t) a) We geerate 10 6 samples from prior p(x 1 ) propagate desity forward by usig x = f (x 1 ) to obtai 10 6 samples of p(x ) ad use Expectatio Maximisatio (EM) to approximate p(x ) with M=3 compoets P(x) Variable X Time (t) (b) We calculate desity propagatio p(x x 1 ) usig the proposed Multi-Modal filter with α = 1.3 ad the desity p(x ) p(x x 1 ) is represeted by usig M=3 compoets 2.3. Summary of the Multi-modal filter ad Smoother 35

42 Variable X (c) Usceted Trasform approximatio of p(x ) p(x x 1 ) 20 Time (t) 0.1 P EM (x) P MMF (x) Variable X (d) Absolute error betwee plots a) & b) Figure 2.3.: The Figure demostrates the ability of the proposed multi-modal approximatios for o-statioary o-liear fuctio. The plots a) ad b) show EM ad Multi-modal approximatios, to facilitate compariso both plots use same Y axis ad colour codig. Visual ispectio shows that the proposed filter is able to track predictive desities, we cofirm this by plottig the error betwee EM estimate ad multi-modal estimate. The Y axis is differet for plot d) Multi-Modal Filter ad Smoother

43 3 Numerical Results We evaluated our proposed filterig algorithm o data geerated from stadard oedimesioal o-liear dyamical systems. Both the UKF ad the Multi-Modal Filter (M-MF) use the same parameters for the Usceted Trasform, i.e. α = 1, β = 2 ad κ = 2. The prior p(x 0 ) is a stadard Gaussia p(x 0 ) = (x 0, 1). Desities i the M-MF are represeted by a Gaussia mixture with M = 3 compoets. The mea of the filtered state is estimated by the first momet of this Gaussia mixture. The employed Particle filter (PF) is a stadard particle filter with residual re-samplig scheme. The PF-UKF (Va Der Merwe et al., 2000) o the other had uses the Usceted Trasform as proposal distributio ad the UKF for filterig. Ulike our M-MF the PF-UKF uses radom samplig ad the UKF. We use the root mea square error (RMSE) ad the predictive Negative Log-Likelihood (NLL) per observatio as metrics to compare the performace of the differet filters. Lower values idicate better performace. The results i Table 3.1 were obtaied from 100 idepedet simulatios with T = 100 time steps for each of the followig system models. 3.1 Uivariate No-liear Growth Model The the Uivariate No-statioary Growth Model (UNGM) (Doucet et al., 2001) is a stadard bechmark problem for o-liear estimators No-statioary Model We tested the differet filters o a stadard system, the Uivariate No-statioary Growth Model (UNGM) from (Doucet et al., 2001) x = x x x cos(1.2( 1))+w, w (0, 1), (3.1) 1 y = x 2 + v, v (0, 1). 20 The true state desity of the o-statioary model stated above alterated betwee multi-modal ad ui-modal distributios. The switch from ui-modal to a bi-modal desity occurred whe the mea was close to zero. The quadratic measuremet fuctio makes it difficult to distiguish betwee the two modes as they are symmetric aroud zero. This symmetry posed a substatial challege for several filterig algorithms. The PF lost track of the state as N = 500 particles failed to capture the true desity especially i its tails, which led to degeeracy (Doucet et al., 2001). The particle filters i Table 3.1 are i their stadard form ad performace may improve if advaced techiques are used (Doucet et al., 2001), as ca be see from the Gibbs filter (Deiseroth ad Ohlsso, 2011). The proposed M-MF could track both modes ad, hece, led to more cosistet estimates. The RMSE performace of the multi-modal filter (M-MF) 37

44 (a) Multi-modal filter (Multi-modal smoother). RMSE 2.58 (1.28). NLL 1.79 (1.57) (b) Usceted Kalma filter (Usceted Kalma smoother). RMSE (7.734). NLL (13.44) (c) Exteded Kalma Filter (Exteded Kalma smoother). RMSE (13.42). NLL (118.34) (d) Usceted Kalma filter (Multi-modal Smoother). RMSE (1.33). NLL (1.577) Figure 3.1.: A example trajectory from table 3.1 for o-statioary quadratic measuremet fuctio (UNGM). I Figure d) the forward pass is UKF ad the multi-modal smoother is used for backward pass. The true state is represeted by dashed red lie, the shaded regio represets 95% cofidece area for filterig distributio ad solid gree lies represet 95% cofidece area for smoothig distributio Numerical Results

45 Statioary No-Statioary h(x) = 5 si(x) h(x) = x 2 /20 h(x) = 5 si(x) RMSE NLL RMSE NLL RMSE NLL EKF 7.59 ± ± ± ± ± ± UKF ± ± ± ± ± ± 20.6 M-MF 0.97 ± ± ± ± ± ± 8.04 PF 2.4 ± 0.03 N/A 3.5 ± ± ± 3.6 N/A Gibbs Filter 3.62 ± ± ± ± ± ± 0.08 Table 3.1.: Average performaces of the filters are show alog with stadard deviatio. Lower values are better. The M-MF filter performs better tha the stadard filterig methods. Statioary No-Statioary h(x) = 5 si(x) h(x) = x 2 /20 h(x) = 5 si(x) RMSE NLL RMSE NLL RMSE NLL EKS ± ± ± ± ± ± 98.5 UKS ± ± ± ± ± ± 25.6 M-MS 0.92 ± ± ± ± ± ± 9.06 Table 3.2.: Average performaces of the smoothers are show alog with stadard deviatio. Lower values are better. The M-MS filter performs better tha the stadard smoothig methods. is sigificatly better tha the UKF, with the M-MF outperformig the UKF i terms of a lower mea error ad stadard deviatio. The mai advatage of the M-MF is its ability to capture the ucertaity appropriately. The NLL values of the M-MF were sigificatly better tha the UKF eve whe the same parameters are used to calculate the Usceted Trasform, see Table 3.1. We tested the UNGM with a alterative measuremet fuctio h(x) = 5 si(x). For this fuctio, the performace of the EKF is best i terms of RMSE, sice its estimates are more stable. The proposed M-MF could track multiple modes, which resulted i sigificatly better performace i terms of NLL. Moreover, the filter performace was cosistetly stable, idicated by the small stadard deviatio values for the NLL measure Statioary Model The statioary model is based o the Uiform No-statioary Growth Model (UNGM) described above by (Kitagawa, 1996) ad ca be described by the followig equatios: f (x) = x x + w, 1 + x 2 w (0, 1), (3.2) h(x) = 5 si(x) + v, v (0, 1). (3.3) We see from the results i Table 3.1 that the UKF was outperformed by all other determiistic filters. The failures of the UKF ad the PF-UKF are attributed to their overcofidet predic Uivariate No-liear Growth Model 39

46 tios for siusoidal fuctios, which cofirms the results i (Deiseroth et al., 2012). The EKF approximates these siusoidal fuctios better but both the UKF ad EKF fail to capture the multi-modal ature of system dyamics. Thus, the EKF ad UKF are icosistet for the model ad settigs used i this experimet. The proposed M-MF o the other had performed cosistetly better i terms of RMSE ad NLL values (see table 3.1). Moreover, the small stadard deviatio of the NLL suggests that our proposed M-MF is cosistet ad stable Numerical Results

47 Figure 3.2.: Lorez time series (reproduced from ) 3.2 Lorez System The Lorez system is a o-liear dyamical system proposed by Lorez i (Lorez, 1963) as a solutio to the heat covectio flow i atmosphere. This system is a 3 dimesioal, oliear, determiistic dyamical system defied by coupled deferetial equatios (3.4) (Lorez, 1963) (Hilbor, 2000). The system is ustable for certai values of parameters ad ca be described as determiistic chaos (Hilbor, 2000). The chaotic behaviour is characterised by a ustable system highly sesitive to iitial values, ad a small deviatio may result i a completely differet evolutio of system states (Lorez, 1963). We use the parameter values ρ = 28, σ = 10, ad β = 8/3 as described by Lorez i (Lorez, 1963). The system is bistable ad describes a butterfly shape usually associated with o-liear chaos (Hilbor, 2000). The system is described by the differetial equatios x = σ(y x), (3.4) t y = x(ρ z) y, t z t = x y βz. We defie the system fuctio as a solutio to the differetial equatios defied by coupled equatios f (x, y, z) i (3.4). The solutios are obtaied by a solver i M atlab. We discretise 3.2. Lorez System 41

3/8/2016. Contents in latter part PATTERN RECOGNITION AND MACHINE LEARNING. Dynamical Systems. Dynamical Systems. Linear Dynamical Systems

3/8/2016. Contents in latter part PATTERN RECOGNITION AND MACHINE LEARNING. Dynamical Systems. Dynamical Systems. Linear Dynamical Systems Cotets i latter part PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Liear Dyamical Systems What is differet from HMM? Kalma filter Its stregth ad limitatio Particle Filter Its simple

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar.

Clustering. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar. Clusterig CM226: Machie Learig for Bioiformatics. Fall 216 Sriram Sakararama Ackowledgmets: Fei Sha, Ameet Talwalkar Clusterig 1 / 42 Admiistratio HW 1 due o Moday. Email/post o CCLE if you have questios.

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian

Chapter 12 EM algorithms The Expectation-Maximization (EM) algorithm is a maximum likelihood method for models that have hidden variables eg. Gaussian Chapter 2 EM algorithms The Expectatio-Maximizatio (EM) algorithm is a maximum likelihood method for models that have hidde variables eg. Gaussia Mixture Models (GMMs), Liear Dyamic Systems (LDSs) ad Hidde

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

UNSCENTED KALMAN FILTER REVISITED - HERMITE-GAUSS QUADRATURE APPROACH

UNSCENTED KALMAN FILTER REVISITED - HERMITE-GAUSS QUADRATURE APPROACH UNSCENED KALMAN FILER REVISIED - HERMIE-GAUSS QUADRAURE APPROACH Ja Štecha ad Vladimír Havlea Departmet of Cotrol Egieerig Faculty of Electrical Egieerig Czech echical Uiversity i Prague Karlovo áměstí

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010

A Note on Effi cient Conditional Simulation of Gaussian Distributions. April 2010 A Note o Effi ciet Coditioal Simulatio of Gaussia Distributios A D D C S S, U B C, V, BC, C April 2010 A Cosider a multivariate Gaussia radom vector which ca be partitioed ito observed ad uobserved compoetswe

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK

Sequential Monte Carlo Methods - A Review. Arnaud Doucet. Engineering Department, Cambridge University, UK Sequetial Mote Carlo Methods - A Review Araud Doucet Egieerig Departmet, Cambridge Uiversity, UK http://www-sigproc.eg.cam.ac.uk/ ad2/araud doucet.html ad2@eg.cam.ac.uk Istitut Heri Poicaré - Paris - 2

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

Binomial Distribution

Binomial Distribution 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course

EE / EEE SAMPLE STUDY MATERIAL. GATE, IES & PSUs Signal System. Electrical Engineering. Postal Correspondence Course Sigal-EE Postal Correspodece Course 1 SAMPLE STUDY MATERIAL Electrical Egieerig EE / EEE Postal Correspodece Course GATE, IES & PSUs Sigal System Sigal-EE Postal Correspodece Course CONTENTS 1. SIGNAL

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall

Similarity Solutions to Unsteady Pseudoplastic. Flow Near a Moving Wall Iteratioal Mathematical Forum, Vol. 9, 04, o. 3, 465-475 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/0.988/imf.04.48 Similarity Solutios to Usteady Pseudoplastic Flow Near a Movig Wall W. Robi Egieerig

More information

A Simplified Derivation of Scalar Kalman Filter using Bayesian Probability Theory

A Simplified Derivation of Scalar Kalman Filter using Bayesian Probability Theory 6th Iteratioal Workshop o Aalysis of Dyamic Measuremets Jue -3 0 Göteorg Swede A Simplified Derivatio of Scalar Kalma Filter usig Bayesia Proaility Theory Gregory Kyriazis Imetro - Electrical Metrology

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Mixtures of Gaussians and the EM Algorithm

Mixtures of Gaussians and the EM Algorithm Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Expectation-Maximization Algorithm.

Expectation-Maximization Algorithm. Expectatio-Maximizatio Algorithm. Petr Pošík Czech Techical Uiversity i Prague Faculty of Electrical Egieerig Dept. of Cyberetics MLE 2 Likelihood.........................................................................................................

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

10-701/ Machine Learning Mid-term Exam Solution

10-701/ Machine Learning Mid-term Exam Solution 0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it

More information

Chapter 6. Sampling and Estimation

Chapter 6. Sampling and Estimation Samplig ad Estimatio - 34 Chapter 6. Samplig ad Estimatio 6.. Itroductio Frequetly the egieer is uable to completely characterize the etire populatio. She/he must be satisfied with examiig some subset

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Exponential Families and Bayesian Inference

Exponential Families and Bayesian Inference Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam 4 will cover.-., 0. ad 0.. Note that eve though. was tested i exam, questios from that sectios may also be o this exam. For practice problems o., refer to the last review. This

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Data Assimilation. Alan O Neill University of Reading, UK

Data Assimilation. Alan O Neill University of Reading, UK Data Assimilatio Ala O Neill Uiversity of Readig, UK he Kalma Filter Kalma Filter (expesive Use model equatios to propagate B forward i time. B B(t Aalysis step as i OI Evolutio of Covariace Matrices (

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Multilevel ensemble Kalman filtering

Multilevel ensemble Kalman filtering Multilevel esemble Kalma filterig Håko Hoel 1 Kody Law 2 Raúl Tempoe 3 1 Departmet of Mathematics, Uiversity of Oslo, Norway 2 Oak Ridge Natioal Laboratory, TN, USA 3 Applied Mathematics ad Computatioal

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.

More information

Module 1 Fundamentals in statistics

Module 1 Fundamentals in statistics Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Probability and MLE.

Probability and MLE. 10-701 Probability ad MLE http://www.cs.cmu.edu/~pradeepr/701 (brief) itro to probability Basic otatios Radom variable - referrig to a elemet / evet whose status is ukow: A = it will rai tomorrow Domai

More information

Standard Bayesian Approach to Quantized Measurements and Imprecise Likelihoods

Standard Bayesian Approach to Quantized Measurements and Imprecise Likelihoods Stadard Bayesia Approach to Quatized Measuremets ad Imprecise Likelihoods Lawrece D. Stoe Metro Ic. Resto, VA USA Stoe@metsci.com Abstract I this paper we show that the stadard defiitio of likelihood fuctio

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

Lectures on Stochastic System Analysis and Bayesian Updating

Lectures on Stochastic System Analysis and Bayesian Updating Lectures o Stochastic System Aalysis ad Bayesia Updatig Jue 29-July 13 2005 James L. Beck, Califoria Istitute of Techology Jiaye Chig, Natioal Taiwa Uiversity of Sciece & Techology Siu-Kui (Iva) Au, Nayag

More information

Ma 530 Infinite Series I

Ma 530 Infinite Series I Ma 50 Ifiite Series I Please ote that i additio to the material below this lecture icorporated material from the Visual Calculus web site. The material o sequeces is at Visual Sequeces. (To use this li

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig

More information

ENGI Series Page 6-01

ENGI Series Page 6-01 ENGI 3425 6 Series Page 6-01 6. Series Cotets: 6.01 Sequeces; geeral term, limits, covergece 6.02 Series; summatio otatio, covergece, divergece test 6.03 Stadard Series; telescopig series, geometric series,

More information

Nonlinear regression

Nonlinear regression oliear regressio How to aalyse data? How to aalyse data? Plot! How to aalyse data? Plot! Huma brai is oe the most powerfull computatioall tools Works differetly tha a computer What if data have o liear

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

An Introduction to Asymptotic Theory

An Introduction to Asymptotic Theory A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu

More information

CSE 527, Additional notes on MLE & EM

CSE 527, Additional notes on MLE & EM CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be

More information

Vector Quantization: a Limiting Case of EM

Vector Quantization: a Limiting Case of EM . Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

Chapter 10: Power Series

Chapter 10: Power Series Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because

More information

MATH 1080: Calculus of One Variable II Fall 2017 Textbook: Single Variable Calculus: Early Transcendentals, 7e, by James Stewart.

MATH 1080: Calculus of One Variable II Fall 2017 Textbook: Single Variable Calculus: Early Transcendentals, 7e, by James Stewart. MATH 1080: Calculus of Oe Variable II Fall 2017 Textbook: Sigle Variable Calculus: Early Trascedetals, 7e, by James Stewart Uit 3 Skill Set Importat: Studets should expect test questios that require a

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model

The Bayesian Learning Framework. Back to Maximum Likelihood. Naïve Bayes. Simple Example: Coin Tosses. Given a generative model Back to Maximum Likelihood Give a geerative model f (x, y = k) =π k f k (x) Usig a geerative modellig approach, we assume a parametric form for f k (x) =f (x; k ) ad compute the MLE θ of θ =(π k, k ) k=

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Generalized Semi- Markov Processes (GSMP)

Generalized Semi- Markov Processes (GSMP) Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual

More information

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t =

Problem Cosider the curve give parametrically as x = si t ad y = + cos t for» t» ß: (a) Describe the path this traverses: Where does it start (whe t = Mathematics Summer Wilso Fial Exam August 8, ANSWERS Problem 1 (a) Fid the solutio to y +x y = e x x that satisfies y() = 5 : This is already i the form we used for a first order liear differetial equatio,

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

Chapter 9: Numerical Differentiation

Chapter 9: Numerical Differentiation 178 Chapter 9: Numerical Differetiatio Numerical Differetiatio Formulatio of equatios for physical problems ofte ivolve derivatives (rate-of-chage quatities, such as velocity ad acceleratio). Numerical

More information

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition 6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.

More information

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated

More information