Computer Intensive Methods in Mathematical Statistics

Size: px
Start display at page:

Download "Computer Intensive Methods in Mathematical Statistics"

Transcription

1 Computer Intensive Methods in Mathematical Statistics Department of mathematics Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1)

2 Plan of today s lecture 1 Last time: the EM algorithm 2 Computer Intensive Methods (2)

3 Outline 1 Last time: the EM algorithm 2 Computer Intensive Methods (3)

4 Last time: the EM algorithm The algorithm goes as follows. Data: Initial value θ 0 Result: {θ l ; l N} for l 0, 1, 2,... do set Q θl (θ) E θl (log f θ (X, Y ) Y ); set θ l+1 arg max θ Θ Q θl (θ) end The two steps within the main loop are referred to as expectation and maximization steps, respectively. Computer Intensive Methods (4)

5 Last time: the EM inequality By rearranging the terms we obtain the following. Theorem (the EM inequality) For all (θ, θ ) Θ 2 it holds that l(θ) l(θ ) Q θ (θ) Q θ (θ ), where the equality is strict unless f θ (x Y ) = f θ (x Y ). Thus, by the very construction of {θ l ; l N} it is made sure that {l(θ l ); l N} is non-decreasing. Hence, the EM algorithm is a monotone optimization algorithm. Computer Intensive Methods (5)

6 EM in exponential families In order to be practically useful, the E- and M-steps of EM have to be feasible. A rather general context in which this is the case is the following. Definition (exponential family) The family {f θ (x, y); θ Θ} defines an exponential family if the complete data likelihood is of form f θ (x, y) = exp (ψ(θ) φ(x) c(θ)) h(x), where φ and ψ are (possibly) vector-valued functions on R d and Θ, respectively, and h is a non-negative real-valued function on R d. All these quantities may depend on y. Computer Intensive Methods (6)

7 EM in exponential families (cont d) The intermediate quantity becomes Q θ (θ) = ψ(θ) E θ (φ(x) Y ) c(θ) + E θ (log h(x) Y ), }{{} ( ) where ( ) does not depend on θ and may thus be ignored. Consequently, in order to be able to apply EM we need 1 to be able to compute the smoothed sufficient statistics τ = E θ (φ(x) Y ) = φ(x)f θ (x Y ), 2 maximization of θ ψ(θ) τ c(θ) to be feasible for all τ. Computer Intensive Methods (7)

8 Example: nonlinear Gaussian model Let (y i ) n i=1 be independent observations of a random variable Y = h(x) + σ y ε y, where h is a possibly nonlinear function and X = µ + σ x ε x is not observable. Here ε x and ε y are independent standard Gaussian noise variables. The parameters θ = (µ, σ 2 x) governing the distribution of the unobservable variable X are unknown while it is known that σ y =.4. Computer Intensive Methods (8)

9 Example: nonlinear Gaussian model (cont.) In this case the complete data likelihood belongs to an exponential family with ( n ) n ( φ(x 1:n ) = xi 2, x i, ψ(θ) = 1 2σx 2, µ ) σx 2, and i=1 i=1 c(θ) = n 2 log σ2 x + nµ2 2σx 2. Letting τ i = E θl (φ i (X 1:n ) Y 1:n ), i {1, 2}, leads to µ l+1 = τ 2 n, (σ 2 x) l+1 = τ 1 n µ2 l+1. Computer Intensive Methods (9)

10 Example: nonlinear Gaussian model (cont.) Since the smoothed sufficient statistics are of additive form it holds that, e.g., τ 1 = E θl ( n i=1 X 2 i Y 1:n ) = n i=1 E θl ( X 2 i Y i ) (and similarly for τ 2 ). However, computing expectations under ( ) f θl (x i y i ) exp 1 2(σy) 2 {y i h(x i )} 2 1 l 2σy 2 {x i µ l } is in general infeasible (i.e. when the transformation h is a general nonlinear function). Computer Intensive Methods (10)

11 Example: nonlinear Gaussian model (cont.) Thus, within each iteration of EM we sample each component f θl (x i y i ) using MH. For simplicity we use the independent proposal r(x i ) = f θl (x i ), yielding the MH acceptance probability α(x (k) i, Xi ) = 1 f θ l (Y i Xi ) f θl (Y i X (k) i ) = 1 exp ( ) 1 2(σy) 2 {h 2 (Xi ) h 2 (X (k) i ) + 2Y i (h(x (k) i ) h(xi ))} 2. l Computer Intensive Methods (11)

12 Example: nonlinear Gaussian model (cont.) Data: Initial value θ 0 ; Y 1:n Result: {θ l ; l N} for l 0, 1, 2,... do set ˆτ j 0, j; for i 1,..., n do run an MH sampler targeting f θl (x i Y i ) (X (k) i ) N l set ˆτ 1 ˆτ 1 + N l (k) k=1 (X i ) 2 /N l ; set ˆτ 2 ˆτ 2 + N l k=1 X (k) i /N l ; end set µ l+1 ˆτ 2 /n; set (σ 2 x) l+1 ˆτ 1 /n µ 2 l+1 ; end k=1 ; Computer Intensive Methods (12)

13 Example: nonlinear Gaussian model (cont.) In order to assess the performance of this Monte Carlo EM algorithm we focus on the linear case, h(x) = x. A data record comprising n = 40 values was produced by simulation under (µ, σ x) = (1,.4) with σ y =.4. In this case, the true MLE is known and given by (check this!) ˆµ(Y 1:n ) = 1 n ˆσ 2 x(y 1:n ) = 1 n n Y i = 1.04, i=1 n {Y i ˆµ(Y 1:n )} =.27. i=1 Computer Intensive Methods (13)

14 Example: nonlinear Gaussian model (cont.) µ estimate EM interation index Figure: EM learning trajectory (blue curve) for µ in the case h(x) = x. Red-dashed line indicates true MLE. N l = 200 for all iterations. Computer Intensive Methods (14)

15 Example: nonlinear Gaussian model (cont.) σ 2 x estimate EM interation index Figure: EM learning trajectory (blue curve) for σ 2 x in the case h(x) = x. Red-dashed line indicates true MLE. N l = 200 for all iterations. Computer Intensive Methods (15)

16 Averaging Roughly, for the MCEM algorithm one may prove that the variance of θ l ˆθ (where ˆθ is the MLE) is of order 1/N l. In the idealised situation where the estimates (θ l ) l are uncorrelated (which is not the case here) one may obtain an improved estimator of ˆθ by combining the individual estimates θ l in proportion of the inverse of their variance. Starting the averaging at iteration l 0 leads to θ l = l l =l 0 N l l l =l 0 N l θ l, l l 0. In the idealised situation the variance of θ l decreases as 1/ l l =l 0 N l (= total number of simulations). Computer Intensive Methods (16)

17 Example: nonlinear Gaussian model (cont.) µ estimate EM interation index Figure: EM learning trajectory (blue curve) for µ in the case h(x) = x. Red-dashed line indicates true MLE. Averaging after l 0 = 30 iterations. Computer Intensive Methods (17)

18 Example: nonlinear Gaussian model (cont.) σ 2 x estimate EM interation index Figure: EM learning trajectory (blue curve) for σ 2 x in the case h(x) = x. Red-dashed line indicates true MLE. Averaging after l 0 = 30 iterations. Computer Intensive Methods (18)

19 Outline 1 Last time: the EM algorithm 2 Computer Intensive Methods (19)

20 Outline 1 Last time: the EM algorithm 2 Computer Intensive Methods (20)

21 General hidden Markov models (HMMs) A hidden Markov model (HMM) comprises 1 a Markov chain (X k ) k 0 with transition density q, i.e. X k+1 X k = x k q(x k+1 x k ), which is hidden away from us but partially observed through 2 an observation process (Y k ) k 0 such that conditionally on the chain (X k ) k 0, (i) the Y k s are independent with (ii) conditional distribution of each Y k depending on the corresponding X k only. The density of the conditional distribution Y k (X k ) k 0 d. = Y k X k will be denoted by p(y k x k ). Computer Intensive Methods (21)

22 General hidden Markov models (HMMs) A hidden Markov model (HMM) comprises 1 a Markov chain (X k ) k 0 with transition density q, i.e. X k+1 X k = x k q(x k+1 x k ), which is hidden away from us but partially observed through 2 an observation process (Y k ) k 0 such that conditionally on the chain (X k ) k 0, (i) the Y k s are independent with (ii) conditional distribution of each Y k depending on the corresponding X k only. The density of the conditional distribution Y k (X k ) k 0 d. = Y k X k will be denoted by p(y k x k ). Computer Intensive Methods (21)

23 General HMMs (cont.) Graphically: Y k 1 Y k Y k+1 (Observations)... X k 1 X k X k+1... (Markov chain) Computer Intensive Methods (22) Y k X k = x k p(y k x k ) X k+1 X k = x k q(x k+1 x k ) X 0 χ(x 0 ) (Observation density) (Transition density) (Initial distribution)

24 The smoothing distribution In an HMM, the smoothing distribution f n (x 0:n y 0:n ) is the conditional distribution of X 0:n given Y 0:n = y 0:n. Theorem (Smoothing distribution) f n (x 0:n y 0:n ) = χ(x 0)p(y 0 x 0 ) n k=1 p(y k x k )q(x k x k 1 ), L n (y 0:n ) where L n (y 0:n ) = density of the observations y 0:n n = χ(x 0 )p(y 0 x 0 ) p(y k x k )q(x k x k 1 ) dx 0:n. k=1 Computer Intensive Methods (23)

25 The goal We wish, for n N, to estimate the expected value τ n = E[h n (X 0:n ) Y 0:n ], under the smoothing distribution. Here h n (x 0:n ) is of additive form, that is n 1 h n (x 0:n ) = h i (x i:i+1 ). i=0 For instance sufficient statistics which are needed for the EM algorithm. Computer Intensive Methods (24)

26 The goal We wish, for n N, to estimate the expected value τ n = E[h n (X 0:n ) Y 0:n ], under the smoothing distribution. Here h n (x 0:n ) is of additive form, that is n 1 h n (x 0:n ) = h i (x i:i+1 ). i=0 For instance sufficient statistics which are needed for the EM algorithm. Computer Intensive Methods (24)

27 Using basic SISR Given particles and weights {(X i 0:n, ωi n)} N i=1 targeting f (x 0:n y 0:n ) we update this to estimate at time n + 1 using the following steps: Selection: Draw indices {I i n} N i=1 Mult({ωi n} N i=1 ) Mutation: For i = 1,..., N draw X i n+1 q( X Ii n n ) and set X i 0:n+1 = (X Ii n 0:n, X i n+1 ) Weighting: For i = 1,..., N set ω i n+1 = p(y n+1 X i n+1 ). Estimation: We estimate τ n+1 using ˆτ n+1 = N i=1 ω i n+1 N l=1 ωl n+1 h n+1 (X i 0:n+1). Computer Intensive Methods (25)

28 The problem of resampling We saw when comparing with the SIS algorithm that the resampling was necessary to achieve a stable estimate. But the resampling also depletes the historical trajectories. There will be an integer k such that at time n in the algorithm X0:k i = X j 0:k for all i and j. Computer Intensive Methods (26)

29 The problem of resampling We saw when comparing with the SIS algorithm that the resampling was necessary to achieve a stable estimate. But the resampling also depletes the historical trajectories. There will be an integer k such that at time n in the algorithm X0:k i = X j 0:k for all i and j. 1.5 D State Value Computer Intensive Methods (26) Time Index

30 Fixing the degeneracy The goal to fixing this problem is the following observation: Conditioned on the observations the hidden Markov chain is also a Markov chain in the reverse direction. We denote the transition density of this Markov chain by q (xk x k+1, y 0:k ), this density can be written as q (xk x k+1, y 0:k ) = f (x k y 0:k )q(x k+1 x k ) f (x y 0:k )q(x k+1 x )dx. Since we can estimate f (x k y 0:k ) well using SISR we can use the following particle approximation of the backward distribution q N (x i k x j k+1, y 0:k) = ω i k q(x j k+1 x i k ) N l=1 ωl k q(x j k+1 x l k ). Computer Intensive Methods (27)

31 Fixing the degeneracy The goal to fixing this problem is the following observation: Conditioned on the observations the hidden Markov chain is also a Markov chain in the reverse direction. We denote the transition density of this Markov chain by q (xk x k+1, y 0:k ), this density can be written as q (xk x k+1, y 0:k ) = f (x k y 0:k )q(x k+1 x k ) f (x y 0:k )q(x k+1 x )dx. Since we can estimate f (x k y 0:k ) well using SISR we can use the following particle approximation of the backward distribution q N (x i k x j k+1, y 0:k) = ω i k q(x j k+1 x i k ) N l=1 ωl k q(x j k+1 x l k ). Computer Intensive Methods (27)

32 Forward Filtering Backward Smoothing (FFBSm) Using this approximation of the transition density we get the Forward Filtering Backward Smoothing algorithm. First we need to run the SISR algorithm and save all the filter estimates {(xk i, ωi k )}N i=1 for k = 0,..., N. We then estimate τ n using ˆτ n = N i 0 =1 N n 1 i n=0 s=0 ω is s q(x i s+1 s+1 x is N l=1 ωl sq(x i s+1 s+1 x s) l s ) ωn in N h n (x i 0 0, x i 1 1,..., x l=1 ωl n in ) n Computer Intensive Methods (28)

33 A Forward only version The previous algorithm requires two passes of the data, first the forward filtering, then the backward smoothing. When working with functions of additive form it is possible to perform smoothing in the following way. Introduce the auxiliary function T k (x), which we define as T k (x) = E[h k (X 0:k ) Y 0:k, X k = x] We can update this function recursively by noting that T k+1 (x) = E[T k (X k ) + h k (X k, X k+1 ) Y 0:k, X k+1 = x], that is the expected value of the previous function with the next additive part under the backward distribution! Computer Intensive Methods (29)

34 Using this decomposition Given that we have estimates for the filter at time k, {(xk i, ωi k )}N i=1, and estimates of {T k(xk i )}N i=1. Propagate the filter estimates using one step of SISR algorithm to get {(xk+1 i, ωi k+1 )}N i=1, now for i = 1,..., N estimate the function T k+1 (xk+1 i ) using T k+1 (x i k+1 ) = N j=1 τ k+1 is estimated using ˆτ k+1 = ω j k q(x i k+1 x j k ) N l=1 ωl k q(x i k+1 x l k )(T k(x j k ) + h(x j k, x i k+1 )) N i=1 ω i k+1 N l=1 ωl k+1 T k+1 (x i k+1 ) Computer Intensive Methods (30)

35 Speeding up the Algorithm This algorithm works, but it is quite slow. Idea! Can we replace calculating the expected value with an MC estimator? Can we sample sufficiently fast from the backward distribution? Formally we wish to sample J such that given all particles and weights at time k and k + 1 and an index i P(J = j) = ω j k q(x i k+1 x j k ) N l=1 ωl k q(x i k+1 x l k ) We can do this efficiently using accept-reject sampling! Computer Intensive Methods (31)

36 Speeding up the Algorithm This algorithm works, but it is quite slow. Idea! Can we replace calculating the expected value with an MC estimator? Can we sample sufficiently fast from the backward distribution? Formally we wish to sample J such that given all particles and weights at time k and k + 1 and an index i P(J = j) = ω j k q(x i k+1 x j k ) N l=1 ωl k q(x i k+1 x l k ) We can do this efficiently using accept-reject sampling! Computer Intensive Methods (31)

37 Speeding up the Algorithm This algorithm works, but it is quite slow. Idea! Can we replace calculating the expected value with an MC estimator? Can we sample sufficiently fast from the backward distribution? Formally we wish to sample J such that given all particles and weights at time k and k + 1 and an index i P(J = j) = ω j k q(x i k+1 x j k ) N l=1 ωl k q(x i k+1 x l k ) We can do this efficiently using accept-reject sampling! Computer Intensive Methods (31)

38 The PaRIS algorithm Now we can describe the PaRIS algorithm. Given particles and weights {(xk i, ωi k )}N i=1 targeting the filter distribution at time k. Together with values {T k (xk i )}N i=1 estimating T k (x). We proceed by first taking one step using the SISR algorithm to get particles and weights {(xk+1 i, ωi k+1 )}n i=1 targeting the filter at time k + 1. For i = 1,..., N we draw Ñ indices {Ji }Ñi =1 from the previous slide and calculate T k+1 (x i k+1) = 1 Ñ Ñ i =1 (T k (x Ji k ) + h ) k (xk Ji, x k+1) i The estimate of τ k+1 is then given by ˆτ k+1 = N ω i k+1 N i=1 l=1 l k+1 T k+1 (x i k+1) Computer Intensive Methods (32)

39 The PaRIS algorithm (cont.) Using this algorithm we have an efficient algorithm for online estimation of the expected value under the joint smoothing distribution. We require the target function to be of additive form. The number of backward samples (Ñ) needed in the algorithm turns out to be 2. We can (under some additional assumptions) prove the following: The computational complexity grows linearly with N. The estimate is asymptotically consistent. We can find a CLT for the error which behaves like 1/ N. Computer Intensive Methods (33)

40 An Example, Paramter Estimation. It turns out that the PaRIS algorithm can be used efficently when performing online parameter estimation. We tested our method on the stochastic volatility model X t+1 = φx t + σv t+1, Y t = β exp(x t /2)U t, t N, where {V t } t N and {U t } t N are independent sequences of mutually independent standard Gaussian noise variables. Parameters to be estimated are θ = (φ, σ 2, β 2 ). Computer Intensive Methods (34)

41 Last time: the EM algorithm Parameter Estimation (cont.) φ time σ time 2 β time PaRIS-based RML Computer Intensive Methods (35)

42 End of the course That is it for the lectures in this course. Hope you have enjoyed it. Review of HA2 sent by mail to by 24 May 12:00:00 Exam 30 May, 14:00:00 19:00:00 Re-Exam 14 August, 08:00:00 13:00:00 Computer Intensive Methods (36)

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 7 Sequential Monte Carlo methods III 7 April 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 5 Sequential Monte Carlo methods I 31 March 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Consistency of the maximum likelihood estimator for general hidden Markov models

Consistency of the maximum likelihood estimator for general hidden Markov models Consistency of the maximum likelihood estimator for general hidden Markov models Jimmy Olsson Centre for Mathematical Sciences Lund University Nordstat 2012 Umeå, Sweden Collaborators Hidden Markov models

More information

Note Set 5: Hidden Markov Models

Note Set 5: Hidden Markov Models Note Set 5: Hidden Markov Models Probabilistic Learning: Theory and Algorithms, CS 274A, Winter 2016 1 Hidden Markov Models (HMMs) 1.1 Introduction Consider observed data vectors x t that are d-dimensional

More information

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010

Lecture 4: Hidden Markov Models: An Introduction to Dynamic Decision Making. November 11, 2010 Hidden Lecture 4: Hidden : An Introduction to Dynamic Decision Making November 11, 2010 Special Meeting 1/26 Markov Model Hidden When a dynamical system is probabilistic it may be determined by the transition

More information

State-Space Methods for Inferring Spike Trains from Calcium Imaging

State-Space Methods for Inferring Spike Trains from Calcium Imaging State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline

More information

Lecture Particle Filters. Magnus Wiktorsson

Lecture Particle Filters. Magnus Wiktorsson Lecture Particle Filters Magnus Wiktorsson Monte Carlo filters The filter recursions could only be solved for HMMs and for linear, Gaussian models. Idea: Approximate any model with a HMM. Replace p(x)

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Markov Chains and Hidden Markov Models

Markov Chains and Hidden Markov Models Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden

More information

Auxiliary Particle Methods

Auxiliary Particle Methods Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley

More information

cappe/

cappe/ Particle Methods for Hidden Markov Models - EPFL, 7 Dec 2004 Particle Methods for Hidden Markov Models Olivier Cappé CNRS Lab. Trait. Commun. Inform. & ENST département Trait. Signal Image 46 rue Barrault,

More information

An Brief Overview of Particle Filtering

An Brief Overview of Particle Filtering 1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

Why do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time

Why do we care? Measurements. Handling uncertainty over time: predicting, estimating, recognizing, learning. Dealing with time Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 2004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Lecture 4: Dynamic models

Lecture 4: Dynamic models linear s Lecture 4: s Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo Methods (for DSGE Models) Sequential Monte Carlo Methods (for DSGE Models) Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 Some References These lectures use material from our joint work: Tempered

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Particle Filtering for Data-Driven Simulation and Optimization

Particle Filtering for Data-Driven Simulation and Optimization Particle Filtering for Data-Driven Simulation and Optimization John R. Birge The University of Chicago Booth School of Business Includes joint work with Nicholas Polson. JRBirge INFORMS Phoenix, October

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: ony Jebara Kalman Filtering Linear Dynamical Systems and Kalman Filtering Structure from Motion Linear Dynamical Systems Audio: x=pitch y=acoustic waveform Vision: x=object

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Stéphanie Allassonnière CIS, JHU July, 15th 28 Context : Computational Anatomy Context and motivations :

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Monte Carlo Approximation of Monte Carlo Filters

Monte Carlo Approximation of Monte Carlo Filters Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include: Arnaud Doucet, Axel Finke, Anthony Lee, Nick Whiteley 7th January 2014 Context & Outline Filtering in State-Space

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

Lecture 6: Bayesian Inference in SDE Models

Lecture 6: Bayesian Inference in SDE Models Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs

More information

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Axel Gandy Department of Mathematics Imperial College London http://www2.imperial.ac.uk/~agandy London

More information

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Factor Graphs and Message Passing Algorithms Part 1: Introduction Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except

More information

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007 Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember

More information

Sequential Bayesian Updating

Sequential Bayesian Updating BS2 Statistical Inference, Lectures 14 and 15, Hilary Term 2009 May 28, 2009 We consider data arriving sequentially X 1,..., X n,... and wish to update inference on an unknown parameter θ online. In a

More information

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning

Why do we care? Examples. Bayes Rule. What room am I in? Handling uncertainty over time: predicting, estimating, recognizing, learning Handling uncertainty over time: predicting, estimating, recognizing, learning Chris Atkeson 004 Why do we care? Speech recognition makes use of dependence of words and phonemes across time. Knowing where

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Adaptive Population Monte Carlo

Adaptive Population Monte Carlo Adaptive Population Monte Carlo Olivier Cappé Centre Nat. de la Recherche Scientifique & Télécom Paris 46 rue Barrault, 75634 Paris cedex 13, France http://www.tsi.enst.fr/~cappe/ Recent Advances in Monte

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

State Estimation of Linear and Nonlinear Dynamic Systems

State Estimation of Linear and Nonlinear Dynamic Systems State Estimation of Linear and Nonlinear Dynamic Systems Part I: Linear Systems with Gaussian Noise James B. Rawlings and Fernando V. Lima Department of Chemical and Biological Engineering University of

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state

More information

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS

Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Maximum Likelihood (ML), Expectation Maximization (EM) Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Outline Maximum likelihood (ML) Priors, and

More information

Linear Dynamical Systems (Kalman filter)

Linear Dynamical Systems (Kalman filter) Linear Dynamical Systems (Kalman filter) (a) Overview of HMMs (b) From HMMs to Linear Dynamical Systems (LDS) 1 Markov Chains with Discrete Random Variables x 1 x 2 x 3 x T Let s assume we have discrete

More information

Hidden Markov Models Part 2: Algorithms

Hidden Markov Models Part 2: Algorithms Hidden Markov Models Part 2: Algorithms CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Hidden Markov Model An HMM consists of:

More information

Lecture 7: Optimal Smoothing

Lecture 7: Optimal Smoothing Department of Biomedical Engineering and Computational Science Aalto University March 17, 2011 Contents 1 What is Optimal Smoothing? 2 Bayesian Optimal Smoothing Equations 3 Rauch-Tung-Striebel Smoother

More information

Blind Equalization via Particle Filtering

Blind Equalization via Particle Filtering Blind Equalization via Particle Filtering Yuki Yoshida, Kazunori Hayashi, Hideaki Sakai Department of System Science, Graduate School of Informatics, Kyoto University Historical Remarks A sequential Monte

More information

An Introduction to Sequential Monte Carlo for Filtering and Smoothing

An Introduction to Sequential Monte Carlo for Filtering and Smoothing An Introduction to Sequential Monte Carlo for Filtering and Smoothing Olivier Cappé LTCI, TELECOM ParisTech & CNRS http://perso.telecom-paristech.fr/ cappe/ Acknowlegdment: Eric Moulines (TELECOM ParisTech)

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

TSRT14: Sensor Fusion Lecture 8

TSRT14: Sensor Fusion Lecture 8 TSRT14: Sensor Fusion Lecture 8 Particle filter theory Marginalized particle filter Gustaf Hendeby gustaf.hendeby@liu.se TSRT14 Lecture 8 Gustaf Hendeby Spring 2018 1 / 25 Le 8: particle filter theory,

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

The Kalman Filter ImPr Talk

The Kalman Filter ImPr Talk The Kalman Filter ImPr Talk Ged Ridgway Centre for Medical Image Computing November, 2006 Outline What is the Kalman Filter? State Space Models Kalman Filter Overview Bayesian Updating of Estimates Kalman

More information

Conditional Random Field

Conditional Random Field Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Variational Autoencoder

Variational Autoencoder Variational Autoencoder Göker Erdo gan August 8, 2017 The variational autoencoder (VA) [1] is a nonlinear latent variable model with an efficient gradient-based training procedure based on variational

More information

State Space and Hidden Markov Models

State Space and Hidden Markov Models State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State

More information

Lecture 6: Gaussian Mixture Models (GMM)

Lecture 6: Gaussian Mixture Models (GMM) Helsinki Institute for Information Technology Lecture 6: Gaussian Mixture Models (GMM) Pedram Daee 3.11.2015 Outline Gaussian Mixture Models (GMM) Models Model families and parameters Parameter learning

More information

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

A Note on Auxiliary Particle Filters

A Note on Auxiliary Particle Filters A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,

More information

Markov Chain Monte Carlo Methods for Stochastic

Markov Chain Monte Carlo Methods for Stochastic Markov Chain Monte Carlo Methods for Stochastic Optimization i John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U Florida, Nov 2013

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University

Likelihood, MLE & EM for Gaussian Mixture Clustering. Nick Duffield Texas A&M University Likelihood, MLE & EM for Gaussian Mixture Clustering Nick Duffield Texas A&M University Probability vs. Likelihood Probability: predict unknown outcomes based on known parameters: P(x q) Likelihood: estimate

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing

Hidden Markov Models. By Parisa Abedi. Slides courtesy: Eric Xing Hidden Markov Models By Parisa Abedi Slides courtesy: Eric Xing i.i.d to sequential data So far we assumed independent, identically distributed data Sequential (non i.i.d.) data Time-series data E.g. Speech

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind

More information

Hidden Markov Models

Hidden Markov Models Hidden Markov Models CI/CI(CS) UE, SS 2015 Christian Knoll Signal Processing and Speech Communication Laboratory Graz University of Technology June 23, 2015 CI/CI(CS) SS 2015 June 23, 2015 Slide 1/26 Content

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Lecture 4: State Estimation in Hidden Markov Models (cont.)

Lecture 4: State Estimation in Hidden Markov Models (cont.) EE378A Statistical Signal Processing Lecture 4-04/13/2017 Lecture 4: State Estimation in Hidden Markov Models (cont.) Lecturer: Tsachy Weissman Scribe: David Wugofski In this lecture we build on previous

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

A minimalist s exposition of EM

A minimalist s exposition of EM A minimalist s exposition of EM Karl Stratos 1 What EM optimizes Let O, H be a random variables representing the space of samples. Let be the parameter of a generative model with an associated probability

More information

Hidden Markov Models in Language Processing

Hidden Markov Models in Language Processing Hidden Markov Models in Language Processing Dustin Hillard Lecture notes courtesy of Prof. Mari Ostendorf Outline Review of Markov models What is an HMM? Examples General idea of hidden variables: implications

More information

Factor Analysis and Kalman Filtering (11/2/04)

Factor Analysis and Kalman Filtering (11/2/04) CS281A/Stat241A: Statistical Learning Theory Factor Analysis and Kalman Filtering (11/2/04) Lecturer: Michael I. Jordan Scribes: Byung-Gon Chun and Sunghoon Kim 1 Factor Analysis Factor analysis is used

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework 3 Due Nov 12, 10.30 am Rules 1. Homework is due on the due date at 10.30 am. Please hand over your homework at the beginning of class. Please see

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning ECE 5984: Introduction to Machine Learning Topics: Classification: Logistic Regression NB & LR connections Readings: Barber 17.4 Dhruv Batra Virginia Tech Administrativia HW2 Due: Friday 3/6, 3/15, 11:55pm

More information

The Hierarchical Particle Filter

The Hierarchical Particle Filter and Arnaud Doucet http://go.warwick.ac.uk/amjohansen/talks MCMSki V Lenzerheide 7th January 2016 Context & Outline Filtering in State-Space Models: SIR Particle Filters [GSS93] Block-Sampling Particle

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Hidden Markov Models for precipitation

Hidden Markov Models for precipitation Hidden Markov Models for precipitation Pierre Ailliot Université de Brest Joint work with Peter Thomson Statistics Research Associates (NZ) Page 1 Context Part of the project Climate-related risks for

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Chapter 8.8.1: A factorization theorem

Chapter 8.8.1: A factorization theorem LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.

More information

Controlled sequential Monte Carlo

Controlled sequential Monte Carlo Controlled sequential Monte Carlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Bayesian Computation

More information