An intro to particle methods for parameter inference in state-space models

Size: px
Start display at page:

Download "An intro to particle methods for parameter inference in state-space models"

Transcription

1 An intro to particle methods for parameter inference in state-space models PhD course FMS020F NAMS002 Statistical inference for partially observed stochastic processes, Lund University Umberto Picchini Centre for Mathematical Sciences, Lund University

2 This lecture will explore possibilities offered by particle filters, also known as Sequential Monte Carlo (SMC) methods, when applied to parameter estimation problems. We will not give a thorough introduction to SMC methods. Instead, we will introduce the most basic (popular) SMC method with the main goal of performing inference for θ, a vector of unknown parameters entering a state space / Hidden Markov model. Results anticipation Thanks to recent and astonishing results, we show how it is possible to construct a practical method for exact Bayesian inference on the parameters of a state-space model.

3 Acknowledgements (early enough, not to forget...) Part of the presented material, slides, MATLAB codes etc. has been kindly shared by: Magnus Wiktorsson (Matematikcentrum, Lund); Fredrik Lindsten (Automatic Control, Linköping).

4 Working framework We consider hereafter the (important) case where data are modelled according to an Hidden Markov Model (HMM), often denoted state-space model (SSM). (In recent literature the terms SSM and HMM have been used interchangeably. We do the same here, though elsewhere it is sometimes assumed that HMMs discrete space and SSM continuous space) A state-space model refers to a class of probabilistic graphical models that describes the probabilistic dependence between latent state variables and the observed measurements of a dynamical system.

5 Our system of interest is {X t } t 0. This is a latent/unobservable stochastic process. (Unavailable) values of the system on a discrete time grid {0, 1,..., T} are X 0:T = {X 0,..., X T } (we consider integer times t for ease of notation). Actual observations (data) are noisy y 1:T = {y 1,..., y T } That is at time t each y t is a noisy/corrupted version of the true state of the system X t. {X t } t 0 is assumed a Markovian stochastic process. Observations y t are assumed conditionally independent given the corresponding latent state X t. We are going to formalize this in next slide.

6 SSMs as graphical models Graphically: Y t 1 Y t Y t+1 (Observations)... X t 1 X t X t+1... (Markov chain) Y t X t = x t p(y t x t ) X t+1 X t = x t p(x t+1 x t ) X 0 π(x 0 ) (Observation density) (Transition density) (Initial distribution)

7 Example: Gaussian random walk A most trivial example (linear, Gaussian SSM). Therefore x t = x t 1 + ξ t, ξ t iid N(0, q 2 ) y t = x t + e t, e t iid N(0, r 2 ) p(x t x t 1 ) = N(x t ; x t 1, q 2 ) p(y t x t ) = N(y t ; x t, r 2 ) Notation: here and in the following N(µ, σ 2 ) is the Gaussian distribution with mean µ and variance σ 2. N(x; µ, σ 2 ) is the evaluation at x of the pdf of a N(µ, σ 2 ) distribution.

8 Notation reminder Always remember X 0 is NOT the state for the first observational time, that one is X 1. Instead X 0 is the (typically unknown) system s initial state for process {X t }. X 0 can be set to be a deterministic constant (as in the example we are going to discuss soon).

9 Markov property of hidden states Briefly: X t (and actually the whole future X t+1, X t+2,...) given X t 1 is independent from anything that has happened before time t 1: p(x t x 0:t 1, y 1:t 1 ) = p(x t x t 1 ), t = 1,..., T Also, past is independent of the future given the present: p(x t 1 x t:t, y t:t ) = p(x t 1 x t )

10 Conditional independence of measurements The current measurement y t given x t is conditionally independent of the measurement and state histories: p(y t x 0:t, y 1:t 1 ) = p(y t x t ). The Markov property on {X t } and the conditional independence on {Y t } are the key ingredients for defining an HMM or SSM.

11 In general {X t } and {Y t } can be either continuous or discrete valued stochastic processes. However in the following we assume {X t } and {Y t } to be defined on continuous spaces.

12 Our goal In all previous slides we haven t made explicit reference to the presence of unknown constants that we wish to estimate. For example in the Gaussian random walk example the unknowns would be the variances (q 2, r 2 ) (and perhaps the initial state x 0, in some situation). In that case we would like to estimate θ = (q 2, r 2 ) using observations y 1:T. More in general... Main goal We introduce general methods for SSM producing inference for the vector of parameters θ. We will be particularly interested in Bayesian inference. Of course θ can contain all sorts of unknowns, not just variances.

13 A quick look into the final goal p(y 1:T θ) is the likelihood of the measurements conditionally on θ. π(θ) is the prior density of θ (we always assume continuous-valued parameters). It encloses knowledge about θ before we see our current data y 1:T. Bayes theorem gives the posterior distribution: π(θ y 1:T ) = p(y 1:T θ)π(θ) p(y 1:T ) p(y 1:T θ)π(θ) inference based on π(θ y 1:T ) is called Bayesian inference. p(y 1:T ) is the marginal likelihood (evidence), independent of θ. Goal: calculate (sample draws from) π(θ y 1:T ). Remark: θ is a random quantity in the Bayesian framework.

14 As you know... (remember we set some background requirements for taking this course) Bayesian inference can rarely be performed without using some Monte Carlo sampling. Except for simple cases (e.g. data y 1,..., y T independently distributed from some member of the exponential family) it is usually impossible to write the likelihood p(y 1:T θ) in closed form. Since early 90 s MCMC (Markov chain Monte Carlo) has opened the possibility to perform Bayesian inference in practice. Are we obsessed with Bayesian inference? NO! However it sometimes offers the easiest way to deal with complex, non-trivial models. Some good reads The Bayesian approach actually opens the possibility for a surprising result (see later on...).

15 The likelihood function for SSMs First of all, notice that in the Bayesian framework, since θ is random, we do not simply write the likelihood function as p θ (y 1:T ) nor p(y 1:T ; θ), but we must condition on θ and write p(y 1:T θ). In a SSM data are not independent, they are only conditionally independent complication!: p(y 1:T θ) = p(y 1 θ) T p(y t y 1:t 1, θ) =? We don t have a closed for expression for the product above because we do not know how to calculate p(y t y 1:t 1, θ). Let s see why. t=2

16 In a SSM the observed process is assumed to depend on the latent Markov process {X t }: we can write p(y 1:T θ) = p(y 1:T, x 0:T θ)dx 0:T = p(y 1:T x 0:T, θ) } {{ } use cond. indep. T T } = t=1 { p(y t x t, θ) p(x 0 θ) t=1 p(x t x t 1, θ) p(x 0:T θ) } {{ } dx 0:T use Markovianity dx 0:T Problems The expression above is a (T + 1)-dimensional integral For most (nontrivial) models, transition densities p(x t x t 1 ) are unknown

17 Special cases Despite the analytic difficulties, find approximations for the likelihood function is possible (we ll consider some approach soon). In some simple cases, closed form solutions do exist: for example when the SSM is linear and Gaussian (see the Gaussian random walk example) then the classic Kalman filter gives the exact likelihood. In the SSM literature important (Gaussian) approximations are given by the extended and unscented Kalman filters. However, approximations offered by particle filters (a.k.a. sequential Monte Carlo) are presently the state-of-art for general non-linear non-gaussian SSM.

18 What if we had p(y 1:T θ)... Ideally if we had p(y 1:T θ) we could use Metropolis-Hastings (MH) to sample from the posterior π(θ y 1:T ). At iteration r + 1 we have: 1 current value is θ r, propose a new θ q(θ θ r ), e.g. θ N(θ r, Σ θ ) for some covariance matrix Σ θ. 2 draw u U(0, 1) accept θ if u < min(1, A), where A = p(y 1:T θ ) p(y 1:T θ r ) π(θ ) π(θ r ) q(θ r θ ) q(θ θ r ) then set θ r+1 := θ and p(y 1:T θ r+1 ) := p(y 1:T θ ). Otherwise, reject θ, set θ r+1 := θ r. Set r := r + 1, go to 1 and repeat.

19 By repeating steps 1-2 as much as wanted we are guaranteed that, by discarding a long enough number of iterations (burnin), the remaining draws form a Markov chain (hence dependent values) having π(θ y 1:T ) as their stationary distribution. Therefore if you have produced R = R 1 + R 2 iterations of Metropolis-Hastings, where R 1 is a sufficiently long burnin, for scalar θ you can then plot the histogram of the last R 2 draws θ R1 +1,..., θ R R1. Such histogram gives the density π(θ y 1:T ), up to a Monte Carlo error induced by using a finite R 2. for a vector valued θ R p create p separate histograms of the draws pertaining each component of θ. Such histograms represent the posterior marginals π(θ j y), j = 1,..., p.

20 Brush up MH with a simple example A simple toy problem not related to state-space modelling. Data: n = 1, 000 observations i.i.d y j N(3, 1). Now assume µ = E(y j ) unknown. Estimate µ. Assume for example a wide prior: µ N(2, 2 2 ) (do not look at data!) Gaussian random walk to propose µ = µ r ξ, with ξ N(0, 1). Notice Gaussian proposals are symmetric, hence q(µ µ r ) = q(µ r µ ) therefore A = p(y 1:n µ ) p(y 1:n µ r ) π(µ ) π(µ r )

21 Start far at µ = 10. Produce 10,000 iterations posterior π(µ y) after burnin 9 12 posterior prior MCMC iterations True value µ = 3 The posterior on the right plot discards the initial 500 draws (burnin).

22 Remark Notice Metropolis-Hastings is not an optimization algorithm. Unlike in maximum likelihood estimation here we are not trying to converge towards some mode. What we want is to explore thoroughly (i.e. sample from) π(θ data), including its tails. This way we can directly assess the uncertainty about θ by looking at π(θ data), instead of having to resort to asymptotic arguments regarding the sampling distribution of ˆθ n when n as in ML-theory.

23 Example: a nonlinear SSM { xt = 0.5x t x t 1 (1+x 2 t 1 ) + 8 cos(1.2(t 1)) + v t, y t = 0.05x 2 t + e t, with a deterministic x 0 = 0, measurements at times t = 1, 2,..., 100. v t N(0, q), e t N(0, r) all independent. Data generated with q = 0.1 and r = X Y time

24 Priors: q InvGamma(0.01, 0.01), r InvGamma(0.01, 0.01) 0.4 Prior density InvGamma(0.01,0.01) Recall true values are q = 0.1 and r = 1. The chosen priors cover both.

25 Results anticipation... Perform an MCMC (how about the likelihood? we ll see this later). R = 10, 000 iterations (generous burnin = 3, 000 iterations). Proposals: q new N(q old, ), and similarly for r new. Start far away: initial values q init = 1, r init = Trace plot of p(q y 1:T ), acc. rate e+01 percent 20 Empirical PDF of p(q y 1:T ) after burnin Trace plot of p(r y 1:T ), acc. rate e+01 percent 3 Empirical PDF of p(r y 1:T ) after burnin

26 How do we deal with general SSM for which the exact likelihood p(y 1:T θ) is unavailable? That s the case of the previous example. For example, the previously analysed model is nonlinear in x t and therefore the exact Kalman filter can t be used.

27 General Monte Carlo integration Recall our likelihood function: p(y 1:T θ) = p(y 1 θ) T p(y t y 1:t 1, θ) We can write p(y t y 1:t 1, θ) as p(y t y 1:t 1, θ) = p(y t x t, θ)p(x t y 1:t 1, θ)dx t = E(p(y t x t, θ)) Use Monte Carlo integration: generate N draws from p(x t y 1:t 1, θ), then invoke the law of large numbers. produce N independent draws xt i p(x t y 1:t 1, θ), i = 1,..., N for each xt i compute p(y t xt, i θ) and by LLN we have 1 N p(y t x i N t, θ) E(p(y t x t, θ)), N i=1 error term is O(N 1/2 ) regardless the dimension of x t t=2

28 SMC and the bootstrap filter But how to generate good draws (particles) x i t p(x t y 1:t 1, θ)? Here good means that we want particles such that the values of p(y t x i t, θ) are not negligible ( explain a large fraction of the integrand). For SSM, sequential Monte Carlo (SMC) is the winning strategy. We will NOT give a thorough introduction to SMC methods. We only use a few notions to solve our parameter inference problem.

29 Importance sampling Let s remove for a moment the dependence on θ. p(y t y 1:t 1 ) = p(y t x 0:t )p(x 0:t y 1:t 1 )dx 0:t = p(y t x t )p(x t x t 1 )p(x 0:t 1 y 1:t 1 )dx 0:t p(yt x t )p(x t x t 1 )p(x 0:t 1 y 1:t 1 ) = h(x 0:t y 1:t )dx 0:t h(x 0:t y 1:t ) where h( ) is an arbitrary (positive) density function called importance density. Choose one easy to simulate from. 1 simulate N samples: x i 0:t h(x 0:t y 1:t ), i = 1,..., N 2 construct importance weights w i t = p(y t x i t)p(x i t x i t 1 )p(xi 0:t 1 y 1:t 1) h(x i 0:t y 1:t) 3 p(y t y 1:t 1 ) = E(p(y t x t )) 1 N N i=1 wi t

30 Importance Sampling Importance Sampling is an appealing and revolutionary idea for Monte Carlo integration. However generating at each time a cloud of particles x0:t i is not really computationally appealing, and it s not clear how to do so. We need something enabling a sort of sequential mechanism, as t increases.

31 Sequential Importance Sampling When h( ) is chosen in an intelligent way, an important property is the one that allows sequential update of weights. After some derivation on the board, we have (see p. 121 in Särkkä 1 and p. 5 in Cappe et al. 2 ) w i t p(y t x i t)p(x i t x i t 1 ) h(x i t x i 0:t 1, y 1:t) wi t 1. However, the dependence of w t on w t 1 is a curse (with remedy) as described in next slide. 1 Särkkä, available here. 2 Cappe, Godsill and Moulines. Available here.

32 The particle degeneracy problem Particle degeneracy occurs when at time t all but one of the importance weights w i t are close to zero. This implies a poor approximation to p(y t y 1:t 1 ). Notice that when a particle gets a zero weight (or a small positive weight that your computer sets to zero numerical underflow) that particle is doomed! Since w i t p(y t x i t)p(x i t x i t 1 ) h(x i t x i 0:t 1, y 1:t) wi t 1. if for a given i we have w i t 1 = 0 particle i will have zero weight for all subsequent times.

33 The particle degeneracy problem For example, suppose a new data point y t is encountered and this is an outlier. Then most particles will end up far away from y t and their weights will be 0. Those particles will die. Eventually, for a time horizon T which is long enough, all but one particle will have zero weight. This means that the methodology is very fragile and particle degeneracy has affected SMC methods for decades.

34 The Resampling idea A life saving solution is to use resampling with replacement (Gordon et al. 3 ). 1 interpret w i t as the probability to sample x i t from the weighted set {x i t, w i t, i = 1,..., N}, with w i t := w i t/ i wi t. 2 draw N times with replacement from the weighted set. Replace the old particles with the new ones { x 1 t,..., x N t }. 3 Reset weights w i t = 1/N (the resampling has destroyed the information on how we reached time t). Since resampling is done with replacement, a particle with a large weight is likely to be drawn multiple times. Particles with very small weights are not likely to be drawn at all. Nice! 3 Gordon, Salmond and Smith. IEEE Proceedings F. 140(2) 1993.

35 Sequential Importance Sampling Re-sampling (SISR) We are now in the position to introduce a method that samples from h( ), so that these samples can be used as if they were from p(x 0:t y 1:t 1 ) provided these are appropriately re-weighted. 1 Sample x 1 t,..., x N t h( ) (same as before) 2 compute weights w i t (same as before) 3 normalize weights w i t = w i t/ N i=1 wi t 4 We have a discrete probability distribution {x i t, w i t}, i = 1,..., N 5 Resample N times with replacement from the set {x 1 t,..., x N t } having associate probabilities { w 1 t,..., w N t }, to generate another sample { x 1 t,..., x N t }. 6 { x 1 t,..., x N t } is an approximate sample from p(x 0:t y 1:t 1 ). Sketch of Proof on page 111 in Wilkinson s book.

36 The next animation illustrates the concept of sequential importance sampling resampling with N = 5 particles. Light blue: observed trajectory (data) dark blue: forward simulation of the latent process {X t } pink balls: particles x i t green balls: selected particles x i t from resampling red curve: density p(y t x t )

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61 Bootstrap filter This is the method we will actually use to ease Bayesian inference. The bootstrap filter is a simple example of SISR. Choose h(x i t x i 0:t 1, y 1:t) := p(x i t x i t 1 ) Recall in SISR after performing resampling reset all weights w i t 1 = 1 N Thus w i t p(y t x i t)p(x i t x i t 1 ) h(x i t x i 0:t 1, y 1:t) wi t 1 = 1 N p(y t x i t) p(y t x i t), i = 1,..., N This is a SISR strategy therefore we still have that x i t p(x 0:t y 1:t 1 )

62 Bootstrap filter in detail 1 t = 0 (initialize) x0 i π(x 0), assign w i 0 = 1/N, i = 1,..., N 2 at the current t assume we have the weighted particles {xt, i w i t} 3 from the current sample, resample with replacement N times to obtain { x i t, i = 1,..., N}. 4 From your model propagate forward x i t+1 p(x t+1 x i t), i = 1,..., N. 5 Compute w i t+1 = p(y t+1 xt+1 i ) and normalise weights w i t+1 = wi t+1 / N i=1 wi t+1 6 set t := t + 1 and if t < T go to step 2.

63 So the bootstrap filter by et al. (1993) 4 easily provide what we need! ˆp(y t y 1:t 1 ) = 1 N N i=1 wi t Finally a likelihood approximation: ˆp(y 1:T ) = ˆp(y 1 ) Put back θ in the notation so to obtain: approximate maximum likelihood or T ˆp(y t y 1:t 1 ) t=2 θ mle = argmax θ ˆp(y 1:T ; θ) exact Bayesian inference by using ˆp(y 1:T θ) inside Metropolis-Hastings. why exact?? let s check it after the example... 4 Gordon, Salmond and Smith. IEEE Proceedings F. 140(2) 1993.

64 Back to the nonlinear SSM example We can now comment on how we obtained previously shown results (reproposed here). We used the bootstrap filter with N = 500 particles and R = 10, 000 MCMC iterations. forward propagation: x t+1 = 0.5x t + 25 (1+xt 2 ) + 8 cos(1.2t) + v t+1 w i t+1 = N(0.05(xi t+1 )2, r) x t 1.5 Trace plot of p(q y 1:T ), acc. rate e+01 percent 20 Empirical PDF of p(q y 1:T ) after burnin Trace plot of p(r y 1:T ), acc. rate e+01 percent 3 Empirical PDF of p(r y 1:T ) after burnin

65 Numerical issues with particle degeneracy When coding your algorithms try to consider the following before normalizing weights code unnormalised weights on log-scale: e.g. when w i t = N(y t x i t) the exp() in the Gaussian pdf will likely produce an underflow (w i t = 0) for x i t far from y t. Solution: reason in terms of logw instead of w. However afterwards we necessarily have to go back to w:=exp(logw) then normalize and the above might still not be enough. Solution: subtract the maximum (log)weight from each (log)weight, e.g. set logw:=logw-max(logw). This is totally ok, the importance of particles is unaffected, weights are only scaled by the same constant c=max(logw).

66 Safe (log)likelihood computation As usual, better compute the (approximate) loglikelihood instead of the likelihood: log ˆp(y 1:T θ) = T ( log N + log t=1 At time t set c t = max{log w i t, i = 1,..., N} and set w i t := exp(log w i t c t ) Once all the w i t are available compute log N i=1 wi t = c t + log N i=1 wi t. (the latter follows from w i t = w i t exp{c t } which you don t need to evaluate) N i=1 w i t )

67 Incredible result Quite astonishingly Andrieu and Roberts 5 proved that using an unbiased estimate of the likelihood function into the MCMC routine is sufficient to obtain exact Bayesian inference! That is using the acceptance ratio A = ˆp(y 1:T θ ) ˆp(y 1:T θ) π(θ ) π(θ) q(θ θ ) q(θ θ) will return a Markov chain with stationary distribution π(θ y 1:T ) regardless the finite number N of particles used to approximate the likelihood!. The good news is that E(ˆp(y 1:T θ)) = p(y 1:T θ) with ˆp(y 1:T θ) obtained via SMC. 5 Andrieu and Roberts (2009), Annals of Statistics, 37(2)

68 The previous result will be considered, in my opinion, one of the most important statistical results of the XXI century. In fact, it offers an exact-approximate approach, where because of computing limitations we can only produce N < particles, while still be reassured to obtain exact (Bayesian) inference under minor assumptions. But let s give a rapid (technically informal) look at why it works. Key result: unbiasedness (del Moral ) We have that E(ˆp(y 1:T θ)) = ˆp(y 1:T θ, ξ)p(ξ)dξ = p(y 1:T θ) with ξ p(ξ) vector of all random variates generated during SMC (both to propagate forward the state and to perform particles resampling). 6 Easier to look at Pitt, Silva, Giordani, Kohn. J. Econometrics 171, 2012.

69 To prove the exactness of the approach we look at the (easier and less general) argument in sec. 2.2 of Pitt, Silva, Giordani, Kohn. J. Econometrics 171, To simplify the notation take y := y 1:T. ˆπ(θ, ξ y) approximate joint posterior of (θ, ξ) obtained via SMC ˆπ(θ, ξ y) = ˆp(y θ, ξ)p(ξ)π(θ) p(y) (notice ξ and θ are assumed a-priori independent) Notice we put p(y) not ˆp(y) at the denominator: this follows from the unbiasedeness assumption as we obtain ˆp(y θ, ξ)p(ξ)π(θ)dξdθ = π(θ){ ˆp(y θ, ξ)p(ξ)dξ}dθ = π(θ)p(y θ)dθ = p(y).

70 The exact (unavailable) posterior of θ is π(θ y) = p(y θ)π(θ) p(y) therefore the marginal likelihood (evidence) is and p(y) = p(y θ)π(θ) π(θ y) ˆp(y θ, ξ)p(ξ)π(θ) ˆπ(θ, ξ y) = p(y) = π(θ y)ˆp(y θ, ξ)p(ξ) π(θ) p(y θ) π(θ)

71 Now, we know that applying an MCMC targeting ˆπ(θ, ξ y) then discarding the output pertaining to ξ corresponds to integrate-out ξ from the posterior ˆπ(θ, ξ y)dξ = π(θ y) ˆp(y θ, w)p(ξ)dξ = π(θ y) p(y θ) } {{ } E(ˆp(y θ))=p(y θ) We are thus performing a pseudo-marginal approach: marginal because we disregard ξ; pseudo because we use ˆp( ) not p( ). Therefore we proved that, using MCMC on an (artificially) augmented posterior, then discard from the output all the random variates created during SMC returns exact Bayesian inference. Notice that discarding the ξ is something that we naturally do in Metropolis-Hastings hence nothing strange is happening here. The ξ are just instrumental, uninteresting, variates independent of θ and independent of {X t }.

72 One more example Let s study the stochastic Ricker model. { y t Pois(φN t ) N t = r N t 1 e N t 1+e t, e t N(0, σ 2 ) N t is the size of a population at t r is the growth rate φ is a scale parameter e t environmental noise So this example shows a observational model for y t given by a discrete distribution. Classic Kalman based filtering can t accommodate discreteness.

73 We can use the R package pomp to fit this model. pomp supports a number of inference methods of interest to us: particle marginal methods iterated filtering approximate Bayesian computation (ABC) synthetic likelihoods...and more. Essentially a user who s interested in fitting a SSM can benefit from a well tested platform, and experiment with a number of possibilities. I have never used it used it until 4 days ago (!). But now I have a working example (also as an optional exercise!).

74 Here are 50 observations from the Ricker model at times (1, 2,..., 50) generated with (r, σ, φ) = (44.7, 0.3, 10) and starting size N 0 = 7 at t = 0. myobservedricker e N y time You can also notice the path of the (latent) process N t as well as the (not so interesting) realizations of e t.

75 r The pomp pmcmc function runs a pseudo-marginal MCMC. Assume we are only interested in (r, φ) and keep σ = 0.3 constant ( known ). Here we use 500 particles with 5,000 MCMC iterations. We start at (r, φ) = (7.4, 5). We assumed flat (improper) priors. PMCMC convergence diagnostics nfail log.prior loglik phi PMCMC iteration PMCMC iteration

76 We used a non-adaptive Gaussian random walk (variances are kept fixed). You might get better results with an adaptive version. A topic set as an (optional) exercise is to have a thought at why the method fails at estimating parameters of a nearly deterministic (smaller σ) stochastic Ricker model. Furthermore, for N t deterministic (σ = 0), where particle filters are not applicable (nor needed), exact likelihood calculation is also challenging. A great coverage of this issue is in Fasiolo et al. (2015) arxiv: comparing particle marginal methods, approximate Bayesian computation, iterated filtering and more. Very much recommended read.

77 Conclusions We have outlined a powerful methodology for exact Bayesian inference, regardless the number of particles N a too small N will have a negative impact on chain mixing many rejections, sticky chain. the methodology is perfectly suited for state-space modelling. the methodology is plug-and-play as besides knowing the observations density p(y t x t ) nothing else is required. in fact we only need the ability to forward simulate x t given x t 1 which can be performed numerically from the state model. the above implies that knowledge of the transition density expression p(x t x t 1 ) is not required. BINGO!

78 Software I m not very much updated on all available software for parameter inference via SMC (tend to write my code from scratch). Some possibilities are: LiBBi (C++ template library) Biips (C++ with interfaces to MATLAB/R) demo("pmcmc") in the R package smfsb. R package pomp accompanying MATLAB code for the book by S. Särkkä. Code available here. the MATLAB code for the presented example is available at the course webpage.

79 Important issues How to tune the number of particles? Doucet, Pitt, and Kohn. Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. arxiv: (2012). Pitt, dos Santos Silva, Giordani and Kohn. On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. Journal of Econometrics 171, no. 2 (2012): Sherlock, Thiery, Roberts and Rosenthal. On the efficiency of pseudo-marginal random walk Metropolis algorithms. arxiv: (2013).

80 Important issues Comparison of resampling schemes. Douc, Cappe and Moulines. Comparison of resampling schemes for particle filtering. download.cgi?id=5755 Hol, Schön and Gustaffson. On resampling algorithms for particle filters, schon/publications/holsg2006.pdf

81 Appendix

82 Multinomial resampling This is the simplest to explain (not the most efficient though) resampling scheme. At time t we wish to sample from a population of weighted particles (x i t, w i t, i = 1,..., N). What we actually do is to sample N times particle indeces with replacement from the population (i, w i t, i = 1,..., N). This will be a sample of size N from a multinomial distribution. Pick at particle from the urn, the larger its probability w i t the more likely it will be picked. Record its index i and put it back in the urn. Repeat for a total of N times. To code the sampling procedure we just need to recall the inverse transform method. For a generic random variable X, let F X be an invertible cdf. We can sample an x from F X using x := FX 1 (u), with u U(0, 1).

83 For example let s start from a simple example of multinomial distribution, the Bernoulli distribution. X {0, 1} with p = P(X = 1), 1 p = P(X = 0). Then 0 x < 0 F X (x) = 1 p 0 x < 1 1 x 1 Draw the stair represented by the plot of F X. Generate a u U(0, 1) and hit the stair s steps. If 0 < u 1 p then set x := 0 and if u > 1 p set x := 1. For the multinomial case it is a simple generalization. Drop time t and set w i = p i. F X is a stair with N steps. Shoot a u U(0, 1) and return index i i := min { i i {1,..., N}; ( p i ) u 0 }. i=1 (1)

84 Cool reads on Bayesian methods (titles are linked) You have an engineeristic/signal processing background: check S. Särkkä Bayesian Filtering and Smoothing (free PDF from the author!) You are a data-scientist: check K. Murphy Machine Learning: a probabilistic perspective. You are a theoretical statistician: C. Robert The Bayesian Choice. You are interested in bioinformatics/systems biology: check D. Wilkinson Stochastic Modelling for Systems Biology, 2ed.. You are interested in inference for SDEs with applications to life sciences: check the book by Wilkinson above and C. Fuchs Inference for diffusion processes.

85 Cool reads on Bayesian methods (titles are linked) You are a computational statistician: check Handbook of MCMC. Older (but excellent) titles are: J. Liu Monte Carlo strategies in Scientific Computing and Casella-Robert Monte Carlo Statistical Methods. You want a practical hands-on and (almost) maths free introduction: check The BUGS Book and Doing Bayesian Data Analysis.

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Controlled sequential Monte Carlo

Controlled sequential Monte Carlo Controlled sequential Monte Carlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Bayesian Computation

More information

An Brief Overview of Particle Filtering

An Brief Overview of Particle Filtering 1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering

Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Advanced Computational Methods in Statistics: Lecture 5 Sequential Monte Carlo/Particle Filtering Axel Gandy Department of Mathematics Imperial College London http://www2.imperial.ac.uk/~agandy London

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms

L09. PARTICLE FILTERING. NA568 Mobile Robotics: Methods & Algorithms L09. PARTICLE FILTERING NA568 Mobile Robotics: Methods & Algorithms Particle Filters Different approach to state estimation Instead of parametric description of state (and uncertainty), use a set of state

More information

Inferring biological dynamics Iterated filtering (IF)

Inferring biological dynamics Iterated filtering (IF) Inferring biological dynamics 101 3. Iterated filtering (IF) IF originated in 2006 [6]. For plug-and-play likelihood-based inference on POMP models, there are not many alternatives. Directly estimating

More information

List of projects. FMS020F NAMS002 Statistical inference for partially observed stochastic processes, 2016

List of projects. FMS020F NAMS002 Statistical inference for partially observed stochastic processes, 2016 List of projects FMS020F NAMS002 Statistical inference for partially observed stochastic processes, 206 Work in groups of two (if this is absolutely not possible for some reason, please let the lecturers

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Learning Static Parameters in Stochastic Processes

Learning Static Parameters in Stochastic Processes Learning Static Parameters in Stochastic Processes Bharath Ramsundar December 14, 2012 1 Introduction Consider a Markovian stochastic process X T evolving (perhaps nonlinearly) over time variable T. We

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory Pseudo-arginal Metropolis-Hastings: a siple explanation and (partial) review of theory Chris Sherlock Motivation Iagine a stochastic process V which arises fro soe distribution with density p(v θ ). Iagine

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Sequential Monte Carlo methods for system identification

Sequential Monte Carlo methods for system identification Technical report arxiv:1503.06058v3 [stat.co] 10 Mar 2016 Sequential Monte Carlo methods for system identification Thomas B. Schön, Fredrik Lindsten, Johan Dahlin, Johan Wågberg, Christian A. Naesseth,

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo Methods (for DSGE Models) Sequential Monte Carlo Methods (for DSGE Models) Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 Some References These lectures use material from our joint work: Tempered

More information

A Note on Auxiliary Particle Filters

A Note on Auxiliary Particle Filters A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

State-Space Methods for Inferring Spike Trains from Calcium Imaging

State-Space Methods for Inferring Spike Trains from Calcium Imaging State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 7 Sequential Monte Carlo methods III 7 April 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Particle Filtering for Data-Driven Simulation and Optimization

Particle Filtering for Data-Driven Simulation and Optimization Particle Filtering for Data-Driven Simulation and Optimization John R. Birge The University of Chicago Booth School of Business Includes joint work with Nicholas Polson. JRBirge INFORMS Phoenix, October

More information

An Introduction to Particle Filtering

An Introduction to Particle Filtering An Introduction to Particle Filtering Author: Lisa Turner Supervisor: Dr. Christopher Sherlock 10th May 2013 Abstract This report introduces the ideas behind particle filters, looking at the Kalman filter

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

2D Image Processing (Extended) Kalman and particle filter

2D Image Processing (Extended) Kalman and particle filter 2D Image Processing (Extended) Kalman and particle filter Prof. Didier Stricker Dr. Gabriele Bleser Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Inference in state-space models with multiple paths from conditional SMC

Inference in state-space models with multiple paths from conditional SMC Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September

More information

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

Bayesian Monte Carlo Filtering for Stochastic Volatility Models Bayesian Monte Carlo Filtering for Stochastic Volatility Models Roberto Casarin CEREMADE University Paris IX (Dauphine) and Dept. of Economics University Ca Foscari, Venice Abstract Modelling of the financial

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

Auxiliary Particle Methods

Auxiliary Particle Methods Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley

More information

The Hierarchical Particle Filter

The Hierarchical Particle Filter and Arnaud Doucet http://go.warwick.ac.uk/amjohansen/talks MCMSki V Lenzerheide 7th January 2016 Context & Outline Filtering in State-Space Models: SIR Particle Filters [GSS93] Block-Sampling Particle

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

SMC 2 : an efficient algorithm for sequential analysis of state-space models

SMC 2 : an efficient algorithm for sequential analysis of state-space models SMC 2 : an efficient algorithm for sequential analysis of state-space models N. CHOPIN 1, P.E. JACOB 2, & O. PAPASPILIOPOULOS 3 1 ENSAE-CREST 2 CREST & Université Paris Dauphine, 3 Universitat Pompeu Fabra

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Topics to be Covered Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling (CDF) Ancestral Sampling Rejection

More information

Lecture 6: Bayesian Inference in SDE Models

Lecture 6: Bayesian Inference in SDE Models Lecture 6: Bayesian Inference in SDE Models Bayesian Filtering and Smoothing Point of View Simo Särkkä Aalto University Simo Särkkä (Aalto) Lecture 6: Bayesian Inference in SDEs 1 / 45 Contents 1 SDEs

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Topics to be Covered Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling (CDF) Ancestral Sampling Rejection

More information

An efficient stochastic approximation EM algorithm using conditional particle filters

An efficient stochastic approximation EM algorithm using conditional particle filters An efficient stochastic approximation EM algorithm using conditional particle filters Fredrik Lindsten Linköping University Post Print N.B.: When citing this work, cite the original article. Original Publication:

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart

Robotics. Mobile Robotics. Marc Toussaint U Stuttgart Robotics Mobile Robotics State estimation, Bayes filter, odometry, particle filter, Kalman filter, SLAM, joint Bayes filter, EKF SLAM, particle SLAM, graph-based SLAM Marc Toussaint U Stuttgart DARPA Grand

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Particle Filters. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics

Particle Filters. Pieter Abbeel UC Berkeley EECS. Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Particle Filters Pieter Abbeel UC Berkeley EECS Many slides adapted from Thrun, Burgard and Fox, Probabilistic Robotics Motivation For continuous spaces: often no analytical formulas for Bayes filter updates

More information

Learning of state-space models with highly informative observations: a tempered Sequential Monte Carlo solution

Learning of state-space models with highly informative observations: a tempered Sequential Monte Carlo solution Learning of state-space models with highly informative observations: a tempered Sequential Monte Carlo solution Andreas Svensson, Thomas B. Schön, and Fredrik Lindsten Department of Information Technology,

More information

Sequential Monte Carlo Samplers for Applications in High Dimensions

Sequential Monte Carlo Samplers for Applications in High Dimensions Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Markov Chain Monte Carlo Methods for Stochastic

Markov Chain Monte Carlo Methods for Stochastic Markov Chain Monte Carlo Methods for Stochastic Optimization i John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U Florida, Nov 2013

More information

Pseudo-marginal MCMC methods for inference in latent variable models

Pseudo-marginal MCMC methods for inference in latent variable models Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 5 Sequential Monte Carlo methods I 31 March 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Monte Carlo Approximation of Monte Carlo Filters

Monte Carlo Approximation of Monte Carlo Filters Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include: Arnaud Doucet, Axel Finke, Anthony Lee, Nick Whiteley 7th January 2014 Context & Outline Filtering in State-Space

More information

Tutorial on ABC Algorithms

Tutorial on ABC Algorithms Tutorial on ABC Algorithms Dr Chris Drovandi Queensland University of Technology, Australia c.drovandi@qut.edu.au July 3, 2014 Notation Model parameter θ with prior π(θ) Likelihood is f(ý θ) with observed

More information

An introduction to particle filters

An introduction to particle filters An introduction to particle filters Andreas Svensson Department of Information Technology Uppsala University June 10, 2014 June 10, 2014, 1 / 16 Andreas Svensson - An introduction to particle filters Outline

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems

Modeling and state estimation Examples State estimation Probabilities Bayes filter Particle filter. Modeling. CSC752 Autonomous Robotic Systems Modeling CSC752 Autonomous Robotic Systems Ubbo Visser Department of Computer Science University of Miami February 21, 2017 Outline 1 Modeling and state estimation 2 Examples 3 State estimation 4 Probabilities

More information

Introduction. log p θ (y k y 1:k 1 ), k=1

Introduction. log p θ (y k y 1:k 1 ), k=1 ESAIM: PROCEEDINGS, September 2007, Vol.19, 115-120 Christophe Andrieu & Dan Crisan, Editors DOI: 10.1051/proc:071915 PARTICLE FILTER-BASED APPROXIMATE MAXIMUM LIKELIHOOD INFERENCE ASYMPTOTICS IN STATE-SPACE

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Introduction to Artificial Intelligence (AI)

Introduction to Artificial Intelligence (AI) Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 9 Oct, 11, 2011 Slide credit Approx. Inference : S. Thrun, P, Norvig, D. Klein CPSC 502, Lecture 9 Slide 1 Today Oct 11 Bayesian

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Introduction to Particle Filters for Data Assimilation

Introduction to Particle Filters for Data Assimilation Introduction to Particle Filters for Data Assimilation Mike Dowd Dept of Mathematics & Statistics (and Dept of Oceanography Dalhousie University, Halifax, Canada STATMOS Summer School in Data Assimila5on,

More information

Data-driven Particle Filters for Particle Markov Chain Monte Carlo

Data-driven Particle Filters for Particle Markov Chain Monte Carlo Data-driven Particle Filters for Particle Markov Chain Monte Carlo Patrick Leung, Catherine S. Forbes, Gael M. Martin and Brendan McCabe September 13, 2016 Abstract This paper proposes new automated proposal

More information

Markov chain Monte Carlo Lecture 9

Markov chain Monte Carlo Lecture 9 Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events

More information

19 : Slice Sampling and HMC

19 : Slice Sampling and HMC 10-708: Probabilistic Graphical Models 10-708, Spring 2018 19 : Slice Sampling and HMC Lecturer: Kayhan Batmanghelich Scribes: Boxiang Lyu 1 MCMC (Auxiliary Variables Methods) In inference, we are often

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

A new iterated filtering algorithm

A new iterated filtering algorithm A new iterated filtering algorithm Edward Ionides University of Michigan, Ann Arbor ionides@umich.edu Statistics and Nonlinear Dynamics in Biology and Medicine Thursday July 31, 2014 Overview 1 Introduction

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information