Hmms with variable dimension structures and extensions

Size: px
Start display at page:

Download "Hmms with variable dimension structures and extensions"

Transcription

1 Hmm days/enst/january 21, Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine xian

2 Hmm days/enst/january 21, Estimating [not testing] the number of components/states

3 Hmm days/enst/january 21, In the mixture model, y i f(y) = k p j f(y θ j ), i = 1,, n j=1 what if k is unknown?!

4 Hmm days/enst/january 21, Meaning of the question weak identifiability of mixtures insolvable philosophical problem unless k has a proper intrinsic meaning [and even so...] hence testing per se is impossible: the data cannot distinguish between k and k + h [unless guided by a firm hand!] the choice of a prior on k π(k) is thus necessary to translate the degree of details required [equivalence with penalizing factors in likelihood analysis]

5 Hmm days/enst/january 21, Multiplicity of technical solutions Saturated models with n mostly empty components Reversible jump MCMC techniques for exploration of most models [Green (1995); Richardson & Green (1997)] Birth and death and other jump processes [Preston (1976); Ripley (1977); Stephens (1999,2000)]

6 Hmm days/enst/january 21, Principles of RJMCMC Births and deaths are proposed with probabilities β k and δ k, respectively, when having k components. Acceptance probability for birth move is min(a, 1), with A = likelihood ratio δ k+1 β k (1 w)k 1 b(w, φ). Acceptance probability for death move is min(a 1, 1).

7 Hmm days/enst/january 21, Principles of BDMCMC New components are born according to a Poisson process with rate λ k when having k components. Each component (w, φ) dies with rate d(w, φ) = likelihood ratio 1 λ k k + 1 b(w, φ) (1 w) k 1.

8 Hmm days/enst/january 21, More general moves Local balance in for Markov jump processes in general: π(θ)q(θ, θ ) = π(θ )q(θ, θ) For birth-death, split-merge moves etc.: π(k)π(θ k k)l(θ k ) λ k b(u k )J 1 = π(k+1)π(θ k+1 k+1)l(θ k+1 ) d(θ k+1, θ k )

9 Hmm days/enst/january 21, Comparison of mixing properties RJMCMC works poorly if A = likelihood ratio δ k+1 β k (1 w)k 1 b(w, φ) is small. If A is small, then d(w, φ) = likelihood ratio 1 λ k k + 1 N 1 k + 1 A 1 is large and BDMCMC works poorly. b(w, φ) (1 w) k 1

10 Hmm days/enst/january 21, RJMCMC BDMCMC Rescaling time In discrete-time RJMCMC, let the time unit be 1/N, put β k = λ k /N and δ k = 1 λ k /N. As N each birth proposal will be accepted, and having k components births occur according to a Poisson process with rate λ k.

11 Hmm days/enst/january 21, As N, a component (w, φ) dies with rate lim Nδ k+1 1 N k + 1 min(a 1, 1) = lim N N 1 k + 1 β k δ k+1 likelihood ratio 1 b(w, φ) (1 w) k 1 = likelihood ratio 1 λ k k + 1 b(w, φ) (1 w) k 1. Hence RJMCMC BDMCMC. This holds more generally.

12 Hmm days/enst/january 21, Inference with varying k Little difference from the fixed k setting Inference must be conditional on k General principle in Bayesian model choice: parameters appearing in different models must be considered as separate entities

13 Hmm days/enst/january 21, Inference on k through posterior probabilities and predictive plots of the regression lines k k k k k k k k k k

14 Hmm days/enst/january 21, Results on a large uniform sample for the beta mixture: p 0 + (1 p 0 ) k i=1 ω i l ω Be(α i ɛ i, α i (1 ɛ i )) l k never estimated as 0 p 0 very small likelihood widely different from 1 curve almost flat

15 Hmm days/enst/january 21, Histogram of scan(datae) Relative Frequency data Histogram of scan(datae) Relative Frequency scan(datae)

16 Hmm days/enst/january 21, Histogram of rafe Histogram of pzero[pzero!= 1] Relative Frequency p Relative Frequency k p0 Histogram of alfa Histogram of epil epsilon Relative Frequency Relative Frequency alpha alpha epsilon Histogram of wei Histogram of poid weights Relative Frequency Relative Frequency probabilities weights probabilities

17 Hmm days/enst/january 21, Extensions to more challenging structures Introduce more advanced models by way of additional latent variables with possible dependence

18 Hmm days/enst/january 21, Hidden Markov models z 1 π z i p zi 1 z i x 1 z 1 N (µ z1, σ 2 z 1 ) x i z i N (µ zi, σ 2 z i ) omega omega Relative Frequency mu instants omega omega mu mu Relative Frequency sigma instants mu mu sigma sigma Relative Frequency omega instants sigma sigma Very similar to normal mixture but for additional structure which improves estimation

19 Hmm days/enst/january 21, Still allows for flat priors π(µ, σ, P ) 1 σ k 1 exp { 1 } (µi+1 2σ 2 µ i ) 2 I σ1 > >σ k [Robert & Titterington (1997)] Gibbs implementation straightforward 1. Generate missing data p(z i = j z i 1, z i+1, θ, p)

20 Hmm days/enst/january 21, Generate parameters p i. D(n i1 + 1,, n ik + 1) µ i N ( ni σ 2 i x i + α i 1 µ i 1 + α i+1 µ i+1 n i σ 2, i + α i 1 + α i+1 ) (n i σ 2 i + α i 1 + α i+1 ) σ 2 i IG ( ni 1 2, n i( x i µ i ) 2 + s 2 i 2 ) I σi 1 <σ i <σ i+1 [Celeux, Diebolt & Robert (1993)]

21 Hmm days/enst/january 21, Non-Gibbsic implementation also possible, without the missing states, thanks to forward-backward formulae Estimation of k possible via reversible jump [Robert, Rydén & Titterington (1999)] and other jump process methods [Cappé, Robert & Rydén (2001)]

22 Hmm days/enst/january 21, Split-merge moves for HMMs Parametrisation: p ij = ω ij / l ω il, Y t X t = i N (µ i, σ 2 i ). Move to split component j into j 1 and j 2 : ω ij1 = ω ij ε i, ω ij2 = ω ij (1 ε i ), ε i U(0, 1); ω j1 j = ω j jξ j, ω j2 j = ω j j/ξ j, ξ j log N (0, 1); similar ideas give ω j1 j 2 etc.; µ j1 = µ j 3σ j ε µ, µ j2 = µ j + 3σ j ε µ, ε µ N (0, 1); σ 2 j 1 = σ 2 j ξ σ, σ 2 j 2 = σ 2 j /ξ σ, ξ σ log N (0, 1). [Split intensity] λ S,k = kλ B [Birth intensity]

23 Hmm days/enst/january 21, Fixed k moves also used Example : Wind intensity in Athens Histogram and rawplot of the dataset

24 Hmm days/enst/january 21, Number of states Relative Frequency number of states temp[, 1] instants Log likelihood values Relative Frequency log likelihood temp[, 2] instants Number of moves Relative Frequency Number of moves temp[, 3] instants MCMC output on k (histogram and rawplot), number of states, and corresponding likelihood values

25 Hmm days/enst/january 21, pi sigma MCMC sequence of the parameters of the three components when conditioning on k = 3

26 Hmm days/enst/january 21, MCMC evaluation of the marginal density, compared with R nonparametric density estimate.

27 Hmm days/enst/january 21, Other latent variable models and hidden structures Hidden semi-markov models Switching ARMA models Stochastic volatility and ARCH models Discretised diffusions

28 Hmm days/enst/january 21, Hidden semi-markov models Example : Ion chanel model [Hobson, 1999; Carpenter et al., 2001] Observables y = (y t ) 1 t T directed by a hidden Gamma process x = (x t ) 1 t T : y t x t N (µ xt, σ 2 ) x t {0, 1} with durations (i = 0, 1) d j = t j+1 t j Ga(s i, λ i ) if x t = i for t j t < t j+1.

29 Hmm days/enst/january 21, y t Complex likelihood structure with no closed form expression

30 Hmm days/enst/january 21, Prior assumptions conjugate normal-gamma prior on the µ s and σ N (θ 0, τσ 2 ) G(ζ, η) 1 conjugate gamma prior on the λ s G(α, β) flat prior on the s s on {1,..., S}

31 Hmm days/enst/january 21, Particle system Generation of a system of particles (ω (j), x (j) ) j (j = 1,..., J) where ω = (µ 0, µ 1, σ, λ 0, λ 1, s 0, s 1 ) based on a proposal/instrumental/importance distribution π(ω y, x) π H (x y, ω)

32 Hmm days/enst/january 21, where π H full conditional of a fitted hidden Markov model with transition matrix IP = 1 λ 0 s 0 λ 0 s 0 λ 1 s 1 1 λ 1 s 1, by analogy with average sojourn times for both models.

33 Hmm days/enst/january 21, Simulation Use of forward backward formulae, of conjugate structure for the µ s, λ s and σ and of finite support for s, distributed as [ ] si i Γ(n i s i + α) s i x π(s i x) (β + v i ) n i Γ(s i ) n i I {1,2,...,S} (s i )

34 Hmm days/enst/january 21, Fitted series with residuals (top) and allocation probabilities (bottom)

35 Hmm days/enst/january 21, µ 1 µ σ (µ 1 µ 2 ) σ λ λ s s s 1 λ s 2 λ Fitted series with residuals (top) and allocation probabilities (bottom)

36 Hmm days/enst/january 21, Iterated particle system Repeated calls to importance sampling with systematic resampling steps to improve fit How many steps? Which improvement? Why bother?!

37 Hmm days/enst/january 21, Algorithm Step 0. Generate (j = 1,..., J) 1. ω (j) π(ω) 2. x (j) = (x (j) t ) 1 t T π H (x y, ω (j) ) and compute the weights (j = 1,..., J) ϱ j π(ω (j), x (j) y) π(ω (j) )π H (x (j) y, ω (j) )

38 Hmm days/enst/january 21, Step i. (i = 1,...) Generate (j = 1,..., J) 1. ω (j) π(ω y, x (j) ) 2. x (j) + = (x (j) t ) 1 t T π H (x y, ω (j) ) compute the weights (j = 1,..., J) ϱ j π(ω (j), x (j) + y) π(ω (j) y, x (j) )π H (x (j) + y, ω (j) ) resample the couples ω (j), x (j) + from the weights ϱ j, and take x (j) = x (j) + (j = 1,..., J).

39 Hmm days/enst/january 21, History and ancestry of the particle system

40 Hmm days/enst/january 21, Solving optimization problems

41 Hmm days/enst/january 21, Role of maximum a posteriori estimation in Bayesian inference θ = (θ 1, θ 2 ) Θ 1 Θ 2 p (θ) especially when posterior means are useless but difficulty with marginal MAP (MMAP) estimates because nuisance parameters must be integrated out θ MMAP 1 = arg Θ1 max p (θ 1 y) where p (θ 1 y) = Θ 2 p (θ 1, θ 2 y) dθ 2

42 Hmm days/enst/january 21, If integration possible in closed-form, use Expectation-Maximization (EM) algorithm [Dempster et al. (1977)] Deterministic algorithm which depends on initialization and is limited to certain classes of models. Stochastic variants like Stochastic EM (SEM) or Monte Carlo EM (MCEM) [Celeux & Diebolt (1985), Wei & Tanner (1991)] Parameter of interest always updated deterministically in the M step

43 Hmm days/enst/january 21, Standard (and Markov chain) Monte Carlo: draw random samples from the joint posterior distribution p (θ 1, θ 2 y) or MCMC (approximate, dependent) sample {( ) } θ (i) 1, θ(i) 2 ; i = 1,..., N and discard nuisance parameters. More suited to integration than to optimization

44 Hmm days/enst/january 21, Simulated annealing (SA) for maximizing p (θ 1 y) Non-homogeneous variant of MCMC for global optimization: invariant distribution at iteration i proportional to p γ(i) (θ 1 y), γ (i) increasing function diverging at infinity. Idea: as γ (i) goes to infinity, p γ(i) (θ 1 y) concentrates itself upon the set of global modes.

45 Hmm days/enst/january 21, State Augmentation for Marginal Estimation [Doucet, Godsill & Robert (2001)] Artificially augmented probability model whose marginal distribution is p γ (θ 1 y) via replications of the nuisance parameters: Replace θ 2 with γ artificial replications, θ 2 (1),..., θ 2 (γ) Treat the θ 2 (j) s as distinct random variables: γ q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) p (θ 1, θ 2 (k) y) k=1

46 Hmm days/enst/january 21, Use corresponding marginal for θ 1 q γ (θ 1 y) = q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) dθ 2 (1)... dθ 2 (γ) γ k=1 = p γ (θ 1 y) p (θ 1, θ 2 (k) y) dθ 2 (1)... dθ 2 (γ) Build a MCMC algorithm in the augmented space, with invariant distribution q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) Use simulated subsequence { θ (i) 1 ; i IN } as drawn from marginal posterior p γ (θ 1 y)

47 Hmm days/enst/january 21, Application to the benchmark galaxy dataset [Roeder (1992)] 82 observations of galaxy velocities from 3 (?) groups Algorithm EM MCEM SAME Mean log-posterior Std dev of log-posterior

Marginal maximum a posteriori estimation using Markov chain Monte Carlo

Marginal maximum a posteriori estimation using Markov chain Monte Carlo Statistics and Computing 1: 77 84, 00 C 00 Kluwer Academic Publishers. Manufactured in The Netherlands. Marginal maximum a posteriori estimation using Markov chain Monte Carlo ARNAUD DOUCET, SIMON J. GODSILL

More information

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models 5 MCMC approaches Label switching MCMC for variable dimension models 291/459 Missing variable models Complexity of a model may originate from the fact that some piece of information is missing Example

More information

Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers

Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers J. R. Statist. Soc. B (23) 65, Part 3, pp. 679 7 Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers Olivier Cappé, Centre National de la Recherche Scientifique,

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models Richard J. Boys Daniel A. Henderson Department of Statistics, University of Newcastle, U.K. Richard.Boys@ncl.ac.uk

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Bayesian model selection in graphs by using BDgraph package

Bayesian model selection in graphs by using BDgraph package Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA

More information

Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni

Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni Jean-Michel Marin CEREMADE Université Paris Dauphine Kerrie L. Mengersen QUT Brisbane

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Bayesian inference for Hidden Markov Models

Bayesian inference for Hidden Markov Models Università degli Studi di Macerata Dipartimento di Istituzioni Economiche e Finanziarie Bayesian inference for Hidden Markov Models Rosella Castellano, Luisa Scaccia Temi di discussione n. 43 Novembre

More information

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models 6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models

Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models 02433 - Hidden Markov Models Pierre-Julien Trombe, Martin Wæver Pedersen, Henrik Madsen Course week 10 MWP, compiled

More information

Submitted to the Brazilian Journal of Probability and Statistics

Submitted to the Brazilian Journal of Probability and Statistics Submitted to the Brazilian Journal of Probability and Statistics A Bayesian sparse finite mixture model for clustering data from a heterogeneous population Erlandson F. Saraiva a, Adriano K. Suzuki b and

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Variational Inference via Stochastic Backpropagation

Variational Inference via Stochastic Backpropagation Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 18-16th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Trans-dimensional Markov chain Monte Carlo. Bayesian model for autoregressions.

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Bayesian finite mixtures with an unknown number of. components: the allocation sampler

Bayesian finite mixtures with an unknown number of. components: the allocation sampler Bayesian finite mixtures with an unknown number of components: the allocation sampler Agostino Nobile and Alastair Fearnside University of Glasgow, UK 2 nd June 2005 Abstract A new Markov chain Monte Carlo

More information

Bayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels

Bayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels Bayesian Mixture Labeling by Minimizing Deviance of Classification Probabilities to Reference Labels Weixin Yao and Longhai Li Abstract Solving label switching is crucial for interpreting the results of

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Different points of view for selecting a latent structure model

Different points of view for selecting a latent structure model Different points of view for selecting a latent structure model Gilles Celeux Inria Saclay-Île-de-France, Université Paris-Sud Latent structure models: two different point of views Density estimation LSM

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them

Sequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Session 5B: A worked example EGARCH model

Session 5B: A worked example EGARCH model Session 5B: A worked example EGARCH model John Geweke Bayesian Econometrics and its Applications August 7, worked example EGARCH model August 7, / 6 EGARCH Exponential generalized autoregressive conditional

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Particle methods for maximum likelihood estimation in latent variable models

Particle methods for maximum likelihood estimation in latent variable models DOI 10.1007/s11222-007-9037-8 Particle methods for maximum likelihood estimation in latent variable models Adam M. Johansen Arnaud Doucet Manuel Davy Received: 15 May 2007 / Accepted: 31 August 2007 Springer

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Bayesian model selection for computer model validation via mixture model estimation

Bayesian model selection for computer model validation via mixture model estimation Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology

More information

Population Monte Carlo Methods/OFPR/CREST/May 5, Population Monte Carlo Methods. Christian P. Robert Université Paris Dauphine

Population Monte Carlo Methods/OFPR/CREST/May 5, Population Monte Carlo Methods. Christian P. Robert Université Paris Dauphine Population Monte Carlo Methods/OFPR/CREST/May 5, 2003 1 Population Monte Carlo Methods Christian P. Robert Université Paris Dauphine Population Monte Carlo Methods/OFPR/CREST/May 5, 2003 2 1 A Benchmark

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Markov Chains and Hidden Markov Models

Markov Chains and Hidden Markov Models Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden

More information

Bayesian inference for multivariate skew-normal and skew-t distributions

Bayesian inference for multivariate skew-normal and skew-t distributions Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Bayesian Analysis of Mixture Models with an Unknown Number of Components an alternative to reversible jump methods

Bayesian Analysis of Mixture Models with an Unknown Number of Components an alternative to reversible jump methods Bayesian Analysis of Mixture Models with an Unknown Number of Components an alternative to reversible jump methods Matthew Stephens Λ Department of Statistics University of Oxford Submitted to Annals of

More information

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check

More information

Recursive Deviance Information Criterion for the Hidden Markov Model

Recursive Deviance Information Criterion for the Hidden Markov Model International Journal of Statistics and Probability; Vol. 5, No. 1; 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Recursive Deviance Information Criterion for

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Aaron C. Courville Université de Montréal Note: Material for the slides is taken directly from a presentation prepared by Christopher M. Bishop Learning in DAGs Two things could

More information

Variational inference

Variational inference Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection Silvia Pandolfi Francesco Bartolucci Nial Friel University of Perugia, IT University of Perugia, IT

More information

Label Switching and Its Simple Solutions for Frequentist Mixture Models

Label Switching and Its Simple Solutions for Frequentist Mixture Models Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Clustering bi-partite networks using collapsed latent block models

Clustering bi-partite networks using collapsed latent block models Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space

More information

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007 Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember

More information

Graphical Models for Query-driven Analysis of Multimodal Data

Graphical Models for Query-driven Analysis of Multimodal Data Graphical Models for Query-driven Analysis of Multimodal Data John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology

More information

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

Bayesian Monte Carlo Filtering for Stochastic Volatility Models Bayesian Monte Carlo Filtering for Stochastic Volatility Models Roberto Casarin CEREMADE University Paris IX (Dauphine) and Dept. of Economics University Ca Foscari, Venice Abstract Modelling of the financial

More information

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, ) Econometrica Supplementary Material SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, 653 710) BY SANGHAMITRA DAS, MARK ROBERTS, AND

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 7 Sequential Monte Carlo methods III 7 April 2017 Computer Intensive Methods (1) Plan of today s lecture

More information

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

\ fwf The Institute for Integrating Statistics in Decision Sciences

\ fwf The Institute for Integrating Statistics in Decision Sciences # \ fwf The Institute for Integrating Statistics in Decision Sciences Technical Report TR-2007-8 May 22, 2007 Advances in Bayesian Software Reliability Modelling Fabrizio Ruggeri CNR IMATI Milano, Italy

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés

dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Inférence pénalisée dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Gersende Fort Institut de Mathématiques de Toulouse, CNRS and Univ. Paul Sabatier Toulouse,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

State Space and Hidden Markov Models

State Space and Hidden Markov Models State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State

More information

Dynamic Approaches: The Hidden Markov Model

Dynamic Approaches: The Hidden Markov Model Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message

More information

arxiv: v1 [stat.me] 30 Sep 2009

arxiv: v1 [stat.me] 30 Sep 2009 Model choice versus model criticism arxiv:0909.5673v1 [stat.me] 30 Sep 2009 Christian P. Robert 1,2, Kerrie Mengersen 3, and Carla Chen 3 1 Université Paris Dauphine, 2 CREST-INSEE, Paris, France, and

More information

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009 with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Infering the Number of State Clusters in Hidden Markov Model and its Extension

Infering the Number of State Clusters in Hidden Markov Model and its Extension Infering the Number of State Clusters in Hidden Markov Model and its Extension Xugang Ye Department of Applied Mathematics and Statistics, Johns Hopkins University Elements of a Hidden Markov Model (HMM)

More information

A BAYESIAN ANALYSIS OF HIERARCHICAL MIXTURES WITH APPLICATION TO CLUSTERING FINGERPRINTS. By Sarat C. Dass and Mingfei Li Michigan State University

A BAYESIAN ANALYSIS OF HIERARCHICAL MIXTURES WITH APPLICATION TO CLUSTERING FINGERPRINTS. By Sarat C. Dass and Mingfei Li Michigan State University Technical Report RM669, Department of Statistics & Probability, Michigan State University A BAYESIAN ANALYSIS OF HIERARCHICAL MIXTURES WITH APPLICATION TO CLUSTERING FINGERPRINTS By Sarat C. Dass and Mingfei

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

HMM part 1. Dr Philip Jackson

HMM part 1. Dr Philip Jackson Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -

More information