Hmms with variable dimension structures and extensions
|
|
- Kory Lewis
- 5 years ago
- Views:
Transcription
1 Hmm days/enst/january 21, Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine xian
2 Hmm days/enst/january 21, Estimating [not testing] the number of components/states
3 Hmm days/enst/january 21, In the mixture model, y i f(y) = k p j f(y θ j ), i = 1,, n j=1 what if k is unknown?!
4 Hmm days/enst/january 21, Meaning of the question weak identifiability of mixtures insolvable philosophical problem unless k has a proper intrinsic meaning [and even so...] hence testing per se is impossible: the data cannot distinguish between k and k + h [unless guided by a firm hand!] the choice of a prior on k π(k) is thus necessary to translate the degree of details required [equivalence with penalizing factors in likelihood analysis]
5 Hmm days/enst/january 21, Multiplicity of technical solutions Saturated models with n mostly empty components Reversible jump MCMC techniques for exploration of most models [Green (1995); Richardson & Green (1997)] Birth and death and other jump processes [Preston (1976); Ripley (1977); Stephens (1999,2000)]
6 Hmm days/enst/january 21, Principles of RJMCMC Births and deaths are proposed with probabilities β k and δ k, respectively, when having k components. Acceptance probability for birth move is min(a, 1), with A = likelihood ratio δ k+1 β k (1 w)k 1 b(w, φ). Acceptance probability for death move is min(a 1, 1).
7 Hmm days/enst/january 21, Principles of BDMCMC New components are born according to a Poisson process with rate λ k when having k components. Each component (w, φ) dies with rate d(w, φ) = likelihood ratio 1 λ k k + 1 b(w, φ) (1 w) k 1.
8 Hmm days/enst/january 21, More general moves Local balance in for Markov jump processes in general: π(θ)q(θ, θ ) = π(θ )q(θ, θ) For birth-death, split-merge moves etc.: π(k)π(θ k k)l(θ k ) λ k b(u k )J 1 = π(k+1)π(θ k+1 k+1)l(θ k+1 ) d(θ k+1, θ k )
9 Hmm days/enst/january 21, Comparison of mixing properties RJMCMC works poorly if A = likelihood ratio δ k+1 β k (1 w)k 1 b(w, φ) is small. If A is small, then d(w, φ) = likelihood ratio 1 λ k k + 1 N 1 k + 1 A 1 is large and BDMCMC works poorly. b(w, φ) (1 w) k 1
10 Hmm days/enst/january 21, RJMCMC BDMCMC Rescaling time In discrete-time RJMCMC, let the time unit be 1/N, put β k = λ k /N and δ k = 1 λ k /N. As N each birth proposal will be accepted, and having k components births occur according to a Poisson process with rate λ k.
11 Hmm days/enst/january 21, As N, a component (w, φ) dies with rate lim Nδ k+1 1 N k + 1 min(a 1, 1) = lim N N 1 k + 1 β k δ k+1 likelihood ratio 1 b(w, φ) (1 w) k 1 = likelihood ratio 1 λ k k + 1 b(w, φ) (1 w) k 1. Hence RJMCMC BDMCMC. This holds more generally.
12 Hmm days/enst/january 21, Inference with varying k Little difference from the fixed k setting Inference must be conditional on k General principle in Bayesian model choice: parameters appearing in different models must be considered as separate entities
13 Hmm days/enst/january 21, Inference on k through posterior probabilities and predictive plots of the regression lines k k k k k k k k k k
14 Hmm days/enst/january 21, Results on a large uniform sample for the beta mixture: p 0 + (1 p 0 ) k i=1 ω i l ω Be(α i ɛ i, α i (1 ɛ i )) l k never estimated as 0 p 0 very small likelihood widely different from 1 curve almost flat
15 Hmm days/enst/january 21, Histogram of scan(datae) Relative Frequency data Histogram of scan(datae) Relative Frequency scan(datae)
16 Hmm days/enst/january 21, Histogram of rafe Histogram of pzero[pzero!= 1] Relative Frequency p Relative Frequency k p0 Histogram of alfa Histogram of epil epsilon Relative Frequency Relative Frequency alpha alpha epsilon Histogram of wei Histogram of poid weights Relative Frequency Relative Frequency probabilities weights probabilities
17 Hmm days/enst/january 21, Extensions to more challenging structures Introduce more advanced models by way of additional latent variables with possible dependence
18 Hmm days/enst/january 21, Hidden Markov models z 1 π z i p zi 1 z i x 1 z 1 N (µ z1, σ 2 z 1 ) x i z i N (µ zi, σ 2 z i ) omega omega Relative Frequency mu instants omega omega mu mu Relative Frequency sigma instants mu mu sigma sigma Relative Frequency omega instants sigma sigma Very similar to normal mixture but for additional structure which improves estimation
19 Hmm days/enst/january 21, Still allows for flat priors π(µ, σ, P ) 1 σ k 1 exp { 1 } (µi+1 2σ 2 µ i ) 2 I σ1 > >σ k [Robert & Titterington (1997)] Gibbs implementation straightforward 1. Generate missing data p(z i = j z i 1, z i+1, θ, p)
20 Hmm days/enst/january 21, Generate parameters p i. D(n i1 + 1,, n ik + 1) µ i N ( ni σ 2 i x i + α i 1 µ i 1 + α i+1 µ i+1 n i σ 2, i + α i 1 + α i+1 ) (n i σ 2 i + α i 1 + α i+1 ) σ 2 i IG ( ni 1 2, n i( x i µ i ) 2 + s 2 i 2 ) I σi 1 <σ i <σ i+1 [Celeux, Diebolt & Robert (1993)]
21 Hmm days/enst/january 21, Non-Gibbsic implementation also possible, without the missing states, thanks to forward-backward formulae Estimation of k possible via reversible jump [Robert, Rydén & Titterington (1999)] and other jump process methods [Cappé, Robert & Rydén (2001)]
22 Hmm days/enst/january 21, Split-merge moves for HMMs Parametrisation: p ij = ω ij / l ω il, Y t X t = i N (µ i, σ 2 i ). Move to split component j into j 1 and j 2 : ω ij1 = ω ij ε i, ω ij2 = ω ij (1 ε i ), ε i U(0, 1); ω j1 j = ω j jξ j, ω j2 j = ω j j/ξ j, ξ j log N (0, 1); similar ideas give ω j1 j 2 etc.; µ j1 = µ j 3σ j ε µ, µ j2 = µ j + 3σ j ε µ, ε µ N (0, 1); σ 2 j 1 = σ 2 j ξ σ, σ 2 j 2 = σ 2 j /ξ σ, ξ σ log N (0, 1). [Split intensity] λ S,k = kλ B [Birth intensity]
23 Hmm days/enst/january 21, Fixed k moves also used Example : Wind intensity in Athens Histogram and rawplot of the dataset
24 Hmm days/enst/january 21, Number of states Relative Frequency number of states temp[, 1] instants Log likelihood values Relative Frequency log likelihood temp[, 2] instants Number of moves Relative Frequency Number of moves temp[, 3] instants MCMC output on k (histogram and rawplot), number of states, and corresponding likelihood values
25 Hmm days/enst/january 21, pi sigma MCMC sequence of the parameters of the three components when conditioning on k = 3
26 Hmm days/enst/january 21, MCMC evaluation of the marginal density, compared with R nonparametric density estimate.
27 Hmm days/enst/january 21, Other latent variable models and hidden structures Hidden semi-markov models Switching ARMA models Stochastic volatility and ARCH models Discretised diffusions
28 Hmm days/enst/january 21, Hidden semi-markov models Example : Ion chanel model [Hobson, 1999; Carpenter et al., 2001] Observables y = (y t ) 1 t T directed by a hidden Gamma process x = (x t ) 1 t T : y t x t N (µ xt, σ 2 ) x t {0, 1} with durations (i = 0, 1) d j = t j+1 t j Ga(s i, λ i ) if x t = i for t j t < t j+1.
29 Hmm days/enst/january 21, y t Complex likelihood structure with no closed form expression
30 Hmm days/enst/january 21, Prior assumptions conjugate normal-gamma prior on the µ s and σ N (θ 0, τσ 2 ) G(ζ, η) 1 conjugate gamma prior on the λ s G(α, β) flat prior on the s s on {1,..., S}
31 Hmm days/enst/january 21, Particle system Generation of a system of particles (ω (j), x (j) ) j (j = 1,..., J) where ω = (µ 0, µ 1, σ, λ 0, λ 1, s 0, s 1 ) based on a proposal/instrumental/importance distribution π(ω y, x) π H (x y, ω)
32 Hmm days/enst/january 21, where π H full conditional of a fitted hidden Markov model with transition matrix IP = 1 λ 0 s 0 λ 0 s 0 λ 1 s 1 1 λ 1 s 1, by analogy with average sojourn times for both models.
33 Hmm days/enst/january 21, Simulation Use of forward backward formulae, of conjugate structure for the µ s, λ s and σ and of finite support for s, distributed as [ ] si i Γ(n i s i + α) s i x π(s i x) (β + v i ) n i Γ(s i ) n i I {1,2,...,S} (s i )
34 Hmm days/enst/january 21, Fitted series with residuals (top) and allocation probabilities (bottom)
35 Hmm days/enst/january 21, µ 1 µ σ (µ 1 µ 2 ) σ λ λ s s s 1 λ s 2 λ Fitted series with residuals (top) and allocation probabilities (bottom)
36 Hmm days/enst/january 21, Iterated particle system Repeated calls to importance sampling with systematic resampling steps to improve fit How many steps? Which improvement? Why bother?!
37 Hmm days/enst/january 21, Algorithm Step 0. Generate (j = 1,..., J) 1. ω (j) π(ω) 2. x (j) = (x (j) t ) 1 t T π H (x y, ω (j) ) and compute the weights (j = 1,..., J) ϱ j π(ω (j), x (j) y) π(ω (j) )π H (x (j) y, ω (j) )
38 Hmm days/enst/january 21, Step i. (i = 1,...) Generate (j = 1,..., J) 1. ω (j) π(ω y, x (j) ) 2. x (j) + = (x (j) t ) 1 t T π H (x y, ω (j) ) compute the weights (j = 1,..., J) ϱ j π(ω (j), x (j) + y) π(ω (j) y, x (j) )π H (x (j) + y, ω (j) ) resample the couples ω (j), x (j) + from the weights ϱ j, and take x (j) = x (j) + (j = 1,..., J).
39 Hmm days/enst/january 21, History and ancestry of the particle system
40 Hmm days/enst/january 21, Solving optimization problems
41 Hmm days/enst/january 21, Role of maximum a posteriori estimation in Bayesian inference θ = (θ 1, θ 2 ) Θ 1 Θ 2 p (θ) especially when posterior means are useless but difficulty with marginal MAP (MMAP) estimates because nuisance parameters must be integrated out θ MMAP 1 = arg Θ1 max p (θ 1 y) where p (θ 1 y) = Θ 2 p (θ 1, θ 2 y) dθ 2
42 Hmm days/enst/january 21, If integration possible in closed-form, use Expectation-Maximization (EM) algorithm [Dempster et al. (1977)] Deterministic algorithm which depends on initialization and is limited to certain classes of models. Stochastic variants like Stochastic EM (SEM) or Monte Carlo EM (MCEM) [Celeux & Diebolt (1985), Wei & Tanner (1991)] Parameter of interest always updated deterministically in the M step
43 Hmm days/enst/january 21, Standard (and Markov chain) Monte Carlo: draw random samples from the joint posterior distribution p (θ 1, θ 2 y) or MCMC (approximate, dependent) sample {( ) } θ (i) 1, θ(i) 2 ; i = 1,..., N and discard nuisance parameters. More suited to integration than to optimization
44 Hmm days/enst/january 21, Simulated annealing (SA) for maximizing p (θ 1 y) Non-homogeneous variant of MCMC for global optimization: invariant distribution at iteration i proportional to p γ(i) (θ 1 y), γ (i) increasing function diverging at infinity. Idea: as γ (i) goes to infinity, p γ(i) (θ 1 y) concentrates itself upon the set of global modes.
45 Hmm days/enst/january 21, State Augmentation for Marginal Estimation [Doucet, Godsill & Robert (2001)] Artificially augmented probability model whose marginal distribution is p γ (θ 1 y) via replications of the nuisance parameters: Replace θ 2 with γ artificial replications, θ 2 (1),..., θ 2 (γ) Treat the θ 2 (j) s as distinct random variables: γ q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) p (θ 1, θ 2 (k) y) k=1
46 Hmm days/enst/january 21, Use corresponding marginal for θ 1 q γ (θ 1 y) = q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) dθ 2 (1)... dθ 2 (γ) γ k=1 = p γ (θ 1 y) p (θ 1, θ 2 (k) y) dθ 2 (1)... dθ 2 (γ) Build a MCMC algorithm in the augmented space, with invariant distribution q γ (θ 1, θ 2 (1),..., θ 2 (γ) y) Use simulated subsequence { θ (i) 1 ; i IN } as drawn from marginal posterior p γ (θ 1 y)
47 Hmm days/enst/january 21, Application to the benchmark galaxy dataset [Roeder (1992)] 82 observations of galaxy velocities from 3 (?) groups Algorithm EM MCEM SAME Mean log-posterior Std dev of log-posterior
Marginal maximum a posteriori estimation using Markov chain Monte Carlo
Statistics and Computing 1: 77 84, 00 C 00 Kluwer Academic Publishers. Manufactured in The Netherlands. Marginal maximum a posteriori estimation using Markov chain Monte Carlo ARNAUD DOUCET, SIMON J. GODSILL
More informationMixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models
5 MCMC approaches Label switching MCMC for variable dimension models 291/459 Missing variable models Complexity of a model may originate from the fact that some piece of information is missing Example
More informationReversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers
J. R. Statist. Soc. B (23) 65, Part 3, pp. 679 7 Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers Olivier Cappé, Centre National de la Recherche Scientifique,
More informationBasic math for biology
Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationA comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models
A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models Richard J. Boys Daniel A. Henderson Department of Statistics, University of Newcastle, U.K. Richard.Boys@ncl.ac.uk
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationBayesian model selection in graphs by using BDgraph package
Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA
More informationBayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni
Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni Jean-Michel Marin CEREMADE Université Paris Dauphine Kerrie L. Mengersen QUT Brisbane
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationBayesian inference for Hidden Markov Models
Università degli Studi di Macerata Dipartimento di Istituzioni Economiche e Finanziarie Bayesian inference for Hidden Markov Models Rosella Castellano, Luisa Scaccia Temi di discussione n. 43 Novembre
More informationDynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models
6 Dependent data The AR(p) model The MA(q) model Hidden Markov models Dependent data Dependent data Huge portion of real-life data involving dependent datapoints Example (Capture-recapture) capture histories
More informationIntroduction to Markov Chain Monte Carlo & Gibbs Sampling
Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu
More informationModelling and forecasting of offshore wind power fluctuations with Markov-Switching models
Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models 02433 - Hidden Markov Models Pierre-Julien Trombe, Martin Wæver Pedersen, Henrik Madsen Course week 10 MWP, compiled
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics A Bayesian sparse finite mixture model for clustering data from a heterogeneous population Erlandson F. Saraiva a, Adriano K. Suzuki b and
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationVariational Inference via Stochastic Backpropagation
Variational Inference via Stochastic Backpropagation Kai Fan February 27, 2016 Preliminaries Stochastic Backpropagation Variational Auto-Encoding Related Work Summary Outline Preliminaries Stochastic Backpropagation
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 18-16th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Trans-dimensional Markov chain Monte Carlo. Bayesian model for autoregressions.
More informationLikelihood-free MCMC
Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSequential Monte Carlo Methods for Bayesian Computation
Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter
More informationPenalized Loss functions for Bayesian Model Choice
Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationApproximate Bayesian Computation: a simulation based approach to inference
Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics
More informationLearning the hyper-parameters. Luca Martino
Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth
More informationCalibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods
Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationModel-based cluster analysis: a Defence. Gilles Celeux Inria Futurs
Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationABC methods for phase-type distributions with applications in insurance risk problems
ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationBayesian finite mixtures with an unknown number of. components: the allocation sampler
Bayesian finite mixtures with an unknown number of components: the allocation sampler Agostino Nobile and Alastair Fearnside University of Glasgow, UK 2 nd June 2005 Abstract A new Markov chain Monte Carlo
More informationBayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels
Bayesian Mixture Labeling by Minimizing Deviance of Classification Probabilities to Reference Labels Weixin Yao and Longhai Li Abstract Solving label switching is crucial for interpreting the results of
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationDifferent points of view for selecting a latent structure model
Different points of view for selecting a latent structure model Gilles Celeux Inria Saclay-Île-de-France, Université Paris-Sud Latent structure models: two different point of views Density estimation LSM
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture
More informationSession 5B: A worked example EGARCH model
Session 5B: A worked example EGARCH model John Geweke Bayesian Econometrics and its Applications August 7, worked example EGARCH model August 7, / 6 EGARCH Exponential generalized autoregressive conditional
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationParticle methods for maximum likelihood estimation in latent variable models
DOI 10.1007/s11222-007-9037-8 Particle methods for maximum likelihood estimation in latent variable models Adam M. Johansen Arnaud Doucet Manuel Davy Received: 15 May 2007 / Accepted: 31 August 2007 Springer
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationMonte Carlo methods for sampling-based Stochastic Optimization
Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationBayesian model selection for computer model validation via mixture model estimation
Bayesian model selection for computer model validation via mixture model estimation Kaniav Kamary ATER, CNAM Joint work with É. Parent, P. Barbillon, M. Keller and N. Bousquet Outline Computer model validation
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationNotes on pseudo-marginal methods, variational Bayes and ABC
Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationMH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution
MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous
More informationBayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models
Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
More informationPopulation Monte Carlo Methods/OFPR/CREST/May 5, Population Monte Carlo Methods. Christian P. Robert Université Paris Dauphine
Population Monte Carlo Methods/OFPR/CREST/May 5, 2003 1 Population Monte Carlo Methods Christian P. Robert Université Paris Dauphine Population Monte Carlo Methods/OFPR/CREST/May 5, 2003 2 1 A Benchmark
More informationInfinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix
Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationMarkov Chains and Hidden Markov Models
Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden
More informationBayesian inference for multivariate skew-normal and skew-t distributions
Bayesian inference for multivariate skew-normal and skew-t distributions Brunero Liseo Sapienza Università di Roma Banff, May 2013 Outline Joint research with Antonio Parisi (Roma Tor Vergata) 1. Inferential
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More informationBayesian Analysis of Mixture Models with an Unknown Number of Components an alternative to reversible jump methods
Bayesian Analysis of Mixture Models with an Unknown Number of Components an alternative to reversible jump methods Matthew Stephens Λ Department of Statistics University of Oxford Submitted to Annals of
More informationChapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang
Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check
More informationRecursive Deviance Information Criterion for the Hidden Markov Model
International Journal of Statistics and Probability; Vol. 5, No. 1; 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Recursive Deviance Information Criterion for
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationMCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17
MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making
More informationExpectation Maximization
Expectation Maximization Aaron C. Courville Université de Montréal Note: Material for the slides is taken directly from a presentation prepared by Christopher M. Bishop Learning in DAGs Two things could
More informationVariational inference
Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM
More informationBAYESIAN MODEL CRITICISM
Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes
More informationA generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection
A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection Silvia Pandolfi Francesco Bartolucci Nial Friel University of Perugia, IT University of Perugia, IT
More informationLabel Switching and Its Simple Solutions for Frequentist Mixture Models
Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationBayesian nonparametrics
Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability
More informationClustering bi-partite networks using collapsed latent block models
Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space
More informationSequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007
Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember
More informationGraphical Models for Query-driven Analysis of Multimodal Data
Graphical Models for Query-driven Analysis of Multimodal Data John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology
More informationBayesian Monte Carlo Filtering for Stochastic Volatility Models
Bayesian Monte Carlo Filtering for Stochastic Volatility Models Roberto Casarin CEREMADE University Paris IX (Dauphine) and Dept. of Economics University Ca Foscari, Venice Abstract Modelling of the financial
More informationSUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )
Econometrica Supplementary Material SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, 653 710) BY SANGHAMITRA DAS, MARK ROBERTS, AND
More informationMarkov Chain Monte Carlo
1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 7 Sequential Monte Carlo methods III 7 April 2017 Computer Intensive Methods (1) Plan of today s lecture
More informationLearning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling
Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian
More information\ fwf The Institute for Integrating Statistics in Decision Sciences
# \ fwf The Institute for Integrating Statistics in Decision Sciences Technical Report TR-2007-8 May 22, 2007 Advances in Bayesian Software Reliability Modelling Fabrizio Ruggeri CNR IMATI Milano, Italy
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationdans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés
Inférence pénalisée dans les modèles à vraisemblance non explicite par des algorithmes gradient-proximaux perturbés Gersende Fort Institut de Mathématiques de Toulouse, CNRS and Univ. Paul Sabatier Toulouse,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationState Space and Hidden Markov Models
State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State
More informationDynamic Approaches: The Hidden Markov Model
Dynamic Approaches: The Hidden Markov Model Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Inference as Message
More informationarxiv: v1 [stat.me] 30 Sep 2009
Model choice versus model criticism arxiv:0909.5673v1 [stat.me] 30 Sep 2009 Christian P. Robert 1,2, Kerrie Mengersen 3, and Carla Chen 3 1 Université Paris Dauphine, 2 CREST-INSEE, Paris, France, and
More informationLikelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009
with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationRiemann Manifold Methods in Bayesian Statistics
Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 13: SEQUENTIAL DATA Contents in latter part Linear Dynamical Systems What is different from HMM? Kalman filter Its strength and limitation Particle Filter
More informationPattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods
Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs
More informationAdaptive Monte Carlo methods
Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert
More informationInfering the Number of State Clusters in Hidden Markov Model and its Extension
Infering the Number of State Clusters in Hidden Markov Model and its Extension Xugang Ye Department of Applied Mathematics and Statistics, Johns Hopkins University Elements of a Hidden Markov Model (HMM)
More informationA BAYESIAN ANALYSIS OF HIERARCHICAL MIXTURES WITH APPLICATION TO CLUSTERING FINGERPRINTS. By Sarat C. Dass and Mingfei Li Michigan State University
Technical Report RM669, Department of Statistics & Probability, Michigan State University A BAYESIAN ANALYSIS OF HIERARCHICAL MIXTURES WITH APPLICATION TO CLUSTERING FINGERPRINTS By Sarat C. Dass and Mingfei
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationAn introduction to Sequential Monte Carlo
An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods
More informationHMM part 1. Dr Philip Jackson
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. HMM part 1 Dr Philip Jackson Probability fundamentals Markov models State topology diagrams Hidden Markov models -
More information