Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix
|
|
- Harvey Blair
- 5 years ago
- Views:
Transcription
1 Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, Comparison of the two MS-GARCH approximations Section three of the paper (see Estimation by Bayesian inference and model comparison) details two algorithms to infer the parameters of an MS-GARCH model. These two algorithms differ on the approximation to the MS-GARCH model. A more accurate approximation will lead to a higher acceptance rate as well as a lower autocorrelation between posterior draws. In order to differentiate the two algorithms, we carry out a Monte Carlo study based on simulated data of 1000 observations and analyze the mixing properties. For each simulation we compute the autocorrelation time 1 as well as the required time for sampling one effective posterior draw. 2 We consider four different MS-GARCH models exhibiting a break in ω with one or multiple switches (CP and MS in the Table 1). The data generated processes differ on their persistence (α + β) that are assumed to be equal across regimes. The Monte Carlo study consists of eight hundred different simulated data (one hundred per DGP per algorithm). The main interest of this study lies in the ability of sampling the state vector. These MCMC simulations therefore fix the GARCH parameters at their MLE and only draw the state vector given these parameters. The Table 1 displays the difference average of the maximum autocorrelation time and the difference average of the effective time for the two algorithms (Kl-MH for the model of Klaassen (2002) minus HMP-MH for the model of Haas, Mittnik, and Paolella (2004)). 1 The autocorrelation time is computed by batch means (see Geyer (1992)) and is defined as 1+2 i=1 ρi where ρ i is the autocorrelation coefficient of order i between the posterior draws of a state variable. 2 following the formula : autocorrelation time x elapsed time for N posterior draws divided by N
2 Break in ω ω 1 = 0.1 ω 2 = 0.7 Persistence % CI of the difference between elapsed times for one effective draw CP [-0.23;0.02] [-0.54;0.02] [-0.75;0.01] [-1.10;0.01] MS [-0.85;0.15] [-1.05;0.01] [-1.19;0.05] [-1.35;0.29] 90 % CI of the difference between autocorrelation times CP [-20.91;1.06] [-49.03;1.65] [-68.34;0.75] [-90.24;0.11] MS [-79.05;6.55] [-96.63;-1.10] [ ;-1.08] [ ;10.15] Table 1: The differences are as follows Kl-MH minus HMP-MH where Kl-MH denotes Klaassens Metropolis-Hastings and HMP-MH stands for Haas, Mittnik and Paolella Metropolis-Hastings. A negative value provides evidence in favor of the Kl-MH algorithm. Although almost all confidence intervals include zero, the distributions are skewed to the left whatever the persistence and the number of breaks. The left limit also increases if the persistence and/or the number of switches grow while no systematic rule can be depicted from the table for the right limit. These two observations lead us to believe that the Klaassen model provides a better approximation to the MS-GARCH model than the specification of Haas, Mittnik, and Paolella (2004). Another evidence is given in the simulation exercise of the paper (see subsection 4.3 Comparison with other algorithms). The result is not surprising since the former model keeps track of the preceding variances when a switch in the state occurs. Despite these comments, the HMP model could still be preferred in the IHMM context. Indeed the beam sampler combined with the Klaassen model somewhat complicates the sampling of the state vector. These computational difficulties do not arise with the HMP model. 2 Estimation of the spline-garch parameters by SMC sampler The SMC sampler discretely approximates an artificial sequence of distributions {π n } p n=1 by sequential importance sampling. Let denote x = {α, β, ω 0,..., ω k+1 }, the set containing all the spline-garch parameters. The artificial sequence of distributions is obtained by introducing an increasing function φ : Z [0, 1] with φ(1) = 0 and φ(p) = 1 such that π i (x Y T ) f(y T x) φ(i) f(x) 2
3 where f(x) denotes the prior density evaluated at x. Note that when n = p, the distribution π p coincides with the posterior distribution of interest. On the contrary, when n = 1, π 1 is equal to the prior distribution (if the latter is proper). In the paper of Del Moral, Doucet, and Jasra (2006), a SMC algorithm is provided for sequentially approximating each of the distribution in the artificial sequence. As the function φ is user-defined, one can choose a function that smoothly increases such that, at a specific iteration of the SMC, the approximation of the previous targeted distribution remains close to the current one. The algorithm operates as follows. First, sample N draws {x i 1 }N i=1 from the prior distribution and link them to uniform weights {W1 i = 1 N }N i=1. Then, starting from n = 2 until n = p, apply the steps (a)-(c) : 1. Correction step : i [1, N], Re-weight each particle with respect to the nth posterior distribution w i n = f(y T x i n 1) φ(n) φ(n 1) Normalize the particle weights : W i n = W n 1 i wi n N. j=1 W j n 1 wj n 2. Re-sampling step : Compute the Effective Sample Size (ESS) as N ESS = [ (Wn) i 2 ] 1. i=1 if ESS < 3N 4 then re-sample the particles and reset the weights uniformly. 3. Mutation step : Run J steps of an MCMC kernel with invariant distribution π n (x n Y T ) for each particle in the system. At the end of the procedure, an estimate of the marginal likelihood is given by p N Wn 1 i w n. i n=2 i=1 The SMC sampler contains many user-defined parameters. As MCMC kernel, we use a random block strategy such as in Chib and Ramamurthy (2010) conjugated with a Metropolis update. The covariance matrix of the proposal distribution, which is a normal one, is directly derived from the particles. The number of MCMC moves J is set to 10. 3
4 Finally, the function φ is adapted on the fly as proposed by Jasra, Stephens, Doucet, and Tsagaris (2011). 3 The sticky infinite hidden Markov model The sticky infinite hidden Markov model is based on Dirichlet processes and hierarchical Dirichlet processes. The second Section of the paper (see Model definition) defines the Dirichlet process and its stick-breaking representation. We go further by deepening some of the Dirichlet process properties and by reviewing the concept of hierarchical Dirichlet process and the sticky infinite hidden Markov model. 3.1 More on Dirichlet process Let be interested in the posterior distribution of G given n i.i.d. draws {θ 1,..., θ n } from G itself (since G is a distribution over Θ). Using the fundamental relation of the Dirichlet process (see equation (1) in the paper), the Bayes theorem as well as the conjugacy of the multinomial distribution with the Dirichlet distribution, it can be shown that G θ 1,..., θ n DP (η + n, η η + n G 0 + n n i=1 δ θ i ) (1) η + n n where the operator δ i denotes the probability measure concentrated at i. Considering the partition {θ 1,..., θ n, Θ\{θ 1,..., θ n }}, using equation (1) in the paper and (1), we directly have the relation : G(θ 1 ),..., G(θ n ), G(Θ\{θ 1,..., θ n }) θ 1,..., θ n Dir(1,..., 1, η) (2) The above equation highlights the discrete nature of the distribution G since the probability of observing one already drawn θ is greater than zero. Moreover it also emphasizes that the expected probability of drawing a new element θ θ i, i [1, n] is equal to η η+n. To provide more intuition on Dirichlet process, the predictive distribution θ n+1 θ 1,..., θ n can be derived as follows 4
5 f(θ n+1 θ 1,..., θ n ) = E(G(θ n+1 ) θ 1,..., θ n ) θ n+1 θ 1,..., θ n 1 n η + n (ηg 0 + δ θi ) from eq. 2 The pólya urn methaphor (see Blackwell and MacQueen (1973)) helps for interpreting the last equality. Consider that each possible element θ Θ is associated to a ball with i=1 a specific color and that all these balls are gathered in a urn. The predictive scheme iterates as follows. At the beginning, we randomly pick a ball from the urn (i.e. a draw from G 0 ) and identify its color. The ball is afterward dropped again in the urn. Before proceeding to the next step, we put a new ball with the corresponding color in an empty urn. At iteration n, we randomly choose a color from the initial urn (i.e. a draw from G 0 ) with probability η η+n 1 or from the other one otherwise. Since the second urn only contains balls with already observed colors, the probability of getting a new color (i.e. a new element in Θ) keeps decreasing as long as the scheme evolves. From equation (2), we can also derive the expected number of elements that have been sampled from G 0 (denoted, hereafter, by m) given the number of draws n. At iteration n + 1, the probability of sampling θ n+1 from G 0 is equal to E(m n) = η n i=1 1 η + i 1 η η+n lim n E(m n) = lim n η(log n + C) which gives where C is the Euler-Mascheroni constant that is approximately equal to The expected number of distinguished elements (which will denote different regimes in the GARCH model) is by far smaller than the number of observations. Notice that the concentration parameter η has a direct impact on the number of distinct elements. 3.2 The Hierarchical Dirichlet process The infinite hidden Markov Model assumes an infinite number of states. It models a doubly stochastic Markov chain in which a sequence of multinomial state {s 1,..., s T } are linked via a state transition matrix and given this unobserved dynamic referring to a set of parameters, the element y t in a sequence of observations {y 1,..., y T } is drawn from a 5
6 parametric distribution. Therefore, the structure should ensure that whatever the followed path, if we reach a specific state, it should refer to the same model parameter. For instance the state one should always be related to the same parameter Θ 1. The hierarchical Dirichlet process has been designed on this purpose. The hyper-parameters of the hierarchical Dirichlet process (HDP), (Teh, Jordan, Beal, and Blei (2006)) consist of the base distribution G 0 and concentrated parameters η R + and λ R +. The HDP is defined as follows G η, G 0 DP (η, G 0 ) and G j λ, G DP (λ, G) j = 1,..., n So G j G G i if i j. As G is a random probability measures over Θ (the support of the base distribution G 0 ), the hierarchical process defines a set of random probability measures G j, one for each group, over Θ. The stick-breaking representation of a HDP can be formulated as follows : G = π k δ Θk and G j = p jk δ Θk k=1 k=1 where j = 1,..., n where Θ k G 0, π = {π k } k=1 Stick(η) are mutually independent, δ Θ k is the probability measure concentrated at Θ k and {p jk } k=1 λ, π DP (λ, π) (as shown in Teh, Jordan, Beal, and Blei (2006)). Notice that by definition of the DP, each G j ( j {1,..., n}) has the same support which is the support of G. This property of the HDP is essential to develop an infinite hidden Markov model. The hidden Markov-switching model is driven by two stochastic processes. On one hand a Markov-chain determines a discrete state vector {s 1,..., s T } and on the other hand the observations follow a specific distribution conditioned to the state vector and the parameters of each regime (y t s t, {Θ k } k=1 F (Θ s t )). The hierarchical Dirichlet process can build this kind of structure with an infinite number of state (and of Dirichlet processes) and is shortly stated in Table 2. 6
7 1. Dirichlet process : G = k=1 π kδ Θk DP (η, G 0 ) π Stick(η) Stick-breaking representation of the Dirichlet Process Θ k G 0 Θ k : Parameters of the model related to the state k 2. Hierarchical Dirichlet processes : G j G = k=1 p jkδ Θk DP (λ, G) p j = {p jk } k=1 DP(λ, π) Each row of the transition matrix is driven by a DP 3. Markov-switching model s t s t 1, {p j } j=1 p s t 1 First order Markovian with transition matrix {p j } j=1 y t s t, {Θ k } k=1 F (Θ s t ) Each state shares the same support (of G 0 ) Table 2: Infinite hidden Markov Model (IHMM) 3.3 The sticky parameter Persistence of regimes is a well-known stylized fact of time series. However the IHMM probability transition matrix does not exhibit any persistence (i.e. E[p jk λ, π] = π k j (see Table 2)). The IHMM transition actually does not differ between a self-transition and a transition to another state, an unrealistic feature for time series. Fox, Sudderth, Jordan, and Willsky (2011) have developed an IHMM framework which excludes a high probability posterior with rapid switching. They called it the sticky HDP- HMM or the sticky IHMM. They specify a new parameter κ for self-transition bias and set a separate prior on this parameter. Their specification is as follows : π η Stick(η) j = 1,..., n p j λ, π, κ DP (λ + κ, λπ + κδ j λ + κ ) An amount κ > 0 is added to the j th component of the (infinite) vector λπ. The new parameter implies a higher probability of staying in the same state in the next period than the original model (i.e. E[p kk λ, κ, π] = λπ k+κ λ+κ ). Note that if κ = 0, we come back to the former specification (i.e the original IHMM of Table 2). References Blackwell, D., and J. MacQueen (1973): Ferguson distributions via Pólya urn schemes, Annals of Statistics, 1,
8 Chib, S., and S. Ramamurthy (2010): Tailored randomized block MCMC methods with application to DSGE models, Journal of Econometrics, 155(1), Del Moral, P., A. Doucet, and A. Jasra (2006): Sequential Monte Carlo samplers, The Royal Statistical Society: Series B(Statistical Methodology), 68, Fox, E., E. Sudderth, M. Jordan, and A. Willsky (2011): A sticky HDP-HMM with Application to Speaker Diarization, Annals of Applied Statistics, 5(2A), Geyer, C. J. (1992): Practical Markov Chain Monte Carlo, Statistical Science, 7(4), Haas, M., S. Mittnik, and M. Paolella (2004): A New Approach to Markov- Switching GARCH Models, Journal of Financial Econometrics, 2, Jasra, A., D. A. Stephens, A. Doucet, and T. Tsagaris (2011): Inference for Lvy-Driven Stochastic Volatility Models via Adaptive Sequential Monte Carlo, Scandinavian Journal of Statistics, 38, Klaassen, F. (2002): Improving GARCH volatility forecasts with regime-switching GARCH, Empirical Economics, 27(2), Teh, Y., M. Jordan, M. Beal, and D. M. Blei (2006): Hierarchical Dirichlet Processes, Journal of the American Statistical Association, 101,
On the conjugacy of off-line and on-line Sequential Monte Carlo Samplers. Working Paper Research. by Arnaud Dufays. September 2014 No 263
On the conjugacy of off-line and on-line Sequential Monte Carlo Samplers Working Paper Research by Arnaud Dufays September 2014 No 263 Editorial Director Jan Smets, Member of the Board of Directors of
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationA Brief Overview of Nonparametric Bayesian Models
A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More informationBayesian nonparametrics
Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability
More informationBayesian Nonparametrics for Speech and Signal Processing
Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer
More informationAn Infinite Hidden Markov Model for Short-term Interest Rates
MPRA Munich Personal RePEc Archive An Infinite Hidden Markov Model for Short-term Interest Rates John M Maheu and Qiao Yang McMaster University, University of Toronto January 2015 Online at https://mpra.ub.uni-muenchen.de/62408/
More informationEvolutionary Sequential Monte Carlo samplers for Change-point models
Evolutionary Sequential Monte Carlo samplers for Change-point models Arnaud Dufays 1 August 24, 2015 Abstract Sequential Monte Carlo (SMC) methods are widely used for non-linear filtering purposes. Nevertheless
More informationSequential Monte Carlo Methods
University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance
More informationAutoregressive Moving Average Infinite Hidden Markov-Switching Models
Autoregressive Moving Average Infinite Hidden Markov-Switching Models Luc Bauwens, 1,2 Jean-François Carpantier, 3 and Arnaud Dufays 4 1 Université catholique de Louvain, CORE 2 SKEMA Business School 3
More informationInference in state-space models with multiple paths from conditional SMC
Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September
More informationStochastic Variational Inference for the HDP-HMM
Stochastic Variational Inference for the HDP-HMM Aonan Zhang San Gultekin John Paisley Department of Electrical Engineering & Data Science Institute Columbia University, New York, NY Abstract We derive
More informationExercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters
Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for
More informationBayesian Nonparametric Learning of Complex Dynamical Phenomena
Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationModelling Regime Switching and Structural Breaks with an Infinite Dimension Markov Switching Model
Modelling Regime Switching and Structural Breaks with an Infinite Dimension Markov Switching Model Yong Song University of Toronto tommy.song@utoronto.ca October, 00 Abstract This paper proposes an infinite
More informationApplied Bayesian Nonparametrics 3. Infinite Hidden Markov Models
Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models Tutorial at CVPR 2012 Erik Sudderth Brown University Work by E. Fox, E. Sudderth, M. Jordan, & A. Willsky AOAS 2011: A Sticky HDP-HMM with
More informationPseudo-marginal MCMC methods for inference in latent variable models
Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016
More informationDirichlet Process. Yee Whye Teh, University College London
Dirichlet Process Yee Whye Teh, University College London Related keywords: Bayesian nonparametrics, stochastic processes, clustering, infinite mixture model, Blackwell-MacQueen urn scheme, Chinese restaurant
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationAuxiliary Particle Methods
Auxiliary Particle Methods Perspectives & Applications Adam M. Johansen 1 adam.johansen@bristol.ac.uk Oxford University Man Institute 29th May 2008 1 Collaborators include: Arnaud Doucet, Nick Whiteley
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationAn Brief Overview of Particle Filtering
1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems
More informationOnline appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US
Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus
More informationBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro
More informationLearning Static Parameters in Stochastic Processes
Learning Static Parameters in Stochastic Processes Bharath Ramsundar December 14, 2012 1 Introduction Consider a Markovian stochastic process X T evolving (perhaps nonlinearly) over time variable T. We
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian Nonparametric Models
Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior
More informationBayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference
1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE
More informationSequential Monte Carlo Methods for Bayesian Computation
Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter
More informationModelling time-varying parameters using artificial neural networks : A GARCH illustration
Modelling time-varying parameters using artificial neural networks : A GARCH illustration Arnaud Dufays a and Morvan Nongni b a,b Département d économique, Université Laval Version: January 31, 2018 Abstract
More informationPaul Karapanagiotidis ECO4060
Paul Karapanagiotidis ECO4060 The way forward 1) Motivate why Markov-Chain Monte Carlo (MCMC) is useful for econometric modeling 2) Introduce Markov-Chain Monte Carlo (MCMC) - Metropolis-Hastings (MH)
More informationCMPS 242: Project Report
CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the
More informationInference in Explicit Duration Hidden Markov Models
Inference in Explicit Duration Hidden Markov Models Frank Wood Joint work with Chris Wiggins, Mike Dewar Columbia University November, 2011 Wood (Columbia University) EDHMM Inference November, 2011 1 /
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationMarkov-Switching Models with Unknown Error Distributions. by Shih-Tang Hwu 1 University of Washington and Chang-Jin Kim University of Washington
Markov-Switching Models with Unknown Error Distributions by Shih-Tang Hwu 1 University of Washington and Chang-Jin Kim University of Washington September, 2017 (Job Market Paper) Abstract To this day,
More informationAbstract INTRODUCTION
Nonparametric empirical Bayes for the Dirichlet process mixture model Jon D. McAuliffe David M. Blei Michael I. Jordan October 4, 2004 Abstract The Dirichlet process prior allows flexible nonparametric
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationBayesian Nonparametrics
Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet
More informationNon-parametric Clustering with Dirichlet Processes
Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction
More informationCS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr Dirichlet Process I
X i Ν CS281B / Stat 241B : Statistical Learning Theory Lecture: #22 on 19 Apr 2004 Dirichlet Process I Lecturer: Prof. Michael Jordan Scribe: Daniel Schonberg dschonbe@eecs.berkeley.edu 22.1 Dirichlet
More informationMarkov Chain Monte Carlo Methods
Markov Chain Monte Carlo Methods John Geweke University of Iowa, USA 2005 Institute on Computational Economics University of Chicago - Argonne National Laboaratories July 22, 2005 The problem p (θ, ω I)
More informationLecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu
Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:
More informationKatsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract
Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach
More informationConstruction of Dependent Dirichlet Processes based on Poisson Processes
1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline
More informationDavid B. Dahl. Department of Statistics, and Department of Biostatistics & Medical Informatics University of Wisconsin Madison
AN IMPROVED MERGE-SPLIT SAMPLER FOR CONJUGATE DIRICHLET PROCESS MIXTURE MODELS David B. Dahl dbdahl@stat.wisc.edu Department of Statistics, and Department of Biostatistics & Medical Informatics University
More informationWho was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?
Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the
More informationBayesian Phylogenetics
Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born
More informationMonte Carlo in Bayesian Statistics
Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationGraphical Models for Query-driven Analysis of Multimodal Data
Graphical Models for Query-driven Analysis of Multimodal Data John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology
More information16 : Approximate Inference: Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution
More informationDirichlet Enhanced Latent Semantic Analysis
Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationBayesian Nonparametric Models on Decomposable Graphs
Bayesian Nonparametric Models on Decomposable Graphs François Caron INRIA Bordeaux Sud Ouest Institut de Mathématiques de Bordeaux University of Bordeaux, France francois.caron@inria.fr Arnaud Doucet Departments
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationBayesian Statistics. Debdeep Pati Florida State University. April 3, 2017
Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h
More informationMCMC: Markov Chain Monte Carlo
I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More information28 : Approximate Inference - Distributed MCMC
10-708: Probabilistic Graphical Models, Spring 2015 28 : Approximate Inference - Distributed MCMC Lecturer: Avinava Dubey Scribes: Hakim Sidahmed, Aman Gupta 1 Introduction For many interesting problems,
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationApril 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning
for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions
More informationOutline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models
Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University
More informationCalibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods
Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June
More informationSlice Sampling Mixture Models
Slice Sampling Mixture Models Maria Kalli, Jim E. Griffin & Stephen G. Walker Centre for Health Services Studies, University of Kent Institute of Mathematics, Statistics & Actuarial Science, University
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationSession 5B: A worked example EGARCH model
Session 5B: A worked example EGARCH model John Geweke Bayesian Econometrics and its Applications August 7, worked example EGARCH model August 7, / 6 EGARCH Exponential generalized autoregressive conditional
More informationChapter 12 PAWL-Forced Simulated Tempering
Chapter 12 PAWL-Forced Simulated Tempering Luke Bornn Abstract In this short note, we show how the parallel adaptive Wang Landau (PAWL) algorithm of Bornn et al. (J Comput Graph Stat, to appear) can be
More informationEvolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State
Evolutionary Clustering by Hierarchical Dirichlet Process with Hidden Markov State Tianbing Xu 1 Zhongfei (Mark) Zhang 1 1 Dept. of Computer Science State Univ. of New York at Binghamton Binghamton, NY
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationGaussian kernel GARCH models
Gaussian kernel GARCH models Xibin (Bill) Zhang and Maxwell L. King Department of Econometrics and Business Statistics Faculty of Business and Economics 7 June 2013 Motivation A regression model is often
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationText Mining for Economics and Finance Latent Dirichlet Allocation
Text Mining for Economics and Finance Latent Dirichlet Allocation Stephen Hansen Text Mining Lecture 5 1 / 45 Introduction Recall we are interested in mixed-membership modeling, but that the plsi model
More informationSharing Clusters Among Related Groups: Hierarchical Dirichlet Processes
Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes Yee Whye Teh (1), Michael I. Jordan (1,2), Matthew J. Beal (3) and David M. Blei (1) (1) Computer Science Div., (2) Dept. of Statistics
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationSequential Monte Carlo Samplers for Applications in High Dimensions
Sequential Monte Carlo Samplers for Applications in High Dimensions Alexandros Beskos National University of Singapore KAUST, 26th February 2014 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Alex
More informationGenerative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis
Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis Stéphanie Allassonnière CIS, JHU July, 15th 28 Context : Computational Anatomy Context and motivations :
More informationAn introduction to Sequential Monte Carlo
An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods
More informationHmms with variable dimension structures and extensions
Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating
More informationImage segmentation combining Markov Random Fields and Dirichlet Processes
Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationSequential Monte Carlo samplers for Bayesian DSGE models
Sequential Monte Carlo samplers for Bayesian DSGE models Drew Creal First version: February 8, 27 Current version: March 27, 27 Abstract Dynamic stochastic general equilibrium models have become a popular
More informationBayesian Nonparametric Hidden Semi-Markov Models
Journal of Machine Learning Research 14 (2013) 673-701 Submitted 12/11; Revised 9/12; Published 2/13 Bayesian Nonparametric Hidden Semi-Markov Models Matthew J. Johnson Alan S. Willsky Laboratory for Information
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof
More informationKernel Sequential Monte Carlo
Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section
More informationHastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model
UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced
More informationSequential Monte Carlo Methods
University of Pennsylvania Econ 722 Part 1 February 13, 2019 Introduction Posterior expectations can be approximated by Monte Carlo averages. If we have draws from {θ s } N s=1 from p(θ Y ), then (under
More informationSequential Monte Carlo samplers for Bayesian DSGE models
Sequential Monte Carlo samplers for Bayesian DSGE models Drew Creal Department of Econometrics, Vrije Universitiet Amsterdam, NL-8 HV Amsterdam dcreal@feweb.vu.nl August 7 Abstract Bayesian estimation
More informationSimple approximate MAP inference for Dirichlet processes mixtures
Vol. 0 (2015) 1 8 Simple approximate MAP inference for Dirichlet processes mixtures Yordan P. Raykov Aston University e-mail: yordan.raykov@gmail.com Alexis Boukouvalas University of Manchester e-mail:
More informationAfternoon Meeting on Bayesian Computation 2018 University of Reading
Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More information