Markov Chain Monte Carlo
|
|
- Sheena Horn
- 5 years ago
- Views:
Transcription
1 Markov Chain Monte Carlo Jamie Monogan University of Georgia Spring 2013 For more information, including R programs, properties of Markov chains, and Metropolis-Hastings, please see: Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
2 Objectives By the end of this meeting, participants should be able to: Use WinBUGS to estimate a model. Describe the properties of Markov Chains. Explain the algorithm behind the Gibbs sampler and apply it to data analysis. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
3 Using WinBUGS Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
4 Specifying Models with BUGS BUGS Vocabulary: node: values and variables in the model, specified by the researcher. parent: a node with direct influence on other nodes. descendent: the opposite of a parent node, but also can be a parent. constant: a founder node, they are fixed and have no parents. stochastic: a node modelled as a random variable (parameters or data). deterministic: logical consequences of other nodes. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
5 Linear BUGS Example Consider the following economic data from Organization for Economic Cooperation and Development (OECD) that highlight the relationship between commitment to employment protection measured on an interval scale (0 to 4) indicating the quantity and extent of national legislation to protect jobs, and the total factor productivity difference in growth rates between and (see The Economist, September 23, 2000 for a discussion). Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
6 Linear BUGS Example (cont.) Prot. Prod. Prot. Prod. United States Canada Australia New Zealand Ireland Denmark Finland Austria Belgium Japan Sweden Netherlands France Germany Greece Portugal Italy Spain Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
7 Linear BUGS Example (cont.) We know from Gauss-Markov theory that the posterior distribution of both the intercept and the slope coefficients is student s-t with n k 1 = 17 degrees of freedom. So why are we running BUGS on a linear model? Consider how different the estimation process is here: ˆβ = (X X) 1 X y versus α 1 f (α β 0 ), β 1 f (β α 1 ) α 2 f (α β 1 ), β 2 f (β α 2 ) : : α m f (α β m 1 ), β m f (β α m ). Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
8 Linear BUGS Example (cont.) First define the statistical structure of the model: mu[ i ] < a l p h a + beta x [ i ] ; y [ i ] dnorm (mu[ i ], tau ) ; Note that we are indexing across the data here (not chaining!). Now define the variables in the model and their distributional assumptions: a l p h a dnorm ( 0. 0, ) ; beta dnorm ( 0. 0, ) ; tau dgamma ( 0. 1, 0. 1 ) ; The second normal parameter is a precision, not a variance, by convention. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
9 Linear BUGS Example (cont.) Data is handled in a data statement for BUGS : data x, y i n oecd. dat ; i n i t s i n oecd. i n ; and generally at the bottom of the source file for WinBUGS : l i s t ( x=c ( , , , , , , , , , , , , , , , , , ), y= c ( 0. 5, 0. 6, 1. 3, 0. 4, 0. 1, 0. 9, 0. 7, 0. 1, 0. 4, 0. 4, 0.5, 0.6, 0.9, 0.2, 0.3,0.3, 0.3, 1.5), N=18) l i s t ( a l p h a = 0. 0, beta = 0. 0, tau = 1. 0 ) Note the R-like handling of data. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
10 Linear BUGS Example (cont.) Looping through the data (which can be much more complex) is also done in a very R -like manner: Programming notes: f o r ( i i n 1 :N) { } mu[ i ] < a l p h a + beta x [ i ] ; y [ i ] dnorm (mu[ i ], tau ) ; In Unix all statements must end with ;, not true in WinBUGS. The order of the distributional statements and the logical looping doesn t matter. The production of chain values is not stipulated by the user in the code. It is good practice to use defined constants to size vectors and matrices (N = 18 here) rather than embed integers in var definitions. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
11 Linear BUGS Example (cont.) model oecd ; { } f o r ( i i n 1 :N) { } mu[ i ] < a l p h a + beta x [ i ] ; y [ i ] dnorm (mu[ i ], tau ) ; a l p h a dnorm ( 0. 0, ) ; beta dnorm ( 0. 0, ) ; tau dgamma ( 0. 1, 0. 1 ) ; l i s t ( x=c ( , , , , , , , , , , , , , , , , , ), y=c ( 0. 5, 0. 6, 1. 3, 0. 4, 0. 1, 0. 9, 0. 7, 0. 1, 0. 4, 0. 4, 0. 5, 0.6, 0.9, 0.2, 0.3, 0. 3, 0.3, 1.5), N=18); l i s t ( a l p h a = 0. 0, beta = 0. 0, tau = 1. 0 ) ; Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
12 Linear BUGS Example (cont.) The next steps are to compile the model in BUGS and run the chain, recording values. The following steps are given for the Unix implementation where each command corresponds to a specific button in WinBUGS. Compile: Bugs>c o m p i l e ( oecd. bug ) Run the chain for a burn-in period: Bugs>update (10000) time f o r updates was 0 0 : 0 0 : 0 1 Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
13 Linear BUGS Example (cont.) Turn on chain value recording: Bugs>monitor ( a l p h a ) Bugs>monitor ( beta ) Run the chain for a much longer series of values: Bugs>update (50000) time f o r updates was 0 0 : 0 0 : 0 5 Ask for summary statistics: Bugs>s t a t s ( a l p h a ) Bugs>s t a t s ( beta ) Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
14 Linear BUGS Example (cont.) Using the posterior mean as a point estimate, we can compare with lm: OLS Model MCMC Posterior Estimate Std. Error Mean Std. Dev. (Intercept) Slope Observations? Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
15 Details on WinBUGS Minor stuff: WinBUGS has lots of bells and whistles to explore, such as running the model straight from the doodle. The data from any plot can be recovered by double-clicking on it. Setting the seed may be important to you: leaving the seed as is exactly replicates chains. Other features: Encoding/Doodling documents. Fonts/colors in documents. Log files to summarize time and errors. Fancy windowing schemes. Note that many WinBUGS programs are saved as.odc files, an omnibus format. Full programs can be saved as plain text also. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
16 Specification Tool Window check model: checks the syntax of your code. load data: loads data from same or other file. num of chains: sets number of parallel chains to run. compile: compiles your code as specified. load inits: loads the starting values for the chain(s). gen inits: lets WinBUGS specify initial values. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
17 Update Window updates: you specify the number of chain iterations to run this cycle. refresh: the number of updates between screen redraws for traceplots and other displays. update: hit this button to begin iterations. thin: number of values to thin out of chain between saved values. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
18 Update Window (cont.) iteration: current status of iterations, by UPDATE parameter. over relax: click in the box for option to generate multiple samples at each cycle, pick sample with greatest negative correlation to current value. Trades cycle time for mixing qualities. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
19 Update Window (cont.) adapting: box will be automatically clicked while the algorithm for Metropolis or slice sampling (using intentionally introduced auxiliary variables to improve convergence and mixing) still tuning optimization parameters (4000 and 500 iterations, respectively). Other options greyed out during this period. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
20 Sampling Window node: sets each node of interest for monitoring; type name and click SET for each variable of interest. Use the * in the window when you are done to do a full monitor. chains: 1 to 10, sets subsets of chains to monitor if multiple chains are being run. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
21 Sampling Window beg, end: the beginning and ending chain values current to be monitored. BEG is 1 unless you know the burn-in period. thin: yet another opportunity to thin the chain. clear: clear a node from being monitored. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
22 Sampling Window trace: do dynamic traceplots for monitored nodes. history: display a traceplot for the complete history. density: display a kernel density estimate. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
23 Sampling Window quantiles: displays running mean with running 95% CI by iteration number. auto cor: plots of autocorrelations for each node with lags 1 to 50. coda: display chain history in window in CODA format, another window appears with CODA ordering information. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
24 Sampling Window stats: summary statistics on each monitored node using: mean, sd, MC error, current iteration value, starting point of chain and percentiles from PERCENTILES window. Notes on stats: WinBUGS regularly provides both: naive SE = sample variance/ n and: MC Error = spectral density var/ n = asymptotic SE. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
25 Theory Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
26 What is a Markov Chain? A type of stochastic process that will help us estimate posterior quantities. A stochastic process is a consecutive set of random quantities defined on some known state space, Θ, indexed so that the order is known: {θ [t] : t T }. Frequently, but not necessarily, T is the set of positive integers implying consecutive, even-spaced time intervals: {θ [t=0], θ [t=1], θ [t=2],...}. A stochastic process must also be defined with respect to a state space, Θ, which identifies the range of possible values of θ. This state space is either discrete or continuous depending on how the variable of interest is measured. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
27 What is a Markov Chain? (cont.) A Markov chain is a stochastic process with the property that any specified state in the series, θ [t], is dependent only the previous value of the chain, θ [t 1]. Therefore values are conditionally independent of all other previous values: θ [0], θ [1],..., θ [t 2]. Formally: P(θ [t] A θ [0], θ [1],..., θ [t 2], θ [t 1] ) = P(θ [t] A θ [t 1] ), where A is any identified set (an event or range of events) on the complete state space. (We will use this A notation extensively.) Colloquially: A Markov chain wanders around the state space remembering only where it has been in the last period. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
28 What is a Markov Chain? (cont.) This short-term memory property is very useful because when the chain eventually finds the region of the state space with highest density, it will wander around there producing a sample that is only modestly nonindependent. If this is the posterior region, then we can use these empirical values as legitimate posterior sample values. Thus difficult posterior calculations can be done with MCMC by letting the chain wander around sufficiently long, thus producing summary statistics from recorded values. Sounds simple. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
29 What is a Markov Chain? (cont.) How does the Markov chain decide to move? Define the transition kernel, K, as a general mechanism for describing the probability of moving to some other specified state based on the current chain status. K(θ, A) is a defined probability measure for all θ points in the state space to the set A Θ. So K(θ, A) maps potential transition events to their probability of occurrence. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
30 What is a Markov Chain? (cont.) When the state space is discrete, K is a matrix mapping, k k for k discrete elements in A, where each cell defines the probability of a state transition from the first term to all possible states: P A = p(θ 1, θ 1 )... p(θ 1, θ k ) : : p(θ k, θ 1 )... p(θ k, θ k ) where the row indicates where the chain is at this period and the column indicates where the chain is going in the next period. Each matrix element is a well-behaved probability, p(θ i, θ j ) 0, i, j A. When the state space is continuous, then K is a conditional PDF: f (θ θ i ). Rows of P A sum to one and define a conditional PMF since they are all specified for the same starting value and cover each possible destination in the state space: for row i: k j=1 p(θ i, θ j ). Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
31 What is a Markov Chain? (cont.) When the state space is continuous, then K is a conditional PDF: f (θ θ i ). K is a conditional PDF: f (θ θ i ), meaning a properly defined probability statement for all θ A, given some given current state θ i. Continuous state space Markov chains have more involved theory; so it s often convenient to think about discrete Markov chains at first. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
32 What is a Markov Chain? (cont.) p m (θ [0] i Transition probabilities between two selected states for arbitrary numbers of steps m can be calculated multiplicatively. The probability of transitioning from the state θ i = x at time 0 to the state θ j = y in exactly m steps is given by the multiplicative series: = x, θ [m] j = y) = p(θ i, θ 1 )p(θ 1, θ 2 ) p(θ m 1, θ j ). θ 1 θ 2 θ m 1 }{{} all possible paths } {{ } transition products So p m (θ [0] i = x, θ [m] j = y) is also a stochastic transition matrix that specifies the product of all the required intermediate steps where we sum over all possible paths that reach y from x. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
33 Marginal Distributions We want the marginal distribution at some step mth from the transition kernel. For the discrete case the marginal distribution of the chain at the m step is obtained by inserting the current value of the chain, θ [m] i, into the row of the transition kernel for the m th step, p m : π m (θ) = [p m (θ 1 ), p m (θ 2 ),..., p m (θ k )]. So the marginal distribution at the first step of discrete Markov chain is given by: π 1 (θ) = p 1 π 0 (θ), where π 0 is the initial starting value assigned to the chain and p 1 = p is a transition matrix. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
34 Marginal Distributions (cont.) The marginal distribution at some (possibly distant) step for a given starting value is: π n = pπ n 1 = p(pπ n 2 ) = p 2 (pπ n 3 ) =... = p n π 0. Since successive products of probabilities quickly result in lower probability values, the property above shows how Markov chains eventually forget their starting points. The marginal distribution for the continuous case is only slightly more involved since we cannot just list as a vector the quantity: π m (θ j ) = p(θ, θ j )π m 1 (θ)dθ, θ which is the marginal distribution of the chain, given that it is currently on point θ j at step m. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
35 Properties Markov Chains May Possess Some Good, Some Bad Homogeneity Irreducibility Recurrence Harris recurrence Stationarity Periodicity Ergodicity Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
36 The Gibbs Sampler The Gibbs sampler is a transition kernel created by a series of full conditional distributions. It is a Markovian updating scheme based on conditional probability statements. If the limiting distribution of interest is π(θ) where θ is an k length vector of coefficients to estimate, then the objective is to produce a Markov chain that cycles through these conditional statements moving toward and then around this distribution. The set of full conditional distributions for θ are denoted Θ and defined by π(θ) = π(θ i θ i ) for i = 1,..., k, where the notation θ i indicates a specific parametric form from Θ without the θ i coefficient. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
37 The Gibbs Sampler (cont.) Steps: 1 Choose starting values: θ [0] = [θ [0] 1, θ[0] 2,..., θ[0] k ] 2 At the j th starting at j = 1 complete the single cycle by drawing values from the k distributions given by: θ [j] 1 π(θ 1 θ [j 1] 2, θ [j 1] 3,..., θ [j 1] k 1, θ[j 1] k ) θ [j] 2 π(θ 2 θ [j] 1, θ[j 1] 3,..., θ [j 1] k 1, θ[j 1] k ) θ [j] 3 π(θ 3 θ [j] 1, θ[j] 2,..., θ[j 1] k 1, θ[j 1] k ). θ [j] k 1 π(θ k 1 θ [j] 1, θ[j] 2, θ[j] 3..., θ[j 1] k ) θ [j] k π(θ k θ (j) 1, θ[j] 2, θ[j] 3..., θ[j] k 1 ) 3 Increment j and repeat until convergence. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
38 Gibbs Sampler Theory Properties of the Gibbs sampler: Since the Gibbs sampler conditions only on values from the last iteration of its chain values, it clearly has the Markovian property. The Gibbs sampler has the true posterior distribution of the parameter vector as its limiting distribution: θ [i] d θ π(θ). i=1 The Gibbs sampler is a homogeneous Markov chain: the consecutive probabilities are independent of n, the current length of the chain. The Gibbs sampler converges at a geometric rate: the total variation distance between an arbitrary time and the point of convergence decreases at a geometric rate in time (t). The Gibbs sampler is ergodic. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
39 Comments on Burn-in The burn-in period is the initial time that is considered to be pre-stationarity. We run the chain for some time after the starting point and throw away the values. Convergence assessment is essential (diagnostics to come). It pays to be conservative in deciding the length of the burn-in period. There is no golden rule here. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
40 For March 20 Present me with a map of the data for your final project. From BCG p. 127, work exercise #4. Jamie Monogan (UGA) Markov Chain Monte Carlo Spring / 40
Markov Chain Monte Carlo
Markov Chain Monte Carlo Jamie Monogan Washington University in St. Louis October 11, 2010 Jamie Monogan (WUStL) Markov Chain Monte Carlo October 11, 2010 1 / 59 Objectives By the end of this meeting,
More informationMarkov chain Monte Carlo
1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop
More informationMCMC Methods: Gibbs and Metropolis
MCMC Methods: Gibbs and Metropolis Patrick Breheny February 28 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/30 Introduction As we have seen, the ability to sample from the posterior distribution
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationBayesian Inference for Regression Parameters
Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More information16 : Approximate Inference: Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationLECTURE 15 Markov chain Monte Carlo
LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 5. Bayesian Computation Historically, the computational "cost" of Bayesian methods greatly limited their application. For instance, by Bayes' Theorem: p(θ y) = p(θ)p(y
More informationEco517 Fall 2013 C. Sims MCMC. October 8, 2013
Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained
More informationA quick introduction to Markov chains and Markov chain Monte Carlo (revised version)
A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationStat 516, Homework 1
Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball
More informationBayesian GLMs and Metropolis-Hastings Algorithm
Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationDAG models and Markov Chain Monte Carlo methods a short overview
DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More information36-463/663Multilevel and Hierarchical Models
36-463/663Multilevel and Hierarchical Models From Bayes to MCMC to MLMs Brian Junker 132E Baker Hall brian@stat.cmu.edu 1 Outline Bayesian Statistics and MCMC Distribution of Skill Mastery in a Population
More information17 : Markov Chain Monte Carlo
10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo
More informationWeakness of Beta priors (or conjugate priors in general) They can only represent a limited range of prior beliefs. For example... There are no bimodal beta distributions (except when the modes are at 0
More informationMCMC notes by Mark Holder
MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationTwo-Variable Regression Model: The Problem of Estimation
Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable
More informationMonte Carlo Methods. Leon Gu CSD, CMU
Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov
More informationBayesian Phylogenetics:
Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes
More informationSC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.
More informationBayesian Graphical Models
Graphical Models and Inference, Lecture 16, Michaelmas Term 2009 December 4, 2009 Parameter θ, data X = x, likelihood L(θ x) p(x θ). Express knowledge about θ through prior distribution π on θ. Inference
More informationMarkov chain Monte Carlo Lecture 9
Markov chain Monte Carlo Lecture 9 David Sontag New York University Slides adapted from Eric Xing and Qirong Ho (CMU) Limitations of Monte Carlo Direct (unconditional) sampling Hard to get rare events
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationMetropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9
Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationThe Metropolis-Hastings Algorithm. June 8, 2012
The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings
More informationMarkov Chain Monte Carlo, Numerical Integration
Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical
More informationCS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling
CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy
More informationMarkov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa
Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and
More informationTheory of Stochastic Processes 8. Markov chain Monte Carlo
Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html
More informationSAMPLING ALGORITHMS. In general. Inference in Bayesian models
SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be
More informationProbabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationFAV i R This paper is produced mechanically as part of FAViR. See for more information.
Bayesian Claim Severity Part 2 Mixed Exponentials with Trend, Censoring, and Truncation By Benedict Escoto FAV i R This paper is produced mechanically as part of FAViR. See http://www.favir.net for more
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationMarkov Chain Monte Carlo The Metropolis-Hastings Algorithm
Markov Chain Monte Carlo The Metropolis-Hastings Algorithm Anthony Trubiano April 11th, 2018 1 Introduction Markov Chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability
More informationMarkov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017
Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the
More informationWinLTA USER S GUIDE for Data Augmentation
USER S GUIDE for Version 1.0 (for WinLTA Version 3.0) Linda M. Collins Stephanie T. Lanza Joseph L. Schafer The Methodology Center The Pennsylvania State University May 2002 Dev elopment of this program
More informationMonte Carlo Inference Methods
Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably
More informationIntroduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo
Introduction to Computational Biology Lecture # 14: MCMC - Markov Chain Monte Carlo Assaf Weiner Tuesday, March 13, 2007 1 Introduction Today we will return to the motif finding problem, in lecture 10
More informationRobert Collins CSE586, PSU Intro to Sampling Methods
Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling
More informationWhy Bayesian approaches? The average height of a rare plant
Why Bayesian approaches? The average height of a rare plant Estimation and comparison of averages is an important step in many ecological analyses and demographic models. In this demonstration you will
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet
Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short
More informationWinter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo
Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte
More informationMARKOV CHAIN MONTE CARLO
MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationShortfalls of Panel Unit Root Testing. Jack Strauss Saint Louis University. And. Taner Yigit Bilkent University. Abstract
Shortfalls of Panel Unit Root Testing Jack Strauss Saint Louis University And Taner Yigit Bilkent University Abstract This paper shows that (i) magnitude and variation of contemporaneous correlation are
More informationBayesian Inference and Decision Theory
Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 2214 ENGR (703) 993-1644 Office Hours: Tuesday and Thursday 4:30-5:30 PM, or by appointment Spring 2018 Unit 6: Gibbs Sampling
More informationReminder of some Markov Chain properties:
Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent
More informationBrief introduction to Markov Chain Monte Carlo
Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical
More informationOn the Optimal Scaling of the Modified Metropolis-Hastings algorithm
On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationHamiltonian Monte Carlo
Hamiltonian Monte Carlo within Stan Daniel Lee Columbia University, Statistics Department bearlee@alum.mit.edu BayesComp mc-stan.org Why MCMC? Have data. Have a rich statistical model. No analytic solution.
More informationMCMC for Cut Models or Chasing a Moving Target with MCMC
MCMC for Cut Models or Chasing a Moving Target with MCMC Martyn Plummer International Agency for Research on Cancer MCMSki Chamonix, 6 Jan 2014 Cut models What do we want to do? 1. Generate some random
More informationTEORIA BAYESIANA Ralph S. Silva
TEORIA BAYESIANA Ralph S. Silva Departamento de Métodos Estatísticos Instituto de Matemática Universidade Federal do Rio de Janeiro Sumário Numerical Integration Polynomial quadrature is intended to approximate
More informationInference in Bayesian Networks
Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)
More informationComputer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo
Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationSimulation. Where real stuff starts
1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?
More informationMarkov chain Monte Carlo
Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains
More informationLecture 8: The Metropolis-Hastings Algorithm
30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationMarkov Chains Handout for Stat 110
Markov Chains Handout for Stat 0 Prof. Joe Blitzstein (Harvard Statistics Department) Introduction Markov chains were first introduced in 906 by Andrey Markov, with the goal of showing that the Law of
More informationMarkov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018
Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling
More informationMonte Carlo Methods for Inference and Learning
Monte Carlo Methods for Inference and Learning Ryan Adams University of Toronto CIFAR NCAP Summer School 14 August 2010 http://www.cs.toronto.edu/~rpa Thanks to: Iain Murray, Marc Aurelio Ranzato Overview
More informationHomework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.
Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for
More informationMCMC Review. MCMC Review. Gibbs Sampling. MCMC Review
MCMC Review http://jackman.stanford.edu/mcmc/icpsr99.pdf http://students.washington.edu/fkrogsta/bayes/stat538.pdf http://www.stat.berkeley.edu/users/terry/classes/s260.1998 /Week9a/week9a/week9a.html
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationReducing The Computational Cost of Bayesian Indoor Positioning Systems
Reducing The Computational Cost of Bayesian Indoor Positioning Systems Konstantinos Kleisouris, Richard P. Martin Computer Science Department Rutgers University WINLAB Research Review May 15 th, 2006 Motivation
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationLect4: Exact Sampling Techniques and MCMC Convergence Analysis
Lect4: Exact Sampling Techniques and MCMC Convergence Analysis. Exact sampling. Convergence analysis of MCMC. First-hit time analysis for MCMC--ways to analyze the proposals. Outline of the Module Definitions
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state
More informationGeneral Construction of Irreversible Kernel in Markov Chain Monte Carlo
General Construction of Irreversible Kernel in Markov Chain Monte Carlo Metropolis heat bath Suwa Todo Department of Applied Physics, The University of Tokyo Department of Physics, Boston University (from
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationMarkov chain Monte Carlo General Principles
Markov chain Monte Carlo General Principles Aaron A. King May 27, 2009 Contents 1 The Metropolis-Hastings algorithm 1 2 Completion (AKA data augmentation) 7 3 The Gibbs sampler 8 4 Block and hybrid MCMC
More informationSpring 2006: Introduction to Markov Chain Monte Carlo (MCMC)
36-724 Spring 2006: Introduction to Marov Chain Monte Carlo (MCMC) Brian Juner February 16, 2006 Hierarchical Normal Model Direct Simulation An Alternative Approach: MCMC Complete Conditionals for Hierarchical
More informationHierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31
Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,
More informationR Demonstration ANCOVA
R Demonstration ANCOVA Objective: The purpose of this week s session is to demonstrate how to perform an analysis of covariance (ANCOVA) in R, and how to plot the regression lines for each level of the
More information1. Introduction. Hang Qian 1 Iowa State University
Users Guide to the VARDAS Package Hang Qian 1 Iowa State University 1. Introduction The Vector Autoregression (VAR) model is widely used in macroeconomics. However, macroeconomic data are not always observed
More information