Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

Similar documents
Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Target tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo

Markov Chain Monte Carlo Lecture 6

Probabilistic Graphical Models

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Probability Theory (revisited)

( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

Information Geometry of Gibbs Sampler

Artificial Intelligence Bayesian Networks

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

Testing for seasonal unit roots in heterogeneous panels

Comparison of Regression Lines

Spatial Statistics and Analysis Methods (for GEOG 104 class).

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Modeling and Simulation NETW 707

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Economics 130. Lecture 4 Simple Linear Regression Continued

Hierarchical Bayes. Peter Lenk. Stephen M Ross School of Business at the University of Michigan September 2004

MIMA Group. Chapter 2 Bayesian Decision Theory. School of Computer Science and Technology, Shandong University. Xin-Shun SDU

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Negative Binomial Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

A Robust Method for Calculating the Correlation Coefficient

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Motion Perception Under Uncertainty. Hongjing Lu Department of Psychology University of Hong Kong

Biostatistics 360 F&t Tests and Intervals in Regression 1

6 Supplementary Materials

Topic- 11 The Analysis of Variance

Lecture 3: Probability Distributions

EM and Structure Learning

Stat 543 Exam 2 Spring 2016

β0 + β1xi. You are interested in estimating the unknown parameters β

The Impact of Category Prices on Store Price Image Formation: An Empirical Analysis

Stat 543 Exam 2 Spring 2016

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

STAT 3008 Applied Regression Analysis

Regression Analysis. Regression Analysis

RELIABILITY ASSESSMENT

Composite Hypotheses testing

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Population Design in Nonlinear Mixed Effects Multiple Response Models: extension of PFIM and evaluation by simulation with NONMEM and MONOLIX

Basic Statistical Analysis and Yield Calculations

First Year Examination Department of Statistics, University of Florida

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

IRO0140 Advanced space time-frequency signal processing

β0 + β1xi. You are interested in estimating the unknown parameters β

STAT 511 FINAL EXAM NAME Spring 2001

x = , so that calculated

Lecture 21: Numerical methods for pricing American type derivatives

4.1 basic idea of interval mapping

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

A Bayesian methodology for systemic risk assessment in financial networks

Fabio Rapallo. p x = P[x] = ϕ(t (x)) x X, (1)

6. Stochastic processes (2)

6. Stochastic processes (2)

18.1 Introduction and Recap

STATISTICS QUESTIONS. Step by Step Solutions.

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Chapter 9: Statistical Inference and the Relationship between Two Variables

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

Linear Approximation with Regularization and Moving Least Squares

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

Gschlößl, Czado: Does a Gibbs sampler approach to spatial Poisson regression models outperform a single site MH sampler?

Simulation and Random Number Generation

Chapter 15 Student Lecture Notes 15-1

Polymer Chains. Ju Li. GEM4 Summer School 2006 Cell and Molecular Mechanics in BioMedicine August 7 18, 2006, MIT, Cambridge, MA, USA

Properties of Least Squares

I529: Machine Learning in Bioinformatics (Spring 2017) Markov Models

Exam. Econometrics - Exam 1

ISSN X Robust bayesian inference of generalized Pareto distribution

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Statistics for Economics & Business

Continuous Time Markov Chain

Chapter 14 Simple Linear Regression

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Statistical analysis using matlab. HY 439 Presented by: George Fortetsanakis

Solutions to Exercises in Astrophysical Gas Dynamics

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

/ n ) are compared. The logic is: if the two

Web Appendix B Estimation. We base our sampling procedure on the method of data augmentation (e.g., Tanner and Wong,

ANOVA. The Observations y ij

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Some basic statistics and curve fitting techniques

Journal of Multivariate Analysis

An R implementation of bootstrap procedures for mixed models

Detection of additive outliers in Poisson INAR(1) time series

Homework Assignment 3 Due in class, Thursday October 15

Linear Regression Analysis: Terminology and Notation

Model Updating Using Bayesian Estimation

ST2352. Working backwards with conditional probability. ST2352 Week 8 1

multiple QTL likelihood

Transcription:

Markov chan Monte Carlo Rasmus Waagepetersen Department of Mathematcs Aalborg Unversty Denmark November, / Outlne for today MCMC / Condtonal smulaton for hgh-dmensonal U: Markov chan Monte Carlo Consder U = (U,...,U n ) N n (,Σ) wth Cov(U,U j ) = Σ j for all,j Then n-dmensonal condtonal densty for U Y = y: f (u,...,u n y) [ f (y u,...,u n )]f (u,...,u n ) Can not be factorzed nto lower dmensonal denstes. / Example: spatal statstcs (Chrstensen and Waagepetersen ) Observatons Y are weed counts at spatal locatons (x,y ) =,..., Coordnate Y (m) Coordnate X (m) 5 5 5 5 5 8 8 8 5 Y U s Posson where U random effect assocated wth (x,y ) (sol propertes). Cov(U,U j ) = τ exp( d j /α) where d j s dstance between (x,y ) and (x j,y j ) /

x x Example: quanttatve genetcs (Sorensen and Waagepetersen ) U, Ũ random genetc effects nfluencng sze and varablty of Y j : Y j sze of jth ltter of th pg. Y j U = u,ũ = ũ N (µ + u,exp( µ + ũ )) Hstogram x Pedgree z (U,...,U n,ũ,...,ũn) N(,G A) 8 5 5 ltter sze w v w Etc. pg wthout observaton. pg wth observaton. A: addtve genetc relatonshp matrx (dependng on pedgree). [ σa G = ] ρσ a σã ρσ a σã ρ: coeffcent of genetc correlaton between U and Ũ. NB: hgh dmenson n >. σ ã 5/ / MCMC Suppose U = (U,...,U n ) π( ) where π( ) s a complcated probablty dstrbuton. Markov chan Monte Carlo: Generate ergodc Markov chan so that U,U,U,... (U m = (U m,...,um n )) I.e. for large m, U m π( ) and dstrbuton of U m π( ) E π k(u) M k(u m ) m= Example: ergodc and non ergodc auto-regressve chans X = βx + ǫ ǫ N(,σ ) X = 5, σ =.5 and β ether. or.5: 5 8 8 t 8 NB: when β < we have convergence to statonary dstrbuton N(,σ /( β ). t / 8/

Jont updatng Metropols-Hastngs algorthm: Basc ngredent: proposal densty Some features of Metropols-Hastngs q(v u),v R n defned for all u R p and easy to sample. Gven ntal state U generate U,U,... as follows:. Condtonal on U m = u m generate proposal V m+ q( u m ).. Wth probablty mn{, π(v m+ )q(u V m+ ) π(u m )q(v m+ u m ) } accept U m+ = V m+ ; otherwse U m+ = u m. Even f π s very complcated probablty densty we may choose a smple proposal densty q (e.g. normal dstrbuton). Need only know π upto constant of proportonalty. If e.g. π(u) = f (u y) = f (y u)f (u) f (y) then we do not need to know margnal densty f (y) whch can often be hard to compute. Under mld condtons of rreducblty and aperodcty ths produces an ergodc Markov chan wth statonary dstrbuton gven by π( ). 9/ / Irreducblty and aperodcty Ex: random walk Metropols V m+ N(u m,σ prop) Irreducblty: chan can get from any part of the state space to any other apart of the state space (of postve π-probablty) Aperodc: chan not perodc. where σprop s the proposal varance. Then q(v u) = q(u v) so Metropols-Hastngs rato reduces to Metropols rato: mn{, π(v m+ )q(u V m+ ) π(u m )q(v m+ u m ) } = mn{, π(v m+ ) π(u m ) } / /

x Metropolzed AR() chan Comparson wth rejecton samplng β =.5 and σ =.5 but now ntroduce Metropols accept/reject n order to sample N(,.5/(. ). 5 Proposal for new state pertubaton of prevous state so easer to get accept. We reject some proposals to mantan statonary dstrbuton but do not throw away rejected states. Ths s at the expense that sample not uncorrelated. 8 t / / Smple example (Exercse ) π(u y) = = f (y exp(u + β))f (u;τ )/L(θ) where f (y λ) densty for Posson dstrbuton of ntensty λ. Convergence of Markov chans for smple example Plots of U,U,U,...: σ prop =. (accept rate %) σ prop = (accept rate %) Random walk Metropols rato (normalsng constant L(θ) = f (y;θ) cancels out): = f (y exp(v m+ + β))f (V m+ ;τ )/L(θ) = f (y = exp(u m + β)f (u m ;τ )/L(θ) = f (y exp(v m+ + β))f (V m+ ;τ ) = f (y exp(u m + β)f (u m ;τ ) sample..5..5. 8 Index sample.5..5..5. 8 Index NB: need only know π up to constant of proportonalty. 5/ /

Autocorrelaton/mxng Plot of autocorrelaton ρ(k) = Corr(U m,u m+k ): Sngle-ste Metropols-Hastngs Update one component n each teraton. σ prop =. (quck mxng) σ prop = (slow mxng) Update of th component:. Condtonal on U m = u m generate V m+ q ( u m ) and let Note: ACF.....8. 5 5 5 Lag Var M m= ACF U m = VarU M so small autocorrelaton advantageous......8. 5 5 5 Lag m= n= ρ( m n ). Wth probablty V m+ = (u m,...,um,vm+,u m +,...,um n ) mn{, π(v m+ )q (u m V m+ ) π(u m )q (V m+ u m ) } accept U m+ = V m+ ; otherwse U m+ = u m. Repeat for =,...,n / 8/ Examples MCMC ssues Random walk Metropols: V m+ N(u m,σ prop ) and mn{, π(v m+ )q (u m V m+ ) π(u m )q (V m+ u m ) } = mn{, π(v m+ ) π(u m ) } Gbbs sampler: V m+ U U j = u m j, j. Then q(v m+ u m ) = π (V m+ u m ) and π(v m+ )q (u m V m+ ) π(u m )q (V m+ u m ) so all proposals are accepted. = π (V m+ u m )π(um )π (u m u m ) π (u m u m )π(um )π (V m+ u m ) = when has chan reached equlbrum/statonary dstrbuton (burn-n) how long chans do we need (precson of Monte Carlo estmates)? These questons may be addressed by vsual nspecton of tmeseres and estmaton of Monte Carlo error. Ptfalls: hgh correlaton between components to be updated multmodalty No need to choose a proposal varance. 9/ /

Implementaton of MCMC usng BUGS (Bayesan analyss usng Gbbs samplng) Model specfcaton n BUGS: herarchcal/drected acyclc graph (DAG). Example: τ = σ =.5 U τ,σ N(,τ ) Y,Y U = u,τ,σ N(u,σ ) (Y,Y condtonally ndependent) model = { taunv <- sgmanv <- /.5 u ~ dnorm(.,taunv) y ~ dnorm(u,sgmanv) y ~ dnorm(u,sgmanv) } Exercses See exercses sheet on webpage. We can now sample (Y,Y,U), U Y,Y, Y,U Y etc. dependng on the data we specfy (whch varables to fx/condton on). / /