( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo

Similar documents
Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Probabilistic Graphical Models

Lecture 4 Hypothesis Testing

Global Sensitivity. Tuesday 20 th February, 2018

U-Pb Geochronology Practical: Background

Markov Chain Monte Carlo Lecture 6

CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo

Convergence of random processes

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Composite Hypotheses testing

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Lecture Notes on Linear Regression

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Economics 130. Lecture 4 Simple Linear Regression Continued

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

RELIABILITY ASSESSMENT

Probability Theory (revisited)

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

PHYS 705: Classical Mechanics. Calculus of Variations II

First Year Examination Department of Statistics, University of Florida

A Robust Method for Calculating the Correlation Coefficient

Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Prof. Dr. I. Nasser Phys 630, T Aug-15 One_dimensional_Ising_Model

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability

Linear Approximation with Regularization and Moving Least Squares

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

6. Stochastic processes (2)

6. Stochastic processes (2)

Quantifying Uncertainty

Simulation and Random Number Generation

Feb 14: Spatial analysis of data fields

Estimation: Part 2. Chapter GREG estimation

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

NUMERICAL DIFFERENTIATION

The Feynman path integral

Information Geometry of Gibbs Sampler

Target tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Applied Stochastic Processes

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Testing for seasonal unit roots in heterogeneous panels

Feature Selection: Part 1

Notes on Frequency Estimation in Data Streams

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Generalized Linear Methods

Computing MLE Bias Empirically

Continuous Time Markov Chains

Monte Carlo method II

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Tracking with Kalman Filter

x = , so that calculated

Goodness of fit and Wilks theorem

Snce h( q^; q) = hq ~ and h( p^ ; p) = hp, one can wrte ~ h hq hp = hq ~hp ~ (7) the uncertanty relaton for an arbtrary state. The states that mnmze t

Lecture 4. Instructor: Haipeng Luo

b ), which stands for uniform distribution on the interval a x< b. = 0 elsewhere

LECTURE 9 CANONICAL CORRELATION ANALYSIS

Singular Value Decomposition: Theory and Applications

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Uncertainty as the Overlap of Alternate Conditional Distributions

Lecture 10: May 6, 2013

Continuous Time Markov Chain

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

An R implementation of bootstrap procedures for mixed models

Chapter 12. Ordinary Differential Equation Boundary Value (BV) Problems

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!

PROPERTIES I. INTRODUCTION. Finite element (FE) models are widely used to predict the dynamic characteristics of aerospace

Probability, Statistics, and Reliability for Engineers and Scientists SIMULATION

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Multi-dimensional Central Limit Theorem

The Study of Teaching-learning-based Optimization Algorithm

e i is a random error

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Classification as a Regression Problem

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

4DVAR, according to the name, is a four-dimensional variational method.

The exam is closed book, closed notes except your one-page cheat sheet.

Basic Statistical Analysis and Yield Calculations

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

A Comparative Study for Estimation Parameters in Panel Data Model

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Multi-dimensional Central Limit Argument

Gaussian Mixture Models

Numerical Heat and Mass Transfer

Lecture 12: Classification

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Basically, if you have a dummy dependent variable you will be estimating a probability.

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Probability and Random Variable Primer

VQ widely used in coding speech, image, and video

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Transcription:

SOCHASIC SIMULAIO FOR BLOCKED DAA Stochastc System Analyss and Bayesan Model Updatng Monte Carlo smulaton Rejecton samplng Importance samplng Markov chan Monte Carlo Monte Carlo smulaton Introducton: If we know how to drectly sample from f x to obtan..d. samples { X :,..., } =, accordng to the Law of Large umber: E g X g X g f x = MCS Procedure: skpped snce t s trval Analyss:. he MCS estmator g MCS s unbased and the varance decays wth the / rate. MCS E f x g = E f x g X E g X f x = = MCS Var g Var g f x f x X = = Varf x g X. Confdence nterval MCS estmator g Central Lmt heorem MCS = = g X Var ormal E g X, f x g X o buld up confdence nterval, we need estmated smply as the sample varance of MCS Var g X g X g f x = f x f x Var g X, whch can be { g X :,..., } =,.e. herefore, the followng 95.4% confdence nterval for Gaussan can be bult:

Stochastc System Analyss and Bayesan Model Updatng Var g X Var g X MCS f x MCS f x g E g X g + f x Remarks:. If we know how to drectly sample from f x s a very strct premse snce t usually requres the knowledge of F the nverse of the cumulatve densty functon, whch s avalable to us only for certan standard probablty densty functons, e.g. unform, Gaussan, Gamma, etc. In the case that we know we can sample from f x usng the followng steps: Draw U ~ unf [ 0,] and let X = F U. It can be shown that ~ Proof: X f x. P X x = P F U x = P U F x = F x. he accuracy of MCS does not depend on the dmenson. F, MCS g s robust aganst the dmenson of X snce Var g

Stochastc System Analyss and Bayesan Model Updatng Rejecton samplng Introducton: 3

Stochastc System Analyss and Bayesan Model Updatng Suppose we don t know how to sample f x drectly, but we do know how to evaluate f x up to a constant. f x = a h x where a : unknown constant, h x : we know how to calculate t. ote that ths s the usual stuaton for Bayesan analyss, where the posteror PDF has the followng form: x f Y f Y f Y x f f x Y = = f Y x f x 44443 { h x a Assume that we know how to sample and evaluate a chosen PDF q x q x K q x h x Procedure:. Let K be a number such that K q x h x. Let X C ~ q x, C : canddate. 3. Accept X = X C w.p. C C h X Kq X., x. 4. Cycle 3 untl we get enough accepted samples. Analyss: Same as the Monte Carlo smulaton except that the number of samples should be the number of the accepted samples. Remarks: Dfferent from MCMC, when a sample s rejected, we do not repeat the prevous sample. he number K can be solved as the followng: Let x * x x * x * x h h = arg max K =,.e. solvng K usually requres optmzaton. x q q 3 How do we do Step 3? Let ~ [ 0, ] 4 If the shapes of q x and x U unf Let X = X C, f C C h X U Kq X h are very dfferent, the effcency of rejecton 4

Stochastc System Analyss and Bayesan Model Updatng samplng wll be poor. However, t s often the case that the shape of h x s unknown to us. It's hard to choose an effcent q x. 5 When the X dmenson s hgh, rejecton samplng can be extremely neffcent. hs s because t can be much harder to fnd a good q x n hgh dmensonal space. Adaptve rejecton samplng Suppose h x log s concave. l o g Kq x h x K q x Kq 3 3 x Procedure: Let q x be an exponental PDF and s used as the ntal rejecton samplng PDF. ote that Kq x s tangent to h x. Draw C ~ X q x. If C sample. However, f C X accepted, keep on usng x X rejected, let x that Kq x s tangent to x q to generate the next q be the second exponental PDF so h at X C n the log space. Generate the next sample usng the envelope PDF formed by Kq x and Kq dong so untl we get enough number of accepted samples. x. Contnue Remarks:. ot applcable to non-log-concave PDF.. Stll not good for hgh dmensonal X. 3. here are varants of adaptve rejecton samplng technques that do not requre evaluatng the gradents [] and do not requre log-concaveness []. 5

Stochastc System Analyss and Bayesan Model Updatng Importance samplng Introducton: Suppose we don t know how to drectly sample from f x, but we know how to evaluate t. We would lke to compute the expected value of some functon g X wth respect to f x,.e. to compute E f x [ g X ] g x f x = dx Let q x, called the mportance samplng PDF, be a PDF that we know how to sample and evaluate and whose support regon contans the support regon of f x, then f x f x E f x g X = g x f x dx = g x q x dx = Eq x g x q x q x Procedure: Draw { X },, L, =..d. from q x. Accordng to the Law of Large umber, we have f f X f x q x = X E g X = E g X g X g q X q X IS Analyss:. he IS estmator g IS s unbased and the varance of g IS decays wth the / rate. = E g X q x x X f X IS E g E q x q x g X = = q X f X q X f = Eq x g X q X f = g x q x dx = E f x g X q x 6

Stochastc System Analyss and Bayesan Model Updatng f X IS f X Var g Var q x q x g X = = Var q x g X = q X q X. Confdence nterval IS estmator g IS = g X f X q X Central Lmt heorem q x Central Lmt heorem X X f Varq x g X f X q X ormal E g X, q X f Varq x g X q X ormal E g X, f x o buld up confdence nterval, we need estmated smply as the sample varance of X Var g X q x q X f X, whch can be f g X : =,...,,.e. q X X f X f IS Var q x g X g X g q X = q X herefore, the followng 95.4% confdence nterval for Gaussan can be bult: f X f X Varq x g X Varq x g X q X q X IS IS g E g X g + f x Remarks:. Recall that the varance of the MCS estmator s Var g X seen that the varance of the IS estmator s Var g X q x f x f. Here we have X q X. It can be shown that f the mportance samplng PDF q x g x f x, the varance of 7

Stochastc System Analyss and Bayesan Model Updatng the IS estmator s 0. We call ths the optmal mportance samplng PDF. However, ths PDF usually cannot be drectly sampled and evaluated. It s also lkely to choose an mportance samplng PDF so that the varance of the resultng IS estmator s larger than that of MCS estmator.. Importance weghts Let us defne w { w : =,..., } X f as the mportance weght of the q X One can see that the IS estmator s smply E q x w = E = q X Also, w = = = f X q x = w g X Varq x f X Central Lmt heorem ormal, q X = q X w f. ote that X th sample. 3. Importance samplng s neffcent when X dmenson s hgh. hs s because when the dmenson s hgh, the mportance weghts may become hghly non-unform. he consequence s that the number of effectve samples s lttle. Modfed mportance samplng Introducton: For Bayesan analyss, we usually only know how to evaluate the posteror PDF f f x Y up to a constant: x Y = x f Y f Y x f ote that we usually don t know the normalzng constant f Y. So the mportance 8

Stochastc System Analyss and Bayesan Model Updatng samplng technque cannot be drectly appled snce t requres full evaluaton of f x Y. Modfed mportance samplng only requres the evaluaton of f to a constant. f x Y a h x =, where a : we don t know t, how to evaluate t. ow let the normalzed mportance weght be w j j h X h X q X j= q X ote that smply = w w g X. = =. he MIS estmator for Observaton: when, MIS s the same as IS. Proof: When, j j h X h X Eq x =, so j= q X q X a x Y up h x : we know E g X E g X Y f x Y j j h X. j= q X a herefore, the normalzed mportance weght n MIS and the mportance weght n IS become dentcal. he consequence s that MIS s only asymptotcally unbased. s Procedure:. Draw. { X :,..., } ~ q x = and compute E g X Y w g X = w j j h X h X. q X j= q X Remarks:. When s large, the statstcal propertes of the MIS estmator s smlar to those for the IS estmator. In fact, MIS s asymptotcally unbased.. MIS can be used to estmate the normalzng constant f Y n Bayesan analyss. Let h X α = be the non-normalzed mportance weght. It can be shown that q X α s an unbased estmator of Y = f. 9

Stochastc System Analyss and Bayesan Model Updatng Sample-mportance resamplng Introducton: Both mportance samplng and modfed mportance samplng are manly for estmatng E g X Y. If we would lke to obtan samples from f x Y, we can adopt SIR, whch s just one step ahead of MIS. he dea s to resample the MIS samples accordng to ther normalzed weghts. Procedure: Recall IIS, we have. Do MIS to obtan { w :, L, } =, {, } X w :,..., = w =. Recall that = = w =.. Resamplng: Let j X = X w.p. new w for j = L,, M, then j { X : j =, L, M} ~ f x Y new Remarks:. here may be repeated samples n. M s better to be no larger than. 3. If s small, asymptotcally unbased. j { X : j,, M } new j { X : j,, M } ~ f x Y new = L. = L. hs s because MIS s only 0

Stochastc System Analyss and Bayesan Model Updatng

Stochastc System Analyss and Bayesan Model Updatng Markov Chan Monte Carlo. Metropols-Hastng algorthm. Gbbs sampler 3. Hybrd Monte Carlo Metropols-Hastng algorthm Introducton: Suppose the target PDF s the posteror PDF n a Bayesan analyss: x f Y f Y x f f x Y = = a h x he dea of MH s to create a Markov chan whose statonary dstrbuton s the same as the target PDF. Procedure: c 0. Let H x x be the chosen proposal PDF.. Intalze 0 X = any place.

3. Let the canddate C c 0 ~ 4. Let X Stochastc System Analyss and Bayesan Model Updatng X H x X and compute C X w. p mn, = X w. p r r 0 mn, 5. Cycle 3 4 to obtan Markov chan samples { t X : t 0,..., } wll be asymptotcally dstrbuted as f C 0 C 0 C 0 h X H X X r =. h X H X X =. hese samples x Y f the Markov chan s ergodc. Frst note that the MH algorthm creates a Markov chan that satsfes detaled balance: F z x h x = B x z h z x, z, F z x and B x z are the forward and backward probablty transton kernels of the Markov chan. he forward and backward kernels n the MH algorthm are = mn, δ + mn, F z x r z x r H z x = mn, δ + mn, B x z r x z r H x z B h where z H x z f z Y H x z h r = =, x H z x rb h x H z x f x Y H z x LHS = F z x h x B x Y H z x f = =. h z H x z f z Y H x z h z H x z h z H x z = mn, δ z x h x mn, H z x h x h x H z x + h x H z x h x H z x = mn, δ + mn, h z H x z x z h z H z x h x h z H x z h x H z x h x H z x mn, δ x z h z mn h z H x z h z H x z B x z h z RHS = + = =, h z H x z herefore, as long as the Markov chan s ergodc, the statonary dstrbuton wll be unque and equal to the target f x Y. Analyss: 3

. he MCMC estmator for E g X Y MCMC estmator s asymptotcally unbased because Stochastc System Analyss and Bayesan Model Updatng s smply MCMC t g = g X. he t = [ ] t E g MCMC = E g X Y E g X Y t= ote that f we gnore the MC samples n the burn-n perod, the estmator s unbased. t t [ MCMC ] = = Var g Var g X Y Var g X Y t= t= t s t cov, = Var g X Y g X g X Y + t= s= t= ote that after the MC reaches ts statonary state, the tme orgn s forgotten, so t t+ k g X g X Y = R k cov, ote that R k = R k and = R 0 Var g X Y Var [ g MCMC ] = R 0 R s t R 0 R s s + = + s= t= s= s= 0 [ γ ] R 0 s R s R 0 = + = + R where s R s R 0 γ = quantfes the degree of dependence of the MC s= samples. ote that R 0 s the varance of the estmator f the MC samples are all ndependent. However, the MC samples are dependent so the actual varance s larger than R 0 of ndependent samples s + γ samples as the followng: t t+ k = cov, t= for a factor of [ + γ ]. One can say that the equvalent number. ote that R k can be estmated from the MC t t k g X g + MCMC g X g MCMC R k R k g X g X k k. Confdence nterval ote that the Central Lmt heorem and the Law of Large umber stll hold for 4

weakly dependent samples,.e. Stochastc System Analyss and Bayesan Model Updatng Y [ γ ] Var g X t g MCMC = g X E g X Y, + t= So the followng 95.4% confdence nterval can be bult: 0 MCMC R R0 g + E g X Y g + + [ ] MCMC γ [ γ ] Remarks:. Recall that f x Y a h x =. We don t need to know how to evaluate the normalzng constant a snce t s cancelled out as we compute r.. When the ntal state of the Markov chan s chosen arbtrarly, the chan may need some tme to reach ts statonary state. We call ths perod the burn-n perod. he MC samples n the burn-n perod are not dstrbuted as f x Y. We can choose to dscard these samples. he usual way of dentfyng the burn-n perod s to plot the sample value tme hstory or the tme hstory of certan chosen statstcs and dentfy t vsually. X t Burn-n perod t Samples n the burn-n perod can be dscarded 3. How do we know the resultng MC s ergodc? In many cases, we don t know t for sure. But f we conduct several MCMCs to obtan ndependent MCs and found ther behavor s smlar, we can argue that the MC may be ergodc. here are cases where we can mmedately see the MC s ergodc or not: y x H, f x f x y x H, 4. After reachng the statonary state, the MC samples become dentcally dstrbuted but dependent. he dependency between two samples decays wth the tme 5

dfference. 5. When H z x H x z algorthm. ote Stochastc System Analyss and Bayesan Model Updatng =, the resultng algorthm s called the Metropols C 0 h X r =. h X 6. he choce of the proposal PDF H z x can sgnfcantly affect the effcency of MCMC. From our experence, the type of H z x s not crtcal but the wdth of H z x matters. If H z x s very narrow, r and no rejecton, but adjacent samples are hghly correlated. If H z x s very wde, r s often small and lots of rejecton and repeatng samples, so the samples can be hghly correlated. herefore, H z x should not be too narrow or too wde. he rule of thumb s to take the wdth of H z x to be the same order of the wdth of f 50%. 0 h X always accept h X C h x C h X x Y. Usually, the performance s the best when the rejecton rate s around 7. In prncple, the MH algorthm should work even when the X dmenson s hgh. In practce, an effcent proposal PDF can be dffcult to buld because we usually lack of knowledge of the geometry of the target PDF f x Y. here are also ssues for the local-random-walk MH, e.g. MH wth a Gaussan proposal PDF. Consder the followng Gaussan PDF. Let σ max and σ mn be the maxmum and mnmum egenvalues of the covarance matrx. Suppose we choose a Gaussan proposal PDF H z x whose standard devaton n all drectons s dentcal and equal to σ mn. One can see that n order to have the MC samples travel h X accept wth prob r throughout the sgnfcant regon of the target PDF, we need at least σ max /σ mn MC 0 6

Stochastc System Analyss and Bayesan Model Updatng tme steps. hs dffculty s due to hgh asperty rato may occur regardless the dmenson of X. However, from my experence, t occurs more frequently when X dmenson s hgh. hs s defntely not an ssue when X s a scalar. 8. MH may not work for mult-modal PDFs snce the resultng MC may be non-ergodc. 7

Stochastc System Analyss and Bayesan Model Updatng 8

Stochastc System Analyss and Bayesan Model Updatng 9

Stochastc System Analyss and Bayesan Model Updatng Gbbs sampler Introducton: Is t possble to conduct MCMC wthout rejectng samples? It turns out to be possble. hs can happen when the uncertan varable X can be dvded nto m groups { X :,..., m} = and, moreover, when t s feasble to drectly sample from { } \, f x x x Y, where { x x } denotes { x,..., x, x,..., x } \ + m. Procedure:. Intalze. Sample 0 0 0 X 0 = X, X,, X L m at any place. 3. Repeat step to get ~,,, X f x x x Y 0 0 L m 0 0 X ~ f x x, x3, L, xm, Y 0 0 3 ~ 3,, 4, L, m, X = X f x x x x x Y M X ~ m f xm x, L, xm, Y dstrbuted as f { X t : t 0,, } = L. hese samples wll be asymptotcally x Y f the Markov chan s ergodc. ote that the Gbbs sampler algorthm creates a Markov chan that satsfes detaled balance: F z x f x Y B x z f z Y x, z =, F z x and B x z are the forward and backward probablty transton kernels of the Markov chan. Usng the followng transton as an example: t+ t+ x t+ t+ x t t+ x3 x3 t t x4 x4 x x L L M M t t x m x m x z he forward and backward kernels n the Gbbs sampler algorthm are: 0

Stochastc System Analyss and Bayesan Model Updatng = δ δ δ 4 4 Lδ m m 3,, 4, L, m, F z x z x z x z x z x f z x x x x Y = δ δ δ 4 4 Lδ m m 3,, 4, L, m, B x z x z x z x z x z f x z z z z Y x Y F z x f = δ z x Lδ z m xm f z3 x, L, xm, Y f x, L, xm Y o δ z3 x3 o x3 Lδ, L, m m m 3 3 o δ x z f z, L, zm Y o z3 = δ x z Lδ x m zm f z3 z, L, zm, Y f z, z, x3, z4l, zm Y o δ x3 z3 o z3 f z z Y = δ x z x z f z, z, x, z L, z Y Lδ,, 3, 4L, m 3 4 m f z z x z z Y = δ x z x z f z z Y, L, m m m 3 3 o δ x z f z, L, zm Y o z3 = δ x z Lδ x m zm f x3 z, L, zm, Y f z, L, zm Y = B x z f z Y o δ x3 z3 o z3 Remarks:. Unlke MH, Gbbs sampler needs no proposal PDF, so no rejecton. But the f x x \. tradeoff s that we need to know how to sample from x. In the case that we don t know how to drectly sample from f x x \, we can x sample t usng sample-mportance resamplng, MCMC, rejecton samplng, etc. A f x x \ popular choce s to use adaptve rejecton samplng to sample from x [3]. 3. he advantages and dsadvantages of Gbbs sampler are smlar to those for MH. he analyss s also the same. 4. Gbbs sampler wll not work well when some uncertan parameters are hghly correlated condtonng on the data. he consequence s that the MC samples move very slowly when sample these hghly correlated varables. here are varants of Gbbs sampler that enjoy larger jumps n the MC samples even when such hgh correlaton exsts, e.g. Gbbs sampler wth overrelaxaton [4,5]. Hybrd Monte Carlo

Introducton: Stochastc System Analyss and Bayesan Model Updatng Both MH and Gbbs sampler create MCs wth local random walk behavor. However, ths behavor s not desrable. Wth ths behavor, t may take a long tme to have the resultng MC travel throughout the sgnfcant regon of f x Y, especally when X dmenson s hgh. Hybrd Monte Carlo s a MCMC algorthm that does not use local random walk. he basc dea s to add an auxlary uncertan varable Z to the process, where the new target PDF s proportonal to, n z = f x z h x e where from f, f x Y = a h x and n s the dmenson of X. If we are able to sample x z, t s clear that the X part of the samples wll be dstrbuted as f x Y. How do we sample from f, x z? We employ MCMC. he man trck for Hybrd Monte Carlo s to gve h x and z some physcal meanng: log [ h x ] s consdered to be the potental energy of a ball wth unt mass x s the locaton of the ball; thnk of log [ h x ] and z s the velocty of the ball. So the total energy of the ball s n log h x + z = log f x, z H x, z = to be the profle of a valley If there s no frcton when the ball rolls on the valley, the total energy H x, z s conservatve,.e. f x, z s conservatve, and the ball rolls accordng to the followng equatons: dx, log dz H x z h x = z = = =,..., n dt dt x x Procedure:. Intalze 0 X, 0 Ẑ. Solve x t, z t accordng to the followng governng equaton, log h x dx dz H x z = z = = =,..., n dt dt x x

Stochastc System Analyss and Bayesan Model Updatng wth ntal condton x0 0 = X and z0 0 = Z. Evolve the soluton for randomzed duraton of 0 R. Let X x R 0 =. 3. Resample n z = Z ~ e 4. Cycle -3 to get { t X : t 0,..., } dstrbuted as f =. hese samples wll be asymptotcally x Y f the Markov chan s ergodc. Analyss: same as MH Remarks:. he HMC algorthm ndeed samples from f, x z. o see ths, let s vew the end product of Step n the algorthm as a canddate,.e. C 0 C X = x R, Z = z R 0 s the canddate of the MCMC algorthm. One can verfy that C C, 0 0, f X Z r = = f X Z hs s because the total energy H x, z, and hence f x, z, remans constant n the soluton process of the Hamltonan equatons. herefore, the canddate X C, C Z s always accepted as the next MC sample. 0. he duraton R n Step s better to be randomzed to avod a perodc MC, although the chance of beng so s very small. 3. Step 3 s necessary snce wthout t, f x, z wll be always constant, hence the MC wll not explore the entre phase space. 4. In practce, Step cannot be solved analytcally and must be solved approxmately. Dong so, the resultng r rato wll be only approxmately equal to. o handle ths ssue, we can smply add an accept/reject step,.e. change Step to the followng: Solve x t, z t approxmately accordng to the followng governng equaton 3

Stochastc System Analyss and Bayesan Model Updatng, log h x dx dz H x z = z = = =,..., n dt dt x x wth ntal condton x0 X 0 = and z0 0 = Z. Evolve the soluton for randomzed duraton of 0 R. Let C X x R 0 = and Z C z R 0 =. Accept the canddate,.e. take C C, 0 0, f X Z r = f X Z X C = X, wth probablty r, where 0 If not accepted, repeat the prevous sample,.e. take X = X. he approxmate soluton algorthm to the Hamltonan equatons can not be arbtrary. In fact, the chosen algorthm must be reversble n tme,.e. f startng from ntal condton, we obtan the canddate soluton; a reversble algorthm must have the property that startng from that canddate and solve the equaton backward n tme, we wll obtan the same ntal condton. hs s so because the resultng HMC algorthm must satsfy the so called detaled balance or reversblty condton. he followng leapfrog fnte dfference algorthm s reversble and s accurate to the nd order n the aylor seres expanson: t t h x z t + = z t + x x= x t h x t t x t + t = x t + t z t + t t h x z t + t = z t + + x x= x t+ t h x t + t 5. One can see that HMC does not do local random walk and t s possble for HMC to make large jumps. Of course the tradeoff s that HMC requres solvng the Hamltonan equatons, whch may requres much computaton. Accordng to the experence from other researchers, t seems that the cost s usually worthwhle, compared to MH and Gbbs sampler. 4

Stochastc System Analyss and Bayesan Model Updatng References: [] Glks, W. R. 99 Dervatve-free adaptve rejecton samplng for Gbbs samplng. Bayesan Statstcs 4, eds. Bernardo, J., Berger, J., Dawd, A. P., and Smth, A. F. M. Oxford Unversty Press. 5

Stochastc System Analyss and Bayesan Model Updatng [] Glks, W. R., Best,. G. and an, K. K. C. 995 Adaptve rejecton Metropols samplng. Appled Statstcs, 44, 455-47. [3] Glks, W.R., and Wld, P. 99. Adaptve rejecton samplng for Gbbs samplng. Appled Statstcs, 4, 337--348. [4] Adler, S.L. 98. Over-relaxaton method for the Monte Carlo evaluaton of the partton functon for multquadratc actons. Physcal Revew D - Partcles and Felds, 3, 90-904. [5] eal, R.M. 995. Suppressng random walks n Markov chan Monte Carlo usng ordered overrelaxaton. echncal Report 9508, Dept. of Statstcs, Unversty of oronto. In general: Mackay, J.D.C. "Introducton to Monte Carlo methods." 6