Target tracking example Filtering: Xt. (main interest) Smoothing: X1: t. (also given with SIS)

Similar documents
Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Outline for today. Markov chain Monte Carlo. Example: spatial statistics (Christensen and Waagepetersen 2001)

Probabilistic Graphical Models

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Convergence of random processes

CS 3750 Machine Learning Lecture 6. Monte Carlo methods. CS 3750 Advanced Machine Learning. Markov chain Monte Carlo

6. Stochastic processes (2)

6. Stochastic processes (2)

Stat 543 Exam 2 Spring 2016

( ) ( ) ( ) ( ) STOCHASTIC SIMULATION FOR BLOCKED DATA. Monte Carlo simulation Rejection sampling Importance sampling Markov chain Monte Carlo

Stat 543 Exam 2 Spring 2016

Hidden Markov Models

Limited Dependent Variables

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Simulation and Random Number Generation

Lecture 3: Probability Distributions

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Quantifying Uncertainty

First Year Examination Department of Statistics, University of Florida

Sampling Self Avoiding Walks

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Probability Theory (revisited)

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Gaussian Mixture Models

Markov Chain Monte Carlo Lecture 6

x = , so that calculated

Chapter 11: Simple Linear Regression and Correlation

NUMERICAL DIFFERENTIATION

Lecture 10 Support Vector Machines II

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

Suites of Tests. DIEHARD TESTS (Marsaglia, 1985) See

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Comparison of Regression Lines

Errors for Linear Systems

Basically, if you have a dummy dependent variable you will be estimating a probability.

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Probability and Random Variable Primer

Conjugacy and the Exponential Family

Global Sensitivity. Tuesday 20 th February, 2018

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Lecture 21: Numerical methods for pricing American type derivatives

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Case A. P k = Ni ( 2L i k 1 ) + (# big cells) 10d 2 P k.

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

STATS 306B: Unsupervised Learning Spring Lecture 10 April 30

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

BAYESIAN CURVE FITTING USING PIECEWISE POLYNOMIALS. Dariusz Biskup

DS-GA 1002 Lecture notes 5 Fall Random processes

STATISTICS QUESTIONS. Step by Step Solutions.

Topic 23 - Randomized Complete Block Designs (RCBD)

Tracking with Kalman Filter

Negative Binomial Regression

Estimation: Part 2. Chapter GREG estimation

Analysis of Discrete Time Queues (Section 4.6)

Problem Set 9 - Solutions Due: April 27, 2005

Lecture 3 Stat102, Spring 2007

Markov chains. Definition of a CTMC: [2, page 381] is a continuous time, discrete value random process such that for an infinitesimal

Why Monte Carlo Integration? Introduction to Monte Carlo Method. Continuous Probability. Continuous Probability

Artificial Intelligence Bayesian Networks

A Bayesian methodology for systemic risk assessment in financial networks

Economics 130. Lecture 4 Simple Linear Regression Continued

Appendix B: Resampling Algorithms

Why Bayesian? 3. Bayes and Normal Models. State of nature: class. Decision rule. Rev. Thomas Bayes ( ) Bayes Theorem (yes, the famous one)

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: Universal Hash Functions/Streaming Cont d

= z 20 z n. (k 20) + 4 z k = 4

Module 9. Lecture 6. Duality in Assignment Problems

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

The EM Algorithm (Dempster, Laird, Rubin 1977) The missing data or incomplete data setting: ODL(φ;Y ) = [Y;φ] = [Y X,φ][X φ] = X

Homework Assignment 3 Due in class, Thursday October 15

EM and Structure Learning

Continuous Time Markov Chains

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Hierarchical Bayes. Peter Lenk. Stephen M Ross School of Business at the University of Michigan September 2004

Hidden Markov Models

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Notes prepared by Prof Mrs) M.J. Gholba Class M.Sc Part(I) Information Technology

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Multiple Choice. Choose the one that best completes the statement or answers the question.

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Assortment Optimization under MNL

Lecture Notes on Linear Regression

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Lecture 17 : Stochastic Processes II

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Physics 5153 Classical Mechanics. Principle of Virtual Work-1

Chapter 20 Duration Analysis

Notes on Frequency Estimation in Data Streams

Checking Pairwise Relationships. Lecture 19 Biostatistics 666

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for U Charts. Dr. Wayne A. Taylor

Problem Set 9 Solutions

Information Geometry of Gibbs Sampler

Transcription:

Target trackng example Flterng: Xt Y1: t (man nterest) Smoothng: X1: t Y1: t (also gven wth SIS) However as we have seen, the estmate of ths dstrbuton breaks down when t gets large due to the weghts becomng degenerate (f we don t resample). If we resample, most of the values sampled for X 1 wll dsappear when t gets large (related to the weght breakdown). So SIS sn t useful for all problems. Gbbs samplng Specal case of Markov Chan Monte Carlo (MCMC) Instead of generatng ndependent samples, t generates dependent samples va a Markov chan. 1 2 3 X X X Useful for a wde range of problems. 1

Popular for Bayesan analyses, but s a general samplng procedure. For example, t can be used to do smoothng n the target trackng example. Smlar to SIS n that the random varable X s decomposed nto X = { X1, X2,, Xk } and each pece s smulated separately. However the condtonng structure s dfferent. When samplng X j, t s drawn condtonal on all other components of X. Gbbs sampler A) Startng value: X 0 = { X 0 0 0 1, X2,, Xk } Pcked by some mechansm t t t t B) Sample X { X1, X2,, X k } = by 1) 2) X ~ X X, X,, X k t t 1 t 1 t 1 1 1 2 3 X ~ X X, X,, X k t t t 1 t 1 2 2 1 3 2

j ) X ~ X X,, X, X,, X t t t t 1 t 1 j j 1 j 1 j+ 1 k k ) X t ~ t 1,, t k X j X Xk 1 Under certan regularty condtons, the 1 2 3 realzatons X, X, X, form a Markov chan X. wth statonary dstrbuton [ ] Thus the realzatons can be treated as dependent samples from the desred dstrbuton. Example: (Nuclear pump falure) Gaver & O Murcheartagh (Technometrcs, 1987) Gelfand & Smth (JASA, 1990) Observed 10 nuclear reactor pumps Counted the number of falures for each pump 3

Pump Falures ( s ) Obs Tme ( t ) Obs Rate ( l ) 1 5 94.320 0.053 2 1 15.720 0.064 3 5 62.880 0.080 4 14 125.760 0.111 5 3 5.240 0.573 6 19 31.440 0.604 7 1 1.048 0.954 8 1 1.048 0.954 9 4 2.096 1.910 10 22 10.480 2.099 (Obs Tme n 1000 s of hours) (Obs Rate = Falures / Tme) 4

Want to determne the true falure rate for each pump wth the followng herarchcal model s ~ Posson β ~Gamma, ( t ) ( α β) ( ) β ~IGamma γ,1δ Note: β ~IGamma ( γ,1δ ) s equvalent to f 1 ~Gamma,1 β ( s ) ( ) π β ( ) ρ β Want to determne = = = ( t ) α 1 α γ + 1 ( γ δ ) s s e β Γ γ δ e β e! ( α) δ β Γ ( γ) t β 1) S for each pump = 1,, 10 2) β S where( S { s,, s } = ) 1 10 5

Note that both sets of these dstrbutons are hard to get analytcally. Can show that p ( S) where {,, } α + s α+ s 1 t 1 10α+ γ ( δ + ) Γ ( α + s ) =. 1 10 Note that the s are correlated and tryng to get the margnal for each looks to be ntractable analytcally. Run a Gbbs sampler to determne β, S. From ths sampler we can get the desred dstrbutons S and β S. A possble Gbbs scheme Step 1) Sample 1 ~ 1 ( 1), β,s Step 10) Sample 10 ~ 10 ( 10), β,s Step 11) Sample β ~ β,s where ( ) = { 1,, j 1, j 1,, j + 10} t e 6

Need the followng condtonal dstrbutons ~ j j ( ), β, S = j β, s j j 1 = Gamma α + s j, t j + 1 β β ~ β, S = β 1 = IGamma γ + 10 α, δ + Ths can be gotten from the jont dstrbuton by ncludng only the terms n the product that contan the random varable of nterest [ β,, S ] ( ) s 10 t 10 α 1 β γ δ β t e e δ e = a = 1 s! = 1 β Γ α β Γ γ ( ) γ + 1 ( ) e.g. for j, whch terms above have a j n them. 7

Equvalently, you can do ths by lookng at the graph structure of the model by only ncludng terms that correspond to edges jonng to the node of nterest. e.g. for β, whch edges connect wth the node for β. S β Example Run: α = 1.8 δ = 1 γ = 0.1 n = 1000 0 β = l 8

Pump 1 Pump 2 Densty 0 5 10 15 Densty 0 1 2 3 4 5 0.00 0.05 0.10 0.15 0.20 0.25 0.30 lambda 0.0 0.2 0.4 0.6 0.8 1.0 lambda Pump 7 Pump 8 Densty 0.0 0.2 0.4 0.6 0.8 Densty 0.0 0.2 0.4 0.6 0.8 0 1 2 3 4 5 6 lambda 0 1 2 3 4 5 6 lambda Pump 10 Beta Densty 0.0 0.4 0.8 Densty 0.0 1.0 2.0 3.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 lambda 0.5 1.0 1.5 beta 9

Pump Mean Medan Std Dev 1 0.0702 0.0668 0.0268 2 0.1542 0.1363 0.0925 3 0.1039 0.0988 0.0399 4 0.1233 0.1206 0.0310 5 0.6263 0.5805 0.2924 6 0.6136 0.6040 0.1351 7 0.8241 0.7102 0.5267 8 0.8268 0.7129 0.5309 9 1.2949 1.2040 0.5776 10 1.8404 1.8121 0.3903 Mean Medan Std Dev Beta 0.4372 0.4161 0.1315 10

beta_+1 0.4 0.6 0.8 1.0 ( 1 ) Cor β, β + = 0.302 0.4 0.6 0.8 1.0 beta_ 11

Pump 1 lambda_+1 0.02 0.04 0.06 0.08 0.10 0.12 0.02 0.04 0.06 0.08 0.10 0.12 lambda_ ( 1 1 1 ) Cor, + = 0.012 Pump 9 lambda_+1 0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0 lambda_ ( 1 9 9 ) Cor, + = 0.091 12

Pump 7 lambda_+1 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 lambda_ 1 ( 7 7 ) Cor, + = 0.063 Pump 8 lambda_+1 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 lambda_ 1 ( 8 8 ) Cor, + = 0.142 13

Target trackng wth the Gbbs sampler As mentoned last tme, the smoothng problem, X1: k Y1: k, sn t solved very well wth SIS. However t can be done very easly wth Gbbs samplng. Step j, j = 1,, k 1 Step k Draw X ~ X X 1, X + 1, Y j j j j j Draw X ~ X X 1, Y k k k k As all the components nvolved n these condtonal dstrbutons are normal, each of these condtonal dstrbutons are normal, thus are easly sampled. In the SIS analyss, t was assumed that all of the parameters of the movement and measurement error dstrbutons (all varances) and the startng pont were assumed known. Ths can easly be relaxed by puttng prors on X 0, Λ, and Σ and samplng them as well as part of the Markov chan. 14

Λ X 0 X 1 X 2 X 3 Z 1 Z 2 Z 3 Σ The sampler needs to be modfed as Step 0 Draw X0 ~ X0 X1, Λ Step j, j = 1,, k 1 Draw X ~ j X j X j 1, X j+ 1, Yj, Λ Step k Draw Xk ~ Xk Xk 1, Yk, Λ 15

Step k + 1 Draw Λ Λ X 0: Step k + 2 ~ k ~, Draw Σ Σ X0: k Y1: K Ths can be performed by Gbbs samplng f the prors on X 0 s Normal and the prors on Λ and Σ are IGamma. Condtons for Gbbs Samplng to work Whle you can always run the chan, t may not gve the answer you want. That s, the realzatons may not have the desred statonary dstrbuton. One-step transtons: p( x y ) n-step transtons: pn ( x y ). Statonary dstrbuton: π ( x) = lm p n ( x y) n 16

If t exsts, t satsfes ( ) ( ) π ( ) π x = p x y y dy A stronger condton whch shows that π ( x ) s the densty of the statonary dstrbuton s π ( x) p( y x) = π ( y) p( x y) holds for all x & y (detaled balance). Note that detaled balance statonarty but statonarty doesn t mply detaled balance. If the followng two condtons hold, the chan wll have the desred statonary dstrbuton. Irreducblty: The chan generated must be rreducble. That s t s possble to get from each state to every other state n a fnte number of steps. 17

Not all problems lead to rreducble chans. Example: ABO blood types The chldren s data?? mples that the chld wth blood type AB must have genotype AB and that the chld wth blood type O must have AB O genotype OO. The only possble way for the two chldren to nhert those genotypes f for one parent to have genotype AO and for the other parent to have genotype BO. However t s not possble to say whch parent has whch genotype wth certanty. By a smple symmetry argument [ = & = ] = P[ Dad = BO & Mom = AO] P Dad AO Mom BO = 0.5 18

Lets try runnng a Gbbs sampler, by frst generatng mom s genotype gven dad s and then dad s gven mom s. Let start the chan wth Dad = AO. Step 1: Generate Mom P Mom = AO Dad = AO = 0 P Mom = BO Dad = AO = 1 so Mom = BO. Step 2: Generate Dad P Dad = AO Mom = BO = 1 P Dad = BO Mom = BO = so Dad = AO. Ths mples that every realzaton of the chan has Mom = BO & Dad = AO. If the chan s started wth Dad = BO, every realzaton of that chan wll have Mom = AO & Dad = BO. 0 19

The reducble chan n ths case does not have the correct statonary dstrbuton. (Well reducble chans don t really have statonary dstrbutons anyway). But runnng the descrbed Gbbs sampler wll not correctly the descrbe the dstrbuton of the mother and father s genotypes. Aperodcty: Don t want a perodc chan (e.g. certan states can only occur on when t s even) Ths volates the dea that each state has a long run frequency margnally. Startng Ponts For every chan you need to specfy a startng pont. There are a number of approaches for choosng ths. 1) Pror means δ = =. γ 0 In pump example, set β E [ β] 20

2) Estmate from data In pump example, E[ l ] = αβ, so set β =. α 0 l In target trackng example, set startng postons at each tme to average observed postons, the dfferences of these to get the veloctes. 3) Sample from pror 4) Ad hoc choces In pump example, set 0 β = For many problems, ths choce can be mportant. The statonary dstrbuton s an asymptotc property and t may take a long tme for the chan to converge. 21

Start = l-bar / Alpha Beta 0.2 0.3 0.4 0.5 0.6 0 10 20 30 40 50 Imputaton Start = Infnty Beta 0.3 0.4 0.5 0.6 0.7 0 10 20 30 40 50 Imputaton Start = 0 Beta 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 10 20 30 40 50 Imputaton 22

0 β = 0 (actually 100 Startng wth 10 ), the ntal draws are not consstent wth the statonary dstrbuton seen later n the chan. Whle for ths example, the problem clears up quckly, for other problems t can take a whle. Ths s more common whch larger problems, that mght have mllons, or maybe bllons of varables beng sampled n a complete sngle scan through the data. Ths can occur wth large space tme problems, such as the Tropcal Pacfc sea surface temperature predctons dscussed at <http://www.stat.ohostate.edu/~sses/collab_enso.php>. 23

Forecast map for December 2002 based on data from January 1970 to May 2002 Observed December 2002 map The usual approach to have a burn-n perod where the ntal samples are thrown away snce they may not be representatve of samples from the statonary dstrbuton. 24

The followng table contans estmates of the posteror means of the 11 parameters n the pump example wth 3 dfferent startng ponts. The frst 200 mputatons were dscarded and then the next 1000 mputatons were sampled. 0 Pump β = l α 0 β = 0 β = 0 1 0.0688 0.0704 0.0715 2 0.1531 0.1531 0.1575 3 0.1064 0.1024 0.1050 4 0.1234 0.1236 0.1221 5 0.6008 0.6198 0.6319 6 0.6116 0.6145 0.6163 7 0.7744 0.8501 0.8118 8 0.8173 0.8224 0.8190 9 1.2584 1.2748 1.2857 10 1.8393 1.8536 1.8409 β 0.4256 0.4358 0.4334 25

Often the bgger the problem, the longer the burn-n perod desred. However those are the problems where tme consderatons wll lmt the total number of mputatons that can be done. So you do want to thnk about startng values for your chan. Gbbs samplng and Bayes Choce of prors For Gbbs samplng to be effcent, the draws n each step of the procedure need to be feasble. That suggests that conjugate dstrbutons need to be used as part of the herarchcal model, as was done n pump and target trackng examples. However conjugacy s not strctly requred, as rejecton samplng wth log-concave dstrbutons mght be able to be used n some problems. Ths dea s sometmes used n the software package WnBUGS (Bayesan analyss Usng Gbbs Samplng). 26

However for some problems the model you want to analyze s not conjugate and the trcks to get around non-conjugacy won t work. For example, lets change model for the pump example to ~ Posson µ, σ ~LogN, s ( t ) σ ( µ σ ) 2 2 2 ( ) ( α β) µ ~Logstc ντ, ~ Webull, Good luck on runnng a Gbbs sampler on ths model (I thnk). Other samplng technques are needed, for ths and other more complcated problems. 27

Metropols Hastngs Algorthm (M-H) A general approach for constructng a Markov chan that has the desred statonary π = π j ) dstrbuton ( ( ) 1) Proposal dstrbuton: Assume that j X t =. Need to propose a new q = q j. state wth dstrbuton j ( ) 2) Calculate the Hastngs rato a j π jq j = mn,1 πqj 3) Acceptance/Reject step Generate U ~ U ( 0,1) and set X t + 1 = = j f U a t ( X ) j otherwse 28

Notes: 1) Gbbs samplng s a specal case of M-H as for each step, π jq j = 1 π q j whch mples the relatonshp also holds for a complete scan through all the varables. 2) The Metropols (Metropols et al, 1953) algorthm was based on a symmetrc proposal dstrbuton ( qj = qj ) a j π j = mn,1 π So a hgher probablty state wll always be accepted. 3) As wth many other samplng procedures, π and q only need to be known up to normalzng constants as they wll be cancelled out when calculatng the Hastngs rato. 29

4) Perodcty sn t a problem usually. For many proposals, q > 0 for all. Also f t+ 1 t a j < 1, P X X = = > 0, thus some states have perod 1, whch mples the chan s aperodc. 5) qa j j gves the 1-step transton probabltes of the chan (e.g. ts p( x y ) n the earler notaton). 6) Detaled balance s easy. Wthout loss of generalty, assume that π jq π q j j < 1 (whch mples a j < 1 and a j = 1) Then π qa = π q j j j = π q j j = π qa π q j π q j j j j j 30

7) The bg problem s rreducblty. However by settng the proposal to correspond to a rreducble chan solves ths. Proposal dstrbuton deas: 1) Approxmate the dstrbuton. For example use a normal wth smlar means and varances. Or use a t wth a moderate number of degrees of freedom. 2) Random walk q( y x) = q( y x) If there s a contnuous state process, you could use ( ) y = x + ε; ε ~ q For a dscrete process, you could use 0.4 j = 1 q( j ) = 0.2 j = 0.4 j = + 1 31

3) Autoregressve chan ( ) ; ~ ( ) y = a + B x a + z z q For the random walk and autoregressve chans, q does not need to correspond to a symmetrc dstrbuton (though that s common). 4) Independence sampler ( ) = q( y) q y x For an ndependence sampler you want q to be smlar to π. a j π jq = mn,1 πq j If they are too dfferent, q π could get very small, makng t dffcult to move from state. (The chan mxes slowly). 32

5) Block at a tme Deal wth varables n blocks lke the Gbbs sampler. Sometmes referred to Metropols wthn Gbbs. Allows for complex problems to be broken down nto smpler ones. Any M-H style update can be used wthn each block (e.g. random walk for one block, ndependence sampler for the next, Gbbs for the one after that). Allows for a Gbbs style sampler, but wthout the worry about conjugate dstrbutons n the model to make samplng easer. Pump Example: ~ Posson ( t ) ( µ σ ) 2 ( ) µ, σ ~LogN, s σ 2 2 µ ~ N ν, τ 2 ( γ δ) ~IGamma, 33

2 Can perform Gbbs on µ and σ but not on, due the non-conjugacy of the Posson and log Normal dstrbutons. Step, = 1,, 10 (M-H): 2 Sample from s, µσ, wth proposal * 2 ~logn, θ (Multplcatve random walk) ( ) HR = * ( t) s ( t ) e * t * 1 log µ φ * σ σ 1 log µ φ σ σ * 1 log log φ θ θ * 1 log log φ * θ θ s t e aj ( HR ) = mn,1 34

Step 11 (Gbbs): 2 2 Sample µ from µσ,, ντ, ~ N ( mean, var) where Step 12 (Gbbs): 2 Sample σ from 1 ν mean = var log 2 + 2 σ τ 1 n 1 var = + 2 2 σ τ σ 2, µ, γ, δ 1 ~IGamma γ + 5, δ + ( log µ ) 2 2 35

Parameters for run Burn-n: 1000 Imputatons: 100,000 ν = -50 2 τ = 100 γ = 1 δ = 100 2 θ = 0.01 Startng values = l 1 µ = log 10 2 1 l ( log ) 2 σ = l µ 9 36

Other optons 1) Combne steps 1 10 nto a sngle draw. Wth ths opton all s change or none do. In the sampler used, whether each changes s ndependent of the other s. The opton used s probably preferable, as t should lead to better mxng of the chan. 2 2) Combne samplng, µ, and σ nto a sngle M-H step. Probably suboptmal as the proposal dstrbuton won t be a great match for the jont posteror dstrbuton of 2, µ, and σ. 37

Rejecton rates Havng some rejecton can be good. Wth the multplcatve random walk sampler 2 used, f θ s too small, there wll be very few rejectons, but the sampler wll move too slowly through the space. 2 Increasng θ wll lead to better mxng, as bgger jumps can be made, though t wll lead to hgher rejecton rates. You need to fnd a balance between rejecton rates, mxng of the chan, and coverage of the state space. For some problems, a rejecton rate of 50% s fne and I ve seen reports for large problems usng normal random walk proposals the rejecton rates of 75% are optmal. 38

Rejecton rates for falure rates proposals under dfferent random walk varances Pump 0.000001 0.0001 0.01 0.04 1 0.00012 0.00613 0.07045 0.13776 2 0.00009 0.00531 0.03141 0.06130 3 0.00034 0.00784 0.07107 0.13754 4 0.00043 0.01126 0.11705 0.22482 5 0.00028 0.00691 0.05521 0.10705 6 0.00126 0.01442 0.13511 0.26028 7 0.00012 0.00148 0.03027 0.05735 8 0.00007 0.00414 0.02854 0.05824 9 0.00024 0.00559 0.06105 0.12131 10 0.00070 0.01461 0.14790 0.27735 39

Theta^2 = 0.000001 Lambda_1 0.00 0.10 0.20 0 e+00 2 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Tme Theta^2 = 0.0001 Lambda_1 0.00 0.10 0.20 0 e+00 2 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Tme Theta^2 = 0.01 Lambda_1 0.00 0.10 0.20 0 e+00 2 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Tme Theta^2 = 0.04 Lambda_1 0.00 0.10 0.20 0 e+00 2 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Tme 40

Standard errors n MCMC As dscussed before, the correlaton of the chan of the chan must be taken nto account when determnng standard errors of quanttes estmated by the sampler. Suppose we use x to estmate and that the burn-n perod was long enough to get nto the statonary dstrbuton. Then σ Var 2 2 n 2 n 1 ( x) = n + ( n j) j = 1 ρ j For a reasonable chan, the autocorrelatons wll de off and so lets assume that they wll be neglgble for j > K. Then the above reduces to 2 σ Var 2 2 n ( x) = n + ( n j) K j = 1 ρ j If the autocorrelatons de off farly quckly, σ and ρ j can be estmated consstently (though wth some bas) by the usual emprcal moments. 2 41

Another approach s blockng. Assume that n = Jm for ntegers J and m. Then let jm 1 x = x ; j = 1,, J j m = j 1 m + 1 ( ) Note that x = x. If m s large relatve to K, then the correlatons between the x j should neglgble and the varance can be estmated as f the x were ndependent. j If the correlaton s slghtly larger, t mght be reasonable to assume that the correlaton between x j and x j + 1 s some value ρ to be determned, but that correlatons at larger lags are neglgble. In ths case ( ) ( ) 1 + x x 2 Var Var j J ρ 42

Estmates wth m = 100 Parameter x SE ρ 1 0.05290 0.00071 0.36116 2 0.06926 0.00277 0.66197 3 0.07837 0.00106 0.35354 4 0.11053 0.00056 0.10520 5 0.56167 0.01119 0.46975 6 0.60546 0.00237 0.10960 7 0.92318 0.04068 0.67346 8 0.90361 0.03766 0.63510 9 1.82900 0.02884 0.33629 10 2.10188 0.00726 0.05263 µ -2.52492 0.01384 0.41517 2 σ 27.15958 0.09967 0.07579 43

Estmates wth m = 1000 Parameter x SE ρ 1 0.05290 0.00075 0.13239 2 0.06926 0.00399 0.18756 3 0.07837 0.00088-0.13079 4 0.11053 0.00045-0.15794 5 0.56167 0.01205-0.00838 6 0.60546 0.00226-0.07845 7 0.92318 0.06081 0.12201 8 0.90361 0.04822 0.04495 9 1.82900 0.03303 0.07779 10 2.10188 0.00757 0.06487 µ -2.52492 0.01981 0.15224 2 σ 27.15958 0.13956 0.29726 44

Standard error estmates for pump example m = 1000 m = 100 Independent 1 0.000752 0.000710 0.000075 2 0.003992 0.002769 0.000205 3 0.000885 0.001063 0.000111 4 0.000446 0.000555 0.000094 5 0.012051 0.011193 0.001009 6 0.002258 0.002373 0.000439 7 0.060813 0.040679 0.002970 8 0.048219 0.037656 0.002807 9 0.033030 0.028835 0.002945 10 0.007568 0.007264 0.001428 µ 0.019808 0.013840 0.005729 2 σ 0.139560 0.099674 0.056767 45