Control Variates for Markov Chain Monte Carlo

Size: px

Start display at page:

Download "Control Variates for Markov Chain Monte Carlo"

Flora Richard
5 years ago
Views:

1 Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability and Methods Lefkada, August 2009 () August 29, / 42

2 Outline of presentation 1 MCMC and variance considerations 2 Theoretical framework for Control Variates to MCMC 3 Control variates for Random-Walk Metropolis-Hastings algorithms 4 Application of control variates to RW-MH algorithms 5 Summary and Further Research () August 29, / 42

3 MCMC and variance considerations 1 MCMC and variance considerations 2 Theoretical framework for Control Variates to MCMC 3 Control variates for Random-Walk Metropolis-Hastings algorithms 4 Application of control variates to RW-MH algorithms 5 Summary and Further Research () August 29, / 42

4 MCMC and variance considerations MCMC framework Let π( ) probability measure on (X, B) F : X R function of interest π(f ) := E π [F (X )] := Fdπ the quantity to be estimated In the MCMC setting, we have (X n ) Markov chain on X, with: P transition kernel π stationary probability measure π(f ) is estimated by the ergodic average: µ n (F ) = 1 n n F (X i ) i=1 () August 29, / 42

5 MCMC and variance considerations Main Theorems for Markov Chains Ergodic Theorem, for Ergodic Markov Chains and appropriate F, (π( F ) < ), lim µ n(f ) = π(f ), a.s. n Central Limit Theorem (CLT) for Markov Chains Under appropriate additional conditions: n[µn (F ) π(f )] = 1 n n [F (X i ) π(f )] D N(0, σf 2 ) i=1 where σf 2, the asymptotic variance of F, is: σ 2 F := lim n Varπ[ nµ n(f )] = n= Cov π[f (X 0), F (X n)] () August 29, / 42

6 MCMC and variance considerations Approaches to variance reduction Importance sampling Antithetic sampling/variates Control variates Rao-Blackwellization () August 29, / 42

7 Theoretical framework for Control Variates to MCMC 1 MCMC and variance considerations 2 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Optimal results for reversible chains Extension to multiple control variates Control variates for MCMC algorithms 3 Control variates for Random-Walk Metropolis-Hastings algorithms 4 Application of control variates to RW-MH algorithms 5 Summary and Further Research () August 29, / 42

8 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Introduction of control variates to Markov chains Let π( ) target probability measure on (X, B) F : X R function of interest π(f ) := E π [F (X )] := Fdπ the quantity to be estimated (X n ) Markov chain on X, with: P transition kernel π stationary probability measure () August 29, / 42

9 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Introduction of control variates to Markov chains Let further: π( ) target probability measure on (X, B) F : X R function of interest π(f ) := E π [F (X )] := Fdπ the quantity to be estimated (X n ) Markov chain on X, with: P transition kernel π stationary probability measure function U : X R, with π(u) = 0 modified function F θ = F θu modified estimator µ n (F θ ) = µ n (F ) θµ n (U) () August 29, / 42

10 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Introduction of control variates to Markov chains All the regularity properties of F also hold for F θ : lim µ 1 n(f θ ) := lim n n n n F θ (X i ) = π(f θ ) = π(f ), i=1 a.s. and n[µn (F θ ) π(f )] = 1 n n [F θ (X i ) π(f )] D N(0, σf 2 θ ) i=1 () August 29, / 42

11 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Introduction of control variates to Markov chains All the regularity properties of F also hold for F θ : lim µ 1 n(f θ ) := lim n n n n F θ (X i ) = π(f θ ) = π(f ), i=1 a.s. and n[µn (F θ ) π(f )] = 1 n n [F θ (X i ) π(f )] D N(0, σf 2 θ ) i=1 Research interest: () August 29, / 42

12 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Introduction of control variates to Markov chains All the regularity properties of F also hold for F θ : lim µ 1 n(f θ ) := lim n n n n F θ (X i ) = π(f θ ) = π(f ), i=1 a.s. and n[µn (F θ ) π(f )] = 1 n n [F θ (X i ) π(f )] D N(0, σf 2 θ ) i=1 Research interest: Find appropriate function U parameter θ so as to significantly reduce (σf 2 θ ) compared to (σf 2 ). () August 29, / 42

13 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: U = G PG, for arbitrary G, with π( G ) < () August 29, / 42

14 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] () August 29, / 42

15 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Then, Which U? Which G? () August 29, / 42

16 Theoretical framework for Control Variates to MCMC Introduction of control variates to Markov chains Elaboration on variance of the modified estimator (σ 2 F θ ) An alternative expression of the variance in the CLT is: σ 2 F = π(ˆf 2 (P ˆF ) 2) ˆF : the solution of Poisson s equation: P ˆF ˆF = F + π(f ) Analogously σ 2 θ := σ2 F θ = π (ˆF 2 θ (P ˆF θ ) 2) () August 29, / 42

17 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: Then, U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Which U? Which G? Findings from σ 2 θ elaboration () August 29, / 42

18 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: Then, U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Which U? Which G? Findings from σ 2 θ elaboration If G = ˆF (σ 2 F θ ) = 0 () August 29, / 42

19 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: Then, U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Which U? Which G? Findings from σ 2 θ elaboration If G = ˆF (σ 2 F θ ) = 0 Guidelines for choosing G as close as possible to ˆF () August 29, / 42

20 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: Then, U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Which U? Which G? Findings from σ 2 θ elaboration If G = ˆF (σ 2 F θ ) = 0 The higher the correlation between F,U the smaller the (σ 2 F θ ) Guidelines for choosing G as close as possible to ˆF () August 29, / 42

21 Theoretical framework for Control Variates to MCMC Choice of function U Introduction of control variates to Markov chains In this setting we use: Then, U = G PG, for arbitrary G, with π( G ) < PG(x) = E [ G(X 1 ) X 0 = x ] Which U? Which G? Findings from σ 2 θ elaboration If G = ˆF (σ 2 F θ ) = 0 The higher the correlation between F,U the smaller the (σ 2 F θ ) Guidelines for choosing G as close as possible to ˆF leading to U = G PG highly correlated to F () August 29, / 42

22 Theoretical framework for Control Variates to MCMC Optimal value of parameter θ Introduction of control variates to Markov chains It can be derived that: σ 2 θ =... quadratic function with respect to θ. () August 29, / 42

23 Theoretical framework for Control Variates to MCMC Optimal value of parameter θ Introduction of control variates to Markov chains It can be derived that: σ 2 θ =... quadratic function with respect to θ. Thus, θ = π(ˆf G (P ˆF )(PG) ) π ( G 2 (PG) 2) Alternative expression (proved by Dellaportas and Kontoyiannis, 2008) with practical usefulness: θ = π(ˆf G (P ˆF )(PG) ) [ E π (G(X 1 ) PG(X 0 )) 2] () August 29, / 42

24 Theoretical framework for Control Variates to MCMC Optimal results for reversible chains Optimal empirical estimate of θ for reversible chains If the chain (X n ) is reversible, Dellaportas and Kontoyiannis (2008) prove: Theorem ) π (ˆF G (P ˆF )(PG) = π ((F π(f ))(G + PG)) () August 29, / 42

25 Theoretical framework for Control Variates to MCMC Optimal results for reversible chains Optimal empirical estimate of θ for reversible chains If the chain (X n ) is reversible, Dellaportas and Kontoyiannis (2008) prove: Theorem ) π (ˆF G (P ˆF )(PG) = π ((F π(f ))(G + PG)) So, the optimal value of θ (for reversible chains) can be expressed as θ rev = π ((F π(f ))(G + PG)) E π [ (G(X 1 ) PG(X 0 )) 2] () August 29, / 42

26 Theoretical framework for Control Variates to MCMC Optimal results for reversible chains Optimal empirical estimate of θ for reversible chains If the chain (X n ) is reversible, Dellaportas and Kontoyiannis (2008) prove: Theorem ) π (ˆF G (P ˆF )(PG) = π ((F π(f ))(G + PG)) So, the optimal value of θ (for reversible chains) can be expressed as θ rev = π ((F π(f ))(G + PG)) E π [ (G(X 1 ) PG(X 0 )) 2] In this case, θ can be adaptively estimated as ˆθ n = µ n(f (G + PG)) µ n (F )µ n (G + PG) n i=1 [G(X i) PG(X i 1 )] 2 1 n () August 29, / 42

27 Theoretical framework for Control Variates to MCMC Extension to multiple control variates Extension to multiple control variates Let s further assume: k functions U j (= G j PG j ) : X R, with π(u j ) = 0 Notation with vectors: G 1 U 1 G = G 2... : X Rk U = U 2... : X Rk θ = G k U k modified function F θ = F θ, U = F k j=1 θ ju j θ 1 θ 2... θ k Rk () August 29, / 42

28 Theoretical framework for Control Variates to MCMC Extension to multiple control variates Extension to multiple control variates Analogously to the one-dimensional case, we have that: σ 2 F θ = σ 2 F 2π(ˆF θ, G P ˆF θ, PG ) + π ( θ, G 2 θ, PG 2) ) θ = K(G) 1 π (ˆF G (P ˆF )(PG) where K(G) ij = E π [( Gi (X 1 ) PG i (X 0 ) )( G j (X 1 ) PG i (X 0 ) )] For reversible chains, Dellaportas and Kontoyiannis (2008) prove: Theorem θ = K(G) 1 π ( (F π(f )) (G + PG) ) () August 29, / 42

29 Theoretical framework for Control Variates to MCMC Control variates for MCMC algorithms Use of control variates for the reduction of variance of MCMC algorithms Control Variate methodology directly applicable to reversible MCMC algorithms Derivation of µ n (F θ ) straightforward. Quantities needed: F (X i ), G j (X i ) for i = 1,..., n, j = 1,..., k: OK, readily available BUT PG j (X i )? This issue may require further attention... Most popular MCMC algorithms: Gibbs sampler Treated in detail in Dellaportas and Kontoyiannis (2008) Metropolis-Hastings algorithm Random Walk (RW-MH) () August 29, / 42

30 Theoretical framework for Control Variates to MCMC Control variates for MCMC algorithms Use of control variates for the reduction of variance of MCMC algorithms Control Variate methodology directly applicable to reversible MCMC algorithms Derivation of µ n (F θ ) straightforward. Quantities needed: F (X i ), G j (X i ) for i = 1,..., n, j = 1,..., k: OK, readily available BUT PG j (X i )? This issue may require further attention... Most popular MCMC algorithms: Gibbs sampler Treated in detail in Dellaportas and Kontoyiannis (2008) Metropolis-Hastings algorithm Random Walk (RW-MH) () August 29, / 42

31 Control variates for Random-Walk Metropolis-Hastings algorithms 1 MCMC and variance considerations 2 Theoretical framework for Control Variates to MCMC 3 Control variates for Random-Walk Metropolis-Hastings algorithms Introduction to the algorithm Elaboration on PG(X ) 4 Application of control variates to RW-MH algorithms 5 Summary and Further Research () August 29, / 42

32 Control variates for Random-Walk Metropolis-Hastings algorithms Setting of RW-MH Introduction to the algorithm Assume the following: Our target distribution is a d-dimensional probability density π(x) The Random Walk Metropolis-Hastings (RW-MH) algorithm used for the simulation of π can be described as: Assume initial value X 0 and Y 0 = X 0 + 0, where 0 P (d-variate symmetrical distribution) At step t + 1, given X t = x t, Y t = y t = x t + t, we have that: X t+1 = where { x t + t = y t w.pr. ρ(x t, y t) x t w.pr. 1 ρ(x t, y t) we simulate t+1 P { ρ(x t, y t)= min 1, π(yt) } π(x t) () August 29, / 42

33 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration PG(x) = E X1 [ G (X1 ) X 0 = x ] () August 29, / 42

34 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration PG(x) = E X1 [ G (X1 ) X 0 = x ] = E Y0 { EX1 [ G (X1 ) X 0 = x, Y 0 = y ]} () August 29, / 42

35 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration PG(x) = E X1 [ G (X1 ) X 0 = x ] = E Y0 { EX1 [ G (X1 ) X 0 = x, Y 0 = y ]} = E Y0 {ρ (x, y) G (y) + (1 ρ (x, y)) G (x)} () August 29, / 42

36 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration [ PG(x) = E X1 G (X1 ) X 0 = x ] { [ = E Y0 EX1 G (X1 ) X 0 = x, Y 0 = y ]} = E Y0 {ρ (x, y) G (y) + (1 ρ (x, y)) G (x)} = G(x) + E Y0 {ρ(x, y) (G(y) G(x))} () August 29, / 42

37 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration [ PG(x) = E X1 G (X1 ) X 0 = x ] { [ = E Y0 EX1 G (X1 ) X 0 = x, Y 0 = y ]} = E Y0 {ρ (x, y) G (y) + (1 ρ (x, y)) G (x)} = G(x) + E Y0 {ρ(x, y) (G(y) G(x))} i.e. PG(x) = G(x) + E Y0 {ρ(x, y) (G(y) G(x))}, where Y 0 q(y x) () August 29, / 42

38 Control variates for Random-Walk Metropolis-Hastings algorithms Elaboration on PG(X ) Use of control variates for RW-MH - PG(x) elaboration [ PG(x) = E X1 G (X1 ) X 0 = x ] { [ = E Y0 EX1 G (X1 ) X 0 = x, Y 0 = y ]} = E Y0 {ρ (x, y) G (y) + (1 ρ (x, y)) G (x)} = G(x) + E Y0 {ρ(x, y) (G(y) G(x))} i.e. PG(x) = G(x) + E Y0 {ρ(x, y) (G(y) G(x))}, where Y 0 q(y x) Several approaches may be considered for the estimation of PG(x). Monte Carlo estimation based on P Importance sampling based on the proposed values y () August 29, / 42

39 Application of control variates to RW-MH algorithms 1 MCMC and variance considerations 2 Theoretical framework for Control Variates to MCMC 3 Control variates for Random-Walk Metropolis-Hastings algorithms 4 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Case-study of a survival analysis Poisson generation Heavy tailed distributions 5 Summary and Further Research () August 29, / 42

40 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Target : N ( 0, σ 2 t = 10 ) Proposal : N ( 0, σ 2 pr = 160 ) We use F (x) = x and G(x) = x () August 29, / 42

41 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Target : N ( 0, σ 2 t = 10 ) Proposal : N ( 0, σ 2 pr = 160 ) We use F (x) = x and G(x) = x Framework for the evaluation of variance reduction: Optimal value of θ based on Dellaportas and Kontoyiannis (2008) approach for reversible chains Terms of PG(x) assessed using Monte Carlo estimates from the proposal distribution T = 100 repetitions for each simulation scenario. Variance Reduction Factors : VRF = S 2 µ n(f ) S 2 µ n(f θ ) () August 29, / 42

42 Figure: Ergodic means for different realizations with n = 100, 000 and n PG = 50 () August 29, / 42 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal

43 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Table: VRF s in Univariate Normal case - Normal proposal - Simplest G(x) = x Length of Markov chain (number of iterations n) n PG 1,000 2,000 5,000 10,000 20,000 50, , , () August 29, / 42

44 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Study of the effect of G functions 1. G(x) = x 2. G 1 (x) = x, G 2 (x) = x 2 k. G 1 (x) = x, G 2 (x) = x 2,, G k (x) = x k () August 29, / 42

45 Figure: Plot of VRF by the order of polynomial G functions () August 29, / 42 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal

46 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Table: VRF s in Univariate Normal case - Normal proposal - Polynomial G functions of order 4, i.e. G 1 (x) = x, G 2 (x) = x 2, G 3 (x) = x 3, G 4 (x) = x 4 Length of Markov chain (number of iterations n) n PG 1,000 2,000 5,000 10,000 20,000 50, , , () August 29, / 42

47 Figure: Plot of VRF by different acceptance rates () August 29, / 42 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal

48 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Study of the effect of the proposal distribution Table: VRF s in Univariate Normal case - T-student proposal (3 df s, variance 160) - Simplest function G(x) = x Length of Markov chain (number of iterations n) n PG 1,000 2,000 5,000 10,000 20,000 50, () August 29, / 42

49 Application of control variates to RW-MH algorithms The simple case of Univariate Normal Application of control variates to RW-MH algorithms - Univariate Normal Study of the effect of the proposal distribution Table: VRF s in Univariate Normal case - Uniform proposal ( 5.5 σ t, +5.5 σ t ) - Simplest function G(x) = x Length of Markov chain (number of iterations n) n PG 1,000 2,000 5,000 10,000 20,000 50, () August 29, / 42

50 Application of control variates to RW-MH algorithms Case-study of a survival analysis Application of control variates to RW-MH algorithms - Survival analysis Source: Albert, J. (2007). Bayesian Computation with R, Springer. Data: Lifetime of a number of patients some of which had a heart transplant. Model assumptions: Non-transplant patients: X i Exponential with mean 1/η Transplant patients: X i Exponential with mean 1/(τη) Parameter η Gamma(p, λ), i.e. f (η) = λp Γ(p) ηp 1 exp( λη) Unknown parameter vector: (τ, λ, p) (all positive) Notation: N non-transplant patients: n: died N n: censored x i survival time. M transplant patients: m: died M m: censored y i time to transplant z i survival time () August 29, / 42

51 Application of control variates to RW-MH algorithms Case-study of a survival analysis Application of control variates to RW-MH algorithms - Survival analysis Likelihood: L(τ, λ, p) = n i=1 pλ p (λ+x i ) p+1 N i=n+1 ( λ λ+x i ) p m j=1 τpλ p ( M (λ+y j +τz j ) p+1 j=m+1 ) p λ λ+y j +τz j Prior distribution of parameters: g(τ, λ, p) 1 Posterior distribution of parameters: g(τ, λ, p data) L(τ, λ, p) Transformation φ = (φ 1 := logτ, φ 2 := logλ, φ 3 := logp, ) ( ) g(φ data) L e φ 1, e φ 2, e φ 3 e 3 i=1 φ i () August 29, / 42

52 Application of control variates to RW-MH algorithms Case-study of a survival analysis Application of control variates to RW-MH algorithms - Survival analysis Table: Function F (τ, λ, p) = log(τ) = φ 1, n = 10, 000, n PG = 50 Form of G A. G 1 = φ 1 G 2 = φ 2 G 3 = φ 3 B. G 1 = φ 2 1 G 2 = φ 2 2 G 3 = φ 2 3 [ ] G1 = φ 1 C. G 2 = φ 2 1 D. [ G1 = φ 1 G 2 = e φ 1 VRF Form of G E. G 1 = φ 1 G 2 = φ 2 1 G 3 = φ 2 G 4 = φ 2 2 G 5 = φ 3 G 6 = φ 2 3 F. G 1 = φ 1 G 2 = φ 2 1 G 3 = φ 2 G 4 = φ 2 2 G. G 1 = φ 1 G 2 = φ 2 1 G 3 = φ 3 G 4 = φ 2 3 VRF () August 29, / 42

53 Figure: Plot of VRF by n, function F (τ, λ, p) = log(τ) = φ 1, G = [ G = φ, G = φ 2] () August 29, / 42 Application of control variates to RW-MH algorithms Case-study of a survival analysis Application of control variates to RW-MH algorithms - Survival analysis

54 Application of control variates to RW-MH algorithms Poisson generation Application of control variates to RW-MH algorithms - Poisson generation Setting: Target distribution π: Poisson(λ), Proposal: discrete bell-shaped: P M M + 1 δ ( = δ) = M(M + 1), δ { M, M+1,..., 1,+1,...,M 1,M} Inference is focused on: F (λ) = λ To enhance estimation of π(f ): F G,θ (λ) = F (λ) θ U(λ) where U = G PG () August 29, / 42

55 Application of control variates to RW-MH algorithms Poisson generation Application of control variates to RW-MH algorithms - Poisson generation The form of function G used here is In the present setting: G(λ) = λ PG(x) = E π [G(λ t+1 ) λ t = x] =... = x + k [ j= k, 0 P k (j) j ρ(x, x + j) ] The terms of PG(x) are assessed using two approaches: (i) Using the exact formula (ii) Using Monte Carlo estimates from P k distribution () August 29, / 42

56 Application of control variates to RW-MH algorithms Poisson generation Application of control variates to RW-MH algorithms - Poisson generation Table: VRF s for F (x) = x, G(x) = x for n = 5, 000 (analytic formula for PG(x)) Proposal distribution P M Target P(λ) λ = λ = λ = () August 29, / 42

57 Application of control variates to RW-MH algorithms Heavy tailed distributions Application of control variates to RW-MH algorithms - Heavy tailed distributions Source: Jarner, S.F. and Roberts, G.O. (2007). Convergence of Heavy-tailed Monte Carlo Markov Chain Algorithms. Scandinavian Journal of Statistics. 34, Main result for RW-MH: Heavy tailed proposals lead to higher rates of convergence Focus: Polynomially ergodic Markov chains For polynomial target distributions: they derive polynomial rates of convergence () August 29, / 42

58 Application of control variates to RW-MH algorithms Heavy tailed distributions Application of control variates to RW-MH algorithms - Heavy tailed distributions Table: Existence of central limit theorems for RW-MH algorithm (from Jarner and Roberts, 2007) Target distribution Proposal distr. t(2.5) t(3) t(4) t(5) t(6) t(7) Uniform L C C C Normal L C C C Cauchy L C C C C t(0.5) L L C C C C C: CLT holds for x, L: CLT holds for x s, s < 1 () August 29, / 42

59 Application of control variates to RW-MH algorithms Heavy tailed distributions Application of control variates to RW-MH algorithms - Heavy tailed distributions Table: VRF s for F (x) = x, with G i = x i, i = 1, 2, 3 (n = 200, 000, n PG = 50) Target distribution Proposal distr. t(2.5) t(3) t(4) t(5) t(6) t(7) Uniform Normal Cauchy t(0.5) () August 29, / 42

60 Summary and Further Research Summary and Further Research Summary A solid methodological framework has been provided for the development and use of control variates in MCMC For given G function, consistent estimates for optimal coefficients θ are defined Main reference: Dellaportas, P. and Kontoyiannis, I. (2008). Control Variates for Reversible MCMC samplers. Submitted to JRSS(Series B) () August 29, / 42

61 Summary and Further Research Summary and Further Research Summary A solid methodological framework has been provided for the development and use of control variates in MCMC For given G function, consistent estimates for optimal coefficients θ are defined Further research Approaches for elicitation of G functions and decision on k Techniques for more efficient derivation of PG(x) Extension to Reversible-Jump MCMC algorithms Main reference: Dellaportas, P. and Kontoyiannis, I. (2008). Control Variates for Reversible MCMC samplers. Submitted to JRSS(Series B) () August 29, / 42

62 Summary and Further Research Summary and Further Research Thank you for listening :) () August 29, / 42

Computational statistics

Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated