Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University

Size: px

Start display at page:

Download "Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University"

Daniella Hicks
6 years ago
Views:

1 Bayesian Inference and the Parametric Bootstrap Bradley Efron Stanford University

2 Importance Sampling for Bayes Posterior Distribution Newton and Raftery (1994 JRSS-B) Nonparametric Bootstrap: good choice (actually used a smoothed nonparametric bootstrap) Today Parametric Bootstrap Good computational properties when applicable Connection between Bayes and frequentist inference Bayesian Inference 1

3 Student Score Data (Mardia, Kent and Bibby) n = 22 students scores on two tests: mechanics, vectors Data y = (y 1, y 2,..., y 22 ) with y i = (mec i, vec i ) Parameter of interest θ = correlation (mec, vec) Sample correlation coefficient ˆθ = 0.498±?? Bayesian Inference 2

4 Scores of 22 students on two tests 'mechanics' and 'vectors'; Sample Correlation Coefficient is.498 +?? vectors score mechanics score Sampled from 88 students, Mardia, Kent and Bibby Bayesian Inference 3

5 R. A. Fisher (1915) Density f θ ( ˆθ ) = n 2 π (1 θ2 ) ( (n 1)/2 1 ˆθ 2) (n 4)/2 0 [ cosh(w) θ ˆθ ] (n 1) dw Bivariate normal y i ind N 2 (µ, /Σ) i = 1, 2,..., n 95% confidence limits (from z transform) θ (0.084, 0.754) Bayesian Inference 4

6 Fisher density for correlation coefficient if corr=.498 And histogram for parametric bootstrap replications 2.5 Frequency corrhat= density correlation Red triangles are Fisher 95% confidence limits Bayesian Inference 5

7 Parametric Bootstrap Computations ind Assume y i N 2 (µ, /Σ) Estimate ˆµ and ˆ/Σ by MLE ( ) ind Bootstrap sample N 2 ˆµ, ˆ/Σ for i = 1, 2,..., 22 y i Bootstrap replication ˆθ = corr coeff for y = (y 1, y 2,..., y 22 ) Bootstrap standard deviation sd ( ˆθ ) = (B = 10, 000) Percentile 95% confidence limits (0.109, 0.761) Bayesian Inference 6

8 Better Bootstrap Confidence Limits (BCa) Idea Reweighting the B bootstrap replications improves coverage Weights involve z 0 (bias correction) and a (acceleration) W BCa (θ) = ϕ(z θ/(1 + az θ ) z 0 ) (1 + az θ ) 2 ϕ(z θ + z 0 ) where Z θ = Φ 1 G(θ) z 0 bootstrap cdf Reweighting the B = 10, 000 bootstraps gave weighted percentiles (0.075, 0.748) Fisher: (0.084, 0.754) Bayesian Inference 7

9 Bayes Posterior Intervals Prior π(θ) Posterior expectation for t(θ): E { t(θ) ˆθ } ( = t(θ)π(θ) f ) / ( θ ˆθ dθ π(θ) f ) θ ˆθ dθ Conversion factor Ratio of likelihood to bootstrap density: ( R(θ) = f ) / θ ˆθ f ˆθ (θ) ( θ = ˆθ ) Bootstrap integrals E { t(θ) ˆθ } = t(θ)π(θ)r(θ) f ˆθ (θ) dθ π(θ)r(θ) f ˆθ (θ) dθ Bayesian Inference 8

10 Bootstrap Estimation of Bayes Expectation E{t(θ) ˆθ} Parametric bootstrap replications f ˆθ ( ) θ 1, θ 2,..., θ i,..., θ B t i = t(θ i ), π i = π(θ i ), R i = R(θ i ): Ê { t(θ) ˆθ } = B / B t i π i R i π i R i i=1 i=1 Reweighting Weight π i R i on θ i Importance sampling estimate: Ê { t(θ) ˆθ } E { t(θ) ˆθ } as B Bayesian Inference 9

11 Jeffreys Priors Harold Jeffreys (1930s): Theory of uninformative prior distributions ( invariant, objective, reference,... ) For density family f θ (y), θ R p : π(θ) = c I(θ) 1/2 I(θ) = Fisher info matrix Normal correlation π(θ) = 1/(1 θ 2 ) Bayesian Inference 10

12 Importance sampling posterior density for correlation (black) using Jeffreys prior pi(theta)=1/(1 theta^2); BCa weighted density (green); raw bootstrap density (red) weighted counts correlation triangles show 95% posterior limits Bayesian Inference 11

13 Multiparameter Exponential Families R P p-dimensional sufficient statistic ˆβ p-dimensional parameter vector β = E { ˆβ } (Poisson: e λ λ x /x! has ˆβ = x, β = λ) Hoeffding f β ( ˆβ ) = f ˆβ ( ˆβ ) e D ( ˆβ,β)/2 Deviance D(β 1, β 2 ) = 2E β1 log { f β1 ( ˆβ ) / f β2 ( ˆβ )} Bayesian Inference 12

14 Conversion Factor in Exponential Families Conversion factor R(β) = f β ( ˆβ ) / f ˆβ (β) (with ˆβ fixed) R(β) = ν(β)e (β) where (β) = [ D ( β, ˆβ ) D ( ˆβ, β )] / 2 ν(β) = f ˆβ ( ˆβ ) / f β (β) 1/Jeffreys prior (Laplace) Jeffreys prior π(β)r(β) e (β) Bayesian Inference 13

15 Example: Multivariate Normal y i ind N d (µ, Σ) for i = 1, 2,..., n = 1 ( ) n µ ˆµ ˆ/Σ /Σ 1 2 ( µ ˆµ ) tr ( /Σ ˆ/Σ 1 ˆ/Σ/Σ 1 ) + log ˆ/Σ /Σ eigenratio (student score data) θ = Bootstrap λ 1 λ 1 + λ 2 ˆθ = y i ind N d ( ˆµ, ˆ/Σ ) ˆλ 1 ˆλ 1 + ˆλ 2 = 0.793±?? i = 1, 2,..., n θ 1, θ 2,..., θ B Bayesian Inference 14

16 Density estimates for student score eigenratio (B=10,000); Boot (red),jeffreys (5 dim prior) Black, and BCa Green (z0=.182, a=0) density BCa Jeffreys boot eigenratio Triangles show 95% limits: bca (.598,.890) bayes (.650,.908) Bayesian Inference 15

17 Reweighting the parametric bootstrap points bhat exp(del) Bayesian Inference 16

18 Prostate Cancer Study (Singh et al, 2002) Microarray study 102 men: 52 prostate cancer, 50 healthy controls 6033 genes z i test statistic for H 0i : no difference H 0i : z i N(0, 1) Goal Identify genes involved in prostate cancer Bayesian Inference 17

19 Prostate cancer z values for 6033 genes; 52 patients vs 50 healthy controls. N(0,1) black; LogPoly4 red counts log poly4 fit z value Bayesian Inference 18

20 Poisson Regression Models Histogram y j = #{z i bin j } x j = midpoint bin j, j = 1, 2,..., J Poisson model y j ind Poi(µ j ) with log(µ j ) = poly(x j, degree = m) Exponential family degree p = m + 1 Let η j = log(µ j ) = (η ˆη) (µ + ˆµ) 2 ( µj ˆµ j ) Bayesian Inference 19

21 Parametric False Discovery Rate Estimates glm(y poly(x,4),poisson) { ˆµ j } ( log poly 4 ) θ = Fdr(3) = [1 Φ(3)] / [ 1 ˆF(3) ] Pr{null z 3} ( ˆF is smoothed CDF estimate) Model 4: Fdr(3) = 0.192±?? Parametric bootstrap y ind Poi ( ) { } ˆµ j j ˆµ j Fdr(3) Bayesian Inference 20

22 Fdr(3) estimation, Model 4; prostate data from 4000 parboots; Jeffreys Bayes (black), raw boot (red), bca (green) posterior density Fdr(3) triangles show 95% Jeffreys/BCa limits (.154,.241) Bayesian Inference 21

23 Compare Model 4 (line) with Model 8 (solid); 4000 parametric bootstrap replications of Fdr(3) counts Fdr(3) bootreps triangles are 95% bca limits: (.141,.239) vs (.154,.241) Bayesian Inference 22

24 Model Selection: AIC = Deviance + 2 df Model AIC boot counts Bayes counts M M M (36%) M (12%) M (5%) M (2%) M (45%) Model 8, Jeffreys prior Bayesian Inference 23

25 Internal Accuracy of Bayes Estimates B for computing Ê { t(β) ˆβ } = B / B t i π i R i π i R i? Define r i = π i R i and s i = t i π i R i ĉ ss = Example t = Fdr(3) CV { Ê } 2 = 1 B 1 (ĉss 1 B (s i s) 2 /B, etc. 1 ) s 2 2ĉsr s r + ĉrr r parametric bootreps, Jeffreys Bayes Model 4 Ê = ĈV = Model 8 Ê = ĈV = Bayesian Inference 24

26 Sampling Variability of Bayes Estimates How much would Ê { t(β) ˆβ } vary for new ˆβ s? (i.e., frequentist properties of Bayes estimates) Need to bootstrap the parametric bootstrap calculations for Ê { t(β) ˆβ } Shortcut Bootstrap after bootstrap Bayesian Inference 25

27 Bootstrap-after-Bootstrap Calculations f ˆβ ( ) β 1, β 2,..., β i,..., β B, the original bootstrap reps under ˆβ Want bootreps under some other value ˆγ Idea Reweight the originals Weight Q ˆγ (β) = f β ( ˆγ ) / fβ ( ˆβ ) : Ê { t ˆγ } = B / B t i π i R i Q ˆγ (β i ) π i R i Q ˆγ (β i ) 1 1 Bayesian Inference 26

28 Bootstrap-after-Bootstrap Standard Errors BAB (1) (2) (3) f ˆβ ( ) γ 1, γ 2,..., γ k Ê k = Ê { } t ˆγ k se { Ê } [ 2 / ] 1/2 = (Êk Ê ) K Example Model % credible limit for Fdr(3) = 0.241±?? Bayesian Inference 27

29 400 BAB replications of 97.5% credible upper limit for Fdr(3); line histogram: same for BCa upper limit Frequency % credible upper limit BAB Standard Errors:.030 (Bayes).034 (BCa); corr.98 Bayesian Inference 28

30 BAB SDs for Bayes Model Selection Proportions Bayes Model post. prob. BAB SD M4 36% ± 20% M5 12% ± 14% M6 5% ± 8% M7 2% ± 6% M8 45% ± 27% Model averaging? Bayesian Inference 29

31 A Few Conclusions... Parametric bootstrap closely related to objective Bayes. (That s why it s a good importance sampling choice.) When it applies, parboot approach has both computational and interpretational advantages over MCMC/Gibbs. Objective Bayes analyses should be checked frequentistically. Bayesian Inference 30

32 References Bootstrap DiCiccio & Efron 1996 Statist. Sci Bayes and Bootstrap 3 48 (see Discussion!) Newton & Raftery 1994 JRSS-B Objective Priors Berger & Bernardo 1989 JASA ; Ghosh 2011 Statist. Sci Exponential Families Appendix 1 MCMC and Bootstrap Efron 2010 Large-Scale Inference, Efron 2011 J. Biopharm. Statist. Bayesian Inference 31

Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University Bayesian Inference Parameter: µ Ω Observed data: x Prior: π(µ) Probability distributions: Parameter of interest: { fµ (x), µ