Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University RSS Journal Webinar

Objective Bayesian Inference Probability family F = {f µ (x), µ Ω} Parameter of interest: θ = t(µ) Prior π(µ) ˆθ = E{θ x} = t(µ)f µ (x)π(µ) dµ Ω / f µ (x)π(µ) dµ Ω Objective Bayes π(µ) uninformative e.g., Jeffreys: π(µ) = I(µ) 1/2 How accurate is ˆθ? Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 2 / 18

General Accuracy Formula x (µ, V µ ) µ, x R p α x (µ) = x log f µ (x) = ( )..., log f µ(x) x i,... Lemma ˆθ = E { t(µ) x } has gradient x ˆθ = cov { t(µ), αx (µ) x } = ĉov Theorem The delta-method standard deviation of ˆθ is sd ( ˆθ) = [ĉov Vx ĉov ] 1/2. Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 3 / 18

Implementation Posterior Sample µ 1, µ 2,..., µ B (MCMC) t(µ i ) = t i and α i = α x (µ i ) ˆθ = B i=1 t i /B ĉov = ŝd ( ˆθ) = [ĉov Vx ĉov ] 1/2 B i=1 ( α i ᾱ ) ( t i t )/ B Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 4 / 18

Diabetes Data Efron et al. (2004), Park and Casella (2008) n = 442 subjects p = 10 predictors: age, sex, bmi, glu,... Response y = disease progression at one year Model: y = X α + e [e N n (0, I)] n 1 n p p 1 n 1 Prior π(α) = e λ α 1 (λ = 0.37) Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 5 / 18

Estimating θ j = x j α The predicted value for patient j MCMC posterior sample {α 1, α 2,..., α, B = 10, 000} B t ji = x j α gives ˆθ j = B i=1 t ji /B ˆθ 125 = 0.248 sd Bayes = 0.072 sd freq = 0.071 sd freq sd Bayes = ( x ˆΣG ˆΣx x ˆΣx ) 1/2 G = X X and ˆΣ = empirical covariance matrix of {α i } Eigen range ( ˆΣ 1/2 G ˆΣ 1/2) 1/2 = (1.007, 0.313) Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 6 / 18

Ratio Freq/Bayes standard errors for the 442 diabetes cases Frequency 0 10 20 30 40 ^ ^ #125 0.85 0.90 0.95 1.00 ratio Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 7 / 18

Posterior CDF of θ 125 Apply accuracy formula to s i = I{t i < c} E{s data} = Pr{θ 125 c data} = ĉdf(c) ĉdf(0.3) = 0.762 ± 0.304 Bayes frequentist sd estimate Upper 95% credible limit = 0.34 ± 0.07 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 8 / 18

MCMC posterior cdf for patient 125, + one freq standard error (Point is upper 95% credible limit) Prob{gamma125 < c data} 0.0 0.2 0.4 0.6 0.8 1.0 [ ].342 0.1 0.2 0.3 0.4 c value Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 9 / 18

Exponential Families f α (ˆβ ) = e α ˆβ ψ(α) f 0 (ˆβ ) natural parameter α sufficient statistic ˆβ (Now α = µ, ˆβ = x ) α x (µ) = α ŝd ( ( ˆθ) = cov t, α ˆβ ) Vˆα cov ( t, α ˆβ ) Vˆα = ψ(ˆα) = cov α=ˆα {ˆβ } Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 10 / 18

MCMC {α 1, α 2,..., α B } ĝ(α β = ˆβ) prob 1 B on α i Reweighting ĝ(α β = b): prob w i (b) = ce (b ˆβ) α i on α i (exponential family in b) Obtain bootstrap confidence intervals for t(α) without further simulation (BCa, ABC) Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 11 / 18

Frequentist 95% abc confidence limits for #125 cdf Prob{mu125 < c data} 0.0 0.2 0.4 0.6 0.8 1.0 0.1 0.2 0.3 0.4 c value Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 12 / 18

Bootstrap Estimates of a Posterior Distribution F = { f µ (x), µ Ω } MCMC sample {µ 1, µ 2,..., µ B }: Approximate g(µ x) with uniform weight 1 B on each µ i Bootstrap MLE ˆµ bootsample ˆµ i fˆµ, i = 1, 2,..., B Approximate g(µ x) with weight p i on ˆµ i where p i = π(ˆµ i )R i, R i conversion factor (easy in exponential families) Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 13 / 18

Simulation Accuracy How big to take B? Boot sampling q i = t i p i and d i = q i q p i p Then ˆθ = B 1 p it i has simulation coefficient of variation B cv i=1 (about 0.01 for ˆθ 125 with B = 10, 000) d 2 i 1/2 / B [ t i = t (ˆµ )] i Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 14 / 18

How Uninformative is a Prior? Measure of prior s effect on estimate of θ = t(µ) is Bayes MLE For patient 125, E{θ x} t(ˆµ) δ = ŝd MLE δ 125 = 0.248 0.316 0.076 = 0.90 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 15 / 18

Effect of Park/Cassella prior on patient estimates Frequency 0 10 20 30 40 #125 ^.90 ^ 1.0 0.5 0.0 0.5 1.0 (Bayes MLE)/Sd Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 16 / 18

2 1 0 1 2 3 2 1 0 1 2 3 Bayes estimates vs MLEs for the 442 diabetes patients; Linear Regression Bayes =.968 MLE MLE Bayes #125 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 17 / 18

References Park and Casella (2008) The Bayesian Lasso JASA 681 686 Singh et al. (2002) Gene expression correlates of clinical prostate cancer behavior Cancer Cell 203 209 Efron, Hastie, Johnstone and Tibshirani (2004) Least angle regression Ann. Statist. 407 499 Efron (2012) Bayesian inference and the parametric bootstrap Ann. Appl. Statist. 1971 1997 Efron (2010) Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction IMS Monographs, Cambridge University Press Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates RSS Journal Webinar 18 / 18