Frequentist Accuracy of Bayesian Estimates

Similar documents
Frequentist Accuracy of Bayesian Estimates

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University

FREQUENTIST ACCURACY OF BAYESIAN ESTIMATES. Bradley Efron. Division of Biostatistics STANFORD UNIVERSITY Stanford, California

Tweedie s Formula and Selection Bias. Bradley Efron Stanford University

Empirical Bayes Deconvolution Problem

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Correlation, z-values, and the Accuracy of Large-Scale Estimators. Bradley Efron Stanford University

Bayesian inference and the parametric bootstrap

Or How to select variables Using Bayesian LASSO

The bootstrap and Markov chain Monte Carlo

Bayesian Inference. Chapter 2: Conjugate models

Empirical Bayes modeling, computation, and accuracy

Statistical Inference

Large-Scale Hypothesis Testing

The problem is: given an n 1 vector y and an n p matrix X, find the b(λ) that minimizes

MODEL SELECTION, ESTIMATION, AND BOOTSTRAP SMOOTHING. Bradley Efron. Division of Biostatistics STANFORD UNIVERSITY Stanford, California

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Estimation of a Two-component Mixture Model

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

Some Curiosities Arising in Objective Bayesian Analysis

A G-Modeling Program for Deconvolution and Empirical Bayes Estimation

Package IGG. R topics documented: April 9, 2018

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Post-selection inference with an application to internal inference

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

A Very Brief Summary of Statistical Inference, and Examples

The locfdr Package. August 19, hivdata... 1 lfdrsim... 2 locfdr Index 5

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Adaptive Lasso for correlated predictors

Regularization: Ridge Regression and the LASSO

Consistent high-dimensional Bayesian variable selection via penalized credible regions

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

A Large-Sample Approach to Controlling the False Discovery Rate

Integrated Non-Factorized Variational Inference

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

The automatic construction of bootstrap confidence intervals

Aspects of Feature Selection in Mass Spec Proteomic Functional Data

spikeslab: Prediction and Variable Selection Using Spike and Slab Regression

7. Estimation and hypothesis testing. Objective. Recommended reading

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Robust methods and model selection. Garth Tarr September 2015

Default priors and model parametrization

Biostatistics Advanced Methods in Biostatistics IV

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

Extended Bayesian Information Criteria for Model Selection with Large Model Spaces

STAT 740: Testing & Model Selection

Stat 5101 Lecture Notes

1. Fisher Information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian Analysis for Partially Complete Time and Type of Failure Data

Multiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar

Data Analysis and Uncertainty Part 2: Estimation

Time Series and Dynamic Models

Variability within multi-component systems. Bayesian inference in probabilistic risk assessment The current state of the art

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Least Angle Regression, Forward Stagewise and the Lasso

Bayesian Linear Models

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

Introduction to Bayesian Statistics

Bayesian Inference. Chapter 9. Linear models and regression

Introduction. Start with a probability distribution f(y θ) for the data. where η is a vector of hyperparameters

On Simple Step-stress Model for Two-Parameter Exponential Distribution

A Very Brief Summary of Bayesian Inference, and Examples

Introductory Bayesian Analysis

INTRODUCTION TO BAYESIAN ANALYSIS

Lecture 2: Priors and Conjugacy

Package FDRreg. August 29, 2016

A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian inference. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark. April 10, 2017

Tweedie s Formula and Selection Bias

Introduction to Bayesian Methods

Chapter 4: Generalized Linear Models-I

On the Comparison of Fisher Information of the Weibull and GE Distributions

Generalized Linear Models and Its Asymptotic Properties

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

An Introduction to GAMs based on penalized regression splines. Simon Wood Mathematical Sciences, University of Bath, U.K.

LASSO. Step. My modified algorithm getlasso() produces the same sequence of coefficients as the R function lars() when run on the diabetes data set:

Lecture 1: Introduction

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions

Confidence Distribution

Integrated Objective Bayesian Estimation and Hypothesis Testing

Probabilistic Inference for Multiple Testing

inferences on stress-strength reliability from lindley distributions

David Giles Bayesian Econometrics

Linear Models A linear model is defined by the expression

Neutral Bayesian reference models for incidence rates of (rare) clinical events

Selective Sequential Model Selection

Bayesian Adjustment for Multiplicity

Distribution-free ROC Analysis Using Binary Regression Techniques

INTRODUCTION TO BAYESIAN STATISTICS

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

1 Hypothesis Testing and Model Selection

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Transcription:

Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University

Bayesian Inference Parameter: µ Ω Observed data: x Prior: π(µ) Probability distributions: Parameter of interest: { fµ (x), µ Ω } θ = t(µ) E{θ x} = t(µ)f µ (x)π(µ) dµ Ω What if we don t know g? / f µ (x)π(µ) dµ Ω Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 2 / 30

Jeffreysonian Bayes Inference Uninformative Priors Jeffreys: π(µ) = I(µ) 1/2 where I(µ) = cov { µ log f µ (x) } (the Fisher information matrix) Can still use Bayes theorem but how accurate are the estimates? Today: frequentist variability of E { t(µ) x } Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 3 / 30

General Accuracy Formula µ and x R p x (µ, V µ ) α x (µ) = x log f µ (x) = ( ) T..., log f µ(x) x i,... Lemma Ê = E { t(µ) x } has gradient x E = cov { t(µ), α x (µ) x }. Theorem The delta-method standard deviation of E is sd ( Ê ) = [ cov { t(µ), α x (µ) x }T V x cov { t(µ), α x (µ) x }] 1/2. Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 4 / 30

Implementation Posterior Sample µ 1, µ 2,..., µ B (MCMC) ˆθ i = t(µ i ) = t i and α i = α x (µ i ) ŝd ( Ê ) = ĉov = [ ] 1/2 ĉov T V x ĉov B i=1 ( α i ᾱ ) ( t i t )/ B Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 5 / 30

Diabetes Data Efron et al. (2004), LARS n = 442 subjects p = 10 predictors: age, sex, bmi, glu,... Response: y = disease progression at one year Model: y = X α + e [e N n (0, I)] n 1 n p p 1 n 1 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 6 / 30

Diabetes data. Lasso: min{rss/2 + gamma*l1(beta)} Cp minimized at Step 7, with gamma =.37 beta vals 15 10 5 0 5 10 15 Step 7 tc gamma: 16.44 8.37 5.84 2.41 1.64 1.27 0.37 0.1 0.09 0.04 0.02 0 ltg bmi ldl map tch hdl glu age sex 0 2 4 6 8 10 12 step Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 7 / 30

Bayesian Lasso Park and Casella (2008) Model: y N n (Xα, I) µ = α and x = y Prior: π(α) = e γl 1(α) [γ = 0.37] Then posterior mode at Lasso ˆα γ Subject 125: θ 125 = x T 125 α How accurate are Bayes posterior inferences for θ 125? Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 8 / 30

Bayesian Analysis MCMC: posterior sample { α i for i = 1, 2,..., 10, 000 } Gives { θ = x T 125,i 125 α, i = 1, 2,..., 10, 000} i θ 125,i 0.248 ± 0.072 General accuracy formula: frequentist sd 0.071 for Ê = 0.248 Other subjects freq sd < Bayes sd Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 9 / 30

Figure 2. 10000 MCMC values for Theta =Subject 125 estimate; mean=.248, stdev=.0721; Frequentist Sd for E{theta data} is.0708, giving Coefficient of Variation 29% Frequency 0 100 200 300 400 500 600 700 ^ [ ].248 ^ 0.0 0.1 0.2 0.3 0.4 0.5 MCMC mu125 values Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 10 / 30

Posterior CDF of θ 125 Apply GAC to s i = I { t i < c } so E{s data} = Pr{θ 125 c data} c = 0.3 : Ê = 0.762 ± 0.304 Bayes GAC sd estimate Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 11 / 30

Figure 3. MCMC posterior cdf of mu125, Diabetes data, Prior exp{.37*l1(alpha)}; verts are + One Frequentist Standard Dev Prob{mu125 < c data} 0.0 0.2 0.4 0.6 0.8 1.0 [ ].336 0.0 0.1 0.2 0.3 0.4 c value Upper 95% credible limit is.336 +.071 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 12 / 30

Exponential Families f α (ˆβ ) = e αt ˆβ ψ(α) f 0 (ˆβ ) x = ˆβ and µ = α General Accuracy Formula α = α x (µ) For Ê = E { t(α) ˆβ } ŝd = [cov ( t, α ) T ˆβ Vˆα cov ( t, α )] 1/2 ˆβ with Vˆα = cov α=ˆα {ˆβ } = ψ(ˆα). Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 13 / 30

Bayesian Estimation Using the Parametric Bootstrap Efron (2012) Parametric bootstrap: fˆα ( ) {ˆβ 1, ˆβ 2,..., ˆβ B } Want to calculate: Ê = E { t(α) ˆβ } for prior π(α) Importance sampling: Ê B / B t i π i R i 1 1 π i R i t i = t (ˆα i ), πi = π (ˆα i ), and Ri = conversion factor (Easy in exponential families.) R i = fˆα (ˆβ ) / ) fˆα (ˆβ i i Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 14 / 30

Bootstrap for GAC p i = π i R i / B 1 π jr j is weight on ith Bootstrap Replication Ê = B i=1 p it i ĉov = B i=1 p i ˆα i ( t i Ê) estimates cov ( α, t ˆβ ) ŝd = [ ] 1/2 ĉov T Vˆα ĉov Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 15 / 30

Prostate Cancer Study Singh et al. (2002) Microarray study: 102 men 52 prostate cancer, 50 healthy controls 6033 genes z i test statistic for H 0i : no difference H 0i : z i N(0, 1) Goal: identify genes involved in prostate cancer Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 16 / 30

Figure 4. Prostate study: 6033 z values and matching N(0,1) density Frequency 0 100 200 300 400 3 4 2 0 2 4 z value Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 17 / 30

Poisson GLM Histogram: 49 bins, c j midpoint of bin j y j = #{z i in bin j} Poisson GLM: y j ind Poi(µ j ) log(µ) = poly(c, degree=8) [MLE: glm(y poly(c,8), Poisson)] Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 18 / 30

Bayesian Estimation for the Poisson Model Model: y Poi(µ), µ = e Xα [X = poly(c,8)] Prior: Jeffreys prior for α / cdf: α µ cdf: F α (z) = µ i µ i Fdr parameter: [MLE: Fdrˆα (3) = 0.183] c i z t(α) = 1 Φ(3) 1 F(3) = Fdr(3) Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 19 / 30

Figure 5. Prostate study: posterior density for Fdr(3) based on 4000 parametric bootstraps from Poisson poly(8) gave posterior (m,sd) =(.183,.025); Frequentist Sd for.183 equalled.026; CV=14% density 0 5 10 15 E{t x}=.183 0.15 0.20 0.25 0.30 Fdr(3) values Internal coefficient of variation =.0023 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 20 / 30

Model Selection Calculations Full model: y Poi(µ), µ = e Xα [ α R 9, µ R 49] Submodel M m : {µ : only 1st m + 1 coordinates of β 0} Bayesian model selection: prior probabilities on and within each M m Poor man s model estimates: partition R 49 into preference regions R m = {µ closest to M m } Calculate posterior probabilities Pr{R m y} Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 21 / 30

Posterior Model Probabilities Distance: minimum AIC from µ to point in R m Preferred model: is one with smallest distance R m 4 5 6 7 8 posterior prob.36.12.05.02.45 sd.32.16.08.03.40 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 22 / 30

Empirical Bayes Accuracy z k = observed value for gene k θ k = true effect size z k N(θ k, 1) Unknown prior g( ) θ 1, θ 2,..., θ N [N = 6033] From observations z 1, z 2,..., z N we wish to estimate E z = E { t(θ) / } z = t(θ)f θ (z)g(θ) dθ f θ (z)g(θ) dθ [if t(θ) = δ 0 (θ) then E z = Pr{θ = 0 z}]. Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 23 / 30

Parametric Families of Priors p-parameter exponential family: log g α (θ) = q(θ) T 1 p α the unknown parameter α + constant p 1 Marginal: f α (z) = f θ (z)g α (θ) dθ e.g., q(θ) = ( δ 0 (θ), θ, θ 2, θ 3, θ 4, θ 5) T [ spike and slab ] MLE: α g α f α (z 1, z 2,..., z N ) marginal MLE ˆα Ê z = t(θ)f θ (z)gˆα (θ) dθ / f θ (z)gˆα (θ) dθ ±?? Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 24 / 30

Delta-Method Standard Deviation of Êz sd ( { ) (dez ) T ( ) } 1/2 Ê z = I 1 dez α dα dα Fisher information matrix: q α = q(θ)g α (θ) dθ h α (z) = f θ (z)g α (θ) [q(θ) q] dθ hα (z)h α (z) T I α = N f α (z) dz de z dα = E z w(θ)gα (θ) [q(θ) q] dθ where w(θ) = t(θ)f θ (z)g α (θ) / tfg α f θ (z)g α (θ) / fg α Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 25 / 30

Figure 6. Estimated prob{abs(theta)>2 z value} Prostate study; Bars show + one freq stdev. Using g model {0,ns(5)} prob{ theta >2 z} 0.0 0.2 0.4 0.6 0.8 1.0 3 4 2 0 2 4 6 z value Prob{abs(theta)>2 z=3} =.434 +.071 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 26 / 30

Estimated False Discovery Rate Local false discovery rate: fdr(z) = Pr {θ = 0 z} = E { t(θ) z } where t(θ) = δ 0 (θ) Next: applied to prostate study Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 27 / 30

Figure 7. Estimated prob{theta=0 z value} Prostate study; Bars show + one freq stdev. Using g model {0,ns(5)} prob{theta=0 z} 0.0 0.2 0.4 0.6 0.8 1.0 3 4 2 0 2 4 6 z value Prob{theta=0 z=3} =.322 +.068, coeff of var =.21 Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 28 / 30

For z = 3: Pr { θ > 2 z = 3} = 0.43 ± 0.07 fdr(3) = 0.32 ± 0.07 locfdr : exfam modeling of z s, not θ s, gave fdr(3) = 0.35 ± 0.03 but doesn t work for θ > 2, etc. Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 29 / 30

References Park and Casella (2008) The Bayesian Lasso JASA 681 686 Singh et al. (2002) Gene expression correlates of clinical prostate cancer behavior Cancer Cell 203 209 Efron, Hastie, Johnstone and Tibshirani (2004) Least angle regression Ann. Statist. 407 499 Efron (2012) Bayesian inference and the parametric bootstrap Ann. Appl. Statist. 1971 1997 Efron (2010) Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction IMS Monographs, Cambridge University Press Bradley Efron (Stanford University) Frequentist Accuracy of Bayesian Estimates 30 / 30