Empirical Bayes Deconvolution Problem

Size: px

Start display at page:

Download "Empirical Bayes Deconvolution Problem"

Richard Houston
5 years ago
Views:

1 Empirical Bayes Deconvolution Problem

2 Bayes Deconvolution Problem Unknown prior density g(θ) gives unobserved realizations Θ 1, Θ 2,..., Θ N iid g(θ) Each Θ k gives observed X k p Θk (x) [p Θ (x) known] Marginal density f(x) = p θ (x)g(θ) dθ Wish to estimate g(θ) from X 1, X 2,..., X N EB Deconvolution Problem 2 / 27

3 Three Familiar Cases Poisson X k Poi(Θ k ), p θ (x) = e θ θ x /x! Normal X k N(Θ k, 1), p θ (x) = 1 2π e 1 2 (x θ)2 Binomial X k Bi(n k, Θ k ), ( ) nk p θ (x) = θ x (1 θ) n k x x } {{ } depends on n k EB Deconvolution Problem 3 / 27

4 Robbins Estimate (1956) Observe X k Poi(Θ k ), k = 1,..., N y x = #{X k = x} Ê{Θ X = x} = (x + 1)y x+1 /y x Claims x Counts y Ê{Θ x} Don t need to estimate g! EB Deconvolution Problem 4 / 27

5 More Ambitious Goal Estimate entire prior density g(θ) Why? Ensemble properties of Θ s Pr{Θ > 2}, etc. Empirical Bayes: estimate full posterior distribution Pr{Θ > 2 X = x}, etc. Asymptotics discouraging (Carroll and Hall, 1988) EB Deconvolution Problem 5 / 27

6 g Modeling Idea Model prior g(θ) as exponential family Discrete Θ-space: Θ { } θ (1), θ (2),..., θ (m) Prior g j = Pr { Θ = θ (j) } g = (g 1,..., g m ) Exponential family g α = e Qα /a α ( gjα = e Q j α /a α, j = 1,..., m ) Q structure matrix, m p α p-dimensional natural parameter EB Deconvolution Problem 6 / 27

7 Maximum Likelihood Estimation of g α α Θ g α X p Θ (x) Marginal density f α (x) X N(Θ, 1) f α = g α N(0, 1) Observe X 1, X 2,..., X N f α (x) gives ˆα, the marginal MLE, and gˆα EB Deconvolution Problem 7 / 27

8 Prostate Study Example 102 men, 52 patients and 50 controls N = 6033 genes Each gives z value z i N(θ i, ) θ i the effect size Wish to estimate g(θ), the effect size density EB Deconvolution Problem 8 / 27

9 Prostate study data: N=6033 z values for 52 patients vs 50 controls. locfdr: empirical null ~ N(0,1.06^2), p0=.984 Frequency N(0,1.06^2) z values EB Deconvolution Problem 9 / 27

10 g-modeling Estimate Θ { 3.6, 3.4,..., 3.6} g α = e Qα /a α Q = [δ 0, ns(θ, 5)] spike and slab Used nlm to find MLE ˆα, giving Pr{Θ = 0} = Pr{ Θ > 2} = 0.02 (locfdr atom at Θ = 0) EB Deconvolution Problem 10 / 27

11 Non null prior for prostate data, g model Q=(1,ns(theta,5),c0=1; Null atom.946; Prob{ theta >2 }=.02; + one stdev g(theta) theta EB Deconvolution Problem 11 / 27

12 Discretizing the X Observations X { x (1),..., x (n) } y i = # { X k = x (i) } y = (y 1,..., y n ) f i = Pr { } X = x (i) y Mult n (N, f) Let p ij = Pr { } X = x (i) Θ = θ (j) and P = (p ij ) m n f n = P m n g m MLE: α g α = e Qα /a α f α = Pg α y Mult n (N, f α ) EB Deconvolution Problem 12 / 27

13 Fisher Information Calculations ( ) pij Define W ij = g αj 1 f αi W α = (W ij ) n m Expected Fisher information at α = ˆα: Iˆα = Q { W ˆα diag(nf ˆα)W ˆα } Q EB Deconvolution Problem 13 / 27

14 Regularization and Accuracy for g Models ˆα = arg max α {l α s α } (s α = c 0 α ) [ ˆα α ( ) 1 Î + sˆα ṡˆα, } {{ } (Î ) 1 + sˆα Î ( ) 1 ] Î + sˆα } {{ } Bias Ĉov Letting ˆR = [diag(ĝ) ĝĝ ] Q, gˆα g ( ˆR Bias, ˆR Ĉov ˆR ) [ c0 = 1 made tr( sˆα )/ tr(î) = 0.03] EB Deconvolution Problem 14 / 27

15 A Binomial Example 844 cancer patients: n k lymph nodes removed; X k found positive Binomial model X k Bi(n k, Θ k ) [θ (0,.05,.10,..., 1)] g modeling prior g = e Qα /a α with Q = [δ 0, ns(θ, 4)] ) P : P kj = ( nk x k θ x k (1 θ j j ) n k x k ; f α = Pg α 844 MLE l α = log(f αk ) [sd s from lˆα ] Fan (1991): k=1 binomial easier than normal EB Deconvolution Problem 15 / 27

16 Nodes study: ratio p=x/n for 844 cases; n ranging from 1 to 69 Frequency * p=x/nodes EB Deconvolution Problem 16 / 27

17 G model estimate of prior distribution g(theta), 844 cases; Theta the true effect size, nodes study; + stdev g(theta) P{Theta<=.25}=.63 P{Theta>=.75}= Theta Q=(1,ns(theta,4), but MLE put zero weight on 'spike' EB Deconvolution Problem 17 / 27

18 Binomial Nodes Example with Covariates Now X k Bi(n k, π k ) with π k = λ 1 { λ(θ k ) + U k γ} λ = logistic transform U = covariate vector Θ k = frailty for patient k Generalized Mixed Model EB Deconvolution Problem 18 / 27

19 g(theta) Estimated fraility distribution g(theta) for mixed model binomial nodes example; u=(sex,age,smok,grade); gamhat=(.187,.088,.102,.706) theta dashed curve: estimated g(theta) without covariates EB Deconvolution Problem 19 / 27

20 Estimated density of Prob{positive node} for best grade (solid) and worst grade (dashed) g(pi) Prob{positive node} EB Deconvolution Problem 20 / 27

21 Fourier Method (Stefanski and Carroll, 1990) X N(Θ, 1): F (f) = F (g)e t2 /2 (F = Fourier transform) Smoothing 1 N ( ) Xk x / ˆf(x) = sin (X k x) N λ k=1 Stef Carroll: ĝ(θ) = F { 1 F (ˆf) } e t 2 /2 Kernal form ĝ(θ) = 1 N k λ (x) = 1 π N k λ (X k θ) k=1 1/λ 0 where e t2 /2 cos(tx) dt EB Deconvolution Problem 21 / 27

22 A Test Case Normal model X k N(Θ k, 1) with Θ [ 3, 3] g(θ) an equal mixture of N(0, ) and a symmetric density proportional to θ Gives triangular-shaped marginal f(x) Goal Sample from f( ), estimate g N = 4000 EB Deconvolution Problem 22 / 27

23 Test Case: true g(theta) (black) and expected Fourier est lambda=.50 (red) + stdev, N=4000 Now for lambda=.333 g and ghat g and ghat theta theta EB Deconvolution Problem 23 / 27

24 Parametric f Modeling Stef Carroll: ĝ = k λ f where f empirical density y/n Instead take ĝ = k λˆf, ˆf parametric estimate of f Next Slide ˆf = glm(y ns(x,6),poisson)$est Need X k = Θ k + ɛ k, with ɛ k iid noise EB Deconvolution Problem 24 / 27

25 Test Case Stdevs for ghat: g modeling (black), Steff Car (red), and parametric f modeling (green) sd parametric f modeling N=4000; c0=1 Q=ns(th0,6) theta EB Deconvolution Problem 25 / 27

26 Summary Nonparametric f modeling: deprecated Parametric f modeling: OK for smooth g, additive noise (preferred for Robbins/Tweedie situations) g modeling: of situations flexible and reasonably efficient for a wide variety EB Deconvolution Problem 26 / 27

27 References Carroll, R. J. and Hall, P. (1988). Optimal rates of convergence for deconvolving a density. J. Amer. Statist. Assoc. 83: Efron, B. (2014). Two modeling strategies for empirical Bayes estimation. Statist. Sci. 29: Fan, J. (1991). On the optimal rates of convergence for nonparametric deconvolution problems. Ann. Statist. 19: Gholami, S., Janson, L., Worhunsky, D., et al. (2015). Number of lymph nodes removed and survival after gastric cancer resection: An analysis from the US Gastric Cancer Collaborative. J. Amer. Coll. Surg. 221: Stefanski, L. and Carroll, R. J. (1990). Deconvoluting kernel density estimators. Statistics 21: EB Deconvolution Problem 27 / 27

A G-Modeling Program for Deconvolution and Empirical Bayes Estimation

A G-Modeling Program for Deconvolution and Empirical Bayes Estimation Balasubramanian Narasimhan Stanford University Bradley Efron Stanford University Abstract Empirical Bayes inference assumes an unknown