Estimating prediction error in mixed models

Size: px

Start display at page:

Download "Estimating prediction error in mixed models"

Osborn Carroll
5 years ago
Views:

1 Estimating prediction error in mixed models benjamin saefken, thomas kneib georg-august university goettingen sonja greven ludwig-maximilians-university munich 1 / 12

2 GLMM - Generalized linear mixed models g(µ i ) = x i β + z i u. - Conditional responses from an exponential family distribution f(y i β, u). - Impose prior distribution on random effects u N ( 0, G(τ 2 ) ). - Structured additive regression models may be represented as (generalized) mixed models. This includes (generalized) additive models, smoothing-spline models and geoadditive models. 2 / 12

3 Marginal & Conditional perspective - Marginal log-likelihood: log f(y i β, u)p(u) du The random effects model correlation between responses. - Conditional log-likelihood: log f(y i β, u) Random effects act as ordinary fixed parameters with regularized estimation due to a penalty term induced by the covariance structure of the random effects. For example in penalized regression the random effects are used as tool to model penalized parameters. 3 / 12

4 Deviance prediction error - Deviance error for regression models: err = 2 log f(y i ˆβ(y i )) + 2C C is the log-likelihood of the saturated model. - Omit C if focus is on model selection. - Too optimistic to predict future values y. The quantity of interest is the expected deviance prediction error: ( ) Err = 2E y log f(y ˆβ(y i )) C. 4 / 12

5 Covariance penalties - For exponential families with corresponding natural parameter θ [ E (Err) = E err + 2 ] Cov(ˆθ i, y i ). i - In GLMs, the approximation i Cov(ˆθ i, y i ) p is used - The resulting criterion is Akaike s information criterion. - For mixed effects models: Prediction may either be based on the conditional distribution y u or on the marginal distribution y. 5 / 12

6 Marginal prediction error - Appropriate if focus is on the fixed effects β and predictions y have new random effects u. - Tempting to use marginal log-likelihood and Cov(ˆθ i, y i ) q i with q = dim(β) + dim(τ 2 ), i.e. the marginal AIC. - The marginal responses are not necessarily from an exponential family distribution: [ E (Err) = E err + 2 ] Cov(ˆθ i, y i ) i might not hold. - maic does not choose model with lowest expected deviance prediction error. 6 / 12

7 Conditional prediction error - Appropriate if the predictions share the same random effects as the observed data. - The conditional responses are from an exponential family distribution but is not an observable quantity. ( ) Cov(ˆθ, y) = E (y µ)ˆθ - For Gaussian models ˆθ = ŷ use the Stein formula Cov(ˆθ, y) = σ 2 E ( ) ŷ y 7 / 12

8 Conditional prediction error - For a linear mixed models ŷ = Hy = X ˆβ + Zû the covariance penalty reduces to tr ( ) [ ( ŷ X = tr(h) = tr t X X t Z y Z t X Z t Z + G(ˆτ 2 ) ) 1 ( X t X X t Z Z t X Z t Z ) ] - ˆτ 2 depends on y. Ignoring this dependence induces a bias. - Corrected criterion can be derived by implicit differentiation tr ( ) ŷ = tr(h) + y j Hy ˆτ 2 j ˆτ 2 j y 8 / 12

9 Poisson & exponential - If the response is Poisson distributed then use the Chen-Stein formula: ( Cov(ˆθ, y) = E y(ˆθ(y) ˆθ(y ) 1)). - The expected deviance error can be estimated by err + 2 i y i (ˆθi (y i ) ˆθ ) i (y i 1). - For exponentially distributed responses, the covariance penalty is ( y ) Cov(ˆθ, y) = E yˆθ(y) ˆθ(x)dx. 0 9 / 12

10 Centralized Steinian - In case of Bernoulli responses, i.e. binary data, the covariance penalty may be rewritten as ( Cov(ˆθ, y) = E µ(1 µ)(ˆθ(1) ˆθ(0)) ). - µ is not available it can be replaced by a consistent estimator ˆµ: err + 2 i ˆµ i (1 ˆµ i ) (ˆθi (1) ˆθ ) i (0). - Similarly for continuous exponential family distributions the expected conditional deviance error can be approximated by err + 2 i ˆµ i y i. 10 / 12

11 Model selection - (random intercept) model 1: 1.00 Selection frequencies of model 1 ( ) µij log = β 0 + β 1 x i + u j 1 µ ij u N (0, τ 2 I) - (linear) model 2: ( ) µi log = β 0 + β 1 x i 1 µ i n = 25 n = 100 Variable proposed tr(h) marginal true - Choose model with lowest expected deviance τ 2 11 / 12

12 Summary Two prediction perspectives: marginal & conditional Choose model with lowest expected conditional deviance error Unbiased estimates for Gaussian, Poisson & exponential responses Asymptotic estimates for further exponential family distributions 12 / 12

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of