Estimating prediction error in mixed models

Similar documents
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Model comparison and selection

On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models

Generalized Linear Models. Kurt Hornik

Modelling geoadditive survival data

mboost - Componentwise Boosting for Generalised Regression Models

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Regression, Ridge Regression, Lasso

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

A general mixed model approach for spatio-temporal regression data

Analysing geoadditive regression data: a mixed model approach

MS&E 226: Small Data

arxiv: v2 [stat.co] 17 Mar 2018

LOGISTIC REGRESSION Joseph M. Hilbe

if n is large, Z i are weakly dependent 0-1-variables, p i = P(Z i = 1) small, and Then n approx i=1 i=1 n i=1

Sparse Linear Models (10/7/13)

Model Selection. Frank Wood. December 10, 2009

Answer Key for STAT 200B HW No. 8

Topic 12 Overview of Estimation

Generalized Estimating Equations

A Modern Look at Classical Multivariate Techniques

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

STAT 740: Testing & Model Selection

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

Various types of likelihood

Generalized Additive Models

Kneib, Fahrmeir: Supplement to "Structured additive regression for categorical space-time data: A mixed model approach"

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Linear Regression Models P8111

STAT 526 Advanced Statistical Methodology

Lecture 9 STK3100/4100

Bayesian Model Comparison

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Recap. HW due Thursday by 5 pm Next HW coming on Thursday Logistic regression: Pr(G = k X) linear on the logit scale Linear discriminant analysis:

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

IEOR165 Discussion Week 5

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

P -spline ANOVA-type interaction models for spatio-temporal smoothing

Estimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Lecture 14: Variable Selection - Beyond LASSO

Generalized Linear Models

gamboostlss: boosting generalized additive models for location, scale and shape

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

On Properties of QIC in Generalized. Estimating Equations. Shinpei Imori

STAT 100C: Linear models

Statistics 262: Intermediate Biostatistics Model selection

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Model Selection in Cox regression

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Applied Linear Statistical Methods

Stat 579: Generalized Linear Models and Extensions

Chapter 7: Model Assessment and Selection

Statistical Data Mining and Machine Learning Hilary Term 2016

Selection Criteria Based on Monte Carlo Simulation and Cross Validation in Mixed Models

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Generalized Linear Models 1

WU Weiterbildung. Linear Mixed Models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Covariance function estimation in Gaussian process regression

High-dimensional regression

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is

Answer Key for STAT 200B HW No. 7

Generalized Linear Models (GLZ)

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Generalized Estimating Equations (gee) for glm type data

Regularization in Cox Frailty Models

Outline of GLMs. Definitions

Joint Probability Distributions

The OSCAR for Generalized Linear Models

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Model comparison. Patrick Breheny. March 28. Introduction Measures of predictive power Model selection

Variable Selection under Measurement Error: Comparing the Performance of Subset Selection and Shrinkage Methods

A new statistical framework to infer gene regulatory networks with hidden transcription factors

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Generalized linear models

Generalized Linear Models

Variable Selection for Generalized Additive Mixed Models by Likelihood-based Boosting

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Machine Learning for OR & FE

Bias-corrected AIC for selecting variables in Poisson regression models

Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection. Sung Y. Park CUHK

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

ECON 4160, Autumn term Lecture 1

When is MLE appropriate

Outline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model

Linear Methods for Prediction

Likelihood-Based Methods

Transcription:

Estimating prediction error in mixed models benjamin saefken, thomas kneib georg-august university goettingen sonja greven ludwig-maximilians-university munich 1 / 12

GLMM - Generalized linear mixed models g(µ i ) = x i β + z i u. - Conditional responses from an exponential family distribution f(y i β, u). - Impose prior distribution on random effects u N ( 0, G(τ 2 ) ). - Structured additive regression models may be represented as (generalized) mixed models. This includes (generalized) additive models, smoothing-spline models and geoadditive models. 2 / 12

Marginal & Conditional perspective - Marginal log-likelihood: log f(y i β, u)p(u) du The random effects model correlation between responses. - Conditional log-likelihood: log f(y i β, u) Random effects act as ordinary fixed parameters with regularized estimation due to a penalty term induced by the covariance structure of the random effects. For example in penalized regression the random effects are used as tool to model penalized parameters. 3 / 12

Deviance prediction error - Deviance error for regression models: err = 2 log f(y i ˆβ(y i )) + 2C C is the log-likelihood of the saturated model. - Omit C if focus is on model selection. - Too optimistic to predict future values y. The quantity of interest is the expected deviance prediction error: ( ) Err = 2E y log f(y ˆβ(y i )) C. 4 / 12

Covariance penalties - For exponential families with corresponding natural parameter θ [ E (Err) = E err + 2 ] Cov(ˆθ i, y i ). i - In GLMs, the approximation i Cov(ˆθ i, y i ) p is used - The resulting criterion is Akaike s information criterion. - For mixed effects models: Prediction may either be based on the conditional distribution y u or on the marginal distribution y. 5 / 12

Marginal prediction error - Appropriate if focus is on the fixed effects β and predictions y have new random effects u. - Tempting to use marginal log-likelihood and Cov(ˆθ i, y i ) q i with q = dim(β) + dim(τ 2 ), i.e. the marginal AIC. - The marginal responses are not necessarily from an exponential family distribution: [ E (Err) = E err + 2 ] Cov(ˆθ i, y i ) i might not hold. - maic does not choose model with lowest expected deviance prediction error. 6 / 12

Conditional prediction error - Appropriate if the predictions share the same random effects as the observed data. - The conditional responses are from an exponential family distribution but is not an observable quantity. ( ) Cov(ˆθ, y) = E (y µ)ˆθ - For Gaussian models ˆθ = ŷ use the Stein formula Cov(ˆθ, y) = σ 2 E ( ) ŷ y 7 / 12

Conditional prediction error - For a linear mixed models ŷ = Hy = X ˆβ + Zû the covariance penalty reduces to tr ( ) [ ( ŷ X = tr(h) = tr t X X t Z y Z t X Z t Z + G(ˆτ 2 ) ) 1 ( X t X X t Z Z t X Z t Z ) ] - ˆτ 2 depends on y. Ignoring this dependence induces a bias. - Corrected criterion can be derived by implicit differentiation tr ( ) ŷ = tr(h) + y j Hy ˆτ 2 j ˆτ 2 j y 8 / 12

Poisson & exponential - If the response is Poisson distributed then use the Chen-Stein formula: ( Cov(ˆθ, y) = E y(ˆθ(y) ˆθ(y ) 1)). - The expected deviance error can be estimated by err + 2 i y i (ˆθi (y i ) ˆθ ) i (y i 1). - For exponentially distributed responses, the covariance penalty is ( y ) Cov(ˆθ, y) = E yˆθ(y) ˆθ(x)dx. 0 9 / 12

Centralized Steinian - In case of Bernoulli responses, i.e. binary data, the covariance penalty may be rewritten as ( Cov(ˆθ, y) = E µ(1 µ)(ˆθ(1) ˆθ(0)) ). - µ is not available it can be replaced by a consistent estimator ˆµ: err + 2 i ˆµ i (1 ˆµ i ) (ˆθi (1) ˆθ ) i (0). - Similarly for continuous exponential family distributions the expected conditional deviance error can be approximated by err + 2 i ˆµ i y i. 10 / 12

Model selection - (random intercept) model 1: 1.00 Selection frequencies of model 1 ( ) µij log = β 0 + β 1 x i + u j 1 µ ij u N (0, τ 2 I) - (linear) model 2: ( ) µi log = β 0 + β 1 x i 1 µ i 0.75 0.50 0.25 0.00 1.00 0.75 0.50 n = 25 n = 100 Variable proposed tr(h) marginal true - Choose model with lowest expected deviance. 0.25 0.00 0.0 0.5 1.0 1.5 τ 2 11 / 12

Summary Two prediction perspectives: marginal & conditional Choose model with lowest expected conditional deviance error Unbiased estimates for Gaussian, Poisson & exponential responses Asymptotic estimates for further exponential family distributions 12 / 12