Prediction of Random Effects and Effects of Misspecification of Their Distribution

Similar documents
Lab 4: Two-level Random Intercept Model

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Chapter 9: Statistical Inference and the Relationship between Two Variables

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Richard Socher, Henning Peters Elements of Statistical Learning I E[X] = arg min. E[(X b) 2 ]

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

β0 + β1xi and want to estimate the unknown

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Basically, if you have a dummy dependent variable you will be estimating a probability.

Statistics for Economics & Business

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Negative Binomial Regression

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Lecture Notes on Linear Regression

Bias-correction under a semi-parametric model for small area estimation

28. SIMPLE LINEAR REGRESSION III

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

18. SIMPLE LINEAR REGRESSION III

Regression. The Simple Linear Regression Model

x i1 =1 for all i (the constant ).

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

PubH 7405: REGRESSION ANALYSIS SLR: PARAMETER ESTIMATION

Limited Dependent Variables

MACHINE APPLIED MACHINE LEARNING LEARNING. Gaussian Mixture Regression

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Chapter 13: Multiple Regression

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Comparison of Regression Lines

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

Lecture 6: Introduction to Linear Regression

Lecture 4 Hypothesis Testing

First Year Examination Department of Statistics, University of Florida

Statistics MINITAB - Lab 2

STAT 511 FINAL EXAM NAME Spring 2001

e i is a random error

Chapter 5 Multilevel Models

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Properties of Least Squares

STK4080/9080 Survival and event history analysis

Decision Analysis (part 2 of 2) Review Linear Regression

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Statistics for Business and Economics

Time-Varying Coefficient Model with Linear Smoothing Function for Longitudinal Data in Clinical Trial

Homework Assignment 3 Due in class, Thursday October 15

The Geometry of Logit and Probit

β0 + β1xi. You are interested in estimating the unknown parameters β

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

β0 + β1xi. You are interested in estimating the unknown parameters β

STAT 3008 Applied Regression Analysis

Chapter 14: Logit and Probit Models for Categorical Response Variables

ANOVA. The Observations y ij

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Lecture 3 Stat102, Spring 2007

Time to dementia onset: competing risk analysis with Laplace regression

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

University of California at Berkeley Fall Introductory Applied Econometrics Final examination

Basic Business Statistics, 10/e

Issues To Consider when Estimating Health Care Costs with Generalized Linear Models (GLMs): To Gamma/Log Or Not To Gamma/Log? That Is The New Question

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Effective plots to assess bias and precision in method comparison studies

4.3 Poisson Regression

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Learning Objectives for Chapter 11

Hydrological statistics. Hydrological statistics and extremes

Scientific Question Determine whether the breastfeeding of Nepalese children varies with child age and/or sex of child.

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

What would be a reasonable choice of the quantization step Δ?

Web-based Supplementary Materials for Inference for the Effect of Treatment. on Survival Probability in Randomized Trials with Noncompliance and

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

Chapter 11: Simple Linear Regression and Correlation

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

The Ordinary Least Squares (OLS) Estimator

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Hidden Markov Models

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

Probability and Random Variable Primer

Introduction to Regression

SIMPLE LINEAR REGRESSION

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Transcription:

Predcton of Random Effects and Effects of Msspecfcaton of Ther Dstruton Charles E. McCulloch and John Neuhaus Dvson of Bostatstcs, Department of Epdemology and Bostatstcs Unversty of Calforna, San Francsco West Coast Stata Users Group Meetng, Octoer 007

Outlne 1) Introducton and motvatng examples ) Predcton of random effects a) Are parametrc assumptons mportant? 3) Bref revew of effects of msspecfcaton (more generally) n mxed models 4) Theoretcal calculatons (Lnear Mxed Model) 5) Theoretcal calculatons (Bnary Matched Pars) 6) Smulatons (Lnear Mxed Model) 7) Example: Hormone and Estrogen Replacement Study 8) Summary

1. Introducton: Examples Example 1: Measurng cogntve declne n elderly women (Women Who Mantan Optmal Cogntve Functon nto Old Age. Barnes DE, Cauley JA, Lu L-Y, Fnk HA, McCulloch CE, Stone KL, Yaffe K. J Amer Geratcs Soc, 007). A modfed Mn-Mental status examnaton was gven at aselne and years 6, 8, 10 and 15 n a prospectve cohort study (Study of Osteoporotc Fractures). Whch partcpants are thought to e n mental declne and what predcts that declne?

Example : Effect of pre-hypertenson at an early age n the CARDIA study. (Prehypertenson Durng Young Adulthood and Presence of Coronary Calcum Later n Lfe: The Coronary Artery Rsk Development In Young Adults (CARDIA) Study. MJ Pletcher, K Bns-Domngo, CE Lews, G We, S Sdney, JJ Carr, E Vttnghoff, CE McCulloch, SB Hulley, sumtted). Blood pressure measured every fve years snce 1986. How to approxmate prevous and cumulatve lood pressure exposure?

Example 3: Predctng those at rsk for developng hgh lood pressure n HERS (The Heart and Estrogen Replacement Study - Hulley, et al, J. Amercan Medcal Assocaton, 1998). HERS was a randomzed, lnded, placeo controlled tral for women wth prevous coronary dsease. We wll use t as a prospectve cohort study for predcton of hgh lood pressure.,763 women were enrolled and followed yearly for 5 susequent vsts. We wll consder only the suset that were not daetc and wth systolc lood pressure less than 140 at the egnnng of the study.

ε. Mxed models and predcton of random effects One way to address the questons aove s to utlze mxed models and derve predcted values of the random effects. Example 1: (cogntve declne): MMSE 0 1 ~ calculate t 1 = = cogntve measure for partcpant 0 + ~ ndep. 1 t + covarates + β 0 N β1 σ, σ 00 01 σ t σ, 01 11 = predcted declne for partcpant. at tme t

Some realstc ut made up data:. tale vst, c(mean mmse n mmse sd mmse) ------------------------------------------ vst mean(mmse) N(mmse) sd(mmse) ----------+------------------------------- 0 7.08,031. 1 7.17 1,931.3 7.10 1,850.3 3 7.08 1,750.3 4 7.04 1,361.3 5 7.10 69. ------------------------------------------ So lttle change n average MMSE over tme.

x: xtmxed mmse vst exercse avgdrpwk pptd: vst, cov(uns) Performng EM optmzaton: Performng gradent-ased optmzaton: Iteraton 0: log restrcted-lkelhood = -1166.158 Iteraton 1: log restrcted-lkelhood = -1166.14 Iteraton : log restrcted-lkelhood = -1166.14 Computng standard errors: Mxed-effects REML regresson Numer of os = 9110 Group varale: pptd Numer of groups = 03 Os per group: mn = 1 avg = 4.5 max = 6 Wald ch(3) = 7.4 Log restr-lkelhood = -1166.14 Pro > ch = 0.0000

------------------------------------------------------------------ mmse Coef. Std. Err. z P> z [95% Conf. Interval] ----------+------------------------------------------------------- vst -.0060353.005913-1.0 0.307 -.017631.005556 exercse.0773954.0179999 4.30 0.000.04116.116746 avgdrpwk -.0097331.0037005 -.63 0.009 -.0169859 -.004803 _cons 7.11017.0495455 547.18 0.000 7.01307 7.078 ------------------------------------------------------------------ ------------------------------------------------------------------ Random-effects Parameters Estmate Std.Err. [95% Conf.Interval] --------------------------+--------------------------------------- pptd: Unstructured sd(vst).194305.00571.183333.05775 sd(_cons).158639.034978.09115.896 corr(vst,_cons) -.0467.031013 -.1035.0181959 --------------------------+--------------------------------------- sd(resdual).481975.004743.47767.4913616 ------------------------------------------------------------------ LR test vs. ln regresson: ch(3) = 17033.5 Pro > ch = 0.000

. predct rslopedev rntdev, reffects. gen predslope=_[vst]+rslopedev. collapse rslopedev rntdev predslope, y(pptd). gen deltammse=6*predslope. summarze Varale Os Mean Std. Dev. Mn Max ----------+----------------------------------------------- pptd 03 1394.65 794.41 1 761 rslopedev 03 1.3e-10.141 -.7615.939 rntdev 03 3.9e-10.133-9.875 3.0384 predslope 03 -.006035.141 -.7676.9178 deltammse 03 -.03611.8530-4.6057 5.5071. summarze deltammse predslope f deltammse<- Varale Os Mean Std. Dev. Mn Max ----------+----------------------------------------------- deltammse 40 -.634.4937-4.6057 -.0331 predslope 40 -.4390.08 -.7676 -.3388

ε Example : (pre-hypertenson): BP t = = ( splne terms) ~ ndep. N(, ) lood pressure for partcpant splne ( t) + covarates + calculate predcted splne for partcpant. t, at tme t The area under the predcted lood pressure trajectory etween 10 and 140 mmhg was ntegrated over tme as a cumulatve prehypertenson exposure (n years of mmhg). Ths was then used as a predctor of coronary calcfcaton.

ε Example 3: (hgh lood pressure): BP t logt( P{ BP t = 1f = 1}) lood pressure s hgh for suject = 0 + covarates + t, at tme t, and 0 o/w ~ calculate 0 0 ~..d. N ( ) β, σ 0 = predcted ntercept for partcpant. Predcted values of random effects avalale from gllamm or the new (Ver 10) multlevel logt command xtmelogt

The xtmxed, xtmelogt, xt-etc. and gllamm commands ft the models usng regular or restrcted maxmum lkelhood. So they use a parametrc assumpton for oth the dstruton of the outcome and the dstruton of the random effects, the latter typcally that the dstrutons are normal. Key queston: Is the parametrc assumpton of the random effects dstruton mportant? Ths s especally crucal snce we don t get to drectly oserve the random effects. Unfortunately, the predcted random effects may not reflect the shape of the underlyng dstruton. (More on ths pont later).

3. Revew of mpact of msspecfcaton n mxed models A numer of nvestgatons have focused on the effect of msspecfcatons n parametrc mxed models. They can e grouped as: 1. Gettng the dstrutonal shape wrong.. Falsely assumng the random effect s ndependent of the cluster sze. 3. Falsely assumng the random effect s ndependent of covarates, e.g., a. Mean of random effects dstruton could e assocated wth a covarate..varance of random effects dstruton could e assocated wth a covarate.

Most nvestgatons have concentrated on the mpact on estmaton of the fxed effects porton of the model. General assessment: 1) Gettng the dstrutonal shape wrong has lttle mpact on nferences aout the fxed effects. ) Incorrectly assumng the random effects dstruton s ndependent of the cluster sze may affect nferences aout the ntercept, ut does not serously mpact nferences aout the regresson parameters. 3) However, assumng the random effects dstruton s ndependent of the covarates when t s not s potentally serous. (Related to mean: Neuhaus and McCulloch, JRSSB, 006; related to the varance: Heagerty and Kurland, Bometrka, 001).

What aout nference aout the predctons of the random effects? We ll concentrate on the ssue of wrong dstrutonal shape, where fxed effects nferences seem largely unaffected. Intuton: the assumed form of the random effects dstruton may e a more crucal assumpton n ths case.

ε ε ε 4. Theoretcal calculatons (Lnear Mxed Model) Frst consder an easy stuaton. Assumed model s a one-way random effects model wth known ntercept and varance components and normally dstruted random effects: Y t t = µ + +, t = 1, K, n ; = 1, K, q ~..d. N ~..d. N t ( ) 0, σ ( ) 0, σ ε t, µ, σε,and σ known In whch case the Best Lnear Unased Predctor s gven y ~ σ = ( Y µ ) σ + σε / n

Wrtng ths out: ( ) ( ) ( ) + + = + + + = + = n n Y n ε σ σ σ µ ε µ σ σ σ µ σ σ σ ε ε ε / / / ~

Condtonal on, the Y t are ndependent N ( µ +, σ ε ). So ~ ~ ε σ = σ N µ ~, σ + σ / n n ε and ~ s condtonally ased. Snce the calculatons are condtonal on, results do not depend on the dstruton of the and so the condtonal as does not depend on the dstruton.

~ σ = σ + σε / n σ σ ( + ε ) ( + 0) = as n So ~ converges n proalty to the true value as n. But asymptotc calculatons as n are not usually of nterest for a random effects model.

What does the dstruton of the ~ look lke? And what f the assumpton of normalty for the s ncorrect,.e., not normal? If n s large then each ~ s close to and hence the dstruton s approxmately correct. But what aout the case when n s not large, the usual case of nterest? Then the dstruton of ~ s the convoluton of the true densty wth the condtonal densty of ~ gven.

For example, suppose the true densty s exponental(1), shfted to have mean 0. Then the densty of ~ s gven y exp 0 { ~ } ~ ( ~ ) n /( ) exp( 1) d ~ µ, σ ε whch s straghtforward to evaluate numercally:

ε ε ε What s the BLUP under the exponental assumpton? Model: Y t t t = µ + ~..d. σ ~..d. N +, t ( E(1) 1) ( ) 0, σ ε t ε = 1, K, n ; µ, σ, and σ ; known = 1, K, q n σ ε. σ nσ Defne = ( Y µ + σ ) ε

Then ~ = Y ε σ ( ) ε φ σ µ +, nσ Φ( ) n where φ (t) and Φ (t) are the standard normal p.d.f. and c.d.f. How do the assumed normal and assumed exponental BLUPs compare?

BLUPS Under Dfferent Dstrutonal Assumptons 8 6 4 BLUP 0-4 - 0 4 6 8 - -4 Average n Cluster Raw devaton Normal Exponental

5. Theoretcal calculatons (Bnary matched pars) Assumed model Y logt( p t t ) = ~ Bnomal( p t µ + + βi{ t= } ~..d. N ( ) β, σ 0 ), = 1,..., q; t = 1, Snce there are only 4 data confguratons per cluster there are only four possle values for ~, for a gven set of parameter values. For example, when y 1, ~ t s gven y (wth p( t) = 1/(1 + e )) ~ = y 1 = = φ( ) p( µ + σ ) p( µ + σ + β ) d φ( ) p( µ + σ ) p( µ + σ + β ) d

These depend on the assumed dstruton. The proaltes of the four (actually three) values depends on the true dstruton. Proalty Dstruton for BLUPs 0.5 0.4 Proalty 0.3 0. 0.1 0-1 -0.5 0 0.5 1 Best Predcted Value BLUP assumed normal BLUP true (exponental)

It s also straghtforward to calculate the mean square error of predcton usng the assumed and true models under the true model. For example, f the assumed model s normal, ut the true s exponental here are some values of the mean square error of predcton: ~ Mean squared error of predcton MSEP = E[( ) ] wth µ = 0, σ = 1: β Normal (assumed) Exponental (true) Percent ncrease 0 0.77 0.75 3.5% 1 0.8 0.79 3.0% 0.85 0.83.1% 3 0.87 0.85 1.4%

ε ε ε 6. Smulaton We smulated data from the one-way random model: Y t t = µ + ~..d. N ~..d. N + t, t = 1, K, n ; ( ) 0, σ or ( ) 0,,, σ ε t = 1, K, q ~..d. σ {E(1) 1} wth q = 10 = n and usng the same random numers for oth the normal and exponental random effects (and the same error terms). 10,000 replcatons. An assumed normal model was ft.

Smulaton results Estmates of the parameters Normal True Ave SD Ave SE µ 1 1.00 0.33 0.3 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.07 0.9 * 0.7 * Exponental µ 1 1.00 0.33 0.31 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.18 * 0.47 0.9 *Excludes one outler

Estmates of fxed effects parameters are lttle affected. Estmates of the parameters Normal True Ave SD Ave SE µ 1 1.00 0.33 0.3 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.07 0.9 * 0.7 * Exponental µ 1 1.00 0.33 0.31 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.18 * 0.47 * 0.9 * *Excludes one outler

As s the estmate of log of the resdual varance. Estmates of the parameters Normal True Ave SD Ave SE µ 1 1.00 0.33 0.3 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.07 0.9 * 0.7 * Exponental µ 1 1.00 0.33 0.31 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.18 * 0.47 * 0.9 * *Excludes one outler

But the estmate of the random effects varance s off. Estmates of the parameters Normal True Ave SD Ave SE µ 1 1.00 0.33 0.3 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.07 0.9 * 0.7 * Exponental µ 1 1.00 0.33 0.31 ln( σ ε ) 0-0.01 0.075 0.075 ln( σ ) 0-0.18 * 0.47 * 0.9 * *Excludes one outler

Confdence nterval coverage for µ was slghtly lower than nomnal for the normal (9%), and low for the exponental (88%). Mean square error of predcton for the BLUPs was 1.87 for the normal model and 1.84 for the exponental.

Do the BLUPs calculated under the assumpton of normalty reflect the true underlyng shape (exponental)? For data smulated wth normally dstruted random effects the average skewness was -0.01 and the average kurtoss was.50 (wth a normal havng values 0 and 3). For data smulated wth exponentally dstruted random effects the average skewness was 0.85 and the average kurtoss was 3.14 (wth an exponental(1) havng values and 9).

7. Example (HERS) Recall the HERS example: We wll consder the 1,378 women who dd not have hgh lood pressure and were not daetc at the aselne vst. We wll use the aselne and vsts 1 through 3 to predct the lood pressure at vsts 4 and 5 and whether or not the woman had developed hgh lood pressure on ether vst 4 or 5. Bref descrptve statstcs: Varale Mean/Percentage SD Age 66.3 6.9 BMI 7.3 4.9 Weght 70.3 kg 13.4 kg On BP meds 79%

ε Predctve model (for aselne and vsts 1, and 3): calculate or BP t 0 ~ BP t BP ˆ t = β ~..d. N ˆ ~ = β + + ˆ β MEDS = ˆ β 0 0 0 4 4 + ˆ β MEDS 4 + 0 ( ) 0, σ 0 + ˆ β BMI 1 + β BMI + β MEDS + β DM + 1 or + ˆ β BMI 1 5 + ˆ β DM 5 + ˆ β EXER + ˆ β DM 5 + β EXER + β AGE 0, ~..d. σ {E(1) 1} + ˆ β EXER t (mxed model pred) + ˆ β AGE 3 3 + ˆ β AGE 3 (fxed effects only)

How well do the predctons work? For predctng the actual systolc lood pressure: Predcton Errors Method Ave Ave as RMSE Fxed effects only 3.4 13.8 18.1 Mxed model (normal) 3.9 11.0 14.9 Mxed model (exponental) 3.1 11.1 14.9 For predctng hgh BP or not: Area under the ROC curve: Fxed effects 0.55, Normal 0.80, Exponental 0.80.

Do they gve the same predcted values? No, ut close: Predcton ased on Normal Random Effects 100 10 140 160 180 100 10 140 160 180 Predcton ased on Exponetal Random Effects

Here s a plot of the dfference etween the predcted values: -15-10 -5 0 5 Dfference n Normal and Exponental Predctons 100 150 00 Mean of SBP n Vsts 1 to 3

8. Summary Predcted values of random effects show modest senstvty to the assumed dstrutonal shape. Dstruton shape of BLUPs often not reflectve of true random effects dstruton. The rankng of predcted values s lttle affected. Fttng flexle dstrutonal shapes s an easy way to check senstvty of the results to the assumed shape. I can e contacted at: chuck@ostat.ucsf.edu Talk can e downloaded from my weste, whch can e found y startng at: www.ostat.ucsf.edu