Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Similar documents
You can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

CO2 Handout. t(cbind(co2$type,co2$treatment,model.matrix(~type*treatment,data=co2)))

Multilevel Methodology

Fitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing

Simple logistic regression

STA6938-Logistic Regression Model

COMPLEMENTARY LOG-LOG MODEL

Models for binary data

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Lecture 3.1 Basic Logistic LDA

Correlated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes

ESTIMATE PROP. IMPAIRED PRE- AND POST-INTERVENTION FOR THIN LIQUID SWALLOW TASKS. The SURVEYFREQ Procedure

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

ST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

SAS PROC NLMIXED Mike Patefield The University of Reading 12 May

Covariance Structure Approach to Within-Cases

Multinomial Logistic Regression Models

Models for Binary Outcomes

Count data page 1. Count data. 1. Estimating, testing proportions

Section Poisson Regression

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ;

STAT 705 Generalized linear mixed models

Introduction (Alex Dmitrienko, Lilly) Web-based training program

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

6 Applying Logistic Regression Models

Changes Report 2: Examples from the Australian Longitudinal Study on Women s Health for Analysing Longitudinal Data

The GENMOD Procedure. Overview. Getting Started. Syntax. Details. Examples. References. SAS/STAT User's Guide. Book Contents Previous Next

Lab 11. Multilevel Models. Description of Data

ssh tap sas913, sas

Regression without measurement error using proc calis

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Chapter 4: Generalized Linear Models-II

Longitudinal Modeling with Logistic Regression

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Multilevel Modeling of Non-Normal Data. Don Hedeker Department of Public Health Sciences University of Chicago.

SAS Example 3: Deliberately create numerical problems

Wrap-up. The General Linear Model is a special case of the Generalized Linear Model. Consequently, we can carry out any GLM as a GzLM.

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Appendix: Computer Programs for Logistic Regression

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

Generalized Models: Part 1

ST3241 Categorical Data Analysis I Logistic Regression. An Introduction and Some Examples

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Cohen s s Kappa and Log-linear Models

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

The GLIMMIX Procedure (Book Excerpt)

Topic 20: Single Factor Analysis of Variance

Chapter 1. Modeling Basics

Some comments on Partitioning

Models for Ordinal Response Data

Sections 4.1, 4.2, 4.3

The GENMOD Procedure (Book Excerpt)

Odor attraction CRD Page 1

SAS Code: Joint Models for Continuous and Discrete Longitudinal Data

SAS/STAT 14.2 User s Guide. The GENMOD Procedure

Using PROC GENMOD to Analyse Ratio to Placebo in Change of Dactylitis. Irmgard Hollweck / Meike Best 13.OCT.2013

Chapter 14 Logistic and Poisson Regressions

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

ANALYSING BINARY DATA IN A REPEATED MEASUREMENTS SETTING USING SAS

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

Introduction to Generalized Models

STAT 526 Advanced Statistical Methodology

Model Estimation Example

Poisson Data. Handout #4

Introduction to SAS proc mixed

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

,..., θ(2),..., θ(n)

Logistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S

SAS/STAT 14.2 User s Guide. The GLIMMIX Procedure

Answer to exercise: Blood pressure lowering drugs

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Figure 36: Respiratory infection versus time for the first 49 children.

Missing Data in Longitudinal Studies: Mixed-effects Pattern-Mixture and Selection Models

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

SAS Syntax and Output for Data Manipulation:

Package HGLMMM for Hierarchical Generalized Linear Models

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression

Product Held at Accelerated Stability Conditions. José G. Ramírez, PhD Amgen Global Quality Engineering 6/6/2013

Regression Models for Risks(Proportions) and Rates. Proportions. E.g. [Changes in] Sex Ratio: Canadian Births in last 60 years

7.1, 7.3, 7.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1/ 31

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

Transcription:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters. Target of inference is the population. Random effect models: based on the sources of dependence. Target of inference is the subject. 1

Special case: Linear models Model for mean response: E (Y i ) = X i β Alternative specifications for variance: covariance pattern e.g. AR(1) or Toeplitz; Cov (Y i ) = Σ i 2

random effects E (Y i b i ) = X i β + Z i,j b i with E (b i ) = 0, Cov (b i ) = G, whence Σ i = Z i,j GZ i,j + σ2 I ni 3

Interpretation of β: in both cases, E ( Y i,j ) = X i,j β = k X i,j,k β k so E ( Y i,j ) But also, with random effects, X i,j,k = β k. E ( Y i,j b i ) = X i,j β + Z i,j b i so, if the k th covariate is not associated with a random effect, E ( Y i,j b i ) X i,j,k = β k. 4

That is, β k is the rate of change of the mean response as the k th covariate changes, both on average across subjects (marginally), and for a specific subject with random effects b (conditionally). This might be either a rate of change over time, if the covariate is time (within-subject factor), or a treatment effect, if the covariate is a dummy variable (between-subject factor). 5

Generalized Linear Models With link function g( ) and inverse link h( ) = g 1 ( ): the marginal model is or g { E ( Y i,j )} = X i,j β, E ( Y i,j ) = h ( X i,j β ) ; the mixed effects model is or g { E ( Y i,j b i )} = X i,j β + Z i,j b i E ( Y i,j b i ) = h ( X i,j β + Z i,j b i). 6

In the mixed effects case, E ( ) { ( )} Y i,j = E E Yi,j b i = and in general = E { h ( X i,j β + Z i,j b i E ( ) ( Y i,j h X i,j β ) for any β. )} h ( X i,j β + Z i,j b i) fb (b i )db i, 7

Logistic regression with random intercept: logit { E ( Y i,j b i )} = X i,j β + b i so and E ( Y i,j ) = E = E ( Y i,j b i ) = e 1 + e e 1 + e ) (X i,j β +b i ) (X i,j β +b i ) (X i,j β +b i e 1 + e ( (X i,j β +b i ) (X i,j β +b i ) ) 1 X i,j β +b i 2πσb 2 e 1 2 b2 i /σ2 b db i. 8

No closed-form solution, but logit { E ( Y i,j )} X i,j β 1 + k 2 σ 2 b where k = 16 3 15π = 0.588 and k2 = 0.346, so β β 1 + 0.346σ 2 b. If σ 2 b = 8, then β β /2 9

Example: one covariate, sample of size 13 with β1 = 1.5, β2 = 0.75, g 1,1 = σb 2 = 4; population average is shown in red, and the approximation in blue. y 0.0 0.2 0.4 0.6 0.8 1.0 4 2 0 2 4 6 8 x 10

Case study: abnormal ECG in a drug trial. options linesize = 80 pagesize = 21 nodate; data ecg; retain id 0; infile ecg.txt firstobs = 32; input seq r1 r2 count; if seq = 1 then /* Placebo followed by drug */ do i = 1 to count; id = id + 1; trt = 0; y = r1; period = 0; output; trt = 1; y = r2; period = 1; output; end; else /* Drug followed by placebo */ do i = 1 to count; id = id + 1; trt = 1; y = r1; period = 0; output; trt = 0; y = r2; period = 1; output; end; run; 11

title1 Marginal Logistic Regression Model ; title2 ; proc genmod descending; class id; model y = trt period / d=bin; repeated subject=id / logor=fullclust; run; title1 Mixed Effects Logistic Regression Model (Random Intercept) ; title2 ; proc nlmixed qpoints=100; /* Initial values from GEE output: */ parms beta1=-1.2433 beta2=.5689 beta3=.2951 g11=1; eta=beta1 + beta2*trt + beta3*period + b; p=exp(eta)/(1 + exp(eta)); model y ~ binary(p); random b ~ normal(0,g11) subject=id; run;

SAS output Marginal Logistic Regression Model 1 The GENMOD Procedure Model Information Data Set Distribution Link Function Dependent Variable WORK.ECG Binomial Logit y Number of Observations Read 134 Number of Observations Used 134 Number of Events 42 Number of Trials 134 12

Marginal Logistic Regression Model 2 Class Levels Values The GENMOD Procedure Class Level Information id 67 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 13

Marginal Logistic Regression Model 3 The GENMOD Procedure Response Profile Ordered Total Value y Frequency 1 1 42 2 0 92 PROC GENMOD is modeling the probability that y= 1. 14

Marginal Logistic Regression Model 4 The GENMOD Procedure Parameter Information Parameter Prm1 Prm2 Prm3 Effect Intercept trt period Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Deviance 131 163.8863 1.2510 Scaled Deviance 131 163.8863 1.2510 Pearson Chi-Square 131 133.5123 1.0192 15

Marginal Logistic Regression Model 5 The GENMOD Procedure Criteria For Assessing Goodness Of Fit Criterion DF Value Value/DF Scaled Pearson X2 131 133.5123 1.0192 Log Likelihood -81.9432 Algorithm converged. 16

Marginal Logistic Regression Model 6 The GENMOD Procedure Analysis Of Initial Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq Intercept 1-1.2186 0.3441-1.8931-0.5440 12.54 0.0004 trt 1 0.5582 0.3784-0.1835 1.2998 2.18 0.1402 period 1 0.2743 0.3768-0.4642 1.0129 0.53 0.4666 Scale 0 1.0000 0.0000 1.0000 1.0000 NOTE: The scale parameter was held fixed. 17

Marginal Logistic Regression Model 7 The GENMOD Procedure GEE Model Information Log Odds Ratio Structure Fully Parameterized Clusters Subject Effect id (67 levels) Number of Clusters 67 Correlation Matrix Dimension 2 Maximum Cluster Size 2 Minimum Cluster Size 2 18

Marginal Logistic Regression Model 8 The GENMOD Procedure Log Odds Ratio Parameter Information Parameter Group Alpha1 (1, 2) Algorithm converged. 19

Marginal Logistic Regression Model 9 The GENMOD Procedure Analysis Of GEE Parameter Estimates Empirical Standard Error Estimates Standard 95% Confidence Parameter Estimate Error Limits Z Pr > Z Intercept -1.2433 0.2999-1.8311-0.6556-4.15 <.0001 trt 0.5689 0.2335 0.1112 1.0266 2.44 0.0148 period 0.2951 0.2319-0.1593 0.7496 1.27 0.2030 Alpha1 3.5617 0.8148 1.9647 5.1587 4.37 <.0001 20

Mixed Effects Logistic Regression Model (Random Intercept) 10 The NLMIXED Procedure Specifications Data Set Dependent Variable Distribution for Dependent Variable Random Effects Distribution for Random Effects Subject Variable Optimization Technique Integration Method WORK.ECG y Binary b Normal id Dual Quasi-Newton Adaptive Gaussian Quadrature 21

Mixed Effects Logistic Regression Model (Random Intercept) 11 The NLMIXED Procedure Dimensions Observations Used 134 Observations Not Used 0 Total Observations 134 Subjects 67 Max Obs Per Subject 2 Parameters 4 Quadrature Points 100 22

Mixed Effects Logistic Regression Model (Random Intercept) 12 The NLMIXED Procedure Parameters beta1 beta2 beta3 g11 NegLogLike -1.2433 0.5689 0.2951 1 76.6705695 Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 3 75.877218 0.793351 3.143617-31.3385 2 5 71.2225189 4.654699 0.769971-36.6519 3 7 70.9616922 0.260827 0.959227-1.3592 4 9 69.3279449 1.633747 0.692069-1.90957 23

Mixed Effects Logistic Regression Model (Random Intercept) 13 The NLMIXED Procedure Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 5 10 69.0008522 0.327093 0.679054-0.73402 6 11 68.5033645 0.497488 0.630885-0.57466 7 12 68.2882471 0.215117 0.611136-0.17656 8 15 68.1688967 0.11935 0.296866-0.0794 9 17 68.1357412 0.033156 0.085458-0.01492 10 19 68.1309946 0.004747 0.021189-0.00549 11 21 68.1302223 0.000772 0.013186-0.00148 12 23 68.1301696 0.000053 0.000147-0.0001 13 25 68.1301694 2.006E-7 0.000046-3.62E-7 24

Mixed Effects Logistic Regression Model (Random Intercept) 14 The NLMIXED Procedure NOTE: GCONV convergence criterion satisfied. Fit Statistics -2 Log Likelihood 136.3 AIC (smaller is better) 144.3 AICC (smaller is better) 144.6 BIC (smaller is better) 153.1 25

Mixed Effects Logistic Regression Model (Random Intercept) 15 The NLMIXED Procedure Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > t Alpha Lower beta1-4.0816 1.6711 66-2.44 0.0173 0.05-7.4180 beta2 1.8631 0.9269 66 2.01 0.0485 0.05 0.01241 beta3 1.0375 0.8189 66 1.27 0.2096 0.05-0.5974 g11 24.4355 18.8489 66 1.30 0.1994 0.05-13.1974 26

Mixed Effects Logistic Regression Model (Random Intercept) 16 The NLMIXED Procedure Parameter Estimates Parameter Upper Gradient beta1-0.7452-8.31e-6 beta2 3.7137 0.000028 beta3 2.6725-0.00005 g11 62.0685-1.34E-7 27

Notes The treatment effect is significant in both analyses, but much larger in the mixed effects model than in the marginal model. The proc nlmixed fit has a very large ĝ 1,1 of 24.4, but with a large standard error and a non-significant t-statistic. But the proc nlmixed output gives AIC = 144.3 and the independence part of the proc genmod output gives Log Likelihood = -81.9432, whence AIC = ( 2) ( 81.9432) + 2 3 = 169.9, showing that including the variance parameter leads to a much better fit. 28