Biostatistics 301A. Repeated measurement analysis (mixed models)

Similar documents
Step 2: Select Analyze, Mixed Models, and Linear.

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.

Mixed Effects Models

Introduction to Within-Person Analysis and RM ANOVA

POWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

Statistical Practice. Selecting the Best Linear Mixed Model Under REML. Matthew J. GURKA

Daniel J. Bauer & Patrick J. Curran

Model Estimation Example

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Dynamic Determination of Mixed Model Covariance Structures. in Double-blind Clinical Trials. Matthew Davis - Omnicare Clinical Research

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Repeated Measures Data

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

One-Way Repeated Measures Contrasts

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar

More Accurately Analyze Complex Relationships

Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects. H.J. Keselman University of Manitoba

1 DV is normally distributed in the population for each level of the within-subjects factor 2 The population variances of the difference scores

Statistics 262: Intermediate Biostatistics Model selection

An Introduction to Mplus and Path Analysis

Review of CLDP 944: Multilevel Models for Longitudinal Data

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements. H.J. Keselman University of Manitoba

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology

Appendix A Summary of Tasks. Appendix Table of Contents

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

Time-Invariant Predictors in Longitudinal Models

An Introduction to Path Analysis

Confidence Intervals for One-Way Repeated Measures Contrasts

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas

1 A Review of Correlation and Regression

Review of Multilevel Models for Longitudinal Data

MULTIVARIATE ANALYSIS OF VARIANCE

A measure for the reliability of a rating scale based on longitudinal clinical trial data Link Peer-reviewed author version

Chapter 4 Multi-factor Treatment Designs with Multiple Error Terms 93

WU Weiterbildung. Linear Mixed Models

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Review of Unconditional Multilevel Models for Longitudinal Data

Introduction to SAS proc mixed

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Describing Change over Time: Adding Linear Trends

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

Modeling the Covariance

Chapter 14: Repeated-measures designs

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Modelling the Covariance

Impact of serial correlation structures on random effect misspecification with the linear mixed model.

Analyzing the Behavior of Rats by Repeated Measurements

CS Homework 3. October 15, 2009

TIME SERIES DATA ANALYSIS USING EVIEWS

Time Invariant Predictors in Longitudinal Models

Changes Report 2: Examples from the Australian Longitudinal Study on Women s Health for Analysing Longitudinal Data

Using SPSS for One Way Analysis of Variance

Mixed Models No Repeated Measures

Regression I: Mean Squared Error and Measuring Quality of Fit

Models for longitudinal data

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

Introduction to SAS proc mixed

M M Cross-Over Designs

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

Generalized Models: Part 1

Answer to exercise: Blood pressure lowering drugs

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

SAS Syntax and Output for Data Manipulation:

Experimental Design and Data Analysis for Biologists

Richard D Riley was supported by funding from a multivariate meta-analysis grant from

Covariance Structure Approach to Within-Cases

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Time-Invariant Predictors in Longitudinal Models

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

16.400/453J Human Factors Engineering. Design of Experiments II

Texas Transportation Institute The Texas A&M University System College Station, Texas

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Analysis of Repeated Measures Data of Iraqi Awassi Lambs Using Mixed Model

BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Alfredo A. Romero * College of William and Mary

Repeated Measures Design. Advertising Sales Example

Dose-response modeling with bivariate binary data under model uncertainty

ANCOVA. Psy 420 Andrew Ainsworth

Measures of Fit from AR(p)

Transcription:

B a s i c S t a t i s t i c s F o r D o c t o r s Singapore Med J 2004 Vol 45(10) : 456 CME Article Biostatistics 301A. Repeated measurement analysis (mixed models) Y H Chan Faculty of Medicine National University of Singapore Block MD11 Clinical Research Centre #02-02 10 Medical Drive Singapore 117597 Y H Chan, PhD Head Biostatistics Unit Correspondence to: Dr Y H Chan Tel: (65) 6874 3698 Fax: (65) 6778 5743 Email: medcyh@ nus.edu.sg In our last article, I discussed the use of the general linear model (GLM) (1) to analyse repeated measurement data and mentioned two major disadvantages: 1. Lost of subjects due to missing data in any of the time points (Table I). 2. The limitation of the availability of variancecovariance structure (only have two choices). Table I. Subjects 2 and 3 are lost to analysis. Subject Time 1 Time 2 Time 3 1 xxxx Xxxx xxxx 2 xxxx missing xxxx 3 xxxx Xxxx missing To overcome the above two disadvantages, the Mixed Model technique can be used. We have to transform the usual longitudinal data form for repeated measurement (Table I) to the relational form (Table II) by using the SPSS Restructure option discussed in the last article (1). Table II. Relational form of Table I. Subject Time Score 1 1 Xxxx 1 2 Xxxx 1 3 Xxxx 2 1 Xxxx 2 2 missing 2 3 Xxxx 3 1 Xxxx 3 2 Xxxx 3 3 missing In this case, only two data points are lost, and the other information for subjects 2 and 3 are still included in the analysis. Table III. Relational form of anxiety data set. Subject Anxiety Trial Score 1 Low 1 18 1 Low 2 14 1 Low 3 12 1 Low 4 6 2 Low 1 19 2 Low 2 12 2 Low 3 8 2 Low 4 4 Etc Table III shows the relational data form for the first two of the 12 subjects from our last article s anxiety example (1). VARIANCE-COVARIANCE/CORRELATION STRUCTURES For the GLM Univariate approach, the assumption for the within-subject variance-covariance is a Type H structure (or circular in form correlation between any two levels of within-subject factor has the same constant value). The Compound symmetry (CS)/ Exchangeable structure would be appropriate. Table IV shows the structure for a 4 time-point study. Table IV. Compound symmetry/exchangeable structure. Variance-covariance Correlation This structure is overly simplistic: the variance at all time points are the same and the correlation between any two measurements is the same i.e. only need to estimate two parameters (σ 2 & ρ).

Singapore Med J 2004 Vol 45(10) : 457 For the GLM Multivariate approach, the assumption that the correlation for each level of within-subject factor is different is modeled by an Unstructured covariance structure, see Table V. Table V. Unstructured correlation structure. This structure is overly complex: the variance at all time points and the correlation between any two measurements are all different i.e. need to estimate 4 variances and 6 covariances = 10 parameters! General form for the number of parameters to be estimated is given by [n + n(n-1)/2], where n = number of repeated trials. Does the variance-covariance/correlation structure of our anxiety data satisfies any of the above 2 structures? Table VI shows the correlation structure of the anxiety data by using the Analyze, Correlate, Bivariate option. We observe that the correlation between two time-points are not really similar (which accounts for the p=0.053 value for the sphericity s test shown in our last article, near rejection of sphericity assumption), thus the compound symmetry assumption may not be appropriate. That leaves us with the unstructured option only - but we need to estimate ten unknown parameters with 12 subjects! There would be concern that with such a small sample size (worse still, if we have missing data!), the variancecovariance structure assumed may not be very appropriate and the results would be based on these could-be unstable estimates. What other choices do we have? None if we use the GLM technique! Using the Mixed Model technique, we have more variance-covariance choices. Taking a closer analysis on Table VI, the correlation between two adjacent time-points (Trial1 and Trial2, for example) is always higher than that of those between two time-points that are further apart (Trial1 and Trial3, for example). In such a situation, an appropriate structure could be the 1 st Order Autoregressive, AR(1), which assumes that the correlation between adjacent timepoints is the same and the correlation decreases by the power of the number of time intervals between the measures (Table VII). Table VI. Correlation structure of anxiety data. Correlations Trial 1 Trial 2 Trial 3 Trial 4 Trial 1 Pearson Correlation 1.488.246.223 Sig. (2-tailed)..107.442.487 Trial 2 Pearson Correlation.488 1.812*.803* Sig. (2-tailed).107..001.002 Trial 3 Pearson Correlation.246.812* 1.785* Sig. (2-tailed).442.001..003 Trial 4 Pearson Correlation.223.803*.785* 1 Sig. (2-tailed).487.002.003. **. Correlation is significant at the 0.01 level (2-tailed).

Singapore Med J 2004 Vol 45(10) : 458 Table VII. 1 st Order Autoregressive, AR(1) structure. In Template I, click continue to get Template II. Template II. Defining the variables. We shall discuss the analysis of the Anxiety data using the Mixed Model technique with the above three structures (Compound symmetry, Unstructured and 1 st Order Autoregressive). To perform the Mixed Model analysis, go to Analyze, Mixed Models, Linear to get Template I. Template I. Specifying subjects and repeated measurements. Put score in the Dependent Variable option and anxiety and trial in the Factor option. Click on the Fixed folder to get Template III. Template III. Defining the Fixed effects. Put the variable subject into the Subject option and trial into the Repeated option. Choose Compound Symmetry for the Repeated Covariance Type option. Table VIII shows all the variancecovariance structures available in SPSS. A brief description for each structure could be obtained from the Help button. Table VIII. Available variance-covariance structures. Ante-dependence: first order AR(1) AR(1): \heterogeneous ARMA(1,1) Compound symmetry Compound symmetry: correlation metric Compound symmetry: heterogeneous Diagonal Factor analytic: first order Factor analytic: first order, heterogeneous Huynh-Feldt Scaled identity Toeplitz Toeplitz: heterogeneous Unstructured Unstructured: correlation metric Highlight both anxiety(f) and trial(f), the Add button becomes visible. Leave the selection as Factorial and click on the Add button to define the Model (anxiety, trial, anxiety*trial). Click on Continue to return to Template II and click OK. Table IXa shows the model defined and the covariance structure used compound symmetry.

Singapore Med J 2004 Vol 45(10) : 459 Table IXa. Model and covariance structure definition. Model Dimension a Number Covariance Number of Subject Number of of Levels Structure Parameters Variables Subjects Fixed Effects Intercept 1 1 anxiety 2 1 trial 4 3 anxiety * trial 8 3 Repeated Effects trial 4 Compound 2 Subject 12 Symmetry Total 19 10 a Dependent Variable: Score. Table IXb. Covariance structure. Estimates of Covariance Parameters a Parameter Estimate Std. Error Repeated CS diagonal offset 2.5694444.6634277 Measures CS covariance 3.6305556 1.9180907 a Dependent variable: Score. Table IXb gives the variance (= 2.57) within each time-point, and the covariance between any two time-points is 3.63. The interest in our model building is not in the variance-covariance structure but in the treatment effects. But it is important to get the appropriate structure to obtain the appropriate standard errors for the inferences of the treatment effects. Question: How do we know which covariance structure is the most appropriate? Table IXc. Model selection measures. Information Criteria a -2 Restricted Log Likelihood 184.546 Akaike s Information Criterion (AIC) 188.546 Hurvich and Tsai s Criterion (AICC) 188.870 Bozdogan s Criterion (CAIC) 193.924 Schwarz s Bayesian Criterion (BIC) 191.924 The information criteria are displayed in smaller-is-better forms. a Dependent Variable: Score Table IXc shows some basic measure for model selection which has to be used in comparison with the measures when other covariance structures are being used. The -2 Restricted Log Likelihood (-2RLL) value is valid for simple models and modifications of this value for more complicated models are given by Akaike s Information Criterion (AIC) and Schwarz s Bayesian Criterion (BIC). The BIC measurement is most severely adjusted and is the recommended measure used for comparison. Hurvich and Tsai s Criterion (AAIC) and Bozdogan s Criterion (CAIC) are the adjustments of AIC for small sample sizes. We want the smaller is better comparisons amongst the covariance structures. Table IXd gives the model selection measurements for the three covariance structures (Note: Unstructured and Unstructured correlation metric, see Table VIII, have the same model selection measurements but because of the small sample size, no estimates were obtained for the within-subject effects, trial and trial*anxiety, when the unstructured covariance structure was used!) The appropriate covariance structure for this anxiety data is AR(1) as it has the smallest BIC among the 3 structures. We can also try the other various covariance structures (Table VIII) to compare their model selection measurements. Since the AR(1) Table IXd. Model selection measures. Information Criteria Compound Symmetry (CS) Unstructured: correlation metric 1 st Order autoregressive, AR(1) -2 RLL 184.546 168.924 176.828 AIC 188.546 188.924 180.828 AICC 188.870 196.510 181.153 CAIC 193.924 215.813 186.206 BIC 191.924 205.813 184.206

Singapore Med J 2004 Vol 45(10) : 460 Table IXe. Results for the between and within subjects effects (p-values). Compound symmetry (CS) Unstructured correlation metric 1 st order autoregressive, AR(1) Anxiety 0.460 0.460 0.465 Trial <0.001 <0.001 <0.001 Trial*anxiety 0.368 0.067 0.150 structure is chosen, then we should only use the between and within subjects results from this model. For discussion purposes, Table IXe shows the results for all three structures. Using the compound symmetry structure, the results obtained are identical to those given by GLM Univariate analysis provided there is no missing data. GLM and Mixed Model will have different results if there were missing data. The between-subject effect (anxiety) of the Mixed Model is identical to GLM but though both models used the unstructured covariance structured, different results are obtained for the Trial*anxiety (p=0.138 for GLM). This is because both techniques used different estimation methods to derive the results will not bore you with the details (those interested could refer to any standard statistical text on mixed model for further reading). From Table IXe, we could see that the p-values are similar in terms of significance (not worrying about the exact values), the issue of using the right covariance structure arises when we have a difference of opinions in terms of significance for the between and within subjects effects for the different models. We have only analyzed the Fixed effects aspects of the anxiety data in the above discussions, which means that the anxiety levels selected represented all levels of this factor or the researcher is only specifically interested in these two levels. In Template II, we have a Random folder which allows us to define the Random effects for the model. Factor effects are random if the levels of the factor that are used in the study represent a random sample of a larger set of potential levels. For the extension of the fixed effects to a mixed effect model (having both fixed and random effects), it would be most appropriate to seek the assistance of a biostatistician! Finally, the above analyses could be performed using other statistical software (SAS, S-plus and STATA) which offers more choices of covariance structures and greater flexibility in the modeling aspects for random effects. Our next article, Biostatistics 302. Principal component and factor analysis, will discuss the approach to summarising and uncovering any patterns in a set of variables (for example, a questionnaire). REFERENCE 1. YH Chan. Biostatistics 301. Repeated measurement analysis. Singapore Med J 2004; 45:354-69.

Singapore Med J 2004 Vol 45(10) : 461 SINGAPORE MEDICAL COUNCIL CATEGORY 3B CME PROGRAMME Multiple Choice Questions (Code SMJ 200410A) True False Question 1. The results from the GLM Univariate procedure of repeated measurement analysis is identical to the Mixed Model procedure when: (a) The covariance structure is compound symmetry with no missing data. (b) The covariance structure is compound symmetry with missing data. (c) Unstructured covariance structure with no missing data. (d) Unstructured covariance structure with missing data. (e) As long as there is no missing data. Question 2. We compare the appropriate covariance structure used for a model by comparing: (a) The p-values of the between-subject effects. (b) The p-values of the within-subjects effects. (c) The model selection measures between different covariance structures. (d) The model selection measures within each covariance structure. (e) All of the above. Question 3. The Mixed Model technique has the following advantages over the GLM: (a) Allows random effects in the model. (b) Gives faster results - shorter computing time. (c) More likely to get a significant p-value. (d) Can select the appropriate variance-covariance structure. (e) Makes use of data from subjects with incomplete data. Question 4. The following statements are true: (a) The Mixed Model procedure allows us to plot the data. (b) The smaller-the-better criterion is used to compare the model selection measures for the different covariance structures. (c) The most severely corrected measurement for the -2RLL is the AIC. (d) The longitudinal data structure could be used for a Mixed Model analysis. (e) The unstructured covariance structure gives the best results. Doctor s particulars: Name in full: MCR number: Specialty: Email address: Submission instructions: A. Using this answer form 1. Photocopy this answer form. 2. Indicate your responses by marking the True or False box 3. Fill in your professional particulars. 4. Either post the answer form to the SMJ at 2 College Road, Singapore 169850 OR fax to SMJ at (65) 6224 7827. B. Electronic submission 1. Log on at the SMJ website: URL http://www.sma.org.sg/cme/smj 2. Either download the answer form and submit to smj.cme@sma.org.sg OR download and print out the answer form for this article and follow steps A. 2-4 (above) OR complete and submit the answer form online. Deadline for submission: (October 2004 SMJ 3B CME programme): 25 November 2004 Results: 1. Answers will be published in the SMJ December 2004 issue. 2. The MCR numbers of successful candidates will be posted online at http://www.sma.org.sg/cme/smj by 20 December 2004. 3. Passing mark is 60%. No mark will be deducted for incorrect answers. 4. The SMJ editorial office will submit the list of successful candidates to the Singapore Medical Council.