A SAS Macro to Find the Best Fitted Model for Repeated Measures Using PROC MIXED

Similar documents
Repeated Measures Data

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

SAS Syntax and Output for Data Manipulation:

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Introduction to SAS proc mixed

Covariance Structure Approach to Within-Cases

Dynamic Determination of Mixed Model Covariance Structures. in Double-blind Clinical Trials. Matthew Davis - Omnicare Clinical Research

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

Power Analysis for One-Way ANOVA

Repeated Measures Design. Advertising Sales Example

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Answer to exercise: Blood pressure lowering drugs

Odor attraction CRD Page 1

Introduction to SAS proc mixed

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Models for longitudinal data

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Comparison of a Population Means

Chapter 11. Analysis of Variance (One-Way)

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Chapter 2. Review of Mathematics. 2.1 Exponents

Descriptions of post-hoc tests

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

Practice with Interactions among Continuous Predictors in General Linear Models (as estimated using restricted maximum likelihood in SAS MIXED)

Variable-Specific Entropy Contribution

Qinlei Huang, St. Jude Children s Research Hospital, Memphis, TN Liang Zhu, St. Jude Children s Research Hospital, Memphis, TN

POWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

Appendix B Microsoft Office Specialist exam objectives maps

Step 2: Select Analyze, Mixed Models, and Linear.

Linear Mixed Models with Repeated Effects

Extensions of Cox Model for Non-Proportional Hazards Purpose

The ARIMA Procedure: The ARIMA Procedure

Chapter 18 The STATESPACE Procedure

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

Introduction to ArcMap

CS Homework 3. October 15, 2009

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

PLS205 Lab 6 February 13, Laboratory Topic 9

Topic 23: Diagnostics and Remedies

International Journal of Current Research in Biosciences and Plant Biology ISSN: Volume 2 Number 5 (May-2015) pp

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

17. Example SAS Commands for Analysis of a Classic Split-Plot Experiment 17. 1

Metabolite Identification and Characterization by Mining Mass Spectrometry Data with SAS and Python

Analysis of Repeated Measures Data of Iraqi Awassi Lambs Using Mixed Model

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

Lecture 4. Random Effects in Completely Randomized Design

Topic 20: Single Factor Analysis of Variance

STA6938-Logistic Regression Model

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

This is a Split-plot Design with a fixed single factor treatment arrangement in the main plot and a 2 by 3 factorial subplot.

EXST7015: Estimating tree weights from other morphometric variables Raw data print

Implementation of Pairwise Fitting Technique for Analyzing Multivariate Longitudinal Data in SAS

Single Factor Experiments

Spatial Linear Geostatistical Models

Biostatistics 301A. Repeated measurement analysis (mixed models)

Random Intercept Models

Seasonal Adjustment using X-13ARIMA-SEATS

Walkthrough for Illustrations. Illustration 1

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Package EMMIX. November 8, 2013

Package HGLMMM for Hierarchical Generalized Linear Models

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

A SAS/AF Application For Sample Size And Power Determination

The Simplex Method: An Example

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Some general observations.

Companion. Jeffrey E. Jones

Chapter 11: Robust & Quantile regression

Introduction to Within-Person Analysis and RM ANOVA

Fitting PK Models with SAS NLMIXED Procedure Halimu Haridona, PPD Inc., Beijing

9 Estimating the Underlying Survival Distribution for a

36-402/608 Homework #10 Solutions 4/1

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar

a = 4 levels of treatment A = Poison b = 3 levels of treatment B = Pretreatment n = 4 replicates for each treatment combination

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Topic 18: Model Selection and Diagnostics

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Rigorous Evaluation R.I.T. Analysis and Reporting. Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish

CityGML XFM Application Template Documentation. Bentley Map V8i (SELECTseries 2)

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Integration of SAS and NONMEM for Automation of Population Pharmacokinetic/Pharmacodynamic Modeling on UNIX systems

Review of CLDP 944: Multilevel Models for Longitudinal Data

Multi-group analyses for measurement invariance parameter estimates and model fit (ML)

Daniel J. Bauer & Patrick J. Curran

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Swabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household.

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.

Mapping Participants to the Closest Medical Center

Longitudinal Studies. Cross-Sectional Studies. Example: Y i =(Y it1,y it2,...y i,tni ) T. Repeatedly measure individuals followed over time

Transcription:

Paper AD18 A SAS Macro to Find the Best Fitted Model for Repeated Measures Using PROC MIXED Jingjing Wang, U.S. Army Institute of Surgical Research, Ft. Sam Houston, TX ABSTRACT Specifying an appropriate covariance structure is essential for a valid and powerful repeated measure analysis using PROC MIXED. However, assessing the covariance structure is not a trivial task. To facilitate this task, the SAS macro %RMMS was developed to find the best fitted model accurately and efficiently. The macro produces a table of various valid candidate covariance structures sorted in ascending order by AIC, AICC and BIC, respectively, with the covariance having the smallest Information Criteria (IC) specified, and a plot of estimated covariance versus duration between time points to examine the covariance structure. In addition, an optional likelihood ratio test between two user-selected nested covariance structures is provided. Moreover, once the final decision is made about the covariance structure, output of the final analysis of the best fitted model is automatically exported to a user-specified rich text format (RTF) file. KEYWORDS: MIXED, covariance structure, repeated measures, likelihood ratio test INTRODUCTION Mixed model methodology in a two-stage approach using the MIXED procedure has been widely used as the modern approach for repeated measures. The first stage attempts to estimate the covariance structure and the second stage substitutes the covariance estimates into the mixed model and uses generalized least square methodology to assess the effects. This approach is appealing because 1) it directly addresses the covariance structure and attempts to accommodate it in the model; 2) it can accommodate data that are missing at random. Specifying an appropriate covariance structure is essential to draw a valid and powerful conclusion. Ignoring important correlations by using a model that is too simple will increase the risk of the type I error of the fixed effects test. On the other hand, using a model which is too complex will reduce the power and efficiency for the fixed effects tests. However, assessing and estimating the covariance structure requires a series of steps. It involves using IC such as AICC and BIC to compare the fit of various covariance structures by the rule smaller is better, and/or using likelihood ratio tests to test whether a more general model is significantly better than its special case (i.e., nested), and/or using graphics to examine the patterns of covariates over levels of repeated variables (e.g., time). Therefore, it is desirable to have access to a macro that automatically generates summaries by applying the above statistical tools so that we can find the most appropriate covariance structure quickly. It is also desirable that the macro can export the final analysis output to a specified RTF file automatically. %RMMS was developed to meet the above desirability, and to make the analysis of repeated measures less burdensome. FEATURES OF THE MACRO The macro automatically generates the following information: 1. A table of various valid candidate covariance structures sorted in ascending order by AIC, AICC and BIC, respectively, with the covariance having the smallest IC specified. 2. A plot of estimated covariance among time points within a subject versus duration in successive time points to examine the covariance structure. 3. A table of the likelihood ratio test between two nested covariance structures if you specify a more general structure and its special case in CovG = and CovS =, respectively, in the macro. 4. Output of the final analysis of the best fitted model to a specified RTF file once the final covariance structure is specified FinalCov = in the macro. MACRO STRUCTURE The macro syntax is: %RMMS (Dt =, Y =, Xs =, ClassXs=, 1

RepeatedVar =, Sub =, CovStructures =, NumRept =, CovG =, CovS =, FinalCov =, Style =, OutFile =); There are several macro variables used in %RMMS that need to be defined in the call. An explanation of these macro variables is provided below. MACRO VARIABLE DESCRIPTION Dt Input data set in the univariate format Y Dependent variable in the MIXED model Xs Fixed effects in the MIXED model, separated by blank ClassXs Class variables in the CLASS Statement in the MIXED model separated by blank RepeatedVar Repeated variable in the REPEATED statement in the MIXED model, for example, time Sub The variable for SUBJECT= option in the MIXED model CovStructures Covariance structures separated by and ended with. Default are TOEP AR(1) ARH(1) CS CSH HF ARMA(1,1) UN. Before using statistical tools to select a covariance structure, you may rule out covariance structures that clearly make no sense based on knowledge of the subject matter, information from the data collection process and previous studies. NumRept Total number of levels in repeated variable CovG* A more general covariance structure compared to CovS as below followed by, default is CovS* A special (nested) case of CovG followed by, default is FinalCov Final selected covariance structure followed by, default is OutFile File directory for the output of final model in RTF, for example, D:\stat\y.rtf * CovG and CovS are for an optional likelihood ratio test between two user-selected nested covariance structures, they must be listed in the macro variable CovStructures in the call. DETAIL OF PROGRAM The code for the macro is available in the Appendix. The following is a description of the logic used in the macro. Step 1. Create a series of data sets named Fitdt&&p (p=1,2,3,, p<= # of candidate structures), FitRF0&&p containing fit statistics and ParmRF&&p containing both fit statistics and dimensions from the PROC MIXED model under each valid covariance structure using DO-WHILE loop. Some candidate covariance structures may be invalid for the PROC MIXED because the procedures under them have non-converged iterations. These invalid candidate covariance structures must be excluded from the model selection first. To do this, the macro variable named Convs containing converge status of a mixed model under each candidate structure is created. Step 2.Create data set S by combining Fitdt&&p after the DO-WHILE loop and sorted by names of fit statistics and their values. Step 3. Create data set Covparms containing estimated covariance by PROC MIXED under UN structure and data set named Time containing duration between time points. Then generate data set named Covplot by merging these two data sets. Step 4. Create data set named User by merging ParmRF&&p and itrf0&&p and calculate likelihood ratio test for a user defined more general structure and its special structure. Step 5. Automatically export the final model using the specified most appropriate structure to a specified RTF file. 2

EXAMPLE The usage of the macro is illustrated through an example in this section. Assume data were from a study with one standard drug treatment (1) and one experimental drug treatment (2). The two treatments were randomly assigned to 30 subjects and pulse was measured at equally spaced time points 1, 2, 3 and 4 for each subject. The data set example includes a dependent variable (pulse), two independent variables (drug and time). A partial listing of the data is as follows: id drug time pulse 1 1 1 85 1 1 2 85 1 1 3 88 2 1 1 90 2 1 2 92 2 1 3 93 3 1 1 97 3 1 2 97 3 1 3 94 The following macro will automatically generate fit statistics for all valid candidate covariance structures listed in ascending order (Table 1) and the plot of covariance vs. duration between time points (Figure 1). % RMMS (Dt= example, Y= pulse, Xs=drug time drug*time, ClassXs=drug time, RepeatedVar = time, Sub = id, CovStructures =TOEP AR(1) ARH(1) CS CSH HF ARMA(1,1) UN, NumRept =3) Table 1. Fit Statistics for All Valid Candidate Covariance Structures Listed in Ascending Order Descr Value type Note AIC (smaller is better) 586.6 HF Smallest AIC (smaller is better) 587.8 ARH(1) AIC (smaller is better) 589.7 UN AIC (smaller is better) 594.1 AR(1) AIC (smaller is better) 594.8 CS AIC (smaller is better) 595.7 TOEP AIC (smaller is better) 595.7 ARMA(1,1) AICC (smaller is better) 587.1 HF Smallest AICC (smaller is better) 588.3 ARH(1) AICC (smaller is better) 590.9 UN AICC (smaller is better) 594.3 AR(1) AICC (smaller is better) 595.0 CS AICC (smaller is better) 596.0 TOEP AICC (smaller is better) 596.0 ARMA(1,1) BIC (smaller is better) 592.2 HF Smallest BIC (smaller is better) 593.4 ARH(1) BIC (smaller is better) 596.9 AR(1) BIC (smaller is better) 597.6 CS BIC (smaller is better) 598.1 UN BIC (smaller is better) 599.9 TOEP BIC (smaller is better) 599.9 ARMA(1,1) 3

Figure 1. Estimated Covariance vs. Duration between Time Points The most promising structure is HF since it has the smallest value for AIC, AICC and BIC, respectively (Table1). The plot further confirms the conclusion. Since HF is a special case of UN, we can use a likelihood ratio test to compare the two structures by using the following macro. Similarly, we can test the likelihood ratio between CS and HF. %RMMS (Dt =example, Y = pulse, Xs = drug time drug*time, ClassXs =drug time, RepeatedVar = time, Sub= id, CovR= HF, CovF= UN ) Table 2 and 3 show that HF is the better fitted model compared to UN (p=0.65) and CS (p=.0022). Finally, we can specify FinalCov=HF and outfile=d: \Pulse.rtf in the macro to export the output of the best fitted model to a specified RTF file. Table2 Likelihood Ratio Test for UN and HF H0: The Simpler Structure is Better Type _2LLR NumPara DF Chisq p_value HF 578.613 4 2.. UN 577.736 6 2 0.877 0.6452 Table 3 Likelihood Ratio Test for CS and HF H0: The Simpler Structure is Better Type _2LLR NumPara DF Chisq p_value CS 590.832 2 2.. HF 578.613 4 2 12.218.0022 4

CONCLUSION %RMMS is designed to quickly find the best fitted repeated measure mixed model. If the mixed procedure fails to create a converge status data set, the macro will stop. For example, duplicate time points in the data set could cause this problem. %RMMS is frequently used in the analysis of repeated measures at this research institute, saving enormous time and resources. REFERENCES Tao, Jill, (2004), Mixed Models Analyses Using the SAS System Course Notes, 4-15 to 4-45 Walker, Glen, Common Statistical Methods for Clinical Research With SAS Example, 111-148 CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Jingjing Wang 3400 Rawley E. Chambers Avenue Fort Sam Houston, Texas 78234 210.916.0345 Email: JingJing.Wang@amedd.army.mil APPENDIX: MACRO CODE %macro RMMS (Dt =, Y =, Xs =, ClassXs =, RepeatedVar =, Sub =, CovStructures = TOEP AR(1) ARH(1) CS CSH HF ARMA(1,1) UN, NumRept=, CovG=, CovS=, FinalCov=, Style=, OutFile=); %Let FinalCov=%scan(&FinalCov,1, ); %Let CovG=%scan(&CovG,1, ); %Let CovS=%scan(&CovS,1, ); /*I. Covariance structure selection before the final structure specified*/ %if ("&FinalCov" = "" ) %then %do; /*1. Selection using IC */ /*1.1 Assign 1 to p and k and assign 1st candidate structure to CovStru*/ %let p=1; %let k=1; %let CovStru=%scan(&CovStructures,&k, ); /*1.2 DO-WHILE loop for PROC MIXED under each candidate covariance structure*/ %do %while ("&CovStru" NE ""); data temp; a=symget("covstru");/*assign value of CovStru to a*/ call symput("covrname",compress(a,"(),")); 5

/*1.2.1 PROC MIXED procedure*/ proc mixed data=&dt ; class &ClassXs; model &Y=&Xs; repeated &repeatedvar/subject=&sub type=&covstru; ods output convergencestatus=cstatus&&k FitStatistics = Fit&&p FitStatistics = FitRF0&&p(rename=(value=&CovRname)) Dimensions = ParmRF0&&p(rename=(value=Num&&CovRname)); ods exclude ModelInfo Dimensions IterHistory ConvergenceStatus CovParms FitStatistics LRT ClassLevels Tests3; /*1.2.2 Check the converge status of the model under the structure*/ data cstatus&&k; set cstatus&&k; call symput("convs",status); /*1.2.3 To eliminate non-exist data due to non-converged PROC MIXED*/ %if &convs=1 %then %goto RetainP; /*1.2.4 Prepare the data Fitdt&&p contain fit statistics for valid structures*/ data Fitdt&&p; set Fit&&p; length type $16.0; type="&covstru"; FitStatName=trim(%scan(descr,1,"" )); if FitStatName ne "-2"; /*1.3 Increase k by 1, remain p or increase p by 1 and assign the next candidate structure to CovStru*/ %let p=%eval(&p+1); %RetainP: %let p=%eval(&p); %let k=%eval(&k+1); %let CovStru=%scan(&CovStructures,&k, ); /*1.2 Create data set S containing fit statistics for all valid candidate structures and pint out the results */ data S0; set %do i=1 %to &p-1; Fitdt&&i ; proc sort data=s0; by FitStatName value; Data S(drop=FitStatName); set S0; by FitStatName value ; if Descr ne "-2 Res Log Likelihood"; if first.fitstatname then Note='Smallest'; proc print data=s noobs; Title1 "Fit Statistics for All Valid Candidate Covariance Structures Listed in Ascending Order"; 6

/*2. Selection by visualizing the covariance structure*/ proc mixed data=&dt ; class &ClassXs; model &Y=&Xs; repeated &repeatedvar/subject=&sub type=un r rcorr;; ods output covparms=cov rcorr=corr; ods exclude ModelInfo Dimensions IterHistory ConvergenceStatus CovParms FitStatistics LRT ClassLevels Tests3 Rcorr CovParms R; data time; %do time1=1 %to &NumRept; Time=&time1; %do time2=1 %to &time1; duration=&time1-&time2; output; data covplot; merge time cov(rename=(estimate=covariance)); goptions reset=all ftext=swiss ftitle=swissb htitle=1.5 htext=1.5; proc gplot data=covplot; label duration='duration Between Time Points'; plot Covariance*Duration=Time / haxis=axis1; symbol1 v=circle color=black i=none ; %do s=2 %to &NumRept; %let s1=%eval(&s-1); symbol&&s v=circle i=j l=%eval(&s-1); axis1 offset=(2,2) minor=none; title1 'Covariance vs. Duration Between Time Points'; quit; /*3. Compare user defined nested structures using likelihood ratio test;*/ %if "&CovG" ne "" or "&CovS" ne "" %then %do; data _null_; call symput("covgv",compress("&covg","(),")); call symput("covsv",compress("&covs","(),")); data User; merge %do i=1 %to &p-1; FitRF0&&i ParmRF0&&i ; if _n_ = 1 then do; Chisq=(&CovSV-&CovGV); DF=(Num&&CovGV-Num&&CovSV); p_value=1-probchi(chisq,df); output; stop; 7

end; data UserP(rename=(Chisqr=Chisq p_valuer=p_value)); length type $16 _2LLR 4.3 NumPara 4.0 DF 4.0 Chisq 4.3 p_value 4.3; set User; Type = "&CovS"; _2LLR = &CovSV; NumPara = Num&&CovSV; Chisqr =.; p_valuer =.; output; Type="&CovG"; _2LLR = &CovGV; NumPara = Num&&CovGV; Chisqr = Chisq; p_valuer = p_value; output; keep Type _2LLR NumPara DF Chisqr p_valuer ; proc print data= UserP noobs; title1 "Likelihood Ratio Test for &CovS and &CovG"; title2 "H0: The Simpler Structure is Better"; /*II. Selected final model: PROC MIXED should be modified as needed*/ %if "&FinalCov" ne "" %then %do; ods rtf body="&outfile" style=&style startpage=yes; ods noproctitle; proc mixed data=&dt ; class &ClassXs; model &Y=&Xs; repeated /subject=&sub type=&finalcov; title1 "Repeated Measure Mixed Model Using &FinalCov"; ods rtf close; %mend RMMS; 8