Introduction to SAS proc mixed

Similar documents
Introduction to SAS proc mixed

Answer to exercise: Blood pressure lowering drugs

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Covariance Structure Approach to Within-Cases

SAS Syntax and Output for Data Manipulation:

Models for longitudinal data

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Odor attraction CRD Page 1

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Topic 20: Single Factor Analysis of Variance

Multi-factor analysis of variance

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Variance component models part I

Correlated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Models for binary data

Time-Invariant Predictors in Longitudinal Models

Introduction to Within-Person Analysis and RM ANOVA

Time-Invariant Predictors in Longitudinal Models

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

Time-Invariant Predictors in Longitudinal Models

A Re-Introduction to General Linear Models (GLM)

Repeated Measures Data

17. Example SAS Commands for Analysis of a Classic Split-Plot Experiment 17. 1

Dynamic Determination of Mixed Model Covariance Structures. in Double-blind Clinical Trials. Matthew Davis - Omnicare Clinical Research

Analysis of variance and regression. May 13, 2008

Repeated Measures Modeling With PROC MIXED E. Barry Moser, Louisiana State University, Baton Rouge, LA

Time-Invariant Predictors in Longitudinal Models

Time Invariant Predictors in Longitudinal Models

Interactions among Continuous Predictors

Topic 23: Diagnostics and Remedies

Some general observations.

Random Intercept Models

Lecture 4. Random Effects in Completely Randomized Design

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Chapter 11. Analysis of Variance (One-Way)

Descriptions of post-hoc tests

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Random Coefficients Model Examples

Linear Mixed Models with Repeated Effects

Faculty of Health Sciences. Correlated data. Count variables. Lene Theil Skovgaard & Julie Lyng Forman. December 6, 2016

Statistical Inference: The Marginal Model

Lab 11. Multilevel Models. Description of Data

Introduction and Background to Multilevel Analysis

Introduction to Random Effects of Time and Model Estimation

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

Changes Report 2: Examples from the Australian Longitudinal Study on Women s Health for Analysing Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data

Workshop 9.3a: Randomized block designs

SAS Code: Joint Models for Continuous and Discrete Longitudinal Data

A Re-Introduction to General Linear Models

Statistics for exp. medical researchers Regression and Correlation

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Lecture 3: Inference in SLR

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

More about linear mixed models

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Statistical Analysis of Hierarchical Data. David Zucker Hebrew University, Jerusalem, Israel

R Output for Linear Models using functions lm(), gls() & glm()

ssh tap sas913, sas

Regression without measurement error using proc calis

Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Variance components and LMMs

Variance components and LMMs

Lecture 10: Experiments with Random Effects

LINEAR REGRESSION. Copyright 2013, SAS Institute Inc. All rights reserved.

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.

PLS205 Lab 2 January 15, Laboratory Topic 3

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

Independence (Null) Baseline Model: Item means and variances, but NO covariances

Step 2: Select Analyze, Mixed Models, and Linear.

36-402/608 Homework #10 Solutions 4/1

Longitudinal Modeling with Logistic Regression

In many situations, there is a non-parametric test that corresponds to the standard test, as described below:

Analysis of variance and regression. November 22, 2007

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Practice with Interactions among Continuous Predictors in General Linear Models (as estimated using restricted maximum likelihood in SAS MIXED)

Example 7b: Generalized Models for Ordinal Longitudinal Data using SAS GLIMMIX, STATA MEOLOGIT, and MPLUS (last proportional odds model only)

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

General Linear Model (Chapter 4)

STAT 501 EXAM I NAME Spring 1999

Longitudinal Data Analysis of Health Outcomes

Random Effects. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology. university of illinois at urbana-champaign

Statistical Distribution Assumptions of General Linear Models

Review of Unconditional Multilevel Models for Longitudinal Data

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Week 7.1--IES 612-STA STA doc

Transcription:

Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen

Outline Data in wide and long format Descriptive statistics Analysis of response profiles (FLW section 5.8) Reading the output from proc mixed Baseline adjustment 2 / 28

Preparing data for analysis Most often raw data is stored in the wide format (e.g. in Excell). one row per subject several columns with the outcomes for different occations Example: id sex age group aix0 aix1 aix2 1 1 57 0 10.5 17.5 25.0 2 1 48 0-2.5 8.0 8.5 3 2 54 1 18.0 24.0 23.5... To fit a linear mixed model with any statistical software data must be in the so-called long format... 3 / 28

The long format Each row contains only one observation of the outcome. A time-variable identifies the time of measurement. An id-variable identifies measurements from same subject. Obs id sex age group week aix 1 1 1 57 0 0 10.5 2 1 1 57 0 12 17.5 3 1 1 57 0 24 25.0 4 2 1 48 0 0-2.5 5 2 1 48 0 12 8.0 6 2 1 48 0 24 8.5 7 3 2 54 1 0 18.0 8 3 2 54 1 12 24.0 9 3 2 54 1 24 23.5 10 4 2 46 1 0 26.0... 4 / 28

From wide to long format Data is transformed from the wide to the long format with: DATA ckd (DROP = aix-aix2); SET ckd_wide; week = 0; aix = aix0; OUTPUT; week = 12; aix = aix1; OUTPUT; week = 24; aix = aix2; OUTPUT; RUN; Note: We keep the baseline variable aix0 for the ANCOVA. 5 / 28

Outline Data in wide and long format Descriptive statistics Analysis of response profiles (FLW section 5.8) Reading the output from proc mixed Baseline adjustment 6 / 28

Spaghettiplots The spaghettiplots from the lecture were made with: PROC SGPANEL DATA=ckd; PANELBY group; SERIES x = week y = aix / GROUP=id; RUN; Note: Applies to data in the long format. 7 / 28

Summary statistics and pairwise scatterplots PROC SORT DATA=ckd_wide; BY group; RUN; ODS GRAPHICS ON; PROC CORR DATA=ckd_wide PLOT=MATRIX(HISTOGRAM) NOPROB; BY group; VAR aix0-aix2; RUN; Note: Applies to data in the wide format. 8 / 28

Plotting averages over time The plot of group-time-averages were made with: PROC MEANS DATA=ckd NWAY; CLASS group week; VAR aix; OUTPUT OUT=ckdmeans MEAN=average; RUN; PROC SGPLOT DATA=ckdmeans; SERIES x = week y = average / GROUP = group markers; RUN; Note: Applies to data in the long format. 9 / 28

Outline Data in wide and long format Descriptive statistics Analysis of response profiles (FLW section 5.8) Reading the output from proc mixed Baseline adjustment 10 / 28

Syntax: Analysis of response profiles PROC MIXED DATA=ckd PLOTS=all; CLASS id week (ref= 0 ) group (ref= 0 ); MODEL aix = week group group*week / SOLUTION CL DDFM=KR OUTPM=ckdfit; REPEATED week / SUBJECT=id TYPE=UN R RCORR; RUN; Syntax is similar to PROC GLM with a MODEL specifying the (linear) relation between outcome and covariates. Categorical variable must be declared with CLASS. The model for the covariance (UN=ustructured) is specified in a separate REPEATED-statement. Fitted values and residuals are saved in a dataset ckdfit. Use the PLOTS-option to get some residual plots. 11 / 28

The option DDFM=KENWARDROGERS (aka KR) (or DDFM=SATTERTHWAITE). A technical option intended to improve the statistical performance of the t-tests and F-tests. It has no effect on balanced data. In unbalanced situations (i.e for almost all observational studies and in case of missing observations) degrees of freedom are computed by a more complicated formulae. The computations may require a little more time, but in most cases this will not be noticable. When in doubt, use it! 12 / 28

Estimated response profiles Use the output data (ckdfit) to plot the estimated profiles: PROC SORT DATA=ckdfit; BY group week id; RUN; PROC SGPLOT DATA=ckdfit; SERIES x = week y = pred / GROUP = group MARKERS; RUN; 13 / 28

Alternative model specifications The same model can be phrased differently to highlight differences between groups at specific time points or changes over time. To compare change over time between groups: Include both main effects and the interaction term. MODEL aix = time group time*group / SOLUTION CL; To get mean differences between groups at each time point: Omit the main effect of group and the intercept. MODEL aix = time time*group / NOINT SOLUTION CL; To get the means for all combinations of group and time. Include only the interaction term and omit the intercept. MODEL aix = time*group / NOINT SOLUTION CL; Note: Usually combined with LSMEANS (on the next slide) 14 / 28

LSMEANS Estimates the means for all time and group combination, and all possible differences between them (DIFF-option). PROC MIXED DATA=ckd; CLASS id week group; MODEL aix = group*week / NOINT DDFM=KR; LSMEANS group*week / DIFF SLICE=week CL; REPEATED week / SUBJECT=id TYPE=UN R RCORR; RUN; NOINT means that the model does not include an intercept (so there is no need to specifiy reference groups) Use SLICE=week to test for overall differences between multiple groups at each time separately (like one-way ANOVA). 15 / 28

Outline Data in wide and long format Descriptive statistics Analysis of response profiles (FLW section 5.8) Reading the output from proc mixed Baseline adjustment 16 / 28

Output (analysis of response profiles) First we get a summary of what data and methods proc mixed has used. (some we have specified and other are SAS defaults) The Mixed Procedure Model Information Data Set Dependent Variable Covariance Structure Subject Effect Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method WORK.CKD aix Unstructured id REML None Kenward-Roger Kenward-Roger Class Level Information Class Levels Values id 51 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 45 46 47 48 49 51 52 53 54 week 3 12 24 0 group 2 1 0 17 / 28

Output (analysis of response profiles) Dimensions Covariance Parameters 6 Columns in X 12 Columns in Z 0 Subjects 51 Max Obs Per Subject 3 This is a summary of the mathematical model specification which is explained in lecture 4. Number of Observations Number of Observations Read 153 Number of Observations Used 144 Number of Observations Not Used 9 ATT: Missing data due to drop out and failed measurements. 18 / 28

Output (analysis of response profiles) In contrary to the ordinary linear models, no explicit formulae for the maximum likelihood estimates exist for linear mixed models in general. Therefore SAS uses numerical optimisation to compute esitmates of the mean and covariance parameters. Iteration History Iteration Evaluations -2 Res Log Like Criterion 0 1 1070.85454941 1 2 982.86560047 0.00144735 2 1 982.26253864 0.00009905 3 1 982.22468047 0.00000061 4 1 982.22445749 0.00000000 Convergence criteria met. Always check that the numerical optimisation has converged! 19 / 28

Output (analysis of response profiles) Options R and RCORR makes SAS print the estimated covariance and correlation matrices. Estimated R Matrix for id 1 Row Col1 Col2 Col3 1 106.23 96.3802 80.1893 2 96.3802 159.64 106.48 3 80.1893 106.48 106.38 Estimated R Correlation Matrix for id 1 Row Col1 Col2 Col3 1 1.0000 0.7401 0.7544 2 0.7401 1.0000 0.8171 3 0.7544 0.8171 1.0000 20 / 28

Output (analysis of response profiles) Fit statistics can be used for comparison of different models. Fit Statistics -2 Res Log Likelihood 982.2 AIC (smaller is better) 994.2 AICC (smaller is better) 994.9 BIC (smaller is better) 1005.8 Null Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 5 88.63 <.0001 The test of "all means are the same" is hardly ever of interest. Make sure to use the PROC MIXED METHOD=ML-option if you want to use this to test nested models for the mean-structure (lecture 2). 21 / 28

Output (analysis of response profiles) At last what is most interesting: estimates and tests. Solution for Fixed Effects Effect week treat Estimate StdError DF t Value Pr > t Intercept 24.3431 2.0793 49.4 11.71 <.0001 week 12 1.0887 1.7694 46.2 0.62 0.5414 week 24 3.0895 1.4995 44.5 2.06 0.0452 week 0 0.... group 1-2.0547 2.8999 48.9-0.71 0.4820 group 0 0.... week*group 12 1-1.9493 2.4871 45.8-0.78 0.4372 week*group 12 0 0.... week*group 24 1-3.6078 2.1298 45.3-1.69 0.0971 week*group 24 0 0.... week*group 0 1 0.... week*group 0 0 0.... (confidence intervals omitted due to lack of space) Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F week 2 44.5 0.99 0.3794 group 1 47 1.84 0.1817 week*group 2 44.5 1.43 0.2490 22 / 28

Output (analysis of response profiles) Standardized (aka Studentized) residuals: Normal distribution? (Other 23 / 28 residuals and boxplots of residuals vs time and group omitted)

Outline Data in wide and long format Descriptive statistics Analysis of response profiles (FLW section 5.8) Reading the output from proc mixed Baseline adjustment 24 / 28

Which model should I choose? Results from the clmm and the ANCOVA model are usually very similar. We recommed the clmm. Programming and interpretation is easier. It is slightly better at handling missing data. Exception: If randomization was performed conditionally on baseline measurements, then the ANCOVA is a valid model while the clmm is not. 25 / 28

The constrained linear mixed model (clmm) To fit the constrained model: 1. Define a new treatment variable by joining groups at baseline. 2. Leave out the main term treat in the model statement. DATA ckd; SET ckd; treat = group; IF week = 0 THEN treat = 0; RUN; PROC MIXED DATA=ckd; CLASS id week (ref= 0 ) treat (ref= 0 ); MODEL aix = week treat*week / SOLUTION CL DDFM=KR; REPEATED week / SUBJECT=id TYPE=UN R RCORR; RUN; 26 / 28

ANCOVA To prepare for the analysis. Baseline must be included as a covariate in the data. Only follow-up times are used when running the analysis. For ease of interpretation and numerical stability we center the baseline variable around its mean. For ease of quantification we use change-since-baseline as outcome. DATA followup; SET ckd; IF week > 0; baseline = aix0 - xxxx; aixchange=aix-aix0; RUN; 27 / 28

ANCOVA To run the analysis with proc mixed: Include the baseline*time interaction in the model. Since the analysis is based on follow-up data, the most natural reference point for time is now the last follow-up. The treatment effect (af last follow-up) is estimated by the group-effect. PROC MIXED DATA=followup; CLASS id week (ref= 24 ) group (ref= 0 ); MODEL aixchange = group week group*week baseline*week / SOLUTION CL DDFM=KR; REPEATED week / SUBJECT=id TYPE=UN R RCORR; RUN; 28 / 28