A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements. H.J. Keselman University of Manitoba

Similar documents
Graphical Procedures, SAS' PROC MIXED, and Tests of Repeated Measures Effects. H.J. Keselman University of Manitoba

THE ANALYSIS OF REPEATED MEASUREMENTS: A COMPARISON OF MIXED-MODEL SATTERTHWAITE F TESTS AND A NONPOOLED ADJUSTED DEGREES OF FREEDOM MULTIVARIATE TEST

An Examination of the Robustness of the Empirical Bayes and Other Approaches. for Testing Main and Interaction Effects in Repeated Measures Designs

The Analysis of Repeated Measures Designs: A Review. H.J. Keselman. University of Manitoba. James Algina. University of Florida.

ANALYZING SMALL SAMPLES OF REPEATED MEASURES DATA WITH THE MIXED-MODEL ADJUSTED F TEST

Type I Error Rates of the Kenward-Roger Adjusted Degree of Freedom F-test for a Split-Plot Design with Missing Values

THE 'IMPROVED' BROWN AND FORSYTHE TEST FOR MEAN EQUALITY: SOME THINGS CAN'T BE FIXED

Analysis of Longitudinal Data: Comparison between PROC GLM and PROC MIXED.

POWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA

The analysis of repeated measures designs: A review

TESTS FOR MEAN EQUALITY THAT DO NOT REQUIRE HOMOGENEITY OF VARIANCES: DO THEY REALLY WORK?

TWO-FACTOR AGRICULTURAL EXPERIMENT WITH REPEATED MEASURES ON ONE FACTOR IN A COMPLETE RANDOMIZED DESIGN

Conventional And Robust Paired And Independent-Samples t Tests: Type I Error And Power Rates

Multiple Comparison Procedures, Trimmed Means and Transformed Statistics. Rhonda K. Kowalchuk Southern Illinois University Carbondale

Chapter 14: Repeated-measures designs

TO TRIM OR NOT TO TRIM: TESTS OF LOCATION EQUALITY UNDER HETEROSCEDASTICITY AND NONNORMALITY. Lisa M. Lix and H.J. Keselman. University of Manitoba

Bootstrap Simulation Procedure Applied to the Selection of the Multiple Linear Regressions

Multiple Comparison Procedures for Trimmed Means. H.J. Keselman, Lisa M. Lix and Rhonda K. Kowalchuk. University of Manitoba

Biostatistics 301A. Repeated measurement analysis (mixed models)

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

A Generally Robust Approach To Hypothesis Testing in Independent and Correlated Groups Designs. H. J. Keselman. University of Manitoba. Rand R.

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

An Overview of the Performance of Four Alternatives to Hotelling's T Square

Power of Modified Brown-Forsythe and Mixed-Model Approaches in Split-Plot Designs

Analysis of Repeated Measures Data of Iraqi Awassi Lambs Using Mixed Model

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Introduction to Within-Person Analysis and RM ANOVA

Robust Means Modeling vs Traditional Robust Tests 1

A SIMULATION STUDY TO EVALUATE PROC MIXED ANALYSIS OF REPEATED MEASURES DATA

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

SAS/STAT 15.1 User s Guide Introduction to Mixed Modeling Procedures

RANDOM and REPEATED statements - How to Use Them to Model the Covariance Structure in Proc Mixed. Charlie Liu, Dachuang Cao, Peiqi Chen, Tony Zagar

International Journal of Current Research in Biosciences and Plant Biology ISSN: Volume 2 Number 5 (May-2015) pp

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Using Estimating Equations for Spatially Correlated A

Describing Change over Time: Adding Linear Trends

COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS

Step 2: Select Analyze, Mixed Models, and Linear.

Marcia Gumpertz and Sastry G. Pantula Department of Statistics North Carolina State University Raleigh, NC

Comparing Measures of the Typical Score Across Treatment Groups. Katherine Fradette. University of Manitoba. Abdul R. Othman

Statistical Practice. Selecting the Best Linear Mixed Model Under REML. Matthew J. GURKA

SAS/STAT 13.1 User s Guide. Introduction to Mixed Modeling Procedures

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

On Selecting Tests for Equality of Two Normal Mean Vectors

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Comparing the performance of modified F t statistic with ANOVA and Kruskal Wallis test

Keywords: One-Way ANOVA, GLM procedure, MIXED procedure, Kenward-Roger method, Restricted maximum likelihood (REML).

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University

Time-Invariant Predictors in Longitudinal Models

Review of CLDP 944: Multilevel Models for Longitudinal Data

Assessing Normality: Applications in Multi-Group Designs

An Introduction to Path Analysis

Generating Spatial Correlated Binary Data Through a Copulas Method

GLM Repeated Measures

Testing For Aptitude-Treatment Interactions In Analysis Of Covariance And Randomized Block Designs Under Assumption Violations

Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption

An Introduction to Mplus and Path Analysis

A Box-Type Approximation for General Two-Sample Repeated Measures - Technical Report -

A Monte Carlo Simulation of the Robust Rank- Order Test Under Various Population Symmetry Conditions

A measure for the reliability of a rating scale based on longitudinal clinical trial data Link Peer-reviewed author version

BIOL 458 BIOMETRY Lab 8 - Nested and Repeated Measures ANOVA

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

Time-Invariant Predictors in Longitudinal Models

A Monte Carlo Power Analysis of Traditional Repeated Measures and Hierarchical Multivariate Linear Models in Longitudinal Data Analysis

Linear Mixed Models with Repeated Effects

Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny

POWER AND TYPE I ERROR RATE COMPARISON OF MULTIVARIATE ANALYSIS OF VARIANCE

A Test of Symmetry. Abdul R. Othman. Universiti Sains Malaysia. H. J. Keselman. University of Manitoba. Rand R. Wilcox

Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Aligned Rank Tests As Robust Alternatives For Testing Interactions In Multiple Group Repeated Measures Designs With Heterogeneous Covariances

R-functions for the analysis of variance

Analyzing the Behavior of Rats by Repeated Measurements

A Comparative Simulation Study of Robust Estimators of Standard Errors

Analysis of Incomplete Non-Normal Longitudinal Lipid Data

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Trimming, Transforming Statistics, And Bootstrapping: Circumventing the Biasing Effects Of Heterescedasticity And Nonnormality

Describing Within-Person Change over Time

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Introduction to Random Effects of Time and Model Estimation

Approximations to Distributions of Test Statistics in Complex Mixed Linear Models Using SAS Proc MIXED

Checking model assumptions with regression diagnostics

Time Invariant Predictors in Longitudinal Models

SAS/STAT 13.1 User s Guide. The Four Types of Estimable Functions

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)

Review of Unconditional Multilevel Models for Longitudinal Data

Longitudinal Modeling with Logistic Regression

Review of Multilevel Models for Longitudinal Data

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Combining dependent tests for linkage or association across multiple phenotypic traits

Descriptive Statistics

Interaction effects for continuous predictors in regression modeling

Time-Invariant Predictors in Longitudinal Models

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Dynamic Determination of Mixed Model Covariance Structures. in Double-blind Clinical Trials. Matthew Davis - Omnicare Clinical Research

Appendix A Summary of Tasks. Appendix Table of Contents

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

Transcription:

1 A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements by H.J. Keselman University of Manitoba James Algina University of Florida Rhonda K. Kowalchuk University of Manitoba and Russell D. Wolfinger SAS Institute

2 Abstract The mixed model approach to the analysis of repeated measurements allows users to model the covariance structure of their data. That is, rather than using a univariate or a multivariate test statistic for analyzing effects, tests that assume a particular form for the covariance structure, the mixed model approach allows the data to determine the appropriate structure. Using the appropriate covariance structure should result in more powerful tests of the repeated measures effects according to advocates of the mixed model approach. SAS' (1996) mixed model program, PROC MIXED, provides users with two information criteria for selecting the `best' covariance structure, Akaike (1974) and Schwarz (1978). Our study compared these log likelihood tests to see how effective they would be for detecting various population covariance structures. In particular, the criteria were compared in unbalanced (across groups) nonspherical repeated measures designs having equal/unequal group sizes and covariance matrices when data were both normally and nonnormally distributed. The results indicate that neither criterion was effective in finding the correct structure. On average, for the 26 investigated distributions, the Akaike criterion only resulted in the correct structure being selected 47 percent of the time while the Schwarz criterion resulted in the correct structure being selected ust 35 percent of the time. Not surprisingly, PROC MIXED default F-tests based on either of these selection criteria performed poorly according to results reported by the authors elsewhere. Key Words: Covariance Structures, Akaike Criterion, Schwarz Criterion, Repeated Measurements

3 A Comparison of Two Approaches For Selecting Covariance Structures in The Analysis of Repeated Measurements 1. Introduction The traditional analysis of variance F-tests for repeated measures designs containing between-subects variables require that the data conform to the multisample sphericity assumption in order to be valid (see Huynh & Feldt, 1970; Keselman & Algina, 1996; Keselman & Keselman, 1993; Rogan, Keselman & Mendoza, 1979; Rouanet & Lepine, 1970). That is, for a J (between-subects) K (within-subects) design, FK œ MS K/MS K S/J and FJK œ MS JK/MS K S/J will only be distributed as F variables when, in addition to multivariate normality and independence assumptions, the variances for all possible differences between the levels of the repeated measures variable are equal (i.e., sphericity) and this constant variance holds for each level of the betweensubects grouping variable (i.e., multisample sphericity, see Mendoza, 1980). Equivalently, multisample sphericity can be stated as C TD C œ -I (K 1), ( œ 1, á, J), where C is an orthonoromalized contrast matrix representing a comparison among the levels of the repeated measures factor, D is a variance-covariance matrix for K associated with a particular level of treatment, - is a scalar 0, I is a K 1 K 1 t identity matrix, and is the transpose operator. Since the data obtained in applied settings, particularly in the behavioral sciences, will rarely conform to the multisample sphericity pattern, many authors have recommended that applied researchers adopt a corrected degrees of freedom (df) test, such as Greenhouse and Geisser (1958) or Huynh and Feldt (1976), or a multivariate test statistic, in order to obtain valid tests of repeated measures effects. A corrected df univariate test attempts to circumvent the multisample sphericity assumption by altering the df of the traditional test statistics based on a sample estimate of the unknown sphericity parameter. The multivariate test, on the other hand, does not require sphericity, only covariance homogeneity and multivariate normality. The empirical evidence regarding the validity of these two approaches indicates that if the design is balanced they effectively control the probability of committing a Type I error at the desired level of significance. However, for unbalanced (unequal group sizes) repeated measures designs, neither approach effectively maintains control over Type I errors when covariance heterogeneity exists, a likely outcome with applied data (e.g., see Keselman & Keselman, 1990; Keselman, Lix & Keselman, 1996). That is, current recommendations do not generally hold in unbalanced (across groups) repeated measures designs when covariance matrices are heterogeneous. Unfortunately, according to a survey conducted by Kowalchuk, Lix, and Keselman (1996), repeated measures designs in the applied sciences are typically unbalanced. One of the newer approaches to the analysis of repeated measurements is based on a mixed model approach (see Jennrich & Schluchter, 1986; Laird & Ware, 1982; Liang & Zeger, 1986; Little, Milliken, Stroup & Wolfinger, 1996). The potential benefit of this approach is that it allows a user to model the covariance structure of the data rather than presuming a certain type of structure as is the case with the traditional univariate and multivariate test statistics. Parsimonously modeling the covariance structure of the data should result in more efficient estimates of the fixed-effects

4 parameters of the model and consequently more powerful tests of the repeated measures effects. The mixed model approach, and specifically SAS's (SAS, 1996) PROC MIXED, allows users to fit various covariance structures to the data. For example, some of the structures that can be fit with PROC MIXED are: (a) compound symmetric (CS), (b) unstructured (UN), (c) spherical/huynh and Feldt (1970) (HF), (d) first order autoregressive (AR), and (e) random coefficients (RC). The CS structure is assumed by the traditional univariate F-tests in SASs GLM program (SAS Institute, 1990), while the UN structure is assumed by GLMs multivariate tests of the repeated measures effects. AR and RC structures more appropriately reflect that measurement occassions that are closer in time are more highly correlated than those farther apart in time. In addition, PROC MIXED allows users to specify, separately and ointly, between-subects and within-subects heterogeneity. After a covariance structure is selected, SAS computes, by default, F-tests which are Wald-type statistics which are asymptotically valid and whose sampling distribution is approximated by an F in small samples (see McLean & Sanders, 1988; Wolfinger, 1993). It is suggested that users first determine the appropriate covariance structure prior to conducting tests of significance for the repeated measures effects (see Little et al., 1996; Wolfinger, 1993; 1996). Specifically, a covariance structure can be selected by comparing the Akaike Information Criterion (AIC) (Akaike, 1974) and/or Schwarz Bayesian Criterion (BIC) (Schwarz, 1978) values for various potential covariance structures (see Bozdogan, 1987; Little et al; Wolfinger, 1993; 1996). According to the authors of SAS manuals these two criteria are likely to result in basically equivalent results. However, to date, we know of no reported study which has examined this proposition. Thus, the purpose of our investigation was to compare the two criteria. 2. Methods The simplest of the higher-order repeated measures designs involves a single between-subects factor and a single within-subects factor, in which subects (i œ 1, á, n, Dn œ N) are selected randomly for each level of the between-subects factor ( œ 1, á, J) and observed and measured under all levels of the within-subects factor (k œ 1, á, K). In this design, the repeated measures data are modeled by assuming that the observations Yik are normal, independent and identically distributed within each level, with common mean vector. and covariance matrix D. 2.1 Test Statistics The AIC and BIC information criteria, in larger-is-better form, can be specified as AIC V œ6v( s) ) ; and ; BIC œ6 œ6 ( s) ) log( 8 : ), V V V where )s is the restricted /residual maximum likelihood estimate of the unknown variancecovariance parameters ), ; is the number of elements in ), and for the model given 1 previously :œjk and 8œNK (See Wolfinger, 1996). 2

5 2.2 Study Variables The conditions of our study were those used by Keselman, Algina, Kowalchuk and Wolfinger (1997) in their study which compared various approaches, including the mixed model approach, to the analysis of repeated measurements. Thus, the conditions (e.g., covariance heterogeneity, nonnormality, unequal group sizes) selected were chosen primarily for their known effects on tests for mean equality. Nonetheless, we believe that the conditions they manipulated would be relevant as well in our investigation, when examining the log likelihood tests, since they were chosen to mirror conditions likely to be encountered by researchers working in applied settings. The two criteria (AIC and BIC) for selecting a covariance structure prior to computing tests for testing repeated measures effects were examined for balanced and unbalanced designs containing one between-subects and one within-subects factor; there were three and four levels of these factors, respectively. Selected combinations of six factors were investigated. Six covariance structures investigated were: (a) UN, (b) ARH, and (c) RCH, (d) UN, (e) ARH, and (f) RCH, where H in the structure designation refers to within- subect heterogeneity and the subscript denotes between groups heterogeneity (i.e., allowing for heterogeneity of a structure across groups) (see Little et al., 1996; SAS, 1995; Wolfinger, 1993; 1966). The authors will provide upon request an enumeration of the element values used in the simulation study. For each of the preceding structures, equal (excluding RCH) as well as unequal between-subects covariance matrices were investigated. When unequal, the elements of the matrices were in the ratio of 1:3:5. Based on the belief that applied researchers work with data that is characterized by both within and between heterogeneity, eleven covariance structures were fit with PROC MIXED for the Akaike (1974) and Schwarz (1978) criteria. These structures were: (a) CS, (b) UN, (c) AR, (d) HF, (e) CSH, (f) ARH, (g) RCH, (h) UN, (i) HF, () ARH, and (k) RCH. Thus, we allowed PROC MIXED to select from among homogeneous, within heterogeneous, and within and between heterogeneous structures. The criteria were investigated when the number of observations in each group were equal or unequal. Based on the findings reported by Keselman et al. (1993) and Wright (1995) we considered, like Keselman et al. (1997), a number of cases of total sample size: N œ 30, N œ 45, and N œ 60. For each value of N, both a moderate and substantial degree of group size inequality were typically investigated. The moderately unbalanced group sizes had a coefficient of sample size variation (C) equal to.16, while for the more disparate cases C.33, where C is defined as ( (n q " 2 D n) /J) / q # n, and q n is the average group size. The C.16 and C.33 unequal group sizes cases were respectively equal to: (a) 8, 10, 12 and 6, 10, 14 (N œ 30), (b) 12, 15, 18 and 9, 15, 21 (N œ 45), and (c) 16, 20, 24 and 12, 20, 28 (N œ 60). Positive and/or negative pairings of these unequal group sizes and unequal covariance matrices were investigated. A positive pairing referred to the case in which the largest n was associated with the covariance matrix containing the largest element

6 values; a negative pairing referred to the case in which the largest n was associated with the covariance matrix with the smallest element values. In short, for each value of N, four pairings of unequal covariance matrices and unequal group sizes were investigated: moderately and very unequal ns which were both positively and negatively paired with the unequal D s. The sphericity index was set at % œ 0.75. The 0.75 value characterizes data found in educational and behavioral science research (Huynh & Feldt, 1976). When % œ 1.0, sphericity is satisfied and for the J K design the lower bound of % œ 1/(K 1). The criteria were compared when the simulated data were obtained from multivariate normal or multivariate nonnormal distributions. The nonnormal distribution was a multivariate lognormal distribution with marginal distributions based on Yik œ exp(x i) (i œ 1, á, n ) where X ik is distributed as N(0,.25); this distribution has skewness (# 1) and kurtosis (# 2) values of 1.75 and 5.90, respectively. The algorithms for generating the two investigated distributions can be found in Keselman et al (1993) and Algina and Oshima (1995). An enumeration of the conditions investigated can be found in Table 1. One thousand replications of each condition were performed. --------------------------------------------- Insert Table 1 About Here --------------------------------------------- 3. Results In the tables that follow we present results for a subset of the selected combinations investigated; the subset adequately demonstrates differences that exist between the criteria. Tables 2 and 3 contain percentages, for each condition investigated, of the three most frequently selected covariance structures by the AIC and BIC criteria, respectively. Shaded cells of the table indicate the true covariance structure of the data. --------------------------------------------- Insert Tables 2 and 3 About Here --------------------------------------------- The results in Table 2 indicate that, for the AIC criterion,: (a) on average, across the 26 investigated conditions, the correct covariance struture is selected 47 percent of the time, (b) when the true structure is UN/UN-Normal (A-F), another structure (ARH ) is selected more often than the correct structure, (c) the correct structure is selected more frequently than any of the other structures when the true structure is ARH -Normal (H-L), RCH -Normal (M-Q), UN -Lognormal data (R-S), and ARH -Lognormal data (T-U) and (d) the correct structure (RCH ) is selected approximately as frequently as an incorrect structure (UN ) for Lognormal data (V-Z). The picture is quite different when applying the Schwarz criterion. In particular, (a) in 14 of the 26 conditions investigated the correct covariance structure was never selected, (b) averaged over the cases in which the correct covariance structure was selected some percentage of the time, the percent of correct selections was 35, and (c) for

7 all structures, except RCH -Lognormal (V-Z) and distribution Q, an incorrect structure was selected more frequently than the correct structure. 4. Discussion The newest approach to the analysis of repeated measurements is a mixed model analysis. Advocates of this approach suggest that it provides the `best' approach to the analysis of repeated measurements since it can, among other considerations, handle missing data and allow users to model the covariance structure of the data. The first of these advantages is typically not a pertinent issue to those involved in controlled experiments since data in these contexts is rarely missing. The second consideration, however, could be most relevant to experimenters since, according to the developers of mixed model analyses, modeling the correct covariance structure of the data should result in more powerful tests of the fixed-effects parameters. That is, users should obtain more powerful tests by using test statistics that more accurately model the correct covariance structure rather than adopting the usual univariate or multivariate tests which presume specific types of covariance structures for the data. The mixed model program in SAS (1996) allows users to model many potentially applicable covariance structures. Additionally, the program allows even greater flexibility to the user by allowing him/her to model covariance structures that have within-subects and/or between-subects heterogeneity. In order to select an appropriate structure for ones data, PROC MIXED users can use either an Akaike (1974) or Schwarz (1978) information criteria. These log likelihood tests, it is believed, should provide equivalent results. Nonetheless, within the context of unbalanced (across groups) nonspherical heterogeneous repeated measures designs, comparisons of the crirteria have not to date been made. Accordingly, we compared these criteria for various between- by within-subects repeated measures designs in which we varied the true covariance structure to the data, the distributional form of the data, as well as group size and covariance balance/imbalance. Our data indicate that neither approach uniformly selected the correct covariance structure. Indeed, for most of the investigated structures investigated, the Akaike (1974), and in particular the Schwarz (1978) criteria, more frequently picked the wrong covariance structure. Averaging over the conditions in which a correct structure was selected some percentage of the time, the correct structure was picked 47 percent of the time with the Akaike criterion and only 35 percent of the time with the Schwarz criterion. Thus, though the mixed model approach allows users to model the covariance structure, two popular criteria for selecting the `best' structure perform poorly. Not surprisingly, Keselman et al. (1997) found that the default F-tests that PROC MIXED computes based on either of these two criteria were prone to inflated rates of Type I error. Accordingly, potential presumed power benefits must be discounted when the procedure is prone to excessive rates of Type I error. The authors are currently examining whether PROC MIXED tests of repeated measures effects can be improved by using its Satterthwaite test option. In the meantime we continue to recommend that for the analysis of repeated measures effects users adopt the nonpooled multivariate Welch-type statistic presented by Keselman, Carriere and Lix (1993). For most unbalanced nonspherical heterogeneous repeated measures designs it typically will be robust even when data is nonnormal (see Keselman et al., 1993). Lix

8 and Keselman (1995) show how to apply this statistic to test omnibus and subeffect tests in most independent and correlated groups designs and present a SAS/IML (SAS, 1989) program to obtain numerical results.

9 FOOTNOTES 1. Release 7.01 of PROC MIXED will compute the Schwarz (1978) criteria with a less stringent penalty. Specifically, based on Carlin and Louis (1996), 8 will be equated with the number of subects rather than the number of observations as is currently the case in PROC MIXED releases other than 7.01.

10 REFERENCES Akaike, H. (1974), A new look at the statistical model identification," IEEE Transaction on Automatic Control, AC-19, 716-723. Algina, J. (in press), Generalization of Improved General Approximation tests to split-plot designs with multiple between-subects factors and/or multiple within-subects factors," British Journal of Mathematical and Statistical Psychology. Algina, J. (1994), Some alternative approximate tests for a split plot design," Multivariate Behavioral Research, 29, 365-384. Bozdogan, H., (1987), Model selection and Akaike's information criterion (AIC): the general theory and its analytical extensions," Psychometrika, 52, 345-370. Carlin, B.P., and Louis, T.A. (1996), Bayes and empirical bayes methods for data analysis," Chapman and Hall. Greenhouse, S.W. and Geisser, S. (1959), On methods in the analysis of profile data," Psychometrika, 24, 95-112. Hotelling, H. (1931), The generalization of Student's ratio," Annals of Mathematical Statistics, 2, 360-378. Huynh, H.S. and Feldt, L. (1970), Conditions under which mean square ratios in repeated measurements designs have exact F distributions," Journal of the American Statistical Association, 65, 1582-1589. Huynh, H. and Feldt, L.S. (1976), Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs," Journal of Educational Statistics, 1, 69-82. Jennrich, R.I., and Schluchter, M.D., (1986), Unbalanced repeated-measures models with structured covariance matrices," Biometrics, 42, 805-820. Keselman, H.J., Algina, J., Kowalchuk, R.K., and Wolfinger, R.D. (1997), A comparison of recent approaches to the analysis of repeated measurements," Unpublished manuscript. Keselman, H.J., Carriere, K.C., and Lix, L.M. (1993), Testing repeated measures hypotheses when covariance matrices are heterogeneous," Journal of Educational Statistics, 18, 305-319. Kowalchuk, R.K., Lix, L.M., and Keselman, H.J. (1996), The analysis of repeated measures designs," paper presented at the Annual Meeting of the Psychometric Society, 1996, Banff, Alberta. Laird, N., and Ware, J.H., (1982), Random-effects models for longitudinal data," Biometrics, 38, 963-974. Liang, K.Y., and Zeger, S.L., (1986), Longitudinal data analysis using generalized linear models," Biometrika, 73, 13-22. Little, R.C., Milliken, G.A., Stroup, W.W., and Wolfinger, R.D., (1996), SAS system for mixed models, Cary, NC: SAS Institute. Lix, L.M., and Keselman, H.J. (1995), Approximate degrees of freedom tests: A unified perspective on testing for mean equality," Psychological Bulletin, 117, 547-560. McLean, R.A., and Sanders, W.L. (1988). Approximating degrees of freedom for standard errors in mixed linear models," Proceedings of the Statistical Computing Section, American Statistical Association, New Orleans, pp. 50-59.

11 Rouanet, H. and Lepine, D. (1970), Comparison between treatments in a repeated measures design: ANOVA and multivariate methods," British Journal of Mathematical and Statistical Psychology, 23, 147-163. SAS Institute. (1989), SAS/IML sowtware: Usage and reference, Version 6. Cary, NC: Author. SAS Institute. (1990), SAS/STAT User's Guide Vol. 2, GLM-VARCOMP, Version 6, 4th Ed. Cary, NC: Author. SAS Institute. (1995), Introduction to the MIXED procedure: Course Notes, Cary, NC: Author. SAS Institute. (1996), SAS/STAT Software: Changes and Enhancements through Release 6.11, Cary, NC: Author. Schwarz, G. (1978), Estimating the dimension of a model," Annals of Statistics, 6, 461-464. Wolfinger, R., (1993), Covariance structure selection in general mixed models," Communication in Statistics Simulation, 22, 1079-1106. Wolfinger, R.D., (1996), Heterogeneous variance-covariance structures for repeated measurements," Journal of Agricultural, Biological, and Environmental Statistics, 1, 205-230. Wright, S.P., (1995), Adusted F tests for repeated measures with the MIXED procedure," Proceedings of the Twentieth Annual SAS Users Group International Conference, Cary, NC. Wright, S.P., and Wolfinger, R.D., (1996), Repeated measures analysis using mixed models: Some simulation results," Paper presented at the Conference on Modelling Longitudinal and Spatially Correlated Data: Methods, Applications, and Future Directions, Nantucket, MA (October).

12 Author's Note This research was supported by a Social Sciences and Humanities Research Council grant (#410-95-0006) to the first author.