Repeated Measurement ANOVA. Sungho Won, Ph. D. Graduate School of Public Health Seoul National University

Similar documents
Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)

Repeated Measures Analysis of Variance

Analysis of Variance

ANCOVA. Psy 420 Andrew Ainsworth

The Random Effects Model Introduction

Psy 420 Final Exam Fall 06 Ainsworth. Key Name

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

UV Absorbance by Fish Slime

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

Chapter 14: Repeated-measures designs

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Checking model assumptions with regression diagnostics

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Chapter 1 Statistical Inference

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Experimental Design and Data Analysis for Biologists

Longitudinal data: simple univariate methods of analysis

Analysis of Variance: Repeated measures

Analysis of Variance (ANOVA)

ST 732, Midterm Solutions Spring 2019

Chapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance

Notes on Maxwell & Delaney

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Analysis of Variance

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

WELCOME! Lecture 13 Thommy Perlinger

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

ANCOVA. Lecture 9 Andrew Ainsworth

Answer to exercise: Blood pressure lowering drugs

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

8/28/2017. Repeated-Measures ANOVA. 1. Situation/hypotheses. 2. Test statistic. 3.Distribution. 4. Assumptions

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Unbalanced Designs & Quasi F-Ratios

CHAPTER 7 - FACTORIAL ANOVA

General Linear Model (Chapter 4)

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

using the beginning of all regression models

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Supplemental Materials. In the main text, we recommend graphing physiological values for individual dyad

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Describing Change over Time: Adding Linear Trends

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Advanced Experimental Design

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

One-way between-subjects ANOVA. Comparing three or more independent means

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

STAT 525 Fall Final exam. Tuesday December 14, 2010

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

Factorial Analysis of Variance

Design of Experiments. Factorial experiments require a lot of resources

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

Statistical Distribution Assumptions of General Linear Models

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Unit 12: Analysis of Single Factor Experiments

same hypothesis Assumptions N = subjects K = groups df 1 = between (numerator) df 2 = within (denominator)

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Statistical Techniques II EXST7015 Simple Linear Regression

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

REVIEW 8/2/2017 陈芳华东师大英语系

STAT 115:Experimental Designs

Review of the General Linear Model

N J SS W /df W N - 1

Lab 11. Multilevel Models. Description of Data

BIOMETRICS INFORMATION

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Longitudinal Data Analysis of Health Outcomes

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Difference in two or more average scores in different groups

POWER ANALYSIS TO DETERMINE THE IMPORTANCE OF COVARIANCE STRUCTURE CHOICE IN MIXED MODEL REPEATED MEASURES ANOVA

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Analysis of variance

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

Allow the investigation of the effects of a number of variables on some response

Lecture 7 Randomized Complete Block Design (RCBD) [ST&D sections (except 9.6) and section 15.8]

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Introduction to Crossover Trials

Modeling the Mean: Response Profiles v. Parametric Curves

STAT 501 EXAM I NAME Spring 1999

df=degrees of freedom = n - 1

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Transcription:

1 Repeated Measurement ANOVA Sungho Won, Ph. D. Graduate School of Public Health Seoul National University

2 Analysis of Variance (ANOVA) Main idea Evaluate the effect of treatment by analyzing the amount of variation among the subgroup sample means

Analysis of Variance (ANOVA) ANOVA tests mean differences between groups defined by categorical variables One-way ANOVA ONE factor with 2 or more levels Multi-way ANOVA 2 or more factors, each with 2 or more levels Fertirrigation: 4 levels Gender drug: 4 doses ASSUMPTIONS Independence of cases - this is a requirement of the design. Normality - the distributions in each cells are normal Homogeneity of variances - the variance of data in groups should be the same. 3

4 Preliminary Model for ANOVA Nonnormality F-test is very robust against non-normal data, especially in a fixedeffects model Large sample size will approximate normality by Central Limit Theorem (recommended sample size > 50) Simulations have shown unequal sample sizes between treatment groups magnify any departure from normality A large deviation from normality leads to hypothesis test conclusions that are too liberal and a decrease in power and efficiency Remedy for nonnormality Data transformation Modified F-tests Adjust the degrees of freedom Rank F-test Permutation (also called randomization) test on the F-ratio

5 Preliminary Model for ANOVA Transformation Transformation e.g. f(x) = x 1.5 e.g. f(x) = log(x) f(x) = atan(x)

6 Permutation test Permutation test Distribution of statistics is empirically estimated with the permuted samples under the null hypothesis. P-value can be obtained under the null hypothesis. It can be applied to any statistics. Example Group 1: 10, 15, 20 vs group 2: 3, 4, 5 We are interested in getting p-values for H 0 : μ 1 = μ 2 Group I : 10, 15, 20 Group II: 3, 4, 5 Number of possible permuted sample: 6 3 = 20

Permutation test Label Group I Group II test statistics 1 3 4 5 10 15 20-3.737 2 3 4 10 5 15 20-1.558 3 3 5 10 4 15 20-1.356 4 4 5 10 3 15 20-1.178 5 3 4 15 5 10 20-0.741 6 3 5 15 4 10 20-0.615 7 4 5 15 3 10 20-0.495 8 3 4 20 5 10 15-0.161 9 3 5 20 4 10 15-0.053 10 3 10 15 4 5 20-0.053 11 4 5 20 3 10 15 0.053 12 4 10 15 3 5 20 0.053 13 5 10 15 3 4 20 0.161 14 3 10 20 4 5 15 0.495 15 4 10 20 3 5 15 0.615 16 5 10 20 3 4 15 0.741 17 3 15 20 4 5 10 1.178 18 4 15 20 3 5 10 1.356 19 5 15 20 3 4 10 1.558 20 10 15 20 3 4 5 3.737 7

8 One-way ANOVA 1. Test normality 2. Test homogeneity of variance within each group 3. Run the ANOVA 4. Reject/accept the H 0 (all the means are equal) 2 approaches 5A. Multiple comparison to test differences between the level of factors 5B. Model simplification working with contrasts

9 One-way ANOVA Maize: 4 varieties (k) y: productivity (NORMAL CONTINUOS) x: variety (CATEGORICAL: four levels: x 1, x 2, x 3, x 4 ) H 0 : µ 1 = µ 2 = µ 3 = µ 4 H a : At least two means differ One-way ANOVA is used to test for differences among two or more independent groups ANOVA model y i = a + bx 2 + cx 3 + dx 4 a=µ1 b=µ1-µ2 c=µ1-µ3 d=µ1-µ4 y a b c d

10 One-way ANOVA Var 1 Var 2 Var 3 Var 4 6.08 6.87 10.26 8.79 5.7 6.77 10.21 8.42 6.5 7.4 10.02 8.31 5.86 6.63 9.65 8.57 6.17 6.98 X 9.03 n i 5 5 4 5 µ i 6.06 6.93 10.03 8.62 Number of groups: k = 4 Number of observations N = 19 Grand mean = 7.80 Sum of squares (SS): deviance Degree of freedom (df) SS Total = Σ(y i grand mean) 2 SS Factor = Σ n i (group mean i grand mean) 2 SS Error (within group) = Σ(y i group mean i ) 2 Total: N 1 Group: k 1 Error: N k

11 One-way ANOVA SS Total SS Error Grand mean SS Factor mean 3 mean 4 SS Total = SS Factor + SS Error mean 1 mean 2

One-way ANOVA SSTotal=SSFactor SSTotal=SSFactor + SSError µ 3 µ4 µ 2 Grand mean µ 1 SS can be divided by the respective df to get a variance MS = SS /df Mean squared deviation F = MSFactor MSError 12

One-way ANOVA: F test (variance) F = Factor MS Error MS If the k means are not equal, then the Factor MS in the population will be greater than the population s Error MS How to define the correct F test can be a difficult task with complex design: This is a limitation of ANOVA!! If F calculated is large (e.g. P<0.05), then we can reject Ho All we conclude is that at least two means are different!!! A POSTERIORI MULTIPLE COMPARISONS WORKING WITH CONTRASTS 13

14 One-way ANOVA: F test (variance) Contrasts are the essence of hypothesis testing and model simplification in analysis of variance and analysis of covariance. They are used to compare means or groups of means with other means or groups of means We used contrasts to carry out t test AFTER having found out a significant effect with the F test - We can use contrasts in model simplification (merge similar factor levels) - Often we can avoid post-hoc multiple comparisons

Multi-way ANOVA: F test (variance) Multi-way ANOVA is used when the experimenter wants to study the effects of two or more treatment variables. ASSUMPTIONS Independence of cases - this is a requirement of the design Normality - the distributions in each of the groups are normal Homogeneity of variances - the variance of data in groups should be the same + Equal replication (BALANCED AND ORTHOGONAL DESIGN: this is required to estimate variance!!) Dose 1 Dose 2 Dose 3 Low temp - X 10 obs 10 obs High temp 10 obs 10 obs 8 obs X If you use traditional general linear models just one missing data can affect strongly the results 15

16 Fixed vs Random Effects If we consider more than one factor we have to distinguish two kinds of effects: Fixed effects: factors are specifically chosen and under control, they are informative (E.g. sex, treatments, wet vs. dry, doses, sprayed or not sprayed) Random effects: factors are chosen randomly within a large population, they are normally not informative (E.g. fields within a site, block within a field, split-plot within a plot, family, parent, brood, individuals within repeated measures) Random effects mainly occur in two contrasting kinds of circumstances 1. Observational studies with hierarchical structure 2. Designed experiments with different spatial or temporal dependence

17 Fixed vs Random Effects Random effect Imagine that we randomly select a of the possible levels of the factor of interest. In this case, we say that the factor is random if their effects can be assumed to follow the normal distribution. Typically random factors are categorical. While continuous covariates may be measured at random levels, we usually think of the effects as being systematic (such as linear, quadratic or even exponential) effects. Random effects are not systematic. If we have only random effect, then we are working on Random Effect MODEL: y ij = µ + r i (random) + ε ij, r i ~ N(0, σ 12 ), ε ij ~ N(0, σ 22 )

18 Fixed vs Random Effects Example From the text discusses a single random factor case about the difference of looms in a textile weaving company. Four looms have been chosen randomly from a population of looms within a weaving shed and four observations of fabric strength were made on each loom. The data obtained from the experiment are below. Loom Obs 1 Obs 2 Obs 3 Obs 4 row sum 1 98 97 99 96 390 2 91 90 93 92 366 3 96 95 97 95 383 4 95 96 99 98 388 y ij = μ + α i + ε ij, α i ~N 0, σ α 2, ε ij ~N 0, σ ε 2

19 Fixed vs Random Effects Why is it so important to identify fixed vs. random effects? They affect the way to construct the F-test in a multifactorial ANOVA. Their misspecification leads to wrong conclusions You can find how to construct your F-test with different combinations of random and fixed effects and with different hierarchical structures (choose a well-known sampling design!!!) If we have both fixed and random effects, then we are working on MIXED MODELS y ij = µ + α i + r i + ε ij, r i ~ N(0, σ 12 ), ε ij ~ N(0, σ 22 )

20 Fixed vs Random Effects Why is it so important to identify fixed vs. random effects? They affect the way to construct the F-test in a multifactorial ANOVA. Their misspecification leads to wrong conclusions You can find how to construct your F-test with different combinations of random and fixed effects and with different hierarchical structures (choose a well-known sampling design!!!) If we have both fixed and random effects, then we are working on MIXED MODELS y i = µ + α i (fixed) + r i (random) + ε

21 Factorial ANOVA: two or more factors Factorial design: two or more factors are crossed. Each combination of the factors are equally replicated and each factor occurs in combination with every level of the other factors 3 levels of irrigation 4 fertilizer 10 10 10 10 10 10 Orthogonal sampling 10 10 10 10 10 10

22 Factorial ANOVA: why? Why use a factorial ANOVA? Why not just use multiple one-way ANOVAs? With n factors, you d need to run n one-way ANOVA s, which would inflate your α-level However, this could be corrected with a Bonferroni correction The best reason is that a factorial ANOVA can detect interactions, something that multiple one-way ANOVA s cannot do. It should be noted that orthogonality is not often preserved in observational study. If they are not orthogonal, true effects of factor A can be estimated only if both A and B are included as covariates. Furthermore if A and B are highly correlated, their effects are not distinguishable because of multi-collinearity.

Performance 23 Factorial ANOVA: Interactions Interaction: When the effects of one independent variable differ according to levels of another independent variable E.g. We are testing two factors, Gender (male and female) and Age (young, medium, and old) and their effect on performance Female If males performance differed as a function of age, i.e. males performed better or worse with age, but females performance was the same across ages, we would say that Age and Gender interact, or that we have an Age x Gender interaction Young Age Old Male It is necessary that the slopes differ from one another

Performance Performance Factorial ANOVA: Main Effects Main effects: the effect of a factor is independent from any other factors This is what we were looking at with one-way ANOVA s if we have a significant main effect of our factor, then we can say that the mean of at least one of the groups/levels of that factor is different than at least one of the other groups/levels Young Old Male Female It is necessary that the intercepts differ 24

Factorial ANOVA: Two-crossed fixed factor design Examples of good ANOVA results Effect Clone < 0.05 Treat < 0.05 C x T < 0.05 Effect Clone < 0.05 Treat < 0.05 C x T n.s. Effect Clone n.s. Treat < 0.05 C x T n.s. Effect Clone < 0.05 Treat n.s. C x T n.s. Mean y 5 15 25 Mean y 15 25 35 Mean y 10 20 30 Mean y 15 25 35 45 1.0 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 Treatment Treatment Treatment Treatment Worst case Effect Clone n.s. Treat n.s C x T n.s Mean y 5 10 15 20 1.0 2.0 3.0 Treatment Two-crossed factor design CloneA CloneB Treatment: 3 levels 25

26 Factorial ANOVA: Two-crossed fixed factor design Two crossed fixed effects: every level of each factor occurs in combination with every level of the other factors Model 1: two fixed effects Model 2: two random effects (uncommon situation) Model 3: one random and one fixed effect We can test main effects and interaction: 1. The main effect of each factor is the effect of each factor independent of (pooling over) the other factors 2. The interaction between factors is a measure of how the effects of one factor depends on the levels of one or more additional factors (synergic and antagonist effect of the factors) Factor 1 x Factor 2 We can only measure interaction effects in factorial (crossed) designs

27 Factorial ANOVA: Two-crossed fixed factor design Two crossed fixed effects: Response variable: weight gain in six weeks Factor A: DIET (3 levels: barley, oats, wheat) Factor B: SUPPLEMENT (4 levels: S1, S2, S3, S4) DIET* SUPPLEMENT= 3 x 4 = 12 combinations We have 48 horses to test our two factors: 4 replicates barley+s1 barley+s2 oats+s1 oats+s2 wheat+s1 wheat+s2 barley+s3 barley+s4 oats+s3 oats+s4 wheat+s3 wheat+s4

Factorial ANOVA: Two-crossed fixed factor design The 48 horses must be independent units to be replicates DIET SUPPLEMENT DIET*SUPPLEMENT F test for main effects and interaction DIET SUPPLEMENT F D F S MS MS D error MS MS S error Barley 26.34 23.29 22.46 25.57 Oats 23.29 20.49 19.66 21.86 wheat 19.63 17.40 17.01 19.66 Barley Oats Wheat S1 S2 S3 S4 DIET*SUPPLEMENT F DxS MS MS DxS error 28

29 Mixed MODELs: Fixed + Random Effects Pure fixed effect models REQUIRE INDEPENDENCE WHAT IF WE HAVE DEPENDENCE??? Mixed models can deal with spatial or temporal dependence The mixed models included the dependence in the data with appropriate random effects

30 Mixed MODELs: Split-Plot We can consider random factors to account for the variability related to the environment in which we carry out the experiment Mixed models can deal with spatial or temporal dependence The split-plot design is one of the most useful design in agricultural and ecological research The different treatment are applied to plot with different size organized in a hierarchical structure

Factorial ANOVA: Split-Plot (Mixed) N P P NP NP N P P NP N P NP NP NP N N P P Block A N P N P N P B C P NP P NP P NP D NP N NP N NP N P P P P P P NP N NP N NP N P NP P NP P NP NP NP NP NP NP NP N N N N N N P P P P P P Irrigation Density Fertilizer Response variable: Crop production in each cell Fixed effects: Irrigation (yes or no) Seed-sowing density sub-plots (low, med, high) Fertilizer (N, P or NP) Random effects: 4 blocks Irrigation within block Density within irrigation plots 31

Factorial ANOVA: Mixed Model The split-plot design i) 4 Blocks HIERARCHICAL STRUCTURE No water Water ii) irrigation plot (yes or no) N P NP P NP N iii) seed-sowing density sub-plots (low, med, high) P P NP N P NP N NP NP N iv) 3 fertilizer sub-sub-plot (N, P, o NP) P P Crop production 32

33 Factorial ANOVA: Mixed Model Model formulation Y~ fixed effects + error terms y ~ a*b*c + Error(a/b/c) Here you can specify your sampling hierarchy Yield Block Irrigation Density Fertilizer Uninformative Informative!!! Yield ~ irrigation*density*fertilizer+ Error(block/irrigation/density))

34 Factorial ANOVA: Mixed Model Mixed models using traditional ANOVA requires perfect orthogonal and balanced design (THEY WORK WELL WITH THE PROPER SAMPLING) avoid to work with F-test for multi-way ANOVA in non-orthogonal sampling designs If something has gone wrong with the sampling You can use linear mixed models with the REML estimation

35 ANOVA with Repeated Measures Statistical Model Plot of means by time for each of two treatments Sub 1 t 1 y 11 y 1t n y n1 y nt A standard ANOVA model Model y hij = hj + hij hj the expected response of a subject on treatment h at time t j, hij the independent errors, N(0, 2 ). h = 1,, g (# of groups); i =1,, n h (# of subjects in the hth group); j =1,, k (# of measurements).

ANOVA with Repeated Measures Mixed effects models The random effect for an individual is a random displacement term that applies to all measurements for that individual. Time Point whole Plot Sub 1 j t Trt = 1 1 y 11 y 1j y 1t i y i1 y ij y it n y n1 y nj y nt Trt = 2 U h1 : Split Plot 1 U hi : Split Plot i U hn : Split Plot n Subjects are not nested for treatment 36

37 ANOVA with Repeated Measures A mixed effects model The random/mixed effects model is alternately referred to a split plot, hierarchical, two-stage, or variance components model, or the random intercept model y hij = hj + U hi + hij hj the expected response of a subject on trt h at time t j U hi the unobserved random effect for subject i in group h; h =1,, g; j =1,,k; i =1,,n h ; U hi N(0, 2 ) for all h, i ; hij ~ N(0, 2 ) are residual errors; U hi, hij are independent

38 ANOVA with Repeated Measures A mixed effects model Decomposition of Variance of Covariance Var (y hij ) = 2 + 2 2 : the population variance of effect 2 : the measurement variance This assumes the equal covariance between y hij and y hij'. However it is not always true and should be confirmed with statistical inference. Hierarchical or two-stage model is applicable to the mixed-effects model 1. E (y hij U hi ) = hj + U hi = U hi *, V(y hij U hi ) = 2 2. U hi ~ N(0, 2 ) The covariance/correlation between measurements within a subject?

39 ANOVA with Repeated Measures A mixed effects model Why not use subject as a fixed effect? May require a lot of additional parameters (depending on number of subjects). Their effects are small and can be assumed to follow the normal distribution. Implies interest in estimating a separate effect for each person. Comments Model fitting does not require estimation of subject effects. Instead of using a separate parameter (fixed effect) for each subject, we assume subject effects are random realization from a population. The only additional parameter is the random effect variance, v 2.

40 ANOVA with Repeated Measures Repeated Measures ANOVA It represents an univariate approach to analyzing repeated measures data with particular focus on hypothesis testing. It can be thought of as an extension of the paired-sample t-test to include comparison between more than two repeated.

ANOVA with Repeated Measures Between-subject and within-subject effects Between-subject effects Each subject is exposed to one of the treatments being tested Estimate E(Y E = 1) - E(Y E = 0) Example: one-way ANOVA Within-subject effects Each subject is exposed to more than one of the treatments being tested Estimate E(Y i E = 1 - Y i E = 0 ), i: subject Example: Paired t-test Mixed (within and between-subjects) design There is at least one within-subjects factor and at least one between-subjects factor in the same experiment 41

42 ANOVA with Repeated Measures Between-subject and within-subject effects Remarks Within-subjects allow observation of change over time. The main disadvantage is possible confounding, which is generated by learning effect and can often be overcome by using counterbalancing. For between-subject designs, as long as group assignment is random, causal estimates are obtained by comparing the behavior of those in one experimental condition with the behavior of those in another. However it is hardly achieved for observational studies

43 ANOVA with Repeated Measures Partition of SS A x (B x S): A - between, B within, S - subjects Each level of factor A contains a different group of e.g. randomly assigned subjects On the other hand, each level of factor B at any given level of factor A contains the same subjects SStotal SS b/t subjects SS w/in subjects SS A SS subj w/in groups SS B SS error

44 Sources of Variance Partition of SS SS total : Deviation of each individual score from the grand mean SS b/t subjects Deviation of subjects' individual means (across treatments) from the grand mean. This is largely uninteresting, as we can pretty much assume that subjects differ SS w/in subjects : How SS vary about their own mean, breaks down into: SS B 1 As in between subjects ANOVA, it is the comparison of treatment means to each other (by examining their deviations from the grand mean) 2 However this is now a partition of the within subjects variation SS error 1 Variability of individuals scores about their treatment mean

45 ANOVA with Repeated Measures Example : The Sitka spruce 1989 Goal : to investigate the effect of ozone on tree growth 79 trees are grown in four controlled environment chambers Observations: 103, 130, 162, 190, 213, 247,273, 308 (days) 27 trees, ozone 70 ppb 27 trees, ozone 70 ppb 12 trees, No ozone 12 trees, No ozone

ANOVA with Repeated Measures Repeated Measures ANOVA Linear model y hij = + h + j + hj + U hi + hij y hij the jth measurement (response) (j =1,, n) for the ith subject (i =1,, m h ) in group h (h =1,,g); the unknown intercept; h the effect of treatment h, h h = 0; j the effect of time j, j j =0 for all h; hj the effect of treatment h at time j; j hj = 0 for all h, h hj = 0 for j =1,, n; U hi the random effect of the ith subject in the hth group, U hi IN(0, 2 ); hij ~ IN(0, 2 ) are residual errors; U hi, hij are independent. m h m h F 1 : To test for difference in expected response between treatments H 0 : 1 = 1 = = g vs H 1 : not H 0. F 2 : To test for lack of parallel group mean profile H 0 : 11 = 12 = = gn vs H 1 : not H 0. 46

47 ANOVA with Repeated Measures ANOVA Table Source of variation Sum of squares df Between treatments Whole plot residual Whole plot total Between times Treatment by time interaction Split-plot residual Split-plot total g BTSS m mh( yh y) 1 h1 RSS TSS BTSS 1 1 1 g m TSS m h ( y y ) 1 h1 n i1 hi 2 BTSS ( ) 2 n y 1 j y j n g 2 2 m ( ) 1 h 1 h y j hj 1 2 ISS y BTSS BTSS RSS TSS ISS BTSS TSS TSS 2 2 2 2 1 g 2 h1 m h n i1 j1 2 ( y y ) hi j 2 2 whole plot split plot g 1 m g n 1 ( g1) ( n1) ( m g) ( n 1) nm 1 F 1 : To test for difference in expected response between treatments: F { BTSS / ( g 1)} / { RSS / ( m g)}, 1 1 1 F 2 : To test for lack of parallel group mean profile: F { ISS / [( g 1)( n 1)]} / { RSS / [( m g)( n 1)]}, 2 2 2

48 ANOVA with Repeated Measures ANOVA Table Source of Var Sum of Squares df Expected Mean Square Between Trts BTSS 1 = m h m h (y h.. y... ) 2 g 1 m h m h h2 /(h-1) + n 2 + 2 Whole plot residual RSS 1 = TSS 1 BTSS 1 m g n 2 + 2 Whole plot total TSS 1 = m h i (y hi. y... ) 2 m 1 Between Times BTSS 2 = n j (y..j y... ) 2 n 1 g j n h j2 /(n-1) + 2 Trt x Time Interaction ISS 2 = j h n h (y h.j y... ) 2 (g 1)(n 1) j h n h hj2 /((g-1)(n-1)) + 2 BTSS 1 BTSS 2 Error RSS 2 = TSS 2 ISS 2 BTSS 2 (m-g)(n-1) 2 TSS 1 Total TSS 2 = h i j (y hij y... ) 2 nm - 1 Under the null hypotheses, α h = 0, τ j = 0, and hj = 0. v 2 = 0 can be 0, which means no covariances between repeated measurements for each subject.

49 ANOVA with Repeated Measures Results Growth of Sitka spruce with and without ozone (continued) The following table presents a split-plot ANOVA for the 1989 sitka spruce data. We ignore possible chamber effects to ease comparison with the results from the robust analysis. Source of variation Sum of squares df Mean square Between treatments 17.106 1 17.106 Whole plot residuals 249.108 77 3.235 Whole plot total 266.214 78 3.413 Between times 41.521 7 5.932 Treatment by time interaction 0.045 7 0.0064 Split-plot residual 5.091 539 0.0094 Split-plot total 312.871 631 What are the test statistics for F 1, F 2?

50 ANOVA with Repeated Measures Assumptions in the Sitka Spruce Example U IN v IN 2 2 hi ~ (0, ), hij ~ (0, ) 2 2 v if i i ', j j ' 2 cov( yhij, yh' i' j ') v if i i ', j j ' 0 if i i ', j j ' Homogeneity of variances Sphericity Independence

51 ANOVA with Repeated Measures Assumptions in the Sitka Spruce Example Between subjects When analysing data from different participants, you have to assume homogeneity of variances, i.e., same variances in all groups Homogeneity of variances is tested with Levene's test Within subjects Since your subjects are always the same, there will necessarily be a relation between the measurements in the various conditions in a within subj design. While the relations cannot be independent they should be the same between all pairs. variance A-B variance A-C variance B-C

ANOVA with Repeated Measures Sphericity The validity of (some of the) F-tests rests on assumption of the splitplot model in particular, type H matrix including compound symmetry (the equal correlation) Type H matrix: 2 2 2 2 2 3 4 3 2 1 2 1 3 1 4 1 2 3 2 4 2 4 Sphericity assume that variances in the differences between conditions is equal (basically it means that the correlation between treatment levels is the same). Sphericity test: perform a statistical test for the structure, type H matrix (Type H matrix satisfies sphericity). 52

53 ANOVA with Repeated Measures Sphericity Start 1 Month 2 Months Start 1 Month Start 2 Months 1 Month 2Months 64 62 74 2-10 -12 63 63 62 0 1 1 66 65 71 1-6 -6 106 100 84 6 22 16 67 66 66 1 1 0 120 112 107 8 13 5 62 63 61-1 1 2 72 71 48 1 24 23 83 73 65 10 18 8 77 65 62 12 15 3 Variance 21.33 140.67 100.89

54 ANOVA with Repeated Measures Mauchly Sphericity Test Likelihood ratio test: H 0 : type H matrix is preserved vs H 1 : not H 0 sup H Lw ( ) sup H ( ) 0H L 1 0 2 log ~ ( df ) When the significance level of the Mauchly s test is < 0.05, then sphericity cannot be assumed

55 Examples Latin square / split-plot design Recall the repeated-measures set-up: Treatment A1 A2 A3 S1 S1 S1 S2 S2 S2 S3 S3 S3 S4 S4 S4 S1, S2, and S3 indicate different subject.

56 Examples Four assumptions need to be tested/satisfied Compound Symmetry Homogeneity of variance in each column σ A12 = σ A2 2 = σ A3 2 Homogeneity of covariance between columns σ A1A2 = σ A2A3 = σ A3A1 No A x S Interaction (Additivity) Sphericity Variance of difference scores between pairs are equal σ YA1-YA2 = σ YA1-YA3 = σ YA2-YA3 The variance of each cell cannot be computed.

57 Examples Usually, testing sphericity will suffice Sphericity can be tested using the Mauchly test in SAS SAS code proc glm data=temp; class sub; model a1 a2 a3 = sub / nouni; repeated as 3 (1 2 3) polynomial / summary printe; run; quit; Results Sphericity Tests Mauchly's Variables DF Criterion Chi-Square Pr > ChiSq Transformed Variates 2 Det = 0 6.01.056 Orthogonal Components 2 Det = 0 6.03.062 In addition, the PRINTE option provides sphericity tests for each set of transformed variables. If the requested transformations are not orthogonal, the PRINTE option also provides a sphericity test for a set of orthogonal contrasts.

58 ANOVA with Repeated Measures Remarks Repeated measures ANOVA is a somewhat ad hoc approach to analysis. For each model we need to find appropriate sums of squares, F-test. If sphericity is violated, Use modified tests (Huynh-Feldt or Greenhouse-eisser). Use MANOVA However, for longitudinal data this covariance model typically has limited scientific appeal. The method assumes balanced data (the same number and timing of measurements across subject). F-test validity is not guaranteed for unbalanced data.

59 ANOVA with Repeated Measures Limitation ANOVA/MANOVA assume categorical predictors. ANOVA/MANOVA do not handle time-dependent covariates (predictors measured over time). They assume everyone is measured at the same time (time is categorical) and at equally spaced time intervals. You don t get parameter estimates (just p-values) Missing data must be imputed. They require restrictive assumptions about the correlation structure.

60 Analysis Procedure Procedure with PROC GLM

Questions?? 61