30 CVP (mean and sd) Postgraduate course in ANOVA and Repeated Measurements Day Repeated measurements (part ) Mogens Erlandsen Deptartment of Biostatistics Aarhus University 5 0 15 10 0 1 3 4 5 6 7 8 9 10 11 The within subject variation is the relevant variation when analyzing changes over time..so How can we estimate the within subject variation and the between subject variation? 1 Univariate Repeated Measurements ANOVA using the anova command In order to use ANOVA we need stronger assumptions 3) The standard deviation σ T is the same for all measurements and the correlations between any two (different measurements) on the same subject are equal. i.e. σ T = σ B + σ W The correlation = σ B / σ T Note: The default behaviour in Stata s anova command is to test effects against the within subject standard deviation. This might be wrong if the effect is a between subjects effect. In this case Stata should be told. See next slide. 3 Example EVF continued (data in long format) Test 1: Hypothesis H: Parallel curves This test can be performed by a 3-way ANOVA with id (subject identification), time, and #time (interaction) in the model. is a between subjects effect Stata 11: anova evf /id time time#, repeated(time) The command wsanova (should be downloaded) might be easier: wsanova evf time, id(id) between() epsilon (allmost the same output!) set matsize 800, permanently before using the anova commands 4
The Univariate Repeated Measurements Anova: anova evf /id time time#, repeated(time) Output (continued) Between-subjects error term: id Levels: 30 (8 df) Lowest b.s.e. variable: id Covariance pooled over: (for repeated variable) Between subjects Within subjects Number of obs = 180 R-squared = 0.6878 Root MSE =.07676 Adj R-squared = 0.6008 Source Partial SS df MS F Prob > F Model.368898 39.006057151 7.91 0.0000.00008 1.00008 0.01 0.9085 id.166484455 8.005945873 time.03171113 5.006543 8.17 0.0000 time#.038393331 5.007678666 10.0 0.0000 Residual.10735557 140.000765968 Total.343464455 179.001918796 Test 1 5 Repeated variable: time Same as previous slide Huynh-Feldt epsilon = 0.885 Greenhouse-Geisser epsilon = 0.797 Box's conservative epsilon = 0.000 ------------ Prob > F ------------ Source df F Regular H-F G-G Box time 5 8.17 0.0000 0.0000 0.0000 0.0080 time# 5 10.0 0.0000 0.0000 0.0000 0.0037 Residual 140 Some corrections of the p-value have been proposed when the assumptions (mainly assumption 3) are violated. They will normally be larger than the regular. 6 How can we use the four/three p - values: ------------ Prob > F ------------ Source df F Regular H-F G-G Box time 5 8.17 0.0000 0.0000 0.0000 0.0080 time# 5 10.0 0.0000 0.0000 0.0000 0.0037 Residual 140 The following has been proposed: If the regular/uncorrected p value is not significant (>0.05) then stop and accept (fail to reject) the hypothesis else If the G-G p value is significant (<0.05) then stop and reject the hypothesis else If the Box p value is significant (<0.05) then stop and reject the hypothesis else stop and accept (fail to reject) the hypothesis. wsanova evf time, id(id) between() epsilon Number of obs = 180 R-squared = 0.6878 Root MSE =.07676 Adj R-squared = 0.6008 Source Partial SS df MS F Prob > F Between subjects:.00008 1.00008 0.01 0.9085.00008 1.00008 0.01 0.9085 id*.166484455 8.005945873 Within subjects:.069664444 10.006966444 9.09 0.0000 time.03171113 5.006543 8.17 0.0000 time*.038393331 5.007678666 10.0 0.0000 Residual.10735557 140.000765968 Total.343464455 179.001918796 Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: 0.797 Huynh-Feldt (H-F) epsilon: 0.885 Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time 5 8.17 0.0000 0.0000 0.0000 time* 5 10.0 0.0000 0.0000 0.0000 Kirk (198) 7 8
Test 3: (for each ): H4: no changes over time. wsanova evf time if ==1, id(id) epsilon Number of obs = 90 R-squared = 0.6439 Root MSE =.03559 Adj R-squared = 0.547 Source Partial SS df MS F Prob > F id.09695557 14.00661111 time.067618889 5.01353778 10.68 0.0000 Residual.088664443 70.00166635 Total.48978889 89.00797516 Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: 0.6795 Huynh-Feldt (H-F) epsilon: 0.936 Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time 5 10.68 0.0000 0.0000 0.0000 wsanova evf time if ==, id(id) epsilon Number of obs = 90 R-squared = 0.8033 Root MSE =.01688 Adj R-squared = 0.7499 Source Partial SS df MS F Prob > F id.073788898 14.00570636 time.00045555 5.000409111 1.54 0.1881 Residual.018571114 70.0006530 Total.094405567 89.001060737 Note: Within subjects F-test(s) above assume sphericity of residuals; p-values corrected for lack of sphericity appear below. Greenhouse-Geisser (G-G) epsilon: 0.6453 Huynh-Feldt (H-F) epsilon: 0.8613 Sphericity G-G H-F Source df F Prob > F Prob > F Prob > F time 5 1.54 0.1881 0.143 0.1980 Lower bound for the p - value 9 10 If we want to estimate the within and between subject standard deviations (σ T = σ B + σ W ) one can use the xtmixed command: We have four variables: evf id time: xi: xtmixed evf i.time*i. id: ///,nofetable noheader no nostderr nolrtest Part of the output: -------------------------------------------------------------------------- Random-effects Parameters Estimate -----------------------------+-------------------------------------------- id: Identity sd(_cons).09383 between subject sd -----------------------------+-------------------------------------------- sd(residual).076761 within subject sd -------------------------------------------------------------------------- From xtmixed we have sd W =.076761 sd B =.09383 and the we can calculate s w = 0.000765967 s B = 0.00086330 sd T = sd B + sd W = 0.001699 s T = 0.040364 The (estimated) correlation between two measurements on the same subject s B / s T = 0.53 11 1
We can look at each separately: xi: xtmixed evf i.time id: if ==1 ///,nofetable noheader no nostderr nolrtest Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity sd(_cons).098733... -----------------------------+------------------------------------------------ sd(residual).0355898... xi: xtmixed evf i.time id: if == ///,nofetable noheader no nostderr nolrtest Random-effects Parameters Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity sd(_cons).08889... -----------------------------+------------------------------------------------ sd(residual).016881... Estimates in each : grp 1 s W 0.0017 0.0007 s B 0.00089 0.00083 s T 0.0016 0.00110 s W 0.0356 0.0163 s B 0.099 0.089 s T 0.0465 0.033 We can see that the estimates for the between subject variation (s B ) are almost equal but the within subject variation (s W ) are different and hence also the total variation and the correlation. Remarks: The correlations are expected to be positive (why?), but in special cases one might get negative correlations. (weight of mice with limit amount of food and.) We can compare the estimates above with the standard deviations and correlation calculated from the 6 variables evf1, evf,.., evf6. 13 14 Conclusion: We found a significant differerence between the s with respect to changes over time p<0.004) We found a statistical significant changes over time in the CPB- (p<0.006) but not in the Sham- (p>0.19) grp 1 s W 0.0356 0.0163 s B 0.099 0.089 s T 0.0465 0.033 Correlation 0.4133 0.7587 Correlation 0.4133 0.7587 Checking the model: A important part, but often suppressed, of the analysis is to check whether the assumptions for the analysis is fulfilled sufficiently (a weak statement ), or a transformation (ln-transformation??) of the data is better, or we need to look for an analysis with maybe weaker assumptions. 15 16
Checking the model: The tests are normally F-tests and The result of the F-test is not affected by moderate departures from normality, especially for large numbers of observations in each. The F-test is more sensitive to the assumption of equal variances/ standard deviations.. unless the sample size in each are almost equal. (One can reduce the degrees of freedom as in the t test with unequal variance) Assumptions: Test 1: Parallel curves 1) All the differences between two timepoints are multivariate normaldistributed with in s. ) The sd s and the correlations between differences should be the same in the two s (mvtest) Test 3: No change over time (with-in a ) 1) All the differences between two timepoints are multivariate normaldistributed. 17 18 Example (evf): Probability plots for 1: Probability plots for : d1_ dif 1 d_3.15 dif 3 -. -.15 -.1 -.05 0.05 dif 3 4 d1_ dif 1 d_3 dif 3 -. -.15 -.1 -.05 0.05 dif 3 4 -.0 0.0.04.06.15 -.15 -.1 -.05 0 -.04 -.0 0.0.04.06 -.04 -.0 0.0 Inver se Normal -.01 0.01.0.03 Inver se Normal dif 4 5 dif 5 6 -.05 0.05 dif 1 6.15 dif 4 5 -.04 -.0 0.0 dif 5 6 -.0 -.01 0.01.0.03 Inver se Normal dif 1 6 -.04 -.0 0.0.04 Inver se Normal 19 0
Scatter plots for (some of) the differences: d_3 d1_ d1_.15 1 1. 1.4 1.6 1.8 1 1. 1.4 1.6 1.8 d_3 1 1. 1.4 1.6 1.8 1 1. 1.4 1.6 1.8 The variation within the s should be equal for each set of differences (and all equal if we use the ANOVA) -. -.15 -.1 -.05 0.05 1 1. 1.4 1.6 1.8 1 1. 1.4 1.6 1.8 1 We can also use the figures from the paired analysis (see Basic Biostatistics ): difference (or changes) versus average (or sum). Bland-Altman plot Look for increasing (decreasing) changes when the average increase and/or increasing variation when the average increase If so then the ln-transformation of the data maybe appropriate. dif-ave plots: d_3 mvtest can also test for normality: d1_ d1_.6.8.3.3.34.36 1 d_3.6.8.3.3.34.36 1 -. -.15 -.1 -.05 0.05.6.8.3.3.34.36 1 mvtest norm d1_ d_3 if ==1, stats(all) Test for multivariate normality Mardia mskewness = 13.03587 chi(35) = 41.715 Prob>chi = 0.019 Mardia mkurtosis = 31.6106 chi(1) = 0.61 Prob>chi = 0.434 Henze-Zirkler =.7465485 chi(1) = 0.000 Prob>chi = 0.991.15 Doornik-Hansen chi(10) = 8.935 Prob>chi = 0.5383.6.8.3.3.34.36.6.8.3.3.34.36.6.8.3.3.34.36 1 1 1 3 4
mvtest can also test for normality (bivariate): mvtest can also test for normality (univariate) mvtest norm d1_ d_3 if ==1, biv Doornik-Hansen test for bivariate normality -------------------------------------------------------- Pair of variables chi df Prob>chi ---------------------------+---------------------------- d1_ d_3 1.9 4 0.7499 1.66 4 0.7986.38 4 0.6663 6.63 4 0.1567 d_3 3.50 4 0.477 3.64 4 0.4571 8.7 4 0.0684.01 4 0.7330 7.5 4 0.130 7.03 4 0.134 -------------------------------------------------------- 5 mvtest norm d1_ d_3 if ==1, uni Test for univariate normality --------------------------------------------------------------------- ------- joint ------ Variable Pr(Skewness) Pr(Kurtosis) adj chi() Prob>chi ----- d1_ 0.658 0.9303 0.5 0.8845 d_3 0.1871 0.5130.51 0.854 0.307 0.8933 1.18 0.5553 0.496 0.818 1.56 0.4595 0.1604 0.0740 5.08 0.0788 --------------------------------------------------------------------- Conclusion: The assumptions (normality) seem to be ok; Similar result for Remark: be careful; a lot of tests 6 Assumption (The univariate (ANOVA) approach): 3) The standard deviation σ T is the same for all measurements and the correlations between any two (different measurements) on the same subject are equal.. mvtest cov evf1 evf evf3 evf4 evf5 evf6 if ==1, compound Test that covariance matrix is compound symmetric Adjusted LR chi(19) = 7.88 Prob > chi = 0.0858. mvtest cov evf1 evf evf3 evf4 evf5 evf6 if ==, compound Test that covariance matrix is compound symmetric Adjusted LR chi(19) = 7.7 Prob > chi = 0.0891 Conclusion: We accept the hypothesis for each 7 Checking the assumptions for the ANOVA approach : Group 1 Residuals Group Residuals -.04 -.0 0.0.04..5.3.35.4 Linear prediction.5.3.35 Linear prediction Residuals Residuals residual probability-plot residual probability-plot -.04 -.0 0.0.04 8
Conclusion: The evf-measurements seem to fulfilled the assumptions about the normal distribution but have problem with standard deviations/correlations between the s. The Univariate Repeated Measurement may be appropriate (for each sperately) and one can state the two/three standard deviation for the two s (i.e. in a figure showing the mean curves) (An analysis of ln-transformed data gives almost the same result) Example (distance): 0 5 30 Distance 8 10 1 14 time Boys Girls 9 30 Example (heartperiod): anova dist sex/idsex time sex#time, repeated(time) Number of obs = 108 R-squared = 0.8386 Root MSE = 1.40536 Adj R-squared = 0.7697 Example (distance): Between-subjects error term: idsex Levels: 7 (5 df) Lowest b.s.e. variable: id Covariance pooled over: sex (for repeated variable) Source Partial SS df MS F Prob > F Model 769.56489 3 4.048884 1.18 0.0000 sex 140.464857 1 140.464857 9.9 0.0054 idsex 377.914773 5 15.1165909 time 09.436974 3 69.81346 35.35 0.0000 sex*time 13.99595 3 4.66417649.36 0.0781 Residual 148.17841 75 1.97503788 Total 917.6913 107 8.57656196 31 Repeated variable: time Huynh-Feldt epsilon = 1.0156 *Huynh-Feldt epsilon reset to 1.0000 Greenhouse-Geisser epsilon = 0.867 Box's conservative epsilon = 0.3333 ------------ Prob > F ------------ Source df F Regular H-F G-G Box time 3 35.35 0.0000 0.0000 0.0000 0.0000 sex*time 3.36 0.0781 0.0781 0.0878 0.1369 Residual 75 Conclusion: We found no significant differerence between the s (sex) with respect to changes over time p>.078) 3
anova evf /id time time*, repeated(time) Number of obs = 108 R-squared = 0.8386 Root MSE = 1.40536 Adj R-squared = 0.7697 Source Partial SS df MS F Prob > F Model 769.56489 3 4.048884 1.18 0.0000 sex 140.464857 1 140.464857 9.9 0.0054 idsex 377.914773 5 15.1165909 time 09.436974 3 69.81346 35.35 0.0000 sex*time 13.99595 3 4.66417649.36 0.0781 Residual 148.17841 75 1.97503788 Total 917.6913 107 8.57656196 If we accept H (parallel curves) we can test whether the two mean curves are equal. It is exactly the same test as day, part 1 i.e. equal to a t-test on the average of the 4 measurements of distances. All three assumptions should be fulfilled. 33 Example (distance): Between-subjects error term: idsex Levels: 7 (5 df) Lowest b.s.e. variable: id Covariance pooled over: sex (for repeated variable) Repeated variable: time Huynh-Feldt epsilon = 1.0156 *Huynh-Feldt epsilon reset to 1.0000 Greenhouse-Geisser epsilon = 0.867 Box's conservative epsilon = 0.3333 ------------ Prob > F ------------ Source df F Regular H-F G-G Box time 3 35.35 0.0000 0.0000 0.0000 0.0000 sex*time 3.36 0.0781 0.0781 0.0878 0.1369 Residual 75 If we accept H (parallel curves) we can test H4 (no changes over time) for both s in one test. If we perform a test for each of the s we can have to different answers or we can accept H4 for both s separately due to low power. 34 If problems with the assumptions we can use a permutation test: permute sex r(f), reps(10000) :mvtest mean d8_10 d10_1 d1_14, by(sex) het.. Monte Carlo permutation results Number of obs = 7 command: mvtest mean d8_10 d10_1 d1_14, by(sex) het _pm_1: r(f) permute var: sex T T(obs) c n p=c/n SE(p) [95% Conf. Interval] -------------- _pm_1 3.140787 96 10000 0.096 0.0017.063658.0331118 Note: confidence interval is with respect to p=c/n. Note: c = #{T >= T(obs)} Remarks: We have now more than one way to analyze the data. Which one (if any) shall we choose? How can describe the analysis? How can we describe the results? Depending of what we can assume we can try to answer the questions (Day 4). Conclusion: We reject H (p=0.030), the changes over time for the two s are statistical significant. 35 36