Analysis of variance. April 16, Contents Comparison of several groups

Similar documents
Analysis of variance. April 16, 2009

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Analysis of Variance

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

Parametrisations, splines

Analysis of variance and regression. November 22, 2007

Multi-factor analysis of variance

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

6. Multiple regression - PROC GLM

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Topic 23: Diagnostics and Remedies

Comparison of a Population Means

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

Answer Keys to Homework#10

Assignment 9 Answer Keys

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Overview. Prerequisites

Correlated data. Introduction. We expect students to... Aim of the course. Faculty of Health Sciences. NFA, May 19, 2014.

Week 7.1--IES 612-STA STA doc

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

Biological Applications of ANOVA - Examples and Readings

Lecture 4. Checking Model Adequacy

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

Chapter 16. Nonparametric Tests

The General Linear Model. April 22, 2008

Linear Combinations of Group Means

One-way ANOVA Model Assumptions

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

The General Linear Model. November 20, 2007

Single Factor Experiments

4.8 Alternate Analysis as a Oneway ANOVA

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Topic 20: Single Factor Analysis of Variance

Analysis of Covariance

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

Answer to exercise 'height vs. age' (Juul)

Odor attraction CRD Page 1

Chapter 12. Analysis of variance

N J SS W /df W N - 1

General Linear Model (Chapter 4)

Lecture notes on Regression & SAS example demonstration

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Linear models Analysis of Covariance

Chapter 11. Analysis of Variance (One-Way)

PLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis

Introduction to Crossover Trials

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Statistics for exp. medical researchers Regression and Correlation

Handout 1: Predicting GPA from SAT

Variance component models part I

Least Squares Analyses of Variance and Covariance

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

Topic 28: Unequal Replication in Two-Way ANOVA

Biostatistics Quantitative Data

Linear models Analysis of Covariance

Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction Review of regression concepts

Varians- og regressionsanalyse

22s:152 Applied Linear Regression. 1-way ANOVA visual:

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Analysis of variance and regression. December 4, 2007

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Correlation. Bivariate normal densities with ρ 0. Two-dimensional / bivariate normal density with correlation 0

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

Descriptions of post-hoc tests

STA 303H1F: Two-way Analysis of Variance Practice Problems

1 Tomato yield example.

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

5.3 Three-Stage Nested Design Example

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Outline Topic 21 - Two Factor ANOVA

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Correlation and Simple Linear Regression

EXST7015: Estimating tree weights from other morphometric variables Raw data print

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

In Class Review Exercises Vartanian: SW 540

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Assessing Model Adequacy

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

Analysis of variance (ANOVA) Comparing the means of more than two groups

Introduction to Analysis of Variance (ANOVA) Part 2

a = 4 levels of treatment A = Poison b = 3 levels of treatment B = Pretreatment n = 4 replicates for each treatment combination

In many situations, there is a non-parametric test that corresponds to the standard test, as described below:

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

STAT 705 Chapter 19: Two-way ANOVA

Topic 29: Three-Way ANOVA

Transcription:

Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics (2008), Lene Theil Skovgaard, Dept. of Biostatistics (2007, 2006) ANOVA, April 2009 1 Comparison of 2 or more groups Marc Andersen StatGroup ApS e-mail: mja@statgroup.dk http://staff.pubhealth.ku.dk/~pd/v+r/html/ number different same of groups individuals individual 2 unpaired paired t-test t-test 2 oneway twoway analysis of variance analysis of variance One-way analysis of variance: Do the distributions differ between the groups? Do the levels differ between the groups?

ANOVA, April 2009 2 ANOVA, April 2009 3 Example: 22 bypass-patients, 3 different kinds of ventilation during anaesthesia, randomized Group I Group II Group III 50% N 2 O, 50% O 2 for 24 hours 50% N 2 O, 50% O 2 during operation 30 50% O 2 (no N 2 O) for 24 hours Gr.I Gr.II Gr.III n 8 9 5 Mean 316.6 256.4 278.0 SD 58.7 37.1 33.8 ANOVA, April 2009 4 ANOVA, April 2009 5 One-way ANOVA one-way: because we only have one critera for classification of the observations, here ventilation method ANalysis Of VAriance: because we compare the variance between groups with the variance within groups Model: j th observation in group no. i Y ij = µ i + ε ij mean of group no. i individual deviation Observations are assumed be independent and to follow a normal distribution (within each group) with the same variance. ε ij N(0, σ 2 ) or equivalently Y ij N(µ i, σ 2 ) Model assumptions must be checked!

ANOVA, April 2009 6 ANOVA, April 2009 7 ANOVA math: Sums of squares Decomposition of deviation from grand mean : Hypothesis testing Usual approach Null hypothesis: group means are equal, H 0 : µ i = µ Alternative hypothesis: group means are not equal We show the means are not equal by rejecting the null hypothesis of equality (ref DGA, 8.5 Hypothesis Testing) y ij ȳ i ȳ. y ij ȳ = (y ij ȳ i ) + (ȳ i ȳ ) j th observation in i th group average in i th group total average Decomposition of variation (sums of squares): (y ij ȳ ) 2 = (y ij ȳ i ) 2 + (ȳ i ȳ ) 2 i,j i,j i,j }{{}}{{}}{{} total variation within groups between groups ANOVA, April 2009 8 ANOVA, April 2009 9 Decomposition of variation: total = between + within F-test statistic: SS total = SS between + SS within (n 1) = (k 1) + (n k) F = MS between = SS between/(k 1) MS within SS within /(N k) Reject the null hypothesis if F is large, i.e. if the variation between groups is too large compared to the variation within groups. Usually the analysis is summarized in an Analysis of variance table Variation df SS MS F P Between k 1 SS b SS b /df b MS b /MS w P (F (df b, df w ) > F obs ) Within n k SS w SS w /df w Total n 1 SS tot

ANOVA, April 2009 10 ANOVA, April 2009 11 Analysis of variance table - Anaestesia example df SS MS F P Between 2 15515.88 7757.9 3.71 0.04 Within 19 39716.09 2090.3 Total 21 55231.97 F = 3.71 F (2, 19) P = 0.04 Weak evidence of non-equality of the three means Analysis of variance in SAS To define the anaestesia data in SAS, we write data ex_redcell; input grp redcell; cards; 1 243 1 251 1 275...... 3 293 3 328 ; The variable redcell contains all the measurements of the outcome and grp contains the method of ventilation for each individual. ANOVA, April 2009 12 ANOVA, April 2009 13 Analysis of variance program: The option solution outputs parameter estimates: proc glm data=ex_redcell; class grp; model redcell=grp / solution; General Linear Models Procedure Dependent Variable: REDCELL Sum of Mean Source DF Squares Square F Value Pr > F Model 2 15515.7664 7757.8832 3.71 0.0436 Error 19 39716.0972 2090.3209 Corrected Total 21 55231.8636 R-Square C.V. Root MSE REDCELL Mean 0.280921 16.14252 45.7200 283.227 Source DF Type I SS Mean Square F Value Pr > F GRP 2 15515.7664 7757.8832 3.71 0.0436 T for H0: Pr > T Std Error of Parameter Estimate Parameter=0 Estimate INTERCEPT 278.0000000 B 13.60 0.0001 20.44661784 GRP 1 38.6250000 B 1.48 0.1548 26.06442584 2-21.5555556 B -0.85 0.4085 25.50141290 3 0.0000000 B... NOTE: The X X matrix has been found to be singular and a generalized inverse was used to solve the normal equations. Estimates followed by the letter B are biased, and are not unique estimators of the parameters. Group 3 (the last group) is the reference group The estimates for the other groups refer to differences to this reference group Source DF Type III SS Mean Square F Value Pr > F GRP 2 15515.7664 7757.8832 3.71 0.0436

ANOVA, April 2009 14 ANOVA, April 2009 15 Some issues: Clinical significance Statistical significance Provide confidence interval Does it make sense? Interpreting the estimates Multiple comparisons The F -test show, that there is a difference but where? Pairwise t-tests are not suitable due to risk of mass significance Recall a significance level of α = 0.05 means 5% chance of wrongfully rejecting a true hypothesis (type I error) The chance of at least one type I error goes up with the number of tests (for k groups, we have m = k(k 1)/2 possible tests, the actual significance level can be as bad as: 1 (1 α) m, e.g. for k=5: 0.40) ANOVA, April 2009 16 ANOVA, April 2009 17 There is no completely satisfactory solution. Approximative solutions: 1. Select a (small) number of relevant comparisons in the planning stage. 2. Make a graph of the average ±2 SEM and judge visually (!), perhaps supplemented with F -tests on subsets of groups. 3. Modify the t-tests by multiplying the P-values with the number of tests, the socalled Bonferroni correction (conservative) 4. Use a correction for multiple testing (Dunnett, Tukey) or a (prespecified) multiple testing procedure Tukey multiple comparisons in SAS: proc glm data=ex_redcell; class grp; model redcell=grp / solution; lsmeans grp / adjust=tukey pdiff cl; The GLM Procedure Least Squares Means Adjustment for Multiple Comparisons: Tukey-Kramer Least Squares Means for effect grp Pr > t for H0: LSMean(i)=LSMean(j) Dependent Variable: redcell i/j 1 2 3 1 0.0355 0.3215 2 0.0355 0.6802 3 0.3215 0.6802 Least Squares Means for Effect grp Difference Simultaneous 95% Between Confidence Limits for i j Means LSMean(i)-LSMean(j) 1 2 60.180556 3.742064 116.619047 1 3 38.625000-27.590379 104.840379 2 3-21.555556-86.340628 43.229517

ANOVA, April 2009 18 ANOVA, April 2009 19 Visual assessment: the bars represent confidence intervals for the means. proc gplot data=ex_redcell; plot redcell*grp / haxis=axis1 vaxis=axis2 frame; axis1 order=(1 to 3 by 1) offset=(8,8) label=(h=3 gruppe nr. ) value=(h=2) minor=none; axis2 offset=(1,1) value=(h=2) minor=none label=(a=90 R=0 H=3 red cell foliate ); symbol1 v=circle i=std2mjt l=1 h=2 w=2; Model checking Check if the assumptions are reasonable: (If not the analysis is unreliable!) Variance homogeneity may be checked by performing Levenes test (or Bartletts test). In case of variance inhomogeneity, we may also perform a weighted analysis (Welch s test), just as in the T-test Normality may be checked through probability plots (or histograms) of residuals, or by a numerical test on the residuals. In case of non-normality, we may use the nonparametric Kruskal-Wallis test Transformation (often logarithms) may help to achieve variance homogeneity as well as normality ANOVA, April 2009 20 ANOVA, April 2009 21 Check of variance homogeneity and normality in SAS proc glm data=ex_redcell; class grp; model redcell=grp; means grp / hovtest=levene welch; output out=model p=predicted r=residual; Store residuals in a dataset for further model checking proc univariate normal data=model; var residual; Output from proc glm: Test for variance homogeneity Levene s Test for Homogeneity of redcell Variance ANOVA of Squared Deviations from Group Means Sum of Mean Source DF Squares Square F Value Pr > F grp 2 18765720 9382860 4.14 0.0321 Error 19 43019786 2264199 and weighted anova in case of variance heterogeneity: Welch s ANOVA for redcell Source DF F Value Pr > F grp 2.0000 2.97 0.0928 Error 11.0646 So we are not too sure concerning the group differences...

ANOVA, April 2009 22 ANOVA, April 2009 23 Output from proc univariate: Test for normality: Tests for Normality Test --Statistic--- -----p Value---- Shapiro-Wilk W 0.965996 Pr < W 0.6188 Kolmogorov-Smirnov D 0.107925 Pr > D >0.1500 Cramer-von Mises W-Sq 0.043461 Pr > W-Sq >0.2500 Anderson-Darling A-Sq 0.263301 Pr > A-Sq >0.2500 The 4 tests focus on different aspects of non-normality. For small data sets, we rarely get significance For large data sets, we almost always get significance Could look at a probability plot instead Non-parametric ANOVA, the Kruskal-Wallis test: proc npar1way wilcoxon; exact; class grp; var redcell; Again, we have lost the significance... Wilcoxon Scores (Rank Sums) for Variable redcell Classified by Variable grp Sum of Expected Std Dev Mean grp N Scores Under H0 Under H0 Score ------------------------------------------------------------------- 1 8 120.0 92.00 14.651507 15.000000 2 9 77.0 103.50 14.974979 8.555556 3 5 56.0 57.50 12.763881 11.200000 Kruskal-Wallis Test Chi-Square 4.1852 DF 2 Asymptotic Pr > Chi-Square 0.1234 Exact Pr >= Chi-Square 0.1233 ANOVA, April 2009 24 ANOVA, April 2009 25 Two-way analysis of variance Two criterias for subdividing observations, A og B Data in two-way layout: (not for analysis!!) B A 1 2 c 1 2... r. Effect of both factors Perhaps even interaction (effect modification) One factor may be individuals or experimental units (e.g. different treatments tried on same person) Repeated measurements Example: Short term effect of enalaprilate on heart rate Time Subject 0 30 60 120 average 1 96 92 86 92 91.50 2 110 106 108 114 109.50 3 89 86 85 83 85.75 4 95 78 78 83 83.50 5 128 124 118 118 122.00 6 100 98 100 94 98.00 7 72 68 67 71 69.50 8 79 75 74 74 75.50 9 100 106 104 102 103.00 average 96.56 92.56 91.11 92.33 93.14

ANOVA, April 2009 26 ANOVA, April 2009 27 Line plot ( Spaghettiogram ) Additive model: Y st = µ + α s + β t + ε st The two effects (s and t) work in an additive way. The ε st s are assumed to be independent, normally distributed with mean 0, and identical variances, ε st N(0, σ 2 ) (check this!) Variational decomposition: SS total = SS subject + SS time + SS residual Ideally the time courses are parallel. ANOVA, April 2009 28 ANOVA, April 2009 29 Analysis of variance table - enalaprilate example Two-way ANOVA in SAS: df SS MS F P Subjects 8 8966.6 1120.8 90.60 <0.0001 Times 3 151.0 50.3 4.07 0.0180 Residual 24 296.8 12.4 Total 35 9414.3 Highly significant difference between subjects (not very interesting) Significant time differences. proc glm data=ex_pulse; class subject times; model hrate=subject times / solution; General Linear Models Procedure Class Level Information Class Levels Values SUBJECT 9 1 2 3 4 5 6 7 8 9 TIMES 4 0 30 60 120 Number of observations in data set = 36

ANOVA, April 2009 30 ANOVA, April 2009 31 Analysis of variance table from output: Parameter estimates from output: General Linear Models Procedure Dependent Variable: HRATE Sum of Mean Source DF Squares Square F Value Pr > F Model 11 9117.52778 828.86616 67.03 0.0001 Error 24 296.77778 12.36574 Corrected Total 35 9414.30556 R-Square C.V. Root MSE HRATE Mean 0.968476 3.775539 3.51650 93.1389 Source DF Type I SS Mean Square F Value Pr > F SUBJECT 8 8966.55556 1120.81944 90.64 0.0001 TIMES 3 150.97222 50.32407 4.07 0.0180 Source DF Type III SS Mean Square F Value Pr > F SUBJECT 8 8966.55556 1120.81944 90.64 0.0001 TIMES 3 150.97222 50.32407 4.07 0.0180 T for H0: Pr > T Std Error of Parameter Estimate Parameter=0 Estimate INTERCEPT 102.1944444 B 50.34 0.0001 2.03024963 SUBJECT 1-11.5000000 B -4.62 0.0001 2.48653783 2 6.5000000 B 2.61 0.0152 2.48653783 3-17.2500000 B -6.94 0.0001 2.48653783 4-19.5000000 B -7.84 0.0001 2.48653783 5 19.0000000 B 7.64 0.0001 2.48653783 6-5.0000000 B -2.01 0.0557 2.48653783 7-33.5000000 B -13.47 0.0001 2.48653783 8-27.5000000 B -11.06 0.0001 2.48653783 9 0.0000000 B... TIMES 0 4.2222222 B 2.55 0.0177 1.65769189 30 0.2222222 B 0.13 0.8945 1.65769189 60-1.2222222 B -0.74 0.4681 1.65769189 120 0.0000000 B... NOTE: The X X matrix has been found to be singular and a generalized inverse was used to solve the normal equations. Estimates followed by the letter B are biased, and are not unique estimators of the parameters. subject 9 at time 120 minutes is the reference ANOVA, April 2009 32 ANOVA, April 2009 33 Expected values for subject=3, times=30: ŷ st = ˆµ + ˆα s + ˆβ t = 102.19 17.25 + 0.22 = 85.16 Residuals r st = observed expected = y st ŷ st ε st Residual for subject 3, time 30: r 32 = 86 85.16 = 0.84 Look for: Model checking differences in variances (systematic?) Non-normality. Lack of additivity (interaction). Can only be tested if there is more than one observation for each combination Serial correlation? (Neighboring observations look more alike)

ANOVA, April 2009 34 ANOVA, April 2009 35 Enalaprilate example: Use the residuals for model checking: Probability plot of residuals. Plot residuals vs expected values. Plot residuals vs group. Look for outliers (a large residual means observed and expected values deviate a lot). No systematic patterns should be present. ANOVA, April 2009 36 ANOVA, April 2009 37 Interaction Example of two criterias for subdividing individuals: sex and smoking habits Outcome: FEV 1 Possible explanations for interaction: biologically different effects of smoking on males and females perhaps the women do not smoke as much as the men perhaps the effect is relative (to be expressed in %) Here, we see an interaction between sex and smoking.

ANOVA, April 2009 38 ANOVA, April 2009 39 Example: The effect of smoking on birth weight ANOVA, April 2009 40 ANOVA, April 2009 41 Interaction: There is an effect of smoking, but only for those who have been smoking for a long time. There is an effect of duration, and this effects increases with amount of smoking The effect of duration depends upon... amount of smoking and the effect of amount depends upon... duration of smoking Example: Fibrinogen after spleen operation 34 rats are randomized, in 2 ways 17 have their spleen removed (splenectomy=yes/no) 8/17 in each group are kept in high altitude (place=altitude/control) Outcome: Fibrinogen level in mg at day 21

ANOVA, April 2009 42 ANOVA, April 2009 43 The usual additive model: Y spr = µ + α s + β p + ε spr, ε spr N(0, σ 2 ) splenectomy (s=yes/no) and place (p=altitude/control) have an additive effect. Model with interaction Y spr = µ + α s + β p + γ sp + ε spr, ε spr N(0, σ 2 ) Here, we specify an interaction between splenectomy and place, i.e. the effect of living in a high altitude may be thought to depend upon whether or not you have an intact spleen. and vice versa.. ANOVA, April 2009 44 ANOVA, April 2009 45 Two-way ANOVA with interaction in SAS: Dependent Variable: fibrinogen proc glm data=ex_fibrinogen; class splenectomy place; model fibrinogen=place splenectomy place*splenectomy / solution; output out=model p=predicted r=residual; Sum of Source DF Squares Mean Square F Value Pr > F Model 3 138402.2949 46134.0983 7.51 0.0007 Error 30 184321.2639 6144.0421 Corrected Total 33 322723.5588 R-Square Coeff Var Root MSE fibrinogen Mean 0.428857 22.21804 78.38394 352.7941 Source DF Type I SS Mean Square F Value Pr > F The GLM Procedure Class Level Information Class Levels Values splenectomy 2 no yes place 2 altitude control Number of observations 34 place 1 57895.84355 57895.84355 9.42 0.0045 splenectomy 1 79976.50000 79976.50000 13.02 0.0011 splenectomy*place 1 529.95139 529.95139 0.09 0.7710 Source DF Type III SS Mean Square F Value Pr > F place 1 57895.84355 57895.84355 9.42 0.0045 splenectomy 1 78937.01021 78937.01021 12.85 0.0012 splenectomy*place 1 529.95139 529.95139 0.09 0.7710

ANOVA, April 2009 46 ANOVA, April 2009 47 Standard Parameter Estimate Error t Value Intercept 261.6666667 B 26.12798017 10.01 place altitude 90.5833333 B 38.08774887 2.38 place control 0.0000000 B.. splenectomy no 104.4444444 B 36.95054391 2.83 splenectomy yes 0.0000000 B.. splenectomy*place no altitude -15.8194444 B 53.86421101-0.29 splenectomy*place no control 0.0000000 B.. splenectomy*place yes altitude 0.0000000 B.. splenectomy*place yes control 0.0000000 B.. Parameter Pr > t Intercept <.0001 place altitude 0.0240 place control. splenectomy no 0.0083 splenectomy yes. splenectomy*place no altitude 0.7710 splenectomy*place no control. splenectomy*place yes altitude. splenectomy*place yes control. The reference levels are place=control, splenectomy=yes (they come last in the alphabet) so the expected fibrinogen level for these animals is intercept=261.67 For all other groups, we have to add one or more extra estimates, as shown in the table below: NOTE: The X X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter B are not uniquely estimable. ANOVA, April 2009 48 ANOVA, April 2009 49 place splenectomy control altitude 261.67 261.67 yes + 90.58 = 352.25 261.67 261.67 + 104.44 + 104.44 no + 90.58-15.82 = 366.11 = 440.87 Model checking Variance homogeneity may be judged from a one-way anova: The GLM Procedure Class Level Information Class Levels Values group 4 no_altitude no_control yes_altitude yes_control Number of observations 34 Levene s Test for Homogeneity of fibrinogen Variance ANOVA of Squared Deviations from Group Means Sum of Mean Source DF Squares Square F Value Pr > F group 3 2.3669E8 78896856 1.55 0.2211 Error 30 1.5232E9 50773012 No reason to suspect inhomogeneity

ANOVA, April 2009 50 ANOVA, April 2009 51 Normality assumption for residuals (proc univariate normal) In the two-way anova, the interaction was not significant (P=0.77), so we omit it from the model: Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.964119 Pr < W 0.3193 Kolmogorov-Smirnov D 0.124882 Pr > D >0.1500 Cramer-von Mises W-Sq 0.094165 Pr > W-Sq 0.1325 Anderson-Darling A-Sq 0.494715 Pr > A-Sq 0.2098 proc glm data=ex_fibrinogen; class splenectomy place; model fibrinogen=place splenectomy / solution clparm; Dependent Variable: fibrinogen Sum of Source DF Squares Mean Square F Value Pr > F Model 2 137872.3435 68936.1718 11.56 0.0002 Error 31 184851.2153 5962.9424 Corrected Total 33 322723.5588 R-Square Coeff Var Root MSE fibrinogen Mean No reason to suspect non-normality 0.427215 21.88815 77.22009 352.7941 Source DF Type III SS Mean Square F Value Pr > F place 1 57895.84355 57895.84355 9.71 0.0039 splenectomy 1 79976.50000 79976.50000 13.41 0.0009 ANOVA, April 2009 52 ANOVA, April 2009 53 Standard Parameter Estimate Error t Value Pr > t Intercept 265.3888889 B 22.50900351 11.79 <.0001 place altitude 82.6736111 B 26.53221591 3.12 0.0039 place control 0.0000000 B... splenectomy no 97.0000000 B 26.48627265 3.66 0.0009 splenectomy yes 0.0000000 B... Residual plots Normality: Variance homogeneity: Removal of spleen leads to a decrease in fibronogen of approx 97.00 mg at day 21 Placing in altitude leads to an increase in fibronogen of approx 82.67 mg at day 21

ANOVA, April 2009 54 ANOVA, April 2009 55 More complicated analyses of variances Three- or more-sided analysis of variance. Latin squares 1 2 3 I A B C II B C A III C A B (Cochran & Cox (1957): Experimental Designs, 2.ed., Wiley) Cross-over designs Variance component models Example of a latin square: A rabbit experiment 6 rabbits Vaccination at 6 different spots on the back 6 different orders of vaccination Swelling is area of blister (cm 2 ) spot rabbit order swelling 1 1 3 7.9 1 2 5 8.7 1 3 4 7.4 1 4 1 7.4.. 6 4 4 5.8 6 5 1 6.4 6 6 3 7.7 ANOVA, April 2009 56 ANOVA, April 2009 57 Some illustrations: 2 2 2 6 1 34 5 25 2 3 16 4 4 1 5 6 3 25 6 4 13 2 6 36 31 1 45 5 4 2 26 25 1 4 6 1 34 3 1 3 5 4 5 6 5 36 1 4 2 6 1 4 5 3 2 3 6 5 1 4

ANOVA, April 2009 58 ANOVA, April 2009 59 Fit 3-way analysis of variance, with additive effects proc glm; class rabbit spot order; model swelling=rabbit spot order; Dependent Variable: swelling Sum of Source DF Squares Mean Square F Value Pr > F Model 15 17.23000000 1.14866667 1.75 0.1205 Error 20 13.13000000 0.65650000 Corrected Total 35 30.36000000 R-Square Coeff Var Root MSE swelling Mean 0.567523 10.99883 0.810247 7.366667 Source DF Type III SS Mean Square F Value Pr > F The GLM Procedure Class Level Information Class Levels Values rabbit 6 1 2 3 4 5 6 spot 6 a b c d e f order 6 1 2 3 4 5 6 Number of observations 36 rabbit 5 12.83333333 2.56666667 3.91 0.0124 spot 5 3.83333333 0.76666667 1.17 0.3592 order 5 0.56333333 0.11266667 0.17 0.9701 The design is balanced, so the test of the effect of one variable (covariate) does not depend on which of the others are still in the model. ANOVA, April 2009 60 How about possible interactions? proc glm; class rabbit spot order; model swelling=rabbit spot order spot*order; Dependent Variable: swelling Sum of Source DF Squares Mean Square F Value Pr > F Model 35 30.36000000 0.86742857.. Error 0 0.00000000. Corrected Total 35 30.36000000 Source DF Type I SS Mean Square F Value Pr > F rabbit 5 12.83333333 2.56666667.. spot 5 3.83333333 0.76666667.. order 5 0.56333333 0.11266667.. spot*order 20 13.13000000 0.65650000.. There is no room for interaction, since there is only one observation for each combination of spot and order!