Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Similar documents
ANALYTICAL COMPARISONS AMONG TREATMENT MEANS (CHAPTER 4)

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 12: Detailed Analyses of Main Effects and Simple Effects

Using SPSS for One Way Analysis of Variance

What Does the F-Ratio Tell Us?

Independent Samples ANOVA

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

Factorial Independent Samples ANOVA

Your schedule of coming weeks. One-way ANOVA, II. Review from last time. Review from last time /22/2004. Create ANOVA table

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Quadratic Equations Part I

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Contrasts (in general)

Comparing Several Means: ANOVA

Multiple Comparisons

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Ch. 16: Correlation and Regression

Review of Multiple Regression

Analytical Comparisons Among Treatment Means (Chapter 4) Analysis of Trend (Chapter 5) ERSH 8310 Fall 2009

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

1 A Review of Correlation and Regression

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

COMPARING SEVERAL MEANS: ANOVA

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

INTRODUCTION TO ANALYSIS OF VARIANCE

Mathematical Notation Math Introduction to Applied Statistics

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

MORE ON MULTIPLE REGRESSION

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Contrasts and Multiple Comparisons Supplement for Pages

WISE Regression/Correlation Interactive Lab. Introduction to the WISE Correlation/Regression Applet

Topic 1. Definitions

Unit 27 One-Way Analysis of Variance

One-way between-subjects ANOVA. Comparing three or more independent means

Chapter 19: Logistic regression

Business Statistics. Lecture 9: Simple Regression

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

Analysis of Variance: Repeated measures

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

10/31/2012. One-Way ANOVA F-test

Specific Differences. Lukas Meier, Seminar für Statistik

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

Notes on Maxwell & Delaney

Orthogonal, Planned and Unplanned Comparisons

Regression With a Categorical Independent Variable: Mean Comparisons

5:1LEC - BETWEEN-S FACTORIAL ANOVA

Using Microsoft Excel

This gives us an upper and lower bound that capture our population mean.

Review of Statistics 101

Introduction to the Analysis of Variance (ANOVA)

One-way between-subjects ANOVA. Comparing three or more independent means

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Analysis of Variance (ANOVA)

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

Chapter Seven: Multi-Sample Methods 1/52

Descriptive Statistics (And a little bit on rounding and significant digits)

Wed, June 26, (Lecture 8-2). Nonlinearity. Significance test for correlation R-squared, SSE, and SST. Correlation in SPSS.

Solving Equations by Adding and Subtracting

9 One-Way Analysis of Variance

Preview from Notesale.co.uk Page 3 of 63

Analysis of variance

One-way ANOVA (Single-Factor CRD)

Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology

Hypothesis testing: Steps

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons

Calculating Fobt for all possible combinations of variances for each sample Calculating the probability of (F) for each different value of Fobt

Advanced Experimental Design


B. Weaver (18-Oct-2006) MC Procedures Chapter 1: Multiple Comparison Procedures ) C (1.1)

In the previous chapter, we learned how to use the method of least-squares

Chapter 26: Comparing Counts (Chi Square)

Notes on Maxwell & Delaney

Multiple Regression Analysis

STAT 328 (Statistical Packages)

Chs. 16 & 17: Correlation & Regression

Statistics Primer. A Brief Overview of Basic Statistical and Probability Principles. Essential Statistics for Data Analysts Using Excel

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Notes on Maxwell & Delaney

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

H0: Tested by k-grp ANOVA

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

9 Correlation and Regression

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

1 Introduction to Minitab

LOOKING FOR RELATIONSHIPS

CIS 2033 Lecture 5, Fall

Confidence intervals

One-Way Analysis of Variance (ANOVA) Paul K. Strode, Ph.D.

ANOVA Analysis of Variance

Chs. 15 & 16: Correlation & Regression

Chapter 1 Linear Equations. 1.1 Systems of Linear Equations

Inferences About the Difference Between Two Means

Transcription:

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means 4.1 The Need for Analytical Comparisons...the between-groups sum of squares averages the differences between the different pairs of means. As such, it cannot provide unambiguous information about the exact nature of treatment effects. Because SS A is a composite, an F ratio based on more than two treatment levels is known as an omnibus or overall test. That is, with two levels we can say which condition is larger, but with more than two levels, we don t know which conditions are actually different (based solely on the F ratio). In order to know which particular means differ, we have to compute specific comparisons between means. When computing SS A, we re actually computing how far each condition mean is from every other condition mean. Thus, as seen in Formula 4.1, SS A is actually the average squared distance among the group means. To illustrate for K&W51 (what else!): SS A [A]![T] 3314.5 SS A n"( Y A! Y T ) 4"( Y A! 45.875) 4 (375.4 + 66 + 135.1 + 5) 3314 ( ) SS A n # Y a j " Y k pairs (4.1) 4/4 [(6.5-61.75) + (37.75-61.75) + (57.5-61.75) + (6.5-57.5) + (37.75-57.5) + (6.5-37.75) ] 3314.5 If our ANOVA tells us that we have insufficient evidence to reject H 0 (essentially a Type II error), then we would consider ways to make our study more powerful or we might move along to a different study entirely. However, in the presence of a significant F ratio (and more than two conditions) we would need to conduct additional analyses to disambiguate the results of our analyses. For that purpose, we would need to compute post hoc comparisons. Ideally, we might avoid the omnibus test entirely and conduct a series of specific planned comparisons. The reality we must face, however, is that some people (e.g. journal editors) may not trust that our comparisons were truly planned. Regardless of the type of comparison (planned or post hoc) you may be considering, the actual computation of the comparison is the same. 4. An Example of Planned Comparisons K&W provide a clear explanation of the value of planned comparisons using McGovern (1964). Her experiment included five treatment conditions in which a second task (after learning a vocabulary list) might interfere with memory for the originally learned vocabulary items. Particular treatments were thought to interfere with particular K&W 4-1

components of memory (which K&W generically label A, B, and C). One treatment condition (a 1 ) was a control condition that should not interfere with any memory component. Another treatment condition (a ) was thought to interfere with Component A, but not B and C. These hypotheses could actually be translated into a series of planned comparisons. The McGovern study demonstrates the analytical possibilities of experimentation that is, creating an experimental design that yields several focused comparisons that are central to the purposes of the experiment. Quite obviously, these are planned comparisons that would be tested directly. 4.3 Comparisons Between Treatment Means First, for terminology, K&W use comparisons, contrasts, and single-df comparisons interchangeably. (Because you re always comparing two means, the df Comp 1.) A typical comparison is between two means, but that doesn t imply that you can only compare two groups at a time (a simple pairwise comparison). Your null hypothesis for such a pairwise comparison would look like this: H 0 : µ 1 µ OR µ 1 µ 0 It is perfectly reasonable to compare combined groups (a complex comparison). Thus, the null hypothesis for a comparison of one group with the combined scores from two other groups would be: H 0 : µ 1 µ + µ 3 OR µ 1! µ + µ 3 0 Because it s not always clear that a complex comparison is reasonable, K&W admonish us to scrutinize any complex comparison. Exactly why are the groups being combined? The symbol ψ represents the comparison, as in:! 1 µ 1 " µ 1 " µ OR! 1 (+1)(µ 1 ) + (".5)(µ ) + (".5)(µ 3 ) Note the equivalence of the two statements above. The second expression, however, is useful because it introduces the notion of coefficients. Each of the means is multiplied by a coefficient prior to adding the products to get the value of ψ. In general: ψ Σ(c j )(µ j ) and Σ(c j ) 0 (4. and 4.3) For example, in an experiment with 5 levels (e.g., see Table 4.1) you might want to compare groups and 4 with groups 1, 3, and 5, so your complex comparison might look like: K&W 4-

ψ (+)(µ 1 ) + (-3)(µ ) + (+)(µ 3 ) + (-3)(µ 4 ) + (+)(µ 5 ) Note that I ve avoided fractions by using whole number weights. Doing so seems easier to me, but note what K&W say about the standard form of coefficients (p. 68). Because we ve been dealing with Greek symbols, you should recognize that we re going to have to do some estimation. Thus: " ˆ # c j Y j ( )( ) (4.4) OK, let s take a step back and see where we re heading. Ultimately, we re going to compute an F ratio for a comparison. To create the F Comparison, we re going to need an MS Comparison (MS ψ ) and an MS Error. What we re doing now is trying to get the MS Comparison by first computing the SS Comparison. The formulas that will give us what we want are: SS Comp n( " ˆ ) # c OR, by substitution, SS n[ " (c ) j (Y j ] Comp j "c j ["(c j )(A j )] SS Comp n c j [" ] OR The MS Comp is identical to the SS Comp, because the df 1, right? So, all that s left to do to compute the F Comp is to determine the appropriate MS Error for the denominator of the F ratio. The appropriate error term is more difficult to determine when you have some concerns about heterogeneity of variance. For the time being, let s assume that we have no concerns about heterogeneity of variance, which means that we could comfortably use the estimate of σ from the overall ANOVA in our comparisons (MS S/A ). Let s use K&W51 to compute examples of a pairwise comparison and a complex comparison. First, let s compute a simple pairwise comparison of the 4Hour Deprivation group and the 0Hour Deprivation group. The coefficients would be {+1, 0, -1, 0}. Using the formulas we d get: SS Comp 4(6.50! 57.50) (106! 30) 19 OR SS Comp 4() 19 Therefore, F Comp 19 150.46 1.77 [Note that MS Comp 19, because df Comp 1. MS Error is from the overall ANOVA.] Now, let s compute a complex comparison of the 4Hour and the 1Hour Dep groups together against the 0Hour and the 8Hour Dep groups together. The coefficients could be {+1, +1, -1, -1} or {-.5, -.5,.5,.5}. The SS Comp and F Comp would be: K&W 4-3

SS Comp [(106 + 151)! (30 + 47) ] 4(4) F Comp 305 150.46 0.11!0 16 305 As planned comparisons (among a group of a-1 or fewer comparisons), you d evaluate both these comparisons relative to F Crit (1,1) 4.75. Note, however, that computing SS Comp is identical to computing SS Treatment (SS A ), except that only two groups are involved. That is, you can use the exact same formulas that you ve already learned, but you simply apply them to the two groups involved in your comparison. Thus, SS Comp.1vs3 [ A]![ T] 106 + 30 4 SS Comp.1+ vs3+ 4 (106 +151) +(30 + 47) 8! (106 + 30) 8 19! 734 16 305 In essence, then, you could avoid learning any new formulas for comparisons, as long as you recognize that all that s involved is a computation of a SS Treatment. A further implication is that you can simply have a statistical package compute an ANOVA on the two groups involved in the comparison and the SS Treatment that gets printed out is the SS Comparison. And, of course, because the df for a comparison is 1, SS Comparison MS Comparison. If you weren t concerned about heterogeneity of variance, then you d compute the F Comp by dividing the MS Comp by the MS S/A from the overall ANOVA (as was shown above). 4.4 Evaluating Contrasts with a t Test When SPSS, for example, computes a test of a contrast, it reports the results in terms of a t statistic and not F. That said, of course, for two means there is no difference in the interpretation of a t or an F. The interpretations are the same because t F. So, by squaring t and tidying up a bit, note what happens to the numerator of the t test. Because a (for a comparison), you ll see in the formula below that the numerator of the t test is the same as SS A and because df 1, MS A would be the same. The denominator of the t test is the pooled variance estimate (which is the same as MS S/A ). So, the squared t-test is really identical to the ANOVA. Neat, huh? # & % t Y ( %( A 1 " Y A ) ( % s p + s ( p % ( $ % n 1 n '( ( Y A 1 " Y A ) s p n ( ) n Y A1 " Y A s p K&W 4-4

Another way to convince yourself of the relationship between t and F is to look at the tables of critical values for the two statistics. In Table A.1 (for F), look up F Crit (1,10) for α.05. Then, in Table A. (for t), look up t Crit for df 10. You should note that t F. K&W discuss the possibility of conducting directional comparisons (pp. 73-75). However, my advice is to eschew directional tests (as well as obfuscation ). K&W mention some reasons that directional tests might be problematic, to which I d add the cynical editor who might well wonder if you really had a directional hypothesis in the first place. K&W illustrate how you can compute confidence intervals for a contrast when you compute t. 4.5 Orthogonal Contrasts The valuable property of orthogonal comparisons is that they reflect independent or nonoverlapping pieces of information. Thus, knowing the outcome of one comparison gives no indication whatsoever about the outcome of another comparison, if the two are orthogonal. Two comparisons are orthogonal if the sum of the products of the two sets of coefficients is zero. Thus, a test for orthogonality of two comparisons is: ( c 1 j ) " c j ( ) 0 (4.0) If all contrasts in a set are orthogonal to one another, they are said to be mutually orthogonal. There can be no more mutually orthogonal contrasts than the degrees of freedom associated with the omnibus effect. That is, with a treatment means, there are only a-1 comparisons that are orthogonal to each other and to Y T. Because they represent nonoverlapping information, the sum of the SS Comp for the a-1 orthogonal comparisons is SS A. In this sense, the sum of squares obtained from a set of a treatment means is a composite of the sums of squares associated with a - 1 mutually orthogonal contrasts. For K&W51, one set of orthogonal comparisons would be: Comp1: 1 0-1 0 Comp: 0 1 0-1 Comp3: 1-1 1-1 Test for Orthogonality: Comp1 vs. Comp: (1)(0) + (0)(1) + (-1)(0) + (0)(-1) 0 Comp1 vs. Comp3: (1)(1) + (0)(-1) + (-1)(1) + (0)(-1) 0 Comp vs. Comp3: (0)(1) + (1)(-1) + (0)(-1) + (-1)(-1) 0 K&W 4-5

Furthermore, note that SS Comp1 19, SS Comp 115, SS Comp3 40.5. If we add all three of these SS we get 3314.5, which is SS A from the overall ANOVA. Note that a fourth comparison would, by definition, not be orthogonal (and would lead to a sum of the SS that would necessarily be greater than SS A ). Orthogonality is an important property of comparisons. However, your comparisons should be driven by theoretical issues rather than by consideration of orthogonality. (Note K&W s comments on the McGovern study, including You should strive to look at orthogonal comparisons whenever you can, but not to the extent of ignoring substantively important comparisons. ) 4.6 Composite Contrasts Derived from Theory The most sensitive contrast to a pattern of means is the one whose coefficients reflect the same pattern. That is, suppose that you are conducting a study in which you have some reason (e.g., based on past research) to think that the means will fall in a certain pattern. You could then construct a set of coefficients as follows: 1. Obtain you set of predicted means and use them as coefficients. Suppose, for example, that for K&W51, we had reason to believe that the pattern of means would be: 6.5, 37.75, 57.5, and 61.75. (What an amazing prediction, eh?) These coefficients would clearly not sum to zero: {6.5, 37.75, 57.5, 61.75}.. Subtract the grand mean (45.875) from each of the predicted means. That will produce a set of coefficients that does sum to zero: {-19.375, -8.15, 11.65, 15.875}. 3. Those coefficients are fairly messy, so to simplify them I could round a bit {-19.4, -8.1, 11.6, 15.9}. I could also work to try to turn these coefficients into integers. The point is that if your predictions about the means are accurate, you will find that the SS Comp for this one comparison (that involves all 4 means) will be very close to the SS A for an overall ANOVA. To the extent that your prediction is not accurate, then your SS Comp will not capture the same variability as that found in the SS A. For the K&W51 data set, you d obtain: ˆ " (#19.4 $ 6.5) + (#8.1$ 37.75) + (11.6 $ 57.5) + (15.9 $ 61.75) 88.95 SS " 4(88.95) (#19.4 ) + (#8.1 ) + (11.65 ) + (15.9 ) 74863.41 3314.4 89.34 The SS A from the original analysis was 3314.5, so you can see that you ve captured virtually all of the variability in the SS Treatment with just this one contrast. As a means of testing theoretical predictions, K&W build on the work of others (e.g., Levin & Neumann, 1999; Rosnow & Rosenthal, 1995). They suggest the construction of a contrast set that assesses the fit of a theoretical prediction with the obtained data, followed by a test of the residual variability (called SS Failure ). This lack-of-fit test should make some sense to you. That is, you may first establish that you have a reasonable K&W 4-6

theoretical prediction (akin to finding a significant correlation between two variables). However, it may be that the residual variability (akin to 1-r ) is still quite high. For the example above, obviously, there would be virtually no SS Failure. 4.7 Comparing Three or More Means For experiments with several levels of the IV, it may make sense to conduct analyses of subsets of the conditions. K&W provide an example of an experiment with a control condition and several treatment conditions. For that experiment, it may make sense to first test all the treatment conditions to see if they differ. If they don t then you could possibly combine them in a comparison with the control condition. On the other hand, if any of the treatment conditions differed, you could then compare them individually with the control condition. K&W 4-7

StatView Simple Pairwise Comparison Using StatView You can compute simple pairwise comparisons in a number of ways using StatView. If you re considering post hoc comparisons, StatView has a number of options built in, including Scheffé s, Tukey-Kramer (HSD), and Fisher s PLSD. If you re considering planned comparisons, you re on your own. Let s use the K&W51 data set to conduct a simple pairwise comparison of the 4Hr and the 0Hr deprivation groups. That is, a set of coefficients like 1, 0, -1, 0. How would you set up such an analysis in StatView? Probably the least painful way is to use the Exclude Row command ( -E). To exclude a row from the analysis, click on the left side of the row (or click-and-drag to highlight a bunch of rows), so that it becomes highlighted, then hit -E. You ll notice that the row number is gray when the row has been excluded. To later include a row that has been excluded, use the Include Row command ( -I). Alternatively, you could simply delete rows from your data set, but then carefully NOT save the changes when you exit, so that your data remain intact. Then proceed with a regular ANOVA, which would produce the output below: As is often the case, you d want to use the MS Comp from this ANOVA, but not the MS Error or the F-Value. To actually compute the F Comp, you d divide 19 by the MS Error from the overall ANOVA (150.46), which would give you an F Comp 1.77. You d then test your F against the appropriate F Critical, which depends on the nature of the comparison (planned, post hoc, etc.). ANOVA Table for Number of Errors Row exclusion: K57.SV5 Hrs No Sleep Residual DF Sum of Squares Mean Square F-Value P-Value Lambda Power 1 19.000 19.000 13.195.0109 13.195.863 6 874.000 145.667 Complex Comparison Using StatView I don t know of any easy way to compute complex comparisons in StatView. The brute force method always works, however (don t ya love it?). OK, so suppose that you want to compare the 4Hr and the 1Hr Dep groups with the 0Hr and the 8Hr Dep groups. That is, you re interested in a comparison like 1, 1, -1, -1. One way that works for sure is to go into your data file and change the values of your IV so that they take on the appropriate levels. That is, all of the people in the 4Hr or 1Hr groups must have the same value (e.g., 1) and all the people in the 0Hr or 8Hr groups must have the same value (e.g., 3). When you make those changes, just go ahead with a regular ANOVA, which will produce an output like that seen below: Once again, you d only use the MS Comp from this output. Dividing 305 by the MS Error from the overall produces an F Comp 0.11, which you d compare with the appropriate K&W 4-8

F Critical. Don t forget to leave the data unchanged when you exit (DON T save), or else you ll have to re-enter grouping data! ANOVA Table for Number of Errors DF Sum of Squares Mean Square F-Value P-Value Lambda Power Hrs No Sleep 1 305.000 305.000 0.17.0005 0.17.99 Residual 14 094.750 149.65 Using Coefficients to Compute Simple or Complex Comparisons in StatView Sorry...you can t. StatView doesn t support the use of coefficients. SuperANOVA does, though. SuperANOVA SuperANOVA does support the use of contrasts (coefficients). First, click on Contrasts at the top of the analysis window (with the IV selected). You ll then see a window like that below (left). Choose the Means comparisons, then click on OK. That will produce the window seen below (right). This window allows you to name the contrast. It also indicates that your comparisons have to balance. That is, the means that you Add Plus are made equivalent to the means that you Add Minus. Let s compare the first two groups with the last group. I would do that by clicking on 1, then Add Plus,, then Add Plus, 4, then Add Minus. That comparison would change the window to the window seen below left (except that you can t see the neat colors here). Note that SA automatically assigns the weights to the means. K&W 4-9

Comparison 1 Effect: Hrs No Sleep Dependent: Errors Cell Weight 1.500.500 4-1.000 df 1 Sum of Squares 340.375 Mean Square 340.375 F-Value 15.555 P-Value.0019 The figure on the right above shows what you d see in the analysis window as a result of your contrast analysis. Of course, all this makes perfect sense as a planned comparison when making fewer than a-1 planned comparisons. If you were using contrast to produce a post hoc comparison, you d have to compare your F-Value with an F-Critical using Tukey s procedure. For the most part, other than the use of contrast coefficients, the procedures for simple and complex comparisons in SuperANOVA parallel those found in StatView. K&W 4-10