PSYC 331 STATISTICS FOR PSYCHOLOGISTS Session 4 A PARAMETRIC STATISTICAL TEST FOR MORE THAN TWO POPULATIONS Lecturer: Dr. Paul Narh Doku, Dept of Psychology, UG Contact Information: pndoku@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015 2016/2017 godsonug.wordpress.com/blog
Session Overview This session builds upon previous sessions and provides further insight into some parametric statistical concepts that will help in the testing of hypotheses. The goal of this session is to equip students with the ability to explain the terminology of analysis of variance (ANOVA) ; Dr. P. N. Doku, Slide 2
Session Outline The key topics to be covered in the session are as follows: The analysis of variance (ANOVA) procedure The general logic of ANOVA Computational procedures Post-hoc analysis: Multiple comparisons following the ANOVA test Worked example and exercises based on the One-Way ANOVA test Introduction to Two-Way analysis of variance (Two-Way ANOVA) Slide 3
Reading List Opoku, J. Y. (2007). Tutorials in Inferential Social Statistics. (2nd Ed.). Accra: Ghana Universities Press. Pages 85-109 Slide 4
Analysis of Variance The analysis of variance is the parametric procedure for determining whether significant differences occur in an experiment with three or more sample means However, in a research study of experiment involving only two conditions of the independent variable (two samples means), you may use either a t-test or the ANOVA and the outcome of the analysis will be the same. Slide 5
Experiment-Wise Error The probability of making a Type I error over a series of individual statistical tests or comparisons in an experiment is called the experiment-wise error rate When we use a t-test to compare only two means in an experiment, the experimentwise error rate equals Slide 6
Experiment-Wise Error When there are more than two means in an experiment, the multiple t-tests result in an experiment-wise error rate much larger than the we have selected Using the ANOVA allows us to make all our decisions and keep the experiment-wise error rate equal to Slide 7
An Overview of One Way ANOVA ANalysis Of VAriance is abbreviated as ANOVA ANOVA is also called the F ratio There is a single independent variable, hence called One-Way An independent variable is also called a factor Each condition of the independent variable is called a level or treatment Differences produced by the independent variable are treatment effect Slide 8
Requirements for using the F ratio 1) Must be a comparison between three or more means. 2) Must be working with interval data. 3) Our sample must have been collected randomly from the research population. 4) We can/must assume that the sample characteristics are normally distributed. 5) We must assume that the variance between samples
Between-Subjects A one-way ANOVA is performed when one independent variable is tested in the experiment When an independent variable is studied using independent samples in all conditions, it is called a between-subjects factor A between-subjects factor involves using the formulas for a between-subjects ANOVA 10
Within-Subjects Factor When a factor is studied using related (dependent) samples in all levels, it is called a within-subjects factor This involves a set of formulas called a within-subjects ANOVA 11
Diagram of a Study Having Three Levels of One Factor 12
Null and Alternate Hypotheses Null hypothesis H 0 : 1 2... k Alternate hypothesis: states that at least the means of two of the populations differ. H a : not all k are equal 13
The ANOVA (F) Test The statistic for the ANOVA is F When F obs is significant, it indicates only that somewhere among the means at least two of them differ significantly It does NOT indicate which specific means differ significantly When the F-test is significant, we perform post hoc comparisons to determine which specific means differ significantly
Computation of the ANOVA (F) Test The Analysis of Variance is a multi-step process. 1. Sum of Squares 2. Mean Square 3. F Ratio Slide 15
Sum of Squares The computations for the ANOVA require the use of several sums of squared deviations The sum of squares is simply the sum of the squared deviations of a set of scores around the mean of those scores Adding them up. It is symbolized as SS
Sum of Squares Comparing Groups: When groups are compared, there are more than one type of sum of squares. Total Sum of Squares (SS total) Between Groups Sum of Squares (SS between) Within Groups Sum of Squares (SS within) Each type represents the sum of squared deviations
Computational Formulae for SS SS T 2 X X 2 N X 1 2 X 2 2 X k 1 2 X k 2 X 2 SS B... n 1 n2 n k 1 n k N 2 SS W X 1 2 X 2... 2 X k 1 X 1 2 X 2 2 X k 1 2 X k 2 n 1 n 2 Slide 18 n k 1 2 X k n k
The Computational Formulas for Sum of Squares: worked example
The Computational Formulas for Sum of Squares: worked example
The Computational Formulas for Sum of Squares: Summary
The mean square between groups describes the differences between the means of the conditions in a factor. It is symbolized by MS. Mean Squares NOTE: The value of the sum of squares becomes larger as variation increases. The sum of squares also increases with sample size. Because of this, the SS cannot be viewed as a true measure of variation. Another measure of variation that we can use is the Mean Square. The mean square within groups describes the variability in scores within the conditions of an experiment. It is symbolized by MS W.
Coputation of Mean Squares Between Within MS between SS between df between MS within SS within df within MS between = between group mean square SS between = between group sum of squares df between = between group degrees of freedom MS within = within group mean square SS within = within group sum of squares df within = within group degrees of freedom
Degrees of Freedom Use the following equations to obtain the correct degrees of freedom: df be tween k 1 df within N total k k = number of groups
Critical Value of F (F critical) The Critical value of F (F crit ) depends on: The degrees of freedom (both the df bn = k 1 and the df wn = N k) The selected To obtain the F crit from the F statistical table: Use the df B (the numerator) across the top of the table. Use the df W (the denominator) along the side of the table.
Worked example of Mean Square Calculating the Mean Square Computation using Table 8.2 data in the previous example
Computing F obs The analysis of variance yields an F ratio. The F ratio is the variance between groups and variation within groups compared. F obs M S be twe e n (bn ) M S within ( wn ) The larger our calculated F ratio, the increased likelihood that we will have a statistically significant result.
Illustration of another way of computing the Sum of Squares and Mean Squares using the mean method Dr. Richard Boateng, UGBS Slide 28
Example: does family size vary by religious affiliation?
Step 1: Find the mean for each sample
Step 2:Cal. (1) Sum of scores, (2) sum of sq. scores, (3) number of subjs., (4) and mean
computations
computations
computation
DECISION To reject the null hypothesis at the.05 significance level with 2 and 12 degrees of freedom, our calculated F ratio must exceed 3.88. From the computation, our obtained F ratio of 8.24, is clearly greater than the F critical, hence we must reject the null hypothesis. Interpretation: At 0.05 significant level, it is indeed true that Family size does vary by religion.
Post Hoc Comparisons When the F-test is significant, we perform post hoc comparisons Post hoc comparisons are like t-tests We compare all possible pairs of level means from a factor, one pair at a time to determine which means differ significantly from each other Examples: The protected t test method and Fisher Least Significant (LSD) method
The Protected t Test method The null hypothesis for comparing any pair of means tested with the formula: and is t X 1 X 2 X 1 X 2 MS error MS error 1 1 MS error n 1 n 2 n 1 n 2 Ms error = MS w where MS w is simply taken from the ANOVA results and n 1 and n 2 are the sizes of the two samples whose means we are comparing. The computed value of t is referred to the t tables at α = 0.05 for a two-tailed test with the degrees of freedom (df) associated with the MS w (= N - k) and a decision is taken as to whether or not H o should be
Note that t here refers to the critical value of t with N-k df in a two-tailed test Fisher LSD (Least Significant Difference) method Used when all the groups have equal sample sizes, i.e. n1=n2=n3 Then the denominator of the protected t test becomes a constant for all pairwise comparisons. In such a situation, it becomes possible to determine what least significant difference (LSD) between means is needed to reject H o at any given level of significance. = t X 1 X 2 X 1 X 2 MS error MS error 1 1 MS error n 1 n 2 n 1 n 2
Two- way ANOVA- overview We have learned how to test for the effects of independent variables considered one at a time. However, much of human behavior is determined by the influence of several variables operating at the same time. Sometimes these variables combine to influence performance.
Two- way ANOVA We need to test for the independent and combined effects of multiple variables on performance. We do this with a Two- way ANOVA that asks: (i)how different from each other are the means for levels of Variable A? (ii)how different from each other are the means for levels of Variable B? (iii)how different from each other are the means for the treatment combinations produced by A and B together?
Two way ANOVA The first two of those questions are questions about main effects of the respective independent variables. The third question is about the interaction effect, the effect of the two variables considered simultaneously.
MAIN vs INTERACTION EFFECTS Main effect A main effect is the effect on performance of one treatment variable considered in isolation (ignoring other variables in the study) Interaction Effect an interaction effect occurs when the effect of one variable is different across levels of one or more other variables Slide 42
Illustration In order to detect interaction effects, we must use factorial designs. In a factorial design each variable is tested at every level of all of the other variables. Below represent two variables A and B both with two levels A1,A2 and B1,B2 respectively. A1 A2 B1 i ii B2 iii iv
Illustration I.vs III Effect of B at level A 1 of variable A II.vs IV Effect of B at A 2 If these are different, then we say that A and B interact I vs II ALTERNATIVELY Effect of A at B 1 III vs IV Effect of A at B 2 If these are different, then we say that A and B interact
B 2 B 2 Illustration B 1 B 1 A 1 A 2 A 1 A 2 In the graphs above, the effect of A varies at levels of B, and the effect of B varies at levels of A. How you say it is a matter of preference (and your theory). In each case, the interaction is the whole pattern. No part of the graph shows the interaction. It can only be seen in the entire pattern (here, all 4 data points).
Computation of F ratios in Two- Way ANOVA In a Two-Way ANOVA, three F ratios are computed: One F ratio is computed for the factor represented along the rows; a second F ratio is computed for the factor represented along the columns; and a third F ratio is computed for the interaction between the factors represented along the rows and columns. The various F ratios are each referred to the F tables with the appropriate degrees of freedom associated with each F ratio under a specified decision rule and a decision is taken as to whether or not H o should be rejected in each case. Slide 46