1 Introduction to One-way ANOVA
|
|
- Augustine Willis
- 6 years ago
- Views:
Transcription
1 Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data: CH10/ Link to Notes: 01.ppt 10_02.ppt 10_03.ppt 1 Introduction to One-way ANOVA Suppose we wished to examine the differences between I different normal populations with possibly different means µ 1, µ 2,..., µ I, but all the variances are equal to σ 2 (a generalization of the twosample t-test with equal population variances in Chapter 9). In One-Way Analysis of Variance (ANOVA), we begin with the following null hypothesis, with the alternative hypothesis H 0 : µ 1 = µ 2 = = µ I H a : µ l µ m, for some l m, so, if the alternative is true, we say at least two means are different. Figure 1 is a plot of three (I = 3) normal distributions all with variance equal to one (σ 2 = 1, but means 100, 110, and 120. Figure 1: Plot of three different normal densities with means 100, 110 and 120, with common variance equal to 1. This is an illustration of the null hypothesis, H 0 : µ 1 = µ 2 = µ 3, being false. To develop a test, we would, draw random samples from each population in question, then use this data to draw inferences about the true state of nature in the underlying distributions. Two such Page 1
2 designed experiments are referred to as balanced and unbalanced single-factor designs associated with ANOVA. Balanced Design Single Factor ANOVA: In a balanced design, we would draw independent random samples of the same size, say J, from each of the I populations. If the sample sizes were not all equal, the design would be said to be unbalanced (note: there is nothing inherently wrong with an unbalanced design). Table 1 (a) illustrates a balanced design with I treatments/groups and J measurements/observations and Table 1 (b) presents an unbalanced design. Table 1 (a): Illustration of a balanced single-factor/one-way design. Xi. = J j=1 X ij/j, i = 1, 2,..., I are the individual sample means for each treatment/group and S1 2, S2 2,..., S2 I are the individual sample variances from each treatment/group. Group or treatment Random sample sample size Mean Var Assumed Distribution 1 X 11, X 12,..., X 1J J X1. S 2 1 N(µ 1, σ 2 ) 2 X 21, X 22,..., X 2J J X2. S 2 2 N(µ 2, σ 2 )... I X I1, X I2,..., X IJ J XI. SI 2 N(µ I, σ 2 ) I J X.. where X.. = ( X 1. + X X I. ) I = ( J j=1 X 1j + J j=1 X 2j + + ) J j=1 X 1j I J = I J X ij I J is the grand mean. Table 1 (b): Illustration of a unbalanced single-factor/one-way design. Xi. = J i j=1 X ij/j i, i = 1, 2,..., I are the individual sample means for each treatment/group and S1 2, S2 2,..., S2 I are the individual sample variances from each treatment/group. Group or treatment Random sample sample size Mean Var Assumed Distribution 1 X 11, X 12,..., X 1J1 J 1 X1. S 2 1 N(µ 1, σ 2 ) 2 X 21, X 22,..., X 2J2 J 2 X2. S 2 2 N(µ 2, σ 2 )... I X I1, X I2,..., X IJI J I XI. SI 2 N(µ I, σ 2 ) where X.. = ( X 1. + X X I. ) I = J 1 + J J I ( J1 j=1 X 1j + J 2 j=1 X 2j + + ) J I j=1 X 1j = J 1 + J J I X.. I Ji X ij J 1 + J J I Page 2
3 is the grand mean. Example 1: The article Compression of Single-Wall Corrugated Shipping Containers Using Fixed and Floating Test Platens (J. Testing and Evaluation, 1992: ) describes an experiment in which several different types of boxes were compared with respect to compression strength (lb). Table 2 displays the data for each box type. The data represent independent random samples from each box type population. Since the sample sizes are all equal, this one-way ANOVA is considered a balanced design. If the sample sizes were not all equal, the design would said to be unbalanced (note: there is nothing inherently wrong with an unbalanced design). Note: for this example there are I = 4 groups/samples and each sample has J = 6 observations. There are a total of I J observations in a balanced design experiment, so in this example there are a total of 24 = 4 6 observations. Table 2: Data associated with the one-way designed experiment for Box Type strength. Type of Box Compression Strength (lbs) Sample Mean Sample SD Grand mean= We see the box type 2 has the largest sample mean strength, x 2. = , and box type 4 has the smallest sample mean strength, x 4. = Box types 1 and 3 seem to be close in strength, on average, with sample mean strengths x 1. = and x 3. = , respectively. The question is, are the differences in the sample means large enough to conclude that the true means are different? Is the sample mean for box type 2 significantly greater than the rest? Is the sample mean for box type 4 significantly smaller than the others? What about the small mean difference between box types 1 and 3? Data Structure for ANOVA: Before we begin this example, let s examine the typical (but not exclusive) data structure for conducting a one-way ANOVA using statistical software like MINITAB. The data in Table 2 are entered into MINITAB with one column (variable) indicating with group/sample the data are from and the other column contains all of the measurements/observations. Page 3
4 Figure 2: Screenshot of the data set for Example 10.1 illustrating the typical data structure for ANOVA. The order of appearance of the columns doesn t matter; the variable containing the observations can come first or the variable containing the group/treatments labels can come first. Basic/Descriptive Statistics: Before doing ANOVA directly in MINITAB lets compute the basic statistics and box-plots. First, go to Stat, Basic Statistics, Display Descriptive Statistics and complete the dialogue box displayed in Figure 3(a). After you select the correction options (statistics displayed), you will get the basic statistics displayed in Figure 3(b). The boxplot displayed in Figure 3(c), was obtained by Graph, Boxplot, selecting with groups option, then scale, transform value category scales. We can see from Figure 3(c) that the sample distribution for Box Type 4 doesn t seem to overlap with the other three sample distributions. Also, while there is quite a bit of overlap for the other three samples, Box Types 1, 2 and 3, Box Type 2 tends to have greater strength than the other two. Whatever overall test we develop, we would expect the null hypothesis to be rejected and follow-up multiple comparisons should at least lead us to the conclusion that Box Type 4 has significantly less strength, on average, compared to the other three. Page 4
5 Figures 3 (a), (b) and (c): Page 5
6 Analysis of means (ANOM) From MINITAB description, ANOM is a graphical analog to ANOVA that tests the equality of population means. The graph displays each factor level mean, the overall mean, and the decision limits. If a point falls outside the decision limits, then evidence exists that the factor level mean represented by that point is significantly different from the overall mean. Figure 4 contains the ANOM for Example 1. Since the second box type mean is above the decision limits and fourth box type mean is below decision limits, this suggest the second is significantly above the the rest and the fourth is significantly below. The first and third means are not significantly different from each other. We will show later that these conclusions will be confirmed by ANOVA and multiple comparisons using Tukey s method. Figure 4: Analysis of Means (ANOM) from Example 1 data. Page 6
7 1.1 Sum of Squares (balance design) Total Sum of Squares: Total Sum of Squares (SST) would be the numerator of the sample variance if you were to compute the sample variance of all n = I J observations without regard to group/treatment, I J SST = (X ij X.. ) 2 This is why SST is referred to as the total amount of variability in the response/measurement variable. The degrees of freedom associated with SST are I J 1. Error Sum of Squares: Error Sum of Squares (SSE) is the numerator for pooled sample variances, I J SSE = (X ij X i. ) 2 SSE is considered the within treatment/group variability. Note, we can also express SSE as SSE = (J 1)S (J 1)S (J 1)S 2 I. SSE is the unexplained variability to uncertainly or random variability. The degrees of freedom associated with SSE are I (J 1. Treatment Sum of Squares: Treatment Sum of Squares (SSTr) represents the variability between treatment/groups, I J SST r = ( X i. X.. ) 2 If all the population means were equal, SSTr would tend to be small. The bigger the differences between the means, the larger SSTr would tend to be. SSTr is the amount of variability explained by differences between group/treatments. The degrees of freedom associated with SSTr are I 1. Decomposition of Sum of Squares: It can be shown that the total variability (SST) can be decomposed into the sum of SSTr and SSE. Also, the degrees of freedom can be decomposed additively. That is, SST = SST r + SSE and df(total) = df(error) + df(t reatments). So, the overall variability in the response/measurement variable is sum of the between group/treatment variability and the the within group (random variability). In other words, it is the sum of explained variability and unexplained variability. Coefficient of Determination (R 2 ): The coefficient of determination, denoted as R 2, is the proportion of the total variability (SST) which is explained by the between treatment/group (SSTr) variability. Since SST = SST r + SSE, the coefficient of determination is defined to be R 2 = SST r SST = 1 SSE SST. Page 7
8 Note that 0 R 2 1. An R 2 = 1 would indicate a perfect fit in that 100% of the total variability is explained by the differences between treatments/groups and, therefore, there is no random variability. Example 1 (cont): ANOVA: strength versus box-type Factor Type Levels Values box-type fixed 4 1, 2, 3, 4 Analysis of Variance for strength Source DF SS MS F P box-type Error Total S = R-Sq = 79.01% R-Sq(adj) = 75.86% Calculator or algebraic simplification for Sum of Squares: SST = I J (x ij x.. ) 2 = I J i 1 j=1 x 2 ij 1 IJ x2.. SST r = I J ( x i. x.. ) 2 = 1 J I i=1 x 2 i. 1 IJ x2.. SSE = I J J (x ij x i. ) 2, where x i. = x ij and x.. = j=1 I i=1 x i. 1.2 Sum of Squares (unbalanced) The results for the unbalanced design are exactly the same as for the balanced design, except for the interim algebra computations, reflecting the differing sample sizes. Once the SST, SSE and SSTr are computed the analysis for unbalanced designs are exactly the same as for balanced designs. The total sample size is n = I J i = J 1 + J J I. i=1 Total Sum of Squares: Total Sum of Squares (SST) would be the numerator of the sample variance if you were to compute the sample variance of all J 1 + J J I observations without regard to group/treatment, I J i SST = (X ij X.. ) 2 Page 8
9 This is why SST is referred to as the total amount of variability in the response/measurement variable. The Total degrees of freedom are df = n 1, where n = J 1 + J J I Error Sum of Squares: Error Sum of Squares (SSE) is the numerator for pooled sample variances, I J i SSE = (X ij X i. ) 2 SSE is considered the within treatment/group variability. Note, we can also express SSE as SSE = (J 1 1)S (J 2 1)S (J I 1)S 2 I. SSE is the unexplained variability to uncertainly or random variability. freedom are n I. The error degrees of Treatment Sum of Squares: Treatment Sum of Squares (SSTr) represents the variability between treatment/groups, I J i SST r = ( X i. X.. ) 2 If all the population means were equal, SSTr would tend to be small. The bigger the differences between the means, the larger SSTr would tend to be. SSTr is the amount of variability explained by differences between group/treatments. The treatment degrees of freedom are df = I 1. Decomposition of Sum of Squares: It can be shown that the total variability (SST) can be decomposed into the sum of SSTr and SSE. That is, SST = SST r + SSE and df(total) = df(error) + df(treatment). So, the overall variability in the response/measurement variable is sum of the between group/treatment variability and the the within group (random variability). In other words, it is the sum of explained variability and unexplained variability. Coefficient of Determination (R 2 ): The coefficient of determination, denoted as R 2, is the proportion of the total variability (SST) which is explained by the between treatment/group (SSTr) variability. Since SST = SST r + SSE, the coefficient of determination is defined to be R 2 = SST r SST = 1 SSE SST. Note that 0 R 2 1. An R 2 = 1 would indicate a perfect fit in that 100% of the total variability is explained by the differences between treatments/groups and, therefore, there is no random variability. Page 9
10 Calculator or algebraic simplification for Sum of Squares: SST = I J i (x ij x.. ) 2 = I J i x 2 ij 1 n x2.. i 1 j=1 SST r = I J i ( x i. x.. ) 2 = 1 J i I x 2 i. 1 n x2.. i=1 SSE = I J i J i (x ij x i. ) 2, where x i. = x ij and x.. = j=1 I x i. and n = J 1 +J 2 + +J I. i=1 Page 10
11 1.3 Mean Square Error Mean Square Error (balanced design): The Mean-Squared-Error (MSE) is MSE = S2 1 + S S2 I I = I i=1 J j=1 (X ij X i. ) 2 I(J 1) = SSE I(J 1) Notice that MSE is an unbiased estimator of σ 2, since ES 2 i = σ2, i = 1, 2,..., I. Mean Square Error (unbalanced design): The Mean-Squared-Error (MSE) is MSE = (J 1 1)S (J 2 1)S (J I 1)S 2 I n I = I Ji (X ij X i. ) 2 n I = SSE I(J 1) Notice that MSE is an unbiased estimator of σ 2, since ES 2 i = σ2, i = 1, 2,..., I. Mean Square for Treatments (both balanced and unbalanced design): The Mean square for treatments (MSTr) is MST r = SST r i 1. Note: if the null hypothesis is true, µ 1 = µ 2 = = µ I, the MSTr is also an unbiased estimator of σ 2. However, if then null hypothesis were false then E(MST r) > E(MSE) = σ 2. F Ratio: If all the normality assumptions hold and the null hypothesis is true, then the ratio of the mean squares is distributed as a F distribution with numerator degrees of freedom equal to (I-1) and denominator degrees of freedom I(J-1), F = MST r MSE F I 1,I(J 1) ANOVA Table (balanced one-factor design) Source of Sum of Variation df squares Mean Square f Treatments I-1 SStr MSTr=SSTr/(I-1) MSTr/MSE Error I(J-1) SSE MSE=SSE/[I(J-1)] Total IJ-1 SST ANOVA Table (unbalanced one-factor design) Source of Sum of Variation df squares Mean Square f Treatments I-1 SStr MSTr=SSTr/(I-1) MSTr/MSE Error n-i SSE MSE=SSE/[I(J-1)] Total n-1 SST Page 11
12 where n = J 1 + J J I. Example 1 (continued): Let µ 1, µ 2, µ 3, and µ 4 represent the true mean compression strength for each of box types 1, 2, 3 and 4, respectively. Assuming the populations are normally distributed with common variance, σ 2, use the data provided to test the null hypothesis that all populations means are equal. That is, the null hypothesis is with the alternative hypothesis H 0 : µ 1 = µ 2 = = µ I H a : µ l µ m, for some l m. The completed ANOVA table (produced by MINITAB) is given below. MINITAB are given on the next page in Figure 5. ANOVA: strength versus box-type Factor Type Levels Values box-type fixed 4 1, 2, 3, 4 The instructions for Analysis of Variance for strength Source DF SS MS F P box-type Error Total S = R-Sq = 79.01% R-Sq(adj) = 75.86% Coefficient-of-variation (R 2 ): The coefficient of variation is, R 2 = 79.01%. This means that 79.01% of the total variability in compression strength is explained by the mean differences between box-types. p-value and the test: The p-value associated with the global hypothesis that all the population means are equal (null hypothsis) is near zero, which would lead us reject the null hypothesis for any reasonable significance level α. Therefore, we reject the null hypothesis and conclude that the sample means are significantly different between the box types. p-value interpretation: If the null hypothesis were true (all the population mean strengths were equal), there is a near zero chance of observing sample mean differences as large or larger then we did in this experiment/sample. So, since we concluded that the sample means strengths were statistically signficant between the box-types and 79.01% of the variability is explained by the differences in strengths between the box-types, we have a great deal of evidence that at least two of the box-types are different, on average. We would like to investigate this further using multiple comparisons. Page 12
13 Figure 5: To produce this in MINTAB, Goto Stat, ANOVA, one-way. In the dialogue box, make sure to select the Response data in one column for all factor levels option, then in the box labeled Response put the strength variable (name or column number) and in the box labeled factor put the box-type variable (name or column number). See the dialogue box below. Page 13
14 1.4 Multiple comparisons: So, as noted, if the null hypothesis is rejected via ANOVA, our conclusion is that at least two are different. To examine the nature of these differences, we could do, as suggested previously, comparison box plots or some other graphical methods to determine which means are different from each other. However, we would want to follow that up with significance tests, for a more formal analysis. However, to compare all means to each other we would have to do I(I 1) 2 comparisons. Recall, in example 1, there were I = 4 treatment groups, so, for all pairwise comparisons you would have to do 4(3)/2 = 6 paired tests, µ 1 to µ 2, µ 1 to µ 3, µ 1 to µ 4, µ 2 to µ 3, µ 2 to µ 4, and µ 3 to µ 4, to cover all possibilities. To control for the fact that we must do many different individual tests, we make adjustments at the individual test level to ensure that the family-wise or experiment-wise type I level is fixed at α. Tukey s Studentized range Test in one such method. Tukey s Method: The process works similarly to the hypothesis testing using confidence intervals, as in Chapter 9. To ensure the family-wise level is α, we test the hypothesis for any two means µ i and µ j, for some i < j, H 0 : µ i = µ j versus H a : µ i µ j, by computing the adjusted confidence intervals X i. X j. ± Q α,i,i(j 1) MSE/J where Q α,i,i(j 1) is found in Table A.10 and is an inflated version of t-distribution used in Chapter 9 methods. For each pair, if the interval contains zero then we would fail to reject the null hypothesis and conclude that the two sample means are not significantly different. If the interval does not contain zero, then we would reject the null hypothesis and conclude that the sample means are significantly different from each other (and, therefore, we conclude µ i µ j ). Example 1 (continued: Recall, previously, we rejected the global null hypothesis that all the population means are equal and concluded that at least two population means are different. Summary statistics, box plots and ANOM, all suggested that sample mean for Box-type 4 was significantly less than the other three. Here, we follow-up with a formal multiple comparisons using Tukey s method and MINITAB, see Figure 6 for MINTAB instructions. Below is the output produced. Tukey Pairwise Comparisons Grouping Information Using the Tukey Method and 95% Confidence box_type N Mean Grouping A A A B Means that do not share a letter are significantly different. Page 14
15 Since the means for 2, 1 and 3, all share the same letter, the means aren t considered significantly different. However, box-type 4 has a different letter than the other three, so we say the sample mean for box-type 4 is significantly different than the other three. This confirms our visual inspection (graphical analysis). MINTAB also supplied the output for Turkey simultaneous Tests for Differences between the means, upon which the above summery was based. I provided this output below: Tukey Simultaneous Tests for Differences of Means Difference Difference SE of Adjusted of Levels of Means Difference 95% CI T-Value P-Value ( -22.6, 110.4) ( -81.4, 51.6) (-217.5, -84.5) (-125.4, 7.6) (-261.4, ) (-202.5, -69.5) Individual confidence level = 98.89% Figure 6: In MINTAB, follow all the steps to produce the ANOVA (as demonstrated previously), but click on the Comparisons button and fill in the dialogue box as indicated below, Page 15
16 Figure 7 (a): Figure 7(b): Page 16
17 1.5 Checking the Assumptions In class, I went through an example where we checked the assumptions. Recall, for the F-test in an ANOVA to be valid, we assumption that the underlying populations are normally distributed with equal variances (common variance assumption). To examine the normality common variance assumptions, we produce a histogram, normal probability plot and residual plot all based on the residuals. To get the plots in Figure 8, below, you fill in the Graphs dialogue box as indicated in Figure 9, on the next page. Figure 8: Residual plots based on ANOVA for Example 1 Based on the normal probability plot (upper left corner), the data do not indicate any significant deviations from normality. The residual plot (upper right hand corner), we would expect random scatter about zero and the spread of the data points should be constant. Based on this graph it looks like the constant variance assumption holds. Page 17
18 Figure 9: How to get the Residual plots based on ANOVA for Example 1. Page 18
Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1
Notes for Wee 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1 Exam 3 is on Friday May 1. A part of one of the exam problems is on Predictiontervals : When randomly sampling from a normal population
More informationChapter 11 - Lecture 1 Single Factor ANOVA
Chapter 11 - Lecture 1 Single Factor ANOVA April 7th, 2010 Means Variance Sum of Squares Review In Chapter 9 we have seen how to make hypothesis testing for one population mean. In Chapter 10 we have seen
More informationChapter 11 - Lecture 1 Single Factor ANOVA
April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More information16.3 One-Way ANOVA: The Procedure
16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationdf=degrees of freedom = n - 1
One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:
More informationEcon 3790: Business and Economic Statistics. Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 13, Part A: Analysis of Variance and Experimental Design Introduction to Analysis of Variance Analysis
More informationWeek 14 Comparing k(> 2) Populations
Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.
More informationSTAT Chapter 10: Analysis of Variance
STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called
More information10 One-way analysis of variance (ANOVA)
10 One-way analysis of variance (ANOVA) A factor is in an experiment; its values are. A one-way analysis of variance (ANOVA) tests H 0 : µ 1 = = µ I, where I is the for one factor, against H A : at least
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationDesign & Analysis of Experiments 7E 2009 Montgomery
1 What If There Are More Than Two Factor Levels? The t-test does not directly apply ppy There are lots of practical situations where there are either more than two levels of interest, or there are several
More informationThe legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.
1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationANOVA: Comparing More Than Two Means
ANOVA: Comparing More Than Two Means Chapter 11 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c Department of Mathematics University of Houston Lecture 25-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationChapter 10: Analysis of variance (ANOVA)
Chapter 10: Analysis of variance (ANOVA) ANOVA (Analysis of variance) is a collection of techniques for dealing with more general experiments than the previous one-sample or two-sample tests. We first
More informationANOVA (Analysis of Variance) output RLS 11/20/2016
ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.
More informationLecture notes 13: ANOVA (a.k.a. Analysis of Variance)
Lecture notes 13: ANOVA (a.k.a. Analysis of Variance) Outline: Testing for a difference in means Notation Sums of squares Mean squares The F distribution The ANOVA table Part II: multiple comparisons Worked
More informationDisadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means
Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure
More informationSociology 6Z03 Review II
Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability
More informationJosh Engwer (TTU) 1-Factor ANOVA / 32
1-Factor ANOVA Engineering Statistics II Section 10.1 Josh Engwer TTU 2018 Josh Engwer (TTU) 1-Factor ANOVA 2018 1 / 32 PART I PART I: Many-Sample Inference Experimental Design Terminology Josh Engwer
More informationChap The McGraw-Hill Companies, Inc. All rights reserved.
11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationEX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3
EX. Chapter 8 Examples In an experiment to investigate the performance of four different brands of spark plugs intended for the use on a motorcycle, plugs of each brand were tested and the number of miles
More informationOne-Way Analysis of Variance (ANOVA)
1 One-Way Analysis of Variance (ANOVA) One-Way Analysis of Variance (ANOVA) is a method for comparing the means of a populations. This kind of problem arises in two different settings 1. When a independent
More information4.1. Introduction: Comparing Means
4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly
More information2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018
Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds
More information9 One-Way Analysis of Variance
9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationW&M CSCI 688: Design of Experiments Homework 2. Megan Rose Bryant
W&M CSCI 688: Design of Experiments Homework 2 Megan Rose Bryant September 25, 201 3.5 The tensile strength of Portland cement is being studied. Four different mixing techniques can be used economically.
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More informationEE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA
1 Design of Experiments in Semiconductor Manufacturing Comparison of Treatments which recipe works the best? Simple Factorial Experiments to explore impact of few variables Fractional Factorial Experiments
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationAnalysis of Variance
Analysis of Variance Math 36b May 7, 2009 Contents 2 ANOVA: Analysis of Variance 16 2.1 Basic ANOVA........................... 16 2.1.1 the model......................... 17 2.1.2 treatment sum of squares.................
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationOne-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
More informationCHAPTER 13: F PROBABILITY DISTRIBUTION
CHAPTER 13: F PROBABILITY DISTRIBUTION continuous probability distribution skewed to the right variable values on horizontal axis are 0 area under the curve represents probability horizontal asymptote
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationSTATS Analysis of variance: ANOVA
STATS 1060 Analysis of variance: ANOVA READINGS: Chapters 28 of your text book (DeVeaux, Vellman and Bock); on-line notes for ANOVA; on-line practice problems for ANOVA NOTICE: You should print a copy
More informationCHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication
CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1 Introduction In this chapter, expand the idea of hypothesis tests. We
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationSTA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03
STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile
More informationBIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES
BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method
More informationStat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment
Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem Reading: 2.4 2.6. Motivation: Designing a new silver coins experiment Sample size calculations Margin of error for the pooled two sample
More informationIn ANOVA the response variable is numerical and the explanatory variables are categorical.
1 ANOVA ANOVA means ANalysis Of VAriance. The ANOVA is a tool for studying the influence of one or more qualitative variables on the mean of a numerical variable in a population. In ANOVA the response
More informationOne-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.
One-Way Analysis of Variance With regression, we related two quantitative, typically continuous variables. Often we wish to relate a quantitative response variable with a qualitative (or simply discrete)
More informationCHAPTER 13: F PROBABILITY DISTRIBUTION
CHAPTER 13: F PROBABILITY DISTRIBUTION continuous probability distribution skewed to the right variable values on horizontal axis are 0 area under the curve represents probability horizontal asymptote
More informationMuch of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.
Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationMathematical Notation Math Introduction to Applied Statistics
Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationANOVA CIVL 7012/8012
ANOVA CIVL 7012/8012 ANOVA ANOVA = Analysis of Variance A statistical method used to compare means among various datasets (2 or more samples) Can provide summary of any regression analysis in a table called
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X.04) =.8508. For z < 0 subtract the value from,
More informationMultiple comparisons - subsequent inferences for two-way ANOVA
1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of
More informationCh. 1: Data and Distributions
Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and
More informationDESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya
DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Jurusan Teknik Industri Universitas Brawijaya Outline Introduction The Analysis of Variance Models for the Data Post-ANOVA Comparison of Means Sample
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationConfidence Interval for the mean response
Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.
More informationRegression used to predict or estimate the value of one variable corresponding to a given value of another variable.
CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent
More informationReview of Statistics 101
Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods
More informationVariance Decomposition and Goodness of Fit
Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings
More informationINTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS
GEORGE W. COBB Mount Holyoke College INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS Springer CONTENTS To the Instructor Sample Exam Questions To the Student Acknowledgments xv xxi xxvii xxix 1. INTRODUCTION
More informationThis module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.
WISE ANOVA and Regression Lab Introduction to the WISE Correlation/Regression and ANOVA Applet This module focuses on the logic of ANOVA with special attention given to variance components and the relationship
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationDesign of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments
Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that
More informationMultiple Regression Methods
Chapter 1: Multiple Regression Methods Hildebrand, Ott and Gray Basic Statistical Ideas for Managers Second Edition 1 Learning Objectives for Ch. 1 The Multiple Linear Regression Model How to interpret
More informationStatistics and Quantitative Analysis U4320
Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one
More informationReview for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling
Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included
More informationKeppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares
Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares K&W introduce the notion of a simple experiment with two conditions. Note that the raw data (p. 16)
More informationBattery Life. Factory
Statistics 354 (Fall 2018) Analysis of Variance: Comparing Several Means Remark. These notes are from an elementary statistics class and introduce the Analysis of Variance technique for comparing several
More informationIndependent Samples ANOVA
Independent Samples ANOVA In this example students were randomly assigned to one of three mnemonics (techniques for improving memory) rehearsal (the control group; simply repeat the words), visual imagery
More informationMultiple Regression Examples
Multiple Regression Examples Example: Tree data. we have seen that a simple linear regression of usable volume on diameter at chest height is not suitable, but that a quadratic model y = β 0 + β 1 x +
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationUsing SPSS for One Way Analysis of Variance
Using SPSS for One Way Analysis of Variance This tutorial will show you how to use SPSS version 12 to perform a one-way, between- subjects analysis of variance and related post-hoc tests. This tutorial
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationAnalysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.
Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a
More informationAnalysing qpcr outcomes. Lecture Analysis of Variance by Dr Maartje Klapwijk
Analysing qpcr outcomes Lecture Analysis of Variance by Dr Maartje Klapwijk 22 October 2014 Personal Background Since 2009 Insect Ecologist at SLU Climate Change and other anthropogenic effects on interaction
More informationOne-Way ANOVA. Some examples of when ANOVA would be appropriate include:
One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement
More informationThe entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.
One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine
More informationIn a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:
Activity #10: AxS ANOVA (Repeated subjects design) Resources: optimism.sav So far in MATH 300 and 301, we have studied the following hypothesis testing procedures: 1) Binomial test, sign-test, Fisher s
More informationCorrelation and the Analysis of Variance Approach to Simple Linear Regression
Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation
More informationSimple Linear Regression: One Quantitative IV
Simple Linear Regression: One Quantitative IV Linear regression is frequently used to explain variation observed in a dependent variable (DV) with theoretically linked independent variables (IV). For example,
More informationAnnouncements. Unit 4: Inference for numerical variables Lecture 4: ANOVA. Data. Statistics 104
Announcements Announcements Unit 4: Inference for numerical variables Lecture 4: Statistics 104 Go to Sakai s to pick a time for a one-on-one meeting. Mine Çetinkaya-Rundel June 6, 2013 Statistics 104
More informationMSc / PhD Course Advanced Biostatistics. dr. P. Nazarov
MSc / PhD Course Advanced Biostatistics dr. P. Nazarov petr.nazarov@crp-sante.lu 04-1-013 L4. Linear models edu.sablab.net/abs013 1 Outline ANOVA (L3.4) 1-factor ANOVA Multifactor ANOVA Experimental design
More informationChapte The McGraw-Hill Companies, Inc. All rights reserved.
12er12 Chapte Bivariate i Regression (Part 1) Bivariate Regression Visual Displays Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed
More informationResearch Methods II MICHAEL BERNSTEIN CS 376
Research Methods II MICHAEL BERNSTEIN CS 376 Goal Understand and use statistical techniques common to HCI research 2 Last time How to plan an evaluation What is a statistical test? Chi-square t-test Paired
More informationTopic 22 Analysis of Variance
Topic 22 Analysis of Variance Comparing Multiple Populations 1 / 14 Outline Overview One Way Analysis of Variance Sample Means Sums of Squares The F Statistic Confidence Intervals 2 / 14 Overview Two-sample
More informationStat 6640 Solution to Midterm #2
Stat 6640 Solution to Midterm #2 1. A study was conducted to examine how three statistical software packages used in a statistical course affect the statistical competence a student achieves. At the end
More information