Biological Applications of ANOVA - Examples and Readings

Size: px
Start display at page:

Download "Biological Applications of ANOVA - Examples and Readings"

Transcription

1 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 1 ANOVA Pac Biological Applications of ANOVA - Examples and Readings One-factor Model I (Fixed Effects) This is the same example for One-factor ANOVA used by Dr. M. in Biometrics class (it's in the BIO 211 Test Pac). The data are birth weights (g) for 36 babies. Each baby is categorized on the basis of the smoking habits of the mother during pregnancy. The value for the SMOKING variable below indicates group membership (i.e. the level). The three levels are: 1 = Nonsmoking, 2 = up to 1 Pack/day, 3 = 1+ pack/day. The WEIGHT variable contains you guessed it! the birth weights. This does a "complete" analysis: it tests the assumptions of normality and homoscedasticity; does the ANOVA, does a planned contrast of nonsmoking babies vs. the combination of the two smoking groups; and also does 11 different multiple comparison tests. SAS does four different tests for normality. The Shapiro-Wilk test is the most widely used. The null hypothesis is Ho: Distribution is Normal. So, when we accept (p>0.05) we have a normal distribution. Note that for each of the three smoking groups, all four normality tests conclude that the distribution is normal. The test for homoscedasticity is the Brown and Forsythe's Test for Homogeneity of WEIGHT Variance. This is a form of Levene s Test, and is an ANOVA done on the absolute deviation of each weight from the group median. The contrast tests the nonsmoking babies (n = 12, mean = ) against the smoking babies (n = 24, mean = ). Note that the smoking babies is all 24 babies in the 1 pack/day and 1+ pack/day groups combined. We can calculate the Contrast SS as a simple Groups SS: Groups SS = n j 12( ) ( X j X ) 2 = ( ) = = = Contrast SS 2 Note that all 11 multiple comparison tests give the same result, i.e. that the 1+ pack/day group is different from the other two. Since all the tests agree, this is a robust conclusion. DATA BABYWT; INPUT SMOKING allows multiple observations per line; SELECT (SMOKING); WHEN (1) SMOKE='Nonsmoke'; WHEN (2) SMOKE='1 Pack'; WHEN (3) SMOKE='1+ Pack'; END; CARDS; ; PROC UNIVARIATE NORMAL; CLASS SMOKE; VAR WEIGHT; PROC GLM; CLASS SMOKE; MODEL WEIGHT = SMOKE / SS3; CONTRAST 'Nonsmoking vs Smoking' SMOKE / E; MEANS SMOKE / HOVTEST=BF Tukey SNK Bon LSD REGWQ Scheffe Duncan Sidak Gabriel SMM Waller lines; RUN;

2 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 2 Example 1 - One-factor Model I (Fixed Effects) 1 The UNIVARIATE Procedure Variable: WEIGHT SMOKE = 1 Pack Moments N 12 Sum Weights 12 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation Std Error Mean Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode. Range 1134 Interquartile Range Tests for Location: Mu0=0 Test -Statistic p Value Student's t t Pr > t <.0001 Sign M 6 Pr >= M Signed Rank S 39 Pr >= S Tests for Normality Test --Statistic p Value Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq > Quantiles (Definition 5) Quantile Estimate 100% Max % % % % Q % Median Example 1 - One-factor Model I (Fixed Effects) 2 The UNIVARIATE Procedure

3 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 3 Variable: WEIGHT SMOKE = 1 Pack Quantiles (Definition 5) Quantile Estimate 25% Q % % % % Min Extreme Observations ----Lowest Highest--- Value Obs Value Obs Example 1 - One-factor Model I (Fixed Effects) 3 The UNIVARIATE Procedure Variable: WEIGHT SMOKE = 1+ Pack Moments N 12 Sum Weights 12 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation Std Error Mean Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode. Range 1870 Interquartile Range Tests for Location: Mu0=0 Test -Statistic p Value Student's t t Pr > t <.0001 Sign M 6 Pr >= M Signed Rank S 39 Pr >= S

4 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 4 Tests for Normality Test --Statistic p Value Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq > Quantiles (Definition 5) Quantile Estimate 100% Max % % % % Q % Median Example 1 - One-factor Model I (Fixed Effects) 4 The UNIVARIATE Procedure Variable: WEIGHT SMOKE = 1+ Pack Quantiles (Definition 5) Quantile Estimate 25% Q % % % % Min Extreme Observations ----Lowest Highest--- Value Obs Value Obs Example 1 - One-factor Model I (Fixed Effects) 5 The UNIVARIATE Procedure Variable: WEIGHT SMOKE = Nonsmoke

5 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 5 Moments N 12 Sum Weights 12 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Uncorrected SS Corrected SS Coeff Variation Std Error Mean Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Tests for Location: Mu0=0 Test -Statistic p Value Student's t t Pr > t <.0001 Sign M 6 Pr >= M Signed Rank S 39 Pr >= S Tests for Normality Test --Statistic p Value Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq Quantiles (Definition 5) Quantile Estimate 100% Max % % % % Q % Median Example 1 - One-factor Model I (Fixed Effects) 6 The UNIVARIATE Procedure Variable: WEIGHT SMOKE = Nonsmoke Quantiles (Definition 5) Quantile Estimate

6 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 6 25% Q % % % % Min Extreme Observations ----Lowest Highest--- Value Obs Value Obs Example 1 - One-factor Model I (Fixed Effects) 7 The GLM Procedure Class Level Information Class Levels Values SMOKE 3 1 Pack 1+ Pack Nonsmoke Number of Observations Read 36 Number of Observations Used 36 Example 1 - One-factor Model I (Fixed Effects) 8 The GLM Procedure Coefficients for Contrast Nonsmoking vs Smoking Row 1 Intercept 0 SMOKE 1 Pack 1 SMOKE 1+ Pack 1 SMOKE Nonsmoke -2

7 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 7 Example 1 - One-factor Model I (Fixed Effects) 9 The GLM Procedure Dependent Variable: WEIGHT Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE WEIGHT Mean Source DF Type III SS Mean Square F Value Pr > F SMOKE Contrast DF Contrast SS Mean Square F Value Pr > F Nonsmoking vs Smoking Example 1 - One-factor Model I (Fixed Effects) 10 The GLM Procedure Brown and Forsythe's Test for Homogeneity of WEIGHT Variance ANOVA of Absolute Deviations from Group Medians Sum of Mean Source DF Squares Square F Value Pr > F SMOKE Error Example 1 - One-factor Model I (Fixed Effects) 11 The GLM Procedure Waller-Duncan K-ratio t Test for WEIGHT NOTE: This test minimizes the Bayes risk under additive loss and certain other assumptions. Kratio 100 Error Degrees of Freedom 33 Error Mean Square F Value 9.18 Critical Value of t

8 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 8 Minimum Significant Difference Means with the same letter are not significantly different. Waller Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 12 The GLM Procedure t Tests (LSD) for WEIGHT NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of t Least Significant Difference Means with the same letter are not significantly different. t Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 13 The GLM Procedure Duncan's Multiple Range Test for WEIGHT NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Number of Means 2 3

9 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 9 Critical Range Means with the same letter are not significantly different. Duncan Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 14 The GLM Procedure Student-Newman-Keuls Test for WEIGHT NOTE: This test controls the Type I experimentwise error rate under the complete null hypothesis but not under partial null hypotheses. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Number of Means 2 3 Critical Range Means with the same letter are not significantly different. SNK Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 15 The GLM Procedure Ryan-Einot-Gabriel-Welsch Multiple Range Test for WEIGHT NOTE: This test controls the Type I experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square

10 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 10 Number of Means 2 3 Critical Range Means with the same letter are not significantly different. REGWQ Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 16 The GLM Procedure Tukey's Studentized Range (HSD) Test for WEIGHT NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of Studentized Range Minimum Significant Difference Means with the same letter are not significantly different. Tukey Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 17 The GLM Procedure Studentized Maximum Modulus (GT2) Test for WEIGHT NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of Studentized Maximum Modulus Minimum Significant Difference

11 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 11 Means with the same letter are not significantly different. SMM Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 18 The GLM Procedure Sidak t Tests for WEIGHT NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of t Minimum Significant Difference Means with the same letter are not significantly different. Sidak Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 19 The GLM Procedure Bonferroni (Dunn) t Tests for WEIGHT NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of t Minimum Significant Difference

12 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 12 Means with the same letter are not significantly different. Bon Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack Example 1 - One-factor Model I (Fixed Effects) 20 The GLM Procedure Scheffe's Test for WEIGHT NOTE: This test controls the Type I experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 33 Error Mean Square Critical Value of F Minimum Significant Difference Means with the same letter are not significantly different. Scheffe Grouping Mean N SMOKE A Nonsmoke A A Pack B Pack

13 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 13 ANOVA by Regression In this example, we'll do the same ANOVA we did in Example 1 (i.e. babies categorized by smoking status of mom during pregnancy). However, this time we will have Excel do the ANOVA by using regression procedures. This approach is not just for fun - or to see if we can fool Excel into doing "stupid ANOVA tricks". When we deal with unbalanced designs, it will be very important to understand that ANOVA problems can be solved by using regression. Also, SAS and other major statistical packages use this approach. Below are the data and "summary output" from the Excel Regression Data Analysis tool. You can see that the birth weights are in the third variable (column). The first two variables are "dummy variables" - they are codes that indicate smoking status. If the values for the dummy variables are 0 0, then that is a nonsmoking baby. Values of 1 0 indicate 1 pack/day. Values of 0 1 indicate 1+ pack/day. First, check-out the ANOVA table, and compare it to the ANOVA table prepared by SAS (or from the TestPac). Notice that Total SS is the same in both. Regression SS below is the same as Groups SS in the TestPac (called SMOKE SS in the SAS output). Residual SS below is the same as Error SS. The DF, MS, and F values also are the same SUMMARY OUTPUT Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA df SS MS F Significance F Regression Residual Total Coefficients Standard Error t Stat P-value Lower 95% Intercept E X Variable X Variable

14 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 14 ANOVA by Regression: Calculations and Sources of Variation The regression done above is a "multiple linear regression". In BIO 211, we did "simple linear regression", which means there was one dependent and one independent variable. Our regression "model" in BIO 211 was: Y = a + bx. In multiple linear regression, there is one dependent variable and two or more independent variables. Our multiple regression model is Y = a + b 1 X 1 + b 2 X 2. In our data, Y is the baby weights, X 1 is the first dummy variable, and X 2 the second dummy variable. The b 1 and b 2 values are called "partial regression coefficients". They are like the slope of the line in simple regression, and they are parameter estimates whose values are determined from the data. The interpretations are: b 1 shows the effect of X 1 on Y while holding X 2 constant b 2 shows the effect of X 2 on Y while holding X 1 constant From the Excel output above, you should see that our equation is: Y = X X 2 Let's see the predicted values of Y (calculated by putting in the values of the dummy variables): X1 X2 Baby Predicted Does anything here look familiar? Do you see how this works? Note that the predicted value for each baby is the mean for that group. When X 1 = 0 and X 2 = 0, then Y = When X 1 = 1 and X 2 = 0, then Y = (1) In other words, the mean of the 1 Pack/day group is g less than the mean of the nonsmoking group. When X 1 = 0 and X 2 = 1, then Y = (1) In other words, the mean of the 1+ Pack/day group is g less than the mean of the nonsmoking group. This should make sense to you. All the dummy variables tell is what smoking group a baby belongs to. If you're trying to estimate (predict) the birth weight of a baby, and all you know is what smoking group it is in, your "best guess" is the mean of the group. For example, let's say a baby has just been born, and it is classified into the 1 Pack/day group. Now, pretend you have to guess the birth weight, and for every gram you are off, you have to pay Dr. M. $1.00! What do you do?? Your "best guess" is grams, because that is right in the middle of the 1 Pack/day group. That should minimize how much you have to pay to the evil Dr. M.

15 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 15 Now, for the all-important sources of variation and sums of squares. First, in multiple regression, the sources of variation are exactly as they were in simple linear regression. Namely: Total is the variation of the observed value of the dependent variable. Regression is the variation of the predicted value of the dependent variable. Residual (Error) is the variation of the difference between observed and predicted. Total SS Now, compare and contrast SS in one-factor ANOVA with SS in regression. = Regression SS = ( Y Y ) Residual (Error) SS i Total SS Notice that Total SS is exactly the same in one-factor ANOVA and regression. Look at the formula above, and then the ANOVA formula from the TestPac. In both cases, Total SS is the sum of the squared deviation of each baby weight from the grand mean of the baby weights. Groups SS = Regression SS 2 Groups SS in ANOVA is Groups SS = n j ( X j X ). At first, you're thinking this is totally different from Regression SS. But, it's the same! First of all, don't be thrown off by the use of X and Y. The X and Y both refer to the same variable in this case, i.e. baby birth weight. Groups SS says you take the (group mean - grand mean) 2 and multiply by the number of data points in the group. Look at the calculation of Groups SS in the TestPac. Now, think about how you would evaluate Regression SS? Remember, the predicted value of Y is the mean of that group. And what about the grand mean? It's the same - the mean of all 36 babies. So, in each group, you're taking the (group mean - grand mean) 2, and you do this once for each baby in the group. This is just like multiplying by the number of babies in the group. Error SS = Residual (Error) SS Check the TestPac pages for the calculation of Error SS in ANOVA. You calculate a SS for each group (each baby from their group mean), and then add them together. In Residual (Error SS) in regression, you're taking each baby minus the predicted baby and squaring, and then adding them all up. But remember, the predicted value for each baby is its group mean! So, you're doing the same thing as in ANOVA. = 2 ( Yˆ Y ) i ( Y i 2 Yˆ ) i 2 It is important that you understand the relationship between the sources of variation in ANOVA and regression. See Dr. M. if this is causing you problems. It's really not hard - you will get it if you think about it for a bit. If you don't remember the definition of "important" from BIO 211, ask Dr. M. in class!

16 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 16 ANOVA as Done by Statistics Programs Now that we ve seen how ANOVA is done by regression, we can expand a bit and take a peek at how a statistics program (e.g. SAS) actually does these procedures. First, A couple of important points: 1. This is not comprehensive. We re not going to look at every detail of the calculations used by computer programs, but we will look at some of them; 2. Don t worry about these details for the exam. You re not expected to recreate all the matrices and other details. The general concepts of ANOVA by Regression in the previous section are important, but the details here are value added (which means not on the test ). Our example will be the baby birth weight example (again). You tell the program what the response variable is (birth weight), and what the factor is (Smoking). The program then looks at your data and figures out that: N (the total sample size) is 36. The factor has three levels. The program then knows it s working with the following model: Y = b 0 + b 1 X1 + b 2 X2 + b 3 X3 + ε where Y is the dependent (response) variable (birth weight) b 0 is the intercept. The intercept is included by default, but you can request it not be included in the model. Don t do this unless you really know what you are doing. b 0 b 1 b 2 and b 3 are parameter estimates. These are the unknowns. The program has to estimate these parameters to do the analysis. X1, X2, and X3 are dummy variables that indicate to what smoking level the baby belongs. The values indicate a nonsmoking baby; is a 1 pack/day baby; and is a 1+ pack/day baby ε is the error (residual) The program then writes the model in matrix terms: Y = Xβ + ε where β is a vector containing the b i symbols: Y is a vector containing the birth weights. β b0 b 1 = b2 b3 X is a matrix called the design matrix that has the dummy variables. There are 4 columns in the design matrix. All the values in the the first column will be 1. This first column refers to the intercept. The next 3 columns are the dummy variables X1 X2 and X3, in that order. The OLS (ordinary least squares) solution is to solve for β: X Y = X Xβ (X X) - X Xβ = (X X) - X Y Iβ= (X X) - X Y β= (X X) - X Y X is the transpose of X (X X) - is the inverse of X X I is the identity matrix Although we certainly will calculate the error term (ε ) along the way, we don t include it here in our matrix approach. Let s begin by looking at the elements of Y and X.

17 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 17 The Y vector looks like this: The design matrix X looks like this: The program needs the transpose of Y = Y, and the transpose of X = X Y has all the weights listed as a single row (a row vector). There s not enough room here to get all 36 birth weights on a single line, so you have to use your imagination. X looks like this: Next, the program calculates X X. Since X is a 4x36 matrix and X is a 36x4 matrix, the X X must be a 4x4 matrix. That is: (4x36) x (36x4) = 4x4 Notice that the principal diagonal has the sample sizes: for all data in the 1,1 position, then for each level as you go down the diagonal. X X is: Reality check: Programs may not actually construct the design matrix. They may read a line of data, construct the appropriate line for the design matrix; transpose that line and then multiply the transpose by the design matrix line. Thus, the X X matrix is being accumulated, one line at a time.

18 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 18 X Y (4x36) x (36x1) = 4 x1 The resulting vector contains ΣY i (sum of the birth weights) values. The first element is for all the data, subsequent elements are for the levels of the smoking factor. Since sample sizes are in the principal diagonal of the X X matrix, the grand mean and level means may now be calculated. X Y is: / 36 = the grand mean / 12 = the nonsmoking mean / 12 = the 1 pack/day mean / 12 = the 1+ pack/day mean Y X (1x36) x (36x4) = 1 x 4 This vector is the same elements as X Y, but as a row vector. Useful for later calculations. Y X is: Y Y (1x36) x (36x1) = 1x1 This is ΣY i i.e. sum of the squared birth weights. This quantity is sometimes called the Uncorrected SS. Y Y is: Since we have ΣY i for all the data as the first element of X Y, and the total sample size (36) from the X X matrix, the program may now calculate Total SS by the machine formula : SS Total 2 ( Y ) N i 2 2 i= = Yi = = = N 36 i= 1 N (X X) - In order to do several more calculations, the program now needs to calculate the inverse of the X X matrix, which is symbolized by (X X) -. But a problem is that the X X matrix is singular (determinant = 0), and therefore has no inverse. Mathematicians have developed a method called generalized inverse to deal with this situation. A frequently used generalized inverse is the g 2 -inverse, also called a reflexive generalized inverse. (X X) - is: Let A represent a square matrix of order p, and G is also a square matrix of order p. G is a g 2 -inverse of A, and A is a g 2 - inverse of G, (that s why it s called reflexive) if both of the following conditions are met: 1. AGA = A 2. GAG = G The generalized inverse of X X is found by a matrix operation called sweeping, which involves working on the matrix one row at a time. The g 2 -inverse found by the sweeping algorithm is not unique, different solutions can be obtained depending on the how the matrix is swept. Fortunately, we just need an inverse, and don t need to worry about the details of how the sweeping operator functions. We just let the computer tell us the g 2 -inverse it found:

19 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 19 Y'X(X'X) - X'Y We re multiplying 3 matrices here. Y X is 1x4; (X X) - is 4x4; X Y is 4x1 so: (1x4) x (4x4) = 1x4 then (1x4) x (4x1) is 1x1 Y'X(X'X) - X'Y is: This is just an intermediate value. What we want to do is subtract this from the Y Y value: Y Y - Y'X(X'X) - X'Y is: = This is the Error SS. Notice we now have Total SS, Error SS, and all the sample sizes. We would now be able to complete the ANOVA table and do the F test. (X'X) - X'Y (4x4) x (4x1) = 4 x 1 This is the calculation of the b i values. (X'X) - X'Y is: So, our model is Y = X X 2 + 0X We can now plug in the values for dummy variables X 1, X 2, and X 3 and calculate the predicted birth weights: Intercept X1 X2 X3 Predicted Y Again, just as we saw in ANOVA by Regression, the key thing to note here is that the predicted weight for each baby is the mean of its smoking group. The method we ve looked at here is more general than using Excel, and this method is how most real statistics program approach ANOVA models. Of course, as the ANOVA model gets more complicated, so do all of these matrices. But the general principles remain the same.

20 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 20 Randomized Block (no replications) TITLE 'Randomized Block (no replications)'; * This is the Randomized Block example found in Zar (4 th ed). The response variable is weight gain (g) in guinea pigs. There is a Diet factor with 4 levels, and a Block factor with 5 levels. The blocks represent different rooms that have slightly different conditions (noise level, light/dark cycle). Each of the five rooms houses four guinea pigs, one on each of the four diets. If you had BIO 211 with Dr. M., this should sound familiar, as this example was also used in class. HOWEVER, if you took BIO 211 before Fall 1999, the data were different from what you see below. The problem setup (response variable, diets, blocks (rooms)) was exactly the same, only the numbers have changed! Before Fall 1999, Dr. M. used the data from the 1st and 2nd editions of the Zar text. When the 3rd edition of Zar came out (late 1996), the data were changed - but Dr. M. didn't change the class example until Fall You may wonder why Zar kept the same problem setup, but changed the numbers - well, join the club! Dr. M. would like to hear the answer to that question! Also if you took BIO 211 before Fall 1999, why haven t you graduated yet? We will do three ANOVAs here: (1) a One-factor grouping by diets, (2) a One-factor grouping by Blocks (rooms), and a Two-factor grouping by both diets and blocks. What you should do is examine the SS, DF and MS due to Diets and Blocks in the One-factor and the Two-factor ANOVAs. See if you can detect the pattern, and explain it! Also look at what happens to the Error (unexplained) source in the ANOVAs. ; DATA G_PIGS; INPUT WT_GAIN DIET BLOCK; CARDS; ; PROC GLM; CLASS DIET; MODEL WT_GAIN = DIET; PROC GLM; CLASS BLOCK; MODEL WT_GAIN = BLOCK; PROC GLM; CLASS DIET BLOCK; MODEL WT_GAIN = DIET BLOCK; MEANS DIET BLOCK; RUN;

21 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 21 Randomized Block (no replications) 14:27 Thursday, June 17, 1999 General Linear Models Procedure Class Level Information Class Levels Values DIET Number of observations in data set = 20 Randomized Block (no replications) 14:27 Thursday, June 17, 1999 General Linear Models Procedure Dependent Variable: WT_GAIN Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE WT_GAIN Mean Source DF Type I SS Mean Square F Value Pr > F DIET Source DF Type III SS Mean Square F Value Pr > F DIET Randomized Block (no replications) 14:27 Thursday, June 17, 1999 General Linear Models Procedure Class Level Information Class Levels Values BLOCK Number of observations in data set = 20

22 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 22 Randomized Block (no replications) 14:27 Thursday, June 17, 1999 Dependent Variable: WT_GAIN General Linear Models Procedure Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE WT_GAIN Mean Source DF Type I SS Mean Square F Value Pr > F BLOCK Source DF Type III SS Mean Square F Value Pr > F BLOCK Randomized Block (no replications) 14:27 Thursday, June 17, 1999 General Linear Models Procedure Class Level Information Class Levels Values DIET BLOCK Number of observations in data set = 20

23 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 23 Randomized Block (no replications) 14:27 Thursday, June 17, 1999 Dependent Variable: WT_GAIN General Linear Models Procedure Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE WT_GAIN Mean Source DF Type I SS Mean Square F Value Pr > F DIET BLOCK Source DF Type III SS Mean Square F Value Pr > F DIET BLOCK Randomized Block (no replications) 14:27 Thursday, June 17, 1999 General Linear Models Procedure Level of WT_GAIN DIET N Mean SD Level of WT_GAIN BLOCK N Mean SD

24 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 24 Two-factor ANOVA with replications (balanced design) TITLE 'Two-factor ANOVA with replications (balanced design)'; * This is the same example used in Dr. M's Biometrics class. A clinic that does health evaluations is studying the effect of smoking. The clinic evaluates people using one of two devices: a stationary bicycle and a treadmill. While the subject is on the bike or treadmill, their oxygen consumption is measured, and the time (in minutes) required for the subject to reach their maximum oxygen consumption is noted. The data below are for 18 people: 6 nonsmokers, 6 moderate, and 6 heavy smokers. From each smoking group, 3 individuals were randomly chosen to ride the bike, and the other 3 walked the treadmill. It is important to note here that every individual was measured on only one device, either the bike or the treadmill. If every individual had been measured on each device, that would be a repeated measures design - we'll deal with that later in the quarter. ; DATA CLINIC; INPUT SMOKING $ DEVICE $ TIME; CARDS; NON BIKE 12.8 NON BIKE 13.5 NON BIKE 11.2 NON TREAD 17.8 NON TREAD 18.1 NON TREAD 16.2 MOD BIKE 10.9 MOD BIKE 11.1 MOD BIKE 9.8 MOD TREAD 15.5 MOD TREAD 13.8 MOD TREAD 16.2 HEAVY BIKE 8.7 HEAVY BIKE 9.2 HEAVY BIKE 9.5 HEAVY TREAD 14.7 HEAVY TREAD 13.2 HEAVY TREAD 10.1 ; PROC GLM; CLASS SMOKING DEVICE; MODEL TIME = SMOKING DEVICE SMOKING*DEVICE; MEANS SMOKING / TUKEY; MEANS DEVICE SMOKING*DEVICE; RUN; Two-factor ANOVA with replications (balanced design) 16:00 Thursday, June 17, 1999 General Linear Models Procedure Class Level Information Class Levels Values SMOKING 3 HEAVY MOD NON DEVICE 2 BIKE TREAD Number of observations in data set = 18

25 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 25 Two-factor ANOVA with replications (balanced design) 16:00 Thursday, June 17, 1999 Dependent Variable: TIME General Linear Models Procedure Source DF Sum of Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square C.V. Root MSE TIME Mean Source DF Type I SS Mean Square F Value Pr > F SMOKING DEVICE SMOKING*DEVICE Source DF Type III SS Mean Square F Value Pr > F SMOKING DEVICE SMOKING*DEVICE Two-factor ANOVA with replications (balanced design) 16:00 Thursday, June 17, 1999 General Linear Models Procedure Tukey's Studentized Range (HSD) Test for variable: TIME NOTE: This test controls the type I experimentwise error rate, but generally has a higher type II error rate than REGWQ. Alpha= 0.05 df= 12 MSE= Critical Value of Studentized Range= Minimum Significant Difference= Means with the same letter are not significantly different. Tukey Grouping Mean N SMOKING A NON B MOD B B HEAVY

26 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 26 Two-factor ANOVA with replications (balanced design) 16:00 Thursday, June 17, 1999 General Linear Models Procedure Level of TIME DEVICE N Mean SD BIKE TREAD Level of Level of TIME SMOKING DEVICE N Mean SD HEAVY BIKE HEAVY TREAD MOD BIKE MOD TREAD NON BIKE NON TREAD

27 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 27 Analysis of Covariance (ANCOVA) TITLE 'Analysis of Covariance (ANCOVA)'; * This is the example that was done in Dr. M's Biometrics class. It uses the same data as in the One-factor ANOVA we did at the beginning of the class: i.e. birth weights of babies grouped by smoking status of the mother during pregnancy. The new variable here is the prepregnancy body weight of the mom (in kg). The first variable indicates smoking group: 1 = none, 2 = 1 pack/day, 3 = 1+ pack/day. The second variable is the birthweight (g), the third variable is the mom weight (kg) ; DATA MOM_BABY; INPUT SMOKING BABY_WT MOM_WT; CARDS; ; PROC Reg; *This PROC Reg does a regression on all 36 data points. If you had Dr. M for BIO 211, this is the example that was used in class for regression; Model BABY_WT = MOM_WT; PROC Reg; *Next, SAS does a regression on each of the smoking groups separately. This is accomplished by the BY SMOKING command; Model BABY_WT = MOM_WT; BY SMOKING; PROC GLM; *The first PROC GLM is used to test if the slopes of the regression lines are the same for each of the smoking groups. This is the interaction term (SMOKING*MOM_WT). The slopes are equal (p = ).; CLASS SMOKING; MODEL BABY_WT = SMOKING MOM_WT SMOKING*MOM_WT;

28 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 28 PROC GLM; CLASS SMOKING; MODEL BABY_WT = SMOKING MOM_WT / SOLUTION; MEANS SMOKING; LSMEANS SMOKING; * The second PROC GLM does the ANCOVA. The SOLUTION option prints out the pooled regression coefficient (slope). The value is about The LSMEANS prints the adjusted means, the MEANS prints the means prior to adjustment. ; RUN; Analysis of Covariance (ANCOVA) 1 16:22 Tuesday, December 11, 2001 The REG Procedure Model: MODEL1 Dependent Variable: BABY_WT Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept MOM_WT

29 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 29 Analysis of Covariance (ANCOVA) 2 16:22 Tuesday, December 11, SMOKING= The REG Procedure Model: MODEL1 Dependent Variable: BABY_WT Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept MOM_WT Example 5 - Analysis of Covariance (ANCOVA) 3 16:22 Tuesday, December 11, SMOKING= The REG Procedure Model: MODEL1 Dependent Variable: BABY_WT Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept MOM_WT

30 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 30 Analysis of Covariance (ANCOVA) 4 16:22 Tuesday, December 11, SMOKING= The REG Procedure Model: MODEL1 Dependent Variable: BABY_WT Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept MOM_WT

31 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 31 Analysis of Covariance (ANCOVA) 5 16:22 Tuesday, December 11, 2001 The GLM Procedure Class Level Information Class Levels Values SMOKING Number of observations 36 Dependent Variable: BABY_WT Analysis of Covariance (ANCOVA) 6 16:22 Tuesday, December 11, 2001 The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE BABY_WT Mean Source DF Type I SS Mean Square F Value Pr > F SMOKING <.0001 MOM_WT <.0001 MOM_WT*SMOKING Source DF Type III SS Mean Square F Value Pr > F SMOKING MOM_WT MOM_WT*SMOKING

32 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 32 Analysis of Covariance (ANCOVA) 7 16:22 Tuesday, December 11, 2001 The GLM Procedure Class Level Information Class Levels Values SMOKING Number of observations 36 Dependent Variable: BABY_WT Analysis of Covariance (ANCOVA) 8 16:22 Tuesday, December 11, 2001 The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE BABY_WT Mean Source DF Type I SS Mean Square F Value Pr > F SMOKING <.0001 MOM_WT <.0001 Source DF Type III SS Mean Square F Value Pr > F SMOKING <.0001 MOM_WT <.0001 Standard Parameter Estimate Error t Value Pr > t Intercept B SMOKING B <.0001 SMOKING B SMOKING B... MOM_WT <.0001 NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

33 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 33 Analysis of Covariance (ANCOVA) 9 16:22 Tuesday, December 11, 2001 The GLM Procedure Level of BABY_WT MOM_WT SMOKING N Mean Std Dev Mean Std Dev Example 5 - Analysis of Covariance (ANCOVA) 10 16:22 Tuesday, December 11, 2001 The GLM Procedure Least Squares Means SMOKING BABY_WT LSMEAN

34 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 34 Calculation of ANCOVA quantities Let s define the following symbols: Y = mean birth weight for smoking group i adjy X b i = adjusted mean birthweight for smoking group i = mean prepregnancy weight for moms in smoking group i X = grand mean of prepregnancy weight for moms p i = pooled regression coefficient = = = ( X i X )( Yi Y ) ( X X ) y = ( Y Y ) xy = sum of crossproducts x i 2 i i = kg Calculation of the Pooled regression coefficient (b p ): Calculate Σ xy i and Σ x i 2 for each of the smoking groups, and pool (add) them: Σ xy p = Σ xy 1 + Σ xy 2 + Σ xy 3 = = This is the pooled sum of crossproducts. Σx p 2 = Σx Σx Σx 3 2 = = This is the pooled sum of squares for the independent variable (the moms weights). = Calculation of the adjusted means: b p xy x p = = 2 p The adjustment to the birth weight means depends on: (1) how far the mean of the moms in that group is from the grand mean of all moms; and (2) the relationship between mom s weight and baby s weight (b p ). adj Y = Y b i i p ( X X ) i Use this formula to calculate adjusted birth weight means for each of the smoking groups: Nonsmoking: ( ) = (2.3302) = = Pack/day: ( ) = ( ) = = Pack/day: ( ) = (-0.111) = =

35 BIO 575 Biological Applications of ANOVA - Winter Quarter 2010 Page 35 ANCOVA - Calculation of Sums of Squares (SS) This is a bit complicated, and it's not necessary to memorize all of this detail, but this may help you understand how the ANCOVA works. In comparing the SS values calculated below to the SAS output, you will note differences. These are due to rounding error. The SAS methods are substantially more accurate than what you see below, but even if we did figure out exactly how SAS did the calculations, it would not help us understand the method. Another way (and perhaps more accurate way) to look at the ANCOVA is that the analysis actually tests whether each group can be described by a common (pooled) regression line. If the pooled regression line for each group is the same (same slope and same intercept), then the adjusted means are not significantly different. And we can test to determine if the slope of that pooled line is significantly different from zero (0). So, our first task is to do a regression for each smoking group - except we use the pooled regression coefficient (b p ) in each regression. This requires us to calculate an intercept for each group (using the means for moms and babies in that group). Then, we use the regression equation to calculate predicted values for each group. We can then calculate Regression SS and Error SS. We'll do this step by step for each group so you can see what's happening. Nonsmoking Group From the SAS output, we see the mean baby weight is , and the mean mom weight is We use these means and b p = to calculate an intercept term (a 1 ). This is done just as we did it in BIO 211: a 1 = * = So, the equation for this group is Y = X Next, we calculate a predicted Y by putting each nonsmoking mom weight in for X. Then, calculate Regression SS (use mean of predicted, not observed - they are different in this case) and Error SS. Mom Baby Predicted Baby Regression SS = ( Ŷ -Y ) = Error SS = (Y - Ŷ) 2 =

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Chapter 8 (More on Assumptions for the Simple Linear Regression) EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually

More information

Laboratory Topics 4 & 5

Laboratory Topics 4 & 5 PLS205 Lab 3 January 23, 2014 Orthogonal contrasts Class comparisons in SAS Trend analysis in SAS Multiple mean comparisons Laboratory Topics 4 & 5 Orthogonal contrasts Planned, single degree-of-freedom

More information

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed. EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

Handout 1: Predicting GPA from SAT

Handout 1: Predicting GPA from SAT Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data

More information

4.8 Alternate Analysis as a Oneway ANOVA

4.8 Alternate Analysis as a Oneway ANOVA 4.8 Alternate Analysis as a Oneway ANOVA Suppose we have data from a two-factor factorial design. The following method can be used to perform a multiple comparison test to compare treatment means as well

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Linear Combinations of Group Means

Linear Combinations of Group Means Linear Combinations of Group Means Look at the handicap example on p. 150 of the text. proc means data=mth567.disability; class handicap; var score; proc sort data=mth567.disability; by handicap; proc

More information

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Page 1 Linear Combination of Means ANOVA: y ij = µ + τ i + ɛ ij = µ i + ɛ ij Linear combination: L = c 1 µ 1 + c 1 µ 2 +...+ c a µ a = a i=1

More information

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,

More information

5.3 Three-Stage Nested Design Example

5.3 Three-Stage Nested Design Example 5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Answer Keys to Homework#10

Answer Keys to Homework#10 Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean

More information

Assignment 9 Answer Keys

Assignment 9 Answer Keys Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67

More information

Lecture notes on Regression & SAS example demonstration

Lecture notes on Regression & SAS example demonstration Regression & Correlation (p. 215) When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also

More information

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1 Linear Combinations Comparison of treatment means Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 6 1 Linear Combinations of Means y ij = µ + τ i + ǫ ij = µ i + ǫ ij Often study

More information

EXST7015: Estimating tree weights from other morphometric variables Raw data print

EXST7015: Estimating tree weights from other morphometric variables Raw data print Simple Linear Regression SAS example Page 1 1 ********************************************; 2 *** Data from Freund & Wilson (1993) ***; 3 *** TABLE 8.24 : ESTIMATING TREE WEIGHTS ***; 4 ********************************************;

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 8, 2014 List of Figures in this document by page: List of Figures 1 Popcorn data............................. 2 2 MDs by city, with normal quantile

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Simple, Marginal, and Interaction Effects in General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means

More information

Topic 28: Unequal Replication in Two-Way ANOVA

Topic 28: Unequal Replication in Two-Way ANOVA Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant

More information

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the

More information

Pairwise multiple comparisons are easy to compute using SAS Proc GLM. The basic statement is:

Pairwise multiple comparisons are easy to compute using SAS Proc GLM. The basic statement is: Pairwise Multiple Comparisons in SAS Pairwise multiple comparisons are easy to compute using SAS Proc GLM. The basic statement is: means effects / options Here, means is the statement initiat, effects

More information

Comparison of a Population Means

Comparison of a Population Means Analysis of Variance Interested in comparing Several treatments Several levels of one treatment Comparison of a Population Means Could do numerous two-sample t-tests but... ANOVA provides method of joint

More information

PLS205 Lab 6 February 13, Laboratory Topic 9

PLS205 Lab 6 February 13, Laboratory Topic 9 PLS205 Lab 6 February 13, 2014 Laboratory Topic 9 A word about factorials Specifying interactions among factorial effects in SAS The relationship between factors and treatment Interpreting results of an

More information

Single Factor Experiments

Single Factor Experiments Single Factor Experiments Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 4 1 Analysis of Variance Suppose you are interested in comparing either a different treatments a levels

More information

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club BE640 Intermediate Biostatistics 2. Regression and Correlation Simple Linear Regression Software: SAS Emergency Calls to the New York Auto Club Source: Chatterjee, S; Handcock MS and Simonoff JS A Casebook

More information

9 One-Way Analysis of Variance

9 One-Way Analysis of Variance 9 One-Way Analysis of Variance SW Chapter 11 - all sections except 6. The one-way analysis of variance (ANOVA) is a generalization of the two sample t test to k 2 groups. Assume that the populations of

More information

Least Squares Analyses of Variance and Covariance

Least Squares Analyses of Variance and Covariance Least Squares Analyses of Variance and Covariance One-Way ANOVA Read Sections 1 and 2 in Chapter 16 of Howell. Run the program ANOVA1- LS.sas, which can be found on my SAS programs page. The data here

More information

PLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis

PLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis PLS205!! Lab 9!! March 6, 2014 Topic 13: Covariance Analysis Covariable as a tool for increasing precision Carrying out a full ANCOVA Testing ANOVA assumptions Happiness! Covariable as a Tool for Increasing

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous. COMPLETELY RANDOM DESIGN (CRD) Description of the Design -Simplest design to use. -Design can be used when experimental units are essentially homogeneous. -Because of the homogeneity requirement, it may

More information

STAT 115:Experimental Designs

STAT 115:Experimental Designs STAT 115:Experimental Designs Josefina V. Almeda 2013 Multisample inference: Analysis of Variance 1 Learning Objectives 1. Describe Analysis of Variance (ANOVA) 2. Explain the Rationale of ANOVA 3. Compare

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS 1a) The model is cw i = β 0 + β 1 el i + ɛ i, where cw i is the weight of the ith chick, el i the length of the egg from which it hatched, and ɛ i

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials. One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine

More information

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv). Regression Analysis Two variables may be related in such a way that the magnitude of one, the dependent variable, is assumed to be a function of the magnitude of the second, the independent variable; however,

More information

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Asheber Abebe Discrete and Statistical Sciences Auburn University Contents 1 Completely Randomized Design

More information

Descriptions of post-hoc tests

Descriptions of post-hoc tests Experimental Statistics II Page 81 Descriptions of post-hoc tests Post-hoc or Post-ANOVA tests! Once you have found out some treatment(s) are different, how do you determine which one(s) are different?

More information

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December

More information

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X. One-Way Analysis of Variance (ANOVA) Also called single factor ANOVA. The response variable Y is continuous (same as in regression). There are two key differences regarding the explanatory variable X.

More information

Odor attraction CRD Page 1

Odor attraction CRD Page 1 Odor attraction CRD Page 1 dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************;

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

11 Factors, ANOVA, and Regression: SAS versus Splus

11 Factors, ANOVA, and Regression: SAS versus Splus Adapted from P. Smith, and expanded 11 Factors, ANOVA, and Regression: SAS versus Splus Factors. A factor is a variable with finitely many values or levels which is treated as a predictor within regression-type

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 Simple, Marginal, and Interaction Effects in General Linear Models: Part 1 PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 2: August 24, 2012 PSYC 943: Lecture 2 Today s Class Centering and

More information

using the beginning of all regression models

using the beginning of all regression models Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR= = -/\<>*; ODS LISTING; dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" ---- + ---+= -/\*"; ODS LISTING; *** Table 23.2 ********************************************; *** Moore, David

More information

N J SS W /df W N - 1

N J SS W /df W N - 1 One-Way ANOVA Source Table ANOVA MODEL: ij = µ* + α j + ε ij H 0 : µ = µ =... = µ j or H 0 : Σα j = 0 Source Sum of Squares df Mean Squares F J Between Groups nj( j * ) J - SS B /(J ) MS B /MS W = ( N

More information

Orthogonal contrasts and multiple comparisons

Orthogonal contrasts and multiple comparisons BIOL 933 Lab 4 Fall 2017 Orthogonal contrasts Class comparisons in R Trend analysis in R Multiple mean comparisons Orthogonal contrasts and multiple comparisons Orthogonal contrasts Planned, single degree-of-freedom

More information

Descriptive Statistics

Descriptive Statistics *following creates z scores for the ydacl statedp traitdp and rads vars. *specifically adding the /SAVE subcommand to descriptives will create z. *scores for whatever variables are in the command. DESCRIPTIVES

More information

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, Contents Comparison of several groups Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

UNIVERSITY EXAMINATIONS NJORO CAMPUS SECOND SEMESTER 2011/2012

UNIVERSITY EXAMINATIONS NJORO CAMPUS SECOND SEMESTER 2011/2012 UNIVERSITY EXAMINATIONS NJORO CAMPUS SECOND SEMESTER 2011/2012 THIRD YEAR EXAMINATION FOR THE AWARD BACHELOR OF SCIENCE IN AGRICULTURE AND BACHELOR OF SCIENCE IN FOOD TECHNOLOGY AGRO 391 AGRICULTURAL EXPERIMENTATION

More information

Statistical Techniques II EXST7015 Simple Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression Statistical Techniques II EXST7015 Simple Linear Regression 03a_SLR 1 Y - the dependent variable 35 30 25 The objective Given points plotted on two coordinates, Y and X, find the best line to fit the data.

More information

Analysis of variance. April 16, 2009

Analysis of variance. April 16, 2009 Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

Multivariate analysis of variance and covariance

Multivariate analysis of variance and covariance Introduction Multivariate analysis of variance and covariance Univariate ANOVA: have observations from several groups, numerical dependent variable. Ask whether dependent variable has same mean for each

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744

More information

1 Tomato yield example.

1 Tomato yield example. ST706 - Linear Models II. Spring 2013 Two-way Analysis of Variance examples. Here we illustrate what happens analyzing two way data in proc glm in SAS. Similar issues come up with other software where

More information

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk

More information

PLS205 Winter Homework Topic 8

PLS205 Winter Homework Topic 8 PLS205 Winter 2015 Homework Topic 8 Due TUESDAY, February 10, at the beginning of discussion. Answer all parts of the questions completely, and clearly document the procedures used in each exercise. To

More information

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means Stat 529 (Winter 2011) Analysis of Variance (ANOVA) Reading: Sections 5.1 5.3. Introduction and notation Birthweight example Disadvantages of using many pooled t procedures The analysis of variance procedure

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages

More information

Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction Review of regression concepts

Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction Review of regression concepts Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction The analysis of covariance (ANCOVA) is a technique that is occasionally useful for improving the precision of an experiment.

More information

Analysis of Variance

Analysis of Variance 1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,

More information

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information. STA441: Spring 2018 Multiple Regression This slide show is a free open source document. See the last slide for copyright information. 1 Least Squares Plane 2 Statistical MODEL There are p-1 explanatory

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

PLS205 Lab 2 January 15, Laboratory Topic 3

PLS205 Lab 2 January 15, Laboratory Topic 3 PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way

More information

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

One-way ANOVA Model Assumptions

One-way ANOVA Model Assumptions One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random

More information

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA

Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

STAT 350: Summer Semester Midterm 1: Solutions

STAT 350: Summer Semester Midterm 1: Solutions Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure

More information

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600 Multiple Comparison Procedures Cohen Chapter 13 For EDUC/PSY 6600 1 We have to go to the deductions and the inferences, said Lestrade, winking at me. I find it hard enough to tackle facts, Holmes, without

More information

SPECIAL TOPICS IN REGRESSION ANALYSIS

SPECIAL TOPICS IN REGRESSION ANALYSIS 1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

8 Analysis of Covariance

8 Analysis of Covariance 8 Analysis of Covariance Let us recall our previous one-way ANOVA problem, where we compared the mean birth weight (weight) for children in three groups defined by the mother s smoking habits. The three

More information

Introduction. Chapter 8

Introduction. Chapter 8 Chapter 8 Introduction In general, a researcher wants to compare one treatment against another. The analysis of variance (ANOVA) is a general test for comparing treatment means. When the null hypothesis

More information

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg data one; input id Y group X; I1=0;I2=0;I3=0;if group=1 then I1=1;if group=2 then I2=1;if group=3 then I3=1; IINT1=I1*X;IINT2=I2*X;IINT3=I3*X; *************************************************************************;

More information