MAT3378 (Winter 2016)

Size: px
Start display at page:

Download "MAT3378 (Winter 2016)"


1 MAT3378 (Winter 2016) Assignment 2 - SOLUTIONS Total number of points for Assignment 2: 12 The following questions will be marked: Q1, Q2, Q4 Q1. (4 points) Assume that Z 1,..., Z n are i.i.d. normal random variables with mean µ and variance σ 2. Let Z be the sample mean and S 2 = 1 n (Z i n 1 Z) 2 be the sample variance. Define the following random variables: (1) V 1 = Z µ σ/ n ; i=1 (2) V 2 = Z µ S/ n ; (3) V 3 = (n 1)S 2 /σ 2 ; (4) V 4 = (Z 1 µ) 2 /σ 2 ; For n = 15 use R to calculate the following probabilities: (1) P (V 1 < 1.456); (2) P (V 2 > 2.155); (3) P (V 3 > 11.98) (4) P (V 4 > 1.801); Hint: This question is about understanding of distribution of different statistics. You will need to use the following R commands: pnorm; pt; pchisq. Type help(pchisq) to see how the appropriate functions have to be used. Solution to Q1: (1) V 1 is standard normal, hence pnorm(-1.456,0,1) = (2) V 2 has t-distribution with n 1 = 14 degrees of freedom, hence 1-pt(2.155,14) = (3) V 3 has χ 2 distribution with n = 15 degrees of freedom, hence 1-pchisq(11.98,15) = (4) V 3 has χ 2 distribution with 1 degree of freedom, hence 1-pchisq(1.801,1) = Marking scheme for Q1: 1 point for each correct answer. Total: 4 points. Q2. (2 points) Consider three independent populations with means µ 1, µ 2 and µ 3, respectively. Suppose that we would like to compare the average of µ 1 and µ 2 with µ 3. To do so, we would like to estimate L = (µ 1 + µ 2 )/2 µ 3. Suppose that we have random samples from each of these populations and that the respective sample means are Ȳ1, Ȳ2 and Ȳ3. Consider the following esti- mator for L: ˆL = (Ȳ1 + Ȳ2 )/2 Ȳ3. (1) Recall that an estimator ˆθ of a parameter θ is unbiased whenever E[ˆθ] = θ. Is ˆL an unbiased estimator for L? (2) Suppose that each population has the same population standard deviation σ = 5. Furthermore, each sample is of size 10. Compute the variance of the estimator ˆL. 1

2 2 Solution to Q2: (1) We have E[ˆL] = E [ (Ȳ1 + Ȳ2 )/2 Ȳ3 ] = (µ1 + µ 2 )/2 µ 3, hence ˆL is the unbiased estimator of L. (2) We have Var[ˆL] = Var [ (Ȳ1 + Ȳ2 )/2 ] ( Ȳ3 σ 2 = 1 4 n 1 + σ2 n 2 ) + σ2 n 3 = 3.75, Marking scheme for Q2: 1 point for each correct answer. Total: 2 points. Q3. (R question) Consider the data in the text cowsdata.txt available on the course webpage. It is a tab delimited file. There are two columns that represent the protein content in the cow s milk and its diet, respectively. The categories for the diet are: 1 = barley, 2= barley and lupine, and 3 = lupine. The data contain 25 cows on the barley diet, 27 cows on the other two diets. The investigators want to analyze the effects of three diets on the content of protein in cow s milk. (1) Produce side-by-side boxplots (that is, 3 boxplots on the same graph) to compare the protein content of each diet. (2) Run the ANOVA test. What is the conclusion? (3) Calculate 95% confidence intervals for factor level means and produce side-by-side plots of confidence intervals. Hint: When typing Mydata<-read.table(file.choose(),header=TRUE) you will get a data.frame in R directly. Solution to Q3: (1) Type Mydata<-read.table(file.choose(),header=TRUE); names(mydata); y<-mydata$protein; x<-factor(mydata$diet); boxplot(y~x)

3 It seems that all diets yield similar mean protein content. (2) By typing summary(aov(y~x)) we get Df Sum Sq Mean Sq F value Pr(>F) x Residuals The hypothesis of the equality of means is not rejected. There is no influence of the diet on the mean protein content. (3) Type MSE=0.1627; means=tapply(y,x,mean); n=tapply(y,x,length) df=sum(n)-3; alpha=0.05; l1=means[1]-qt(1-alpha/2,df)*sqrt(mse/n[1]); u1=means[1]+qt(1-alpha/2,df)*sqrt(mse/n[1]); l2=means[2]-qt(1-alpha/2,df)*sqrt(mse/n[2]); u2=means[2]+qt(1-alpha/2,df)*sqrt(mse/n[2]); l3=means[3]-qt(1-alpha/2,df)*sqrt(mse/n[3]); u3=means[3]+qt(1-alpha/2,df)*sqrt(mse/n[3]); plot(c(1,1),c(l1,u1),type="o",xlim=c(1,4),ylim=c(min(l1,l2,l3),max(u1,u2,u3)),xlab="factor Levels", lines(c(2,2),c(l2,u2),type="o") lines(c(3,3),c(l3,u3),type="o")

4 4 CI Factor Levels Again, the confidence intervals overlap which is another indication that the mean protein level is the same for all diets. Marking scheme for Q3: This question will not be marked. Q4. (6 points) In Q3, Assignment 1, we performed ANOVA for Cash offers. The hypotheses of equality of factor level means was rejected. We proceed with analysis of factor level means (1) Estimate the mean cash offer for young owners: use a 99 percent confidence interval. You can solve this part using R or by hand. (2) Construct a 99 percent confidence interval for µ 3 µ 1. Interpret your interval estimate. You can solve this part using R or by hand. (3) Test whether or not µ 2 µ 1 = µ 3 µ 2 ; control the a risk at α = You can solve this part using R or by hand. Hint: contrast. Note that you will need to code on your own, since no code provided on the course webpage. (4) Obtain confidence intervals for all pairwise comparisons between the treatment means; use the Tukey procedure and a 90 percent family confidence coefficient. Interpret your results and provide a graphic summary by preparing a paired comparison plot. Use R. (5) Perform the Bonferroni procedure. Interpret your results. Use R. Solution to Q4: First, I repeat the code from Assignment 1 the data. # Getting Data; Young=c(23, 25, 21, 22, 21, 22, 20, 23, 19, 22, 19, 21) Middle=c(28, 27, 27, 29, 26, 29, 27, 30, 28, 27, 26, 29) Elderly=c(23, 20, 25, 21, 22, 23, 21, 20, 19, 20, 22, 21) # Creating Data Frame; FactorLevels=c(1,2,3) n1=length(young); n2=length(middle); n3=length(elderly); MyData=data.frame( Values=c(Young,Middle,Elderly),

5 5 Treatment=c(rep(1,n1),rep(2,n2),rep(3,n3))); y=mydata$values; x=factor(mydata$treatment); Recall also that running ANOVA in Assignment 1 we got MSE=2.49. We also calculate the means and sample sizes first: means=tapply(y,x,mean); n=tapply(y,x,length); df=sum(n)-2; (1) Type MSE=2.49; alpha=0.01; l1=means[1]-qt(1-alpha/2,df)*sqrt(mse/n[1]); u1=means[1]+qt(1-alpha/2,df)*sqrt(mse/n[1]); print(c(l1,u1)) the following confidence interval: ( , ). (2) Type MSE=2.49; alpha=0.01; l31=means[3]-means[1]-qt(1-alpha/2,df)*sqrt(mse/n[1]+mse/n[3]); u31=means[3]-means[1]+qt(1-alpha/2,df)*sqrt(mse/n[1]+mse/n[3]); print(c(l31,u31)) the following confidence interval: ( , ). Since the confidence interval contains 0, the hypothesis H 0 : µ 1 = µ 3 is not rejected. Mean Cash offers for Young and Elderly are the same. (3) We estimate the contrast by Note that the variance of ˆL is L = µ 2 µ 1 (µ 3 µ 2 ) = µ 1 + 2µ 2 µ 3 ˆL = Ȳ1 + 2Ȳ2 Ȳ3. Var[ˆL] = σ 2 /n 1 + 4σ 2 /n 2 + σ 2 /n 2 = σ 2 (1/n 1 + 4/n 2 + 1/n 3 ). Hence the 99% CI for the contrast is: MSE=2.49; alpha=0.01; lcontrast=-means[1]+2*means[2]-means[3]-qt(1-alpha/2,df)*sqrt(mse/n[1]+4*mse/n[2]+mse/n[3]); ucontrast=-means[1]+2*means[2]-means[3]+qt(1-alpha/2,df)*sqrt(mse/n[1]+4*mse/n[2]+mse/n[3]); print(c(lcontrast,ucontrast)) The confidence interval for the contrast is ( , ). The hypothesis H 0 : µ 2 µ 1 = µ 3 µ 2 is rejected since 0 is not in the confidence interval. Alternative solution, by computing the test statistics and p-value. t.stat=(-means[1]+2*means[2]-means[3])/sqrt(mse/n[1]+4*mse/n[2]+mse/n[3]); 1-pt(abs(t.stat),df) The p-value is e-13, hence reject H 0 for α = Interpretation: change in the mean cash offers from Young to Middle is significantly larger than change in the mean cash offers from Middle to Elderly. (4) Type results<-aov(y~x); TukeyHSD(results); Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = y ~ x) $x

6 6 diff lwr upr p adj There is significant difference between Young and Middle as well as Elderly and Middle. significant difference between Young and Elderly. (5) Type pairwise.t.test(y,x,p.adjust="bonferroni") Pairwise comparisons using t tests with pooled SD There is no data: y and x e e-11 P value adjustment method: bonferroni Bonferroni procedure confirms findings in the Tukey procedure. Marking scheme for Q4: 1 point for the correct CI in 1); 1 point for the correct CI in 2); 1 point for the correct variance in 3); 1 point for the correct conclusion in 3); 1 point for the correct conclusion in 4); 1 point for the correct conclusion in 5). Total: 6 points. Q5. The data set cancer.txt contains breast cancer rates at different countries. There are 7 factors levels that represent different continents (last column). Perform ANOVA test. If the test rejects H 0, analyse factor level means using the tools in R-2.html. The data set can be loaded in R by Mydata<-read.table(file.choose(),header=TRUE) Try to load both cancer.txt and cancer-bad.txt. Think why there are problems with importing the second one. Solution to Q5: Type y<-mydata$breastcancer; x<-factor(mydata$continent); results=aov(y~x); summary(results) Df Sum Sq Mean Sq F value Pr(>F) x <2e-16 *** Residuals Signif. codes: 0 *** ** 0.01 * The ANOVA test rejects H 0 - there are continents that are significantly different. The Tukey and Bonferroni procedure yield, respectively TukeyHSD(results); plot(tukeyhsd(results)); pairwise.t.test(y,x,p.adjust="bonferroni") Tukey multiple comparisons of means 95% family-wise confidence level

7 7 diff lwr upr p adj AS-AF EE-AF LATAM-AF NORAM-AF OC-AF WE-AF EE-AS LATAM-AS NORAM-AS OC-AS WE-AS LATAM-EE NORAM-EE OC-EE WE-EE NORAM-LATAM OC-LATAM WE-LATAM OC-NORAM WE-NORAM WE-OC Pairwise comparisons using t tests with pooled SD data: y and x AF AS EE LATAM NORAM OC AS EE 2.8e e LATAM NORAM 3.6e e OC WE < 2e-16 < 2e e e e-05 P value adjustment method: bonferroni 95% family wise confidence level WE OC OC LATAM LATAM EE EE AS LATAM AF Differences in mean levels of x Note that the mean breast cancer rate in ASIA and AFRICA is significantly lower than in the other continents.

Math 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Math 141. Lecture 16: More than one group. Albyn Jones 1.   jones/courses/ Library 304. Albyn Jones Math 141 Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones/courses/141 Comparing two population means If two distributions have the same shape and spread,

More information

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical

More information

Statistics for EES Factorial analysis of variance

Statistics for EES Factorial analysis of variance Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Statistics - Lecture 05

Statistics - Lecture 05 Statistics - Lecture 05 Nicodème Paul Faculté de médecine, Université de Strasbourg 1/47 Descriptive statistics and probability Data description and graphical

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model Biostatistics 250 ANOVA Multiple Comparisons 1 ORIGIN 1 Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model When the omnibus F-Test for ANOVA rejects the null hypothesis that

More information

Two (or more) factors, say A and B, with a and b levels, respectively.

Two (or more) factors, say A and B, with a and b levels, respectively. Factorial Designs ST 516 Two (or more) factors, say A and B, with a and b levels, respectively. A factorial design uses all ab combinations of levels of A and B, for a total of ab treatments. When both

More information

STAT22200 Spring 2014 Chapter 5

STAT22200 Spring 2014 Chapter 5 STAT22200 Spring 2014 Chapter 5 Yibi Huang April 29, 2014 Chapter 5 Multiple Comparisons Chapter 5-1 Chapter 5 Multiple Comparisons Note the t-tests and C.I. s are constructed assuming we only do one test,

More information

Analysis of Variance II Bios 662

Analysis of Variance II Bios 662 Analysis of Variance II Bios 662 Michael G. Hudgens, Ph.D. mhudgens 2008-10-24 17:21 BIOS 662 1 ANOVA II Outline Multiple Comparisons Scheffe Tukey Bonferroni

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D.

More information


FACTORIAL DESIGNS and NESTED DESIGNS Experimental Design and Statistical Methods Workshop FACTORIAL DESIGNS and NESTED DESIGNS Jesús Piedrafita Arilla Departament de Ciència Animal i dels Aliments Items Factorial

More information

More about Single Factor Experiments

More about Single Factor Experiments More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell

More information

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =

More information

Table 1: Fish Biomass data set on 26 streams

Table 1: Fish Biomass data set on 26 streams Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure

More information

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses. 1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately

More information

Straw Example: Randomized Block ANOVA

Straw Example: Randomized Block ANOVA Math 3080 1. Treibergs Straw Example: Randomized Block ANOVA Name: Example Jan. 23, 2014 Today s example was motivated from problem 13.11.9 of Walpole, Myers, Myers and Ye, Probability and Statistics for

More information

One-Way Analysis of Variance: ANOVA

One-Way Analysis of Variance: ANOVA One-Way Analysis of Variance: ANOVA Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background to ANOVA Recall from

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Week 14 Comparing k(> 2) Populations

Week 14 Comparing k(> 2) Populations Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.

More information

Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia

Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015 Tuan V. Nguyen Tuan V. Nguyen Garvan Ins)tute of Medical Research Sydney, Australia Analysis of variance Between- group and within- group varia)on explained

More information

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 17 pages including

More information

ANOVA: Analysis of Variance

ANOVA: Analysis of Variance ANOVA: Analysis of Variance Marc H. Mehlman University of New Haven The analysis of variance is (not a mathematical theorem but) a simple method of arranging arithmetical facts so

More information

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Page 1 Linear Combination of Means ANOVA: y ij = µ + τ i + ɛ ij = µ i + ɛ ij Linear combination: L = c 1 µ 1 + c 1 µ 2 +...+ c a µ a = a i=1

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 12, 2015 List of Figures in this document by page: List of Figures 1 Time in days for students of different majors to find full-time employment..............................

More information

Part 5 Introduction to Factorials

Part 5 Introduction to Factorials More Statistics tutorial at Lecture notes on Experiment Design & Data Analysis Design of Engineering Experiments Part 5 Introduction to Factorials Text reference, Chapter 5 General

More information

1 Introduction to One-way ANOVA

1 Introduction to One-way ANOVA Review Source: Chapter 10 - Analysis of Variance (ANOVA). Example Data Source: Example problem 10.1 (dataset: exp10-1.mtw) Link to Data:

More information

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes 1 Assumptions and models There are two factors, factors A and B, that are of interest. Factor A is studied at a levels, and factor B at b levels;

More information


COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA Experimental Design and Statistical Methods Workshop COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA Jesús Piedrafita Arilla Departament de Ciència Animal i dels Aliments

More information

Chapter 16: Understanding Relationships Numerical Data

Chapter 16: Understanding Relationships Numerical Data Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear

More information

Factorial and Unbalanced Analysis of Variance

Factorial and Unbalanced Analysis of Variance Factorial and Unbalanced Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

ANOVA (Analysis of Variance) output RLS 11/20/2016

ANOVA (Analysis of Variance) output RLS 11/20/2016 ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

ANOVA: Comparing More Than Two Means

ANOVA: Comparing More Than Two Means ANOVA: Comparing More Than Two Means Chapter 11 Cathy Poliak, Ph.D. Office Fleming 11c Department of Mathematics University of Houston Lecture 25-3339 Cathy Poliak, Ph.D.

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

16.3 One-Way ANOVA: The Procedure

16.3 One-Way ANOVA: The Procedure 16.3 One-Way ANOVA: The Procedure Tom Lewis Fall Term 2009 Tom Lewis () 16.3 One-Way ANOVA: The Procedure Fall Term 2009 1 / 10 Outline 1 The background 2 Computing formulas 3 The ANOVA Identity 4 Tom

More information

MBA 605, Business Analytics Donald D. Conant, Ph.D. Master of Business Administration

MBA 605, Business Analytics Donald D. Conant, Ph.D. Master of Business Administration t-distribution Summary MBA 605, Business Analytics Donald D. Conant, Ph.D. Types of t-tests There are several types of t-test. In this course we discuss three. The single-sample t-test The two-sample t-test

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 22 November 29, 2011 ANOVA 1 / 59 Cuckoo Birds Case Study Cuckoo birds have a behavior

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section

Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Stat 311: HW 9, due Th 5/27/10 in your Quiz Section Fritz Scholz Your returned assignment should show your name and student ID number. It should be printed or written clearly. 1. The data set ReactionTime

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Lecture 11 Analysis of variance

Lecture 11 Analysis of variance Lecture 11 Analysis of variance Dr. Wim P. Krijnen Lecturer Statistics University of Groningen Faculty of Mathematics and Natural Sciences Johann Bernoulli Institute for Mathematics and Computer Science

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

STAT 401A - Statistical Methods for Research Workers

STAT 401A - Statistical Methods for Research Workers STAT 401A - Statistical Methods for Research Workers One-way ANOVA Jarad Niemi (Dr. J) Iowa State University last updated: October 10, 2014 Jarad Niemi (Iowa State) One-way ANOVA October 10, 2014 1 / 39

More information

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t = 2. The distribution of t values that would be obtained if a value of t were calculated for each sample mean for all possible random of a given size from a population _ t ratio: (X - µ hyp ) t s x The result

More information

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in

More information

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments

Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments Design of Engineering Experiments Part 2 Basic Statistical Concepts Simple comparative experiments The hypothesis testing framework The two-sample t-test Checking assumptions, validity Comparing more that

More information

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic. Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information



More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

The Distribution of F

The Distribution of F The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments. Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

More information

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5) STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

On Assumptions. On Assumptions

On Assumptions. On Assumptions On Assumptions An overview Normality Independence Detection Stem-and-leaf plot Study design Normal scores plot Correction Transformation More complex models Nonparametric procedure e.g. time series Robustness

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Comparing Several Means

Comparing Several Means Comparing Several Means Some slides from R. Pruim STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy The Dating World of Swordtail Fish In some species of swordtail fish, males develop

More information

One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013

One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013 One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013 1. You are planning an experiment that will involve 4 equally sized groups, including 3 experimental groups and a control. Each

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018 Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds

More information

ANOVA: Analysis of Variance

ANOVA: Analysis of Variance ANOVA: Analysis of Variance Marc H. Mehlman University of New Haven The analysis of variance is (not a mathematical theorem but) a simple method of arranging arithmetical facts so

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018

Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Statistical inference (estimation, hypothesis tests, confidence intervals) Oct 2018 Sampling A trait is measured on each member of a population. f(y) = propn of individuals in the popn with measurement

More information

ST505/S697R: Fall Homework 2 Solution.

ST505/S697R: Fall Homework 2 Solution. ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

Chapter 11: Analysis of variance

Chapter 11: Analysis of variance Chapter 11: Analysis of variance Note made by: Timothy Hanson Instructor: Peijie Hou Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences

More information

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants. The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several

More information

Notes on Maxwell & Delaney

Notes on Maxwell & Delaney Notes on Maxwell & Delaney PSY710 12 higher-order within-subject designs Chapter 11 discussed the analysis of data collected in experiments that had a single, within-subject factor. Here we extend those

More information

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Gilles Lamothe February 21, 2017 Contents 1 Anova with one factor 2 1.1 The data.......................................... 2 1.2 A visual

More information

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results

More information


SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling Review for Final For a detailed review of Chapters 1 7, please see the review sheets for exam 1 and. The following only briefly covers these sections. The final exam could contain problems that are included

More information

Comparisons among means (or, the analysis of factor effects)

Comparisons among means (or, the analysis of factor effects) Comparisons among means (or, the analysis of factor effects) In carrying out our usual test that μ 1 = = μ r, we might be content to just reject this omnibus hypothesis but typically more is required:

More information

Lec 1: An Introduction to ANOVA

Lec 1: An Introduction to ANOVA Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Statistics Lab #6 Factorial ANOVA

Statistics Lab #6 Factorial ANOVA Statistics Lab #6 Factorial ANOVA PSYCH 710 Initialize R Initialize R by entering the following commands at the prompt. You must type the commands exactly as shown. options(contrasts=c("contr.sum","contr.poly")

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at When you

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Y i. is the sample mean basal area and µ is the population mean. The difference between them is Y µ. We know the sampling distribution

Y i. is the sample mean basal area and µ is the population mean. The difference between them is Y µ. We know the sampling distribution 7.. In this problem, we envision the sample Y, Y,..., Y 9, where Y i basal area of ith tree measured in sq inches, i,,..., 9. We assume the population distribution is N µ, 6, and µ is the population mean

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro eariasca/math282a.html MATH 282A University

More information

Hotelling s One- Sample T2

Hotelling s One- Sample T2 Chapter 405 Hotelling s One- Sample T2 Introduction The one-sample Hotelling s T2 is the multivariate extension of the common one-sample or paired Student s t-test. In a one-sample t-test, the mean response

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Handout 4: Simple Linear Regression

Handout 4: Simple Linear Regression Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:

More information

Linear Combinations of Group Means

Linear Combinations of Group Means Linear Combinations of Group Means Look at the handicap example on p. 150 of the text. proc means data=mth567.disability; class handicap; var score; proc sort data=mth567.disability; by handicap; proc

More information