Lecture 11 Analysis of variance
|
|
- Lora Chambers
- 5 years ago
- Views:
Transcription
1 Lecture 11 Analysis of variance Dr. Wim P. Krijnen Lecturer Statistics University of Groningen Faculty of Mathematics and Natural Sciences Johann Bernoulli Institute for Mathematics and Computer Science October 18, 2010
2 Lecture overview one-way, two-way analysis of variance fixed effects; skip random effects parametric as well as nonparametric inference paired group comparisons correction for multiple testing purpose: finding evidence for valid statistical inferences distinguish: assumptions, hypotheses, conclusions order of approach: first concentrate on methods, than on data data preparation in the book is useful, but initially confusing Why not repeated t-testing? 1. ANOVA has more power! 2. Avoid multiple testing 2
3 Example: consumed gasoline by pickup trucks gas consumption in gallons of 5 pickup trucks driving 500 miles from manufactures Chevrolet, Dogde, and Ford Chevy Dodge Ford Question: Are the means significantly different? H 0 : µ = µ 1 = µ 2 = µ 3 versus H A : not all mu s equal Population sample description y ij Y ij observation j in group i µ Y total mean µ i = µ + α i Y i mean for group i (identification i α i = 0)) H 0 : µ = µ 1 = µ 2 = µ 3 H 0 : α 1 = α 2 = α 3 = 0 equals means equivalent to no effects 3
4 Computing means and variances organize the data in a matrix and use apply Chevy <- c(15.2,15.4,14.8,14.4,14.7) Dodge <- c(14.8,14.4,14.3,14.1,14.4) Ford <- c(15.1,14.3,14.6,13.9,14.6) truck.ma <- cbind(chevy,dodge,ford) > apply(truck.ma,2,mean) Chevy Dodge Ford > apply(truck.ma,2,var) Chevy Dodge Ford
5 Decomposition of sums of squares assume the model: y ij = µ + α i + ε ij, where ε ij iid N(0, σ 2 ) normally distributed for all ij; equal group variances Y ij observation j in group i n i number of observations in group i k number of groups Y ij Y = k n i (Y ij Y ) 2 i=1 j=1 } {{ } Total SS Y ij Y i }{{} within group deviation = k n i (Y ij Y i ) 2 i=1 j=1 } {{ } Within SS + Y i Y }{{} between group deviation + k n i (Y i Y ) 2 i=1 Total SS = Within SS + Between SS j=1 } {{ } Between SS 5
6 F-test of ANOVA WithinMS = 1 n k k n i (Y ij Y i ) 2 BetweenMS = 1 k 1 i=1 j=1 Between MS F = Within MS H 0 : µ 1 = µ 2 = = µ k ; equal group means in population k n i (Y i Y ) 2 i=1 j=1 reject H 0 if p-value = P(F k 1,n k > F) = 1-pf(F,k-1,n-k) < α large population difference in means generates differences in sample means with high probability large sample differences in means large Between MS large F-value small p-value general idea: make inferences about means by analysis of variance (in the F!) 6
7 Elementary computation of sums of squares Chevy <- c(15.2,15.4,14.8,14.4,14.7) Dodge <- c(14.8,14.4,14.3,14.1,14.4) Ford <- c(15.1,14.3,14.6,13.9,14.6) k <- 3; ni <- 5; n <- 15 truck.ma <- cbind(chevy,dodge,ford) Total.mean <- sum(truck.ma)/n Within.SS <- 0; Between.SS <- 0 for (i in 1:k) Within.SS <- Within.SS + sum((truck.ma[,i] - mean(truck.ma[,i]))ˆ2) for (i in 1:k) Between.SS <- Between.SS + ni * (mean(truck.ma[,i]) - Total.mean)ˆ2 > (Within.MS <- Within.SS/(n-k)) [1] 0.14 > (Between.MS <- Between.SS/(k-1)) [1] 0.35 > (F <- Between.MS/Within.MS)# gives 2.5 > 1-pf(F,k-1,n-k) [1] #not reject H0
8 Example: F-test for gasoline consumption data > Chevy <- c(15.2,15.4,14.8,14.4,14.7) > Dodge <- c(14.8,14.4,14.3,14.1,14.4) > Ford <- c(15.1,14.3,14.6,13.9,14.6) > truck.dat <- c(chevy,dodge,ford) > truck.fac <- gl(3,5) > summary(aov(truck.dat truck.fac)) # or > anova(lm(truck.dat truck.fac)) Analysis of Variance Table Response: truck.dat Df Sum Sq Mean Sq F value Pr(>F) truck.fac Residuals k = 3, n = 15, Between SS = 0.70, Between MS = 0.70/2 = 0.35, Within SS = 1.68, Within MS = 1.68/12 =.14, F =.35/0.14 = 2.5, p-value =.1237 > α not reject H 0 8
9 Extract predictor matrix by model.matrix illustrate by an example: extract the predictor matrix > (X <- model.matrix(model)) (Intercept) truck.fac2 truck.fac
10 Matrix computation of predicted values perform regression analysis wrt X and compute ŷ i by two methods X <- model.matrix(model) y <- c(chevy,dodge,ford) betahat <- (solve(t(x) %*% X)) %*% t(x) %*% y yhat <- X %*% betahat > sqrt(sum((fitted(model)-yhat)ˆ2)) [1] e-14 euclidian distance between ŷ i from lm and regression analysis 0 10
11 Summary of model estimation > summary(lm(truck.dat truck.fac)) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** truck.fac truck.fac Residual standard error: on 12 degrees of freedo Multiple R-squared: , Adjusted R-squared: F-statistic: 2.5 on 2 and 12 DF, p-value: estimated effects α 1 =.9 (not printed), α 2 =.5, α 3 =.4 standard error follows from least squares t-value from estimate divided by standard error p-value from R function pt Conclusion: do not reject H 0 : µ 1 = µ 2 = µ 3 ; α 1 = α 2 = α 3 = 0 11
12 Testing validity of assumptions model implies error y ij µ + α i = ε ij (iid) N(0, σ 2 ) homoscedasticity assumption: equal group variances Bartlett s procedure tests H 0 : σ 2 = σ 2 1 = σ2 2 = σ2 3 versus H A : not all group variances equal Shapiro-Wilk procedure tests for normality of ε ij > model <- lm(truck.dat truck.fac) > shapiro.test(residuals(model)) Shapiro-Wilk normality test data: residuals(model) W = , p-value = > bartlett.test(truck.dat truck.fac) Bartlett test of homogeneity of variances data: truck.dat by truck.fac Bartlett s K-squared = , df = 2, p-value = Conclusion: Do not reject normality, homoscedaticity 12
13 Paired Comparisons test the difference between experimental effects H 0 : α i = α j versus H A : α i α j ( ( ) under H 0 Y i Y j has density φ 0, σ ) 1ni 2 + 1nj substitute estimator S 2 for σ 2 in the test testistic T ij = Y i Y j SE(Y i Y j ) = Y i Y j S 2 ( 1ni + 1nj ) reject H 0 if p-value = P(t n k > T ij ) = 1 pt(abs(tij),n-k) < α/2 13
14 Example: blood coagulation times Example: coagulation blood coagulation times from 24 animals receiving 4 different diets Box, Hunter& Hunter(1978) > library(faraway) > data(coagulation) > anova(lm(coag diet, data=coagulation)) Analysis of Variance Table Response: coag Df Sum Sq Mean Sq F value Pr(>F) diet e-05 *** Residuals Conclusion: There are difference in means Question: Which? 14
15 Answer: Detect Least Significant Differences library(agricolae); library(faraway) model <- aov(coag diet, data=coagulation) df <- df.residual(model) MS.error <- deviance(model)/df > LSD.test(coag,diet,df,MS.error,group=FALSE) Difference pvalue sig LCL UCL B - A ** C - A *** A - D C - B B - D *** C - D *** Conclusion: starred differences are significant (equality rejected) 15
16 Correction for Multiple testing we just made ( 4 2 ) = 6 paired comparisons probability of false positive is binomially distributed with π = 0.05, n = 6 P(X 1) = sum(dbinom(1:6, 6, 0.05)) = > 0.05 Solution 1: adjust alpha α = α ( 4 2 ) Solution 2: multiply raw p-values by ( 4 2 ); call them adjusted > LSD.test(coag,diet,df,MS.error,group=FALSE, p.adj="bonferroni") Difference pvalue sig LCL UCL B - A * C - A ** A - D C - B B - D ** C - D *** > * 6 [1]
17 Tuckey s honest significant differences (HSD) Example: > TukeyHSD(aov(coag diet, data=coagulation)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = coag diet, data = coagulation) $diet diff lwr upr p adj B-A e C-A e D-A e C-B e D-B e D-C e
18 Conclusion: 95% Confidence Intervals not containing zero indicate significant differences (H 0 rejected) 18
19 Kruskal-Wallis test compute the ranks of all observations and R i the sum of the ranks per treatment if all observations are different (no ties), then ( ) χ 12 k Ri 2 = 3(n 1) n(n + 1) n i i=1 reject H0: mean ranks are equal among groups if p-value = 1-pf(chi,k-1) < α good power properties (Lehmann, 1998, Nonparametrics ) > kruskal.test(truck.dat truck.fac) Kruskal-Wallis rank sum test data: truck.dat by truck.fac Kruskal-Wallis chi-squared = , df = 2, p-value =
20 Air Quality in New York Daily air quality measurements in New York, May to September Ozone= Mean ozone in parts per billion from 1300 to 1500 hours at Roosevelt Island > model <- lm(ozone Month, data = airquality) > shapiro.test(residuals(model)) Shapiro-Wilk normality test data: residuals(model) W = , p-value = 1.022e-07 # reject normality > bartlett.test(ozone Month, data = airquality) Bartlett test of homogeneity of variances data: Ozone by Month Bartlett s K-squared = , df = 4, p-value = # reject homoscedasticity > kruskal.test(ozone Month, data = airquality) Kruskal-Wallis rank sum test data: Ozone by Month Kruskal-Wallis chi-squared = , df = 4, p-value = 6.901e-06 #reject equal distribution
21 Two-way analysis of variance, fixed effects two factors α and β y ijk = µ + α i + β j + γ ij + ε ijk y ijk measurement of case k, level i factor α, level j factor β µ overall population mean α i effect of level i of factor α ( i α i = 0) β j effect of level j of factor β ( j β j = 0) γ ij interaction effect between level i of α, level j of β ( i j γ ij = 0) ε ijk error (iid error normal mean 0, var σ 2 ) model formula in R y ij = µ + α i + ε ij y a y ijk = µ + α i + β j + ε ijk y a + b y ijk = µ + α i + β j + γ ij + ε ijk y a + b + a : b y ijk = µ + α i + β j + γ ij + ε ijk y a * b H 0 : α 1 = α 2 = α 3 = 0; H 0 : β 1 = β 2 = β 3 = 0; H 0 : γ 1 = γ 2 = γ 3 = 0 21
22 Weight gain due to diet weight gain (gr.) in rats due to different types of diets two factors investigated (Hand, et al., 1993): source of protein: beef, cereal type of diet : low, high amount of protein Beef Cereal Low High Low High
23 > library(hsaur) > anova(lm(weightgain type * source, data=weightgain)) Analysis of Variance Table Response: weightgain Df Sum Sq Mean Sq F value Pr(>F) type * source type:source Residuals pmrt: population mean, main effects type, main effect source, interactioneffect Residuals SS = Within SS = , Within MS = /36 = , Between type SS = , df = nr levers -1 = 1, Between MS = /1, F = / = , p-value = < α Conclusion: Reject type of diet H 0 ; not reject source H 0 23
Statistics for EES Factorial analysis of variance
Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA
More information3. Design Experiments and Variance Analysis
3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when
More informationMore about Single Factor Experiments
More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell
More information1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative
More informationBiostatistics for physicists fall Correlation Linear regression Analysis of variance
Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody
More informationAnalysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes
Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes 2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] =
More informationRank-Based Methods. Lukas Meier
Rank-Based Methods Lukas Meier 20.01.2014 Introduction Up to now we basically always used a parametric family, like the normal distribution N (µ, σ 2 ) for modeling random data. Based on observed data
More informationChapter 12. Analysis of variance
Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout
More information2-way analysis of variance
2-way analysis of variance We may be considering the effect of two factors (A and B) on our response variable, for instance fertilizer and variety on maize yield; or therapy and sex on cholesterol level.
More informationData are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)
BSTT523 Pagano & Gauvreau Chapter 13 1 Nonparametric Statistics Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA) In particular, data
More information22s:152 Applied Linear Regression. Take random samples from each of m populations.
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College
1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More information22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA
22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 17 pages including
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationTwo (or more) factors, say A and B, with a and b levels, respectively.
Factorial Designs ST 516 Two (or more) factors, say A and B, with a and b levels, respectively. A factorial design uses all ab combinations of levels of A and B, for a total of ab treatments. When both
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationT-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum
T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222
More informationOne-way analysis of variance
Analysis of variance From R.R. Sokal and F.J. Rohlf, Biometry, 2nd Edition (1981): A knowledge of analysis of variance is indispensable to any modern biologist and, after you have mastered it, you will
More informationFACTORIAL DESIGNS and NESTED DESIGNS
Experimental Design and Statistical Methods Workshop FACTORIAL DESIGNS and NESTED DESIGNS Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Factorial
More informationMultiple Sample Numerical Data
Multiple Sample Numerical Data Analysis of Variance, Kruskal-Wallis test, Friedman test University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 /
More informationSPSS Guide For MMI 409
SPSS Guide For MMI 409 by John Wong March 2012 Preface Hopefully, this document can provide some guidance to MMI 409 students on how to use SPSS to solve many of the problems covered in the D Agostino
More informationAnalysis of Variance
Analysis of Variance Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 22 November 29, 2011 ANOVA 1 / 59 Cuckoo Birds Case Study Cuckoo birds have a behavior
More informationMath 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141
Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Comparing two population means If two distributions have the same shape and spread,
More informationDETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics
DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationPart 5 Introduction to Factorials
More Statistics tutorial at www.dumblittledoctor.com Lecture notes on Experiment Design & Data Analysis Design of Engineering Experiments Part 5 Introduction to Factorials Text reference, Chapter 5 General
More informationCOMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA
Experimental Design and Statistical Methods Workshop COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments
More informationStatistics - Lecture 05
Statistics - Lecture 05 Nicodème Paul Faculté de médecine, Université de Strasbourg http://statnipa.appspot.com/cours/05/index.html#47 1/47 Descriptive statistics and probability Data description and graphical
More information22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)
22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are
More informationFactorial and Unbalanced Analysis of Variance
Factorial and Unbalanced Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)
More informationMAT3378 (Winter 2016)
MAT3378 (Winter 2016) Assignment 2 - SOLUTIONS Total number of points for Assignment 2: 12 The following questions will be marked: Q1, Q2, Q4 Q1. (4 points) Assume that Z 1,..., Z n are i.i.d. normal random
More information3. Nonparametric methods
3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests
More informationOne-Way Analysis of Variance: ANOVA
One-Way Analysis of Variance: ANOVA Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background to ANOVA Recall from
More informationWeek 14 Comparing k(> 2) Populations
Week 14 Comparing k(> 2) Populations Week 14 Objectives Methods associated with testing for the equality of k(> 2) means or proportions are presented. Post-testing concepts and analysis are introduced.
More informationANOVA (Analysis of Variance) output RLS 11/20/2016
ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38
BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to
More informationMultiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model
Biostatistics 250 ANOVA Multiple Comparisons 1 ORIGIN 1 Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model When the omnibus F-Test for ANOVA rejects the null hypothesis that
More informationMath Section MW 1-2:30pm SR 117. Bekki George 206 PGH
Math 3339 Section 21155 MW 1-2:30pm SR 117 Bekki George bekki@math.uh.edu 206 PGH Office Hours: M 11-12:30pm & T,TH 10:00 11:00 am and by appointment Linear Regression (again) Consider the relationship
More informationLecture 14: ANOVA and the F-test
Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016 Example Consider a study of 983 individuals and examine the relationship between duration of breastfeeding
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationSTAT 401A - Statistical Methods for Research Workers
STAT 401A - Statistical Methods for Research Workers One-way ANOVA Jarad Niemi (Dr. J) Iowa State University last updated: October 10, 2014 Jarad Niemi (Iowa State) One-way ANOVA October 10, 2014 1 / 39
More informationANOVA: Analysis of Variation
ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationConfidence Intervals, Testing and ANOVA Summary
Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0
More informationAnalysis of Variance
Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationNonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown
Nonparametric Statistics Leah Wright, Tyler Ross, Taylor Brown Before we get to nonparametric statistics, what are parametric statistics? These statistics estimate and test population means, while holding
More informationSummary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)
Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ
More informationAnalysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking
Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,
More informationOutline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013
Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 12, 2015 List of Figures in this document by page: List of Figures 1 Time in days for students of different majors to find full-time employment..............................
More informationIntroduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.
Preface p. xi Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p. 6 The Scientific Method and the Design of
More informationSEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics
SEVERAL μs AND MEDIANS: MORE ISSUES Business Statistics CONTENTS Post-hoc analysis ANOVA for 2 groups The equal variances assumption The Kruskal-Wallis test Old exam question Further study POST-HOC ANALYSIS
More informationExam details. Final Review Session. Things to Review
Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationHandling Categorical Predictors: ANOVA
Handling Categorical Predictors: ANOVA 1/33 I Hate Lines! When we think of experiments, we think of manipulating categories Control, Treatment 1, Treatment 2 Models with Categorical Predictors still reflect
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationDr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)
Dr. Maddah ENMG 617 EM Statistics 10/12/12 Nonparametric Statistics (Chapter 16, Hines) Introduction Most of the hypothesis testing presented so far assumes normally distributed data. These approaches
More informationWeek 7.1--IES 612-STA STA doc
Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ
More informationLecture 10. Factorial experiments (2-way ANOVA etc)
Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More informationAnalysis of Variance Bios 662
Analysis of Variance Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-21 13:34 BIOS 662 1 ANOVA Outline Introduction Alternative models SS decomposition
More informationLec 3: Model Adequacy Checking
November 16, 2011 Model validation Model validation is a very important step in the model building procedure. (one of the most overlooked) A high R 2 value does not guarantee that the model fits the data
More informationThe ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.
Lecture 11 Topic 8: Data Transformations Assumptions of the Analysis of Variance 1. Independence of errors The ε ij (i.e. the errors or residuals) are statistically independent from one another. Failure
More informationLecture 4. Checking Model Adequacy
Lecture 4. Checking Model Adequacy Montgomery: 3-4, 15-1.1 Page 1 Model Checking and Diagnostics Model Assumptions 1 Model is correct 2 Independent observations 3 Errors normally distributed 4 Constant
More informationWorkshop 7.4a: Single factor ANOVA
-1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationOrthogonal contrasts for a 2x2 factorial design Example p130
Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the
More informationI i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.
Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS
More informationSAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model
Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,
More information22s:152 Applied Linear Regression. 1-way ANOVA visual:
22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis
More information13: Additional ANOVA Topics. Post hoc Comparisons
13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Post hoc Comparisons In the prior chapter we used ANOVA
More informationBiostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test
Biostatistics 270 Kruskal-Wallis Test 1 ORIGIN 1 Kruskal-Wallis Test The Kruskal-Wallis is a non-parametric analog to the One-Way ANOVA F-Test of means. It is useful when the k samples appear not to come
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More information4/22/2010. Test 3 Review ANOVA
Test 3 Review ANOVA 1 School recruiter wants to examine if there are difference between students at different class ranks in their reported intensity of school spirit. What is the factor? How many levels
More informationAnalysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total
Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More information1 Use of indicator random variables. (Chapter 8)
1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationOutline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013
Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent
More informationEconometrics. 4) Statistical inference
30C00200 Econometrics 4) Statistical inference Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Confidence intervals of parameter estimates Student s t-distribution
More informationGlossary. The ISI glossary of statistical terms provides definitions in a number of different languages:
Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the
More informationMultiple Predictor Variables: ANOVA
Multiple Predictor Variables: ANOVA 1/32 Linear Models with Many Predictors Multiple regression has many predictors BUT - so did 1-way ANOVA if treatments had 2 levels What if there are multiple treatment
More information610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison
610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison Prof Colleen F. Moore Note: The metaphor of making friends with your data was used by Tukey in some of his writings.
More informationNon-parametric tests, part A:
Two types of statistical test: Non-parametric tests, part A: Parametric tests: Based on assumption that the data have certain characteristics or "parameters": Results are only valid if (a) the data are
More informationThe Distribution of F
The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2
More informationSTATISTICS 141 Final Review
STATISTICS 141 Final Review Bin Zou bzou@ualberta.ca Department of Mathematical & Statistical Sciences University of Alberta Winter 2015 Bin Zou (bzou@ualberta.ca) STAT 141 Final Review Winter 2015 1 /
More informationMcGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination
McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationSTAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing
STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =
More informationANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS
ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More information