Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Similar documents
Introduction to Analysis of Variance (ANOVA) Part 1

/ n ) are compared. The logic is: if the two

x = , so that calculated

Statistics for Economics & Business

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Introduction to Regression

Chapter 13: Multiple Regression

Topic 7: Analysis of Variance

Chapter 14 Simple Linear Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Basic Business Statistics, 10/e

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Topic 23 - Randomized Complete Block Designs (RCBD)

F statistic = s2 1 s 2 ( F for Fisher )

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Lecture 4 Hypothesis Testing

Lecture 6: Introduction to Linear Regression

ANOVA. The Observations y ij

17 - LINEAR REGRESSION II

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Statistics for Business and Economics

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

18. SIMPLE LINEAR REGRESSION III

MD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract

Measuring the Strength of Association

28. SIMPLE LINEAR REGRESSION III

Biostatistics 360 F&t Tests and Intervals in Regression 1

Topic- 11 The Analysis of Variance

Properties of Least Squares

Chapter 11: Simple Linear Regression and Correlation

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

Comparison of Regression Lines

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Learning Objectives for Chapter 11

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

STAT 3008 Applied Regression Analysis

Economics 130. Lecture 4 Simple Linear Regression Continued

SIMPLE LINEAR REGRESSION

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

STATISTICS QUESTIONS. Step by Step Solutions.

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

First Year Examination Department of Statistics, University of Florida

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Chapter 8 Indicator Variables

Statistics MINITAB - Lab 2

Statistics II Final Exam 26/6/18

Lecture 6 More on Complete Randomized Block Design (RBD)

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,

x i1 =1 for all i (the constant ).

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Negative Binomial Regression

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Statistics Chapter 4

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

experimenteel en correlationeel onderzoek

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Lab 4: Two-level Random Intercept Model

Chapter 5 Multilevel Models

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Chapter 12 Analysis of Covariance

Topic 10: ANOVA models for random and mixed effects Fixed and Random Models in One-way Classification Experiments

Diagnostics in Poisson Regression. Models - Residual Analysis

Limited Dependent Variables and Panel Data. Tibor Hanappi

# c i. INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased: Recall: A contrast is a linear combination of effects with coefficients summing to zero:

1-FACTOR ANOVA (MOTIVATION) [DEVORE 10.1]

β0 + β1xi. You are interested in estimating the unknown parameters β

17 Nested and Higher Order Designs

e i is a random error

The Ordinary Least Squares (OLS) Estimator

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 15 - Multiple Regression

Linear Regression Analysis: Terminology and Notation

STAT 3014/3914. Semester 2 Applied Statistics Solution to Tutorial 13

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Chapter 15 Student Lecture Notes 15-1

SIMPLE LINEAR REGRESSION and CORRELATION

4.3 Poisson Regression

Some basic statistics and curve fitting techniques

Professor Chris Murray. Midterm Exam

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Methods in Epidemiology. Medical statistics 02/11/2014

F8: Heteroscedasticity

Transcription:

Reduced sldes Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor 1

The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned b regresson model vs unexplaned varance In ANOVA models Varance explaned b Factors >> than unexplaned varance In common language s the varablt among treatments greater than varablt wthn treatments ANOVA vs regresson One factor ANOVA: 1 contnuous response varable and 1 categorcal predctor varable (factor) Compare wth regresson: 1 contnuous response varable and 1 contnuous predctor varable

Ams Measure relatve contrbuton of dfferent sources of varaton (factors or combnaton of factors) to total varaton n response varable Test hpotheses about group (treatment) populaton means for response varable Data laout Factor level (group) 1 Replcates 11 1... 1 1j j... j............ 1n n... n Sample means 1 Populaton means 1 Grand mean estmates 3

Tpes of predctors (factors) Fxed factor: all levels or groups of nterest are used n stud conclusons are restrcted to those groups Random factor: random sample of all groups of nterest are used n stud tpcall ndvdual groups are not of nterest conclusons extrapolate to all possble groups Lnear model Lnear model for 1 factor ANOVA: j = + + j where overall populaton mean effect of th treatment or group ( - ) j random or unexplaned error (varaton not explaned b treatment effects) 4

Datoms & heav metals Effect of heav metals on speces dverst of datoms n streams n Colorado Response varable: speces dverst of datoms Predctor varable: heav metal level categorcal wth 4 groups (background, low, medum, hgh) Replcates are statons H 0 : 1 = = = Null hpothess No dfference between populaton group (treatment) means Mean speces dverst of datoms s same for 4 heav metals levels 5

H 0 - fxed factor No effects of specfc groups (treatments) H 0 : 1 = = = = 0 where = - No effect of 4 heav metal levels on datom speces dverst Inference s onl to these 4 heav metals Streams and datoms Does datom dverst var b stream? 6

H 0 - random factor No varaton among means of all possble groups (treatments) H 0 : A = 0 H : μ μ / N1 =0 where groups =1 to N (streams) are chosen randoml Test: No varaton n datom speces dverst between randoml chosen streams Inference s to all streams (wthn??? Regon) sampled b N number of streams Parttonng varaton Varaton n response varable parttoned nto: varaton explaned b dfference among groups (or treatments) varaton not explaned (resdual varaton, wthn group) 7

Regresson: Analss of varance n Y ( ) Total varaton (Sum of Squares) n Y ( ) Varaton n Y explaned b regresson (SS Regresson ) ( ) Varaton n Y unexplaned b regresson (SS Resdual ) ˆ ( ) ( ) ( ˆ ) Y } } ( ˆ ) } ( ) ( ˆ ) least squares regresson lne x x X 8

Parttonng the Varance 1 11 1 13 14 3 31 4 3 33 34 1 3 Group Parttonng the Varance 1 3 4 3 1 1 3 Group 9

Sum of squares Parttonng the Varance ( ) ( ) ( ) j j 1 1 ) ( j ( ) 3 Wthn Groups Between Groups 1 3 Group Sum of squares Parttonng the Varance ( j ) n ( ) ( j 1 1 3 4 3 Wthn Group unexplaned ) 1 3 Group 10

Sum of squares Parttonng the Varance ( j ) n ( ) ( j Between Groups explaned n = 4 (n ths example) ) 1 ( ) Between Groups (n = 4) 3 1 3 Group ANOVA SS Total j ( ) SS Between groups + SS Wthn groups (Resdual) n( ) ( j ) 11

Mean squares Average sum-of-squared devatons Degrees of freedom: number of components mnus 1 df total [pn-1] = df groups [p-1] + df resdual [p(n-1)] Mean square s a varance: SS dvded b df Source SS df MS Groups n ( ) p-1 Resdual ANOVA table ( ) j p(n-1) n ( ( ) ( p 1) j p( n 1) ) Total ( ) j pn-1 1

Treatments (= groups) explan nothng, e. SS Groups equals zero Replcate Group1 Group Group3 Group4 1 16.0 15.0 16.0 17.0 15.0 17.0 16.0 16.0 3 17.0 16.0 17.0 15.0 4 16.0 16.0 15.0 16.0 Mean 16.0 16.0 16.0 16.0 Grand mean = 16.0 Treatments (= groups) explan everthng, e. SS Resdual equals zero Replcate Group1 Group Group3 Group4 1 19.5 15.0 16.5 13.0 19.5 15.0 16.5 13.0 3 19.5 15.0 16.5 13.0 4 19.5 15.0 16.5 13.0 Mean 19.5 15.0 16.5 13.0 Grand mean = 16.0 13

Testng ANOVA H 0 Remember: Lnear model for 1 factor ANOVA: j = + + j and u u, where can be or All populaton group means the same 1 = = = a = Fxed factor: H 0 : 1 = = = = 0 Means that there s no varablt across a fxed set of group means (lmted nference) Random factor (A): H 0 : A = 0 Means that there s no varablt across all possble group means (broad nference) ANOVA table Source SS df MS F Groups n ( ) p-1 n ( ) MS g /MS res ( p 1) Resdual ( ) j p(n-1) ( j ) p( n 1) Total ( ) j pn-1 14

F-rato statstc F-rato statstc s rato of sample varances (.e. mean squares) Probablt dstrbuton of F-rato known dfferent dstrbutons dependng on df of varances If homogenet of varances holds, F- rato follows F dstrbuton F dstrbuton a null dstrbuton P(F) 3, 4 df 0 1 3 4 5 F 15

Expected mean squares If factor s fxed and homogenet of varance assumpton holds: MS Groups estmates MS Resdual estmates n ( ) p 1 F rato = Ms groups MS Resdual Testng H 0 - fxed factor If H 0 s true: all s = 0 MS Groups and MS Resdual both estmate so F-rato 1 If H 0 s false: at least one 0 MS Groups estmates + treatment effects so F-rato > 1 MS Groups n ( ) p 1 MS Resdual F rato = Ms groups MS Resdual 16

If factor s fxed and homogenet of varance assumpton holds: MS Groups MS Resdual Expected n ( ) p 1 n Calculated ( ( p 1) ( ) j p( n 1) ) F rato = Ms groups MS Resdual Expected mean squares (random factor) If factor s random and homogenet of varance assumpton holds: MS Groups estmates MS Resdual estmates n A F rato = Ms groups MS Resdual 17

Testng H 0 - random factor If H 0 s true: A = 0 MS Groups and MS Resdual both estmate so F-rato 1 If H 0 s false: A > 0 MS Groups estmates plus added varance due to groups or treatments so F-rato > 1 MS Groups n A MS Resdual F rato = Ms groups MS Resdual If factor s random and homogenet of varance assumpton holds: MS Groups MS Resdual Expected n A n Calculated ( ( p 1) ( ) j p( n 1) ) F rato = Ms groups MS Resdual 18

Full set of sldes 19

Introducton to Analss of Varance (ANOVA) Part 1 Sngle factor The logc of Analss of Varance Is the varance explaned b the model >> than the resdual varance In regresson models Varance explaned b regresson model vs unexplaned varance In ANOVA models Varance explaned b Factors >> than unexplaned varance In common language s the varablt among treatments greater than varablt wthn treatments 0

ANOVA vs regresson One factor ANOVA: 1 contnuous response varable and 1 categorcal predctor varable (factor) Compare wth regresson: 1 contnuous response varable and 1 contnuous predctor varable Ams Measure relatve contrbuton of dfferent sources of varaton (factors or combnaton of factors) to total varaton n response varable Test hpotheses about group (treatment) populaton means for response varable 1

Termnolog Factor (predctor varable): usuall desgnated factor A number of levels/groups/treatments = p Number of replcates wthn each group n Each observaton: Data laout Factor level (group) 1 Replcates 11 1... 1 1j j... j............ 1n n... n Sample means 1 Populaton means 1 Grand mean estmates

Tpes of predctors (factors) Fxed factor: all levels or groups of nterest are used n stud conclusons are restrcted to those groups Random factor: random sample of all groups of nterest are used n stud tpcall ndvdual groups are not of nterest conclusons extrapolate to all possble groups Lnear model Lnear model for 1 factor ANOVA: j = + + j where overall populaton mean effect of th treatment or group ( - ) j random or unexplaned error (varaton not explaned b treatment effects) 3

Compare wth regresson model = 0 + 1 x + ntercept s replaced b slope s replaced b (treatment effect): predctor varable s categorcal rather than contnuous stll measures effect of predctor varable Datoms & heav metals Effect of heav metals on speces dverst of datoms n streams n Colorado Response varable: speces dverst of datoms Predctor varable: heav metal level categorcal wth 4 groups (background, low, medum, hgh) Replcates are statons 4

H 0 : 1 = = = Null hpothess No dfference between populaton group (treatment) means Mean speces dverst of datoms s same for 4 heav metals levels H 0 - fxed factor No effects of specfc groups (treatments) H 0 : 1 = = = = 0 where = - No effect of 4 heav metal levels on datom speces dverst Inference s onl to these 4 heav metals 5

Streams and datoms Does datom dverst var b stream? H 0 - random factor No varaton among means of all possble groups (treatments) H 0 : A = 0 H : μ μ / N1 =0 where groups =1 to N (streams) are chosen randoml Test: No varaton n datom speces dverst between randoml chosen streams Inference s to all streams (wthn??? Regon) sampled b N number of streams 6

Basc assumpton of ANOVA (sngle factor) 1 = = = = = where = populaton varance of dependent varable ( ) n each group (ths s the wthn group varaton) Each group (or treatment) populaton has smlar varance homogenet of varance assumpton Parttonng varaton Varaton n response varable parttoned nto: varaton explaned b dfference among groups (or treatments) varaton not explaned (resdual varaton, wthn group) 7

Regresson: Analss of varance n Y ( ) Total varaton (Sum of Squares) n Y ( ) Varaton n Y explaned b regresson (SS Regresson ) ( ) Varaton n Y unexplaned b regresson (SS Resdual ) ˆ ( ) ( ) ( ˆ ) Y } } ( ˆ ) } ( ) ( ˆ ) least squares regresson lne x x X 8

ANOVA SS Total j ( ) SS Between groups + SS Wthn groups (Resdual) n( ) ( j ) Parttonng the Varance 1 11 1 13 14 3 31 4 3 33 34 1 3 Group 9

Parttonng the Varance 1 3 4 3 1 1 3 Group Parttonng the Varance ( ) ( ) ( ) j j 1 1 ) ( j ( ) 3 Wthn Groups Between Groups 1 3 Group 30

Parttonng the Varance ( j ) n ( ) ( j ) Wthn Group 1 3 4 3 1 1 3 Group Parttonng the Varance ( j ) n ( ) ( j Between Groups n = 4 (n ths example) ) 1 ( ) Between Groups (n = 4) 3 1 3 Group 31

Mean squares Average sum-of-squared devatons Degrees of freedom: number of components mnus 1 df total [pn-1] = df groups [p-1] + df resdual [p(n-1)] Mean square s a varance: SS dvded b df Source SS df MS Groups n ( ) p-1 Resdual ANOVA table ( ) j p(n-1) n ( ( ) ( p 1) j p( n 1) ) Total ( ) j pn-1 3

Treatments (= groups) explan nothng, e. SS Groups equals zero Replcate Group1 Group Group3 Group4 1 16.0 15.0 16.0 17.0 15.0 17.0 16.0 16.0 3 17.0 16.0 17.0 15.0 4 16.0 16.0 15.0 16.0 Mean 16.0 16.0 16.0 16.0 Grand mean = 16.0 Treatments (= groups) explan everthng, e. SS Resdual equals zero Replcate Group1 Group Group3 Group4 1 19.5 15.0 16.5 13.0 19.5 15.0 16.5 13.0 3 19.5 15.0 16.5 13.0 4 19.5 15.0 16.5 13.0 Mean 19.5 15.0 16.5 13.0 Grand mean = 16.0 33

Testng ANOVA H 0 Remember: Lnear model for 1 factor ANOVA: j = + + j and u u, where can be or All populaton group means the same 1 = = = a = Fxed factor: H 0 : 1 = = = = 0 Means that there s no varablt across a fxed set of group means (lmted nference) Random factor (A): H 0 : A = 0 Means that there s no varablt across all possble group means (broad nference) ANOVA table Source SS df MS F Groups n ( ) p-1 n ( ) MS g /MS res ( p 1) Resdual ( ) j p(n-1) ( j ) p( n 1) Total ( ) j pn-1 34

F-rato statstc F-rato statstc s rato of sample varances (.e. mean squares) Probablt dstrbuton of F-rato known dfferent dstrbutons dependng on df of varances If homogenet of varances holds, F- rato follows F dstrbuton F dstrbuton P(F) 3, 4 df 0 1 3 4 5 F 35

Expected mean squares If factor s fxed and homogenet of varance assumpton holds: MS Groups estmates MS Resdual estmates n ( ) p 1 F rato = Ms groups MS Resdual Testng H 0 - fxed factor If H 0 s true: all s = 0 MS Groups and MS Resdual both estmate so F-rato 1 If H 0 s false: at least one 0 MS Groups estmates + treatment effects so F-rato > 1 MS Groups n ( ) p 1 MS Resdual F rato = Ms groups MS Resdual 36

If factor s fxed and homogenet of varance assumpton holds: MS Groups MS Resdual Expected n ( ) p 1 n Calculated ( ( p 1) ( ) j p( n 1) ) F rato = Ms groups MS Resdual Expected mean squares (random factor) If factor s random and homogenet of varance assumpton holds: MS Groups estmates MS Resdual estmates n A F rato = Ms groups MS Resdual 37

Testng H 0 - random factor If H 0 s true: A = 0 MS Groups and MS Resdual both estmate so F-rato 1 If H 0 s false: A > 0 MS Groups estmates plus added varance due to groups or treatments so F-rato > 1 MS Groups n A MS Resdual F rato = Ms groups MS Resdual If factor s random and homogenet of varance assumpton holds: MS Groups MS Resdual Expected n A n Calculated ( ( p 1) ( ) j p( n 1) ) F rato = Ms groups MS Resdual 38