STAT 705 Chapter 16: One-way ANOVA

Size: px
Start display at page:

Download "STAT 705 Chapter 16: One-way ANOVA"

Transcription

1 STAT 705 Chapter 16: One-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 21

2 What is ANOVA? Analysis of variance (ANOVA) models are regression models with qualitative predictors, called factors or treatments Factors have different levels For example, the factor education may have the levels high school, undergraduate, graduate The factor gender has two levels female, male We may have several factors as predictors, eg race and gender may be used to predict annual salary in $ There are two types of factors: Classification (investigator cannot control) Experimental (investigator can control) 2 / 21

3 ANOVA A control treatment (or control factor level) is sometimes used to measure effects of (new or experimental) treatments under investigation, relative to the status quo eg ibuprofin, aspirin, and placebo We have 3 factor levels Without placebo, we do not know how iboprofin or aspirin does relative to no pain killer, only relative to each other Uses of ANOVA models: find best/worst treatment, measure effectiveness of new treatment, compare treatments Often interested in determining whether there is a difference in treatments Read Sections in the text 3 / 21

4 163 Cell means model Have r different treatments or factor levels At each level i, have n i observations from group i Total number of observations is n T = n 1 + n n r { i = 1,, r factor level Response is Y ij where j = 1,, n i obs within factor level } Example: Two factors: MS, PhD Y ij is age in years Spring of 2014 we observe Y 11 = 28, Y 12 = 24, Y 13 = 24, Y 14 = 22, Y 15 = 26, Y 16 = 23, Y 21 = 29, Y 22 = 23, Y 23 = 26, Y 24 = 25, Y 25 = 22, Y 26 = 23, Y 27 = 38, Y 28 = 33, Y 29 = 30, Y 2,10 = 27 4 / 21

5 One-way ANOVA model Y ij = µ i + ɛ ij, ɛ ij iid N(0, σ 2 ) Can rewrite as Y ij ind N(µ i, σ 2 ) Data are normal, data are independent, variance constant across groups µ i is allowed to be different for each group µ 1,, µ r are the r population means of the response A picture helps Questions: what is E{Y ij }? What is σ 2 {Y ij }? 5 / 21

6 Matrix formulation (pp , ) For r = 3 we have Y 11 Y 12 Y 1n1 Y 21 Y 22 Y 2n2 Y 31 Y 32 Y 3n3 = µ 1 µ 2 µ 3 + ɛ 11 ɛ 12 ɛ 1n1 ɛ 21 ɛ 22 ɛ 2n2 ɛ 31 ɛ 32 ɛ 3n3 or Y = Xβ + ɛ 6 / 21

7 164 Fitting the model For r = 3, let Q(µ 1, µ 2, µ 3 ) = 3 ni i=1 j=1 (Y ij µ i ) 2 Need to minumize this over all possible (µ 1, µ 2, µ 3 ) to find least-squares (LS) solution Can easily show that Q(µ 1, µ 2, µ 3 ) has minimum at ˆβ = ˆµ 1 ˆµ 2 ˆµ 3 = Ȳ 1 Ȳ 2 Ȳ 3 where Ȳ i = 1 n i ni j=1 Y ij is the sample mean from the ith group (pp ) These ˆβ are also maximum likelihood estimates 7 / 21

8 Matrix formula of least-squares estimators (r = 3) X X = (X X) 1 = n n n = , X Y = Y 1 Y 2, Y 3 ˆβ = (X X) 1 X Y = Ȳ 1 Ȳ 2 Ȳ 3 n n n 3, 8 / 21

9 Residuals As in regression (STAT 704), e ij = Y ij Ŷ ij = Y ij ˆµ i = Y ij Ȳ i As usual, Ŷ ij is the estimated mean response under the model Note that n i j=1 e ij = 0 [check this!] In matrix terms e = Y Xˆβ = Y Ŷ 9 / 21

10 Kenton Food Company Example r = 4 box designs for a new breakfast cereal 20 stores w/ roughly equal sales volumes picked to participate; n i = 5 is planned for each A fire occurred at one store that had design 3, so ended up with n T = 19 instead of 20, and n 1 = n 2 = n 4 = 5 and n 3 = 4 10 / 21

11 Kenton foods example data kenton; input sales design datalines; ; proc sgscatter; plot sales*design; run; proc glm plots=all; * zero/one dummy variables, but recover cell means via lsmeans; class design; model sales=design; lsmeans design; run; 11 / 21

12 165 ANOVA table (pp ) Define the following n i Y i = Y ij = i group sum, j=1 Ȳ i = 1 n i n i Y ij = ith group mean j=1 Y = r n i Y ij = i=1 j=1 r Y i = sum all obs i=1 Ȳ = 1 n T r n i i=1 j=1 Y ij = 1 n T r Y i = mean all obs i=1 12 / 21

13 Sums of squares for treatments, error, and total SSTO = SSTR = = r n i (Y ij Ȳ ) 2 = variability in Y ij s i=1 j=1 n i r (Ŷ ij Ȳ ) 2 = i=1 j=1 n i r (Ȳ i Ȳ ) 2 = r n i (ˆµ ij Ȳ ) 2 i=1 j=1 i=1 j=1 i=1 r n i (Ȳ i Ȳ ) 2 = variability explained by ANOVA model r n i r n i SSE = (Y ij Ŷ ij ) 2 = i=1 j=1 i=1 j=1 = variability NOT explained by ANOVA model e 2 i 13 / 21

14 Comments As before in regression, SSTO }{{} = } SSTR {{} + }{{} SSE total treatment effects leftover randomness SSE=0 Y ij = Y ik for all j k SSTR=0 Ȳi = Ȳ for i = 1,, r 14 / 21

15 ANOVA table (p 694) Source SS df MS SSTR r ni i=1 j=1 (Ȳ i Ȳ ) 2 r 1 SSTR/(r 1) SSE r ni i=1 j=1 (Y ij Ȳi ) 2 n T r SSE/(n T r) SSTO r ni i=1 j=1 (Y ij Ȳ ) 2 n T 1 15 / 21

16 Degrees of freedom SSTO has n T 1 df because there are n T Y ij Ȳ terms in the sum, but they add up to zero (1 constraint) SSE has n T r df because there are n T Y ij Ȳi terms in the sum, but there are r constraints of the form ni j=1 (Y ij Ȳ i ) = 0 SSTR has r 1 df because there are r terms n i (Ȳ i Ȳ ) in the sum, but they sum to zero (1 constraint) 16 / 21

17 Estimated mean squares E{MSE} = σ 2, MSE is unbiased estimate of σ 2 r E{MSTR} = σ 2 i=1 + n i(µ i µ ) 2, r 1 where µ = r n i µ i i=1 n T is weighted average of µ 1,, µ r (pp ) If µ i = µ j for all i, j {1,, r} then E{MSTR} = σ 2, otherwise E{MSTR} > σ 2 Hence, if any group means are different then E{MSTR} E{MSE} > 1 17 / 21

18 166 F test of H 0 : µ 1 = = µ r Fact: If µ 1 = = µ r then F = MSTR MSE F (r 1, n T r) To perform α-level test of H 0 : µ 1 = = µ r vs H a : some µ i µ j for i j, Accept if F F (1 α, r 1, n T r) or p-value α Reject if F > F (1 α, r 1, n T r) or p-value < α p-value = P{F (r 1, n T 1) F } Example: Kenton Foods 18 / 21

19 Comments If r = 2 then F = (t ) 2 where t is t-statistic from 2-sample pooled-variance t-test The F-test may be obtained from the general nested linear hypotheses approach (big model / little model) Here the full model is Y ij = µ i + ɛ ij and the reduced is Y ij = µ + ɛ ij F = [ ] SSE(R) SSE(F ) dfe R dfe F SSE(F ) dfe F = MSTR MSE 19 / 21

20 167 Alternative formulations SAS will fit the cell means model (discussed so far) with a noint option in model statement; however, the F-test will not be correct Your textbook discusses an alternative parameterization that is not easy to get out of the SAS procedures we will use By default, SAS fits the model where α r = 0 Y ij = µ + α i + ɛ ij, E{Y rj } = µ; µ is the cell-mean for the rth level For i < r, E{Y ij } = µ + α i ; α i is i s offset to group r s mean µ Note that SAS s default corresponds to a regression model where categorical predictors are modeled using the usual zero-one dummy variables In class, let s find the design X for SAS s model for r = 3 and n 1 = n 2 = n 3 = 2 20 / 21

21 SAS s baseline & offset model Even though SAS parameterizes the model differently, with the r th level as baseline, the ANOVA table and F-test is the same as the cell means model Also ˆµ = Ȳr and ˆα i = Ȳi Ȳr are the OLS and MLE estimators These are reported in SAS Use, eg model sales=design / solution; The cell means ˆµ i are obtained in SAS by adding lsmeans to glm or glimmix 21 / 21

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 38 Two-way ANOVA Material covered in Sections 19.2 19.4, but a bit

More information

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 41 Two-way ANOVA This material is covered in Sections

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

STAT 705 Chapters 22: Analysis of Covariance

STAT 705 Chapters 22: Analysis of Covariance STAT 705 Chapters 22: Analysis of Covariance Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 16 ANalysis of COVAriance Add a continuous predictor to

More information

Topic 20: Single Factor Analysis of Variance

Topic 20: Single Factor Analysis of Variance Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory

More information

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest. Experimental Design: Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest We wish to use our subjects in the best

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Chapter 11 - Lecture 1 Single Factor ANOVA

Chapter 11 - Lecture 1 Single Factor ANOVA Chapter 11 - Lecture 1 Single Factor ANOVA April 7th, 2010 Means Variance Sum of Squares Review In Chapter 9 we have seen how to make hypothesis testing for one population mean. In Chapter 10 we have seen

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA

STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Balanced vs. unbalanced

More information

Chapter 8: Regression Models with Qualitative Predictors

Chapter 8: Regression Models with Qualitative Predictors Chapter 8: Regression Models with Qualitative Predictors Some predictors may be binary (e.g., male/female) or otherwise categorical (e.g., small/medium/large). These typically enter the regression model

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Sections 7.1, 7.2, 7.4, & 7.6

Sections 7.1, 7.2, 7.4, & 7.6 Sections 7.1, 7.2, 7.4, & 7.6 Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 25 Chapter 7 example: Body fat n = 20 healthy females 25 34

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA iron retention (log) 0 1 2 3 high Fe2+ high Fe3+ low Fe2+ low Fe3+ medium Fe2+ medium Fe3+ 2 Two-way ANOVA In the one-way design there is only one factor. What if there are several factors? Often, we are

More information

Regression Models for Quantitative and Qualitative Predictors: An Overview

Regression Models for Quantitative and Qualitative Predictors: An Overview Regression Models for Quantitative and Qualitative Predictors: An Overview Polynomial regression models Interaction regression models Qualitative predictors Indicator variables Modeling interactions between

More information

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Analysis of Variance Bios 662

Analysis of Variance Bios 662 Analysis of Variance Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-21 13:34 BIOS 662 1 ANOVA Outline Introduction Alternative models SS decomposition

More information

One-way ANOVA (Single-Factor CRD)

One-way ANOVA (Single-Factor CRD) One-way ANOVA (Single-Factor CRD) STAT:5201 Week 3: Lecture 3 1 / 23 One-way ANOVA We have already described a completed randomized design (CRD) where treatments are randomly assigned to EUs. There is

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij = 20. ONE-WAY ANALYSIS OF VARIANCE 1 20.1. Balanced One-Way Classification Cell means parametrization: Y ij = µ i + ε ij, i = 1,..., I; j = 1,..., J, ε ij N(0, σ 2 ), In matrix form, Y = Xβ + ε, or 1 Y J

More information

BIOS 2083 Linear Models c Abdus S. Wahed

BIOS 2083 Linear Models c Abdus S. Wahed Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Chapter 3. Diagnostics and Remedial Measures

Chapter 3. Diagnostics and Remedial Measures Chapter 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed Y i = β 0 + β 1 X i + ǫ i i = 1, 2,..., n, where ǫ i iid N(0, σ 2 ), β 0, β 1 and σ 2 are unknown parameters,

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Section Poisson Regression

Section Poisson Regression Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test Rebecca Barter April 13, 2015 Let s now imagine a dataset for which our response variable, Y, may be influenced by two factors,

More information

F-tests and Nested Models

F-tests and Nested Models F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)

More information

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1 Introduction In this chapter, expand the idea of hypothesis tests. We

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Nested Designs & Random Effects

Nested Designs & Random Effects Nested Designs & Random Effects Timothy Hanson Department of Statistics, University of South Carolina Stat 506: Introduction to Design of Experiments 1 / 17 Bottling plant production A production engineer

More information

Research Methods II MICHAEL BERNSTEIN CS 376

Research Methods II MICHAEL BERNSTEIN CS 376 Research Methods II MICHAEL BERNSTEIN CS 376 Goal Understand and use statistical techniques common to HCI research 2 Last time How to plan an evaluation What is a statistical test? Chi-square t-test Paired

More information

STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression With a Categorical Independent Variable Lecture 15 March 17, 2005 Applied Regression Analysis Lecture #15-3/17/2005 Slide 1 of 29 Today s Lecture» Today s Lecture» Midterm Note» Example Regression

More information

Fitting a regression model

Fitting a regression model Fitting a regression model We wish to fit a simple linear regression model: y = β 0 + β 1 x + ɛ. Fitting a model means obtaining estimators for the unknown population parameters β 0 and β 1 (and also for

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression ith a Independent Variable ERSH 8320 Slide 1 of 34 Today s Lecture Regression with a single categorical independent variable. Today s Lecture Coding procedures for analysis. Dummy coding. Relationship

More information

STAT 506: Randomized complete block designs

STAT 506: Randomized complete block designs STAT 506: Randomized complete block designs Timothy Hanson Department of Statistics, University of South Carolina STAT 506: Introduction to Experimental Design 1 / 10 Randomized complete block designs

More information

STATISTICS FOR ECONOMISTS: A BEGINNING. John E. Floyd University of Toronto

STATISTICS FOR ECONOMISTS: A BEGINNING. John E. Floyd University of Toronto STATISTICS FOR ECONOMISTS: A BEGINNING John E. Floyd University of Toronto July 2, 2010 PREFACE The pages that follow contain the material presented in my introductory quantitative methods in economics

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek Two-factor studies STAT 525 Chapter 19 and 20 Professor Olga Vitek December 2, 2010 19 Overview Now have two factors (A and B) Suppose each factor has two levels Could analyze as one factor with 4 levels

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

VIII. ANCOVA. A. Introduction

VIII. ANCOVA. A. Introduction VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest.

More information

Topic 28: Unequal Replication in Two-Way ANOVA

Topic 28: Unequal Replication in Two-Way ANOVA Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Lecture 2. The Simple Linear Regression Model: Matrix Approach Lecture 2 The Simple Linear Regression Model: Matrix Approach Matrix algebra Matrix representation of simple linear regression model 1 Vectors and Matrices Where it is necessary to consider a distribution

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

STAT 401A - Statistical Methods for Research Workers

STAT 401A - Statistical Methods for Research Workers STAT 401A - Statistical Methods for Research Workers One-way ANOVA Jarad Niemi (Dr. J) Iowa State University last updated: October 10, 2014 Jarad Niemi (Iowa State) One-way ANOVA October 10, 2014 1 / 39

More information

MAT3378 ANOVA Summary

MAT3378 ANOVA Summary MAT3378 ANOVA Summary April 18, 2016 Before you do the analysis: How many factors? (one-factor/one-way ANOVA, two-factor ANOVA etc.) Fixed or Random or Mixed effects? Crossed-factors; nested factors or

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Chapter 11 - Lecture 1 Single Factor ANOVA

Chapter 11 - Lecture 1 Single Factor ANOVA April 5, 2013 Chapter 9 : hypothesis testing for one population mean. Chapter 10: hypothesis testing for two population means. What comes next? Chapter 9 : hypothesis testing for one population mean. Chapter

More information

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA March 6, 2017 KC Border Linear Regression II March 6, 2017 1 / 44 1 OLS estimator 2 Restricted regression 3 Errors in variables 4

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

General Linear Model: Statistical Inference

General Linear Model: Statistical Inference Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Qualitative IV Simple Linear Regression: One Qualitative IV 1. Purpose As noted before regression is used both to explain and predict variation in DVs, and adding to the equation categorical variables extends regression

More information

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013 Topic 19 - Inference - Fall 2013 Outline Inference for Means Differences in cell means Contrasts Multiplicity Topic 19 2 The Cell Means Model Expressed numerically Y ij = µ i + ε ij where µ i is the theoretical

More information

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X. One-Way Analysis of Variance (ANOVA) Also called single factor ANOVA. The response variable Y is continuous (same as in regression). There are two key differences regarding the explanatory variable X.

More information

STAT Chapter 10: Analysis of Variance

STAT Chapter 10: Analysis of Variance STAT 515 -- Chapter 10: Analysis of Variance Designed Experiment A study in which the researcher controls the levels of one or more variables to determine their effect on the variable of interest (called

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

3. Diagnostics and Remedial Measures

3. Diagnostics and Remedial Measures 3. Diagnostics and Remedial Measures So far, we took data (X i, Y i ) and we assumed where ɛ i iid N(0, σ 2 ), Y i = β 0 + β 1 X i + ɛ i i = 1, 2,..., n, β 0, β 1 and σ 2 are unknown parameters, X i s

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing STAT763: Applied Regression Analysis Multiple linear regression 4.4 Hypothesis testing Chunsheng Ma E-mail: cma@math.wichita.edu 4.4.1 Significance of regression Null hypothesis (Test whether all β j =

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A = Matrices and vectors A matrix is a rectangular array of numbers Here s an example: 23 14 17 A = 225 0 2 This matrix has dimensions 2 3 The number of rows is first, then the number of columns We can write

More information

Regression With a Categorical Independent Variable

Regression With a Categorical Independent Variable Regression With a Independent Variable Lecture 10 November 5, 2008 ERSH 8320 Lecture #10-11/5/2008 Slide 1 of 54 Today s Lecture Today s Lecture Chapter 11: Regression with a single categorical independent

More information

Outline Topic 21 - Two Factor ANOVA

Outline Topic 21 - Two Factor ANOVA Outline Topic 21 - Two Factor ANOVA Data Model Parameter Estimates - Fall 2013 Equal Sample Size One replicate per cell Unequal Sample size Topic 21 2 Overview Now have two factors (A and B) Suppose each

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

Chapter 6 Multiple Regression

Chapter 6 Multiple Regression STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,

More information

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2 identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, (1) E(MSE) = E(SSE/[IJ(K 1)]) = (2) E(MSA) = E(SSA/(I

More information

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information