STAT5044: Regression and Anova

Size: px
Start display at page:

Download "STAT5044: Regression and Anova"

Transcription

1 STAT5044: Regression and Anova Inyoung Kim 1 / 49

2 Outline 1 How to check assumptions 2 / 49

3 Assumption Linearity: scatter plot, residual plot Randomness: Run test, Durbin-Watson test when the data can be arranged in time order Constant variance: scatter plot, residual plot (ABS-residual plot); Brown-Forsythe test, Breusch-Pagan Test Normality of error: Box-plot, histogram, normal probability plot; Shapiro-Wilks test, Kolmogorov-Smirnov, Anderson-Darling Remark: Normality probability plot provides no information if the assumption of linearity and/or constant variance are violated 3 / 49

4 Influential point Combination of large absolute residual and high leverage (h ii ) Leverage: diagonal value of Hat matrix (H) h 11 h 12 h 1n h 21 h 22 h 1n H = h n1 h n2 h nn High leverage large h ii 4 / 49

5 Residual Three types: Ordinary r: r i = y ŷ, where E(r i ) = 0 and var(r i ) = (1 h ii )σ 2 Standardized: r i ˆσ 1 h ii Studendized (or Jackknife): where, ˆσ 2 (i) = ( j r 2 r i ˆσ (i) 1 hii t n 2 j(i) )/(n p 1) and (p+1) is the number of parameter h ii is the leverage which is the diagonal value of Hat matrix r j(i) = y j ŷ j(i) = y j ( ˆβ 0(i) + ˆβ 1(i) x j ) 5 / 49

6 Properties of residuals Sum to zero: r i = 0 Are not independent 6 / 49

7 Residual Jackknife σ 2 r i(i) = y i ŷ i(i) N(0, ) 1 h ii where the subindex (i) indicate that estimate without point i residual for y i computed using regression without y i then scaling Studendized residual: r i(i) var(r ˆ i(i) ) r i(i) = y i ŷ i(i) = y i [ ˆβ 0(i) + ˆβ 1(i) x i ] 7 / 49

8 Studendized residual r i(i) = var(r ˆ i(i) ) r i ˆσ (i) 1 hii by Fact 1 and 2 Fact 1: r i(i) = r i 1 h ii Fact 2: ri(i) 2 = (n p) ˆσ 2 r i 2 1 h ii ˆσ (i) = (n p) ˆσ 2 r 2 i 1 h ii n p 1 8 / 49

9 Residual Using Fact1 r i(i) = r i 1 h ii, we have Var(r i(i) ) = Var(r i) (1 h ii ) 2 = σ 2 1 h ii But σ 2 is unknown We use ˆσ 2 (i) r i(i) = Y i Ŷ i(i) r i(i) = r i 1 h ii 9 / 49

10 Residual Studendized residual r i 1 h ii = ˆσ (i) 2 1 h ii r i ˆσ 2 (i) (1 h ii ) where ˆσ 2 (i) = j r 2 j(i) n p 1 rj(i) 2 = (n p) ˆσ 2 r i 2 j 1 h ii NOTE: large residual if r j(i) > 3 An expression for the distribution of the standardized residuals was obtained (Weisberg, 1985) 10 / 49

11 Studendized residual r i(i) = var(r ˆ i(i) ) r i ˆσ (i) 1 hii t n p 1 You don t need to know how to prove this in our class! (beyond our class scope) 11 / 49

12 Comparison with standardized residual Standardized residual: r i 0 var(ri ) = r i 0 σ 2 (1 h ii ) r i (1 hii ) ˆσ 2 If one has outliers with large absolute residual, then ˆσ 2 may not be a good measurement Residuals are not independent and have different variances The distribution of the standardized residual is not a t distribution People usually ignore these problems 12 / 49

13 Residual plots in R > lmfit<-lm(y x) > plot(fitted(lmfit),residuals(lmfit),xlab= Fitted,ylab= Residuals ) > abline(h=0) > plot(fitted(lmfit),abs(residuals(lmfit)),xlab="fitted",ylab=" Residuals 13 / 49

14 Residual plots Residual plots Residuals Residuals Fitted Fitted 14 / 49

15 Leverage H = X(X t X) 1 X t Let x t i = ( ( ) x t 1 ) 1 1 x i, xi =, X =, A = (X t X) 1 x i H n n = XAX t = The (i,j)th element of H is x t i Ax j NOTE: A = (X t X) 1 = x t 1 x t n x t n A ( x 1 x 2 x n ) ( 1 + x 2 n x S xx S xx x 1 S xx S xx ) 15 / 49

16 Leverage The (i,j)th element of H is (1 x i )(X t X) 1 ( 1 x j ), h ii is level of matrix h ii = 1 n + (x i x) 2 S xx (Check ) high level point: h ii is large, that is (x i x) 2 is large 1 n h ii 1 Idea: If this is regular and n is large (n ) h ii = 1 n + (x i x) 2 (x i x) 2 O( 1 n ) 0 16 / 49

17 Why is leverage in this range? j h 2 ji = h ii 0 j i h 2 ji = h ii h 2 ii Hence, 0 h ii (1 h ii ) Since h ii > 0 and 1 h ii 0, h ii 1 We also know that h ii > 1 n because of h ii = 1 n + (x i x) 2 S xx 1 n 17 / 49

18 Cook s distance Measure influential points using ŷ i ŷ i(j), j is fixed point ŷ 1(i) ŷ 2(i) ŷ (i) = ŷ n(i) where the subindex (i) indicates the fitted values are obtained using all observations except ith observation The ith cook s distance: D i = {ŷ ŷ (i)} t {ŷ ŷ (i) } p ˆσ 2 where ŷ = X ˆβ, ŷ (i) = X ˆβ (i) 18 / 49

19 Cook s distance D i = { ˆβ ˆβ (i) } t X t X{ ˆβ ˆβ (i) }/p ˆσ 2 F p,n p Identify the points which have relatively large cook distance by Fact3: ˆβ ˆβ (i) = D i = ( r i 1 h ii ) 2 x t i (X t X) 1 (X t X)(X t X) 1 x i p ˆσ 2 r i 1 h ii (X t X) 1 x i 19 / 49

20 Cook s distance D i depends on two factors: D i = ( r i 1 h ii ) 2 x t i (X t X) 1 (X t X)(X t X) 1 x i p ˆσ 2 The size of the residual r i The leverage value h ii The larger either r i or h ii is, the larger D i The ith case can be influential: (1) by having a larger residuals and only a moderate leverage value h ii or (2) by having a larger leverage value h ii with only a moderately sized residuals or (3) by having both a larger residual and a large leverage value 20 / 49

21 Cooks distance in R libray(stats) #<---for cooksdistance libray(faraway) #<--halfnorm lmfit<-lm(y x) cook<-cooksdistance(lmfit) par(mfcol=c(1,2)) halfnorm(cook,3,ylab="cooks dist") boxplot(cook) 21 / 49

22 Cooks distance in R Cooks distance Cooks dist Half normal quantiles 22 / 49

23 Randomness: runs test and Durbin Wason test runs test Order the residuals Count the number of runs (r), the numbers of positive and negative residuals, let s say n 1 and n 2 If n 1 20, n 2 20, reject the hypothesis of randomness if r < r L or if r > r U, where r L and r U are the upper and lower critical values given Table A30 (handout) For large sample size, reject hypothesis of randomness if z > z α/2, where z = r µ 05 σ where µ = 1 + 2n 1n 2 n 1 +n 2, σ 2 = 2n 1n 2 (2n 1 n 2 n 1 n 2 ) (n 1 +n 2 ) 2 (n 1 +n 2 1) 23 / 49

24 Example of Randomness: runs test > x<-c(0:9) > y<-c(98, 135, 162,178, 221,232,283,300,374,395) > lmfit<-lm(y x) > residuals(lmfit) / 49

25 Example of Randomness: runs test How to do run test? Run test: (+ + +) ( ) (+ +) the num of run=3 the num of positive=5 the num of negative=5 Using Table A30 rl=2 and ru=10 If r<rl or r>ru, reject the hypothesis of randomness 25 / 49

26 run rest in R library(lawstat) lmfit<-lm(y x) runstest(residuals(lmfit)) > runstest(residuals(lmfit)) Runs Test - Two sided data: residuals(lmfit) Standardized Runs Statistic = , p-value = / 49

27 Randomness: Durbin Wason test Durbin Wason test: to test error terms ε i are independent (H 0 : ρ = 0) Test statistic D is D = n t=2(r t r t 1 ) 2 n t=1 r 2 t where r t = Y t Ŷ t If D > d U, conclude H 0 If D < d L, conclude H a If d L < D < d U, test is inconclusive d L and d U are selected based on level of testing, the number of X variables (p 1), sample size (n) 27 / 49

28 DW test in R > lmfit<-lm(y x) > dwtest(lmfit) Durbin-Watson test data: lmfit DW = 1875, p-value = alternative hypothesis: true autocorrelation is greater than 0 28 / 49

29 Constant variance: Brown-Forsythe and Breusch-Pagan test Brown-Forsythe (Levene test) r i1, r i2 : the ith residual for group1 and group2 n 1, n 2 : the sample size of each group r 1, r 2 : the median of each group d i1 = r i1 r 1, d i2 = r i2 r 2 Two-sample t test statistic is where s 2 = (d i1 d1 ) 2 + (d i2 d2 ) 2 n 2 Breusch-Pagan to test H 0 : γ 1 = 0 d 1 d2 t BF = s 1/n 1 + 1/n 2 log e σi 2 = γ 0 + γ 1 X i Test statistic is X 2 BP = SSR /2 (SSE/n) 2 where SSR : regression sum of squares when regressing r 2 on X and SSE is the error sum of squares when regression Y on X 29 / 49

30 BF tests in R # best way to split two group is that one has low values and the other has large values of X g1<-c( , , , , ) g2<-c( , , , , ) d1<-abs(g1-median(g1)) d2<-abs(g2-median(g2)) ttest(d1,d2) Welch Two Sample t-test data: d1 and d2 t = , df = 711, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y / 49

31 BF tests in R library(lawstat) lmfit<-lm(y x) levenetest(residuals(lmfit),group) > levenetest(residuals(lmfit),group=c(rep(1,5),rep(0,5))) Classical Levene s test based on the absolute deviations from the mean data: residuals(lmfit) Test Statistic = 00708, p-value = / 49

32 BP tests in R library(lmtest) lmfit<-lm(y x) bptest(lmfit) > bptest(lmfit) studentized Breusch-Pagan test data: lmfit BP = 30628, df = 1, p-value = / 49

33 Test of normality Shapiro Wilk test: H 0 : a sample y 1,,y n cames from a normally distributed population Test statistic is W = ( a iy (i) ) 2 n i=1(y (i) ȳ) 2 where y (i) is the ith order statistics and the constant a i are given by m t V 1 (a 1,,a n ) = (m t V 1 V 1 m) 1/2 and m = (m 1,,m n ) t where m i is the expected values of the order statistics of iid random variables from standard normal dist and V is the covariance matrix of those order statistics If W is too small, reject the null hypothesis 33 / 49

34 Shapiro Wilks in R library(stats) lmfit<-lm(y x) Shapirotest(residuals(lmfit)) > shapirotest(residuals(lmfit)) Shapiro-Wilk normality test data: residuals(lmfit) W = 09073, p-value = / 49

35 Test of normality Kolmogorove-Smirnov: The empirical distribution function F n for n iid observations Y i is defined as F n (y) = 1 n n i=1 I(Y i < y) where I(Y y): indicator function The Kolmogorove-Smirnov statistic is If D n is big, reject the null D n = sup y F n (y) F(y) Correlation test: idea-compute the correlation between the expected quantile of normal and the observed order statistic Anderson-Darling test: A distance or empirical distribution test and use with small sample size n / 49

36 Anderson-Darling test in R library(nortest) adtest(residuals(lmfit)) > adtest(residuals(lmfit)) Anderson-Darling normality test data: residuals(lmfit) A = 04495, p-value = / 49

37 PP plot and QQplot Plots for comparing two probability distributions There are two basic types, the probability-probability plot and the quantile-quantile plot A plot of points whose coordinates are the cumulative probabilities {p x (q),p y (q)} for different values of q is a probaility-probability plot, A plot of the points whose coordinates are the quantiles {q x (p),q y (p)} for different values of p is a quantile-quantile plot The latter is the more frequently used of the two types and its use to investigate the assumption that a set of data is from a normal distribution For example, plotting the ordered sample values y 1,,y n against the quantiles of a standard normal distribution, Φ 1 [p (i) ] where p i = i 1 2 n Φ(x) = x 1 e 1 2 µ2 2π dµ This is usually known as a normal probability plot and 37 / 49

38 Normal QQ plot in R library(faraway) qqnorm(residuals(lmfit), ylab= Residuals ) qqline(residuals(lmfit)) 38 / 49

39 Normal QQplot in R Normal QQplot Normal Q-Q Plot Histogram of residuals(lmfit) Residuals Residuals Theoretical Quantiles residuals(lmfit) 39 / 49

40 Lack of fit test Idea: if you have multiple tests of y for x values, you can use these to test for lack of fit Basis: if the fit is good, the fitted line should go through the mean of y s at each x If the fit is bad, the fitted value should differ from the mean 40 / 49

41 Linear Lack of fit test This test assumes variance homogeneity Goal: check the linearity of the conditional mean of Y given X Requirements: one has to have replicates in X Data x 1 x 2 x k y 11 y 21 y k1 y 12 y 22 y k2 y 1n1 y 2n2 y knk Some of the n 1, n 2,,n k have to be > 1 41 / 49

42 Linear Lack of fit test Model y ij = β 0 + β 1 x i + ε ij, i = 1,,k, j = 1,2,,n k where ε ij [0,σ 2 ] Model y ij = β 0 + β 1 x i + σε ij, i = 1,,k, j = 1,2,,n k where ε ij [0,1] These are the same model 42 / 49

43 Linear Lack of fit test Model y ij = β 0 + β 1 x i + σε ij, i = 1,,k, j = 1,2,,n k where ε ij [0,1] How many total replicate? n 1 + n n k = n Remark1: independent, normally distributed error with a constant variance 43 / 49

44 Linear Lack of fit test y = y 11 y 1n1 y 21 y 2n2 y k1 y knk = 1 n1 x 1 1 n1 1 n2 x 2 1 n2 1 nk x k 1 nk ( β0 β 1 ) + ε 44 / 49

45 ANOVA table for Lack of fit test ANOVA model (Ŷ ij Ȳ ) 2 residual (Y ij Ŷ ij ) 2 Total (Y ij Ŷ ij ) 2 + (Ŷ ij Ȳ ) 2 SSE=SSPE+SSLOF Y ij Ŷ ij = (Y ij Ȳ i ) + (Ȳ i Ŷ ij ) SSPE: sum of squared pure errors= (Y ij Ȳ i ) 2 SSLOF=sum of square lack of fit = (Ŷ ij Ȳ i ) 2 H 0 : Linear model fit the data well H 1 : Linear model does not fit the data If SSLOF is large there is a lack of fit F = (Ŷ ij Ȳ i ) 2 /df 1 (Y ij Ȳ i ) 2 /df 2 F df 1,df 2(= SSLof SSPE ) reject H 0 if F > F df 1,df 2 for a 1 α level test 45 / 49

46 Degree of freedom in ANOVA Find df1 and df2 Think about an example of two populations (We used pooled sample variance) S 2 p = (Y 1j ȳ 1 ) 2 + (Y 2j ȳ 2 ) 2 n n 2 1 Now we have k groups S 2 p = = (Y 1j ȳ 1 ) 2 + (Y 2j ȳ 2 ) 2 n 1 + n 2 2 (y ij ȳ i ) 2 n n n k 1 = SSPE n k df 2 = n k, df 1 = df(res) df 2 = n 2 (n k) = k 2 46 / 49

47 ANOVA ANOVA SS df Regression (Ŷ ij Ȳ ) 2 1 Residual (Y ij Ŷ ij ) 2 n-2 LoF (ŷ ij ȳ i ) 2 k-2 PE (Y ij Ȳ i ) 2 n-k F LOF = SSLof/(k 2) SSPE/(n k) F k 2,n k 47 / 49

48 SSLOF and SSPE SSLOF = y t A 1 y = y t ( H + J )y SSPE = y t A 2 Y = y t (I J )y where 1 J n n J 1 = 0 J n n n k J nk 48 / 49

49 Remedial actions Change model if it appears there is nonlinearity but homogeneity of variance Transform if there is heterogeneity of variance and nonlinearity Consider weighted least squares if there is just heterogeneity of variance Delete outliers Fit a robust model (loess, etc) 49 / 49

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects

Contents. 1 Review of Residuals. 2 Detecting Outliers. 3 Influential Observations. 4 Multicollinearity and its Effects Contents 1 Review of Residuals 2 Detecting Outliers 3 Influential Observations 4 Multicollinearity and its Effects W. Zhou (Colorado State University) STAT 540 July 6th, 2015 1 / 32 Model Diagnostics:

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

STAT 4385 Topic 06: Model Diagnostics

STAT 4385 Topic 06: Model Diagnostics STAT 4385 Topic 06: Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2016 1/ 40 Outline Several Types of Residuals Raw, Standardized, Studentized

More information

Remedial Measures, Brown-Forsythe test, F test

Remedial Measures, Brown-Forsythe test, F test Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Diagnostics and Remedial Measures: An Overview

Diagnostics and Remedial Measures: An Overview Diagnostics and Remedial Measures: An Overview Residuals Model diagnostics Graphical techniques Hypothesis testing Remedial measures Transformation Later: more about all this for multiple regression W.

More information

Unit 10: Simple Linear Regression and Correlation

Unit 10: Simple Linear Regression and Correlation Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

holding all other predictors constant

holding all other predictors constant Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y = b 0 + b 1 x 1 + + b p x p + e Partial Regression Coefficients: b i effect (on the mean response) of increasing

More information

One-way ANOVA Model Assumptions

One-way ANOVA Model Assumptions One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Stat 427/527: Advanced Data Analysis I

Stat 427/527: Advanced Data Analysis I Stat 427/527: Advanced Data Analysis I Review of Chapters 1-4 Sep, 2017 1 / 18 Concepts you need to know/interpret Numerical summaries: measures of center (mean, median, mode) measures of spread (sample

More information

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

MLR Model Checking. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project MLR Model Checking Author: Nicholas G Reich, Jeff Goldsmith This material is part of the statsteachr project Made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License: http://creativecommons.org/licenses/by-sa/3.0/deed.en

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Heteroskedasticity and Autocorrelation

Heteroskedasticity and Autocorrelation Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity

More information

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin Regression Review Statistics 149 Spring 2006 Copyright c 2006 by Mark E. Irwin Matrix Approach to Regression Linear Model: Y i = β 0 + β 1 X i1 +... + β p X ip + ɛ i ; ɛ i iid N(0, σ 2 ), i = 1,..., n

More information

4.1. Introduction: Comparing Means

4.1. Introduction: Comparing Means 4. Analysis of Variance (ANOVA) 4.1. Introduction: Comparing Means Consider the problem of testing H 0 : µ 1 = µ 2 against H 1 : µ 1 µ 2 in two independent samples of two different populations of possibly

More information

Simple Linear Regression for the Advertising Data

Simple Linear Regression for the Advertising Data Revenue 0 10 20 30 40 50 5 10 15 20 25 Pages of Advertising Simple Linear Regression for the Advertising Data What do we do with the data? y i = Revenue of i th Issue x i = Pages of Advertisement in i

More information

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University

Math 5305 Notes. Diagnostics and Remedial Measures. Jesse Crawford. Department of Mathematics Tarleton State University Math 5305 Notes Diagnostics and Remedial Measures Jesse Crawford Department of Mathematics Tarleton State University (Tarleton State University) Diagnostics and Remedial Measures 1 / 44 Model Assumptions

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013 Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Diagnostics of Linear Regression

Diagnostics of Linear Regression Diagnostics of Linear Regression Junhui Qian October 7, 14 The Objectives After estimating a model, we should always perform diagnostics on the model. In particular, we should check whether the assumptions

More information

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables Regression Analysis Regression: Methodology for studying the relationship among two or more variables Two major aims: Determine an appropriate model for the relationship between the variables Predict the

More information

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Leiden University Leiden, 30 April 2018 Outline 1 Error assumptions Introduction Variance Normality 2 Residual vs error Outliers Influential observations Introduction Errors and

More information

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Tentative solutions TMA4255 Applied Statistics 16 May, 2015 Norwegian University of Science and Technology Department of Mathematical Sciences Page of 9 Tentative solutions TMA455 Applied Statistics 6 May, 05 Problem Manufacturer of fertilizers a) Are these independent

More information

CHAPTER 2 SIMPLE LINEAR REGRESSION

CHAPTER 2 SIMPLE LINEAR REGRESSION CHAPTER 2 SIMPLE LINEAR REGRESSION 1 Examples: 1. Amherst, MA, annual mean temperatures, 1836 1997 2. Summer mean temperatures in Mount Airy (NC) and Charleston (SC), 1948 1996 Scatterplots outliers? influential

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from

More information

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines) Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS Ravinder Malhotra and Vipul Sharma National Dairy Research Institute, Karnal-132001 The most common use of statistics in dairy science is testing

More information

Applied Statistical Methods II. Larry Winner University of Florida Department of Statistics

Applied Statistical Methods II. Larry Winner University of Florida Department of Statistics Applied Statistical Methods II Larry Winner University of Florida Department of Statistics August 21, 2018 2 Chapter 1 Simple Linear Regression 1.1 Introduction Linear regression is used when we have a

More information

Lecture 5: Hypothesis tests for more than one sample

Lecture 5: Hypothesis tests for more than one sample 1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated

More information

STA 6167 Exam 1 Spring 2016 PRINT Name

STA 6167 Exam 1 Spring 2016 PRINT Name STA 6167 Exam 1 Spring 2016 PRINT Name Unless stated otherwise, for all significance tests, use = 0.05 significance level. Q.1. A regression model was fit, relating estimated cost of de-commissioning oil

More information

Diagnostics for Linear Models With Functional Responses

Diagnostics for Linear Models With Functional Responses Diagnostics for Linear Models With Functional Responses Qing Shen Edmunds.com Inc. 2401 Colorado Ave., Suite 250 Santa Monica, CA 90404 (shenqing26@hotmail.com) Hongquan Xu Department of Statistics University

More information

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013 Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis

More information

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed

More information

Handout 4: Simple Linear Regression

Handout 4: Simple Linear Regression Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:

More information

Solutions to Final STAT 421, Fall 2008

Solutions to Final STAT 421, Fall 2008 Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7,

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 4 Jakub Mućk Econometrics of Panel Data Meeting # 4 1 / 30 Outline 1 Two-way Error Component Model Fixed effects model Random effects model 2 Non-spherical

More information

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03 STA60/03//07 Tutorial letter 03//07 Applied Statistics II STA60 Semester Department of Statistics Solutions to Assignment 03 Define tomorrow. university of south africa QUESTION (a) (i) The normal quantile

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Answer Keys to Homework#10

Answer Keys to Homework#10 Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015

Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using

More information

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.

More information

Assignment 9 Answer Keys

Assignment 9 Answer Keys Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

The ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.

The ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test. Lecture 11 Topic 8: Data Transformations Assumptions of the Analysis of Variance 1. Independence of errors The ε ij (i.e. the errors or residuals) are statistically independent from one another. Failure

More information

Module 6: Model Diagnostics

Module 6: Model Diagnostics St@tmaster 02429/MIXED LINEAR MODELS PREPARED BY THE STATISTICS GROUPS AT IMM, DTU AND KU-LIFE Module 6: Model Diagnostics 6.1 Introduction............................... 1 6.2 Linear model diagnostics........................

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

STAT 571A Advanced Statistical Regression Analysis. Chapter 3 NOTES Diagnostics and Remedial Measures

STAT 571A Advanced Statistical Regression Analysis. Chapter 3 NOTES Diagnostics and Remedial Measures STAT 571A Advanced Statistical Regression Analysis Chapter 3 NOTES Diagnostics and Remedial Measures 2015 University of Arizona Statistics GIDP. All rights reserved, except where previous rights exist.

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information