The Classical Linear Regression Model

Similar documents
Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Day 2: Rethinking regression as a predictive tool

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2016

Greene, Econometric Analysis (7th ed, 2012)

The Simple Regression Model. Simple Regression Model 1

Essential of Simple regression

Section Least Squares Regression

Day 4: Shrinkage Estimators

General Linear Model (Chapter 4)

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON The Simple Regression Model

Ch 2: Simple Linear Regression

ECO220Y Simple Regression: Testing the Slope

Inference for Regression

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Statistical Modelling in Stata 5: Linear Models

ECON3150/4150 Spring 2016

Ch 3: Multiple Linear Regression

Lecture 4 Multiple linear regression

The Simple Linear Regression Model

Simple Linear Regression: The Model

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Empirical Application of Simple Regression (Chapter 2)

Variance Decomposition and Goodness of Fit

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Linear models and their mathematical foundations: Simple linear regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

THE MULTIVARIATE LINEAR REGRESSION MODEL

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

14 Multiple Linear Regression

Correlation and Simple Linear Regression

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

Simple Linear Regression

Week 3: Simple Linear Regression

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Statistical Inference with Regression Analysis

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Econometrics Midterm Examination Answers

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Lab 07 Introduction to Econometrics

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Lecture 11: Simple Linear Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Lecture 4: Multivariate Regression, Part 2

Ordinary Least Squares Regression

Chapter 2: simple regression model

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Multivariate Regression: Part I

Lecture 6 Multiple Linear Regression, cont.

Lecture 8: Instrumental Variables Estimation

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Generating OLS Results Manually via R

MODELS WITHOUT AN INTERCEPT

Biostatistics 380 Multiple Regression 1. Multiple Regression

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Simple Linear Regression

The Simple Regression Model. Part II. The Simple Regression Model

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

1 Independent Practice: Hypothesis tests for one parameter:

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model

Lab 11 - Heteroskedasticity

13 Simple Linear Regression

Econometrics I Lecture 3: The Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model

Simple Linear Regression: One Quantitative IV

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Binary Dependent Variables

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Coefficient of Determination

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Multiple Linear Regression

sociology 362 regression

ECON 450 Development Economics

BNAD 276 Lecture 10 Simple Linear Regression Model

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Applied Statistics and Econometrics

Introduction to Estimation Methods for Time Series models. Lecture 1

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Handout 11: Measurement Error

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

Multiple Linear Regression CIVL 7012/8012

Applied Regression Analysis

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Econometrics. 9) Heteroscedasticity and autocorrelation

2 Regression Analysis

Transcription:

The Classical Linear Regression Model ME104: Linear Regression Analysis Kenneth Benoit August 14, 2012

CLRM: Basic Assumptions 1. Specification: Relationship between X and Y in the population is linear: E(Y ) = X β No extraneous variables in X No omitted independent variables Parameters (β) are constant 2. E(ɛ) = 0 3. Error terms: Var(ɛ) = σ 2, or homoskedastic errors E(r ɛi,ɛ j ) = 0, or no auto-correlation

CLRM: Basic Assumptions (cont.) 4. X is non-stochastic, meaning observations on independent variables are fixed in repeated samples implies no measurement error in X implies no serial correlation where a lagged value of Y would be used an independent variable no simultaneity or endogenous X variables 5. N > k, or number of observations is greater than number of independent variables (in matrix terms: rank(x ) = k), and no exact linear relationships exist in X 6. Normally distributed errors: ɛ X N(0, σ 2 ). Technically however this is a convenience rather than a strict assumption

Normally distributed errors

Ordinary Least Squares (OLS) Objective: minimize e 2 i Ŷi = b 0 + b 1 X i error e i = (Y i Ŷi) = (Y i Ŷ i ) 2, where b = = (Xi X )(Y i Ȳ ) (Xi X ) Xi Y i X 2 i The intercept is: b 0 = Ȳ b 1 X

OLS rationale Formulas are very simple Closely related to ANOVA (sums of squares decomposition) Predicted Y is sample mean when Pr(Y X ) =Pr(Y ) In the special case where Y has no relation to X, b 1 = 0, then OLS fit is simply Ŷ = b 0 Why? Because b0 = Ȳ b X 1, so Ŷ = Ȳ Prediction is then sample mean when X is unrelated to Y Since OLS is then an extension of the sample mean, it has the same attractice properties (efficiency and lack of bias) Alternatives exist but OLS has generally the best properties when assumptions are met

OLS in matrix notation Formula for coefficient β: Y = X β + ɛ X Y = X X β + X ɛ X Y = X X β + 0 (X X ) 1 X Y = β + 0 β = (X X ) 1 X Y Formula for variance-covariance matrix: σ 2 (X X ) 1 In simple case where y = β0 + β 1 x, this gives σ 2 / (x i x) 2 for the variance of β 1 Note how increasing the variation in X will reduce the variance of β 1

The hat matrix The hat matrix H is defined as: ˆβ = (X X ) 1 X y X ˆβ = X (X X ) 1 X y ŷ = Hy H = X (X X ) 1 X is called the hat-matrix Other important quantities, such as ŷ, ei 2 expressed as functions of H (RSS) can be Corrections for heteroskedastic errors ( robust standard errors) involve manipulating H

Three critical quantities Y i The observed value of dep. variable for unit i Ȳ The mean of the dep. variable Y Ŷ i The value of outcome for unit i that is predicted from the model

Sums of squares (ANOVA) TSS Total sum of squares (Y i Ȳ ) 2 SSM Model or Regression sum of squares (Ŷ i Ȳ ) 2 SSE Error or Residual sum of squares e 2 i = (Ŷi Y i ) 2 The key to remember is that TSS = SSM + SSE

R 2 Solid arrow: variation in y when X is unknown (TSS Total Sum of Squares (y i ȳ) 2 ) Dashed arrow: variation in y when X is known (SSM Model Sum of Squares (ŷ i ȳ) 2 )

R 2 decomposed y = ŷ + ɛ Var(y) = Var(ŷ) + Var(ɛ) + 2Cov(ŷ, ɛ) Var(y) = Var(ŷ) + Var(ɛ) + 0 (yi ȳ) 2 /N = (ŷ i ŷ) 2 /N + (e i ê) 2 /N (yi ȳ) 2 = (ŷ i ŷ) 2 + (e i ê) 2 (yi ȳ) 2 = (ŷ i ŷ) 2 + ei 2 TSS = SSM + SSE TSS = SSM TSS TSS + SSE TSS 1 = R 2 + unexplained variance

R 2 A much over-used statistic: it may not be what we are interested in at all Interpretation: the proportion of the variation in y that is explained linearly by the independent variables R 2 = SSM TSS = 1 SSE TSS (yi ŷ i ) 2 = 1 (yi ȳ) 2 Alternatively, R 2 is the squared correlation coefficient between y and ŷ

R 2 continued When a model has no intercept, it is possible for R 2 to lie outside the interval (0, 1) R 2 rises with the addition of more explanatory variables. For this reason we often report adjusted R 2 : 1 (1 R 2 ) n 1 n k 1 where k is the total number of regressors in the linear model (excluding the constant) Whether R 2 is high or not depends a lot on the overall variance in Y To R 2 values from different Y samples cannot be compared

R 2 continued Solid arrow: variation in y when X is unknown (SSR) Dashed arrow: variation in y when X is known (SST)

R 2 decomposed y = ŷ + ɛ Var(y) = Var(ŷ) + Var(e) + 2Cov(ŷ, e) Var(y) = Var(ŷ) + Var(e) + 0 (yi ȳ) 2 /N = (ŷ i ŷ) 2 /N + (e i ê) 2 /N (yi ȳ) 2 = (ŷ i ŷ) 2 + (e i ê) 2 (yi ȳ) 2 = (ŷ i ŷ) 2 + ei 2 SST = SSR + SSE SST /SST = SSR/SST + SSE/SST 1 = R 2 + unexplained variance

Regression terminology y is the dependent variable referred to also (by Greene) as a regressand X are the independent variables also known as explanatory variables also known as regressors y is regressed on X The error term ɛ is sometimes called a disturbance

Some important OLS properties to understand Applies to y = α + βx + ɛ If β = 0 and the only regressor is the intercept, then this is the same as regressing y on a column of ones, and hence α = ȳ the mean of the observations If α = 0 so that there is no intercept and one explanatory variable x, then β = xy x 2 If there is an intercept and one explanatory variable, then i β = (x i x)(y i ȳ) (xi x) 2 i = (x i x)y i (xi x) 2

Some important OLS properties (cont.) If the observations are expressed as deviations from their means, y = y ȳ and x = x x, then β = x y / x 2 The intercept can be estimated as ȳ β x. This implies that the intercept is estimated by the value that causes the sum of the OLS residuals to equal zero. The mean of the ŷ values equals the mean y values together with previous properties, implies that the OLS regression line passes through the overall mean of the data points

OLS in Stata. use dail2002 (Ireland 2002 Dail Election - Candidate Spending Data). gen spendxinc = spend_total * incumb (2 missing values generated). reg votes1st spend_total incumb minister spendxinc Source SS df MS Number of obs = 462 -------------+------------------------------ F( 4, 457) = 229.05 Model 2.9549e+09 4 738728297 Prob > F = 0.0000 Residual 1.4739e+09 457 3225201.58 R-squared = 0.6672 -------------+------------------------------ Adj R-squared = 0.6643 Total 4.4288e+09 461 9607007.17 Root MSE = 1795.9 ------------------------------------------------------------------------------ votes1st Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- spend_total.2033637.0114807 17.71 0.000.1808021.2259252 incumb 5150.758 536.3686 9.60 0.000 4096.704 6204.813 minister 1260.001 474.9661 2.65 0.008 326.613 2193.39 spendxinc -.1490399.0274584-5.43 0.000 -.2030003 -.0950794 _cons 469.3744 161.5464 2.91 0.004 151.9086 786.8402 ------------------------------------------------------------------------------

OLS in R > dail <- read.dta("dail2002.dta") > mdl <- lm(votes1st ~ spend_total*incumb + minister, data=dail) > summary(mdl) Call: lm(formula = votes1st ~ spend_total * incumb + minister, data = dail) Residuals: Min 1Q Median 3Q Max -5555.8-979.2-262.4 877.2 6816.5 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 469.37438 161.54635 2.906 0.00384 ** spend_total 0.20336 0.01148 17.713 < 2e-16 *** incumb 5150.75818 536.36856 9.603 < 2e-16 *** minister 1260.00137 474.96610 2.653 0.00826 ** spend_total:incumb -0.14904 0.02746-5.428 9.28e-08 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 1796 on 457 degrees of freedom (2 observations deleted due to missingness) Multiple R-squared: 0.6672, Adjusted R-squared: 0.6643 F-statistic: 229 on 4 and 457 DF, p-value: < 2.2e-16

Examining the sums of squares > yhat <- mdl$fitted.values # uses the lm object mdl from previous > ybar <- mean(mdl$model[,1]) > y <- mdl$model[,1] # can t use dail$votes1st since diff N > SST <- sum((y-ybar)^2) > SSR <- sum((yhat-ybar)^2) > SSE <- sum((yhat-y)^2) > SSE [1] 1473917120 > sum(mdl$residuals^2) [1] 1473917120 > (r2 <- SSR/SST) [1] 0.6671995 > (adjr2 <- (1 - (1-r2)*(462-1)/(462-4-1))) [1] 0.6642865 > summary(mdl)$r.squared # note the call to summary() [1] 0.6671995 > SSE/457 [1] 3225202 > sqrt(sse/457) [1] 1795.885 > summary(mdl)$sigma [1] 1795.885