The Simple Regression Model. Simple Regression Model 1

Similar documents
ECON The Simple Regression Model

Multiple Linear Regression CIVL 7012/8012

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Econometrics I Lecture 3: The Simple Linear Regression Model

Introductory Econometrics

Multiple Linear Regression

The Simple Regression Model. Part II. The Simple Regression Model

ECON3150/4150 Spring 2015

Homoskedasticity. Var (u X) = σ 2. (23)

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Simple Linear Regression: The Model

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Multiple Regression Analysis

Intermediate Econometrics

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

The Classical Linear Regression Model

Multiple Regression Analysis: Heteroskedasticity

Intermediate Econometrics

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

LECTURE 2: SIMPLE REGRESSION I

Intro to Applied Econometrics: Basic theory and Stata examples

ECON3150/4150 Spring 2016

Regression Analysis with Cross-Sectional Data

Chapter 14 Simple Linear Regression (A)

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Multiple Regression Analysis

Chapter 2: simple regression model

Linear Models in Econometrics

The Multiple Regression Model Estimation

2. Linear regression with multiple regressors

The Simple Linear Regression Model

Chapter 8 Heteroskedasticity

Least Squares Estimation-Finite-Sample Properties

Simple Linear Regression Model & Introduction to. OLS Estimation

Multiple Regression Analysis. Part III. Multiple Regression Analysis

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Econometrics - 30C00200

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Simple Regression Model. January 24, 2011

Linear Regression. Junhui Qian. October 27, 2014

Motivation for multiple regression

Econometrics Review questions for exam

Ordinary Least Squares Regression

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

In Chapter 2, we learned how to use simple regression analysis to explain a dependent

Introduction to Econometrics

The Multiple Regression Model

Lecture 9 SLR in Matrix Form

5.1 Model Specification and Data 5.2 Estimating the Parameters of the Multiple Regression Model 5.3 Sampling Properties of the Least Squares

Diagnostics of Linear Regression

Introductory Econometrics

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Multiple Linear Regression

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Introductory Econometrics

INTRODUCTORY ECONOMETRICS

Recitation 1: Regression Review. Christina Patterson

Regression Analysis: Basic Concepts

In Chapter 2, we learned how to use simple regression analysis to explain a dependent

ECNS 561 Multiple Regression Analysis

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Multivariate Regression Analysis

Estadística II Chapter 5. Regression analysis (second part)

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Review of Statistics

Advanced Quantitative Methods: ordinary least squares

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

THE MULTIVARIATE LINEAR REGRESSION MODEL

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Review of Econometrics

Regression Models. Chapter 4

Lecture 9: Linear Regression

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

ECON 497: Lecture Notes 10 Page 1 of 1

Mathematics for Economics MA course

Simple Linear Regression

The regression model with one stochastic regressor.

Essential of Simple regression

Finding Relationships Among Variables

1. The OLS Estimator. 1.1 Population model and notation

Environmental Econometrics

Estadística II Chapter 4: Simple linear regression

Measuring the fit of the model - SSR

Lecture 3: Multiple Regression

Linear models and their mathematical foundations: Simple linear regression

Regression Analysis II

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Formal Statement of Simple Linear Regression Model

Reliability of inference (1 of 2 lectures)

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Transcription:

The Simple Regression Model Simple Regression Model 1

Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising - Or y is the market value of an apartment and x is its area Our objectives for the moment will be: - Estimate the unknown parameters β 0 andβ 1 - Assess how much x explains y Simple Regression Model 2

The Simple Linear Regression (SLR) Model with Cross-Sectional Data Assumption SLR.1 (Linearity in parameters) y is the dependent variable (or explained variable, or response variable, or the regressand) x is the independent variable (or explanatory variable, or control variable, or the regressor) u is the error term or disturbance (y is allowed to vary for the same level of x) This is a Population equation. It is linear in the (unknown) parameters Simple Regression Model 3

Further assumptions Assumption SLR.2 (Random Sampling) We have a random sample from the population such that: Assumption SLR.3 (Variation in the Regressor) x i, i=1,2,...,n don t have all the same value Simple Regression Model 4

Further assumptions Assumption SLR.4 (Zero Conditional Mean) Crucial assumption! Remember previous examples: Earnings β 0 + β education+ u = 1 In these cases SLR.4 most likely fails. Assumption SLR.4 (Zero Mean) CrimeRate β 0 + β Policeman+ = 1 u Not really necessary given that SLR.4 implies SLR.4 This is not a restrictive assumption, since we can always use β 0 so that SLR.4 holds Simple Regression Model 5

Implication of Zero Conditional Mean implies f(y). y. E(y x) = β 0 + β 1 x For any x, distribution of y is centered at E(y x) x 1 x 2 Simple Regression Model 6

Population regression line, sample data points and the associated error terms y y 4 E(y x) = β 0 + β 1 x u 4 {. y 3 y 2.} u. 3 u 2 { y 1. } u 1 x 1 x 2 x 3 x 4 Simple Regression Model 7 x

Estimation of unknown parameters by method of moments Want to estimate the population parameters from a sample Let {(x i,y i ): i=1,,n} denote a random sample of size n from the population y i =β 0 + β 1 x i + u i i) implies since ii) implies Simple Regression Model 8

Estimation of unknown parameters by method of moments (intuition) Clever idea is to pick so that sample moments match the population moments The sample counterparts are: These are two equations in the two unknowns Simple Regression Model 9

Estimation of unknown parameters by method of moments (solution) implies: Plug-in in to obtain: this simplifies to: Need sample variation of the x variable. Otherwise the slope parameter is not well defined! Simple Regression Model 10

Slope estimator It is an estimator if we treat (x i,y i ) as random variables. It is an estimate once we substitute (x i,y i ) by the actual observations. The slope estimate is the sample covariance between x and y divided by the sample variance of x If x and y are positively correlated, the slope will be positive If x and y are negatively correlated, the slope will be negative Need x to vary in our sample Simple Regression Model 11

Example and interpretation n=11064, Data from the Employment Survey (INE), 2003 wage is net monthly wage and educ measures the number of years of schooling completed So, we estimate a positive relation between these two variables; an additional year of schooling leads to an estimated expected increase of 52.513 euros in the wage Again, must be cautious about the interpretation of these estimates Simple Regression Model 12

Definitions Fitted value Residual Sum of Squared Residuals (RSS) Simple Regression Model 13

Sample regression line, sample data points and the associated estimated error terms (called residuals); fitted values are on the line y y 4 y 3 y 2 }. û. 3 û 2 { û 4. { yˆ =β ˆ + βˆ 1x 0 y 1 }. û 1 x 1 x 2 x 3 x 4 Simple Regression Model 14 x

Alternate approach to derivation OLS Ordinary Least Squares Can fit a line so as to minimise the sum of squared residuals Equivalent to the equations we had before: same solution! Simple Regression Model 15

Review OLS estimators Given a random sample: OLS obtained by minimising: with respect to Alternatively, can exploit the fact that: Fitted value: Residual: Simple Regression Model 16

Today - Assess how much x explains y - Assess how close our estimators are from the true (population) values Simple Regression Model 17

Looking for a Goodness of fit measure: Decomposition of the Total Sum of Squares (SST) We know: This implies: Also:, to be used later Now define the Total Sum of Squares SST SSE, Explained Sum of Squares Simple Regression Model 18

Goodness-of-Fit How well does our sample regression line fit the data? If SSR is very small relative to SST (or SSE very large relative to SST), then a lot of the variation in the dependent variable y is explained by the model Can compute the fraction of the total sum of squares (SST) that is explained by the model, denote this as the R-squared of the regression R 2 = SSE/SST = 1 SSR/SST The lower SSR, the higher will be R 2 OLS maximises R 2 (minimises SSR). We have always 0 R 2 1 Simple Regression Model 19

Example (continued) n=11064, Data from the Employment Survey (INE), 2003 A lot of variation in the dependent variable (wage) is left unexplained by the model This is not related to the fact that educ is most probably correlated with u Can have low R 2 even if all the assumptions hold true Simple Regression Model 20

Assumptions again... Assumption SLR.1 (Linearity in parameters) Assumption SLR.2 (Random Sampling from the population) We have a random sample Assumption SLR.3 (Variation in the Regressor) x i, i=1,2,...,n don t have all the same value Assumption SLR.4 (Zero Conditional Mean) This implies Simple Regression Model 21

Properties of OLS Estimators Thought experiment: Get a sample of size n from the population many times (infinite times) Compute the OLS estimates Is the average of these estimates equal to the true parameter values? Under the previous assumption (SLR.1 to SLR.4) it turns out that the answer is positive These are estimators if we treat (x i,y i ) as random variables It will be the case that: Simple Regression Model 22

Unbiasedness of OLS (Proof) Put: SST x = n = ( x i 1 i x) 2 n n n Now, note that: ( x x) = and ( x x) x = ( x ) 2 0 i= 1 i i= 1 i i i= 1 i x Let us rewrite our estimator as: Simple Regression Model 23

Unbiasedness of OLS (cont) Now take expectations Conditional on the values of the x s, that is, treat the x s as constants. To avoid heavy notation, I don t make this explicit. And if the conditional expectation is a constant the unconditinal expectation is also that constant... Now, it s really easy to prove unbiasedness of Try it yourself! Simple Regression Model 24

Unbiasedness Summary The OLS estimators ofβ 1 andβ 0 are unbiased However, in a given sample we may be close or far from the true parameter, unbiasedness is a minimal requirement Unbiasedness depends on the 4 assumptions: if any assumption fails, then OLS is not necessarily unbiased Simple Regression Model 25

Variance of the OLS Estimators We know already that the sampling distribution of our estimators is centered around the true parameter Is the distribution concentrated around the true parameter? To answer this question, we add an additional assumption Assumption SLR.5 (Homoskedasticity) Simple Regression Model 26

Variance of OLS (cont.) Assumption SLR.5 (Homoskedasticity) Var(u x) = E(u 2 x)-[e(u x)] 2 E(u x) = 0, so σ 2 = E(u 2 x) = E(u 2 ) = Var(u) So, σ 2 is also the unconditional variance, called the variance of the error σ, the square root of the error variance is called the standard deviation of the error Can write: E(y x)=β 0 + β 1 x and Var(y x) = σ 2 Simple Regression Model 27

Homoskedastic Case f(y x) y.. E(y x) = β 0 + β 1 x x 1 x 2 Simple Regression Model 28

Heteroskedastic Case f(y x) y... E(y x) = β0 + β 1 x x 1 x 2 x 3 Simple Regression Model 29 x

Variance of OLS (cont.) SST x = n = ( x i 1 i x) 2 Note that: Var( a+ bx ) = Var( bx ) = b Var( Y + X ) = Var( Y ) + Var( X ) 2 Var( X ) - a and b are constants and X a Random Variable - If X and Y are independent ( ˆ ) 1 Var β1 = Var β 1 + ( xi x) u SST i = x 2 2 2 ( ( i ) i ) ( i ) ( i ) = 1 Var x x u 1 x x Var u SST = x SST x 2 2 2 2 2 2 ( xi x) σ σ SST σ x x x x = 1 1 SST = = SST SST This is a conditional variance! Simple Regression Model 30

Variance of OLS Summary Var ( ˆ ) 2 β 1= σ n SSTx = = ( x i 1 i SST x x) 2 The larger the error variance, σ 2, the larger the variance of the slope estimator The larger the variability in the x i, the smaller the variance of the slope estimate SST x tends to increase with the sample size n, so variance tends to decrease with the sample size Can also show: Simple Regression Model 31

Estimating the Error Variance In practice, the error variance σ 2 is unknown, must estimate it... Can show that this is an unbiased estimator of the error variance The so-called standar error of the regression is given by: Simple Regression Model 32

Estimated standard error (se) of the parameters And similarly for In our leading example: Simple Regression Model 33