Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Similar documents
e i is a random error

x i1 =1 for all i (the constant ).

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Properties of Least Squares

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

β0 + β1xi and want to estimate the unknown

Economics 130. Lecture 4 Simple Linear Regression Continued

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Statistics for Economics & Business

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

β0 + β1xi. You are interested in estimating the unknown parameters β

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Statistics for Business and Economics

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Linear Regression Analysis: Terminology and Notation

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

The Ordinary Least Squares (OLS) Estimator

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

β0 + β1xi. You are interested in estimating the unknown parameters β

Chapter 11: Simple Linear Regression and Correlation

Lecture 3 Stat102, Spring 2007

a. (All your answers should be in the letter!

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Chapter 14 Simple Linear Regression

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Introduction to Regression

Lecture 6: Introduction to Linear Regression

Basic Business Statistics, 10/e

Chapter 9: Statistical Inference and the Relationship between Two Variables

Lecture 3 Specification

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Statistics MINITAB - Lab 2

Learning Objectives for Chapter 11

Negative Binomial Regression

STAT 3008 Applied Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Correlation and Regression

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Comparison of Regression Lines

Lecture 4 Hypothesis Testing

28. SIMPLE LINEAR REGRESSION III

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

T E C O L O T E R E S E A R C H, I N C.

III. Econometric Methodology Regression Analysis

18. SIMPLE LINEAR REGRESSION III

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

Linear Approximation with Regularization and Moving Least Squares

CHAPTER 8. Exercise Solutions

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Scatter Plot x

Continuous vs. Discrete Goods

Estimation: Part 2. Chapter GREG estimation

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Improvement in Estimating the Population Mean Using Exponential Estimator in Simple Random Sampling

January Examinations 2015

Midterm Examination. Regression and Forecasting Models

STAT 511 FINAL EXAM NAME Spring 2001

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Topic 7: Analysis of Variance

Basically, if you have a dummy dependent variable you will be estimating a probability.

Kernel Methods and SVMs Extension

LECTURE 9 CANONICAL CORRELATION ANALYSIS

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Chapter 4: Regression With One Regressor

SIMPLE LINEAR REGRESSION

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

,, MRTS is the marginal rate of technical substitution

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Chapter 15 - Multiple Regression

ECONOMICS 351* -- Stata 10 Tutorial 6. Stata 10 Tutorial 6

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

GMM Method (Single-equation) Pongsa Pornchaiwiseskul Faculty of Economics Chulalongkorn University

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

Exercise 1 The General Linear Model : Answers

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Chapter 13: Multiple Regression

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

LINEAR REGRESSION MODELS W4315

Limited Dependent Variables

Systems of Equations (SUR, GMM, and 3SLS)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Laboratory 1c: Method of Least Squares

Transcription:

Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the unknown parameters β and β? The least squares prncple, also known as the method of ordnary least squares (OLS), s to fnd a soluton that mnmzes the sum of squared errors: Ths s a mnmzaton problem. Soluton s a calculus eercse. Mnmzng ponts b and b are obtaned as the soluton to the frst order condtons: S =0 β Dfferentaton gves: and S β = (y = = y S and = 0 β β β β β = = ) S = e = = = ( y β β ) S β = (y = = y β β ) β β = = = ote: negatve and postve errors get equal weght. Econ 36 - Chapter Econ 36 - Chapter

To obtan the mnmzng pont set the above equatons to 0 and evaluate at b and b. Ths gves: and b b = 0 = y = b b = 0 = y = = ow dvde through by and rearrange terms to get the normal equatons: y = b + b () = = y = b + b () = = = The normal equatons can be solved. From equaton () rearrange terms to get: b = y b = = (3) Substtute (3) nto () to get: y = Rearrange to get: y = = (y b y = b ) = = + b ote, the above used the result: ow solve to get the slope estmator: y y = b = (4) = = = = Equatons (3) and (4) are called the ordnary least squares (OLS) estmators for β and β. = y b where y and are the sample means. That s y = y = Equaton (3) s the ntercept estmator. 3 Econ 36 - Chapter 4 Econ 36 - Chapter

The estmators b and b gve an estmaton rule. How are these results used? Collect a numerc data set for an applcaton of nterest. y and are now the numerc observatons. Do the calculatons n equatons (3) and (4). Ths gves the numerc estmates denoted by b and b. These numbers are called ordnary least squares (OLS) estmates. Statstcal ote A numerc data set can be vewed as one sample from the populaton. It s useful to note that the slope estmator can be epressed n a number of equvalent ways. Equaton (4) can be wrtten as: y = y = = = b (4a) = = Another equvalent formula s: (y y) = b = (4b) = Suppose repeated samplng from the populaton was possble. A dfferent sample wll have a dfferent set of numerc data. Ths means that dfferent samples wll yeld dfferent least squares pont estmates. 5 Econ 36 - Chapter 6 Econ 36 - Chapter

Equaton (4b) can be used to get Equaton (4) by usng the results: Another statement for the slope estmator s: = (y y) = = ( y y y + y) cov(, y) b = var() and = = = = = = = = = y = = y = y y + y y ( + ) = = + + where the sample varance s: var( ) = = and the sample covarance s: cov(, y) = (y y) = ote the dvsor s. 7 Econ 36 - Chapter 8 Econ 36 - Chapter

The least squares ftted values or predcted values are: ŷ + = b b for =,..., The least squares resduals are: ê = y = y ŷ b b Propertes of least squares estmaton for =,..., The least squares ftted lne passes through the sample means y, The average value of ŷ s the sample mean y. That s, = = ŷ y The sum of the resduals s zero. That s, The ntercept estmate b Interpretng the Estmates The ntercept estmate may not have a meanngful economc nterpretaton f the sample observatons do not have values around = 0. Eample For the household ependture functon y s weekly household ependture on food (n dollars) and s weekly ncome (n $00). The estmated lnear regresson equaton s: ŷ 83.4 + 0. = for =,..., 40 As a frst guess, you may say that a household wth zero ncome (=0) wll spend about $83.4 each week on food. Thnk agan. An nspecton of the data set shows that, for the sample of 40 households, the mnmum weekly household ncome s 3.69 ( $369) and the mamum ncome s 33.40 ($3,340). = = ê 0 ote: the above propertes requre that an equaton ntercept β ncluded n the lnear regresson equaton. s The ftted regresson lne may not be useful for predctng food ependture at levels of ncome below the mnmum observed value or eceedng the mamum level n the data set. 9 Econ 36 - Chapter 0 Econ 36 - Chapter

The slope estmate b Ths gve a margnal mpact the estmated ncrease n the mean of the dependent varable y for a one unt ncrease n the eplanatory varable. Eample For the estmated food ependture equaton the slope estmate s 0.. The economc nterpretaton s: for a typcal household, weekly ependture on food ncreases by about $0. for an addtonal $00 n ncome. For reportng purposes, values can be rescaled. For eample, an equvalent statement s: a $0 ncrease n weekly household ncome leads to an ncrease n weekly ependture on food by about $.04. Elastcty Economsts are famlar wth the measure of elastcty defned as the percentage ncrease n y for a one percent ncrease n. An elastcty can be obtaned as: dln(y) dy ε = = = β dln() yd y Ths wll vary at every sample observaton. How can a summary measure be found? A fast method s to evaluate the elastcty at the sample means (, y). Ths gves an elastcty estmate calculated as: ε ˆ = y b For the household food ependture eample: 9.60 ε ˆ = b = 0. = 0.7 y 83.57 Ths says that a % ncrease n household ncome wll lead, on average, to a 0.7% ncrease n weekly food ependture. An estmated ncome elastcty less than one suggests that food s a necessty rather than a luury good. Econ 36 - Chapter Econ 36 - Chapter

There can be more than one method for gettng results. Here s another method for estmatng an elastcty from the least squares estmaton results. Defne the observatons: The sample mean s: An elastcty estmate s: ε ˆ Z = z b z = for =,..., y z = z = For the food ependture estmaton results: Z ε ˆ = z b = (0.074)0. = 0.76 Ths method gves a dfferent numercal answer compared to the prevous calculaton. Assessng the Least Squares Estmators The model of economc behavour s epressed as the lnear regresson equaton: where y + = β + β e for =,,..., y and are observable varables y s the dependent varable s the eplanatory varable β and β are unknown parameters (coeffcents) β s the ntercept coeffcent β s the slope coeffcent e s a random error The method of ordnary least squares (OLS) fnds an estmaton rule for β and β to mnmze the sum of squared errors: S e = ( y β ) = = = β 3 Econ 36 - Chapter 4 Econ 36 - Chapter

Soluton gves the least squares (OLS) estmators b and b. The predcted or ftted values are ŷ + = b b for =,..., The resduals are ê = y ŷ = y b b for =,..., The estmators b and b are functons of the y and. e s vewed as a random varable. Therefore y s a random varable and b and b are also random varables and ther statstcal propertes can be analyzed. To establsh some statstcal results a number of assumptons are requred. The standard assumptons are: () The lnear regresson equaton s correctly specfed as: y + β + e = β () E(e ) = 0 for all ( =,,..., ) Ths says the random errors have zero mean. That s, any omtted varables that are captured n e do not systematcally affect the mean value of y. 5 Econ 36 - Chapter 6 Econ 36 - Chapter

(3) var( e ) = σ (sgma-squared) for all (5) must have at least dfferent values. Ths says equal error varance for all observatons. Ths s called homoskedastcty (equal spread). ote that assumpton () mples var(e ) = E [( e E(e )) ] ( ) = Ee (4) cov( e,ej) = 0 for all j for all Ths says the covarance between any two errors s zero. ote that assumpton () mples cov(e,e ) j = E [( e E(e ))( e E(e ))] = E(ee ) j The correlaton between two errors s defned as: j j That s, var() > 0 (5*) s non-random or non-stochastc. That s, the values are fed n repeated samplng. Ths means cov(e, ) = E = 0 [( e E(e ))( E( ))] = E(e )( ) Ths says the error s uncorrelated wth the eplanatory varable. The above assumptons can now be used to establsh the statstcal propertes of the least squares (OLS) estmator. The focus of the presentaton wll be the slope estmator b. Smlar results can be obtaned for the ntercept estmator b. cov(e,e ) var(e )var(e ) j j Ths shows zero covarance s equvalent to uncorrelated errors. 7 Econ 36 - Chapter 8 Econ 36 - Chapter

If the standard assumptons are satsfed then b s an unbased estmator of β. That s, (b ) = β E That s, the w have the propertes w = = 0 and w = = Ths result can be shown. Introduce w = for =,,..., = A result s = = 0 = = Therefore w = 0 = ow state the slope estmator as: b (y y) = = = = w (y y) = = wy y w = = Also w = = w ( = = = = = = wy = Ths shows that b s a lnear functon of the y a weghted average of the y wth the w as weghts. Use assumpton () to substtute for the y to get b = w ( β = = β w + β + β + e ) w + we = = = = β + we = 9 Econ 36 - Chapter 0 Econ 36 - Chapter

Take epectatons to fnd The varance of the slope estmator can be found as follows. E(b ) = β + E w = e = var( b ) = var( β + we ) = β + = we(e ) use assumpton (5*) = var( ) we = β s a constant = β use assumpton () E(e ) = 0 for all Ths says the slope estmator b s an unbased estmator of β. What does ths mean? Wth a sample of numerc data a slope estmate can be calculated. Ths estmate wll be smaller or larger than the true unknown populaton value β. Another sample of observatons wll yeld a dfferent slope estmate that agan wll be smaller or larger than the true parameter. In repeated samplng, the average of all the calculated slope estmates wll equal β. = w = = σ = σ w = = σ var(e ) = = = ) use assumptons (5*) and (4) use assumpton (3) var( e ) = σ substtute for w Econ 36 - Chapter Econ 36 - Chapter

The varance gves a measure of the precson of the estmator. Inspecton of the varance formula shows the followng: an ncrease n sample sze generally leads to lower varance. Ths holds snce ( ncreases as ncreases. = Ths gves ncreased precson of the estmator. the greater the varablty n the more precse s the estmator. Ths holds snce the varance of the slope estmator can be epressed as: var( b ) = σ ( ) var() The Gauss-Markov Theorem It has been shown that the least squares (OLS) estmator b s a lnear unbased estmator of β. Here, lnear means that b s a weghted average of the y. The Gauss-Markov theorem says: If the standard set of assumptons s satsfed, then the least squares estmator has mnmum varance n the class of lnear unbased estmators. That s, the least squares estmator s BLUE (Best Lnear Unbased Estmator). Best means mnmum varance. the smaller the varablty n y (as reflected n σ ) the more precse s the estmator. 3 Econ 36 - Chapter 4 Econ 36 - Chapter

What does ths mean? Suppose That s, * b s another lnear unbased estmator of β. b * = ky = where the k are some weghts that are dfferent from the w. Also E (b * ) = β The Gauss-Markov theorem says * > var( b ) var(b) If any of the standard assumptons are volated then the least squares method may not be the best. There may be an estmator wth lower varance but t s ether based or not lnear n y. Estmatng the Varance of the Error Term ow take another look at the varance of the slope estmator: var( b The error varance ) An estmator for = σ σ σ = s unknown. s needed. The sum of squared resduals s: SSE ê = ( y b b ) = = = An unbased estmator for σˆ = ê = σ s: = SSE The dvsor s the number of degrees of freedom n the sum of squares. The degrees of freedom (df) s the number of ndependent peces of nformaton used to comple the sum of squares from observatons. For ths applcaton, two degrees of freedom are lost for the two estmated parameters (the ntercept and the slope). 5 Econ 36 - Chapter 6 Econ 36 - Chapter

ow replace the unknown error varance σ to get an estmator for var( b ) as: v âr(b ) = σˆ = The standard error of the slope estmator b s: se(b ) = vâr(b) 7 Econ 36 - Chapter