Lesson 11: Simple Linear Regression

Similar documents
Properties and Hypothesis Testing

1 Inferential Methods for Correlation and Regression Analysis

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

(all terms are scalars).the minimization is clearer in sum notation:

ECON 3150/4150, Spring term Lecture 3

Regression, Inference, and Model Building

Simple Linear Regression

Efficient GMM LECTURE 12 GMM II

11 Correlation and Regression

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Simple Regression Model

Output Analysis (2, Chapters 10 &11 Law)

Linear Regression Models

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Random Variables, Sampling and Estimation

Frequentist Inference

Statistical Properties of OLS estimators

TAMS24: Notations and Formulas

Algebra of Least Squares

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

MA Advanced Econometrics: Properties of Least Squares Estimators

Chapters 5 and 13: REGRESSION AND CORRELATION. Univariate data: x, Bivariate data (x,y).

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Statistics 20: Final Exam Solutions Summer Session 2007

Estimation for Complete Data

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Sample Size Determination (Two or More Samples)

Linear Regression Demystified

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Linear Regression Models, OLS, Assumptions and Properties

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

SIMPLE LINEAR REGRESSION AND CORRELATION ANALYSIS

Correlation Regression

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Topic 9: Sampling Distributions of Estimators

Lecture 33: Bootstrap

Correlation. Two variables: Which test? Relationship Between Two Numerical Variables. Two variables: Which test? Contingency table Grouped bar graph

STATISTICAL INFERENCE

Topic 9: Sampling Distributions of Estimators

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Regression. Correlation vs. regression. The parameters of linear regression. Regression assumes... Random sample. Y = α + β X.

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

There is no straightforward approach for choosing the warmup period l.

Lecture 2: Monte Carlo Simulation

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

University of California, Los Angeles Department of Statistics. Simple regression analysis

Midterm 2 ECO3151. Winter 2012

Ismor Fischer, 1/11/

Worksheet 23 ( ) Introduction to Simple Linear Regression (continued)

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Eco411 Lab: Central Limit Theorem, Normal Distribution, and Journey to Girl State

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

Final Examination Solutions 17/6/2010

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

1 Introduction to reducing variance in Monte Carlo simulations

Expectation and Variance of a random variable

Mathematical Notation Math Introduction to Applied Statistics

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Stat 139 Homework 7 Solutions, Fall 2015

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

The standard deviation of the mean

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Chapter 12 Correlation

Section 14. Simple linear regression.

Read through these prior to coming to the test and follow them when you take your test.

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

A statistical method to determine sample size to estimate characteristic value of soil parameters

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Chapter 1 (Definitions)

(X i X)(Y i Y ) = 1 n

Topic 9: Sampling Distributions of Estimators

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Stat 200 -Testing Summary Page 1

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Stat 319 Theory of Statistics (2) Exercises

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Math 140 Introductory Statistics

Chapter 6 Sampling Distributions

Transcription:

Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested i allowig the expected value to vary with some variables. For istace, i the discussio o the mea icome, sometimes we may wat to kow how icome is related to the level of educatio. Kowig how icome is related to the level of educatio will allow us to predict better a perso s icome give his/her educatioal backgroud. I additio, the Huma Capital Theory i Ecoomics tells us there should be a positive relatio betwee icome ad educatio which is viewed as huma capital. 1 Recall that if we have several radom variables described by a multivariate distributio, we ca talk about coditioal expectatios. Recall the defiitio of coditioal expectatio for the discrete case with two radom variables X ad Y. Defiitio 1 (Coditioal Expectatio): For two discrete radom variables that are joitly distributed with a bivariate probability distributio, the coditioal expectatio or coditioal mea E(X Y = y j ) is computed by the formula: E(X Y = y j ) = x xp X Y (x y j ) = x 1 P X Y (x 1 y j ) + x 2 P X Y (x 2 y j ) +... + x N P X Y (x N y j ) Sometimes, we write µ X Y =yj = E(X Y = y j ). The ucoditioal expectatio or mea of X is related to the coditioal mea. E(X) = y E(X Y = y)p X Y (y) = E[E(X Y )] 1 See Becker, Gary (1964): Huma Capital, 1st editio (NBER). ECON1003 Lesso 11: Simple Liear Regressio 1

For cotiuous radom variables, the coditioal expectatio or coditioal mea E(X Y = y) is computed by the formula: E(X Y = y) = xf(x y)dx x I the huma capital example, Y will be icome, X will be years of schoolig. Whe our iterest is the expected value (or called populatio mea) of a radom variable, we use the sample average as a estimator. Whe our iterest is the coditioal expected value (or called the coditioal populatio mea), we ca use the coditioal sample average as a estimator. Example 1 (coditioal sample average I): Suppose we have the followig sample of 10 observatios with two variables, mothly earigs (Y, i dollars) ad geder (X, male=1, female=2). Obs # X Y Obs # X Y 1 1 3000 6 1 5000 2 1 4000 7 2 4000 3 1 4500 8 2 2500 4 1 6000 9 2 5000 5 1 3500 10 2 4000 The sample average icome coditioal o male is (y 1 + y 2 +... + y 6 )/6 = (3000 + 4000 + 4500 + 6000 + 3500 + 5000)/6 = 4333.33 The sample average icome coditioal o female is (y 7 + y 8 +... + y 1 0)/4 = (4000 + 2500 + 5000 + 4000)/4 = 3875.00 Based o these sample averages, we will coclude that a typical male will ear 4333.33 dollars per moth while female will ear 3875.00 dollars. Geder (X) Coditioal mea E(Y X = x) coditioal sample mea Ê(Y X = x) 1 E(Y X = 1) 4333.33 2 E(Y X = 2) 3875.00 The geeral formula to compute the coditioal sample average is Ê(Y X = x) = X=x Y i #(X = x) ECON1003 Lesso 11: Simple Liear Regressio 2

Example 2 (coditioal sample average II): Suppose we have the followig sample of 40 observatios with two variables, mothly earigs (Y, i dollars) ad years of schoolig (X, from 0 to 20). Obs # X Y Obs # X Y Obs # X Y Obs # X Y 1 0 3044 11 5 4106 21 12 5505 31 15 6066 2 0 3009 12 6 4334 22 12 5418 32 16 6327 3 1 3336 13 8 4602 23 12 5531 33 17 6551 4 1 3336 14 8 4677 24 13 5683 34 18 6798 5 2 3587 15 8 4613 25 13 5740 35 18 6673 6 2 3477 16 9 4883 26 14 5982 36 18 6649 7 2 3504 17 9 4937 27 14 5952 37 18 6797 8 3 3766 18 10 5118 28 15 6052 38 19 6945 9 4 3807 19 11 5386 29 15 6009 39 20 7151 10 4 3811 20 12 5569 30 15 6147 40 20 7130 The sample average icome coditioal o 0 year of schoolig is (y 1 + y 2 )/2 = (3044 + 3009)/2 = 3026.50 Similarly, we ca compute the sample average icome coditioal o differet years of schoolig X Ê(Y X = x) #obs X Ê(Y X = x) #obs 0 3026.50 2 11 5386.00 1 1 3336.00 2 12 5505.75 4 2 3522.67 3 13 5711.50 2 3 3766.00 1 14 5967.00 2 4 3809.00 2 15 6068.50 4 5 4106.00 1 16 6327.00 1 6 4334.00 1 17 6551.00 1 8 4630.67 3 18 6729.25 4 9 4910.00 2 19 6945.00 1 10 5118.00 1 20 7140.50 2 We make several observatios of the above example. ECON1003 Lesso 11: Simple Liear Regressio 3

1. Some of the coditioal sample average are based o oly oe observatio. Usig it as a estimate of the coditioal expectatio is very imprecise. 2. Some of the coditioal sample average are missig because the iavailablity of data. For istace, we have o observatio of X = 7. What ca we do to improve our estimatio of the coditioal mea? It turs out that the estimatio may be improved if we are willig to assume some relatioship betwee X ad Y. A liear relatioship is commoly assumed betwee two variables E(Y X) = β 0 + β 1 X (1) I example 2, we used 40 observatios to produce 20 coditioal meas. O average, we have two observatio to estimate each coditioal mea ad we could ot estimate the populatio mea coditioal o X = 7 because there is o observatio coditioal o X + 7. If we are willig to assume a liear relatioship betwee E(Y X) ad X as i equatio (1), we oly eed to estimate two parameters β 0 ad β 1. O average, we will be usig 20 observatios to estimate oe parameter. Oce we have the estimates of β 0 ad β 1, we ca produce the coditioal mea of Y for each X we are iterested i. I additio, we would be able to estimate the populatio mea coditioal o X = 7 eve if we have o observatio coditoal o X = 7. 2 Simulatio 1 (Liear expectatio): We simulatio 10 observatios for each X with differet variace of the error term. E(Y X) = 3 + 2X ɛ N(0, σ 2 ) Y = E(Y X) + ɛ X is assumed to take discrete values of 1,2,...,9. The observatios so geerated are plotted below. Note that the expected values lie o the straight lies. 2 Followig this logic, the estimatio of the coditoal mea will be improved if we are willig to assume ay relatioship betwee X ad Y such that the umber of parameters is greatly reduced. For istace, we may assume E(Y X) = β 0 + β 1 X + β 2 X 2. However, ote that if the true coditioal expectatio is ot related to X as assumed, usig the assumed relatioship to estimate E(Y X) will be wrog. Thus, the choice of the fuctioal form of E(Y X) is extremely importat. That is why we ofte check the liearity assumptio by doig a scatter plot of Y agaist X. Whe E(Y X) is assumed to have a specific fuctioal form with a set of parameters, the regressio is called parametric. Whe E(Y X) is ot assumed ay specific fuctioal form (ad hece o assumed parameters), the regressio is called oparametric. ECON1003 Lesso 11: Simple Liear Regressio 4

1 2 3 4 5 6 7 8 9 24 25 22 20 20 18 16 15 14 12 10 10 8 5 6 4 (a) σ 2 = 1 0 1 2 3 4 5 6 7 8 9 (b) σ 2 = 4 Figure 1: Distributio of data from a liear regressio model Simulatio 2 (No-liear expectatio): We simulatio 10 observatios for each X with differet variace of the error term. E(Y X) = 3 + 2X + 2.5 2 ɛ N(0, σ 2 ) Y = E(Y X) + ɛ X is assumed to take discrete values of 1,2,...,9. The observatios so geerated are plotted below. Note that the expected values lie o the the curve. 250 250 200 200 150 150 100 100 50 50 0 1 2 3 4 5 6 7 8 9 (a) σ 2 = 1 0 1 2 3 4 5 6 7 8 9 (b) σ 2 = 100 Figure 2: Distributio of data from a o-liear regressio model Example 3 (Which datasets are from a liear expectatio model): Guess which datasets are likely from a liear expectatio model. ECON1003 Lesso 11: Simple Liear Regressio 5

1 2 3 4 5 6 7 8 9 24 25 22 20 20 18 16 15 14 12 10 10 8 5 6 4 (a) Dataset #1 0 1 2 3 4 5 6 7 8 9 (b) Dataset #2 250 250 200 200 150 150 100 100 50 50 0 1 2 3 4 5 6 7 8 9 (c) Dataset #3 0 1 2 3 4 5 6 7 8 9 (d) Dataset #4 Figure 3: Datasets from four differet models It turs out that the datasets are draw from the simulatios reported above. Dataset #3 is ulikely from a liear model. However, oe ca easily coclude that Dataset #4 is likely from a liear model because the oliearity is mild relative to the dispersio of the data. Give that we believe that the uderlyig model is liear, how do we estimate β 0 ad β 1? 1 Estimatio of the simple liear model There are at least two approaches to estimate the liear model: 1. The method of momets 2. The ordiary least squares It turs out that the two differet approach yield the same estimator for the parameters β 0 ad β 1. ECON1003 Lesso 11: Simple Liear Regressio 6

1.1 The method of momets Suppose we have observatios of (X, Y ) pair. We ca imagie that the observatios of Y are radom draws from a ormal distributio with mea E(Y X) ad some variace σ 2, i.e., Y N(E(Y X), σ 2 ) Let ɛ = Y E(Y X). We have ɛ N(0, σ 2 ) Thus, a radom draw of Y is like a radom draw of e plus E(Y X). Thus, the assumed liear model (1) meas Y = β 0 + β 1 X + ɛ (2) Note that ɛ has zero mea, i.e., E(ɛ). Thus, oe ca use the coditio E(ɛ) as oe criteria to estimate the parameters. The problem is that oe equatio ca also be used to solve oe coefficiet oly (either β 0 or β 1 ). To solve (get estimate) for two coefficets, we will eed aother coditio. Oe possibility is to assume that ɛ is draw idepedetly of X. That is, E(ɛ X). E(ɛ X) implies E(Xɛ X) ad E(Xɛ) 3. Thus, we have two coditios, E(ɛ) ad E(Xɛ). E(ɛ) = E(Y β 0 + β 1 X) = E(Y ) β 0 + β 1 E(X) E(ɛX) = E[(Y β 0 + β 1 X)X] = E(Y X) β 0 E(X) + β 1 E(X 2 ) Two equatios are just eough to solve the two coefficiets β 0 or β 1. If we have a data sample of obseratios, how do we estimate β 0 or β 1? Note that E(.) is really the populatio average. I our estimatio, we do ot observe ɛ, β 0 ad β 1. What we have are oly observatios of (x i, y i ), i = 1,...,. We ca use the sample aalog (i.e., sample average to replace for the populatio average) to estimate the paramters. That is, we defie e i = y i b 0 + b 1 x i ad compute correspodig sample averages ad set them to equal zero. b 0, b 1 ad e i are sample aalog of β 0, β 1 ad ɛ i the model. Our objective is to fid b 0 ad 3 Note that E(Xɛ) implies Cov(X, ɛ). ECON1003 Lesso 11: Simple Liear Regressio 7

b 1, ad hece e i. Ê(e) = e i E(Xe) = x ie i Let s verify this method with somethig we are familair with the estimatio of populatio mea. Example 4 (Estimatio of populatio mea): Suppose Y N(β 0, σ 2 ). Thus, β 0 is the populatio mea of Y. We have observatio of y i, i = 1, 2,...,. We wat to estimate the populatio mea of Y. Fittig ito the liear model framwork, we write Y = β 0 + ɛ (3) Thus, we have oly oe parameter to estimate, i.e., β 0. Let b 0 be a estimate of β 0. First, we write e i = y i b 0. Secod, we will compute the sample average of e i ad set it to zero. e i (y i b 0 ) y i b 0 y i b 0 b 0 = y i Thus, the method yields sample average as a estimator of β 0. Example 5 (Estimatio of the liear model): Suppose Y N(β 0 + β 1 X, σ 2 ). Thus, β 0 + β 1 X is the populatio mea of Y coditioal o X. We have observatio of (x i, y i ), i = 1,...,. We wat to estimate the liear relatioship of coditioal populatio mea. This is exactly the liear model framwork as i equatio 1. Thus, we have two parameter to estimate, i.e., β 0 ad β 1. Let b 0 ad b 1 be estimates of β 0 ad β 1. First, we write e i = y i b 0 b 1 x i. Secod, we will compute the sample average of e i ad ECON1003 Lesso 11: Simple Liear Regressio 8

x i e i ad set them to zero. e i (y i b 0 b 1 x i ) (y i b 0 b 1 x i ) (4) e ix i (y i b 0 b 1 x i )x i (y i b 0 b 1 x i )x i (5) Thus, the two equatios (4 ad 5) may be used to solve for the two ukow b 0 ad b 1 This approach is called the method of momets because the estimatio is based o the matchig the sample momets (sample averages) with the populatio momets (E(.)). 1.2 The method of ordiary least squares Aother view is to fid the lie that best fit the data. I the liear model 2, we would like to choose the b 0 ad b 1 so that the error e is miimized. e = Y b 0 + b 1 X Whe we have obseratios of x i, y i (ad hece e i ), aturally, we will have some positive e i ad some egativae e i. A operatioal procedure to miimize e is to choose b 0 ad b 1 such that the sum of squared errors is miimized. S(b 0, b 1 ) = e 2 i = (y i b 0 b 1 x i ) 2 Miimizig the S(b 0, b 1 ) with respective to b 0 ad b 1 yields the followig two first order coditios: S(b 0, b 1 ) b 0 = 2(y i b 0 b 1 x i )( 1) (y i b 0 b 1 x i ) (6) ECON1003 Lesso 11: Simple Liear Regressio 9

S(b 0, b 1 ) b 1 = 2(y i b 0 b 1 x i )( x i ) (y i b 0 b 1 x i )x i (7) Note that these two coditios (6 ad 7) are the same as those two coditios usig the method of momets approach (4 ad 5). 2 The coveiece of matrix otatios The use of matrix greatly simplify our aalysis. Our model Y = β 0 + β 1 X + ɛ may be rewritte i matrix otatios ( Y = 1 X ) β 0 β 1 + ɛ = Zβ + ɛ Premultiply with Z, we have Z Y = Z Zβ + Z ɛ The coditio to estimate β is E(Z ɛ). Hece E(Z Y ) = E(Z Z)β + E(Z ɛ) E(Z Y ) = E(Z Z)β β = [E(Z Z)] 1 E(Z Y ) ECON1003 Lesso 11: Simple Liear Regressio 10

Suppose we have a sample of observatios (y i, x i ), i = 1,...,. We have y 1 y 2... = 1 x 1 1 x 2...... b 0 b 1 + e 1 e 2... y 1 x e or i compact form Y = Zb + e Premultiply by Z, we have Z Y = Z Zb + Z e Z Y = Z Zb b = (Z Z) 1 Z Y (8) where Z e because Z e is the sample aalog of E(Z ɛ) which is assumed to equal zero i the model. 3 Properties of the OLS estimator For coveiece, we use the matrix otatios to discuss the properties of the OLS estimator. 3.1 Ubiasedess Recall the defiitio of ubiasedess. Defiitio 2 (Ubiasedess): A estimator θ = θ(x 1, x 2,..., x ) for a populatio parameter β is called ubiased if E(θ) = β ECON1003 Lesso 11: Simple Liear Regressio 11

Thus, b is a ubiased estimator of β if E(b) = β. I the followig discussio, it is coveiet to assume that x i are fixed ad kow. E(b) = E((Z Z) 1 Z Y) = E((Z Z) 1 Z Zβ + ɛ) = E((Z Z) 1 Z Zβ) + E((Z Z) 1 Z ɛ) = E((Z Z) 1 Z Z)β + E[(Z Z) 1 Z E(ɛ Z)] = β Thus, b is ubiased if E(ɛ Z). 3.2 The estimators are ormally distributed Note that b is a ratio of sample meas. If we assume that x i are fixed ad kow, the b will be a weighted average of y i. Thus, for sample with more tha 30 observatios, Cetral Limit Theorem may be applied to coclude that b will be ormally distributed. I showig the ubiasess, we have compute the mea of b. It remais to fid the variace of b. V (b) = V (Z Z) 1 Z Y) = V ((Z Z) 1 Z Zβ + ɛ) = V ((Z Z) 1 Z Zβ) + V ((Z Z) 1 Z ɛ) + 2COV ((Z Z) 1 Z Zβ), ((Z Z) 1 Z ɛ) = V (β) + V ((Z Z) 1 Z ɛ) + 2COV (β, ((Z Z) 1 Z ɛ) = V ((Z Z) 1 Z ɛ) = E[(Z Z) 1 Z ɛ)(z Z) 1 Z ɛ) ] = E[(Z Z) 1 Z ɛɛz(z Z) 1 ] = (Z Z) 1 Z E(ɛɛ)Z(Z Z) 1 ] = (Z Z) 1 Z Iσ 2 Z(Z Z) 1 ] = σ 2 (Z Z) 1 Z Z(Z Z) 1 ] = σ 2 (Z Z) 1 ECON1003 Lesso 11: Simple Liear Regressio 12

Thus, b A N(β, σ 2 (Z Z) 1 ) Usually σ 2 is ukow ad has to be estimated based o e i = y i b 0 b 1 x i. S 2 = e i 2 S 2 is also called stadard error of estimate. Why do we have a deomiator of ( 2) istead of ( 1) as i the usual estimate of populatio variace? It is becasue b 0 ad b 1 have to be estimated from data. I the estimatio of the populatio variace σ 2, these two umbers are assumed fixed. Hece, ( 2) reflects the loss of two degree of freedom. 3.3 BLUE The OLS estimator is also kow to be Best Liear Ubiased Estimator. Best because the estimator is a result of miimizig the sum of squared errors ad hece V (b) is the smallest amog all possible ways of obtaiig a estimate of β. Liear because liear model is assumed. Ubiased because b 0 ad b 1 are ubiased estimator of β 0 ad β 1 4 Iferece The kowledge about the distributio of b allows us to do various kids of iferece. The cofidece iterval about β ad testig hypothesis about β are straighforward. Let b = b 0 b 1 N β 0 β 1, V (b 0) C(b 0, b 1 ) C(b 0, b 1 ) V (b 1 ). (9) 4.1 Testig idividual parameters Ofte, we are iterested i testig whether the idividual populatio parameters are differet from zero at 5% level of sigificace. That is, H 0 : β 1 versus H 1 : β 1 0. The joit distributio of b 0 ad b 1 as show ECON1003 Lesso 11: Simple Liear Regressio 13

i (9) suggests that b 1 N(β 1, V (b 1 )). Hece we will reject the ull if b 1 0 V (b1 ) > 1.96 or b 1 0 V (b1 ) < 1.96 Sometimes, we would like to test whether the idividual populatio parameters are differet from oe at 5% level of sigificace. That is, H 0 : β 1 = 1 versus H 1 : β 1 1. The joit distributio of b 0 ad b 1 as show i (9) suggests that b 1 N(β 1, V (b 1 )). Hece we will reject the ull if b 1 1 V (b1 ) > 1.96 or b 1 1 V (b1 ) < 1.96 The testig about β 0 is similar. 4.2 Testig a set of parameters Suppose we are iterested i testig whether the two populatio parameters are ot equal at 5% level of sigificace. That is, H 0 : β 1 β 0 versus H 1 : β 1 β 0 0. The joit distributio of b 0 ad b 1 as show i (9) suggests that b 1 b 0 N(β 1, V (b 1 ) + V (b 0 ) 2C(b 0, b 1 )). Hece we will reject the ull if (b 1 b 0 ) 0 V (b1 ) + V (b 0 ) 2C(b 0, b 1 ) > 1.96 or (b 1 b 0 ) 0 V (b1 ) + V (b 0 ) 2C(b 0, b 1 ) < 1.96 5 How good is the model? 5.1 Goodess of fit How well doest the model fit the data? A model is better fit of the data whe the implied e i are small. Because the OLS estimators are result of miimizig the sum of squared errors give x i ad y i, the estimator ECON1003 Lesso 11: Simple Liear Regressio 14

b is the best fit. However, there are alterative models usig differet x as a explaatory variables. Oe would wat to derive a commo measture to tell which explaatory variable will yield the best fit. Note that our aim is the predict y give x. Without x, we will be usig the sample mea of y as a predictio of y. I this case we will have a sum of squared errors SST = (y i ȳ) 2 With x, we will be usig the sample coditiol mea of y (i.e., b 0 + b 1 x) as a predictio of y. We will have a sum of squared errors SSE = (y i (b 0 + b 1 x i )) 2 It ca be show that SST = SSE + SSR where SSR is the regressio sum of sqaures SSR = A atural measure of goodess of fit is ((b 0 + b 1 x i ) ȳ) 2 R 2 = SSR SST = 1 SSE SST R 2 measures how the percetage of the total sum of sqaured errors that ca be explaied by the explaatory variable(s) i a regressio framework. Note that R 2 lies betwee 0 ad 1. A higher R 2 meas a explaatory variable (x) is better i predictig y. R 2 the explaatory variable x is useless i predictig y. R 2 = 1 the explaatory variable x predicts y perfectly. R 2 is also kow as the coefficiet of determiatio. A relative of R 2 is the correlatio coefficiet r. Defiitio 3 (Correlatio coefficiet): Suppose we have a sample of observatios (x i, y i ), i = 1, 2,...,. The Correlatio Coefficiet (r) is a measure of the stregth of the liear relatioship betwee two variables x ad y. r = (xi x)(yi ȳ) (xi x)2 1 1 (yi ȳ)2 1 = (x i x)(y i ȳ) (x i x) 2 (y i ȳ) 2 ECON1003 Lesso 11: Simple Liear Regressio 15

It ca rage from 1.00 to 1.00. Values of 1.00 or 1.00 idicate perfect ad strog correlatio. Values close to 0.0 idicate weak correlatio. Negative values idicate a iverse relatioship ad positive values idicate a direct relatioship. It ca be show that i the coefficiet of determiatio (R 2 ) is the square of correlatio coefficiet (r 2 ). Note that R 2 is more geeral ad is valid for models with more tha oe explaatory variable, but the correlatio coefficiet applies oly to two variables. 5.2 Validity of assumptios 5.2.1 The liearity assumptio We have assumed the Y = β 0 + β 1 X + e. Sometimes, Ecoomic theory or data suggest that the model may ot be liear. For example, i the huma capital example, it is ofte assumed that Y = β 0 +β 1 X +β 2 X 2 +e ad Y is log mothly earigs istead of mothly earigs. How do we kow whether liearity is a satisfactory assumptio? We ofte check by doig a scatter plot of the data y agaist x. If the plot suggest o-liearity, we will have to revised our model. 5.2.2 Same variace for all observatios homoskedasticity The observatios (x i, y i ), i = 1,..., are assumed to be draw from the same populatio Y N(E(Y X), σ 2 ) Whe this assumptio is ot correct, we will eed to do some adjustmet to our estimatio ad iferece. The assumptio may geerally be checked by plottig the residuals (e i ) agaist x. If we see the residuals to exhibit some patter, we will try to trasform the model or the data. To trasform the data, we ca defie y = log(y) or y = y 2, etc. To adjust the model, oe may add higher order terms (i.e., square terms, cubic terms) to allow for oliearity. ECON1003 Lesso 11: Simple Liear Regressio 16