Statistical Properties of OLS estimators

Similar documents
Properties and Hypothesis Testing

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

11 Correlation and Regression

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Simple Linear Regression

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

MA Advanced Econometrics: Properties of Least Squares Estimators

Topic 9: Sampling Distributions of Estimators

(X i X)(Y i Y ) = 1 n

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Statistics 20: Final Exam Solutions Summer Session 2007

Random Variables, Sampling and Estimation

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

Linear Regression Demystified

Regression, Inference, and Model Building

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Asymptotic Results for the Linear Regression Model

Quick Review of Probability

Quick Review of Probability

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

Estimation for Complete Data

Algebra of Least Squares

ECON 3150/4150, Spring term Lecture 3

Linear Regression Models

STP 226 EXAMPLE EXAM #1

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

CLRM estimation Pietro Coretto Econometrics

1 Inferential Methods for Correlation and Regression Analysis

Simple Linear Regression

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Topic 9: Sampling Distributions of Estimators

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Linear Regression Models, OLS, Assumptions and Properties

Correlation Regression

Mathematical Notation Math Introduction to Applied Statistics

Lecture 2: Concentration Bounds

Topic 9: Sampling Distributions of Estimators

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Lecture 1, Jan 19. i=1 p i = 1.

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Simple Regression. Acknowledgement. These slides are based on presentations created and copyrighted by Prof. Daniel Menasce (GMU) CS 700

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Simple Regression Model

Lecture 3. Properties of Summary Statistics: Sampling Distribution

U8L1: Sec Equations of Lines in R 2

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 4

Econometrics II Tutorial Problems No. 4

The standard deviation of the mean

x iu i E(x u) 0. In order to obtain a consistent estimator of β, we find the instrumental variable z which satisfies E(z u) = 0. z iu i E(z u) = 0.

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Study the bias (due to the nite dimensional approximation) and variance of the estimators

(all terms are scalars).the minimization is clearer in sum notation:

Curve Sketching Handout #5 Topic Interpretation Rational Functions

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Dealing with Data and Fitting Empirically

Economics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Lesson 11: Simple Linear Regression

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Statistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions

Introducing Sample Proportions

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

Y i n. i=1. = 1 [number of successes] number of successes = n

TEACHER CERTIFICATION STUDY GUIDE

University of California, Los Angeles Department of Statistics. Simple regression analysis

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:


Understanding Samples

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Stat 139 Homework 7 Solutions, Fall 2015

The Simple Regression Model

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

Analysis of Experimental Data

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

1 Covariance Estimation

Appendix D Some Portfolio Theory Math for Water Supply

University of California, Los Angeles Department of Statistics. Practice problems - simple regression 2 - solutions

1 Introduction to reducing variance in Monte Carlo simulations

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Section 14. Simple linear regression.

Activity 3: Length Measurements with the Four-Sided Meter Stick

Transcription:

1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of Y i s Cosider the umerator term i β 1. Note that (X i X ) = 0 (X i X )Y = Y (X i X ) = 0 β 1 = = (X i X )Y i = w i Y i w i = (X i X ) 1 Y i β 0 = Y β 1X = w i X Y i = ( 1 w ix ) Y i Ubiased Estimator E(β 0) = β 0 E(β 1) = β 1 Assumptio 1. E(u i X) = 0 E(Y i X) = β 0 + β 1 X i, E(u i ) = E[E(u i X)] = E[0] = 0 where X icludes all X i s: X = (X 1, X,, X ) Note: E(u i X) = 0 implies that cov(u i, X) = 0, that is, errors u i are ot correlated with ay X j. This is based o the followig geeral result. Theorem. E(Y X) = E(Y) cov(x, Y) = 0 Proof. We use the law of iterated expectatios. cov(x, Y) = E(XY) E(X)E(Y) E(XY) = E [E(XY X)] = E[XE(Y X)] = E[XE(Y)] = E(X)E(Y) the first equality is due to the law of iterated expectatios the secod equality is due the fact that, coditioal o X meas X becomes a costat i expectatios third equality is due to the proposed relatioship E(Y X) = E(Y) last equality is due to the fact that E(Y) is a costat i expectatio with respect to X

Example of wage rate of idividuals: Idividuals of 1 year educatio (X ) get β 0 + β 1 X i o the average. Some idividual gets a higher ad some gets a lower wage rate tha the average rate ad the average of deviatios from the mea is zero. Propositio: Uder Assumptio 1, OLS estimators are ubiased. Proof OLS estimators ca be writte as (1.a) β 1 = β 1 + w i u i where w i = (X i X ) (1.b) β 0 = β 0 (β 1 β 1 )X + u Takig coditioal expectatio q.e.d. E(β 1 X) = β 1 + w i E(u i X) = β 1 + 0 E(β 0 X) = β 0 (E(β 1 X) β 1 )X + E(u X) = β 0 + 0 + 0 Proof of (1.b) Y i = β 0 + β 1 X i + u i Y = β 0 + β 1 X + u Plug this ito the equatio for β 0 ad rearrage terms to get β 0 = Y β 1X = (β 0 + β 1 X + u ) β 1X = β 0 (β 1 β 1 )X + u Proof of (1.a) Cosider the umerator term of β 1 equatio = (X i X )(Y i Y ) = (X i X )Y i = (X i X )(β 0 + β 1 X i + u i ) = β 1 (X i X )X i + (X i X )u i = β 1 (X i X ) + (X i X )u i where we used the relatioship (X i X )Y = (X i X )β 0 = 0 (X i X )X i = (X i X )(X i X ) = (X i X ) β 1 = = β 1 (X i X ) + (X i X )u i (X i X ) = β 1 + w i u i Violatio of Assumptio 1

3 Omitted variable. Omitted variable may cause the correlatio betwee the error term ad the explaatory variable. The regressio is the effect of the percetage of childre who are eligible for free luch program (luch) o the percetage (math) of 10 th graders who pass the math exam. The regressio result is math=3.14-0.319luch, =408, R =0.171 This idicates that a 10% icrease i the umber of studets who are eligible for free luch will reduce the passig percetage by 3.19 percet. A policy implicatio is that the govermet must tighte the eligibility criteria to icrease the passig percetage. This regressio result does't seem to be right. It is likely that the explaatory variable (percetage of eligible studets) is correlated with the poverty level, school quality ad resources of the school which are cotaied i the error term. This causes the OLS estimator biased. Assumptio. (X i, Y i ) are idepedetly ad idetically distributed (i.i.d) Assumptio 3. Large outliers are ulikely. Assumptio meas that samples are draw radomly. This is possible i experimets. But, with observed data, we hope that they are reasoably idepedet. Idepedece of Y i ad Y j meas that u i ad u j are idepedet. That is, u i's are i.i.d. radom variables. E(u i X) = E(u i ) = 0 var(u i X) = var(u i ) = σ u homoscedasticity var(u i X) = var(u i ) = σ u Heteroskedasticity var(u i X) = var(u i ) = σ i Theorem. Variace ad covariace of coefficiet estimators uder homoscedasticity Uder the assumptios listed above, the variaces of OLS estimators are give by σ = σ β 0 u Q 0 σ = σ β 1 u Q 1 σ β 0,β 1 = cov(β 0, β 1) = X var(β 1) = σ u X Q 1 Q 0 = X i Q 1 = 1 (X i X ) Remarks: 1. Estimators become less precise (i.e., higher variaces) as there is more ucertaity i the error term (i.e., higher value of σ u ).. Estimators become more precise (i.e., lower variaces) as the regressor X is more widely dispersed aroud its mea, i.e., the deomiator term is larger. 3. Estimators become more precise (i.e., lower variaces) as the sample size icreases. 4. Variace of itercept term icreases if values of regressor X are far away from the origi 0.

4 5. β 0 ad β 1 are egatively (positively) correlated if X is positive (egative) because the regressio lie must pass the poit of sample meas (X, Y ). 6. Variaces of coefficiet estimators are ukow because they ivolve σ u which is ukow. To compute their variaces, we eed a estimator for σ u. Least squares estimator of σ u Variaces of least squares estimators of coefficiets ad their covariace caot be computed from the give data because σ u is ukow. How do we estimate it? - σ u is the variace of error term: σ u = var(u i ) = E[u i E(u i )] = E[u i ] - Expected value is a theoretical couter part of sample mea - If we have observed data o u i, we may estimate the variace by sample mea u i /. - Sice we do t have observed values of error terms, we may use its estimate u i - A problem is that ot all u i ca take idepedet values: For example, u i = 0. This meas that if we have values of the first -1 estimated error terms, the last oe is automatically determied. This is called the loss of degrees of freedom. - The umber loss i degrees of freedom is equal to the umber of parameters we estimate. - i the simple liear regressio model, it is two. - whe we add all estimated residuals, we are actually addig - idepedet values. - the degrees of freedom is therefore -. - ad we estimate the variace by the average of - idepedet residuals σ u = u i Estimated variaces ad covariace of coefficiet estimators are obtaied by replacig the ukow σ u with its ubiased estimate σ u. σ β 0 = σ u Q 0 σ β 1 = σ u Q 1 = σ β 0,β 1 X σ β 1 Remark: Goodess of Fit R The idea of the R measure of goodess of fit was to compare the SSR i models with ad without iformatio about the regressor ad measure the fractio of the reductio i the SSR

5 R = SSR u = 1 SSR u Aother idea is to see how close predicted values Y i are to the observed Y i. The closeess is measured by the sample correlatio coefficiet or its squared value corr(y i, Y i) = cov(y i,y i) SD(Y i )SD(Y i) R = [corr(y i, Y i)] = [cov(y i,y i)] var(y i )var(y i) To show the last expressio is the same as the previous expressio of R, we first show u i = (Y i β 0 β 1X i ) = (Y i Y ) + β 1 (X i X ) = 0 + 0 = 0 Notig X i u i = (Y i β 0 β 1X i ) = (Y i Y ) + β 1 (X i X ) = 0 + 0 = 0 Y i = Y i + u i Y = Y + u = Y it is easy to show cov(y i, Y i) = 1 (Y i Y )(Y i Y ) = 1 (Y i + u i Y )(Y i Y ) = 1 (Y i Y ) where the last equality is due to X i u i = 0. This shows var(y i) = 1 R = [cov(y i,y i)] = var(y i) var(y i )var(y i) var(y i ) Note that (Y i Y ) = cov(y i, Y i) var(y i ) = 1 (Y i Y ) 1 = SSR r Now, we will show var(y i) = SSR u (Y i Y ) = (Y i Y i + Y i Y ) The last term is zero = (Y i Y i) + (Y i Y ) + (Y i Y i)(y i Y ) = u i + (Y i Y ) + u i(y i Y ) u i(y i Y ) = u i(y i Y ) = u i(β 0 + β 1X i Y ) = 0 where the last equality is due to u i(β 0 Y ) = (β 0 Y ) u i = 0 u ix i = 0 This is show before. Puttig all these results together, we have R = [cov(y i,y i)] = var(y i) = SSR u = 1 SSR u var(y i )var(y i) var(y i )