MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Similar documents
Review of Econometrics

Ch 2: Simple Linear Regression

Lecture 14 Simple Linear Regression

L2: Two-variable regression model

Lecture 3: Multiple Regression

Simple Linear Regression: The Model

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Regression and Statistical Inference

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

ECON3150/4150 Spring 2015

Simple Linear Regression

ECON The Simple Regression Model

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Simple Linear Regression

Essential of Simple regression

Applied Statistics and Econometrics

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

WISE International Masters

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Two-Variable Regression Model: The Problem of Estimation

Econometrics I Lecture 3: The Simple Linear Regression Model

The Simple Linear Regression Model

Linear Regression with Multiple Regressors

2. Linear regression with multiple regressors

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Regression Analysis: Basic Concepts

Correlation and Regression

Correlation Analysis

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Measuring the fit of the model - SSR

Linear Models and Estimation by Least Squares

2 Regression Analysis

Multiple Regression Analysis

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Introduction to Estimation Methods for Time Series models. Lecture 1

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM

REGRESSION ANALYSIS AND INDICATOR VARIABLES

MORE ON SIMPLE REGRESSION: OVERVIEW

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

The Simple Regression Model. Part II. The Simple Regression Model

Econometrics Midterm Examination Answers

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Linear models and their mathematical foundations: Simple linear regression

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

Recitation 1: Regression Review. Christina Patterson

Statistics II Exercises Chapter 5

Chapter 16. Simple Linear Regression and dcorrelation

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

The regression model with one stochastic regressor (part II)

Sample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.

Inferences for Regression

STA 2201/442 Assignment 2

AMS 7 Correlation and Regression Lecture 8

STAT Chapter 11: Regression

Statistics 3657 : Moment Approximations

Multivariate Regression Analysis

ECO220Y Simple Regression: Testing the Slope

Simple Linear Regression

ECON 4230 Intermediate Econometric Theory Exam

Review of Statistics

Simple linear regression

Testing Linear Restrictions: cont.

Scatter plot of data from the study. Linear Regression

Motivation for multiple regression

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Linear Regression with one Regressor

Business Statistics. Lecture 10: Course Review

Linear Regression with Multiple Regressors

Ch 3: Multiple Linear Regression

MS&E 226: Small Data

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Review of Statistics 101

Chapter 2: simple regression model

Stat 135, Fall 2006 A. Adhikari HOMEWORK 10 SOLUTIONS

Scatter plot of data from the study. Linear Regression

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Lecture 18: Simple Linear Regression

Lectures 5 & 6: Hypothesis Testing

Intermediate Econometrics

Quantitative Techniques - Lecture 8: Estimation

STAT 4385 Topic 03: Simple Linear Regression

Estadística II Chapter 4: Simple linear regression

Microeconometria Day # 5 L. Cembalo. Regressione con due variabili e ipotesi dell OLS

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Note on Bivariate Regression: Connecting Practice and Theory. Konstantin Kashin

ECON 4160, Autumn term Lecture 1

LECTURE 5 HYPOTHESIS TESTING

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Transcription:

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators Thilo Klein University of Cambridge Judge Business School Session 4: Linear regression, OLS assumptions http://thiloklein.de 1/ 41

t-distribution What if σ Y is un-known? (almost always) Recall: Ȳ N(µ Y, σ Y 2 n ) and Ȳ µ Y σ n Y N(0, 1) If (Y 1,..., Y n ) are i.i.d. (and 0 < σy 2 < ) then, when n is large, the distribution of Ȳ is well approximated by a normal distribution: Why? CLT If (Y 1,..., Y n ) are independently and identically drawn from a Normal distribution, then for any value of n, the distribution of Ȳ is normal: Why? Sums of normal r.v.s are normal But we almost never know σ Y. We use sample standard deviation (s Y ) to estimate the unknown σ Y Consequence of using the sample s.d. in the place of the population s.d. is an increase in uncertainty. Why? s Y / n is the standard error of the mean : estimate from sample, of st. dev. of sample mean, over all possible samples of size n drawn from the population Session 4: Linear regression, OLS assumptions http://thiloklein.de 2/ 41

t-distribution Using s Y : estimator of σ Y s Y 2 = 1 sample size-1 sample size i=1 (Y i Ȳ )2 Why sample size - 1 in this estimator? Degrees of freedom (d.f.): the number of independent observations available for estimation. Sample size less the number of (linear) parameters estimated from the sample Use of an estimator (i.e., a random variable) for σ Y (and thus, the use of an estimator for σ Ȳ ) motivates use of the fatter-tailed t-distribution in the place of N(0, 1) The law of large numbers (LLN) applies to s 2 Y : s 2 Y is, in fact, a sample average (How?) LLN: If sampling is such that (Y 1,..., Y n ) are i.i.d., (and if E(Y 4 ) < ), then s 2 Y converges in probability to σ 2 Y : s 2 Y is an average, not of Y i, but of its square (hence we need E(Y 4 ) < ) Session 4: Linear regression, OLS assumptions http://thiloklein.de 3/ 41

t-distribution Aside: t-distribution probability density f(t) = d.f.+1 Γ( 2 ) d.f.πγ( d.f. 2 t2 d.f.+1 + ) 2 )(1 d.f. d.f. : Degrees of freedom: the only parameter of the t-distribution; Γ() (Gamma function): Γ(z) = 0 t z 1 e t dt Mean: E(t) = 0 Variance: V (t) = d.f./(d.f. 2) for d.f. > 2 1, converges to 1 as d.f. increases (Compare: N(0, 1)) Session 4: Linear regression, OLS assumptions http://thiloklein.de 4/ 41

t-distribution t-distribution pdf and CDF: d.f.=3 Session 4: Linear regression, OLS assumptions http://thiloklein.de 5/ 41

t-distribution t - statistic t d.f. = Ȳ µ s Y / n d.f. = n 1 test statistic for the sample average same form as z-statistic: Mean 0, Variance 1 If the r.v. Y is is normally distributed in population, the test-statistic above for Ȳ is t-distributed with d.f. = n 1 degrees of freedom Session 4: Linear regression, OLS assumptions http://thiloklein.de 6/ 41

t-distribution t - distribution, table and use Session 4: Linear regression, OLS assumptions http://thiloklein.de 7/ 41

t-distribution t SND If d.f. is moderate or large (> 30, say) differences between the t-distribution and N(0, 1) critical values are negligible Some 5% critical values for 2-sided tests: degree of freedom 5%t-distribution critical value 10 2.23 20 2.09 30 2.04 60 2.00 1.96 Session 4: Linear regression, OLS assumptions http://thiloklein.de 8/ 41

t-distribution Comments on t distribution, contd. The t distribution is only relevant when the sample size is small. But then, for the t distribution to be correct, the populn. distrib. of Y must be Normal Why? The popln r.v. (Ȳ ) must be Normally distributed for the test-statistic (with σ Y estimated by s Y ) to be t-distributed If sample size small, CLT does not apply; the popln distrib. of Y must be Normal for the popln. distrib. of Ȳ to be Normal (sum of Normal r.v.s is Normal) So if sample size small and σ Y estimated by s Y, then test-statistic to be t-distributed if popln. is Normally distributed If sample size large (e.g., > 30), the popln. distrib. of Ȳ is Normal irrespective of distrib. of Y - CLT (we saw that as d.f. increases, distrib. of t d.f. converges to N(0, 1)) Finance / Management data: Normality assumption dubious. e.g., earnings, firm sizes etc. Session 4: Linear regression, OLS assumptions http://thiloklein.de 9/ 41

t-distribution Comments on t distribution, contd. So, with large samples, or, with small samples, but with σ known, and a normal population distribution P r( z α/2 Ȳ µ σ/ n z α/2) = 1 α P r(ȳ z σ α/2 µ Ȳ + z σ n α/2 ) n = 1 α P r(ȳ 1.96 σ n µ Ȳ + 1.96 σ n ) = 0.95 Session 4: Linear regression, OLS assumptions http://thiloklein.de 10/ 41

t-distribution Comments on t distribution, contd. With small samples (< 30 d.f.), drawn from an approximately normal population distribution, with the unknown σ estimated with s Y, for test statistic Ȳ µ s Y / n : s Y P r(ȳ t d.f.,α/2 µ Ȳ + t n d.f.,α/2 ) = 1 α n Q: What is t d.f.,α/2? s Y Session 4: Linear regression, OLS assumptions http://thiloklein.de 11/ 41

Simple linear regression model Population Linear Regression Model: Bivariate Y = β 0 + β 1 X + u Interested in conditional probability distribution of Y given X Theory / Model : the conditional mean of Y given the value of X is a linear function of X X : the independent variable or regressor Y : the dependent variable β 0 : intercept β 1 : slope u : regression disturbance, which consists of effects of factors other than X that influence Y, as also measurement errors in Y n : sample size, Y i = β 0 + β 1 X i + u i i = 1,, n Session 4: Linear regression, OLS assumptions http://thiloklein.de 12/ 41

Simple linear regression model Linear Regression Model - Issues Issues in estimation and inference for linear regression estimates are, at a general level, the same as the issues for the sample mean Regression coefficient is just a glorified mean Estimation questions: How can we estimate our model? i.e., draw a line/plane/surface through the data? Advantages / disadvantages of different methods? Hypothesis testing questions: How do we test the hypothesis that the population parameters are zero? Why test if they are zero? How can we construct confidence intervals for the parameters? Session 4: Linear regression, OLS assumptions http://thiloklein.de 13/ 41

Simple linear regression model Regression Coefficients of the Simple Linear Regression Model How can we estimate β 0 and β 1 from data? Ȳ is the ordinary least squares (OLS) estimator of µ Y : Can show that Ȳ solves: min X n i=1 (Y i X) 2 By analogy, OLS estimators of the unknown parameters β 0 and β 1, solve: min ˆβ0, ˆβ 1 n (Y i ( ˆβ 0 + ˆβ 1 X i )) 2 i=1 Session 4: Linear regression, OLS assumptions http://thiloklein.de 14/ 41

Simple linear regression model OLS in pictures (1) Session 4: Linear regression, OLS assumptions http://thiloklein.de 15/ 41

Simple linear regression model OLS in pictures (2) Session 4: Linear regression, OLS assumptions http://thiloklein.de 16/ 41

Simple linear regression model OLS in pictures (3) Session 4: Linear regression, OLS assumptions http://thiloklein.de 17/ 41

Simple linear regression model OLS in pictures (4) Session 4: Linear regression, OLS assumptions http://thiloklein.de 18/ 41

Simple linear regression model OLS method of obtaining regression coefficients The sum : e 2 1 +.. + e2 n, is the Residual Sum of Squares (RSS), a measure of total error RSS is a function of both ˆβ 0 and ˆβ 1 (How?) RSS = (Y 1 ˆβ 0 ˆβ 1 X 1 ) 2 +.. + (Y n ˆβ 0 ˆβ 1 X n ) 2 = n (Y i ˆβ 0 ˆβ 1 X i ) 2 i=1 Idea: Find values of ˆβ 0 and ˆβ 1 that minimise RSS RSS( ) ˆβ 0 = 2 (Y i ˆβ 0 ˆβ 1 X i ) = 0 RSS( ) ˆβ 1 = 2 X i (Y i ˆβ 0 ˆβ 1 X i ) = 0 Session 4: Linear regression, OLS assumptions http://thiloklein.de 19/ 41

Simple linear regression model Ordinary Least Squares, contd. RSS ˆβ 0 = 0 and RSS ˆβ = 0 1 Solving these two equations together: ˆβ 1 = 1 n (Xi X)(Y i Ȳ ) 1 (Xi X) = n 2 ˆβ 0 = Ȳ ˆβ 1 X (Xi X)(Y i Ȳ ) (Xi X) 2 = Cov(X,Y ) V ar(x) Session 4: Linear regression, OLS assumptions http://thiloklein.de 20/ 41

Simple linear regression model Linear Regression Model: Interpretation Exercise Session 4: Linear regression, OLS assumptions http://thiloklein.de 21/ 41

Linear regression model: Evaluation Measures of Fit (i): Standard Error of the Regression (SER) and Root Mean Squared Error (RMSE) Standard error of the regression (SER) is an estimate of the dispersion (st.dev.) of the distribution of the disturbance term, u; Equivalently, of Y, conditional on X How close are Y values to Ŷ values? can develop confidence intervals around any prediction 1 n SER = n 2 i=1 (e i ē) 2 1 n = n 2 i=1 e i 2 SER converges to root mean squared error (RMSE) 1 n RMSE = n i=1 (e i ē) 2 1 n = n i=1 e2 i RMSE denominator has n SER has (n 2) Why? Session 4: Linear regression, OLS assumptions http://thiloklein.de 22/ 41

Linear regression model: Evaluation Measures of Fit (ii): R 2 How much of the variance in Y can we explain with our model? Without the model, the best estimate of Y i is the sample mean Ȳ With the model, the best estimate of Y i is conditional on X i and is the fitted value Ŷi = ˆβ 0 + ˆβ 1 X i How much does the error in estimate of Y reduce with the model? Session 4: Linear regression, OLS assumptions http://thiloklein.de 23/ 41

Linear regression model: Evaluation Goodness of fit The model: Y i = Ŷi + e i V ar(y ) = V ar(ŷ + e) = V ar(ŷ ) + V ar(e) + 2Cov(Ŷ, e) Why? = V ar(ŷ ) + V ar(e) 1 (Y Ȳ ) 2 1 = ( Ŷ Ŷ ) 2 + 1 (e ē) 2 n n n (Y Ȳ ) 2 = ( Ŷ Ŷ ) 2 + (e ē) 2 Total Sum of Squares (TSS) = Explained Sum of Squares (ESS) + Residual Sum of Squares (RSS) R 2 = ESS ( T SS = Ŷ i Ȳ )2 (Yi Ȳ )2 Session 4: Linear regression, OLS assumptions http://thiloklein.de 24/ 41

Linear regression model: Evaluation Goodness of fit T SS = ESS + RSS R 2 = ESS R 2 = T SS = ( Ŷ i Ȳ )2 (Yi Ȳ )2 = 1 Cov(Y,Ŷ ) st.dev.(y )st.dev.(ŷ ) = r Y,Ŷ e 2 i (Yi Ȳ )2 Session 4: Linear regression, OLS assumptions http://thiloklein.de 25/ 41

Properties of OLS Estimators After regression estimates In practical terms, we wish to: quantify sampling uncertainty associated with ˆβ 1 use ˆβ 1 to test hypotheses such as β 1 = 0 construct confidence intervals for β 1 all these require knowledge of the sampling distribution of the OLS estimators (based on the probability framework of regression) Session 4: Linear regression, OLS assumptions http://thiloklein.de 26/ 41

Properties of OLS Estimators Properties of Estimators and the Least Squares Assumptions What kinds of estimators would we like? unbiased, efficient, consistent Under what conditions can these properties be guaranteed? We focus on the sampling distribution of ˆβ 1 (Why not ˆβ 0?) The results below do hold for the sampling distribution of ˆβ 0 too. Session 4: Linear regression, OLS assumptions http://thiloklein.de 27/ 41

Properties of OLS Estimators Assumption 1: E(u X = x) = 0 Session 4: Linear regression, OLS assumptions http://thiloklein.de 28/ 41

Properties of OLS Estimators Assumption 1: E(u X = x) = 0 Conditional on X, u does not tend to influence Y either positively or negatively. Implication : Either X is not random, or, If X is random, it is distributed independently of the disturbance term, u: Cov(X, u) = 0 This will be true if there are no relevant omitted variables in the regression model (i.e., those that are correlated to X) Session 4: Linear regression, OLS assumptions http://thiloklein.de 29/ 41

Properties of OLS Estimators Assumption 1: Residual plots that pass and that fail Session 4: Linear regression, OLS assumptions http://thiloklein.de 30/ 41

Properties of OLS Estimators Aside: include a constant in the regression Suppose E(u i ) = µ u 0 Suppose u i = µ u + v i where v i N(0, σ 2 V ) Then Y i = β 0 + β 1 X i + v i + µ u = (β 0 + µ u ) + β 1 X i + v i Session 4: Linear regression, OLS assumptions http://thiloklein.de 31/ 41

Properties of OLS Estimators Assumption 2: (X i, Y i ), i = 1,..., n are i.i.d. This arises naturally with simple random sampling procedure Because most estimators are linear functions of observations, Independence between observations helps in obtaining the sampling distributions of the estimators Session 4: Linear regression, OLS assumptions http://thiloklein.de 32/ 41

Properties of OLS Estimators Assumption 3: Large outliers are rare A large outlier is an extreme value of X or Y Technically, E(X 4 ) < and E(Y 4 ) < Note: If X and Y are bounded, then they have finite fourth moments (income, etc.) Rationale : a large outlier can influence the results significantly Session 4: Linear regression, OLS assumptions http://thiloklein.de 33/ 41

Properties of OLS Estimators Assumption 3: OLS sensitive to outliers Session 4: Linear regression, OLS assumptions http://thiloklein.de 34/ 41

Properties of OLS Estimators Sampling distribution of ˆβ 1 If the three Least Squares Assumptions (mean zero disturbances, i.i.d. sampling, no large outliers) hold, then the exact (finite sample) sampling distribution of ˆβ 1 is such that: ˆβ 1 is unbiased, that is, E( ˆβ 1 ) = β 1 V ar( ˆβ 1 ) can be determined Other than its mean and variance, the exact distribution of ˆβ 1 is complicated and depends on the distribution of (X, u) ˆβ 1 is consistent: ˆβ1 p β 1 plim( ˆβ 1 ) = β 1 ˆβ So when n is large, 1 β 1 N(0, 1) (by CLT) V ar( ˆβ 1) This parallels the sampling distribution of Ȳ Session 4: Linear regression, OLS assumptions http://thiloklein.de 35/ 41

Properties of OLS Estimators Mean of the sampling distribution of ˆβ 1 Y = β 0 + β 1 X + u unbiasedness ˆβ 1 = Cov(X, Y ) V ar(x) = Cov(X, [β 0 + β 1 X + u]) V ar(x) = Cov(X, β 0) + Cov(X, β 1 X) + Cov(X, u) V ar(x) = 0 + β 1Cov(X, X) + Cov(X, u) V ar(x) Cov(X, u) = β 1 + V ar(x) Session 4: Linear regression, OLS assumptions http://thiloklein.de 36/ 41

Properties of OLS Estimators Unbiasedness of ˆβ 1 ˆβ 1 = Cov(X,Y ) V ar(x) = β 1 + Cov(X,u) V ar(x) To investigate unbiasedness, take Expectation E( ˆβ 1 ) = β 1 + 1 V ar(x) E(Cov(X, u)) = β 1 Expected value of Cov(X, u) is zero (Why?) ˆβ 1 is an unbiased estimator of β 1 Session 4: Linear regression, OLS assumptions http://thiloklein.de 37/ 41

Linear regression model: further assumptions Homoscedasticity and Normality of residuals u is homoscedastic u N(0, σ u 2 ) These assumptions are more restrictive However, if these assumptions are not violated, then other desirable properties obtain Session 4: Linear regression, OLS assumptions http://thiloklein.de 38/ 41

Linear regression model: further assumptions Heteroscedasticity: one example Session 4: Linear regression, OLS assumptions http://thiloklein.de 39/ 41

Linear regression model: further assumptions Normality of disturbances: histograms of residuals that pass and fail Session 4: Linear regression, OLS assumptions http://thiloklein.de 40/ 41

Linear regression model: further assumptions Normality of disturbances 2: q-q plots of residuals that pass and fail Session 4: Linear regression, OLS assumptions http://thiloklein.de 41/ 41