The Ordinary Least Squares (OLS) Estimator

Similar documents
Chapter 11: Simple Linear Regression and Correlation

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 9: Statistical Inference and the Relationship between Two Variables

Statistics MINITAB - Lab 2

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Statistics for Economics & Business

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

e i is a random error

Statistics for Business and Economics

Introduction to Regression

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Basic Business Statistics, 10/e

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Exam. Econometrics - Exam 1

Properties of Least Squares

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Comparison of Regression Lines

Linear Regression Analysis: Terminology and Notation

Chapter 14 Simple Linear Regression

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Economics 130. Lecture 4 Simple Linear Regression Continued

III. Econometric Methodology Regression Analysis

Learning Objectives for Chapter 11

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

x i1 =1 for all i (the constant ).

18. SIMPLE LINEAR REGRESSION III

January Examinations 2015

28. SIMPLE LINEAR REGRESSION III

β0 + β1xi and want to estimate the unknown

Lecture 3 Stat102, Spring 2007

β0 + β1xi. You are interested in estimating the unknown parameters β

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

T E C O L O T E R E S E A R C H, I N C.

STAT 3008 Applied Regression Analysis

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture 6: Introduction to Linear Regression

Chapter 4: Regression With One Regressor

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Chapter 15 - Multiple Regression

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

/ n ) are compared. The logic is: if the two

Scatter Plot x

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Basically, if you have a dummy dependent variable you will be estimating a probability.

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Chapter 13: Multiple Regression

Correlation and Regression

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

β0 + β1xi. You are interested in estimating the unknown parameters β

Chapter 8 Indicator Variables

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Some basic statistics and curve fitting techniques

Continuous vs. Discrete Goods

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Unit 10: Simple Linear Regression and Correlation

a. (All your answers should be in the letter!

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Rockefeller College University at Albany

Chapter 5 Multilevel Models

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

SIMPLE LINEAR REGRESSION

Polynomial Regression Models

Regression Analysis. Regression Analysis

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

A Comparative Study for Estimation Parameters in Panel Data Model

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Phase I Monitoring of Nonlinear Profiles

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

REGRESSION ANALYSIS II- MULTICOLLINEARITY

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Introduction to Generalized Linear Models

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Lecture 3 Specification

Econometrics of Panel Data

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Biostatistics 360 F&t Tests and Intervals in Regression 1

Negative Binomial Regression

Chapter 15 Student Lecture Notes 15-1

Activity #13: Simple Linear Regression. actgpa.sav; beer.sav;

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Linear Feature Engineering 11

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

A Robust Method for Calculating the Correlation Coefficient

Transcription:

The Ordnary Least Squares (OLS) Estmator 1

Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal scence, economcs, management, lfe and bologcal scence, and the socal scence Regresson analyss may be the most wdely used statstcal technque 2

Example 1: delvery tme v.s. delvery volume Suspect that the tme requred by a route delveryman to load and servce a machne s related to the number of cases of product delvered 25 randomly chosen retal outlet The n-outlet delvery tme and the volume of product delvery Scatter dagram: dsplay a relatonshp between delvery tme and delvery volume 3

4

5

Y: delvery tme, x: delvery volume Y = 0 + 1 x + ε Error, ε: The dfference between y and 0 + 1 x A statstcal error,.e. a random varable The effects of the other varables on delvery tme, measurement errors, 6

Smple lnear regresson model: Y = 0 + 1 x + ε x: ndependent (predctor, regressor) varable Y: dependent (response) varable ε : error If x s fxed, Y s determned by ε. Suppose that E(ε) = 0 and Var(ε) = 2. Then E(Y x) = E( 0 + 1 x + ) = 0 + 1 x Var(Y x) = Var( 0 + 1 x + ) = 2 7

The true regresson lne s a lne of mean values: the heght of the regresson lne at any x s the expected value of Y for that x. The slope, 1 : the change n the mean of Y for a unt change n x The varablty of Y at x s determned by the varance of the error 8

Example: E(Y x) = 3.5 + 2 x, and Var(Y x) = 2 Y x ~ N( 0 + 1 x, 2 ) 2 small: the observed values wll fall close the lne. 2 large: the observed values may devate consderably from the lne. 9

10

The regresson equaton s only an approxmaton to the true functonal relatonshp between the varables. Regresson model: Emprcal model 11

12

Vald only over the regon of the regressor varables contaned n the observed data! 13

Multple lnear regresson model: Y = 0 + 1 x 1 + + k x k + ε Lnear: the model s lnear n the parameters, 0, 1,, k, not because Y s a lnear functon of x s. 14

Two mportant objectves: Estmate the unknown parameters (fttng the model to the data): The method of least squares. Model adequacy checkng: An teratve procedure to choose an approprate regresson model to descrbe the data. Remarks: Don t mply a cause-effect relatonshp between the varables Can ad n confrmng a cause-effect relatonshp, but t s not the sole bass! Part of a broader data-analyss approach 15

The Least Squares Estmator Y = 0 + 1 x + ε x: regressor varable Y: response varable 0 : the ntercept, unknown 1 : the slope, unknown ε: error wth E(ε) = 0 and Var(ε) = 2 (unknown) The errors are uncorrelated. 16

Gven x, E(Y x) = E( 0 + 1 x + ) = 0 + 1 x Var(Y x) = Var( 0 + 1 x + ) = 2 Responses are also uncorrelated. Regresson coeffcents: 0, 1 1 : the change of E(Y x) by a unt change n x 0 : E(Y x=0) 17

Least-squares Estmaton of the Parameters Estmaton of 0 and 1 Data: n pars: (y, x ), = 1,, n Method of least squares: Mnmze n S( 0, 1) [ y ( 0 1x )] 1 2 18

Least-squares normal equatons: 19

The least-squares estmator: 20

The ftted smple regresson model: A pont estmate of the mean of y for a partcular x Resdual: An mportant role n nvestgatng the adequacy of the ftted regresson model and n detectng departures from the underlyng assumpton! 21

Example 2: The Rocket Propellant Data Shear strength s related to the age n weeks of the batch of sustaner propellant. 20 observatons From scatter dagram, there s a strong relatonshp between shear strength (Y) and propellant age (x). Assumpton Y = 0 + 1 x + ε 22

23

S S ˆ ˆ xx xy 1 0 S S y xy xx x x 2 y nx 2 1106.56 nxy 41112.65 37.15 ˆ x 1 2627.82 The least-square ft: yˆ 2627.82 37. 15x 24

How well does ths equaton ft the data? Is the model lkely to be useful as a predctor? Are any of the basc assumpton volated and f so how serous s ths? 25

Propertes of the Least-Squares Estmators and the Ftted Regresson Model ˆ and ˆ are lnear combnatons of y ˆ ˆ 0 1 0 n 1 y, c ( x 1 y ˆ x 1 c x) / ˆ and ˆ are unbased estmators. 1 0 S xx 26

27 0 1 1 0 1 0 1 1 0 1 1 ) ˆ ( ) ˆ ( ) ( ) ( ) ( ) ˆ ( x x x y E E x c y E c y c E E n xx xx S x x S c y Var c y c Var Var 2 2 2 2 2 2 2 1 ) ( ) ( ) ( ) ˆ ( ) 1 ( ) ˆ ( 2 2 0 S xx x n Var

Classcal Lnear Regresson Assumptons 1. Regresson s lnear n parameters 2. Error term has zero populaton mean 3. Error term s not correlated wth X s 4. No seral correlaton 5. No heteroskedastcty 6. No perfect multcollnearty and we usually add: 7. Error term s normally dstrbuted (*We dd not use ths n dervng the OLS for t s a non-parametrc estmator. A good property.)

Gauss-Markov Theorem Gven OLS assumptons 1 through 6, the OLS estmator of β k s the mnmum varance estmator from the set of all lnear unbased estmators of β k for k=0,1,2,,k. That s, the OLS s the BLUE (Best Lnear Unbased Estmator) ~~~~~~~~ * Furthermore, by addng assumpton 7 (normalty), one can show that OLS = MLE and s the BUE (Best Unbased Estmator) also called the UMVUE.

Gauss-Markov Theorem Can you prove ths theorem? Ths s your Quz 2. Last but not the least, we thank colleagues who have uploaded ther lecture notes on the nternet!