Chapter 6: The Simple Regression Model

Similar documents
Properties and Hypothesis Testing

Statistical Properties of OLS estimators

1 Inferential Methods for Correlation and Regression Analysis

Simple Linear Regression

Random Variables, Sampling and Estimation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

ECON 3150/4150, Spring term Lecture 3

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Correlation Regression

Final Examination Solutions 17/6/2010

The Simple Regression Model

Lesson 11: Simple Linear Regression

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Linear Regression Models

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

11 Correlation and Regression

Statistics 511 Additional Materials

Simple Regression Model

Part 1 of the text covers regression analysis with cross-sectional data. It builds upon a solid

First, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,

Linear Regression Demystified

Computing Confidence Intervals for Sample Data

Algebra of Least Squares

Part 1 of the text covers regression analysis with cross-sectional data. It builds

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Topic 9: Sampling Distributions of Estimators

Regression and Correlation

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Estimation for Complete Data

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

STP 226 EXAMPLE EXAM #1

Refresher course Regression Analysis

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

The standard deviation of the mean

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

This is an introductory course in Analysis of Variance and Design of Experiments.

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Polynomial Functions and Their Graphs

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Expectation and Variance of a random variable

Section 14. Simple linear regression.

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Midterm 2 ECO3151. Winter 2012

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

UNIT 11 MULTIPLE LINEAR REGRESSION

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Statistical inference: example 1. Inferential Statistics

Infinite Sequences and Series

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Regression, Inference, and Model Building

Chapter 8: Estimating with Confidence

Output Analysis (2, Chapters 10 &11 Law)

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

multiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.

Frequentist Inference

Solutions to Odd Numbered End of Chapter Exercises: Chapter 4

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

Read through these prior to coming to the test and follow them when you take your test.

Lecture 11 Simple Linear Regression

Regression and correlation

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Data Analysis and Statistical Methods Statistics 651

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Stat 139 Homework 7 Solutions, Fall 2015

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Linear Regression Models, OLS, Assumptions and Properties

Efficient GMM LECTURE 12 GMM II

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Stat 421-SP2012 Interval Estimation Section

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Simple Linear Regression

6 Sample Size Calculations

Correlation and Covariance

1 Review of Probability & Statistics

Understanding Samples

ECON 3150/4150, Spring term Lecture 1

(X i X)(Y i Y ) = 1 n

Transcription:

Chapter 6: The Simple Regressio Model Statistics ad Itroductio to Ecoometrics M. Ageles Carero Departameto de Fudametos del Aálisis Ecoómico Year 2014-15 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 1 / 81

Itroductio Ecoometrics is a brach or subdisciplie of Ecoomics that uses ad develops statistic methods i order to estimate relatioships betwee the ecoomic variables, to test ecoomic theories ad to evaluate govermet ad firms policies. Examples of ecoometric applicatios: Effects o employmet of a traiig programme for uemployed people. Cousellig i differet ivestmet strategies. Effects o sales of a advertisig campaig. Ecoometric Applicatios with may ecoomic disciplies: Macroecoomics =) Predictio of variables such as GNP ad iflatio or quatifyig the relatioship betwee iterest rate-iflatio. Microecoomics =) Quatify the relatioship betwee educatio ad wages, productio ad iputs, R+D ivestmet ad firms profits. Fiace =) Volatility Aalysis of assets, Asset Pricig Models M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 2 / 81

Stages of the empirical ecoomic aalysis The first stage of the ecoometric aalysis is to formulate clear ad precisely the questio to be studied (test of a ecoomic theory, aalysis of the effect of a public policy, etc. ). I may cases a formal ecoomic model is built. Example I order to describe the cosumptio decisio of idividuals subject to budget costraits, we assume that the idividuals make their choices i order to maximise their utility level. This model implies a set of demad equatios i which the demaded quatity of each good depeds o its ow price, the price of other substitute ad complemetary goods, cosumer icome ad their idividual characteristics affectig their prefereces. These equatios model the idividual cosumptio decisios ad are the basis for the ecoometric aalysis of the cosumers demad. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 3 / 81

Crime Ecoomic Model (Gary Becker (1968)) This model describes the idividual participatio i crime ad it is based o the utility maximisatio. Crimes imply ecoomic rewards ad costs. The decisio to participate i crime activities is a problem of assigig resources i order to maximise utility, where the costs ad beefits of the alterative decisios must be take ito accout. Costs: Costs liked to the possibility of beig arrested ad covicted. Opportuity cost of ot participatig i other activities such as legal jobs. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 4 / 81

Crime Ecoomic Model (cot.) Equatio describig the time ivested i crime activities y = f (x 1, x 2, x 3, x 4, x 5, x 6, x 7 ) y! Hours devoted to crime activities x 1! Hourly "Wage" of crime activities. x 2! Hourly wage of legal work. x 3! Other icome that does ot arise from crime activities or paid work. x 4! Probability of beig arrested x 5! Probability of beig covicted i case of beig arrested. x 6! Expected setece i case of beig arrested. x 7! Age. Fuctio f depeds o the uderlyig utility fuctio that is barely kow. However, we ca use the ecoomic theory, ad sometimes commo sese, i order to predict the effect of each variable o the crime activity. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 5 / 81

Crime Ecoomic Model (cot.) Oce the ecoomic model has bee established we must trasform it ito the ecoometric model. Followig the previous example, i order to costruct the ecoometric model we should: Specify the fuctioal form of fuctio f. Aalyse which variables ca be observed, which variables ca be approximated, which variables are ot observed ad how oe should take ito accout may other factors affectig crime behaviour. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 6 / 81

Crime Ecoomic Model (cot.) Cosider the followig particular ecoometric mode for the ecoomic model of crime behaviour crime = β 0 + β 1 w + β 2 othic + β 3 farr + β 4 fcov + β 5 avgse + β 6 age + u crime! Frequecy of the crime activity w! Wage that could be obtaied i a legal job. othic! Other icome. farr! Frequecy of arrests due to previous ifractios fcov! Frequecy of seteces. avgse! Average duratio of seteces age! Age. u! This is the error term reflectig all the uobserved factors affectig crime activity such as the wage of crime activities, the family eviromet of the idividual, etc. This also captures measuremet errors i those variables icluded i the model. β 0, β 1,.., β 6! Parameter of the ecoometric model describig the relatioship betwee crime (crime) ad those factors used i order to determie crime i the model. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 7 / 81

Oce the ecoometric model is specified hypothesis of iterest i terms of the ukow parameters of the model ca be formulated. For example, we ca ask whether wage obtaied i a legal job (w) does ot have ay effect o the crime activity. This hypothesis is equivalet to β 1 = 0. Oce the ecoometric model has bee established we have to collect the date o the variables appearig there. Fially, we use appropriate statistical techiques i order to estimate the ukow parameters ad test the hypothesis of iterest of these parameters. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 8 / 81

The structure of the ecoomic data Cross-Sectio Data They arise from surveys to families, idividuals or firms i a give poit of time. I may cases we ca assume that this is a radom sample, that is that the observatios are idepedet ad idetically distributed (iid). Examples: Ecuesta de Presupuestos Familiares (EPF), Ecuesta de Població Activa (EPA). Time Series We observe oe or more variables alog the time They are usually depedet variables Aual, quarterly, mothly or daily frequecy, etc. Examples: Mothly series of price idices, Aual GNP series, Daily IBEX-35 series. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 9 / 81

Pael Data This is a time series for each member of a cross-sectio 6= repeated cross-sectios. Examples: Ecuesta Cotiua de Presupuestos Familiares, Survey of Icome ad Livig Coditios (EU-SILC). M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 10 / 81

Causality ad the cocept of ceteris paribus i the ecoometrics aalysis I most applicatios, we are iterested i aalysig whether oe variable has a causal effect o aother variable. Examples: Would a icrease of the price of the good cause a decrease i its demad? If the seteces become tighter, would this have a causal effect o crime? Has educatio a causal effect o the productivity of workers? Does participatio i a certai traiig programme cause a icrease i the wage of those workers attedig? M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 11 / 81

The fact that there is correlatio betwee two variables does ot imply that a causality relatioship ca be iferred. For example, the fact that we observe that those workers participatig i a certai traiig programme have higher wages tha those that did ot participate, is ot eough to establish a causal relatioship. Iferrig causality is difficult because i Ecoomics we usually do ot have experimetal data. I causality, the cocept ceteris paribus (the rest of the relevat factors are held fixed) is very importat. For example, i order to aalyse the cosumers demad, we are iterested i quatifyig the effect that a chage i the price of the good has o the demaded quatity, by holdig fixed the rest of the factors such as icome, the price of other goods, the prefereces of the cosumers, etc. The ecoometric methods are used i order to estimate the ceteris paribus effects ad therefore to ifer causality betwee variables. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 12 / 81

Defiitio of the simple regressio model The simple regressio model is used i order to aalyse the relatioship betwee two variables. Although the simple regressio model has may limitatios, it is useful to lear to estimate ad iterpret this model before startig with the multiple regressio model. I the simple regressio model, we cosider that there are two radom variables y ad x that represet a populatio ad we are iterested i explaiig y i terms of x. For example, y ca be the hourly wage ad x the educatio years. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 13 / 81

We eed to establish a equatio relatig y ad x, ad the easiest model is to assume a liear relatioship y = β 0 + β 1 x + u (1) This equatio defies the simple regressio model ad it is assumed that this assumptio is valid for the populatio of iterest. y! depedet variable, explaied variable or respose variable. x! idepedet variable, explaatory variable, cotrol variable ad regressor. u! Error term or radom shock that captures the effect of other factors affectig y. I the aalysis of the simple regressio aalysis all these factors affectig y are cosidered as uobserved. β 1 is the slope parameter ad β 0 is the itercept. β 1 ad β 0 are ukow parameters that we wat to estimate usig a radom sample of (x, y). M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 14 / 81

β 1 reflects the chage i y give a icrease i a uit of x, holdig fixed the rest of the factors affectig y ad that are icluded i u. Note that the liearity assumptio implies that a icrease i a uit of x has the same effect o y regardless of the iitial value of x. This assumptio is ot very realistic i some cases ad we will relax this assumptio later. Example 1 Let s cosider a simple regressio model relatig the wage of a idividual with his level of educatio wage = β 0 + β 1 educ + u If his wage, wage, is give i dollars per hour ad educ are years of educatio, β 1 reflects the chage i hourly wage give a icrease i oe year of educatio, holdig the rest of the factors fixed. The error term u cotais all the other factors affectig wage, such as the work experiece, iate ability ad teure i the curret job, etc. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 15 / 81

Example 2 Assume that the soya productio is determied by the model yield = β 0 + β 1 fertilizer + u where yield is the soya productio ad fertilizer is the quatity of fertilizer. The error term u cotais other factors affectig the soya productio such as the quality of lad, the quatity of rai, etc. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 16 / 81

Obtaiig a good estimatio of parameter β 1 i model (1) depeds o the relatioship existig betwee the error term u ad variable x. Formally, the assumptio we eed to impose o the relatioship betwee x ad u i order to obtai a credible estimatio of β 1 is that the mea of u coditioal o x is zero for ay value of x E(u j x) = E(u) = 0 (2) Recall that the mea of u coditioal o x is just the mea of the distributio of u coditioal o x. Note that, as log as the model has a itercept, the assumptio E(u) = 0 is ot very restrictive, sice this is just a ormalisatio that is obtaied by defiig β 0 = E(y) β 1 E(x) The real assumptio is that the mea of the distributio of u coditioal o x is costat. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 17 / 81

How assumptio (2) should be iterpreted i the cotext of previous examples? Example 1 (cot.) To simplify, we assume that the error term u oly represets iate ability. The assumptio (2) implies that the mea level of ability does ot deped o the years of educatio. Uder this assumptio, the level of mea ability of those idividuals with 10 years of educatio is the same as those idividuals with 16 years of educatio. However, if we assume that those idividuals with higher iate ability chose to acquire higher educatio, the average iate ability of those idividuals with 16 years of educatio will be higher tha the average iate ability of those idividuals with 10 years of educatio ad the assumptio (2) is ot satisfied. Sice the iate ability is uobserved it is very difficult to kow whether its mea depeds o the level of educatio or ot; but this is a questio that we should thik about before startig with the empirical process. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 18 / 81

Example 2 (cot) To simplify, assume that i this example the error term u is oly the quality of the lad. I this case, if the quatity employed i differet slots is radom ad does ot deped of the quality of the lad, the the assumptio (2) holds: the average quality of the lad does ot deped o the fertilizer quatity. O the other had, if the best lad slots obtai a higher quatity of fertilizer, the mea value of u depeds o the quatity of fertilizer ad the assumptio (2) is ot true. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 19 / 81

We obtai the expressio of the mea of y coditioal o x uder the assumptio that (2). If we compute the expected value (coditioal o x) i the equatio (1) we have that E(y j x) = E(β 0 + β 1 x + u j x) = β 0 + β 1 x + E(u j x) ad uder the assumptio (2) E(y j x) = β 0 + β 1 x (3) This equatio shows that, uder the assumptio (2), the populatio regressio fuctio, E(y j x), is a liear fuctio of x. From equatio (3) it ca be deduced that : β 0 is the mea of y whe x is equal to zero β 1 is the chage i the mea of y give a icrease i oe uit of x. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 20 / 81

The estimator of Ordiary Least Squares (OLS). Iterpretatio. I this sectio we first review how to estimate the parameters β 0 ad β 1 of the simple regressio model usig a radom sample of the populatio. Later o, we will see how to iterpret the results of the estimatio for a give sample. Let f(x i, y i ) : i = 1, 2,.., g be a radom sample of the populatio. Give that this data arises from a populatio defied by the simple regressio model, for each observatio i, we ca establish that y i = β 0 + β 1 x i + u i (4) where u i is the error term of observatio i cotaiig all the factors affectig y i differet from x i. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 21 / 81

We use the assumptio (2) i order to obtai the estimators of the parameters β 0 ad β 1. Sice E(u) = 0, usig equatio (1) ad substitutig u as a fuctio of the observed variables, we have that O the other had, it ca be show that E(y β 0 β 1 x) = 0 (5) E(u j x) = 0 ) E(xu) = 0 ad usig equatio (1) ad substitutig u as a fuctio of the observed variables, we have that E(x(y β 0 β 1 x)) = 0 (6) M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 22 / 81

The equatios (5) ad (6) allow us to obtai good estimators of the parameters β 0 ad β 1. Replacig i equatios (5) ad (6) the populatio expectatios by sample meas, the estimators of bβ 0 ad bβ 1 are obtaied as the solutios to equatios 1 1 (y i bβ 0 bβ 1 x i ) = 0 (7) x i (y i bβ 0 bβ 1 x i ) = 0 (8) Note that the equatios (7) ad (8) are the sample couterparts to equatios (5) ad (6). The estimates obtaied as the sample couterparts of populatio momets are deoted as estimates of the method of momets. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 23 / 81

After some algebra, we ca isolate bβ 0 ad bβ 1 i equatios (7) ad (8) obtaiig: bβ 0 = y bβ 1 x (9) where S xy = 1 1 bβ 1 = = (x i x) (y i y) (x i x) 2 betwee x ad y,ad S 2 x = 1 1 of x. S xy S 2 x (x i x) (y i y) is the sample covariace (10) (x i x) 2 is the sample variace Note that i order for the OLS estimators to be defied we eed that (x i x) 2 > 0. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 24 / 81

The estimates defied by equatios (9) ad (10) are deoted as Ordiary Least Squares (OLS) estimates of the costat term ad slope of the simple regressio model. The OLS estimates are computed for a give particular sample, ad therefore, for a give sample, bβ 0 ad bβ 1 are two real umbers. If the OLS estimates are computed with a differet sample, the oe would obtai differet results for bβ 0 ad bβ 1. Therefore, sice bβ 0 ad bβ 1 are a fuctio of the sample, we ca also thik that of bβ 0 ad bβ 1 as radom variables, that is, as estimators of populatio parameters β 0 ad β 1. Both i this sectio ad sectios 4 ad 5 we are goig to aalyse the properties of the OLS estimates for a give sample. I sectio 6 we study the statistical properties of the radom variables bβ 0 ad bβ 1, that is, we study the statistical properties of bβ 0 ad bβ 1 as estimators of the populatio parameters. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 25 / 81

Although we have derived the expressios for the estimates of OLS from assumptio (2), this assumptio is ot required i order to compute the estimates. The oly coditio eeded i order to compute the OLS estimates for a give sample is that (x i x) 2 > 0. I fact, ote that (x i x) 2 > 0 is ot a assumptio sice the oly coditio we eed is that ot all the x i i the sample are all equal. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 26 / 81

We see ow a graphical iterpretatio of the estimates of OLS of the simple regressio model that justifies the ame of least squares. To do so, we draw that cloud of poits associated to a give sample of size ad ay lie y = b 0 + b 1 x We show that the OLS estimates defied i equatios (9) ad (10) are the "best" choice for those values b 0 ad b 1 if the objective is that the lie is as "close" as possible to this cloud of poits for a give proximity criterio. I particular, the proximity criterio that delivers the OLS estimates is to miimise the squared sum of the vertical distaces of the cloud of poits to the regressio lie. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 27 / 81

b y i + bx 0 1 i ( xi, yi)........... y= b0+ bx 1 x i M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 28 / 81

Graphically, we ca see that the vertical distace from poit (x i, y i ) to the lie y = b 1 + b 2 x is give by y i b 0 b 1 x i ad therefore, the objective fuctio that should be miimised is s(b 0, b 1 ) = The partial derivatives are: s(b 0, b 1 ) = 2 b 0 s(b 0, b 1 ) = 2 b 1 (y i b 0 b 1 x i ) 2 (11) (y i b 0 b 1 x i ) x i (y i b 0 b 1 x i ) M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 29 / 81

The estimated coefficiets are obtaied oce the partial derivatives of the objective fuctio are equal to zero (y i bβ 0 bβ 1 x i ) = 0 x i (y i bβ 0 bβ 1 x i ) = 0 These two equatios are deoted as first order coditios of the OLS estimates ad are idetical to equatios (7) ad (8). Therefore, the estimates obtaied by miimisig the objective fuctio (11) are the OLS estimates defied i equatios (9) ad (10). M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 30 / 81

We defie fitted value for y whe x = x i as by i = bβ 0 + bβ 1 x i This is the predicted value for y whe x = x i. Note that there is a fitted value for each observatio i the sample. We defie the residual for each observatio i the sample as the differece betwee the observed value y i ad the fitted value by i. bu i = y i by i ad there is a residual for each observatio i the sample. Note that the residual for each observatio is the vertical distace (with its correspodig sig) from the poit to the regressio lie y = bβ 0 + bβ 1 x, ad therefore, the OLS criterio is to miimise the squared sum of residuals. If a poit is above the regressio lie, the residual is positive ad if the poit is below the regressio lie, the residual is egative. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 31 / 81

M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 32 / 81

Why is this criteria to miimise the squared sum of the residuals used? The aswer is because this is a easy criterio ad delivers good estimators with good properties uder certai assumptios. Note that a criterio cosistig i miimisig the sum of the residuals would ot be appropriate sice the residuals ca be positive or egative. If we could cosider other alterative criterio such as miimisig the sum of absolute value of the residuals mi b 0,b 1 jy i b 0 b 1 x i j The problem of usig this criterio is that the objective fuctio is ot differetiable ad therefore it is more complicated to compute the miimum. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 33 / 81

Iterpretatio of the results of the regressio The regressio lie or sample regressio fuctio is defied as by = bβ 0 + bβ 1 x ad it is the estimated versio of the populatio regressio fuctio. E(y j x) = β 0 + β 1 x. The costat term or itercept, bβ 0, is the predicted value for y whe x = 0. I may cases, it does ot make sese to cosider x = 0, ad i these cases bβ 0 does ot have iterest itself. However, it is importat ot to forget icludig bβ 0 whe predictig y for ay value of x. bβ 0 is also the estimated value for the mea of y whe x = 0. The slope, bβ 1, is measurig the variatio of by whe x icreases i oe uit. I fact, if x chages i x uits, the predicted chage i y is of by = bβ 1 x uits. bβ 1 is measurig the estimated variatio i the mea of y whe x icrease i oe uit. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 34 / 81

Example 1 (cot.) Give a sample with = 526 idividuals (file WAGE1 from Wooldridge) for which the hourly wage i dollars is observed, wage, ad years of educatio, educ, the followig OLS regressio lie has bee obtaied [wage = 0.90 + 0.54 educ The estimated value 0.9 for the itercept literally meas that the predicted wage for those idividuals with 0 years of educatio is of 90 cets ( 0.9 dollars) per hour, this does ot make sese. The reaso why this predictio is ot good for those low levels of educatio is because there are very few idividuals with few years of educatio. The estimated value for the slope idicates that oe more year of educatio implies a icrease of predicted hourly wage of 54 cets (0.54 dollars). If the icrease i the umber of years of educatio is 3 years, the predicted wage would icrease i 3 0.54 = 1.62 dollars. Regardig the predictio for differet values of educ, the predicted hourly wage for idividuals with 10 years of educatio is [wage = 0.90 + 0.54 10 = 4.5 dollars per hour. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 35 / 81

Fitted values ad residuals. Goodess of fit. Algebraic properties of the OLS regressio 1. The sum of the residuals is zero bu i = 0 (12) ad therefore the sample mea of the residuals is zero. 2. The sum of the product of the observed values for x ad the residuals is zero x i bu i = 0 (13) ad therefore, sice the mea of the residuals is zero by property 1, the sample covariace betwee the observed values of x ad the residuals is zero. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 36 / 81

3. The poit (x, y) lies o the sample regressio lie. 4. The mea of the fitted values coicides with the mea of the observed values y = by 5. The sample covariace betwee the fitted values ad the residuals is zero. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 37 / 81

Goodess of fit I what follows we see a measure of the capacity of the explaatory variable to explai the variability of the depedet variable. This measure reflects the quality of the fit, that is, whether the OLS regressio lie fits well the data. Defiitios: Total Sum of Squares(TSS): SST = Explaied Sum of Squares (SSE): SSE = (by i by) 2 = Sum of Squared Residuals (SSR): SSR = (y i y) 2 sice y=by bu 2 i (by i y) 2 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 38 / 81

This three values that we have just see are o egative, sice they are sum of squares. SST, SSE ad SSR are measures of the degree of variability of the depedet variable, of the fitted values ad of the residuals, respectively, sice they are the umerators of the sample variace of each of these variables. These three measures are related to each other, sice it ca be show that SST = SSE + SSR Assumig that SST is ot zero, which is equivalet to sayig that the observatios of the depedet variable are ot all the same, dividig the three terms i the sum above by SSC we have: 1 = SSE SST + SSR SST M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 39 / 81

We defie the coefficiet of determiatio of the model as R 2 = SSE SST = 1 SSR SST The square-r represets the proportio of the variability of the depedet variable that is explaied by the model. R 2 satisfies the followig coditio: 0 R 2 1 It is oegative because SSE ad SST are oegative It is smaller or equal tha 1 because SSR is oegative. Sometimes R 2 is also expressed as a percetage, multiplyig its value by 100. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 40 / 81

I order to best uderstad the role of the coefficiet of determiatio, it is useful to cosider the two extreme cases: The coefficiet of determiatio is 1 if ad oly if SSR = 0; i this case, all the residuals must be exactly equal to 0, thus y i = by i for all the observatios ad therefore all the observatios lie o the OLS regressio lie: there is a perfect fit. The coefficiet of determiatio is 0 if ad oly if SSE = 0; i this case, all the fitted values must be exactly equal to y, that is, the fitted values do ot deped o the value of the idepedet variable, thus the OLS regressio lie is a horizotal lie y = y. I this case, kowig the value of the idepedet variable does ot provide ay iformatio o the depedet variable. I practice, we would always obtai itermediate values of R 2. The closer R 2 is to 1, the better the goodess of fit. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 41 / 81

It is importat to poit out that i social scieces, low R 2 is ofte foud, especially whe, as we do i this course, we work with cross sectios. The fact that R 2 is low does ot mea that the OLS estimate is ot useful. The OLS estimate ca still provide a good estimate of the effect of X o y eve if R 2 is low. Example 1 (cot.) I the regressio of wage o the years of educatio we have [wage = 0.90 + 0.54 educ = 526, R 2 = 0.165 The years of educatio explai 16.5% of the variatio of wages. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 42 / 81

Measuremet uits ad fuctioal form Measuremet uits It is very importat to take ito accout the measuremet uits whe iterpretig the results of a regressio. The estimated value of the parameters of a regressio model depeds o the measuremet uits of the depedet variable ad the explaatory variable. If we have already estimated the parameters of the model usig certai uits for the variables, the estimated values for these parameters ca be easily obtaied if we chage the measuremet uits. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 43 / 81

If we chage the measuremet uits of the depedet variable ad we measure it i differet uits y = cy, substitutig the estimated model, we have by = cbβ 0 + cbβ 1 x = bβ 0 + bβ 1 x where bβ 0 = c bβ 0 ad bβ 1 = c bβ 1 ad therefore, the ew estimated coefficiets are equal to the previously estimated coefficiets multiplied by c. If we chage the measuremet uits of the explaatory variable ad the measure this variable with differet uits x = cx, substitutig i the estimated model x = x c we have by = bβ 0 + b β 1 c x = bβ 0 + bβ 1 x where bβ 1 = b β 1 c ad therefore the estimated costat does ot chage ad the ew estimated slope is equal to the previously estimated slope divided by c. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 44 / 81

Example 1 (cot.) I the regressio of wage o the years of educatio with variable wage measured i dollars per hour ad variable educ measured i years we obtaied the followig regressio lie: [wage = 0.90 + 0.54 educ = 526, R 2 = 0.165 Which values would be obtaied for the costat ad the slope of the regressio lie if wage is measured i cets per hour? Let wagec be the wage i cets. Obviously, the relatioship betwee wage ad wagec is wagec = 100 wage so that the estimated model usig wage i cets per hour is obtaied by multiplyig by 100 the estimated coefficiets we obtaied whe wage is measured i dollars per hour \wagec = 90 + 54 educ M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 45 / 81

Example 1 (cot.) I this way, we obtaied that the iterpretatio of the regressio results does ot chage whe the measuremet uits are chaged, sice a icrease i oe year of educatio implies a icrease of 54 cets per hour i the predicted wage. Regardig R 2, the ituitio tells us that sice this provides iformatio o the goodess of fit it should ot deped o the measuremet uits of the variables. I fact, it ca be show, usig the defiitio, that R 2 does ot deped o the measuremet uits. I this example, we have that R 2, whe wage is measured i cets per hour, is also 0.165. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 46 / 81

Example 3 Usig a sample (file CEOSAL1 from Wooldridge) of = 209 executive directors for whom their aual wage i thousads of dollars is observed, salary, ad the average retur (i percetage) of the shares of their compay, roe, the followig OLS regressio lie has bee obtaied \salary = 963.19 + 18.50 roe = 209, R 2 = 0.013 From this model, we have that a icrease i a percetage poit i the shares returs icreases the predicted wage of the executive director i 18500 dollars (18.5 thousads of dollars). If we chage the measuremets uits of the explaatory variable, for example, if the retur is expressed as a decimal istead of as a percetage, what would the ew estimated coefficiets be? M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 47 / 81

Example 3 (cot.) Let roe1 be the share retur expressed as a decimal. Clearly, the relatioship betwee roe ad roe1 is roe1 = 1 100 roe so that the estimated model usig the shares retur i decimals is obtaied multiplyig by 100 the estimated slope we obtaied whe the retur is measured as a percetage \salary = 963.19 + 1850 roe1 = 209, R 2 = 0.013 I this way, we obtai agai that the iterpretatio of the regressio results does ot chage whe the measuremet uits chage, sice as before a icrease i a percetage poit i the compay shares retur implies a icrease i the predicted wage of the executive director of 1850 0.01 = 18.5 thousads of dollars. R 2 does ot chage whe we chage the measuremet uits of the idepedet variable. I this example R 2 is still equal to 0.013. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 48 / 81

Example 3 (cot.) If we chage ow the measuremet uits of both the depedet ad explaatory variable, for example, we express the retur with decimals ad the wage i dollars, what would the ew estimated coefficiets be? O the oe had, we have just see that the uits chage i the shares retur implies that we eed to multiply by 100 the estimated slope. O the other had, if salary100 deotes the wage i hudreds of dollars salary100 = 10 salary These uits chage implies that we eed to multiply by 10 both the costat ad the slope of the regressio lie. If we make both uit chages, the regressio lie is \ salary100 = 9631.9 + 18500 roe1 = 209, R 2 = 0.013 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 49 / 81

Fuctioal form So far we have cosidered liear relatioships betwee two variables. As see above, whe we establish a liear relatioship betwee y ad x we are assumig that the effect o y of a chage i oe uit of x does ot deped o the iitial level of x. This assumptio is ot very realistic i some applicatios. For example, i example 1 where wage is a fuctio of the years of educatio, the estimated model predicts that a additioal year of educatio would icrease wage i 54 cets both for the first year of educatio, for the fifth, for the sixteeth, etc ad this is ot quite reasoable. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 50 / 81

Assume that each additioal year of educatio implies a costat percetage icrease i wage. Ca this effect be take ito accout i the cotext of the simple regressio model? The aswer is yes ad it is eough to cosider the logarithm of wage as the depedet variable of the model. Assume that the regressio model relatig wage ad years of educatio is: log(wage) = β 0 + β 1 educ + u (14) I this model if we hold fixed all the other factors affectig wage ad captured by error term u, we have that a additioal year of educatio implies ad icrease of β 1 i the logarithm of wage. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 51 / 81

Therefore, sice a percetage icrease is approximately equal to the differece of logs multiplied by 100, we have that this model implies that, holdig fixed all the factors affectig wage ad captured i the error term u, a additioal year of educatio implies ad icrease i wage of 100 β 1 %. Note that equatio (14) implies a oliear relatioship betwee wage ad years of educatio. A additioal year of educatio implies a higher icrease i wage (i absolute terms) the higher the iitial umber of years of educatio is: The model where the depedet variable is i logarithms ad the explaatory variable is i levels is deoted as log-level model. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 52 / 81

The model (14) ca be estimated by OLS usig the logarithm of wage as the depedet variable. Usig the data i example 1, the followig results have bee obtaied \ log(wage) = 0.584 + 0.083 educ = 526, R 2 = 0.186 Therefore, this estimated model implies that for ay additioal year of educatio the hourly wage icreases by 8.3%. This effect is deoted by ecoomist as retur to a additioal year of educatio. There is aother importat o liearity ot icluded i this applicatio. This o liearity would reflect a "certificatio" effect. It could be the case that year 12, that is fiishig secodary educatio, has a much larger impact o wage that fiishig year 11, sice the latter does ot imply the degree. I chapter 5 we will see how to take ito accout this type of o liearities. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 53 / 81

We aalyse here how to use the logarithm trasformatio i order to obtai a model with costat elasticity. Example 4 Usig the same data as i example 3, we ca estimate a model with costat elasticity that relates the wage of executive directors with the sales of the firm. The populatio model we have to estimate is log(salary) = β 0 + β 1 log(sales) + u where sales are the aual sales of the firm i millios of dollars ad salary is the aual wage of the executive director of the firm i thousads of dollars. I this model, β 1 is the elasticity of wage of executive directors with respect to the sales of the firm. This model ca be estimated by OLS usig the log of wage as a depedet variable ad the log of sales as a explaatory variable. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 54 / 81

Example 4 (cot.) The regressio model is \ log(salary) = 4.822 + 0.257 log(sales) = 209, R 2 = 0.211 The estimated elasticity is 0.257, which implies that a icrease of 1% i the sales implies a icrease of 0.257% i the wage of the executive director (this is the usual iterpretatio of elasticity). M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 55 / 81

The model where both the depedet variable ad the explaatory variable are i logarithms is deoted by log-log model. We see ow how a chage i uits of a variable that is expressed i logs affects both the costat ad the slope of the model. Cosider the model log-level log(y) = β 0 + β 1 x + u (15) If we chage the measuremet uits of y ad defie y = cy, usig logarithm we have that log(y ) = log(c) + log(y). Substitutig i (15) we have log(y ) = β 0 + log(c) + β 1 x + u = β 0 + β 1 x + u ad therefore, these uits chages do ot affect the slope, oly the costat of the model. Similarly, if the explaatory variable is i logarithms ad we chage its measuremet uits, this chage does ot affect the slope of the model, but oly the costat term. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 56 / 81

Fially, we ca also cosider a model where the depedet variable is i levels ad the explaatory variable is i logs. This model is deoted as level-log model. y = β 0 + β 1 log(x) + u I this model, β 1 /100 is the variatio i uits of y give a icrease of 1% i x. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 57 / 81

The model we studied i this chapter is deoted as simple regressio model, although we have see that this model also allows oe to establish some oliear relatioships betwee variables. The adjective "lieal" is due to the liearity of the model i terms of the parameters β 0 ad β 1. The variables y ad x ca be ay type of trasformatio of other variables. We studied i detail the logarithmic trasformatios sice they are the most iterestig oes i Ecoomics, but i the cotext of the simple regressio model the followig trasformatio could have also bee cosidered y = β 0 + β 1 x 2 + u y = β 0 + β 1 p x + u It is importat to take ito accout that the fact that the variables are trasformatio of the variables does ot affect the estimatio method but affects the iterpretatio of the parameters, for example as see above i the logarithmic trasformatios. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 58 / 81

Statistical Properties of the OLS estimators The algebraic properties of the OLS estimates have bee studied so far. I this sectio, we go back to the populatio model i order to study the statistical properties of the OLS estimators. We cosider ow that bβ 0 ad bβ 1 are radom variables, that is, they are estimators of the populatio parameters β 0 ad β 1 ad we study some of the properties of their distributios. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 59 / 81

Ubiasedess of the OLS estimators Ubiasedess of the OLS estimators We study uder which assumptios the OLS estimators are ubiased. Assumptio RLS.1 (liearity i parameters) The depedet variable y is related i the populatio with the explaatory variable x ad the error term u through the populatioal model y = β 0 + β 1 x + u (16) Assumptio RLS.2 (radom sample) The data arise from a radom sample of size : f(x i, y i ) : i = 1, 2,.., g from the populatio model Assumptio RLS.3 (zero coditioal mea) E(u j x) = 0 Assumptio RLS.4 (sample variatio of the idepedet variable) The values of x i, i = 1, 2,..,, i the sample are ot all the same. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 60 / 81

The assumptios RLS.1 ad RLS.2 imply that we ca write (16) i terms of the radom sample as y i = β 0 + β 1 x i + u i, i = 1, 2,.., (17) where u i is the error term of observatio i ad it cotais those uobservables affectig y i. Note that the error term u i is ot the same as the residual bu i. The assumptios RLS.2 ad RLS.3 imply that for each observatio i E(u i j x i ) = 0, i = 1, 2,.., ad E(u i j x 1, x 2,.., x ) = 0, i = 1, 2,.., (18) Note that if assumptios RLS.4 does ot hold, the OLS estimator could ot be computed. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 61 / 81

Before showig the statistical properties of the OLS estimators, it is useful to write bβ 1 as a fuctio of the errors of the model. Expressio for bβ 1 as a fuctio of the error terms: Usig the defiitio of bβ 1 i equatio (10) bβ 1 = = (x i x) (y i y) (x i x) 2 = Usig (17) (x i x) y i (x i x) 2 (x i x) (β 0 + β 1 x i + u i ) (x i x) 2 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 62 / 81

Expressio for bβ 1 as a fuctio of the error terms (cot.): Sice bβ 1 = β 0 (x i x) + β 1 (x i x) 2 (x i x) = 0 ad (x i x) x i + (x i x) 2 (x i x) u i (x i x) 2 (x i x) x i = (x i x) 2 we have that bβ 1 = β 1 + (x i x) u i (19) (x i x) 2 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 63 / 81

Uder assumptios RLS.1 to RLS.4, bβ 0 ad bβ 1 are ubiased estimators of parameters β 0 ad β 1, that it is Proof E(bβ 0 ) = β 0 ad E(bβ 1 ) = β 1 We are goig to show that bβ 1 is a ubiased estimator of β 1, that it is E(bβ 1 ) = β 1. I this proof, the expectatios are coditioal to the observed values of the explaatory variable i the sample, that is, they are coditioal expectatios i x 1, x 2,.., x. Therefore, coditioig i the observed values of x, all those terms that are a fuctio of x 1, x 2,.., x are ot radom. Usig (19) E b β 1 = β 1 + 1 0 = β 1 + E B @ (x i x) 2 (x i x)u i (x i x) 2 1 C A = β 1 + 1 (x i x) E(u i ) = β 1 usig (18) (x i x) 2 E! (x i x) u i M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 64 / 81

Some Commets o Assumptios RLS.1 to RLS.4 Geerally, if oe of the four assumptios we cosider does ot hold, the the estimator is ot ubiased. As metioed before, if assumptio RLS.4 fails it is ot possible to obtai the OLS estimates. Assumptio RLS.1 requires that the relatioship betwee y ad x is liear with a additive error; we have already discussed that we mea liear i parameters sice variables x ad y ca be oliear trasformatio of the variables of iterest. If assumptio RLS.1 fails ad the model is oliear i parameters, the estimatio is more complicated ad it is beyod the cotets of this course. Regardig assumptio RLS.2, this is suitable for may applicatios (although ot i all of them) whe we work with cross-sectioal data. Fially, assumptio RLS.3 is a crucial assumptio for the ubiasedess of the OLS estimator. If this assumptio fails, the estimators are geerally biased. I chapter 3, we will see that we ca determie the directio ad the size of the bias. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 65 / 81

Some Commets o Assumptios RLS.1 to RLS.4 (cot.) I the aalysis of simple regressio with o experimetal data, there is always the possibility that x is correlated with u. Whe u cotais factors affectig y ad that are correlated with x, the result of the OLS estimatio ca reflect the effect that those factors have o y ad ot the ceteris paribus relatioship betwee x ad y. Example 5 Suppose we are iterested i aalysig the effect of a public programme of the school luch o the school retur. It is expected that this programme has a positive ceteris paribus effect o the school retur sice if there is a studet without ecoomic resources to pay for the mea that beefits from this programme, his productivity i school should improve. We have data o 408 secodary school of Michiga state (file MEAP93 from Wooldridge) ad for each school we observe the percetage of studets that pass a stadardised math exam (math10) ad the percetage of studets that beefit from the luch programme i schools (lchprg). M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 66 / 81

Some Commets o Assumptios RLS.1 to RLS.4 (cot.) Example 5 (cot.) Give this data, the followig results have bee obtaied: \math10 = 32.14 0.319 lchprg = 408 R 2 = 0.171 The estimated model predicts that if the access to the programme icreases i 10 percetage poits, the percetage of studets passig the exam decreases i approximately 3.2 percetage poits. Is this result credible? The aswer is NO. It is more likely that this result is due to the error term beig correlated with lchprg. The error term cotais other factors (differet to the access to the school luch programme) affectig the result of the exam. Amog these factors, the socioecoomic level of the studets families, which affects the school productivity ad that is obviously correlated with the participatio i the luch programme. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 67 / 81

Iterpretatio of the cocept ubiasedess of a estimator Recall that the fact that a estimator is ubiased does ot mea that for our particular sample the value of the estimate is close to the true value of the parameter. The fact that a estimator is ubiased implies that if we had access to may radom samples of the populatio ad for each of them the value of the estimator was computed, if the umber of samples is very large, the sample mea of the estimates would be very close to the true value of the parameter we wat to estimate. Sice i the practice we oly have access to oe sample, the ubiasedess property is ot very useful if there is ot ay other property that guaratees that the dispersio of the distributio of the OLS estimator is small. I additio, a dispersio measure of the distributio of the estimators allow us to choose the best estimator as the oe with low dispersio. As a way of measurig the dispersio we use the variace, or the square root, the stadard deviatio. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 68 / 81

Variaces of the OLS estimators I this chapter we are goig to compute the variace of the OLS estimators uder a additioal assumptio kow as the homoskedasticity assumptio. This assumptio establishes that the variace of the error term u coditioal o x is costat, that is, it does ot deped o x. The variace of the OLS estimator ca be computed without ay additioal assumptio, that is, usig oly assumptios RLS1 to RLS4. However, the expressios for the variaces i the geeral case are more complicated ad they are beyod the scope of this course. Assumptio RLS.5 (homoskedasticity) Var(u j x) = σ 2 Whe Var(u j x) depeds o x we say that the errors are heteroskedastic. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 69 / 81

It is importat to poit out that assumptio RLS.5 does ot play ay role i the ubiasedess of bβ 0 ad bβ 1. We add assumptio RLS.5 to simplify the computatio of the variace of the OLS estimators. Additioally, as we will see i Chapter 7, uder the additioal assumptio of homoskedasticity the OLS estimators have some efficiecy properties. Sice assumptio RLS.3 establishes that E(u j x) = 0 ad sice Var(u j x) = E(u 2 j x) (E(ujx)) 2, we ca write assumptio RLS.5 as E(u 2 j x) = σ 2 Assumptio RLS.5 ca be writte as Var(y j x) = σ 2 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 70 / 81

Example 1 Let s cosider agai the simple regressio model relatig the wage of a perso with his/her level of educatio wage = β 0 + β 1 educ + u I this model the assumptio of homoskedasticity is Var(wage j educ) = σ 2, i.e., the variace of wage does ot deped o the umber of years of educatio. This assumptio caot be very realistic sice it is likely that those idividuals with higher levels of educatio have differet opportuities to work, which ca lead to a higher variability of wages for high educatio levels. O the cotrary, those idividuals with low levels of educatio have less opportuities to work ad may of them work for the miimum wage ad this implies that the variability of wage is small for low levels of educatio. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 71 / 81

Variace of the samplig distributio of the OLS estimators Uder assumptios RLS.1 to RLS.5 Var(bβ 1 ) = Var(bβ 0 ) = σ 2 = (x i x) 2 σ 2 1 x 2 i = (x i x) 2 σ 2 ( 1)S 2 x σ 2 x 2 ( 1)S 2 x where the variace is coditioal to the observed values i the sample for the explaatory variables, i.e. they are coditioal variaces o x 1, x 2,.., x M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 72 / 81

Proof We show the formula for the variace of Var(bβ 1 ). Recall the expressio of bβ 1 as a fuctio of the errors of the model i equatio (19) bβ 1 = β 1 + (x i x) u i (x i x) 2 The variace we have to compute is coditioal o x i, therefore, (x i x), i = 1, 2,..,, ad radom either. (x i x) 2 are ot radom ad β 1 is ot M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 73 / 81

Proof (cot.) Additioally, usig assumptio RLS.2, errors u i are idepedet ad therefore, usig the followig properties of the variace: The variace of the sum of idepedet radom variables is the sum of the variaces The variace of a costat times a radom variable equals the squared costat times the variace of the radom variable The variace of the sum of a variable ad a costat is the variace of the radom variable we have that Var b β 1 = (x i x) 2 var(u i )! 2 = (x i x) 2 usig RLS.5 σ 2 (x i x) 2 = (x i x) 2 σ 2! 2 (x i x) 2! 2 = σ2 = σ2 (x i x) 2 (x i x) 2 ( 1)S 2 x M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 74 / 81

Accordig to the expressio that we have obtaied to the variace of bβ 1 we have that: The higher the variace of the error term, σ 2, the higher the variace of bβ 1, if the variace of the uobservables affectig y is very large, it is very difficult to estimate β 1 precisely. The higher the variace of x i the smaller the variace of bβ 1, if x i has a low dispersio, it is very difficult to estimate β 1 precisely. The higher the sample size, the smaller the variace of bβ 1 is. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 75 / 81

Estimatio of the variace of the error term The variace of bβ 0 ad bβ 1 depeds o the sample values of x i, which are observables ad of the variace of the error term σ 2, that is a ukow parameter. Therefore, i order to estimate the variace of bβ 0 ad bβ 1 we have to obtai a estimator of σ 2. Sice σ 2 is the variace of the error term u, that as we saw above equals the expectatio of u 2 (give that the mea of u is zero by assumptio RLS.3), we could thik of usig the sample mea of the squared errors w = 1 as a estimator of σ 2. If we could compute w as a fuctio of the sample, w would be a ubiased estimator of σ 2 sice! E 1 u 2 i = 1 u 2 i E(u 2 i ) = σ2 M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 76 / 81

The problem is that w is ot a estimator sice it caot be computed as a fuctio of the sample sice the errors are ot observable What we ca compute as a fuctio of the sample is the residuals bu i. I what follows, we see that the residuals are estimates of the errors ad how to obtai a ubiased estimator of σ 2 as a fuctio of the squared residuals. Recall that the residual of observatio i is defied as bu i = y i by i = y i bβ 0 bβ 1 x i ad sice the error of observatio i is u i = y i β 0 β 1 x i we ca thik of the residuals as estimates of the errors. I this way, we ca defie the followig estimator of σ 2 bw = 1 bu 2 i M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 77 / 81

bw is a estimator of σ 2 but it is ot ubiased. The reaso why this estimator is ot ubiased is that, as opposed to the errors - which are idepedet-, the residuals are ot idepedet sice they satisfy the two liear restrictios see i Sectio 4 (equatios (12) ad (13)). Therefore, sice residuals satisfy two liear restrictios, the residuals have 2 degrees of freedom ad the ubiased estimator of σ 2 is bσ 2 = 1 bu 2 i 2 (proof i page 62 of Wooldridge) Usig this estimator for σ 2, the estimated variaces of bβ 1 ad bβ 0 are defied as follows \ Var(bβ 1 ) = bσ 2 ( 1)S 2 x ad Var(bβ \ 0 ) = bσ2 x 2 ( 1)S 2 x M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 78 / 81

The Stadard Error of Regressio (SER) is defied as p bσ = bσ 2 bσ is a estimator of the stadard deviatio of the error term, σ. Although bσ is ot a ubiased estimator of σ we see below that this has other good properties whe the sample is large. The stadard error of bβ 1, deoted by se(bβ 1 ), is defied as se(bβ 1 ) = p ( bσ 1)S 2 x se(bβ 1 ) is a estimator of the stadard deviatio of bβ 1 ad therefore a measure of the precisio of bβ 1. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 79 / 81

Aalogously, the stadard error of bβ 0, deoted by se(bβ 0 ), is defied as p bσ x se(bβ 0 ) = p 2 ( 1)S 2 x se(bβ 0 ) is a estimator of the stadard deviatio of bβ 0 ad therefore a measure of the dispersio of bβ 0. se(bβ 1 ) is a radom variable sice, give the values of x i, it takes differet values for differet samples of y. For a give sample, the stadard error se(bβ 1 ) is a umber as bβ 1 whe we compute it with a particular sample. The same happes with se(bβ 0 ). The stadard errors play a very importat role for iferece, that is, whe testig restrictios o the parameters of the model or whe computig cofidece itervals. M. Ageles Carero (UA) Chapter 6: SRM Year 2014-15 80 / 81