ECON 351 - The Simple Regression Model Maggie Jones 1 / 41
The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In general, more complicated econometric models are used for empirical analysis, but this provides a good starting point Suppose we have two variables, x and y, and we are interested in the relationship between the two Specifically, we care about the question, how does x affect y? Typically, we don t observe the full population of y or the full population of x so we can think of y and x as random samples 2 / 41
The Simple Regression Model In determining the relationship between x and y, we should keep three questions in mind: 1 How do we allow for factors other than x that might affect y? 2 What is the functional relationship between x and y? 3 How can we be certain we are capturing the ceteris paribus relationship between x and y? We resolve these questions by writing down an equation relating y to x 3 / 41
The Simple Regression Model y = β 0 + β 1 x + u (1) We call equation 1 the simple linear regression model y is called the dependent variable x is called the independent variable u is called the error term, it represents everything else that helps to explain y, but is not contained in x 4 / 41
The Simple Regression Model Equation 1 assumes a linear functional form, i.e. it assumes that the relationship between x and y is linear β 0 is the intercept term/parameter β 1 is the slope parameter - it measures the effect of x on y, holding all other factors constant: y = β 0 + β 1 x + u Note: in what instances would a linear functional form be a poor choice? 5 / 41
The Simple Regression Model Equation 1 assumes a linear functional form, i.e. it assumes that the relationship between x and y is linear β 0 is the intercept term/parameter β 1 is the slope parameter 6 / 41
More on the Error Term As long as β 0 is included in the equation, we can assume that the average value of u in the population is zero E(u) = 0 (2) A crucial assumption is that the average value of u does not depend on x, this is known as mean independence E(u x) = E(u) (3) Combining equation 2 and 3 yields one of the most important assumptions in regression analysis, the zero conditional mean assumption E(u x) = 0 (4) 7 / 41
The Simple Regression Model Equation 1 assumes a linear functional form, i.e. it assumes that the relationship between x and y is linear β 0 is the intercept term/parameter β 1 is the slope parameter 8 / 41
The Simple Regression Model The zero conditional mean assumption gives β 1 another interpretation Taking conditional expectations of equation 1 yields: E(y x) = β 0 + β 1 x (5) which is known as the population regression function We interpret β 1 as, a 1 unit increase in x increases the expected value of y by β 1 units 9 / 41
The Simple Regression Model We can now re-consider equation 1 y = β 0 + β 1 x }{{} explained part y can be decomposed into + u }{{} unexplained part the explained part - part of y explained by x the unexplained portion - part of y that can t be explained by x 10 / 41
Ordinary Least Squares Now we can begin to discuss the way to estimate β 0 and β 1 given a random sample of y and x Let {(x i, y i ) : i = 1,..., n} be a random sample of size n drawn from the population (x, y) y i = β 0 + β 1 x i + u i (6) How do we use the data to obtain parameter estimates of the population intercept and slope? 11 / 41
Ordinary Least Squares We begin with the zero conditional mean assumption of equation 4, which implies: Cov(x, u) = E(ux) = 0 (7) And the zero mean assumption of equation 2 E(u) = 0 (8) These two equations are known as moment conditions 12 / 41
Ordinary Least Squares We then define u in terms of the simple regression equation and our moment conditions become E(ux) = E [(y β 0 β 1 x)x] = 0 (9) And the zero mean assumption of equation 2 E(u) = E(y β 0 β 1 x) = 0 (10) 13 / 41
Ordinary Least Squares Given our sample of x and y, using the method of moments, we choose our parameter estimates, ˆβ 0 and ˆβ 1 to solve the system of equations E(ux) = 1 n E(u) = 1 n n (y i ˆβ 0 ˆβ 1 x i )x i = 0 (11) i=1 n (y i ˆβ 0 ˆβ 1 x i ) = 0 (12) i=1 14 / 41
Ordinary Least Squares Solving yields the parameter estimate for β 0 ˆβ 0 = ȳ + ˆβ 1 x (13) And the estimate for β 1 ˆβ 1 = n i=1 (x i x)(y i ȳ) n i=1 (x i x) 2 (14) Equation 14 is actually just the sample covariance between x and y divided by the sample variance of x 15 / 41
Ordinary Least Squares The method of moments is not the only way to arrive at these equations for parameter estimates of β 0 and β 1 The focus of Econ 351 will be on the method of Ordinary Least Squares Our estimates ˆβ 0 and ˆβ 1 are also called the ordinary least squares estimates 16 / 41
Ordinary Least Squares To see why, define a fitted value as the value of y i that we obtain from combining the sample x i with our parameter estimates, ˆβ 0 and ˆβ 1 ŷ i = ˆβ 0 + ˆβ 1 x i Define the residual as the difference between the actual value of y i and the fitted value ŷ i û i = y i ŷ i = y i ˆβ 0 ˆβ 1 x i 17 / 41
Chapter 2 Ordinary Least Squares Figure 2.4 Fitted values and residuals. The Simple Regression Model y y i û i residual y ˆ ˆ 0 ˆ 1 x y 1 yˆ i Fitted value x 1 x i x as small as possible. The appendix to this chapter shows that the conditions necessary 18 / 41
Ordinary Least Squares It seems reasonable to want parameter values that minimize the difference between the true y i and the fitted value ŷ i Sometimes û i will be positive and sometimes it will be negative, thus in theory summing over all residuals could equal zero However, if we square the residuals, we have a more accurate summary of the total error in the regression residuals 19 / 41
Ordinary Least Squares Choosing parameter values for β 0 and β 1 that minimize the sum of squared residuals is the basic principle behind ordinary least squares n û 2 i = i=1 n i=1 ( y i ˆβ 0 ˆβ 1 x i ) 2 (15) To minimize equation 15 we set the first order conditions with respect to each of the ˆβs equal to zero 20 / 41
Ordinary Least Squares The fitted values and parameter values form the OLS regression line ŷ = ˆβ 0 + ˆβ 1 x (16) The slope estimate tells us the amount by which ŷ changes when x changes by one unit ˆβ 1 = ŷ x 21 / 41
Useful Properties of OLS Estimates 1 The sum of the OLS residuals is zero n û i = 0 i=1 2 The sample covariance between x and û is zero n x i û i = 0 3 The point ( x, ū) is always on the OLS regression line i=1 22 / 41
Useful Properties of OLS Estimates Re-writing y i in terms of its fitted value and its residual is useful y i = ŷ i + û i From here we see that If 1 n n i=1 ûi = 0 then ȳ i = ŷ i The covariance of ŷi and û i is zero OLS decomposes y i into two parts: a fitted value and a residual, both of which are uncorrelated 23 / 41
Sum of Squares 1 Total Sum of Squares SST = n (y i ȳ) 2 i=1 2 Explained Sum of Squares SSE = n (ŷ i ȳ) 2 i=1 3 Residual Sum of Squares SSR = n (y i ŷ i ) 2 i=1 24 / 41
Sum of Squares 1 Total Sum of Squares: measures the total sample variation in the y i (measures how spread out the y i are in the sample) 2 Explained Sum of Squares: measures the sample variation in the fitted values, ŷ i 3 Residual Sum of Squares: measures the sample variation in the residuals, û i Note that the total variation can be expressed as the sum of the explained and unexplained variation: SST = SSE + SSR 25 / 41
Goodness of Fit One of the most common ways to measure how well a regression fits the data is to use the R-squared R 2 = SSE/SST = 1 SSR/SST (17) It tells us the ratio of the explained variation compared to the total variation So if the majority of y is explained by unobserved factors, the R 2 tends to be very low R 2 is always between 0 and 1 26 / 41
Notes on the R 2 A low R 2 does not necessarily mean that the regression is bad and shouldn t be used It simply means that the variable x does not explain much of the variation in the variable y i.e. there are other variables that might help to explain y The regression may still provide an accurate summary of the relationship between x and y 27 / 41
Functional Form Level-Level: dependent and independent variables are in levels and related linearly y = β 0 + β 1 x + u Log-Level: dependent variable is in log form, independent variable in levels log(y) = β 0 + β 1 x + u Log-Log: dependent and independent variables are in log form - can be interpreted as an elasticity log(y) = β 0 + β 1 log(x) + u Level-Log: dependent variable is in levels and independent variable in log form y = β 0 + β 1 log(x) + u 28 / 41
Functional Form Model Equation Y X β 1 Lev-Lev y = β 0 + β 1 x + u y x y = β 1 x Log-Lev log(y) = β 0 + β 1 x + u log(y) x % y = (100β 1 ) x Log-Log log(y) = β 0 + β 1 log(x) + u log(y) log(x) % y = β 1 % x Lev-Log y = β 0 + β 1 log(x) + u y log(x) y = (β 1 /100)% x 29 / 41
Unbiasedness of OLS Unbiasedness is a statistical property that we will examine in the context of our simple linear regression model We require four assumptions to establish the unbiasedness of OLS parameters SLR. 1 - Linear in Parameters: needs to be in the form y = β 0 + β 1 x + u SLR. 2 - Random Sampling: {(xi, y i ) : i = 1,..., n} must be drawn from a random sample SLR. 3 - Variation in x: the sample outcomes on x are not all the same value SLR. 4 - Zero Conditional Mean: our previous assumption E(u x) = 0 holds 30 / 41
Unbiasedness of OLS Now consider rewriting ˆβ 1 as ˆβ 1 = n i=1 (x i x)y i n i=1 (x i x) 2 Recall from the review that a parameter is unbiased if its expectation equals its true value Substituting in the regression equation for y i yields ˆβ 1 = n i=1 (x i x)(β 0 + β 1 x i + u i ) n i=1 (x i x) 2 31 / 41
Unbiasedness of OLS Which, cancelling terms that equal 0, is ˆβ 1 = β 1 + n i=1 (x i x)u i n i=1 (x i x) 2 Checking unbiasedness: [ n E( ˆβ i=1 1 ) = E(β 1 ) + E (x ] i x)u i }{{} n i=1 (x i x) 2 =β 1 }{{} And since E(u i ) = 0, we have: 1 = n ni=1 (x i x) 2 i=1 (x i x)e(u i ) E( ˆβ 1 ) = β 1 32 / 41
Unbiasedness of OLS Now to verify the unbiasedness of ˆβ 0 ˆβ 0 = ȳ ˆβ 1 x = β 0 + β 1 x + ū ˆβ 1 x E( ˆβ 0 ) = E(β 0 ) }{{} + E(β }{{ 1 x) } =β 0 =β 1 x E( ˆβ 0 ) = β 0 + E(ū) }{{} =0 So ˆβ 0 is also unbiased under SLR. 1 - SLR. 4 E( ˆβ 1 x) }{{} =β 1 x 33 / 41
Variance of the OLS Estimate We also wish to know how far we can expect ˆβ 1 to be from β 1 on average We can compute the variance of the OLS estimators under assumptions SLR. 1 - SLR. 4, plus one additional assumption SLR. 5 - Homoskedasticity: the error term has the same variance given any value of the explanatory variable Var(u x) = σ 2 x 34 / 41
Variance of the OLS Estimate Under SLR. 1 - SLR. 5, the variance of the OLS estimators are: Var( ˆβ 1 ) = σ 2 n i=1 (x i x) 2 And Var( ˆβ 0 ) = σ 2 n n i=1 x2 i n i=1 (x i x) 2 35 / 41
Estimating the Error Variance Typically, we don t know the true value of σ 2, so we need to obtain an estimate of it The errors are never observed, but the regression residuals are Note that E(u 2 ) = σ 2 Thus, an unbiased estimator of σ 2 is 1 n n i=1 u2 i However, we do not observe u i, we observe û i 36 / 41
Estimating the Error Variance Replacing u i with û i yields the estimator n ˆσ 2 = 1 n i=1 û 2 i However, this estimator is biased Recall the two restrictions from the first order conditions: n i=1 ûi = 0 and n i=1 x iû i = 0 If we observed n 2 residuals, we could always use the above conditions to back out the remaining two residuals 37 / 41
Estimating the Error Variance Our estimate of the error variance makes an adjustment for the degrees of freedom Is ˆσ 2 unbiased? Yes! ˆσ 2 = 1 n 2 n û 2 i (18) i=1 38 / 41
Estimators of the OLS Parameter Variances We can use equation 18 in Var( ˆβ 0 ) and Var( ˆβ 1 ) to obtain an estimate of the variances of ˆβ 0 and ˆβ 1 Var( ˆβ 1 ) = 1 n 2 n i=1 û2 i n i=1 (x i x) 2 Var( ˆβ 0 ) = 1 n n 2 i=1 û2 i n n i=1 x2 i n i=1 (x i x) 2 39 / 41
Additional Notes on Variance Estimates We call the square root of the estimate of the variance of the errors the standard error of the regression ˆσ = ˆσ 2 ˆσ is used to compute the standard error of ˆβ 1 se( ˆβ 1 ) = ˆσ n i=1 (x i x) 2 40 / 41
Regression Through the Origin In some instances it makes sense to exclude the constant term from the model This regression equation is called a regression through the origin since we are imposing the intercept to be equal to 0 y = β 1 x + u (19) Minimizing the sum of squared residuals for this regression yields the following estimate for β 1 β 1 = n i=1 x iy i n i=1 x2 i 41 / 41