I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes of OLS estmators, whch help justfy the use of OLS and also help to make statstcal nference. To study statstcal propertes of OLS estmators, we need to mpose a set of assumptons, most of them on the error term. When the frst 5 assumptons are held, we call the error term a classcal error term. If Assumpton 7 s added, t s called a classcal normal error term. Assumpton : No specfcaton error n the model. That s, the regresson model s lnear n the coeffcents, s correctly specfed (has the correct varables and functonal forms), has no measurement error, and has an addtve error term. Y = K K ε Assumpton : Dsturbances have zero mean. The expected value of the dsturbance term s zero. E( ε ) = Assumpton 3: All the ndependent varables are uncorrelated wth the error term (we say ndependent varables are exogenous). That s, for all =,,n, j=,,k, Cov( ε, j )= E[( ε - E( ε ))( j - E( j ))] = E[ ε ( j - E( j ))] = E( ε ) - E( ε )E( )= E( ε )= j j j Assumpton 4: No autocorrelaton between the errors. Gven any values of and j (where j), the correlaton between ε and ε j s zero. Cov( ε, ε j, )= E( ε ε, j) =, j j, If ths s true for tme seres data, we say errors are serally uncorrelated. Cross-sectonal data have less problem of correlaton, but there are exceptons. j
Assumpton 5: Errors are homoskedastc. Gven the value of, the varance of the dsturbance s the same for all observatons. Var( ε )= E[ ε - E( ε ) ] = E( ε )= σ When ths assumpton does not hold, the error s sad to be heteroskedastc. Heteroskedastcty s tradtonally beleved to be an ssue for cross-sectonal data. However, t may well be a problem n the tme seres context. Assumpton 6: No 'perfect multcollnearty' between ndependent varables. That s, no explanatory varable can be wrtten as a lnear functon of other explanatory varables, e.g., the followng equaton cannot hold = α α K K If the above equaton holds, there s perfect multcollnearty n explanatory varables. When K=, we say and are perfectly collnear. Suppose we have a 3-varable regresson model, Y = ε If there s an exact lnear relatonshp between the ndependent varables: = α for some parameter α Substtute t nto orgnal regresson: Y = = α ( α ) = γ ε ε ε Ths reduces to -varable regresson. Sngle slope coeffcent. Can't dentfy separate effects of the ndependent varables. Assumpton 7: Errors are normally dstrbuted. Combned wth Assumptons, 4 & 5, we have ε ~ N(, σ ). Ths assumpton s mportant f we have a small sample, otherwse t s not mportant.
Page 3 II. Further Detals about Assumptons Assumpton : No specfcaton error.. Lnearty may be n dsguse. Consder the followng model Y = e e ε where e s the exponental. Ths model looks nonlnear, but can be transformed nto a lnear form by takng a log both sde ln( Y ) = ln( ) ε whch s lnear n s wth ln(y) and ln() as dependent and ndependent varables, respectvely. Note that the followng models are all lnear: ln( Y ) = ε Y = ln( ) ε Y = ε Y = ε. Correct specfcaton also requres all relevant explanatory varables to be taken n account. If the true model s Y = ε But you estmate the model Y = ε Then your model s msspecfed. We call ths mssng relevant varable problem. However, f you estmate the model Y = 3 3 ε
Page 4 Then your model s also msspecfed. We call ths ncludng rrelevant varable problem. 3. In the case where you estmate the model wth a dfferent functonal form, eg, Y = ε you also commt a specfcaton error. Assumpton : Error has zero mean. Ths s a rather weak assumpton. What would happen f you estmate the followng model Y = ε, E( ε ) = δ * We can ntroduce another error term ε so that * ε = ε δ As a result, the new error term has zero mean and the model becomes Y = ( δ ε ) * It mples that the estmated parameters are δ and. In other word, f the parameter of nterest s, and a constant ntercept s ncluded n the model, Assumpton s automatcally satsfed. Assumpton 3: Explanatory varables are exogenous. What mght cause the volaton of the assumpton of E( ε k ) = for some k? We call ths problem of the volaton of exogenety endogenety or smultanety. Exogenety occurs when the explanatory varable s determned ndependently of the error term, that s, outsde of the model. Ths assumpton s automatcally satsfed f s non-stochastc. However, f both ndependent and dependent varables are smultaneously determned n the model, we have the endogenety problem. Let s use the followng example to llustrate how the exogenety assumpton s volated. (note that the example used n the textbook s not very clear.)
Q d Page 5 = α P ε where Qd s the quantty of demand for a good and P s the prce. Ths model s a demand functon and n ths model P cannot be exogenous. Ths s because P cannot be determned outsde of the model, e, Q d and P are smultaneously determned wthn the model. To see ths, we have to examne how the prce s determned: Q Q s d = P ε = Q = Q s The frst of these two equatons s the supply functon. The second s the equlbrum. Solvng all three equatons for P, we have ε ε P = α Snce P s a functon ofε, P must be correlated wth ε. Hence the exogenety assumpton s volated. Assumptons, 4, 5, and 6 wll be dscussed n subsequent Topcs. Assumpton 7: Normalty. Ths assumpton s often justfed accordng to the Central Lmt Theorem. Central Lmt Theorem: The mean (or sum) of a large number of ndependent and dentcally dstrbuted (d) random wll tend to be a normal dstrbuton, regardless f ther dstrbuton, f the number of such varables s large enough. III. Samplng Dstrbuton of ˆ Need to know somethng about the precson of our least-squares estmators. The dea s that these coeffcent estmates are a functon of the sample data. They wll vary from sample to sample. We care about the relablty of our estmates of the coeffcents. For the SLR model, under Assumptons -6, OLS estmators are unbased, that s,
Page 6 E ( ˆ ˆ =. ) =, E( ) Also we have the formulae for the varance and standard devatons of the OLS estmators (proofs are avalable n more advanced econometrc textbooks). For the estmated ntercept: Var( ˆ )= Σ n Σ x σ σ( ˆ )= Σ n Σ x σ For the estmated slope coeffcent: ˆ σ Var( )= Σ x σ( ˆ )= σ Σ x The varance of the error term can be estmated wth: Σ where n- s the number of degrees of freedom (df). The square root s known as the Standard Error of the Estmate (or Regresson). It s a common summary statstc from regresson analyss. e σˆ = n - σˆ = Σ e n - Fnally, note the non-zero covarance between the two coeffcent estmates. Snce the varance s always postve, the nature of the covarance depends on the mean of.
Page 7 Cov( ˆ, ˆ )= - Var( ˆ )= - σ Σ x If, n addton, Assumpton 7 s also satsfed, we have the samplng dstrbuton for : ˆ ˆ ~ N(, Var( ˆ ) ) ~ N(, Var( ˆ ) ) The same deas apply to MLR models. However, the expressons are more complcated usng non-matrx notaton. IV. Statstcal Propertes of OLS Estmators: Gauss-Markov Theorem Gauss-Markov Theorem: Under Assumptons -6, the OLS estmators, n the class of unbased lnear estmators, have mnmum varance (.e., they re the best lnear unbased estmator (BLUE)). Won t prove ths proposton. The estmated coeffcents and varance of the dsturbances are unbased: E( ˆ )= E( ˆ )= E( ˆ σ )= σ On average, we ll get the estmated coeffcent and varance of the dsturbances rght n repeated samplng. The varance of the estmators of ˆ and ˆ are smaller than those from any other lnear unbased estmator. They have mnmum varance. Remarks: ) The Gauss-Markov theorem does not say that OLS estmates are normally dstrbuted (and t does not depend on Assumpton 7). ) But f Assumpton 7 s met, the OLS estmates are normally dstrbuted. 3) If the errors are not normal, OLS s approxmately normal provded the sample sze s large enough. Ths s due to the Central Lmt Theorem. Consstency: OLS estmator s also consstent. That s as n goes to nfnty, ˆ and ˆ n probablty. Why?
Page 8 V. Dscusson Questons: Q4.3, Q4.4, Q4.7 VI. Learnng the Notatons n Table 4.. VII. Computng Exercses: Monte Carlo