Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by the lnear regresson equaton: y = β + β x + β x +... + β x + e K K for =,,..., N Note: The ntercept parameter β s attached to the varable x = for all (the constant ). Ths s called a multple regresson model where: y s the dependent varable x, x,..., xk are explanatory varables..., β s the ntercept coeffcent β, β are slope coeffcents e K s a random error that captures omtted varables, measurement errors, etc. Econ 6 - Chapter 5
Interpretaton of the slope coeffcents: β k (for k =,..., K) measures the change n the mean value of y for a unt change n x k, holdng all other varables constant. The least squares prncple or the method of ordnary least squares (OLS) fnds an estmaton rule for β, β,..., β K to mnmze the sum of squared errors: S = = N = N ( y β β x... β x ) = e K K Soluton gves the least squares (OLS) estmators b, b,..., b K. The predcted or ftted values are ŷ = b + b x +... + b x for =,..., N The resduals are K K ê = y ŷ for =,..., N = y b b x... b K x K Econ 6 - Chapter 5
To establsh the statstcal propertes of the least squares estmators a set of standard assumptons s ntroduced as follows. () The true model s: y = β + β x + β x +... + β x + e Ths says no mportant varables are excluded and the correct functonal form s used. () (e ) 0 E = for all () ( ) var( e ) = E = σ for all e Homoskedastc errors (equal error varance for all observatons). (4) cov( e,e ) = E(ee ) 0 for all j j j = Uncorrelated errors. (5a) the explanatory varables are treated as non-random. (5b) no explanatory varable can be formed as a lnear combnaton of the remanng x varables. K K Econ 6 - Chapter 5
What does (5b) mean? Ths can be shown wth an example. Suppose x = 4x holds for all. That s, x + 4x = 0 The result for the lnear regresson equaton can be seen as follows: y = β = β = β = β + β 4β + x ( β 4β ) + αx x + β + e + β x x x + e + e + e α = β 4β where The parameter α can be estmated. However, t s not possble to ndvdually estmate β and β. Ths s called perfect collnearty. 4 Econ 6 - Chapter 5
A specal case of perfect collnearty s when an explanatory varable has the same numercal value for all observatons. For example, x = c for all, where c s some constant value. In ths case, t s not possble to obtan ndvdual estmates of both β and the ntercept β. Ths suggests that all explanatory varables must take at least dfferent values (non-zero varance). 5 Econ 6 - Chapter 5
For the multple regresson equaton, the least squares (OLS) estmators of the parameters are lnear functons of y. Wth the standard assumptons, t can be shown that the estmators are unbased estmaton rules. Expressons for the varances of the least squares (OLS) estmators can be obtaned. The mportant result of the Gauss-Markov Theorem s: Gven the standard assumptons, the least squares estmators have mnmum varance n the class of all lnear unbased estmators. That s, the least squares method s BLUE (Best Lnear Unbased Estmator). 6 Econ 6 - Chapter 5
Interval Estmaton The varances and covarances of the least squares estmators are denoted by: var( b k ) for k =,,..., K cov( b, b ) k m for k m The formula ncludes the error varance σ. An unbased estmator for the error varance s constructed from the least squares resduals ê as: σˆ = N K N = ê = SSE N K Note that the degrees of freedom (df) for the sum of squares s N K (K s the number of estmated parameters n the multple regresson equaton). When the unknown error varance s replaced wth estmators of the varances and covarances as: ˆσ ths gves vâr(bk ) and côv(b, b ) The standard errors are defned as: se(b k k ) = vâr(b ) for k =,,..., K k m 7 Econ 6 - Chapter 5
Assume the errors are normally and ndependently dstrbuted. The error assumptons can be stated as: ( 0, σ ) e ~N for all It follows that the least squares estmators have a normal dstrbuton wth: b ~N(, var(b )) k β k k for k =,,..., K From statstcal theory, the random varable: b k β se(b k k ) ~t (N K) t-dstrbuton wth N K df. A 00( α)% confdence nterval estmator for β k s constructed as: [ b t se(b ), b t se(b )] k c k k + c where t c s the crtcal value from the t-dstrbuton wth (N K) degrees of freedom and upper tal area equal to α/. The lower and upper lmts of the nterval estmator can be expressed as: b k ± t se(b ) for k =,,..., K c k k 8 Econ 6 - Chapter 5
The above presentaton works wth random varables and estmaton rules. The methods can be appled to a numerc data set. Calculaton wth computer software such as Stata gves numerc results for parameter estmates and nterval estmates. Dfferent samples wll gve dfferent numerc estmaton results. Example: A cross-secton data set has been compled from a survey of fast food stores. The varables n the data set are monthly sales revenue, sales (n thousands of dollars), average product prce, p (n dollars), and advertsng expendture, a (n thousands of dollars). The number of observatons n the data set s 75. The frst 50 observatons are used for the lecture notes examples. An economc model s: sales = f(p, a) A regresson equaton that assumes sales s lnearly related to prce and advertsng s: sales β + β p + β a + e = for =,,..., 50 9 Econ 6 - Chapter 5
The ftted regresson equaton wth standard errors reported n parentheses s: sâles =.6 7.6p +.6a (8.) (.4) (0.89) The economc nterpretaton of the slope coeffcents can be dscussed. The negatve coeffcent on p suggests that, wth advertsng held constant, a $ ncrease n product prce wll lead to an average decrease n sales revenue of 7.6 thousands of dollars. An equvalent statement s: a $ reducton n the prce of a fast food meal wll nrease total sales revenue for the fast food operaton by about $7,60. Wth prce held at a fxed level, an addtonal $ of advertsng expendture wll lead to an ncrease n sales revenue of $.6. An equvalent statement s: sales revenue wll ncrease by an estmated $,60 n response to a $000 ncrease n advertsng expendture. 0 Econ 6 - Chapter 5
A 95% confdence nterval estmate for the coeffcent on the prce varable s calculated as: 7.60 ±.0(.475) = [ 0.0, 4. ] The t-dstrbuton crtcal value of t c =.0 was obtaned wth Mcrosoft Excel wth the functon: T.INV.T(0.05, 47) two-tal α degrees of freedom = N K = 50 = 47 Wth advertsng expendture held at a fxed level, the nformaton n the data set suggests that an ncrease n product prce by $ wll lead to a declne n sales revenue n the range 4. to 0.0 thousands of dollars ($4,0 to $0,00). General Note: A relatvely wde nterval estmate suggests a lack of precson n the pont estmate for a coeffcent. Econ 6 - Chapter 5
To llustrate an applcaton of alternatve functonal forms, for the fast food sales model, consder a regresson equaton wth log-transformed varables. Estmaton results wth standard errors reported n parentheses are: lnˆ(sales) = 5. 0.5ln(p) + 0.05ln(a (0.8) (0.04) (0.07) The slope coeffcents have the nterpretaton as elastctes. The results show that a % ncrease n prce s assocated wth a 0.5% decrease n total sales, assumng advertsng expendture s kept at the same level. An equvalent statement s: a prce reducton of 0%, wll lead to an ncrease n total sales for the fast food operaton of about 5.%. ) Econ 6 - Chapter 5
Hypothess Testng for a Sngle Coeffcent The lnear regresson equaton s: y = β + β x + β x +... + β x + e K K for =,,..., N A data set must be collected for the dependent varable y and all the explanatory varables x,..., x K. Computer programs can then be used to apply least squares (OLS) estmaton to get parameter estmates: b, b,..., b K and estmated standard errors: se(b ), se(b ),..., se(b K ) For an explanatory varable of nterest, say x k, does x k have any nfluence on y? To answer ths queston, test the null hypothess H : β k 0 = 0 aganst the alternatve hypothess H : β k 0 Econ 6 - Chapter 5
The test statstc of nterest s the t-statstc: t bk = where k =,,..., K se(b ) k A p-value for ths two-tal test can be calculated as: ( t t ) p = P N K) ( > Note that the t-dstrbuton degrees of freedom s N K. The usual approach to a decson rule s to reject the null hypothess n favour of the alternatve f the p-value s smaller than some chosen sgnfcance level (say α = 0.05). Rejectng the null hypothess mples that there s a statstcally sgnfcant relatonshp between y and x k. The standard least squares (OLS) estmaton output from econometrcs computer programs reports estmated coeffcents and ther estmated standard errors along wth the t-statstc for a test of sgnfcance wth a p-value for a two-tal test. 4 Econ 6 - Chapter 5
Falure to reject H0 : β k = 0 can mean: () The null hypothess s true. That s, x k has no nfluence on y. () The data s not suffcently good to reject the null hypothess even though t may be false. If the stuaton s () then ths suggests that t may be sensble to drop the varable x k from the equaton. Ths may mprove the precson of the other coeffcent estmates. But, f () descrbes the stuaton then f x k s dropped ths means that an mportant varable has been excluded from the equaton. The result may be that the least squares method wll gve a based estmaton rule for the parameter estmators snce assumpton () of the standard assumptons wll now be volated. The concluson from ths s: It s mportant to rely on the underlyng economc theory to gude n varables to nclude. Do not exclude varables that have a role n the economc theory even though the estmaton results may appear to show that they are not statstcally sgnfcant. These results are stll nterestng to report and dscuss. 5 Econ 6 - Chapter 5
Hypothess Testng for a Lnear Combnaton of Coeffcents Hypotheses that nvolve lnear combnatons of the coeffcents can be tested. Example: A data set has nformaton for companes. The Cobb-Douglas producton functon s: where Q β β = γl K exp(e ) for =,,..., Q s the output measure for company, L s labour nput, K s captal stock, and e s a random error. Log transformaton gves the lnear regresson equaton: ln( Q ) = β + β ln(l ) + β ln(k ) + e β s the elastcty of output wth respect to the labour nput, that s, t measures the percentage change n output for a one percent change n the labour nput, holdng the captal nput constant. β s the elastcty of output wth respect to the captal nput, holdng the labour nput constant. Constant returns to scale mples the restrcton: β + β = 6 Econ 6 - Chapter 5
To test the constant returns to scale hypothess consder testng: H0 = : β + β aganst : β + β H To proceed, a standard assumpton s that the errors are ndependently and dentcally normally dstrbuted. From statstcal theory, the random varable: (b + b se(b ) ( β + b + β ) ) ~t (N K) From the rules of varance: vâr(b + b) = vâr(b) + vâr(b) + côv(b,b) and se(b + b) = vâr(b + b) When the constant returns to scale hypothess s true β + β. = From the estmaton results a test statstc and p-value for a two-tal test are calculated as: t = vâr(b ) + (b + b vâr(b ( t t ) p = P N K) ( > ) ) + côv(b, b ) 7 Econ 6 - Chapter 5
A secton of Stata output for ths example s shown below. The data set has N = observatons.. * Create log-transformed varables. generate LQ =log(q). generate LL =log(l). generate LK =log(k). regress LQ LL LK ------------------------------------------------------ LQ Coef. Std. Err. t P> t -------------+---------------------------------------- LL.558996.86484 0.68 0.499 LK.4877.70877 0.69 0.494 _cons -.8679.5464-0.4 0.85 ------------------------------------------------------. estat vce Covarance matrx of coeffcents of regress model e(v) LL LK _cons -------------+------------------------------------ LL.6665764 LK -.5664746.495467 _cons -.405686.4779844.986058 Usng the estmated varances and covarances of the parameter estmators, the t-statstc for testng the null hypothess β + β = s calculated as: t = = 0.66657 + 0.49544 + ( 0.56644) 0.0467 0.707 ( 0.55899 + 0.4877) = 0.77 8 Econ 6 - Chapter 5
The degrees of freedom s N K = = 0. Wth Mcrosoft Excel, the p-value for the test can be found by selectng Insert Functon T.DIST.T (0.77, 0). The answer for the p-value s 0.786. Ths hgh p-value gves no evdence to reject the null hypothess. Therefore, there s support for the clam that producton s descrbed by constant returns to scale. For testng a lnear combnaton of coeffcents a t-statstc and p-value can be computed automatcally by usng the Stata lncom command. The test of the constant returns to scale hypothess calculated above s shown n the Stata results:. lncom _b[ll] + _b[lk] ( ) LL + LK = ------------------------------------------------------ LQ Coef. Std. Err. t P> t -------------+---------------------------------------- ().04677.70685 0.7 0.786 ------------------------------------------------------ 9 Econ 6 - Chapter 5
Measurng Goodness of Ft The R goodness-of-ft measure s calculated as: R = N = N = (y ê y) = SSE SST The R gves the proporton of the varaton n the dependent varable explaned by all the explanatory varables n the model. R measures serve as a gude only. A more nterestng focus may be the economc nterpretaton of the parameter estmates and analyss of the economc theory that motvates the regresson equaton specfcaton. 0 Econ 6 - Chapter 5