REGRESSION ANALYSIS II- MULTICOLLINEARITY

Size: px
Start display at page:

Download "REGRESSION ANALYSIS II- MULTICOLLINEARITY"

Transcription

1 REGRESSION ANALYSIS II- MULTICOLLINEARITY

2 QUESTION 1 Departments of Open Unversty of Cyprus A and B consst of na = 35 and nb = 30 students respectvely. The students of department A acheved an average test score ˆ A 7.5, whle the students of department B acheved an average test score ˆ 6 B (a) If the standard devaton of the department A s known and equals to 2.5, examne the null hypothess that the average test score of the students of A equals to 8.5 versus the alternatve hypothess that t s less than 8.5 (b) If the standard devaton of the department B s unknown whle ts estmate equals to 1.5, examne the null hypothess that the average test score of the students of B equals to 5 versus the alternatve hypothess that t s more than 5

3 () Standard devaton s known Queston 1 We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : μ Α = 8,5 Η 1 : μ Α < 8,5 We choose the sutable test statstc, and we calculate ts value: The sutable test statstc s: We calculate ts value: Z = μ Α 8,5 σ/ n A ~Ν(0,1) Z = μ Α 8,5 σ/ n A = 7,5 8,5 2,5/ 35 = 1 4,23 = 2,366. We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 Z : Z c and C 1 Z : Z < c.

4 Queston 1 We choose the sgnfcance level α, whch determnes the probablty of commttng Type I error: P Z < c; H 0 s vald = α. Based on the last relatonshp, we fnd from the standard normal dstrbuton table the crtcal value c: Level of sgnfcance Crtcal value 1% -2,33 5% -1,64 10% -1,28 Decson: we have found that Z = 2,366. At level of sgnfcance 1% the crtcal value s c = 2,33. We have Z < c, so we reject Η 0. At level of sgnfcance 5% the crtcal value s c = 1,65. We have Z < c, so we reject Η 0. At level of sgnfcance 10% the crtcal value s c = 1,28. We have Z < c, so we reject Η 0.

5 (b) Queston 1 The standard devaton σ s unknown. We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : μ Β = 5 Η 1 : μ Β > 5 We choose the sutable test statstc, and then we calculate ts value: The sutable test statstc s the t-statstc: t = μ Β 5 s Β / n B ~St(n Β 1). Then we calculate ts value: t = μ Β 5 s Β / n B = 6 5 = 1,0 = 0,122. 1,5 30 8,22 We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 t : t c και C 1 t : t > c.

6 Queston 1 We select the sgnfcance level α, whch s the probablty to commt Type I error: P t > c; H 0 s vald = α. From the last equaton we fnd the crtcal value c from the Student s t dstrbuton table wth degrees of freedom equal to 30-1 = 29. Snce we have 29 degrees of freedom we get: Level of sgnfcance Crtcal value 1% 2,462 5% 1,699 10% 1,311 Make a Decson: We have found that t = 0,122. At level of sgnfcance 5% the crtcal value s c = 1,699. We have t < c, so we accept Η 0. At level of sgnfcance 1% the crtcal value s c = 2,462. We have t < c, so we accept Η 0. At level of sgnfcance 10% the crtcal value s c = 1,311. We have t < c, so we accept Η 0.

7 Queston 2 An economst evaluates the relatonshp between 5 economc varables. He wants to estmate the multple regresson: yt 1 2xt 3zt 4rt 5mt u t =1,2,, 105. The estmaton output s: yˆ t 0,172 0,264 x (2,604) (0,205) t 0,623z (0,190) 0,195r (0,097) 0,222m (0,131) where standard errors are presented n parentheses. (a) Examne the followng hypotheses: () H0 : 2 0 () H0 : 4 0 H1 : 4 0 () H : 0 H : at least one of,,, (v) H : 0 H : at least one of,, The sgnfcance level s 5%. We are gven the followng two models: (1) yˆ t t t t t t, s R , RSS 51196, 28 22,740, 0,184 z 0,250m R 2 0, 189 s 22, 957 RSS 53232, 45 (0,098) (0,122) (2) yˆ t 0,163 zt 0,210 xt R 2 0, 116 s 25, 834 RSS 52144, 12 (0,075) (0,114) 0 t

8 () We determne the null hypothessη 0 H 0 : 2 0 We choose the sutable test statstc, and then we compute ts value: The sutable test statstc s the t-statstc: t = β 2 β 2 ~St(n k). SE(β 2 ) Ts the number of observatons, so we haven = 105, whleks the number of the parameters n the regresson (k = 5, β 1, β 2, β 3, β 4, β 5 ). Also, SE(β 2 )s the standard error of the regressorβ 2. The value of the t-statstc s computed as: t = 0, ,205 = 1,290. We calculate thep-value as the degree of support ofη 0 : p value = P t t ; H 0 s vald = 2 P t t ; H 0 s vald = = 2 P t 1.290; H 0 s vald = 2 0,10 = 0,20. Thep value = 0,20s very large (larger than 10%) so there s strong support of the null hypothess H 0 : 2 0.Thus,coeffcent 2 s statstcally nsgnfcant (statstcally s equal to zero).

9 () Determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : β 4 = 0 Η 1 : β 4 < 0 We choose the sutable test statstc and then we compute ts value: The sutable test statstc s the t-statstc: t = β 4 β ~St(n k), SE(β 4 ) where SE(β 4 ) s the standard error of β 4 and k, s the number of the parameters of the regresson. The test statstc s calculated as t = 0, ,097 = We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 t : t c and C 1 t : t < c.

10 We select the sgnfcance level α, whch represents the probablty of commttng Type Ι error: P t < c; H 0 s vald = α =>1 P t c; H 0 s vald = α => P t c; H 0 s vald = 1 α Based on the last equaton we get the crtcal value c from the t dstrbuton table t n k = t = t 100. Sgnfcance level Crtcal value 5% Make a decson: We accept Η 0 when the value of the t-statstc, t * «falls» nto the acceptance regon C 0, whle we reject Η 0 and accept Η 1 when t * «falls» ntothe rejecton regon C 1. At sgnfcance level 5% the crtcal valuec = 1,660. But we fnd that t = 2.01 < c. So we reject Η 0 and accept Η 1.

11 () We want to examne whether all parameter coeffcents, except the constant, are smultaneously equal to zero. We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : β 2 = β 3 = β 4 = β 5 = 0 Η 1 : at least one ofβ 2, β 3, β 4, β 5 0 We choose the sutable test statstc and then we compute ts value: The sutable test statstc s the F-statstc F = R2 (n k) ~F(k 1, n k), 1 R 2 (k 1) The value of the F-statstc s calculated as F = 0,219 (105 5) 1 0,219 (5 1) = 0, = 7. We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 F : F c and C 1 F : F > c.

12 We select the sgnfcance levelα, whch represents the probablty of commttng TypeΙ error: P F > c; H 0 s vald = α From the last equaton we fnd the crtcal value c from the F dstrbuton tablef k 1, n k = F 5 1,105 5 = F(4,100). (We usedf(4,120)becausef(4,100)does not exst n the tables). Level of sgnfcance 5% 2,4472 Crtcal Value Make a decson: We acceptη 0 when the value of the F-statstc, F * «falls» ntothe acceptance regonc 0, whle we rejectη 0 and acceptη 1 whenf * «falls» nto the rejecton regonc 1. For sgnfcance level 5% the crtcal valuec = 2,4472. We fnd thatf = 7 > c. So we rejectη 0 (acceptη 1 ).

13 (v) We want to examne whether parameter coeffcents β1, β2 and β4 are smultaneously equal to zero. If we set smultaneously these coeffcent restrctons, we get the followng restrcted verson of our basc regresson model: y t 3 zt 5 m t u t Thus, between model 1 and 2, we choose model 1 because t corresponds to the restrcted model specfcaton. Estmaton of the restrcted model yelds the followng results: yˆ t 0,184 z (0,098) t 0,250m (0,122) t, R , s , RSS Snce the new coeffcent of determnaton has been calculated for the model under the restrctons we have: R R

14 We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : β 1 = β 2 = β 4 = 0 Η 1 : at least one ofβ 1, β 2, β 4 0 We choose the sutable test statstc, and then we calculate ts value. The sutable test statstc s the F-statstc F = R 2 2 U R R /m 2 ~F(m, n k). 1 R U /(n k) The value of the F-statstc s calculated as: F = 0,219 0, ,219 (105 5) 3 = 1,28. We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 F : F c and C 1 F : F > c.

15 We choose the level of sgnfcance α, whch represents the probablty of commttng TypeΙ error: P F > c; H 0 s vald = α Based on the last equaton we fnd the crtcal value c from the F dstrbuton table F m, n k = F(3,105 5). Level of Sgnfcance 5% 2,6802 Crtcal Value Make a decson: We accept Η 0 when the value of the F-statstc, F * «falls» nto the acceptance regon C 0, whle we reject Η 0 and accept the alternatve hypothess Η 1 when F * «falls» nto the rejecton regon C 1. For sgnfcance level 5%, the crtcal value s c = 2,6802. We fnd that F = 1,28 < c. So we accept Η 0 (and reject Η 1 ).

16 Queston 3 A random sample of 1,562 persons was asked to respond on a scale from one (strongly dsagree) to seven (strongly agree) to the queston: Wll the new government economc polcy lower unemployment?. The sample mean response was 4.27 and the populaton standard devaton was Test whether the mean response s equal to 4 aganst the alternatve that t s dfferent than 4. Perform the hypothess test at level 5%.

17 We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : μ = 4 Η 1 : μ 4 We choose the sutable test statstc, and we calculate ts value. The sutable test statstc s the z-statstc because we have a large sample sze: Z = μ 4 σ/ n A ~Ν(0,1) We calculate ts value: Z = μ 4 = σ/ n A 1.32/ 1562 = We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 Z : Z < c and C 1 Z : Z c.

18 We choose the sgnfcance level α, whch determnes the probablty of commttng Type I error: * P( Z c; H0 s vald ) 2* P( Z P( Z * * c; H c; H 0 0 s vald ) s vald ) / 2 c z / 2 Based on the last relatonshp, we fnd from the standard normal dstrbuton table the crtcal value c: Level of sgnfcance Crtcal value 5% 1.96 Make a decson: we have found that Z = At level of sgnfcance 5%, the crtcal value s c = We have Z > c, so we reject Η 0.

19 Queston 4 Much research n appled economcs focuses on the prcng of goods/servces. One common approach nvolves buldng a model n whch the prce of a good depends on specfc characterstcs of that good. A real estate agent n Canada s nterested n buldng a prcng model for house prces. An approach s to estmate a multple regresson model, where the sales prce of the house n Canadan dollars s the dependent varable Y, whle varous determnants of house prces are used as ndependent varables. Factors whch affect the house prces are the followng: X1 = the lot sze of the property (n square feet) X2 = the number of bedrooms X3 = the number of bathrooms X4 = the number of storeys (excludng the basement). X5 = basement (f the house has a basement) X6 = ar condtonng system (f the house ncludes an ar condtoner) X7 = garage (number of rooms used for storage of vehcles)

20 Data on the housng market of Wndsor of Canada sale prce lot sze bedroom bath storeys basement ar cond garage Data taken from Gary Koop s book Analyss of economc data

21 Queston 4 Ft the regresson model: y x x x x x x x u , 1,2,...,39 Wrte the ftted regresson equaton. If we consder comparable houses, how much would an extra bathroom add to the value of the house? Examne whether the coeffcent β3 s statstcally sgnfcant at level 5%. Test whether all the determnants of the house prces are smultaneously equal to zero aganst the alternatve hypothess that at least one of them s dfferent from zero (at level 5%) Test whether the varables X2, X6 and X7 are smultaneously equal to zero aganst the alternatve hypothess that at least one of them s dfferent from zero (at level 5%)

22 Queston 4 Wrte the ftted regresson equaton We estmate the regresson model by usng the excel functon Regresson. The Estmaton Output s gven below: SUMMARY OUTPUT Regresson Statstcs Multple R R Square Adjusted R Square Standard Error Observatons 39 ANOVA df SS MS F Sgnfcance F Regresson E Resdual E+08 Total Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept X Varable X Varable X Varable X Varable X Varable X Varable X Varable The ftted regresson equaton s: ˆ y x x x x 3241x x x 7

23 Queston 4 If we consder comparable houses, how much would an extra bathroom add to the value of the house? Houses wth an extra bathroom wll worth bˆ Canadan dollars more than those wthout an extra bathroom, f we consder houses wth the same lot sze, number of bedrooms, storeys, basement, etc. The coeffcent estmate of varable X3 measures how much Y wll change when X3 changes one unt, gven that all the other explanatory varables reman the same. ˆb 3 In the case of smple regresson we can say that β measures the nfluence of X on Y ; n the multple regresson we say that βj measures the nfluence of Xj on Y all other explanatory varables beng equal.

24 Economc nterpretaton of the regresson estmates Some ways of verbally statng what the value of β1 means: An extra square foot of lot sze wll tend to add another $6.15 on to the prce of a house, ceters parbus. If we consder houses wth the same number of bedrooms, bathrooms, storeys, etc, an extra square foot of lot sze wll tend to add another $6.15 onto the prce of the house. If we compare houses wth the same number of bedrooms, bathrooms, storeys, etc, those wth larger lots tend to be worth more. In partcular, an extra square foot of lot sze s assocated wth an ncreased prce of $6.15. We cannot smply say that houses wth bgger lots are worth more snce ths s not the case (e.g. some nce houses on small lots wll be worth more than poor houses on large lots). However, we can say that f we consder houses that vary n lot sze, but are comparable n other respects, those wth larger lots tend to be worth more.

25 Examne whether the coeffcent β3 s statstcally sgnfcant at level 5%. The null hypothess Η 0 s H : We choose the sutable test statstc, and then we compute ts value: The sutable test statstc s the t-statstc: t = β 3 β 3 ~St(n k). SE(β 3 ) T s the number of observatons, so we have n = 39, whle k s the number of the parameters n the regresson (k = 8, β 0, β 1, β 2, β 3, β 4, β 5, β 6, β 7 ). Also, SE(β 3 )s the standard error of the regressor β 3. The value of the t-statstc s computed as: t = 14787, ,28 = 2,169. We calculate thep-value as the degree of support ofη 0 : p value = P t t ; H 0 s vald = 2 P t t ; H 0 s vald = = 2 P t 2,169; H 0 s vald = The p value = 0,038 s very small (smaller than 5%) so there s no support of the null hypothess H : Thus, the coeffcent s statstcally sgnfcant (statstcally s not 3 equal to zero).

26 Test whether all the determnants of the house prces are smultaneously equal to zero aganst the alternatve hypothess that at least one of them s dfferent from zero (at level 5%) We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : β 1 = β 2 = β 3 = β 4 = β 5 = β 6 = β 7 = 0 Η 1 : at least one ofβ 1, β 2,.., β 7 0 We choose the sutable test statstc and then we compute ts value: The sutable test statstc s the F-statstc F = R2 (n k) ~F(k 1, n k), 1 R 2 (k 1) The value of the F-statstc s calculated as F = 0,59 (39 8) 1 0,59 (8 1) = = 6.38 We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 F : F c and C 1 F : F > c.

27 We select the sgnfcance levelα, whch represents the probablty of commttng TypeΙ error: P F > c; H 0 s vald = α From the last equaton we fnd the crtcal value c from the F dstrbuton tablef k 1, n k = F 8 1,39 8 = F(7,31). (We used F(7,30)because F 7,31 does not exst n the tables). Level of sgnfcance 5% Crtcal Value Make a decson: We accept Η 0 when the value of the F-statstc, F * «falls» nto the acceptance regon C 0, whle we reject Η 0 and accept Η 1 when F * «falls» nto the rejecton regon C 1. For sgnfcance level 5% the crtcal valuec = 2,3343. We fnd that F = 6.38 > c. So we reject Η 0 (accept Η 1 ).

28 Test whether the varables X2, X6 and X7 are smultaneously equal to zero aganst the alternatve hypothess that at least one of them s dfferent from zero (at level 5%) If we set smultaneously these coeffcent restrctons, we get the followng restrcted verson of the regresson model: y x x x x u, ,2,...,39 We estmate the new regresson model, and we get the followng results: SUMMARY OUTPUT Regresson Statstcs Multple R R Square Adjusted R Square Standard Error Observatons 39 ANOVA df SS MS F Sgnfcance F Regresson E E-06 Resdual E+08 Total Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept X Varable X Varable X Varable X Varable The new coeffcent of determnaton s the restrcted coeffcent of determnaton: R R

29 We determne the null hypothess Η 0 and the alternatve hypothess Η 1 : Η 0 : β 2 = β 6 = β 7 = 0 Η 1 : at least one ofβ 2, β 6, β 7 0 We choose the sutable test statstc, and then we calculate ts value. The sutable test statstc s the F-statstc F = R 2 2 U R R /m 2 ~F(m, n k). 1 R U /(n k) The value of the F-statstc s calculated as: F = 0,5904 0, ,5904 (39 8) 3 = We determne the acceptance regon C 0 and the rejecton regon C 1 : C 0 F : F c and C 1 F : F > c.

30 We choose the level of sgnfcance α, whch represents the probablty of commttng TypeΙ error: P F > c; H 0 s vald = α Based on the last equaton we fnd the crtcal value c from the F dstrbuton tablef m, n k = F(3,39 8). (We used F(3,30) because F 3,31 does not exst n the tables). Level of Sgnfcance 5% 2,9223 Crtcal Value Make a decson: We accept Η 0 when the value of the F-statstc, F * «falls» nto the acceptance regon C 0, whle we reject Η 0 and accept the alternatve hypothess Η 1 when F * «falls» nto the rejecton regon C 1. For sgnfcance level 5%, the crtcal value s c = 2,9223. We fnd that F = < c. So we accept Η 0 (and reject Η 1 ).

31 Ptfalls of usng multple regresson analyss In multple regresson analyss, we are usually facng two types of problems: The Effect of Includng a Varable that Ought not to be Included The Omtted varables bas The Effect of Includng a Varable that Ought not to be Included If we nclude explanatory varables that should not be present n the regresson, then the estmated coeffcents on the varables wll not be accurate In the prevous example, addng an extra bedroom to the house wll rase ts prce by $14,787.69? Probably Not! The reason s that there are many factors other than the number of bedrooms that potentally nfluence house prces. (for example, bathrooms or lot sze are more mportant determnants of house prces than bedrooms. ) Furthermore, these factors may be hghly correlated (.e. houses wth more bathrooms tend to have more bedrooms). To nvestgate the possblty, let us examne the correlaton matrx of all the varables n the model

32 Correlaton Matrx of the varables We calculate the correlaton coeffcent between each par of varables, and then we present the results n a matrx. For example, f we have three varables, X, Y and Z, then there are three possble correlatons (.e. ρxy, ρxz and ρyz ). Then we put these correlatons n a matrx: X Y Z X 1 Y ρxy 1 Z ρxz ρyz 1 You can use the excel functon Correlaton n the Data Analyss Toolbox to compute the correlaton matrx of the varables.

33 Correlaton Matrx of the varables In the house prcng regresson model, the correlaton matrx of all varables s the followng: Y X1 X2 X3 X4 X5 X6 X7 Y 1 X X X X X X X Snce all the elements of the correlaton matrx are postve, t follows that each par of varables s postvely correlated wth each other. The correlaton between the number of bathrooms and the number of bedrooms s 0.335, ndcatng that houses wth more bathrooms also tend to have more bedrooms Also note that the correlaton between the number of storeys and the number of bedrooms s 0.536, ndcatng that houses wth more storeys also tend to have more bedrooms. Snce we have found that these factors are hghly correlated wth the bedroom factor, whle t s found to be nsgnfcant, we must exclude t from the regresson model.

34 Multcollnearty When the explanatory varables are very hghly correlated wth each other (correlaton coeffcents ether very close to 1 or to -1) then the problem of multcollnearty occurs. Perfect multcollnearty = under Perfect Multcollnearty, the OLS estmators smply do not exst Imperfect multcollnearty Imperfect multcollnearty (or near multcollnearty) exsts when the explanatory varables n an equaton are correlated, but ths correlaton s less than perfect. In cases of mperfect multcollnearty, the OLS estmators can be obtaned However, the OLS varances are often larger than those obtaned n the absence of multcollnearty.

35 Detectng multcollnearty Auxlary regressons We can determne the relatonshp between any of the regressors and the other regressors by examnng each of the regressors as dependent varables, determne the R 2 values for these regressons, and usng a test to determne the relatonshp between each regressor and the set of other explanatory varables. For example we can run the followng auxlary regresson (for the varable x2): x2 0 1x1 2x3 3x4 4x5 5x6 6x7 u, 1,2,...,39 We wll set up an F test to determne f there s a hgh level of multcollnearty. So we wll test the null hypothess H 0 : aganst the alternatve hypothess that at least one of these coeffcents s dfferent than zero. 0

36 Detectng multcollnearty Auxlary regressons The test statstc wll be a F statstc, whch follows an F dstrbuton wth k-2 and n-k+1 degrees of freedom, where k=the number of explanatory varables, ncludng the ntercept. If F s sgnfcant, t s taken to mean that the partcular X s collnear wth other X's; f the F value s not sgnfcant, the X (as dependent varable) s not consdered to be collnear wth the other explanatory varables. If F s sgnfcant, you may wsh to exclude that varable from the model, snce the part of the dependent varable that t s explanng s already beng explaned by the other explanatory varables. You wll have to determne f t s wse to use a more parsmonous model or not.

37 Detectng multcollnearty Auxlary regressons Back to our example. We run an auxlary regresson for the varable x1. Thus, we use x1 as a dependent varable, and the other regressors as ndependent varables. The estmaton output of the auxlary regresson s presented below:

38 Detectng multcollnearty Auxlary regressons The F statstc s not sgnfcant (ts p-value s , much larger than 0.05), therefore the varable x1 s not consdered to be collnear wth the other explanatory varables.

39 Auxlary regressons Detectng multcollnearty

40 Auxlary regressons Detectng multcollnearty We estmated sx addtonal auxlary regressons wth the remanng varables as dependent varables. The F statstc s found to be sgnfcant for the varables bedroom, bathroom, storeys, and garage, therefore these varables are collnear wth the other explanatory varables. We may wsh to exclude these varables from the model.

41 Detectng multcollnearty Klen's Rule of Thumb suggests that multcollnearty may be a problem only f the R 2 obtaned from an auxlary regresson s greater than the overall R 2 (on the regresson wth y as the dependent varable). In ths example, the overall R 2 s equal to 0.59 whle the R 2 obtaned from the auxlary regressons range from 0.18 to Therefore, accordng to Klen s rule of thumb, there s no problem of multcollnearty snce the auxlary R 2 are smaller than the overall R 2

42 Detectng multcollnearty Egenvalues and Condton Index We can get the egenvalues and the condton ndex to estmate the level of collnearty n the explanatory varables. Most multvarate statstcal approaches nvolve decomposng a correlaton matrx nto lnear combnatons of varables. The lnear combnatons are chosen so that the frst combnaton has the largest possble varance (subject to some restrctons we won't dscuss), the second combnaton has the next largest varance, subject to beng uncorrelated wth the frst, the thrd has the largest possble varance, subject to beng uncorrelated wth the frst and second, and so forth. The varance of each of these lnear combnatons s called an egenvalue.

43 Detectng multcollnearty Egenvalues and Condton Index Number stands for lnear combnaton of X varables. Egenval(ue) stands for the varance of that combnaton. The condton ndex s a smple functon of the egenvalues, namely, CI max where λ s the symbol for an egenvalue.

44 Detectng multcollnearty Egenvalues and Condton Index The fourth part of the matrx s the Varance proportons. Ths s the regresson coeffcent varance- decomposton matrx, whch shows the proporton of varance for each regresson coeffcent (and ts assocated varable) attrbutable to each condton ndex.

45 Detectng multcollnearty Egenvalues and Condton Index To use the table, you frst look at the varance proportons. For X1, for example, most of the varance (about 75 percent) s assocated wth Number 3, whch has an egenvalue of.079 and a condton ndex of Most of the rest of X1 s assocated wth Number 4. Varable X2 s assocated wth 3 dfferent numbers (2, 3, & 4), and X3 s mostly assocated wth Number 2. Look for varance proportons about.50 and larger. Collnearty s spotted by fndng 2 or more varables that have large proportons of varance (.50 or more) that correspond to large condton ndces (between 10 and 30). There s no evdent problem wth collnearty n the above example

46 Detectng multcollnearty Egenvalues and Condton Index Gretl: Clck on Analyss on the estmated model and then select Collnearty : Frst, a threshold of 10 for the condton ndex selects three condton ndexes (10.446, and ) Condton ndex equal to s assocated only to one large varance proporton ( the lotsze varable); thus no collnearty s shown for ths ndex. Condton ndex equal to s assocated wth two large varance proportons (0.38 and bedroom and bath respectvely); thus, there s collnearty between these two varables The last condton ndex (18.667) s relatvely assocated wth the varables lotsze and bedroom (varance proportons and 0.436) ; no collnearty exsts.

47 Detectng multcollnearty Varance Inflaton Factors (VIF) A varance nflaton factor (VIF) quantfes how much the varance of the estmated coeffcent s nflated. The standard errors and hence the varances of the estmated coeffcents are nflated (.e. ncreased) when multcollnearty exsts. So, the varance nflaton factor for the estmated coeffcent b k denoted VIF k s just the factor by whch the varance s nflated. The VIF for the estmated coeffcent b k s calculated as 2 R k VIF k 1 1 R where s the R 2 -value obtaned by regressng the k th predctor on the remanng predctors. A VIF of 1 means that there s no correlaton among the k th predctor and the remanng predctor varables, and hence the varance of b k s not nflated at all. The general rule of thumb s that VIFs exceedng 4 warrant further nvestgaton, whle VIFs exceedng 10 are sgns of serous multcollnearty requrng correcton. 2 k

48 Detectng multcollnearty Gretl: Clck on Analyss on the estmated model and then select Collnearty : VIFs In ths example, there are no sgns of serous multcollnearty because all VIFs are smaller than 10.

49 Resolvng multcollnearty The easest ways to cure the problems are remove one of the collnear varables transform the hghly correlated varables nto a rato go out and collect more data swtch to a hgher frequency In order to reduce the multcollnearty that exsts, t s not suffcent to go out and just collect older data observatons. The data have to be collected n such a way to ensure that the correlatons among the volatng predctors s actually reduced. That s, collectng more of the same knd of data won't help to reduce the multcollnearty.

50 Resolvng multcollnearty Orthogonal auxlary varables Suppose we have three auxlary varables, X1, X2, and X3, whle we have found that X1 s collnear wth X2 and X3. One way to resolve the problem s to drop X1 from the regresson. If we wsh to keep X1 n the model, together wth X2 and X3, we have to transform X1 n a way that s no longer collnear wth these varables. One way s to make X1 orthogonal to X2 and X3. How do we do t? Frst, we can run the followng auxlary regresson by least squares: X 0 1X 2 2X3 1 u Then we keep the resduals u from the prevous model, and we make the followng transformaton: ~ X 1 0 u ~ ~ where X denotes the orthogonal X1. We are now able to use n our basc 1 X1 model as a regressor.

51 Resolvng multcollnearty Orthogonal auxlary varables prevous example: we found that the varable garage s collnear wth the other varables. If we want to keep the varable n our model, we can make t orthogonal to the other factors. We run the regresson where garage s the dependent varable, whle the remanng predctors are used as ndependent varables: Note that the constant of the model s equal to

52 Resolvng multcollnearty Orthogonal auxlary varables Based on ths model, we can generate the resduals. Remember that X X X X X X Y Y u ˆ... ˆ ˆ ˆ ˆ ˆ X X X X ˆ... ˆ ˆ ˆ ˆ X X u 7 7 ˆ We can calculate the resduals of the model n excel:

53 Resolvng multcollnearty Orthogonal auxlary varables Based on ths model, we can generate the resduals. Remember that X X X X X X Y Y u ˆ... ˆ ˆ ˆ ˆ ˆ Alternatvely, we can calculate the resduals of the model n Gretl. Select Save, and then Resduals. The seres of the resduals wll appear as a new varable (n ths case they appear as uhat1).

54 Resolvng multcollnearty Orthogonal auxlary varables The last step s to calculate the orthogonal varable, by usng the formula: X 7 0 u Therefore, we sum each resdual wth the estmated ntercept of the model. Excel: use a new column where you wll add the number to the column of the resduals Gretl: select Add, then defne new varable, and type nsde the box garage_orth = uhat1. The new varable wll appear n the worksheet.

55 The Omtted varables bas If we omt explanatory varables that should be present n the regresson, then the estmated coeffcents on the ncluded varables wll be not accurate. The ntuton behnd why the omsson of varables causes bas s provded n the prevous example: lot sze s an mportant factor for house prces, and thus wants to enter nto the regresson. If we omt t from the regresson, t wll try to enter n the only way t can through ts postve correlaton wth the explanatory varable: number of bedrooms. One practcal consequence of omtted varables bas s that you should always try to nclude all those explanatory varables that could affect the dependent varable. Unfortunately, n practce, ths s rarely possble. House prces, for nstance, depend on many other explanatory varables than those found n the data set (e.g. the state of repar of the house, how pleasant the neghbors are, closet and storage space, whether the house has hardwood floors, the qualty of the garden, etc.). many of the omtted factors wll be subjectve (e.g. how do you measure pleasantness of the neghbors?).

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Scatter Plot x

Scatter Plot x Construct a scatter plot usng excel for the gven data. Determne whether there s a postve lnear correlaton, negatve lnear correlaton, or no lnear correlaton. Complete the table and fnd the correlaton coeffcent

More information

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students. PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions ECONOMICS 35* -- NOTE ECON 35* -- NOTE Tests of Sngle Lnear Coeffcent Restrctons: t-tests and -tests Basc Rules Tests of a sngle lnear coeffcent restrcton can be performed usng ether a two-taled t-test

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =

More information

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1 UNIVERSITY OF EAST ANGLIA School of Economcs Man Seres PG Examnaton 016-17 ECONOMETRIC METHODS ECO-7000A Tme allowed: hours Answer ALL FOUR Questons. Queston 1 carres a weght of 5%; Queston carres 0%;

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

a. (All your answers should be in the letter!

a. (All your answers should be in the letter! Econ 301 Blkent Unversty Taskn Econometrcs Department of Economcs Md Term Exam I November 8, 015 Name For each hypothess testng n the exam complete the followng steps: Indcate the test statstc, ts crtcal

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 3. Two-Variable Regression Model: The Problem of Estimation Chapter 3. Two-Varable Regresson Model: The Problem of Estmaton Ordnary Least Squares Method (OLS) Recall that, PRF: Y = β 1 + β X + u Thus, snce PRF s not drectly observable, t s estmated by SRF; that

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3. Outlne 3. Multple Regresson Analyss: Estmaton I. Motvaton II. Mechancs and Interpretaton of OLS Read Wooldrdge (013), Chapter 3. III. Expected Values of the OLS IV. Varances of the OLS V. The Gauss Markov

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables

More information

Correlation and Regression

Correlation and Regression Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE) June 7, 016 15:30 Frst famly name: Name: DNI/ID: Moble: Second famly Name: GECO/GADE: Instructor: E-mal: Queston 1 A B C Blank Queston A B C Blank Queston

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Question 1 carries a weight of 25%; question 2 carries 20%; question 3 carries 25%; and question 4 carries 30%.

Question 1 carries a weight of 25%; question 2 carries 20%; question 3 carries 25%; and question 4 carries 30%. UNIVERSITY OF EAST ANGLIA School of Economcs Man Seres PGT Examnaton 017-18 FINANCIAL ECONOMETRICS ECO-7009A Tme allowed: HOURS Answer ALL FOUR questons. Queston 1 carres a weght of 5%; queston carres

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Professor Chris Murray. Midterm Exam

Professor Chris Murray. Midterm Exam Econ 7 Econometrcs Sprng 4 Professor Chrs Murray McElhnney D cjmurray@uh.edu Mdterm Exam Wrte your answers on one sde of the blank whte paper that I have gven you.. Do not wrte your answers on ths exam.

More information

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees. Model and Data ECON 35* -- NOTE 3 Tests for Coeffcent Dfferences: Examples. Introducton Sample data: A random sample of 534 pad employees. Varable defntons: W hourly wage rate of employee ; lnw the natural

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

Soc 3811 Basic Social Statistics Third Midterm Exam Spring 2010

Soc 3811 Basic Social Statistics Third Midterm Exam Spring 2010 Soc 3811 Basc Socal Statstcs Thrd Mdterm Exam Sprng 2010 Your Name [50 ponts]: ID #: Your TA: Kyungmn Baek Meghan Zacher Frank Zhang INSTRUCTIONS: (A) Wrte your name on the lne at top front of every sheet.

More information

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the Chapter 11 Student Lecture Notes 11-1 Lnear regresson Wenl lu Dept. Health statstcs School of publc health Tanjn medcal unversty 1 Regresson Models 1. Answer What Is the Relatonshp Between the Varables?.

More information

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10) I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

Chapter 8 Multivariate Regression Analysis

Chapter 8 Multivariate Regression Analysis Chapter 8 Multvarate Regresson Analyss 8.3 Multple Regresson wth K Independent Varables 8.4 Sgnfcance tests of Parameters Populaton Regresson Model For K ndependent varables, the populaton regresson and

More information

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2 Chapter 4 Smple Lnear Regresson Page. Introducton to regresson analyss 4- The Regresson Equaton. Lnear Functons 4-4 3. Estmaton and nterpretaton of model parameters 4-6 4. Inference on the model parameters

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X). 11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

Continuous vs. Discrete Goods

Continuous vs. Discrete Goods CE 651 Transportaton Economcs Charsma Choudhury Lecture 3-4 Analyss of Demand Contnuous vs. Dscrete Goods Contnuous Goods Dscrete Goods x auto 1 Indfference u curves 3 u u 1 x 1 0 1 bus Outlne Data Modelng

More information

β0 + β1xi. You are interested in estimating the unknown parameters β

β0 + β1xi. You are interested in estimating the unknown parameters β Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

CHAPTER 8 SOLUTIONS TO PROBLEMS

CHAPTER 8 SOLUTIONS TO PROBLEMS CHAPTER 8 SOLUTIONS TO PROBLEMS 8.1 Parts () and (). The homoskedastcty assumpton played no role n Chapter 5 n showng that OLS s consstent. But we know that heteroskedastcty causes statstcal nference based

More information

Topic 7: Analysis of Variance

Topic 7: Analysis of Variance Topc 7: Analyss of Varance Outlne Parttonng sums of squares Breakdown the degrees of freedom Expected mean squares (EMS) F test ANOVA table General lnear test Pearson Correlaton / R 2 Analyss of Varance

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation ECONOMICS 5* -- NOTE 6 ECON 5* -- NOTE 6 Tests of Excluson Restrctons on Regresson Coeffcents: Formulaton and Interpretaton The populaton regresson equaton (PRE) for the general multple lnear regresson

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors ECONOMICS 5* -- Introducton to Dummy Varable Regressors ECON 5* -- Introducton to NOTE Introducton to Dummy Varable Regressors. An Example of Dummy Varable Regressors A model of North Amercan car prces

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X. ( ) ( ) where ( ) 1 ˆ β = X X X X β + ε = β + Aε A = X X 1 X [ ] E ˆ β β AE ε β so ˆ = + = β s unbased ( )( ) [ ] ˆ Cov β = E ˆ β β ˆ β β = E Aεε A AE ε ε A Aσ IA = σ AA = σ X X = [ ] = ( ) 1 Ftted values

More information