Exercises (in progress) Applied Econometrics 2016-2017 Part 1 1. De ne the concept of unbiased estimator. 2. Explain what it is a classic linear regression model and which are its distinctive features. 3. Consider the following linear regression model: y i = 1 + 2 D i + 3 x i + e i, i = 1; 2; :::; n where n is the number of countries considered, y i is the logged annual temperature in country i (average across the year), x 1i is the logged volume of CO 2 emissions registered in country i (average across the year), and D i is a dummy (deterministic) variable which takes value 1 if country i is a South country and value 0 if country i is a North country. Finally, e i is a disturbance term which we assume to be homoskedastic and uncorrelated. (a) Explain which is the meaning of this regression model, and explain in particular which is the intepretation of the parameter 3. (b) Which is the constant associated with South countries? Which is the constant associated with North countries? What is the interpretation of the null hypothesis H 0 : 2 = 0? (c) Consider the matrix form representation of the regression model above, in particular show which is the form of the matrix = E(ee 0 ) and which are the columns of the matrix X: (d) Suppose that the OLS estimation of the linear regreesion model above on a set of n = 25 countries has produced the following estimation results: R 2 = 0:65: ^ 1 = 0:5 s:e:(^ 1 ) = 0:05 ^ 2 = 0:15 s:e:(^ 2 ) = 0:01 ^ 3 = 0:4 s:e:(^ 3 ) = 0:45 Which conclusion can we draw about the relationship between average termperature and CO 2 emissions? 4. De ne what a consistent estimator is. 5. Explain the di erences that exist between the classic linear regression model and the generalized linear regression model. 1
6. Consider the linear regression model for cross-section data: where it is known that y i = 0 + 1 z i + e i, i = 1; 2; :::; N E(e i ) = 0 E(e 2 i ) = z 2 i ; i = 1; 2; ::::; N h > 0 Cov(e i ; e j ) = E(e i e j ) = for each i and each j: Which is the optimal estimator of = ( 0 ; 1 ) 0? 7. Consider the linear regression model based on time series data (y 0 is given): y t = 0 + 1 y t 1 + e t ; e t WN(0; 2 e); t = 1; 2; 3; 4 = T Tell how the X matrix is done in this speci c case if we consider the compact matrix representation: y = X + e: Discuss also how the matrix = E(ee 0 j X) is done. Is it a classic or generalized linear regression model? 8. Consider the linear regression model based on cross-section data: where it is known that y i = 0 + 1 z i + e i, i = 1; 2; :::; N E(e i ) = 0 E(e 2 i ) = 2 e Cov(e i ; e j ) = E(e i e j ) = 0:5 for each i and each j: A. Find the compact matrix representation (i.e. tell how the vectors and matrices y; X,, e are done in this case). B. Discuss the structure of the matrix = E(ee 0 ) and tell whether we are in the presence of a classic or generalized linear regression model. C. Which is the "optimal estimator" of? Can we compute it? 9. What are the small-sample properties of the OLS estimator in the linear regression model? 10. What is the meaning of BLUE when referred to the OLS estiamator of a linearr regression model? 2
11. Consider the following linear regression model: c i = 1 + 2 gdp i + 3 w i + " i, i = 1; 2; :::; n where c i is per-capita consumption in country i, gdp i is per-capita real- GDP in country i; w i is real wealth in country i, n is the total number of countries considered and " i is a disturbance term assumed homoskedastic with variances 2, uncorrelated across countries and with Gaussian distribution. (A) Provide the economic interpretation of the coe cients 1, 2 and 3 in this model; (B) Tell which is the form of the matrix = E("" 0 ) in this model and which is the "best" estimator of the parameters = ( 1, 2, 3 ) and 2 ; (C) Immagine that the OLS estimation of the linear regreesion model above on a set of n = 15 countries has produced the following estimation results: ^ 2 = 0:5 R 2 = 0:80: ^ 1 = 0:5 s:e:(^ 1 ) = 0:05 ^ 2 = 0:98 s:e:(^ 2 ) = 0:10 ^ 3 = 0:4 s:e:(^ 3 ) = 0:05 Explain why the standard errors are useful. Then suppose we are interested in the testing problem: H 0 : 2 = 1 H 1 : 2 6= 1: Tell which is the meaning of the null hypothesis and discuss a test for it. 12. Provide and example of a linear regression model with heteroskedastic disturbances. 13. When is the assumption of normal disturbances important in a linear regression model? Why do we need it? 14. What is the meaning of the F-test associated with the estimation output of a linear regression model? 15. Read the following link http://www.sjsu.edu/faculty/watkins/gwstat2.htm understand its contents, and try to summarize main results. 16. What is a white noise disturbance? 17. Let y t be mean annual surface air temperature on year in India for the period t =1881,...,1997 (for a total of n = 117 years). Consider the following simple regression model y t = 1 + 2 t + " t, " t WN(0, 2 ) t = 1881; :::; 1997 3
which is built with the objective of understanding whether there is a warming (liner) trend in India. (A) Consider the matrix form representation and tell whether it is a classic or generalized linear regression model. (B) What is the interpretation of the coe cient 2 in this model? (C) Suppose that OLS estimation has produced the following results: ^ 1 = 10 s:e:(^ 1 ) = 0:05 ^ 2 = 0:47 s:e:(^ 2 ) = 0:39 ^ = 0:6 R 2 = 0:69: Is it possible to conclude that there is a strong warming trend of about half a degree? 18. Let q t be aggregate output per worked-hour at time t in the U.S., k t the ratio between aggregate capital and aggregate labot at time t and, nally, let A t a technology index. Consider the following regression model: q t = 1 + 2 k t + 3 A t + " t, " t WN(0, 2 ) which is estimated on time series data from t = 1909 until t = 1949 (for a total of T = 41 annual observations). Estimation output is reported in the table that follows. (A) Interpret the results. (B) What s the meaning of the F-test? (C) Test the hypothesis that the response of aggregate output per worked-hour to technology is exactly 1. 4
gretl output for installazioni 2017-03-21 12:30, page 1 Model 1: OLS, using observations 1909-1949 (T = 41) Dependent variable: q coefficient std. error t-ratio p-value --------------------------------------------------------- const -0.295445 0.0111377-26.53 4.00e-026 *** k 0.0956250 0.00398550 23.99 1.47e-024 *** A 0.717229 0.00491408 146.0 7.42e-054 *** Mean dependent var 0.905976 S.D. dependent var 0.201284 Sum squared resid 0.002500 S.E. of regression 0.008112 R-squared 0.998457 Adjusted R-squared 0.998376 F(2, 38) 12295.91 P-value(F) 3.79e-54 Log-likelihood 140.7739 Akaike criterion -275.5477 Schwarz criterion -270.4070 Hannan-Quinn -273.6758 rho 0.483540 Durbin-Watson 1.022892 White's test for heteroskedasticity - Null hypothesis: heteroskedasticity not present Test statistic: LM = 7.31772 with p-value = P(Chi-square(5) > 7.31772) = 0.198063