Econometrics Midterm Examination Answers

Similar documents
Econometrics Homework 1

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Applied Statistics and Econometrics

Nonlinear Regression Functions

Essential of Simple regression

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Multivariate Regression: Part I

Introduction to Econometrics. Multiple Regression (2016/2017)

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Introduction to Econometrics. Multiple Regression

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Hypothesis Tests and Confidence Intervals. in Multiple Regression

Lab 07 Introduction to Econometrics

ECO220Y Simple Regression: Testing the Slope

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Linear Regression with Multiple Regressors

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Applied Statistics and Econometrics

Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 6: Linear Regression With Multiple Regressors

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 15: Multiple Linear Regression & Correlation

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

ECON3150/4150 Spring 2016

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Testing Linear Restrictions: cont.

Föreläsning /31

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Handout 11: Measurement Error

Practice exam questions

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Fixed and Random Effects Models: Vartanian, SW 683

Lecture notes to Stock and Watson chapter 8

Introduction to Econometrics. Review of Probability & Statistics

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Econometrics. 8) Instrumental variables

Problem Set 1 ANSWERS

Lecture 4: Multivariate Regression, Part 2

Graduate Econometrics Lecture 4: Heteroskedasticity

Lab 11 - Heteroskedasticity

Introduction to Econometrics

ECON Introductory Econometrics. Lecture 16: Instrumental variables

STATISTICS 110/201 PRACTICE FINAL EXAM

Quantitative Techniques - Lecture 8: Estimation

Lecture 5: Hypothesis testing with the classical linear model

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2016

Introduction to Econometrics. Regression with Panel Data

1 Independent Practice: Hypothesis tests for one parameter:

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #

An explanation of Two Stage Least Squares

1 A Non-technical Introduction to Regression

2. Linear regression with multiple regressors

Statistical Inference with Regression Analysis

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Collection of Formulae and Statistical Tables for the B2-Econometrics and B3-Time Series Analysis courses and exams

Specification Error: Omitted and Extraneous Variables

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Instrumental Variables, Simultaneous and Systems of Equations

Empirical Application of Simple Regression (Chapter 2)

Lecture 3: Multivariate Regression

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

1. The Multivariate Classical Linear Regression Model

General Linear Model (Chapter 4)

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

At this point, if you ve done everything correctly, you should have data that looks something like:

Lecture 4: Multivariate Regression, Part 2

Correlation and Simple Linear Regression

Econometrics II Censoring & Truncation. May 5, 2011

Linear Regression with one Regressor

i) the probability of type I error; ii) the 95% con dence interval; iii) the p value; iv) the probability of type II error; v) the power of a test.

Econometrics 1. Lecture 8: Linear Regression (2) 黄嘉平

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

THE MULTIVARIATE LINEAR REGRESSION MODEL

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

Handout 12. Endogeneity & Simultaneous Equation Models

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

ECON Introductory Econometrics. Lecture 17: Experiments

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Lecture 8: Instrumental Variables Estimation

Ch 2: Simple Linear Regression

Question 1 [17 points]: (ch 11)

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.

Review of Econometrics

Transcription:

Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i ) = under the usual assumptions. (5 points) A: ^ is an unbiased estimator for if E(^) = : X = P n X i=n, under the usual assumption of random sample (where observations are independent draw from an identical distribution), then E( X) = P i E(X i)=n = n=n = : (ii) If E(X i ) = and V ar(x i ) = 2, and observations are independent of each other, what does the Law of Large Numbers and Central Limit Theorem state about the sample mean X = ( P n X i) =n? (6 points) A: LLN: X n converges in distribution to E(X i ) =. (Intuitive explanation also acceptable. The distribution of X n gets narrower and narrower when n increases, and when n!, Xn collapse to the true value.) CLT: p n( X n )! d N(0; 2 ): or X app N(; 2 =n): (Intuitive explanation: The distribution of X gets closer and closer to normal when n becomes larger and larger.) (iii) Is the following statement true or false? Explain. "If I always provide an estimate of 37 for whatever sample I obtain, this estimator is the most e cient because its variance is zero." (5 points) A: False. It is only meaningful to compare variance if the estimators are unbiased (or consistent.) We don t know if the true value is 37, so such an estimator is likely to be biased and inconsistent. (iv) Are the following valid null hypotheses in statistical testing? Explain. (a) X = 00; (b) X = 00. (5 points) A: Only A is valid because we test hypothesis on parameters (properties) of the population, not for a sample number. (v) What do the no perfect multicollinearity and zero conditional mean assumption mean in the basic assumption of linear regression model? What are the consequences if they are violated? (8 points)

A: Perfect multicollinearity means there exists a linear relationship between some regressors. This leads to the OLS estimator unde ned. Zero conditional mean means E(u i jx i ) = 0, meaning that u i is not predictable by x i : (Any information related to x i are captured by the linear function.) (This also implies that regressors and error are uncorrelated.) When this is violated, the estimator will become biased. (vi) What are the four factors that a ect the variance of individual slope coe cient estimators under OLS for a multiple regression model? How do they a ect the variance? (6 points) A: The four factors include the variance of the error term 2 (positive), sample size n (negative), the variance of the regressor (negative) and the Rk; 2 k between this regressor and other regressors (positive). Question 2 (2 points) Let z be a random column vector of size 3 : 0 z = @ (a) Write out z 0 z and zz 0 in terms of z ; z 2 and z 3 : (3 points) (b) If z N(0; I 3 ), what is the distribution of z 0 z? (2 points) (c) If z N(0; I 3 ); what is E(zz 0 )? What is E(z 0 z)? (4 points) (d) If a = (; ; ) 0 (a column vector of ), what is a 0 z? Calculate E(a 0 z) and V ar(a 0 z): (3 points) z z 2 z 3 A A: (a) z 0 z = z 2 + z2 2 + z2 3 0 zz 0 = @ z 2 z z 2 z z 3 z 2 z z 2 2 z 2 z 3 z 3 z z 3 z 2 z 2 3 A (b) Chi-Square with degree of freedom 3, since each is the square of standard normal and each term is independent of all others. (c) 0 E(z E(zz 0 2) E(z z 2 ) E(z z 3 ) ) = @ E(z 2 z ) E(z2 2) E(z 2z 3 ) A E(z 3 z ) E(z 3 z 2 ) E(z3 2) = V ar(z) = I 0 0 0 = @ 0 0 A 0 0 E(z 0 z) = E(z 2 ) + E(z2 2 ) + E(z2 3 ) = V ar(z ) + V ar(z 2 ) + V ar(z 3 ) = + + = 3: (Since E(z i ) = 0:) (d) a 0 z = z + z 2 + z 3 : E(a 0 z) = 0 + 0 + 0 = 0: V ar(a 0 z) = V ar(z ) + V ar(z 2 ) + V ar(z 3 ) = 3: (In terms of matrix, V ar(a 0 z) = a 0 Ia = a 0 a = 3:) 2

Question 3 (3 points) We want to know about the expenditure on food in a month for students in this university. In particular we want to estimate the mean of their food expenditure. Now we randomly sample (n) 225 students. The sample mean food expenditure obtained is Y = 962:5 with standard deviation of the sample s = 88:8: We want to test the hypothesis that H 0 : = 000 against H : 6= 000: (a) What test statistic do we use? What is its distribution if the null is true? (3 points) (b) Calculate the statistic and carry out the test at 5% signi cance level. (Critical value is.96.) (5 points) (c) Calculate the 95% two-sided con dence interval for the population mean food expenditure. (3 points) (d) Is the 99% two-sided con dence interval longer or shorter than the one at 95%? Why? (2 point) (a) We should use t test and t statistic should be used. Its distribution if the null hypothesis is true is T (224), or approximately normal. (3 points) (b) Here, t = (962:5 000)=(88:8= p 225) = 6: 334 5 < :96: Therefore we can reject the null hypothesis at 5% signi cance level. (5 points) (c) 95% con dence interval is given by 962:5:96(88:8= p 225) = 962:5: 6 = (950:9; 974:) : (3 points) (d) 99% con dence interval is bigger/wider, because given the same information, to increase probability of covering the true value, we have to allow a longer interval. (We use a larger critical value.) (2 point) 3

Question 4 (30 points) Consider one example we have gone through in class. We want to see what determine student s test score in school. The dependent variable testscr is the average test score of a school, str is the student-teacher ratio of the school (number of student per teacher), and avginc is the average income of families (per $000) in the school district. Student-teacher ratio captures how the class size can a ect students learning, while average income indirectly captures intensity of human capital investment from the family. Here we estimate testscr i = + 2 str i + 3 avginc i + 4 avginc 2 i + 5 avginc 3 i + u i where square and cube of average income are also included. The following shows the regression output from Stata:. reg testscr str avginc avginc2 avginc3 Source SS df MS Number of obs = 420 F( 4, 45) = 35.49 Model 8644.9747 4 2536.2437 Prob > F = 0.0000 Residual 65964.689 45 58.950889 R squared = 0.5663 Adj R squared = 0.5622 Total 5209.594 49 363.030056 Root MSE = 2.608 testscr Coef. Std. Err. t P> t [95% Conf. Interval] str.9277523.3369433 2.75 0.006.59008.2654239 avginc 5.24736.8536044 6.00 0.000 3.446809 6.802664 avginc2.0073.0377 2.72 0.007.740683.028462 avginc3.0007293.0004685.56 0.20.00097.006503 _cons 67.8974 8.679455 7.9 0.000 600.8362 634.9586 (a) Interpret cone cient on student-teacher ratio (str). Is it statistically signi cant at 5% level? (Critical value for 2-sided test under normal distribution is.96.) (5 points) (b) If instead we want to test whether the coe cient on str is -2.0, what is the test statistic and can we reject the null at 5% signi cance level? (3 points) (c) Write down the formula of R 2 in terms of various sum of squares and verify that the number shown on the right column is the same as calculated from the sum of squares shown on the left. Do the same for R 2 (or adjusted R 2 :) (4 points) (d) What is the F test on the right column testing? Write down the null and alternative hypothesis. How can we calculate this statistic with the sums of squares available? Can we reject the null hypothesis at 5% level? (6 points) (e) Verify the con dence interval shown for the coe cient str using the formula introduced in class and numbers provided in the results. (4 points) (Continue next page. If you don t have enough space, you can write on the next page.) Now we would like to test the joint signi cance of the coe cients on average income and its square and cube terms. That is H 0 : 3 = 4 = 5 = 0 using F test. (Note: In my notation, is the intercept term.) (f) What is the alternative hypothesis? (2 points) The results for the restricted regression is shown below 4

. reg testscr str Source SS df MS Number of obs = 420 F(, 48) = 22.58 Model 7794.004 7794.004 Prob > F = 0.0000 Residual 4435.484 48 345.252353 R squared = 0.052 Adj R squared = 0.0490 Total 5209.594 49 363.030056 Root MSE = 8.58 testscr Coef. Std. Err. t P> t [95% Conf. Interval] str 2.279808.4798256 4.75 0.000 3.22298.336637 _cons 698.933 9.46749 73.82 0.000 680.323 77.5428 (g) Perform the test. What is the F statistic in this sample? What are the distribution and the degrees of freedom for the distribution under the null hypothesis? Can we reject the null hypothesis at 5% signi cance level? (Possible critical values: F ;;0:05 = 3:84; F 3;;0:05 = 2:60; F 5;;0:05 = 2:2, where the subscripts means the numerator and denominator degrees of freedom and signi cance level respectively.) (6 points) Answers: (a) For one more student per teacher, the average score of the school falls for about -0.92 points, holding average income constant. As the p-value is smaller than 0.05, (or t > :96), the coe cient is signi cant. (5 points) (b) t = ( 0:9278 + 2:0)=0:3369 = 3: 82 5 > :96. So we can also reject the null that the coe cient is -2.0. (3 points) (c) R 2 = SSE=SST = SSR=SST = 8644:9747=5209:594 = 0:5663: R 2 65964:689=45 = (SSR=(n K))=(SST=(n )) = 5209:594=49 = 0:5622: (4 points) (d) The F test reported on the right is the test that all population coe cients beside the constant (intercept) term are zero. H 0 : 2 = 3 = 4 = 5 = 0 against H : 2 6= 0 or 3 6= 0 or 4 6= 0 or 5 6= 0: It can be calculated by F = (5209:594 65964:689)=4 65964:689=45 = 35:49 The p-value stated there is smaller than 0.00005, so we can reject the null at 5% signi cance level. (6 points) (e) The 95% con dence interval is 0:9278:96(0:33694) = 0:92780:660 4 = ( 0:9278 0:6604; 0:9278+0:6604) = ( : 588 2; 0:267 4) (Some discrepency due to the use of normal critical value. They may have used a more accurate critical value from T(45).) (4 points) (f) H : 3 6= 0 or 4 6= 0 or 5 6= 0: (2 points) (g) F = ((SSR R SSR U )=3)=(SSR U =(n K)) = 3 (4435:484 65964:689)=(65964:689=45) = 64: 3 > 2:60. So, we can reject the null hypothesis that all three coe cients are zero. (6 points) 5

Question 5 (0 points) Consider the case that the regression function does not have an intercept. If we know that the population regression function is y i = x i + u i where 0 = 0: (i) What is E(y i jx i )? What is E(y i jx i = 0)? (2 points) (ii) Derive the OLS estimator for. That means we minimize the sum of squares of residuals min (y i ^ x i ) 2 ^ (5 points) (iii) Show that the same estimator can be obtained by using the moment condition E(u i x i ) = 0: (3 points) A: (i) E(y i jx i ) = E( x i jx i ) + E(u i jx i ) = x i since E(u i jx i ) = 0 by basic assumptions. E(y i jx i = 0) = (0) = 0: Thus, a regression model without an intercept term has a mean of y zero when x = 0: (The line passes through the origin.) (ii) The rst order condition is 2(y i ^ x i )( x i ) = 0 x i y i + ^ x 2 i = 0 ^ = P n x iy i P n x2 i (iii) By the moment condition, E(u i x i ) = E((y i x i )x i ) = E(y i x i ) E(x 2 i ) = 0: By replacing the sample moments, we have Thus we have the same estimator. n y i x i = ^ n x 2 i! 6