Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Similar documents
Econometrics Homework 4 Solutions

Problem Set 10: Panel Data

Fixed and Random Effects Models: Vartanian, SW 683

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Quantitative Methods Final Exam (2017/1)

Empirical Application of Panel Data Regression

1 The basics of panel data

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

Lecture 3 Linear random intercept models

ECON Introductory Econometrics. Lecture 11: Binary dependent variables

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Linear Regression with Multiple Regressors

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

ECON Introductory Econometrics. Lecture 17: Experiments

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

ECON 497 Final Exam Page 1 of 12

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Lecture 8: Instrumental Variables Estimation

****Lab 4, Feb 4: EDA and OLS and WLS

Exercices for Applied Econometrics A

Jeffrey M. Wooldridge Michigan State University

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Applied Statistics and Econometrics

Binary Dependent Variables

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

Fortin Econ Econometric Review 1. 1 Panel Data Methods Fixed Effects Dummy Variables Regression... 7

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Description Quick start Menu Syntax Options Remarks and examples Stored results Methods and formulas Acknowledgment References Also see

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Longitudinal Data Analysis. RatSWD Nachwuchsworkshop Vorlesung von Josef Brüderl 25. August, 2009

Chapter 6: Linear Regression With Multiple Regressors


Econometrics Review questions for exam

Econometrics Midterm Examination Answers

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Introduction to Econometrics. Multiple Regression (2016/2017)

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

multilevel modeling: concepts, applications and interpretations

Lecture 12: Effect modification, and confounding in logistic regression

Chapter 11. Regression with a Binary Dependent Variable

Problem Set 5 ANSWERS

Essential of Simple regression

Handout 12. Endogeneity & Simultaneous Equation Models

ECO220Y Simple Regression: Testing the Slope

Applied Statistics and Econometrics

Lab 07 Introduction to Econometrics

Linear Regression with Multiple Regressors

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Lecture#12. Instrumental variables regression Causal parameters III

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Practice exam questions

STATISTICS 110/201 PRACTICE FINAL EXAM

Introduction to Econometrics

Monday 7 th Febraury 2005

FTE Employment before FTE Employment after

ECON3150/4150 Spring 2015

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Question 1 [17 points]: (ch 11)

1

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Introduction to Econometrics

Nonlinear Regression Functions

Introduction to Econometrics. Multiple Regression

Empirical Application of Simple Regression (Chapter 2)

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

ERSA Training Workshop Lecture 5: Estimation of Binary Choice Models with Panel Data

Introduction to Econometrics. Regression with Panel Data

Practice 2SLS with Artificial Data Part 1

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Multiple Regression: Inference

Econ 371 Problem Set #6 Answer Sheet. deaths per 10,000. The 90% confidence interval for the change in death rate is 1.81 ±

point estimates, standard errors, testing, and inference for nonlinear combinations

ECON Introductory Econometrics. Lecture 2: Review of Statistics

Lecture #8 & #9 Multiple regression

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Statistical Inference with Regression Analysis

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Review of Econometrics

Econometrics. 8) Instrumental variables

Problem set - Selection and Diff-in-Diff

The Simple Linear Regression Model

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Nonlinear Econometric Analysis (ECO 722) : Homework 2 Answers. (1 θ) if y i = 0. which can be written in an analytically more convenient way as

ECON3150/4150 Spring 2016

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Graduate Econometrics Lecture 4: Heteroskedasticity

ECON3150/4150 Spring 2016

Control Function and Related Methods: Nonlinear Models

Lecture 4: Multivariate Regression, Part 2

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

Binary Dependent Variable. Regression with a

Transcription:

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. This is an open book examination where all printed and written resources, in addition to a calculator, are allowed. If you are asked to derive something, give all intermediate steps. Do not answer questions with a "yes" or "no" only, but carefully motivate your answer. Question 1 The head of department wants to investigate whether a mandatory term paper increases the probability that a student passes the exam. He has asked the econometrics teacher to set up an experiment where students are randomly assigned to a treatment group that has to write a mandatory term paper and to a control group that does not have to write a mandatory term paper. At the end of the course the teacher has a data set with information on 150 students about whether they passed the exam (P assed i ) and about whether they had written a mandatory term paper (T ermpaper i = 1) or not (T ermpaper i = 0). The teacher decides to estimate the following regression model by OLS Thursday April 21 14:58:47 2016 Page 1 and obtains the following estimation results 1. regress Passed Termpaper, robust P assed i = β 0 + β 1 T ermpaper i + u i (1) Linear regression Number of obs = 150 F(1, 148) = 4.23 Prob > F = 0.0415 R-squared = 0.0278 Root MSE =.39706 Robust Passed Coef. Std. Err. t P> t [95% Conf. Interval] Termpaper.1333333.0648398 2.06 0.042 _cons.7333333.0514066 14.27 0.000.6317475.8349192 a) Give an interpretation, in words, of the two estimated coefficients. Solution (5 points): β0 = 0.73 is the fraction of students in the control group (that did not write a term paper) that passed the exam and β 1 = 0.13 is the difference in the fraction of students that passed the exam between those that wrote a term paper, and those that did not write a term paper. The fraction of students in the treatment group (that wrote a term paper) that passed the exam is equal to β 0 + β 1 = 0.867. 1

b) Compute a 95 percent confidence interval for β 1. Solution (5 points): 95% confidence interval for β 1 : β 1 ± 1.96 SE( β 1 ) filling in the numbers from the regression output gives 0.133 ± 1.96 0.065 (0.005, 0.261) c) The teacher wants to analyze whether the effect of the mandatory term paper depends on the age of the student. Describe in detail how you would extend model (1), such that you can test the null hypothesis that the effect of the mandatory term paper does not depend on the age of the student. Solution (10 points): The regression should be augmented to include an interaction term as follows: P assed i = β 0 + β 1 T ermp aper i + β 2 Age i + β 3 (T ermp aper i Age i ) + ɛ i whereby Age i is the age of the student. The hypothesis can be tested by using a t or F test testing H 0 : β 3 = 0. Thursday April 21 15:05:59 2016 Page 1 d) Because the dependent variable is binary the teacher decides to estimate a logit model and obtains the following estimation results. 1. logit Passed Termpaper, robust Iteration 0: log pseudolikelihood = -75.060364 Iteration 1: log pseudolikelihood = -72.978111 Iteration 2: log pseudolikelihood = -72.944236 Iteration 3: log pseudolikelihood = -72.944223 Iteration 4: log pseudolikelihood = -72.944223 Logistic regression Number of obs = 150 Wald chi2( 1) = 4.00 Prob > chi2 = 0.0454 Log pseudolikelihood = -72.944223 Pseudo R2 = 0.0282 Robust Passed Coef. Std. Err. z P> z [95% Conf. Interval] Termpaper.8602013.4298819 2.00 0.045.0176483 1.702754 _cons 1.011601.2619912 3.86 0.000.4981075 1.525094 2

What is effect of writing a term paper on the predicted probability of passing the exam? Solution (10 points): T he effect of writing a term paper on the predicted probability of passing the exam equals: P r(p assed i = 1) = Pr(P assed i = 1 T ermpaper i = 1) Pr(P assed i = 1 T ermpaper i = 0) P r(p assed i = 1) = ( 1/ ( 1 + e (1.01+0.86 1))) ( 1/ ( 1 + e (1.01+0.86 0))) = 0.866 0.733 = 0.133 e) The teacher finds out that 180 students enrolled in the course and 90 were randomly assigned to the treatment group and the other 90 to the control group. The teacher doesn t observe exam results for all 180 students but only for 150 students. She suspects that some of the students in the treatment group didn t want to write a term paper and dropped out of the course. Explain whether in this case the OLS estimator of β 1 (from part a)) is a consistent estimator of the causal effect of writing a mandatory term paper on the probability of passing the exam. Solution (10 points): This is an example of attrition. Since the attrition is related to the treatment it can lead to sample selection problems. If the less motivated students are the ones that decide to drop out of the course this implies that students in the treatment group that take the exam are on average more motivated than those in the control group that take the exam. This will result in an inconsistent OLS estimator of the causal effect of writing a mandatory term paper on the probability of passing the exam. f) The administration informs the teacher that 30 students got the flu and therefore did not take the exam. Explain whether in this case the OLS estimator of β 1 (from part a)) is a consistent estimator of the causal effect of writing a mandatory term paper on the probability of passing the exam. Solution (10 points): This is an example of attrition. Since the attrition is due to students getting the flu it is unlikely to be related to the treatment and it will therefore not lead to sample selection problems. This type of attrition will therefore not (or very unlikely) result in an inconsistent OLS estimator of the causal effect of writing a mandatory term paper on the probability of passing the exam. 3

Question 2 The federal government of the U.S. wants to know whether increasing seat belt usage reduces traffic fatalities. A government employee has data on the number of traffic fatalities per million of traffic miles (fatalityrate it ) and on the seat belt useage rate (sb useage it ) for 51 U.S. States, for the years 1990-1997. He estimates the following regression model by OLS Thursday April 21 10:19:57 2016 Page 1 and obtains the following estimation results 1. regress fatalityrate sb_useage, robust fatalityrate it = β 0 + β 1 sb usage it + u it (2) Linear regression Number of obs = 408 F(1, 406) = Prob > F = R-squared = 0.0309 Root MSE =.00448 Robust fatalityrate Coef. Std. Err. t P> t [95% Conf. Interval] sb_useage -.0060436.0018963 _cons.0220271.0012039 18.30 0.000.0196605.0243937 a) Is the relation between seat belt usage and the traffic fatality rate significantly different from zero at a 5 percent significance level? Solution (5 points):h 0 : β 1 = 0 vs H 1 : β 1 0. Construct the t-statistic: t = 0.006 0 0.0018 = 3.19 The absolute value of the t-statistic is bigger than 1.96 so we reject H 0. The relation between seat belt usage and the traffic fatality rate is significantly different from zero at a 5 percent significance level. 4

Thursday April 21 10:22:55 2016 Page 1 b) The government employee decides to include state fixed / effects / / / and obtains / / / the following results. 1. xtreg fatalityrate sb_useage, fe i(state) robust Fixed-effects (within) regression Number of obs = 408 Group variable: State Number of groups = 51 R-sq: Obs per group: within = 0.3010 min = 8 between = 0.0033 avg = 8.0 overall = 0.0309 max = 8 F( 1,50) = 65.43 corr(u_i, Xb) = -0.2176 Prob > F = 0.0000 (Std. Err. adjusted for 51 clusters in State) Robust fatalityrate Coef. Std. Err. t P> t [95% Conf. Interval] sb_useage -.0131094.0016207-8.09 0.000 -.0163646 -.0098542 _cons.0261786.0009522 27.49 0.000.024266.0280912 sigma_u.00433259 sigma_e.00167444 rho.87004661 (fraction of variance due to u_i) Are the results different from the results without fixed effects? Explain why or why not. Solution (10 points): Yes the estimated coefficient on seat belt usage in the regression with fixed effects is more negative and more than twice as large in absolute value. This indicates that there are time-invariant state characteristics that caused omitted variable bias in the regression without fixed effects. These characteristics are correlated with seat belt usage and affect the traffic fatality rate. c) Explain an alternative way to estimate the effect of seat belt usage on the traffic fatality rate while including state fixed effects. Solution (10 points): Instead of within estimation we can also include dummy variables for 50 states and include a constant term. Exclude 1 of the dummy variables to avoid perfect multicollinearity: fatalityrate it = β 0 + β 1 sb usage it + γ 2 D2 i +... + γ 51 D51 i + u it Di i = 1 for state i and 0 otherwise Alternatively we can include 51 dummies and exclude the constant term: fatalityrate it = β 1 sb usage it + γ 1 D1 i + γ 2 D2 i +... + γ 51 D51 i + u it Di i = 1 for state i and 0 otherwise 5

d) The government employee is worried that there are omitted variables that vary within states over time that cause omitted variable bias. He decides to use an instrumental variable approach and to use the presence of a mandatory seat belt law as instrument. The variable primary it is a binary variable that equals 1 if in State i in year t there is a law which makes it possible for a police officer to stop a car and ticket the driver if the officer observes that Thursday April 21 10:56:55 2016 Page 1 someone in the car is not wearing a seat belt. He obtains the following first stage estimation results 1. xtreg sb_useage primary, fe i(state) Fixed-effects (within) regression Number of obs = 408 Group variable: State Number of groups = 51 R-sq: Obs per group: within = 0.0306 min = 8 between = 0.3273 avg = 8.0 overall = 0.2212 max = 8 F( 1,356) = corr(u_i, Xb) = -0.4552 Prob > F = 0.0009 sb_useage Coef. Std. Err. t P> t [95% Conf. Interval] primary.2957143.0882233 0.001.12221.4692186 _cons.5418801.0142222 38.10 0.000.5139101.5698502 sigma_u.0992207 sigma_e.08252533 rho.5910923 (fraction of variance due to u_i) Is primary it a weak instrument? Solution (10 points): Compute the first stage F-statistic. Since there is only one instrument ( ) 2 ( the first stage F-statistic is the square of the the t-statistic:f = t 2 π = 1 SE( π 1 ) = 0.2957 2 0.0882) = 11.2. Since 11.2 is bigger than the rule of thumb value of 10, primary it is not a weak instrument. 6

e) The government employee decides to use two instruments. Next to the variable primary it the data set contains the variable secondary it. The binary variable secondary it equals 1 if in State i in year t there is a law that makes it possible that a police officer can write a ticket if an occupant is not wearing a seat belt, but only if the police officer has another Thursday April 21 11:01:32 2016 Page 1 reason to stop the car. The government official estimates a first stage regression using both instruments and obtains the following first stage estimation results 1. xtreg sb_useage primary secondary, fe i(state) Fixed-effects (within) regression Number of obs = 408 Group variable: State Number of groups = 51 R-sq: Obs per group: within = 0.0372 min = 8 between = 0.3319 avg = 8.0 overall = 0.2259 max = 8 F( 2,355) = 6.86 corr(u_i, Xb) = -0.4735 Prob > F = 0.0012 sb_useage Coef. Std. Err. t P> t [95% Conf. Interval] primary.3055089.0882704 3.46 0.001.1319103.4791074 secondary.0035196.0022574 1.56 0.120 -.00092.0079591 _cons.5380134.0144087 37.34 0.000.5096761.5663506 sigma_u.10001657 sigma_e.08235998 rho.59591494 (fraction of variance due to u_i) Are there weak instrument problems when using these two instrument? Explain whether it is better to use both primary it and secondary it as instruments or whether it is better to use only primary it as instrument for sb usage it. Solution (10 points): Compute the first stage F-statistic. Since there are two instruments and no control variables we can use the overall F statistic:f = 6.86. Since 6.86 is smaller than the rule of thumb value of 10, there are weak instrument problems. It is better to use only primary it as instrument because in this case the first stage F-statistic is higher (higher than 10), while using both instruments results in a lower F-statistic and weak instruments problems. 7

Question 3 Discuss whether each of the following statements is correct or not. a) If we perform a J-test and we don t reject the null hypothesis we know that the instrument relevance condition holds. Solution (5 points) Incorrect. The J-test test the null hypothesis of instrument exogeneity, not instrument relevance. If we do not reject the null hypothesis of instrument exogeneity this does not imply that we know that instrument exogeneity holds, we can never accept the null hypothesis. We can use the first stage regression (regression of the endogenous regressor on the instrument(s)) to test for instrument relevance. b) If the R 2 = 0.5 the total sum of squares is twice as large as the sum of squared residuals. Solution (5 points) Correct. R 2 = 1 SSR T SS SSR so T SS = 1 0.5 = T SS = SSR 0.5 = 2 SSR c) When the sample size is small we can estimate the Root Mean Squared Forecast Error by the Standard Error of the Regression. Solution (5 points) Incorrect. The RM SF E = error: E [ ( Y T +1 ŶT +1 T ) 2 ] has two sources of 1. The error arising because future values of u t are unknown 2. The error in estimating the coefficients If the sample size is large the first source of error will be (much) larger than the second and RMSF E V ar (u t ) and V ar (u t ) can be estimated by the SER = 1 T 2 T t=1 û2 t. However if the sample size is small the second type of error can be large and it is not a good idea to estimate the RMSFE by the SER. 8

Question 4 A researcher wants to estimate the effect of a job training program (X i ) on wages in NOK (Y i ). Consider the following population regression model Y i = β 0 + β 1 X i + u i with E [u i X i ] = 0. The researcher observes X i but by accident wages are measured in Euro s instead of NOK. Let α be NOK/Euro exchange rate. The researcher performs an OLS regression of Yi (the observed wage rate in Euro s) on X i. The researcher wants to know the causal effect of the job training program on wages in NOK, show whether the OLS estimator is a consistent or an inconsistent estimator of this causal effect. Solution (10 points): α = Y i Y i = Y i = 1 α Y i β ols = s Y X s 2 X p Cov(Y i,x i) = Cov( 1 α Y i,x i ) = Cov( 1 α (β 0+β 1 X i +u i ),X i ) = Cov( 1 α β 0,X i)+cov( 1 α β 1X i,x i)+cov( 1 α u i,x i) = 0+ 1 α β 1Cov(X i,x i )+0 = 1 α β 1 If α 1 the OLS estimator of β 1 is an inconsistent estimator of the causal effect of the job training program on wages in NOK. 9