Similar documents


Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

Write your identification number on each paper and cover sheet (the number stated in the upper right hand corner on your exam cover).

WISE International Masters

8. Instrumental variables regression

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

Introduction to Econometrics

The Simple Linear Regression Model

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Review of Econometrics

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

1 Motivation for Instrumental Variable (IV) Regression

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Problem Set 11

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Assessing Studies Based on Multiple Regression

Linear Regression with one Regressor

Ec1123 Section 7 Instrumental Variables

ECON 4230 Intermediate Econometric Theory Exam

Econometrics Summary Algebraic and Statistical Preliminaries

Econ Spring 2016 Section 9

Final Exam - Solutions

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

ECONOMETRICS HONOR S EXAM REVIEW SESSION

2. Linear regression with multiple regressors

Midterm 2 - Solutions

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

EMERGING MARKETS - Lecture 2: Methodology refresher

Linear Regression with Multiple Regressors

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Applied Statistics and Econometrics

WISE International Masters

6. Assessing studies based on multiple regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Department of Economics, UCSB UC Santa Barbara

Rockefeller College University at Albany

Linear Regression with Multiple Regressors

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Unit 6 - Introduction to linear regression

Least Squares Estimation-Finite-Sample Properties

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

Lecture 4: Heteroskedasticity

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)

Econometrics. Final Exam. 27thofJune,2008. Timeforcompletion: 2h30min

Midterm 2 - Solutions

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Answer Key: Problem Set 6

Econometrics -- Final Exam (Sample)

Exam D0M61A Advanced econometrics

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

FinQuiz Notes

ECONOMETFUCS FIELD EXAM Michigan State University May 11, 2007

LECTURE 11. Introduction to Econometrics. Autocorrelation

Lectures 5 & 6: Hypothesis Testing

ECON 497 Midterm Spring

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Multiple Regression Analysis: Heteroskedasticity

Unit 6 - Simple linear regression

Metrics Honors Review

Econometrics Review questions for exam

ECON Introductory Econometrics. Lecture 13: Internal and external validity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Confidence Intervals for Comparing Means

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

ECON3150/4150 Spring 2016

Economics 241B Estimation with Instruments

Correlation and Linear Regression

ECON3150/4150 Spring 2015

Mathematics for Economics MA course

Introduction to Econometrics. Heteroskedasticity

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

Econometrics Problem Set 6

Lab 11 - Heteroskedasticity

CHAPTER 6: SPECIFICATION VARIABLES

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

9. Linear Regression and Correlation

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Hypothesis Tests and Confidence Intervals in Multiple Regression

Final Exam - Solutions

Econ 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5.

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Econometrics - 30C00200

REVIEW 8/2/2017 陈芳华东师大英语系

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Econometrics Problem Set 6

An Introduction to Econometrics. A Self-contained Approach. Frank Westhoff. The MIT Press Cambridge, Massachusetts London, England

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Transcription:

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Saturday, May 9, 008 Examination time: 3 hours Write your name, Swedish personal number and the number of the question on every cover sheet. Do not write answers for more than one question in the same cover sheet. Explain notions/concepts and symbols. Only legible exams will be marked. No aids are allowed with the exception of calculators provided by exam administrators. --------------------------------------------------------------------------------------------------------------------------- The exam consists of two parts. Part 1 consists of 0 multiple choice questions worth 40 points in total ( points each). All students must answer this part of the exam. Part consists of two discussion questions worth 60 points in total (30 points each). Discussion question 1 is worth 30 points for those that successfully acquired credit on the first credit assignment. Discussion question is worth 30 points for those that successfully acquired credit on the second credit assignment. If you have received credit you do not need to answer the respective discussion question on the exam. The exam is worth 100 points in total. For the grade E 40 points are required, for D 50 points, C 60 points, B 75 points and A 90 points. --------------------------------------------------------------------------------------------------------------------------- If you think that a question is vaguely formulated: specify the conditions used for solving it. --------------------------------------------------------------------------------------------------------------------------- Results will be posted on the notice board, House A, floor 3, June 14, 010 at the latest --------------------------------------------------------------------------------------------------------------------------- Good luck!

Part 1: Multiple Choice Questions (40 points). Circle the right answer. Only one answer per question. No credit will be given for multiple answers or additional explanations. Two points per question for correct answers. 1) The following are all least squares assumptions with the exception of: a. The conditional distribution of u i given X i has a mean of zero. b. The explanatory variable in regression model is normally distributed. c. ( X i, Yi ), i = 1,..., n are independently and identically distributed. d. Large outliers are unlikely. ) The reason why estimators have a sampling distribution is that a. economics is not a precise science. b. individuals respond differently to incentives. c. in real life you typically get to sample many times. d. the values of the explanatory variable and the error term differ across samples. 3) The sample average of the OLS residuals is a. some positive number since OLS uses squares. b. zero. c. unobservable since the population regression function is unknown. d. dependent on whether the explanatory variable is mostly positive or negative. 4) To obtain the slope estimator using the least squares principle, you divide the a. sample variance of X by the sample variance of Y. b. sample covariance of X and Y by the sample variance of Y. c. sample covariance of X and Y by the sample variance of X. d. sample variance of X by the sample covariance of X and Y. 5) With heteroskedastic errors, the weighted least squares estimator is BLUE. You should use OLS with heteroskedasticity-robust standard errors because a. this method is simpler. b. the exact form of the conditional variance is rarely known. c. the Gauss-Markov theorem holds. d. your spreadsheet program does not have a command for weighted least squares.

6) The t-statistic is calculated by dividing a. the OLS estimator by its standard error. b. the slope by the standard deviation of the explanatory variable. c. the estimator minus its hypothesized value by the standard error of the estimator. d. the slope by 1.96. 7) Finding a small value of the p-value (e.g. less than 5%) a. indicates evidence in favor of the null hypothesis. b. implies that the t-statistic is less than 1.96. c. indicates evidence in against the null hypothesis. d. will only happen roughly one in twenty samples. 8) When there are omitted variables in the regression, which are determinants of the dependent variable, then a. you cannot measure the effect of the omitted variable, but the estimator of your included variable(s) is (are) unaffected. b. this has no effect on the estimator of your included variable because the other variable is not included. c. this will always bias the OLS estimator of the included variable. d. the OLS estimator is biased if the omitted variable is correlated with the included variable. 9) When you have an omitted variable problem, the assumption that E(u i X i ) = 0 is violated. This implies that a. the sum of the residuals is no longer zero. b. there is another estimator called weighted least squares, which is BLUE. c. the sum of the residuals times any of the explanatory variables is no longer zero. d. the OLS estimator is no longer consistent. 10) All of the following are true, with the exception of one condition: a. a high R or variable. b. a high c. a high R or R or d. a high R or regressors. R does not mean that the regressors are a true cause of the dependent R does not mean that there is no omitted variable bias. R always means that an added variable is statistically significant. R does not necessarily mean that you have the most appropriate set of

11) The interpretation of the slope coefficient in the model Y = β ln( ) 0 + β1 X + u is as follows: i i i a. a 1% change in X is associated with a β 1 % change in Y. b. a 1% change in X is associated with a change in Y of 0.01 β 1. c. a change in X by one unit is associated with a 100 β 1 % change in Y. d. a change in X by one unit is associated with a β 1 change in Y. 1) A nonlinear function a. makes little sense, because variables in the real world are related linearly. b. can be adequately described by a straight line between the dependent variable and one of the explanatory variables. c. is a concept that only applies to the case of a single or two explanatory variables since you cannot draw a line in four dimensions. d. is a function with a slope that is not constant. 13) A statistical analysis is internally valid if e. its inferences and conclusions can be generalized from the population and setting studied to other populations and settings. f. statistical inference is conducted inside the sample period. g. the hypothesized parameter value is inside the confidence interval. h. the statistical inferences about causal effects are valid for the population being studied. 14) Comparing the California test scores to test scores in Massachusetts is appropriate for external validity if a. Massachusetts also allowed beach walking to be an appropriate P.E. activity. b. the two income distributions were very similar. c. the student-to-teacher ratio did not differ by more than five on average. d. the institutional settings in California and Massachusetts, such as organization in classroom instruction and curriculum, were similar in the two states. 15) You try to explain the number of IBM shares traded in the stock market per day in 005. As an independent variable you choose the closing price of the share. This is an example of a. simultaneous causality. b. invalid inference due to a small sample size. c. sample selection bias since you should analyze more than one stock. d. a situation where homoskedasticity-only standard errors should be used since you only analyze one company.

16) Consider a panel regression of unemployment rates for the G7 countries (United States, Canada, France, Germany, Italy, United Kingdom, Japan) on a set of explanatory variables for the time period 1980-000 (annual data). If you included entity and time fixed effects, you would need to specify the following number of binary variables: a. 1. b. 6. c. 8. d. 6. 17) A pattern in the coefficients of the time fixed effects binary variables may reveal the following in a study of the determinants of state unemployment rates using panel data: a. macroeconomic effects, which affect all states equally in a given year. b. attitude differences towards unemployment between states. c. there is no economic information that can be retrieved from these coefficients. d. regional effects, which affect all states equally, as long as they are a member of that region. 18) If the instruments are not exogenous, a. you cannot perform the first stage of TSLS. b. then, in order to conduct proper inference, it is essential that you use heteroskedasticityrobust standard errors. c. your model becomes overidentified. d. then TSLS is inconsistent. 19) Consider a model with one endogenous regressor and two instruments. Then the J-statistic will be large a. if the number of observations are very large. b. if the coefficients are very different when estimating the coefficients using one instrument at a time. c. if the TSLS estimates are very different from the OLS estimates. d. when you use homoskedasticity-only standard errors. 0) Causal effects that depend on the value of an observable variable, say W i, a. cannot be estimated. b. can be estimate by interacting the treatment variable with W i. c. result in the OLS estimator being inefficient. d. requires use of homoskedasticity-only standard errors.

Part : Discussion Questions (60 points) On separate sheets of paper, answer the following discussion questions. Write your name, personal number (personnummer) and the question number on each sheet. Answer each question clearly and concisely. Only legible answers will be considered, others will be disregarded. If you think that a question is vaguely formulated, specify the conditions used for solving it. Each question is worth 30 points. Discussion Question 1: NOTE: Those with credit on credit assignment 1 receive 30 points for this question and do not have to answer discussion question 1 Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height of children and their parents towards the end of the 19 th century. It is from this study that the name regression originated. You decide to update his findings by collecting data from 110 college students, and estimate the following relationship: NO studenth = 19.6 + 0.73 Midparh, R = 0.45, SER =.0 (7.) (0.10) where Studenth is the height of students in inches, and Midparh is the average of the parental heights. Values in parentheses are heteroskedasticity robust standard errors. (Following Galton s methodology, both variables were adjusted so that the average female height was equal to the average male height.) (a) Interpret the estimated coefficients. (b) What is the meaning of the regression R?. (c) What is the prediction for the height of a child whose parents have an average height of 70.06 inches? (d) What is the interpretation of the SER here? (e) Given the positive intercept and the fact that the slope lies between zero and one, what can you say about the height of students who have quite tall parents? Who have quite short parents? (f) Test for the statistical significance of the slope coefficient.

(g) If children, on average, were expected to be of the same height as their parents, then this would imply two hypotheses, one for the slope and one for the intercept. (i) What should the null hypothesis be for the intercept? Calculate the relevant t-statistic and carry out the hypothesis test at the 1% level. (ii) What should the null hypothesis be for the slope? Calculate the relevant t-statistic and carry out the hypothesis test at the 5% level. (h) Can you reject the null hypothesis that the regression R is zero? (i) Construct a 95% confidence interval for a one inch increase in the average of parental height. (j) Galton was concerned about the height of the English aristocracy and referred to the above result as regression towards mediocrity. Can you figure out what his concern was? Why do you think that we refer to this result today as Galton s Fallacy?

Discussion Question : NOTE: Those with credit on credit assignment receive 30 points for this question and do not have to answer discussion question To analyze the year-to-year variation in temperature data for a given city, you regress the daily high temperature (Temp) for 100 randomly selected days in two consecutive years (1997 and 1998) for Phoenix. The results are (heteroskedastic-robust standard errors in parenthesis): NO PHX Temp 1998 PHX = 15.63 + 0.80 Temp1997 ; (0.10) R = 0.65, SER = 9.63 (a) Calculate the predicted temperature for the current year if the temperature in the previous year was 40 0 F, 78 0 F, and 100 0 F. How does this compare with you prior expectation? Sketch the regression line and compare it to the 45 degree line. What are the implications? (b) You recall having studied errors-in-variables before. Although the web site you received your data from seems quite reliable in measuring data accurately, what if the temperature contained measurement error in the following sense: for any given day, say January 8, there is a true underlying seasonal temperature (X), but each year there are different temporary weather patterns (v, w) which result in a temperature X ~ different from X. For the two years in your data set, the situation can be described as follows: ~ X 1997 = X +ν 1997 and X 1998 = X + ω1997 ~ ~ ~ ~ ~ Subtracting X1997 from X 1998, you get X 1998 X 1997 + ω1998 ν 1997. Hence the population parameter for the intercept and slope are zero and one, as expected. It is not difficult to show that the OLS estimator for the slope is inconsistent, where v 1 σ x + σ v p ˆ β 1 σ As a result you consider estimating the slope and intercept by TSLS. You think about an instrument and consider the temperature one month ahead of the observation in the previous year. Discuss instrument validity for this case.

(c) The TSLS estimation result is as follows: NO PHX Temp 1998 PHX = -6.4 + 1.07 Temp1997 ; (0.06) Perform a t-test on whether or not the slope is now significantly different from one. (d) Write a short essay about the Overidentifying Restrictions Test. What is meant exactly by overidentification? State the null hypothesis. Describe how to calculate the J-statistic and what its distribution is. Can the test be used in the above example, why or why not? Use an example of two instruments and one endogenous variable to explain under what situation the J - test will be likely to reject the null hypothesis. If your variables pass the test, is this sufficient for these variables to be good instruments? (e) What are the two conditions for instrument validity? The reason for the inconsistency of OLS is that corr( X i, ui ) 0. But if X and Z are correlated, and X and u are also correlated, then how can Z and u not be correlated? Explain..