Econometrics Problem Set 3

Similar documents
Econometrics Problem Set 6

Econometrics Problem Set 6

Econometrics Problem Set 4

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Econometrics Problem Set 7

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Universidad Carlos III de Madrid Econometría Nonlinear Regression Functions Problem Set 8

Econometrics Homework 1

WISE International Masters

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Homework Set 2, ECO 311, Spring 2014

Homework Set 2, ECO 311, Fall 2014

WISE International Masters

ECON Interactions and Dummies

Empirical Application of Simple Regression (Chapter 2)

Econometrics Problem Set 10

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

2. Linear regression with multiple regressors

Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 8

Introduction to Econometrics (4 th Edition) Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 8

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

ECON3150/4150 Spring 2015

Answer Key: Problem Set 5

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

Eco 391, J. Sandford, spring 2013 April 5, Midterm 3 4/5/2013

Motivation for multiple regression

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Chapter 9: The Regression Model with Qualitative Information: Binary Variables (Dummies)

4. Nonlinear regression functions

Econometrics -- Final Exam (Sample)

Final Exam - Solutions

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

ECO375 Tutorial 8 Instrumental Variables

Econometrics I Lecture 7: Dummy Variables

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

Applied Statistics and Econometrics

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Lecture 6: Linear Regression (continued)

Exercise sheet 6 Models with endogenous explanatory variables

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Introduction to Linear Regression Analysis

ECO321: Economic Statistics II

Chapter 2: simple regression model

Exercise Sheet 4 Instrumental Variables and Two Stage Least Squares Estimation

Regression Analysis. BUS 735: Business Decision Making and Research

Econometrics Problem Set 11


Final Exam - Solutions

Applied Quantitative Methods II

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

1 Correlation and Inference from Regression

Solutions to Problem Set 4 (Due November 13) Maximum number of points for Problem set 4 is: 66. Problem C 6.1

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Dealing With Endogeneity

Lab 10 - Binary Variables

Answer Key: Problem Set 6

Applied Statistics and Econometrics

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 9. Dummy (Binary) Variables. 9.1 Introduction The multiple regression model (9.1.1) Assumption MR1 is

Lecture 6: Linear Regression

Econ Spring 2016 Section 9

Regression #8: Loose Ends

Simple Linear Regression: The Model

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Homoskedasticity. Var (u X) = σ 2. (23)


Problem Set - Instrumental Variables

Problem Set # 1. Master in Business and Quantitative Methods

1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

ECON 482 / WH Hong Binary or Dummy Variables 1. Qualitative Information

Homework Set 3, ECO 311, Spring 2014

Midterm 2 - Solutions

2.1 Linear regression with matrices

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

CHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Review of Econometrics

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Recitation 1: Regression Review. Christina Patterson

ECON3150/4150 Spring 2016

Properties of estimator Functional Form. Econometrics. Lecture 8. Nathaniel Higgins JHU. Nathaniel Higgins Lecture 8

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Ch 7: Dummy (binary, indicator) variables

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Descriptive Statistics Class Practice [133 marks]

Applied Statistics and Econometrics

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Section 3: Simple Linear Regression

Truncation and Censoring

Statistical methods for Education Economics

Controlling for Time Invariant Heterogeneity

ECON 3150/4150, Spring term Lecture 7

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Solutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3

ECON 497 Midterm Spring

Economics 241B Estimation with Instruments

Chapter 11. Regression with a Binary Dependent Variable

Transcription:

Econometrics Problem Set 3 Conceptual Questions 1. This question refers to the estimated regressions in table 1 computed using data for 1988 from the U.S. Current Population Survey. The data set consists of information on 4000 full-time full-year workers. The highest educational achievement for each worker was either a high school diploma or a bachelor s degree. The worker s ages ranged from 25 to 34 years. The dataset also contained information on the region of the country where the person lived, marital status, and number of children. For the purposes of these exercises let AHE = average hourly earnings (in 1998 dollars) College = binary variable (1 if college, 0 if high school) F emale = binary variable (1 if female, 0 if male) Age = age (in years) Ntheast = binary variable (1 if Region = Northeast, 0 otherwise) Midwest = binary variable (1 if Region = Midwest, 0 otherwise) South = binary variable (1 if Region = South, 0 otherwise) W est = binary variable (1 if Region = West, 0 otherwise) (a) Compute R 2 for each of the regressions. (b) Using the regression results in column (1): i. Do workers with college degrees earn more, on average, than workers with only high school degrees? How much more? ii. Do men earn more than women on average? How much more? (c) Using the regression results in column (2): i. Is age an important determinant of earnings? Explain. ii. Sally is a 29-year-old female college graduate. Betsy is a 34-year-old female college graduate. Predict Sally s and Betsy s earnings. (d) Using the regression results in column (3):

Dependent variable: average hourly earnings (AHE). Regressor (1) (2) (3) College(X 1 ) 5.46 5.48 5.44 F emale(x 2 ) -2.64-2.62-2.62 Age(X 3 ) 0.29 0.29 Northeast(X 4 ) 0.69 Midwest(X 5 ) 0.60 South(X 6 ) -0.27 Intercept 12.69 4.40 3.75 Summary Statistics SER 6.27 6.22 6.21 R 2 0.176 0.190 0.194 R 2 n 4000 4000 4000 Table 1: Results of Regressions of Average Hourly Earnings on Gender and Education Binary Variables and Other Characteristics Using 1988 Data from the Current Population Survey i. Do there appear to be important regional differences? ii. Why is the wage regressor W est omitted from the regression? What would happen if it was included? iii. Juanita is a 28-year-old female college graduate from the South. Jennifer is a 28-year-old female college graduate from the Midwest. Calculate the expected difference in earnings between Juanita and Jennifer. 2. (SW 6.10) (Y i, X 1,i, X 2,i ) satisfy the four multiple regression model least squares assumptions; in addition, var(u i X 1,i, X 2,i ) = 4 and var(x 1,i ) = 6. A random sample of size n = 400 is drawn from the population. (a) Assume that X 1 and X 2 are uncorrelated. Compute the variance of ˆβ 1. [Hint: The variance of ˆβ 1 is [ ] σ 2ˆβ1 = 1 1 σu 2. n 1 ρ 2 X 1,X 2 σx 2 1 (b) Assume that cor(x 1, X 2 ) = 0.5. Compute the variance of ˆβ 1. Page 2

(c) Comment on the following statements: When X 1 and X 2 are correlated, the variance of ˆβ 1 is larger than it would be if X 1 and X 2 were uncorrelated. Thus, if you are interested in β 1, it is best to leave X 2 out of the regression if it is correlated with X 1. 3. (SW 6.11) Consider the regression model Y i = β 1 X 1i + β 2 X 2i + u i for i = 1,..., n. (Notice that there is no constant term in the regression). (a) Specify the least squares function that is minimized by OLS. (b) Compute the partial derivatives of the objective function with respect to b 1 and b 2. (c) Suppose that n i=1 X 1iX 2i = 0. Show that ˆβ 1 = n i=1 X 1iY i / n i=1 X2 1i. (d) Suppose that n i=1 X 1iX 2i 0. Derive an expression for ˆβ 1 as a function of the data (Y i, X 1i, X 2i ), i = 1,..., n. (e) Suppose that the model includes an intercept: Y i = β 0 + β 1 X 1i + β 2 X 2i + u i. Show that the least squares estimators satisfy ˆβ 0 = Ȳ ˆβ 1 X1 ˆβ 2 X2. 4. (SW 7.7) Data were collected from a random sample of 220 home sales from a community in 2003. Let P denote the selling price (in $1000), BDR denote the number of bedrooms, Bath denote the number of bathrooms, Hsize denote the size of the house (in square feet), Age denote the age of the house (in yeas), and P r denote a binary variable that is equal to 1 if the condition of the house is reported as poor. An estimated regression yields Pˆ =119.2 + 0.485BDR + 23.4Bath + 0.156Hsize + 0.002Lsize + 0.090Age - 48.8P r (23.9) (2.61) (8.94) (0.011) (0.00048) (0.311) (10.5) SER = 41.5, R 2 = 0.72. (a) Is the coefficient on BDR statistically significantly different from zero? (b) Typically five-bedroom houses sell for much more than two-bedroom houses. Is this consistent with your answer to (a) and with the regression more generally? (c) A homeowner purchases 2000 square feet from an adjacent lot. Construct a 99% confidence interval for the change in the value of her house. (d) Lot size is measured in square feet. Do you think that another scale might be more appropriate? Why or why not? (e) The F -statistic for omitting BDR and Age from the regression is F = 0.08. Are the coefficients on BDR and Age statistically different from zero at the 10% level? Page 3

5. A study was conducted to determine whether certain features could be used to explain variability in the price of furnaces. For a sample of 19 furnaces the following regression was estimated: where Ŷ = -68.23 + 0.0023X 1 + 19.73X 2 + 7.65X 3 SER = 41.5, R 2 = 0.72 (0.005) (8.99) (3.082) Y = Price, in dollars, X 1 = Rating of furnace, in BTU per hour, X 2 = Energy efficiency ratio, X 3 = Number of settings. The standard errors reported here assume homoskedasticity of the error term. (a) What assumptions are required to be able to use this regression analysis for statistical inference. (b) Under the required assumptions for statistical inference, find a 95% confidence interval for the expected increase in price resulting from an additional setting when the values of the rating and the energy efficiency ration remain fixed. (c) Under the required assumptions for statistical inference, test the null hypothesis that, all else being equal, the energy efficiency ratio of furnaces does not affect their price against the alternative that the higher the energy efficincy ratio, the higher the price. (d) Under the required assumptions for statistical inference, test the null hypothesis that, taken together, the three independent variables do not linearly influence the price of the furnaces. 6. (SW 7.9) Consider the regression model Y i = β 0 + β 1 X 1i + β 2 X 2i + u i. Use the transform the regression approach discussed in class to transform the regression so that you can use a t-statistic to test (a) β 1 = β 2 ; (b) β 1 + aβ 2 = 0, where a is a constant; (c) β 1 + β 2 = 1; (Hint: You must redefine the dependent variable in the regression.) (d) β 1 + β 2 = a, where a is a constant.

Empirical Questions For these empirical exercises, the required datasets and a detailed description of them can be found at www.xmueconometrics.weebly.com. 7. (SW E7.3) The data set used in this empirical exercise (CollegeDistance) contains data from a random sample of high school seniors interviewed in 1980 and re-interviewed in 1986. In this exercise you will use these data to investigate the relationship between the number of completed years of education for young adults and the distance from each student s high school to the nearest four year college. (Proximity to college lowers the cost of education, so that students who live closer to a four-year college should, on average, complete more years of higher education.) (a) An education advocacy group argues that, on average, a person s educational attainment would increase by approximately 0.15 year if distance to the nearest college is decreased by 20 miles. Run a regression of years of completed education (ED) on distance to the nearest college (Dist). Is the advocacy groups claim consistent with the estimated regression? Explain. (b) Other factors also affect how much college a person completes. Does controlling for these other factors change the estimated effect of distance on college years completed? To answer this question, construct a table like Table 7.1 in the textbook. Include a simple specification [constructed in (a)], a base specification (that includes a set of important control variables), and several modifications of the base specification. Discuss how the estimated effect of Dist on ED changes across specifications. (c) It has been argued that, controlling for other factors, blacks and Hispanics complete more college than whites. Is this result consistent with the regressions that you constructed in part (b)? (d) Graph a 95% joint confidence interval for the coefficients on blacks and Hispanics.