Problem Set 1 ANSWERS

Similar documents
Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Functional Form. So far considered models written in linear form. Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X

Section Least Squares Regression

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

ECON3150/4150 Spring 2016

Lecture 8: Functional Form

General Linear Model (Chapter 4)

Interpreting coefficients for transformed variables

SOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS

Applied Statistics and Econometrics

Lab 6 - Simple Regression

Econometrics Midterm Examination Answers

Lecture 4: Multivariate Regression, Part 2

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

Multiple Regression: Inference

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Lab 07 Introduction to Econometrics

At this point, if you ve done everything correctly, you should have data that looks something like:

Lecture 4: Multivariate Regression, Part 2

ECO220Y Simple Regression: Testing the Slope

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

Heteroskedasticity Example

Econometrics Homework 1

Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Interaction effects between continuous variables (Optional)

Introduction to Econometrics. Review of Probability & Statistics

Lecture 24: Partial correlation, multiple regression, and correlation

ECON3150/4150 Spring 2015

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

1 Independent Practice: Hypothesis tests for one parameter:

sociology 362 regression

ECON3150/4150 Spring 2016

Empirical Application of Simple Regression (Chapter 2)

Handout 12. Endogeneity & Simultaneous Equation Models

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Specification Error: Omitted and Extraneous Variables

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 17: Experiments

Suggested Answers Problem set 4 ECON 60303

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013

THE MULTIVARIATE LINEAR REGRESSION MODEL

Problem 4.1. Problem 4.3

Brief Suggested Solutions

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

sociology 362 regression

ECON 497 Final Exam Page 1 of 12

Rockefeller College University at Albany

Lab 10 - Binary Variables

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

Practice 2SLS with Artificial Data Part 1

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

Quantitative Methods Final Exam (2017/1)

Applied Statistics and Econometrics

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Problem Set 10: Panel Data

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Correlation and Simple Linear Regression

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

F Tests and F statistics

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, Last revised April 6, 2015

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Graduate Econometrics Lecture 4: Heteroskedasticity

Essential of Simple regression

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

Handout 11: Measurement Error

Lecture 3: Multivariate Regression

Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, Last revised January 20, 2018

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Binary Dependent Variables

Week 3: Simple Linear Regression

Multivariate Regression: Part I

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Nonlinear Regression Functions

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

STATISTICS 110/201 PRACTICE FINAL EXAM

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Lecture 14. More on using dummy variables (deal with seasonality)

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

1 The basics of panel data

Econometrics II Censoring & Truncation. May 5, 2011

Problem Set 5 ANSWERS

An explanation of Two Stage Least Squares

Lecture 7: OLS with qualitative information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

σ σ MLR Models: Estimation and Inference v.3 SLR.1: Linear Model MLR.1: Linear Model Those (S/M)LR Assumptions MLR3: No perfect collinearity

Question 1 carries a weight of 25%; question 2 carries 25%; question 3 carries 20%; and question 4 carries 30%.

Transcription:

Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one of the properties of expected value. 2. Any linear combination of normally distributed random variables: a. is also normally distributed This is also just a simple application of probability theory. (These first 2 were just checking that you reviewed the Econ 10 material!) 3. What can be said about the estimated slope coefficient for a regression of Y on X, versus the estimated slope coefficient for a regression of X on Y? b. the slopes are not reciprocals The estimate of β 1 from the population model Y=β 0 +β 1 X+u can be written as Cov(X,Y)/Var(X), thus the estimate of β 1 from the population model X=β 0 +β 1 Y+u can be written as Cov(X,Y)/Var(Y). Clearly, these will have the same sign, will not be identical, and will not be reciprocals. 4. Suppose the following is the true population model of the effect of smoking on birth weight in ounces: birth weight = β 0 + β 1 daily cigarettes + u, and that the average birth weight is 125 oz and the average mother smoked 2 cigarettes per day. If your estimate of β 1 is -2.5, what must your estimate of β 0 be? c. 130 We know that an OLS regression fits exactly at the mean of the sample. Thus, based on the information given, we have: 125 = estimated β 0 2.5*2, so our estimate of β 0 must be 125+5=130. 5. Suppose a new sample is chosen and you estimate the following: predicted birth weight in oz = 136 2.1 daily cigarettes. What would be your estimate of β 0 be if birth weight were recorded in pounds? b. 8.5 There are 16 ounces in a pound, so we need to divide our coefficients by 16 to obtain the same relationship. 6. Using the same sample as in question 5, you now estimate the following: birth weight in oz = β 0 + β 1 weekly cigarettes + u. Your estimate of β 1 would be: c. 0.3 From question 5 we know that each daily cigarette reduces birth weight by 2.1 ounces. Since a daily cigarette implies 7 weekly cigarettes, 1 weekly cigarette has an effect 1/7 the size, reducing weight by 0.3 ounces.

Part II. Stata Problems 1. a). sum, detail ------------------------------------------------------------- Percentiles Smallest 1% -2.804-3.2238 5% -1.9984-2.804 10% -1.2417-2.7727 Obs 127 25% -.69525-2.5743 Sum of Wgt. 127 50%.016168 Mean.0839172 Largest Std. Dev. 1.266603 75%.72278 2.669 90% 1.8124 3.1895 Variance 1.604283 95% 2.2796 3.2187 Skewness.3415457 99% 3.2187 4.7221 Kurtosis 4.177176 ------------------------------------------------------------- Percentiles Smallest 1% 6552 6306 5% 7844.8 6552 10% 8164 6848 Obs 127 25% 9234 6879 Sum of Wgt. 127 50% 10127 Mean 10790.68 Largest Std. Dev. 3613.81 75% 11542 17582 90% 12806 21961 Variance 1.31e 95% 14374 25940 Skewness 4.879524 99% 25940 39610 Kurtosis 35.87324 Given the above, the mean of the test score is 0.084, and the median is 0.016 note the 50 th percentile is the median! The mean of per capita income is $10,791, and the median is $10,127. b) I chose bin sizes of 30 and 50, respectively. You may have made a different choice.. graph, bin(30). graph, bin(50).125984.181102 Fraction Fraction 0-3.2238 4.7221 0 6306 39610

c) From part a) we know that median per capita income is 10127, so. sum if >= 10127 Variable Obs Mean Std. Dev. Min Max ---------+---------------------------------------------------- 64.6253451 1.216266-1.5618 4.7221 64 12585.69 4317.745 10127 39610. sum if <10127 Variable Obs Mean Std. Dev. Min Max ---------+---------------------------------------------------- 63 -.4661048 1.071052-3.2238 2.0024 63 8967.171 944.517 6306 10110 The average test score in districts with above median per capita income is 0.625 and in districts with below median per capita income it is 0.466. This is a difference of 1.091 standard deviation units. The average per capita income in the high-income districts is $12,586, while in the low-income districts it is $8,967. This is a difference of $3,619. d) You may have graphed this the other way. While it doesn t really matter, it is more natural to think of the test score as the outcome, and thus as the Y variable.. graph 4.7221-3.2238 6306 39610 e). reg Source SS df MS Number of obs = 127 ---------+----------------------------- F( 1, 125) = 90.49 Model 84.8861871 1 84.8861871 Prob > F = 0.0000 Residual 117.253472 125.938027778 R-squared = 0.4199 ---------+----------------------------- Adj R-squared = 0.4153 Total 202.139659 126 1.60428301 Root MSE =.96852 Coef. Std. Err. t P> t [95% Conf. Interval] ---------+-------------------------------------------------------------------.0002271.0000239 9.513 0.000.0001799.0002744 _cons -2.366932.2715919-8.715 0.000-2.904446-1.829418

The above regression shows that there is a positive relationship between per capita income and test scores. An extra $1000 in per capita income in a district will increase test scores in the district by 0.227 standard deviation units. This should not surprise us, since the graph in part d) appeared to reflect a positive relationship. Also, in part c) we saw a difference of $3619 between high- and low-income districts, and a difference of 1.09 in test scores. Our regression would predict a difference of about 0.822 in test scores for a $3619 difference in scores. This is in the same ballpark. Of course, this simple regression does not control for the many other things that will affect test scores that may be correlated with per capita income. 2. a). reg TDs attempts Source SS df MS Number of obs = 60 -------------+------------------------------ F( 1, 58) = 247.12 Model 2634.62634 1 2634.62634 Prob > F = 0.0000 Residual 618.356991 58 10.6613274 R-squared = 0.8099 -------------+------------------------------ Adj R-squared = 0.8066 Total 3252.98333 59 55.1353107 Root MSE = 3.2652 TDs Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- attempts.0373786.0023778 15.72 0.000.0326189.0421382 _cons -.2807747.6655973-0.42 0.675-1.613112 1.051563 Based on this regression, 100 more attempts would imply 3.7 more touchdowns in the season. b). reg TDs attempts completions Source SS df MS Number of obs = 60 -------------+------------------------------ F( 2, 57) = 166.11 Model 2776.59711 2 1388.29856 Prob > F = 0.0000 Residual 476.386219 57 8.35765297 R-squared = 0.8536 -------------+------------------------------ Adj R-squared = 0.8484 Total 3252.98333 59 55.1353107 Root MSE = 2.891 TDs Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- attempts -.0314791.016839-1.87 0.067 -.0651986.0022404 completions.1160537.028158 4.12 0.000.0596683.172439 _cons.0249384.5939655 0.04 0.967-1.164457 1.214334 Now, all else equal, another 100 attempts would imply 3 fewer touchdowns! What is going on? The key here is the ceteris paribus interpretation of the coefficient on attempts. What does it mean to increase attempts, holding completions constant? It means the QB is throwing a bunch of incomplete passes. Presumably, QBs with more incompletions are worse QBs, so they have fewer touchdowns. c) At first glance, we might just say that with 100 more completions, 11.6 more touchdowns will be thrown. This is true, all else equal i.e. holding attempts constant. So, it s like comparing 2 QBs that throw the same number of passes, one of whom has 100 more completions. Thus, it s the answer to what happens if you increase the completion rate for a QB. Another interpretation might be to think about just adding 100 completions to what was already being done. In this case, since every completion is also an attempt, we would gain 11.6 touchdowns, but lose 3.1 from the additional attempts. Overall, then, we would expect 8.5 more touchdowns after this change (adding 100 attempts that were all completions).