Final Exam. 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class.

Similar documents
Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Empirical Application of Panel Data Regression

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Your Galactic Address

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Handout 12. Endogeneity & Simultaneous Equation Models

Analyzing Severe Weather Data

EXST 7015 Fall 2014 Lab 08: Polynomial Regression

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Quantitative Methods Final Exam (2017/1)

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Answers: Problem Set 9. Dynamic Models

Mediation Analysis: OLS vs. SUR vs. 3SLS Note by Hubert Gatignon July 7, 2013, updated November 15, 2013

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Econometrics. 8) Instrumental variables

Use your text to define the following term. Use the terms to label the figure below. Define the following term.

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Sample Statistics 5021 First Midterm Examination with solutions

Lecture 8: Instrumental Variables Estimation

CHAPTER 6: SPECIFICATION VARIABLES

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

14.32 Final : Spring 2001

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Handout 11: Measurement Error

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Economics 308: Econometrics Professor Moody

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Problem Set 10: Panel Data

Econ 510 B. Brown Spring 2014 Final Exam Answers

FinQuiz Notes

Annual Performance Report: State Assessment Data

Instrumental Variables, Simultaneous and Systems of Equations

ECO220Y Simple Regression: Testing the Slope

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

ECON 497 Final Exam Page 1 of 12

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

Statistical Inference with Regression Analysis

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. 9) Heteroscedasticity and autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation

Reading Assignment. Serial Correlation and Heteroskedasticity. Chapters 12 and 11. Kennedy: Chapter 8. AREC-ECON 535 Lec F1 1

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013

1 Motivation for Instrumental Variable (IV) Regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Multiway Analysis of Bridge Structural Types in the National Bridge Inventory (NBI) A Tensor Decomposition Approach

ECON Introductory Econometrics. Lecture 17: Experiments

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Binary Dependent Variables

Lab 07 Introduction to Econometrics

Problem Set 5 ANSWERS

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 54

Applied Statistics and Econometrics

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Suggested Answers Problem set 4 ECON 60303

Econ 423 Lecture Notes

Auto correlation 2. Note: In general we can have AR(p) errors which implies p lagged terms in the error structure, i.e.,

Regression Diagnostics

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016)

Hypothesis Tests and Confidence Intervals in Multiple Regression

Essential of Simple regression

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Module 19: Simple Linear Regression

Graduate Econometrics Lecture 4: Heteroskedasticity

Applied Statistics and Econometrics

Lab 11 - Heteroskedasticity

Econometrics Summary Algebraic and Statistical Preliminaries

Test of Convergence in Agricultural Factor Productivity: A Semiparametric Approach

Introduction to Econometrics. Multiple Regression (2016/2017)

Econometrics Midterm Examination Answers

Answers to Problem Set #4

Econometrics Part Three

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Ordinary Least Squares Regression

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Section 2 NABE ASTEF 65

Lecture 4: Multivariate Regression, Part 2

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Linear Regression with Time Series Data

Lecture 1: intro. to regresions

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, Last revised April 6, 2015

Statistical Mechanics of Money, Income, and Wealth

1 Independent Practice: Hypothesis tests for one parameter:

Spatial Regression Models: Identification strategy using STATA TATIANE MENEZES PIMES/UFPE

Freeing up the Classical Assumptions. () Introductory Econometrics: Topic 5 1 / 94

Applied Statistics and Econometrics

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant:

Evolution Strategies for Optimizing Rectangular Cartograms

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

F9 F10: Autocorrelation

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Smart Magnets for Smart Product Design: Advanced Topics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Nonlinear Regression Functions

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Transcription:

Name Answer Key Economics 170 Spring 2003 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the exam Final Exam John F. Stewart Instructions: Complete each part of this exam in the space provided. If you need additional space, use the backs of pages but clear indicate where your answers are. Neatness and clarity of exposition count. You must attach your formula list and the output from the Stata assignment to this exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (15 points) 5 (40 points) Bonus (10 points) Total (120 points) 1. Definitions: Briefly Define each of the following terms as they relate to the material covered in class. 1.1. Under identification In a system of simultaneous equations, the situation in which it there does not exist a solution for the structural parameters in terms of the parameters of the reduced form. 1.2. Consistency a n is a consistent estimator of a if the lim (probability( a-a > e)) Y 0 as n Y 4 for any e > 0 i.e., the plim(a n ) = a 1.3. Simultaneous equation bias If one of the independent variables in a regression equation is endogenous (is systematically related to other variable so that the covariance between the variable and error term is not 0) then OLS estimation of the equation will result in biased parameter estimates 1.4. Durbin-Watson statistic dw statistic is used to test for 1 st order autocorrelation of errors. It is based on an estimate of the correlation coefficient between errors at t and t-1. Yt = β1 + β2 Xt2 +... + β k Xtk + µ t error specification: µ t = ρµ t 1 + εt with 1 ρ 1 Step1: estimate with OLS and compute residuals µ $ t = Yt β $ β $ 1 2 Xt2... β $ k Xtk t= n 2 Step2: compute d = t= 2 ( µ $ t µ $ t 1) µ $ 2 t Step3: test H0: ρ = 0against ρ > 0and H0: ρ = 0 against ρ < 0 1.5 heteroskedasticity Failure of the one of the classical assumption for OLS estimates (errors have constant variance) to be BLUE. Econ 170 Final Page 1 of 18

Failure of the assumption: Given X t, m t has constant varaince. Var(m t X)= s 2 2. Short Answer, multiple choice, (5 points each) Use the following list of Ramanathan s assumptions for the linear regression model. 1. The regression model is linear in unknown coefficients a and b i. Y t = a + b 1 X 1t +...+ b k X kt + m t for t=1,2,...,n 2. Not all observations on X are the same, i.e. Var(X) > 0 3. The error term m t is a random variable with E( m t X t ) = 0 4. X t is given and nonrandom, implying that it is uncorrelated with ms that is Cov(X t, m s ) = 0 5. Given X t, m t has constant varaince. Var(m t X)= s 2 6. Given X t, m t and m s are independently distributed for all t s so Cov(m t, m s X) = 0 7. The number of observation (n) must be greater than the number of regression coefficients estimate 8. For a given X, m t is normally distributed. m t X ~ N(0, s 2 ) 2.1. A failure of assumption 6 is called a) multicollinearity b) serial correlation c) heteroscedasticity d) endogeneity 2.2. A failure assumption 4 is called a) multicollinearity b) serial correlation c) heteroscedasticity d) endogeneity 2.3. If one of the X varaibles is endogenous, OLS (ordinary least squares) estimates of the parameter of the regression equation will be (check all that apply) a) biased b) unbiased c) efficient d) not efficient e) consistent 2.4. Assuming that all 8 assumptions hold and that the parameter estimates of the β' i ; β $ i are made using OLS, then β $ i is distributed σ $ β $ i a) t under H o : β $ i = βi c) P 2 under H o : b) t under H o : β $ i = 0 d) P2 under H o : β $ i = βi $ β i = 0 2.5. Assuming that all 8 assumptions hold, the nr 2 (where R 2 =1-(ESS/TSS) fro the OLS regression) is distributed a) F under H 0 : B$ $ 1 = β2 =... = βk = 0 a) P 2 under H 0 : $ $ B1 = β2 =... = β k = 0 b) F under H 0 : B$ 1 = β1,......, β $ k = β k b) P 2 under H 0 : B$ 1 = β1,......, β $ k = βk Econ 170 Final Page 2 of 18

Econ 170 Final Page 3 of 18

Problems: (Points as indicated) 3. (30 points total) Consider the following regression mode to be estimated on a sample of Y and X values: Yi = α + β Xi + µ i i N 2 µ ~ 0, σ i and σi = σ + γx i α, β, γ, and σ are parameters and assume σ + γxi > 0, i 3.1. (6 points) True or False $ β (The OLS estimate of β) is unbiased if γ =0 unbiased if γ 0 efficient if γ =0 efficient if γ 0 3.2. (12 points) Suppose that you knew the true values of F and ( but do not know the true values of " and $. Describe how you might use the available date to generate unbiased efficient estimators for " and $. Explain your answer clearly The problem here is that the error terms do not have constant variance (heteroskedasticity) though the OLS estimates will be unbiased, they will not be efficient. If we knew the values of F and ( then we would know the value of s i = F +(X i If we then transform the original regression equation by dividing both sides of the equation by s i we get Yi α Xi µ = + β + i This transformed equation now has an error term that is standard norm and so fits the σi σi σ i σ i classical assumptions for OLS regress to be BLUE. The procedure would be to regress the transformed Y against the transformed X and 1/s i with no constant term. The OLS estimate of b would be BLUE. 3.3. (12 points) Now suppose that you don t know the true values of F and ( but only have a data set on Y and X. Describe how you might use the available date to generate estimators for " and $ and describe what the properties of these estimates would be. Explain your answer clearly. If you did not know the true values of F and (, then you would have to estimate them. One could do an OLS regression on the original equation Yi = α + β Xi + µ i and then predict the residuals $µ i. The predicted residuals could then be regressed on X (or X and powers) of X to get estimates of F and (. If you take the specification of σ i and convert it to a variance you get σ i 2 σ 2 σγxi γ 2 Xi 2 orσi 2 2 = + 2 = = δ0 + δ1xi + δ3xi you could think of this as 2 2 an estimating equation where we proxy for σ i with $µ i. Taking the square roots of the predicted values of the estimated variance equation will provide the weights to us for transforming the variables as in 3.2. Econ 170 Final Page 4 of 18

4. Consider the following time series model: Y t = α +βx t + µ t. We suspect that the model is characterize by a second order auto regressive process. µ t = ρ 1 µ t-1 + ε t where ε t is white noise i.e it is iid. 4.1. (5 points) Describe step by step how you could test the hypothesis H0: µ t = ε t against the alternative H1: µ t = ρ 1 µ t-1 + ε t 1. estimate model Y t = α +βx t + µ t. with OLS and predict residuals e t 2. do auxiliary regression: e t = α +βx t +ρ 1 e t-1 3. calculate L = (n-m)r 2 where n is the number of observations, m is number of lags in the auxiliary regression and R 2 is its unadjusted R 2 4. L is distributed χ 2 m under null hypothesis of all lagged error coefficients in auxiliary regression are 0, If we cannot reject, have serial correlation Because this is a first order autregressive procoss, you could also use the Durbin-Watson test. 4.2. (5 points) Suppose that after doing the test proposed in 4.1., that you cannot reject the null hypothesis of no autocorrelation, what are the consequences of estimating the original model with OLS. Estimates from the OLS model will still be unbiased, but they will not be efficient. 4.3. (5 points) Again assuming that we cannot reject the null hypothesis of autocorrelation,, describe step by step the procedure that would result in consistent and asysmtotically efficient estimators of the parameters of this model. 1. estimate model Y t = α +βx t + µ t. with OLS and predict residuals e t 2. do auxiliary regression: e t =ρ 1 e t-1 +ρ 2 e t-2 3. use r the OLS estimate of ρ to transform the original data as follows Y * t =Y t - ry t-1 X* t =X t - rx t-1 4. estimate original model on transformed data and repeat until the estimates of the rhos no longer change (iterative procedure) or search over rho for the value producing the best fit (search procedure).. Econ 170 Final Page 5 of 18

5. (40 points) This question uses Part A of the Final Exam Homework Stata assignment. Note: I am trying to cover a lot of ground with one data set. So please treat the questions as sequential and only use the information that is specifically requested for each part. 5.1.a. (5 points) First consider you OLS estimation results for Model 1 and Model 2 from the Final Exam Homework assignment. The economic model is the cross state variation in average performance on the SAT score, depends on how much the state spends per pupil on education and possibly other factors. Compare the results you obtained from these two model (particularly concentrating on the differences). What explanation would you offer as to why the two models differ? If you look at the two sets of results, you find that Model 1 has a negative insignificant coefficient on spend01 and very low R 2. However when pr_02 is added the sign on spend01 changes and becomes significant. These are classic symptoms of omitted variable bias.. Model 1: reg sat_tot spend01 ---------+------------------------------ F( 1, 48) = 0.14 Model 595.509669 1 595.509669 Prob > F = 0.7114 Residual 206388.27 48 4299.75563 R-squared = ---------+------------------------------ Adj R-squared = -0.0179 Total 206983.78 49 4224.15878 Root MSE = 65.573 0.711 -.0203284.0139785 _cons 1092.407 62.56692 17.460 0.000 966.6081 1218.207. Model 2: reg sat_tot spend01 pr_02 ---------+------------------------------ F( 2, 47) = 95.67 Model 166165.534 2 83082.7668 Prob > F = 0.0000 Residual 40818.2463 47 868.473326 R-squared = ---------+------------------------------ Adj R-squared = 0.7944 Total 206983.78 49 4224.15878 Root MSE = 29.47 0.003.0043223.0203997 pr_02-2.146003.1554239-13.807 0.000-2.458676-1.833331 _cons 1059.989 28.21694 37.566 0.000 1003.224 1116.754 Econ 170 Final Page 6 of 18

5.1.b. (5 points) Though you were not asked to do it on the assignment, if, after running Model 2, you had run Whites general test you would have gotten the following output.. whitetst White's general test statistic : 10.81928 Chi-sq( 5) P-value =.0551 Model Yt = β1 + β2 X2 + β 3X3 + µ t assumed error structure: 2 σt = α1 + α2 X2 + α3x3 + α4x2 2 + α5x3 2 + α6 X2 X3 µ $ 2 1. Estimate model and calculate 2. Estimate auxiliary regression 2 3. Compute nr for auxiliary regression 2 4. nr ~ χ5 2 under the hypothesisα2 = α3 = α4 = α5 = α6 = 0 What have we tested for with this test? How was the test actually done? And, how do you interpret the above results? How do these test results change your interpretation of the OLS estimators you obtained for Model 2, if at all. The White test is a test for Heteroskedasticity (error terms whose variance is not constant). The test that is performed is as follows. The test statistic in the output above is nr 2 (where n=50 and R2 is from the auxiliary regression on the estimated residuals from Model 2. A Chi-sq of 10.81 has a P-value of.0551 so at not quite the 95% level of confidence we can reject the hypothesis of no heteroskedasticity. If the residuals are heteroskedastic, then the OLS coefficient estimates, while still unbiased are inefficient (so hypotheses tests are invalid). 5.2. Now consider Model 2, Model 3, and Model 4. For this section we have added have added the states poverty rate as another determinant of SAT scores. In Model 3 it is added separate entering linearly variable; in Model 4 poverty rate enters both linearly and interacted with spending. Model 2: sat_tot = α + β 1 spend01 + β 2 pr_02 + µ Model 3: sat_tot = α + β 1 spend01 + β 2 pr_02 + β 3 pov_rate + µ Model 4: sat_tot = α + β 1 spend01 + β 2 pr_02 + β 3 pov_rate + β4(pov_rate spend01) + µ 5.2.a. (5 points) Using your estimated results for Model 4, State A has a poverty rate of 5% and State B has a poverty rate of 15%. An additional $1 of spending per pupil in will result in how many additional points on the SAT scores in State A, in State B (show you work) Econ 170 Final Page 7 of 18

δsat _ tot Note that if we take Model 4, = β1 + β4pov _ rate using the coefficients from the regression we get the δspend 01 numbers above. Also note from the summary statistics that you produced, that participation rate is measured in full number e.g. 15% = 15 (not.15) in the actual data. Econ 170 Final Page 8 of 18

5.2.b. (5 points) Using the Wald Test from your STATA assignment, can you reject the specification in Model 2 (as H 0 when the alternative hypothesis (H 1 ) is Model 4? Explain. Restriction between Model 4 and Model 2 is that β 3 =β 4 =0 The output of the test is. test pov_rate pr_spend ( 1) pov_rate = 0.0 ( 2) pr_spend = 0.0 F( 2, 45) = Prob > F = With 99% + confidence we can reject the hypothesis that β 3 =β 4 =0, that is we can reject Model 2 in favor of Model 4 5.2.c. (5 points) Suppose our interest was in rejecting Model 2 (as H 0 ) when the alternative hypothesis (H 1 ) is Model 3, Do you have enough information on your print outs to do this test? Explain. Here the implied restriction is that β 3 =0 We could rely on the t statistic on β 3 or, better we could remember that we can construct the F for Wald s test using the ESS of the restricted and unrestricted regressions. For a linear restriction of coefficients the following test statistic is distributed as F under the null hypothesis that the restriction is valid. 2 2 ( U R ) R R / ( k m) is distributed as F 1 R / ( n k ) 2 1, k m U Where the R 2 s are the R 2 from the restricted and unrestricted regressions, k-m is the number of restriction and k is the number of parameters estimated in the unrestricted model. From the two regression run on models 2 and 4 in the exercise, you can get the two R 2 measures. 5.2.d. (5 points) Using the information you generate in part 5 of the STATA assignment, where does North Carolina rank compared to the other states in average SAT scores? Rank Where do you predict North Carolina would rank in average SAT scores if all states had participation in the exams at the same level? Rank. 5.3. Now consider a model of SAT scores in which we also consider some additional factors and we consider the determinants of the participation rate pr_02. Model 5: 2) pr_02 = γ + γ 1 spend01 + γ 2 sat_tot + γ 3 fam_y + β 4 col_grad + υ 5.3.a. (5 points) In theory, using OLS to estimate equation 1) of model 5 will result in parameter estimates that are (check those that apply) biased efficient unbiased inefficient asymptotically efficient not asymptotically efficient consistent not consistent 5.3.b. (5 points) Using your STATA output, compare the estimates you obtained to OLS and 2SLS procedures. How do they differ? Econ 170 Final Page 9 of 18

Simple answer here (see log material) is that they don t differ much at all coefficient sign, magnitudes, and implied significance are virtually identical. A case where doing it right, gets you essentially the same answer as doing it wrong. 5.4. Bonus Question (10 points) Along with your answer to 4.3., consider the additional output that was obtained from an OLS regression 1') sat_tot = α + β 1 spend01 + β 2 pr_02 + g 0 er_pr + β 3 pov_rate + β 4 col_grad + µ where er_pr are the predicted residuals for an OLS estimation of the reduced form equation for pr_02.. reg sat_tot pr_02 er_pr spend01 pov_rate col_grad ---------+------------------------------ F( 5, 44) = 64.70 Model 182202.063 5 36440.4126 Prob > F = 0.0000 Residual 24781.717 44 563.220842 R-squared = 0.8803 ---------+------------------------------ Adj R-squared = 0.8667 Total 206983.78 49 4224.15878 Root MSE = 23.732 pr_02-2.67249.4787374-5.582 0.000-3.637321-1.707658 -.7121979 1.300466 spend01.0108647.004135 2.628 0.012.0025313.0191982 pov_rate -2.539621 1.437782-1.766 0.084-5.43728.3580377 col_grad 3.747338 1.557108 2.407 0.020.6091938 6.885482 _cons 1028.728 55.9383 18.390 0.000 915.9915 1141.464 What can you add to your answer to 5.3 with this additional information? This is the Durbin-Wu-Hausman exogeneity test. A reduced form equation for pr_02 is run and the residuals are calculated. These residuals are then included in the estimation of structural equation 1) as an additional variable. If the coefficient on the residuals does not differ from zero (as in this case) then we cannot reject exogeneity. The reason we get the same results on equation 1 with OLS and 2SLS is that pr_02 is statistically exogenous. Full Log of STATA Exercise: use "D:\work\courses\econ170\Econ 170 Exams\final_hw_s03_a.dta", clear. do "D:\work\courses\econ170\Econ 170 Exams\final_s03_a.do". /* Key for final Exam Homework*/. /*Part A: Using final_hw_s03_a.dta*/. pause on. sum Variable Obs Mean Std. Dev. Min Max Econ 170 Final Page 10 of 18

---------+----------------------------------------------------- state 0 pov_rate 50 11.942 3.12489 6.5 19.9 pr_02 50 37.4 28.22938 4 83 ver_02 50 532.42 32.52132 488 597 math_02 50 536.96 32.97863 491 610 sat_tot 50 1069.38 64.99353 980 1207 spend01 50 7252.76 1098.009 4579 9362 pr_spend 50 85615.21 22263.82 43042.6 151147.6 fam_y 50 49241.52 7108.452 36484 65521 col_grad 50 24.586 4.499842 14.1 34.8 pr_02bak 50 37.4 28.22938 4 83. corr pov_rate pr_02 ver_02 math_02 sat_tot spend01 pr_spend fam_y col_grad (obs=50) pov_rate pr_02 ver_02 math_02 sat_tot spend01 pr_spend ---------+--------------------------------------------------------------- pov_rate 1.0000 pr_02-0.2784 1.0000 ver_02 0.0241-0.8814 1.0000 math_02-0.0556-0.8519 0.9692 1.0000 sat_tot -0.0161-0.8733 0.9922 0.9924 1.0000 spend01-0.2966 0.2816-0.0704-0.0363-0.0536 1.0000 pr_spend 0.8341-0.1233-0.0242-0.0849-0.0552 0.2638 1.0000 fam_y -0.7995 0.4708-0.2329-0.1367-0.1859 0.2834-0.6576 col_grad -0.6274 0.4251-0.1514-0.0864-0.1196 0.2060-0.5219 fam_y col_grad ---------+------------------ fam_y 1.0000 col_grad 0.8133 1.0000. reg sat_tot spend01 ---------+------------------------------ F( 1, 48) = 0.14 Model 595.509669 1 595.509669 Prob > F = 0.7114 Residual 206388.27 48 4299.75563 R-squared = 0.0029 ---------+------------------------------ Adj R-squared = -0.0179 Total 206983.78 49 4224.15878 Root MSE = 65.573 spend01 -.003175.0085314-0.372 0.711 -.0203284.0139785 _cons 1092.407 62.56692 17.460 0.000 966.6081 1218.207. reg sat_tot spend01 pr_02 Econ 170 Final Page 11 of 18

---------+------------------------------ F( 2, 47) = 95.67 Model 166165.534 2 83082.7668 Prob > F = 0.0000 Residual 40818.2463 47 868.473326 R-squared = 0.8028 ---------+------------------------------ Adj R-squared = 0.7944 Total 206983.78 49 4224.15878 Root MSE = 29.47 spend01.012361.0039959 3.093 0.003.0043223.0203997 pr_02-2.146003.1554239-13.807 0.000-2.458676-1.833331 _cons 1059.989 28.21694 37.566 0.000 1003.224 1116.754. whitetst White's general test statistic : 10.81928 Chi-sq( 5) P-value =.0551. reg sat_tot spend01 pr_02 pov_rate ---------+------------------------------ F( 3, 46) = 90.59 Model 177022.319 3 59007.4396 Prob > F = 0.0000 Residual 29961.4612 46 651.336113 R-squared = 0.8552 ---------+------------------------------ Adj R-squared = 0.8458 Total 206983.78 49 4224.15878 Root MSE = 25.521 spend01.0089185.0035617 2.504 0.016.0017491.0160879 pr_02-2.265635.1377517-16.447 0.000-2.542914-1.988355 pov_rate -5.104642 1.250309-4.083 0.000-7.621383-2.587901 _cons 1150.39 32.97606 34.886 0.000 1084.013 1216.768. reg sat_tot spend01 pr_02 pov_rate pr_spend ---------+------------------------------ F( 4, 45) = 70.71 Model 178571.217 4 44642.8042 Prob > F = 0.0000 Residual 28412.5631 45 631.390292 R-squared = 0.8627 ---------+------------------------------ Adj R-squared = 0.8505 Total 206983.78 49 4224.15878 Root MSE = 25.127 spend01.0304041.0141589 2.147 0.037.0018866.0589216 pr_02-2.270024.135655-16.734 0.000-2.543247-1.996801 pov_rate 8.346319 8.675756 0.962 0.341-9.12755 25.82019 Econ 170 Final Page 12 of 18

pr_spend -.0018886.0012058-1.566 0.124 -.0043172.00054 _cons 995.7873 103.9111 9.583 0.000 786.4996 1205.075. /*Wald Test, two ways to the same answer*/. test pov_rate pr_spend ( 1) pov_rate = 0.0 ( 2) pr_spend = 0.0 F( 2, 45) = Prob > F =. test pov_rate=0 ( 1) pov_rate = 0.0 F( 1, 45) = 0.93 Prob > F = 0.3412. test pr_spend=0, accum ( 1) pov_rate = 0.0 ( 2) pr_spend = 0.0 F( 2, 45) = 9.82 Prob > F = 0.0003. display.0304041-(.0018886*5).0209611. display.0304041-(.0018886*15).0020751. /*Making predictions if all states had participation at same level*/. replace pr_02=37.4 (50 real changes made). /* Setting all observation at average participation*/. predict sat_hat (option xb assumed; fitted values). sort sat_tot. list state sat_tot state sat_tot Econ 170 Final Page 13 of 18

1. GA 980 2. SC 981 3. TX 991 4. FL 995 6. PA 998 7. NY 1000 8. IN 1001 9. DE 1002 10. ME 1005 11. RI 1007 12. HI 1008 13. NJ 1011 14. CA 1013 15. VA 1016 16. CT 1018 17. MD 1020 18. VT 1022 19. NV 1027 20. MA 1028 21. AL 1035 22. NH 1038 23. WV 1040 24. AR 1043 25. OR 1052 26. WA 1054 27. WY 1068 28. OH 1073 29. ID 1080 30. MT 1088 31. CO 1091 32. NM 1094 33. KY 1102 34. MS 1106 35. AZ 1116 36. TN 1117 37. AK 1119 38. LA 1120 39. UT 1122 40. OK 1127 41. MI 1130 42. NE 1131 43. MO 1154 44. KS 1158 45. SD 1162 46. MN 1172 47. IL 1174 48. WI 1182 49. IA 1193 50. ND 1207 Econ 170 Final Page 14 of 18

. sort sat_hat. list state sat_tot sat_hat state sat_tot sat_hat 1. LA 1120 1030.204 2. WV 1040 1031.561 3. MS 1106 1033.46 4. NM 1094 1034.228 5. AL 1035 1045.247 6. AR 1043 1046.171 7. KY 1102 1047.069 8. UT 1122 1047.273 9. AZ 1116 1047.689 10. TX 991 1048.357 11. CA 1013 1049.498 12. OK 1127 1050.99 13. MT 1088 1052.648 14. TN 1117 1054.395 15. SC 981 1056.032 16. ID 1080 1056.893 17. FL 995 1057.702 18. NY 1000 1057.816 19. GA 980 1060.092 20. SD 1162 1060.24 22. NV 1027 1061.026 23. MO 1154 1061.068 24. WA 1054 1064.327 25. HI 1008 1065.54 26. CO 1091 1067.771 27. OH 1073 1070.911 28. IL 1174 1071.24 29. OR 1052 1072.396 30. VA 1016 1076.498 31. ME 1005 1078.465 32. RI 1007 1078.642 33. AK 1119 1079.534 34. PA 998 1080.86 35. ND 1207 1081.442 36. MI 1130 1082.29 37. KS 1158 1082.384 38. WY 1068 1082.858 39. IA 1193 1087.334 40. NE 1131 1088.053 41. MA 1028 1089.136 42. MD 1020 1091.129 43. NH 1038 1091.438 44. IN 1001 1093.566 Econ 170 Final Page 15 of 18

45. VT 1022 1098.422 46. DE 1002 1099.098 47. WI 1182 1105.683 48. MN 1172 1110.313 49. CT 1018 1113.146 50. NJ 1011 1116.185. /*Restoring original values to pr_02*/. replace pr_02=pr_02bak (50 real changes made). /*Model 5*/. /* OLS estimation of the SAT equation of model 5*/. reg sat_tot pr_02 spend01 pov_rate col_grad ---------+------------------------------ F( 4, 45) = 81.98 Model 182006.631 4 45501.6578 Prob > F = 0.0000 Residual 24977.1488 45 555.04775 R-squared = 0.8793 ---------+------------------------------ Adj R-squared = 0.8686 Total 206983.78 49 4224.15878 Root MSE = 23.559 pr_02-2.402115.1350726-17.784 0.000-2.674165-2.130065 spend01.0094098.003292 2.858 0.006.0027793.0160403 pov_rate -2.633522 1.418512-1.857 0.070-5.490552.2235072 col_grad 3.058477 1.020629 2.997 0.004 1.002825 5.114128 _cons 1047.226 45.95481 22.788 0.000 954.6683 1139.784. /* 2sls estimation of the SAT equation of model 5*/. ivreg sat_tot (pr_02=fam_y) spend01 pov_rate col_grad Instrumental variables (2SLS) regression ---------+------------------------------ F( 4, 45) = 9.93 Model 179782.665 4 44945.6663 Prob > F = 0.0000 Residual 27201.1148 45 604.469217 R-squared = 0.8686 ---------+------------------------------ Adj R-squared = 0.8569 Total 206983.78 49 4224.15878 Root MSE = 24.586 pr_02-2.67249.4959583-5.389 0.000-3.671401-1.673578 spend01.0108647.0042837 2.536 0.015.0022369.0194926 pov_rate -2.539621 1.489501-1.705 0.095-5.539629.4603874 col_grad 3.747338 1.613119 2.323 0.025.49835 6.996326 _cons 1028.728 57.95048 17.752 0.000 912.0095 1145.446 Econ 170 Final Page 16 of 18

Instrumented: pr_02 Instruments: spend01 pov_rate col_grad fam_y. /* Bonus question endogeneity test for model 5*/. reg pr_02 spend01 col_grad pov_rate fam_y ---------+------------------------------ F( 4, 45) = 4.46 Model 11082.8636 4 2770.7159 Prob > F = 0.0040 Residual 27965.1362 45 621.447471 R-squared = 0.2838 ---------+------------------------------ Adj R-squared = 0.2202 Total 39047.9998 49 796.897955 Root MSE = 24.929 pr_02 Coef. Std. Err. t P> t [95% Conf. Interval] spend01.00481.0034095 1.411 0.165 -.0020571.0116772 col_grad.7397285 1.363968 0.542 0.590-2.007444 3.486901 pov_rate 2.714306 1.915009 1.417 0.163-1.142721 6.571333 fam_y.0022324.0011226 1.989 0.053 -.0000287.0044934 _cons -158.0135 65.56593-2.410 0.020-290.07-25.95689. predict pr_hat (option xb assumed; fitted values). predict er_pr,resid. reg sat_tot pr_02 er_pr spend01 pov_rate col_grad ---------+------------------------------ F( 5, 44) = 64.70 Model 182202.063 5 36440.4126 Prob > F = 0.0000 Residual 24781.717 44 563.220842 R-squared = 0.8803 ---------+------------------------------ Adj R-squared = 0.8667 Total 206983.78 49 4224.15878 Root MSE = 23.732 pr_02-2.67249.4787374-5.582 0.000-3.637321-1.707658 er_pr.2941339.4993292 0.589 0.559 -.7121979 1.300466 spend01.0108647.004135 2.628 0.012.0025313.0191982 pov_rate -2.539621 1.437782-1.766 0.084-5.43728.3580377 col_grad 3.747338 1.557108 2.407 0.020.6091938 6.885482 _cons 1028.728 55.9383 18.390 0.000 915.9915 1141.464.. Econ 170 Final Page 17 of 18

. end of do-file. Econ 170 Final Page 18 of 18