ln( weekly earn) age age

Similar documents
ε. Therefore, the estimate

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

CHAPTER 2. = y ˆ β x (.1022) So we can write

STA302/1001-Fall 2008 Midterm Test October 21, 2008

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Suggested Answers, Problem Set 4 ECON The R 2 for the unrestricted model is by definition u u u u

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

Statistics: Unlocking the Power of Data Lock 5

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

Simple Linear Regression - Scalar Form

Lecture Notes Types of economic variables

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

Econometric Methods. Review of Estimation

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

MEASURES OF DISPERSION

Simple Linear Regression

Objectives of Multiple Regression

Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b

2SLS Estimates ECON In this case, begin with the assumption that E[ i

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Lecture 8: Linear Regression

Introduction to Econometrics (3 rd Updated Edition, Global Edition) Solutions to Odd-Numbered End-of-Chapter Exercises: Chapter 9

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

Statistics MINITAB - Lab 5

hp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations

ENGI 3423 Simple Linear Regression Page 12-01

Multiple Choice Test. Chapter Adequacy of Models for Regression

Chapter Two. An Introduction to Regression ( )

SPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS

Department of Agricultural Economics. PhD Qualifier Examination. August 2011

Chapter 13 Student Lecture Notes 13-1

i 2 σ ) i = 1,2,...,n , and = 3.01 = 4.01

f f... f 1 n n (ii) Median : It is the value of the middle-most observation(s).

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS


Lecture 2: Linear Least Squares Regression

Multiple Linear Regression Analysis

Using Statistics To Make Inferences 9

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

Summary of the lecture in Biostatistics

Lecture 1 Review of Fundamental Statistical Concepts

Wu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

Linear Regression with One Regressor

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Logistic regression (continued)

Lecture 1: Introduction to Regression

Comparison of Dual to Ratio-Cum-Product Estimators of Population Mean

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Chapter 8. Inferences about More Than Two Population Central Values

Lecture 3. Sampling, sampling distributions, and parameter estimation

Continuous Distributions

Chapter 14 Logistic Regression Models

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

9.1 Introduction to the probit and logit models

L5 Polynomial / Spline Curves

Correlation and Regression Analysis

Evaluating Polynomials

Fundamentals of Regression Analysis

Lecture 1: Introduction to Regression

ENGI 4421 Propagation of Error Page 8-01

is the score of the 1 st student, x

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

UNIVERSITY OF EAST ANGLIA. Main Series UG Examination

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

Example. Row Hydrogen Carbon

1 Onto functions and bijections Applications to Counting

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

Measures of Dispersion

Simple Linear Regression

residual. (Note that usually in descriptions of regression analysis, upper-case

Chapter 2 Supplemental Text Material

Chapter 3 Sampling For Proportions and Percentages

UNIVERSITY OF TORONTO AT SCARBOROUGH. Sample Exam STAC67. Duration - 3 hours

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Point Estimation: definition of estimators

Functions of Random Variables

Part 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))

Laboratory I.10 It All Adds Up

Simulation Output Analysis

Probability and. Lecture 13: and Correlation

ρ < 1 be five real numbers. The

Analysis of Variance with Weibull Data

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Convergence of the Desroziers scheme and its relation to the lag innovation diagnostic

CHAPTER 4 RADICAL EXPRESSIONS

Recall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I

Previous lecture. Lecture 8. Learning outcomes of this lecture. Today. Statistical test and Scales of measurement. Correlation

Simple Linear Regression and Correlation.

The TDT. (Transmission Disequilibrium Test) (Qualitative and quantitative traits) D M D 1 M 1 D 2 M 2 M 2D1 M 1

Transcription:

Problem Set 4, ECON 3033 (Due at the start of class, Wedesday, February 4, 04) (Questos marked wth a * are old test questos) Bll Evas Sprg 08. Cosder a multvarate regresso model of the form y 0 x x. Wrte the st order codtos for the optmzato problem where oe s terested mmzg the sum of squared errors SSE = ˆ. Suppose a sample f 5 observatos, the followg facts are preseted about the model above. x x x x x x y 40 80 0 0 0 0 x y 0 x y 60 Usg the frst order codtos (or ormal equatos) ad these facts, provde the estmates for ˆ 0, ˆ ˆ ad. HINT: Solve for ˆ 0 frst.. Dowload the data cps87.dta. Geerate two ew varables. The frst s the atural log of weekly eargs. The secod s age squared. Next, ru a regresso of the atural log of weekly eargs o age, age squared ad years of educato. We ca wrte ths model as l( weekly ear) 0 age age educ 3 l( weekly ear) Provde a mathematcal expresso that defed age l( weekly ear) s at age? Age 35? Age 50? age. Usg the results from the regresso, what 3. Cosder a multvarate regresso model of the form y 0 x. Suppose the R from ths model s R a. True, False, or Ucerta ad expla. The R ca ever fall below R a whe addtoal varables are added to the model? (Thk of a specal case where someoe adds completely rrelevat varables to the model what wll happe to the R?) 4. O the class web page s a STATA data set called house_prce.dta. It has data o 4 homes sold 998 a small tow New Eglad. The data set cotas formato o the sales prce of the house (measured o thousads of dollars), the umber of bedrooms, bathrooms, other rooms, square feet of lvg space ad age of the home, Dowload the data ad tally estmate a regresso wth house prces as the outcome of terest ad four covarates: age years, # umber of bedrooms, # of bath rooms, # of other rooms. Call ths model. a. Iterpret the coeffcet o age years ad # of bedrooms by provdg a umerc example. Now, estmate a secod model ad add to the orgal regresso the square feet of lvg space. Call ths model.

b. What happes to the coeffcet o # of rooms, # of bedrooms ad # of other rooms ths ew model compared to the prevous oe? Why have the coeffcets o these three varables chaged so dramatcally? c. Iterpret the coeffcet o square feet of lvg space. Now estmate a thrd model wth the same depedet varable but clude oly two covarates: age years ad square feet. d. Compare the R from ths model ad that Model #. Provde a tutve explaato for why the dfferece s so small. 5. O the class web page s a data set amed seor_medcal_exp.dta whch has formato o age, the umber of chroc codtos ad the total medcal expeses for a sample of seor ctzes aged 65 to 84. I select seors for ths example because all of them have health surace through the Medcare program. The three varables the data set are Varable Label totalexp total expedtures o medcal care, 00 chroc umber of chroc codtos (0-5) age age years Load the data set to STATA, the costruct two ew varables: Regress totalexp o age ad chroc. (reg totalexp age chroc) a) Iterpret the coeffcet o age provde a umerc example of the magtude of the coeffcet o ths varable? b) Iterpret the coeffcet o chroc -- provde a umerc example of the magtude of the coeffcet o ths varable? c) Now regress totalexp o age (reg totalexp age). What has happeed to the coeffcet o age compared to the results part a? Does ths make sese? Why or why ot. d) Regress age o chroc. What s the coeffcet o chroc? Does ths make sese? e) After ths regresso, output the resduals from the regresso predct res_age, resdual Next, regress totalexp o res_age. How does ths umber compare to the estmates a)? 6. *Retur to problem 5 o problem set 3. A pharmaceutcal compay s vestgatg the cholesterol lowerg beefts of a ew drug. I a sample of subjects the compay radomly assgs mllgrams of actve gredets (label ths as x ) ad the outcome of terest, labeled as y, s the chage cholesterol from the start utl the ed of the tral. Itally, the researchers estmate a model of the form y 0 x. However, a colleague metos that as part of the expermet, they also collected detaled data o characterstcs of survey partcpats that predct y lke ther weght at the start of the tral, age, sex, ethcty/race, plus other varables. The colleague asks whether o should clude these covarates (label them as x, x 3, x k) to the basc regresso? a) By estmatg a model of y... 0 x x xkk, do you atcpate that the estmate o ˆ wll chage? b) I a multvarate model, the estmated varace of ˆ s gve as ˆ ˆ( ) ( R ) ( x x ) V ˆ

What s the lkely cosequece of addg these addtoal covarates (x, x 3, x k) to the estmated varace of ˆ? Expla your aswer. 7. *O the ext page are the results from two regresso models: I model (), I regress Y o X, ad ote that the stadard error o the coeffcet o X s very small ad the t-statstc o the coeffcet o ˆ s over 3. Note that model (), whe I add X to the model, the stadard error o ˆ creases by a factor of 3 ad the t- statstc o ths parameter falls to.39. Usg the formato gve, provde a tutve explaato for why the stadard creases so much o ˆ whe X s added to the model. To get full credt, you must provde the proper equato. 8. *May people get ther health surace through ther job ad because of hgh health surace costs, may employers are cosderg offerg free o-ste exercse classes as a way of ecouragg healthy behavors ad hopefully reducg medcal care costs. The evdece for subsdzed exercse classes comes prmarly from research the feld of publc health. I these models the authors collect data from a employer ad estmate a regresso of the form y 0 x where y s aual spedg o health care for employee ad x s a dummy varable that equals f the perso uses the o-ste health care servces. Call ths model (). Let be the estmate for from model () ad ths case, the author gets the expected result that < 0 people that use o-ste exercse classes have lower health care spedg. Model () has bee crtczed because t does ot cotrol for the fact that the least healthy employees are the oes the least lkely to eroll these classes. Cosder a smple exteso to the model where the author has detaled data o the health of employees pror to the exercse classes opeg. Let x be a smple dex that equals the umber of chroc health codtos a perso has (e.g., a perso wth hgh blood pressure, obesty, ad dabetes has a cout of three whereas a healthy perso has a cout of zero). Now cosder estmatg model () whch s of the form y 0 x x. If Model () s the true model, do you atcpate that, the estmate from model (), s based up or dow? Expla your aswer ad to get full credt, you must provde a approprate equato. 3

Results for Questo 7 Correlato betwee X ad X. corr x x (obs=489) x x -------------+------------------ x.0000 x 0.9994.0000 Model : Regresso of Y o X. reg y x Source SS df MS Number of obs = 489 -------------+------------------------------ F(, 487) = 56.63 Model.04473.04473 Prob > F = 0.0000 Resdual 535.054756 487.540634 R-squared = 0.845 -------------+------------------------------ Adj R-squared = 0.84 Total 656.09899 488.63705357 Root MSE =.46383 y Coef. Std. Err. t P> t [95% Cof. Iterval] x.0765488.0037 3.7 0.000.07005.08877 _cos 5.059357.043554 6.6 0.000 4.97395 5.44763 Model : Regresso of Y o X ad X. reg y x x Source SS df MS Number of obs = 489 -------------+------------------------------ F(, 486) = 8.4 Model.58 60.5579054 Prob > F = 0.0000 Resdual 534.9838 486.598358 R-squared = 0.846 -------------+------------------------------ Adj R-squared = 0.839 Total 656.09899 488.63705357 Root MSE =.46389 y Coef. Std. Err. t P> t [95% Cof. Iterval] x.30486.0935397.39 0.63 -.059375.339098 x -.0539557.093559-0.58 0.564 -.37338.943 _cos 5.059738.0435649 6.4 0.000 4.9743 5.4566 4

9. *Research has show that studets attedg hgher qualty colleges ad uverstes ted to have hgher wages after graduato tha those attedg less selectve sttutos. Usg a atoally represetatve sample of college graduates aged 30-39, researchers regress the atural log of aual eargs (y ) o the average SAT score from the college the respodet atteded (x ) usg the smple bvarate regresso model y 0 x. Call ths model (). Let be the estmate for β from model () ad ths case, the author gets the expected result that >0 studets that graduated from hgher qualty schools teds to have hgher eargs. Someoe crtczes model () because t does ot cotrol for dffereces other characterstcs of the studets that are lkely to be correlated wth eargs. For example, the author does ot have a measure of academc ablty for the studet lke a SAT score whch they argue should be cluded the model. Suppose the author cosders estmatg model () whch s of the form y 0 x x where x s the studets ow SAT score. If Model () s the true model, do you atcpate that, the estmate from model (), s based up or dow? Expla your aswer ad to get full credt, you must provde a approprate equato. 0. A researcher regresses y o x ad produces the results below. A colleague argues that the model should also clude the covarates x, x, ad x4, whch the colleague argues are strog predctors of y. Below s a matrx that provdes the correlato coeffcets for the varables x, x, x3 ad x4. Gve these results, do you expect that addg x, x3 ad x4 to the model wll chage the results much? Assume your colleague s correct that x, x3 ad x4 are strog predctors of y.. reg y x Results for Problem 0 Source SS df MS Number of obs = 398 -------------+------------------------------ F(, 3979) = 795.9 Model 74.739778 74.739778 Prob > F = 0.0000 Resdual 874.36848 3979.9745785 R-squared = 0.666 -------------+------------------------------ Adj R-squared = 0.664 Total 049.086 3980.6359504 Root MSE =.46877 y Coef. Std. Err. t P> t [95% Cof. Iterval] x.07398.00636 8.0 0.000.0688386.07959 _cos 5.0543.0353055 44.60 0.000 5.03595 5.7436. corr x x x3 x4 (obs=398) x x x3 x4 -------------+------------------------------------ x.0000 x 0.08.0000 x3 0.000 0.006.0000 x4 0.005 0.0075-0.0.0000 5