Multiple Regression: Inference
|
|
- Kristin Matthews
- 5 years ago
- Views:
Transcription
1 Multiple Regression: Inference
2 The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled for. ˆ j will never be 0, but how far is it from 0? Need to weigh the size of the estimate against its sampling error. j We define the t-statistic as: t. ˆ se( ˆ ) Reject H 0 : β j =0 if t is sufficiently large: threshold depends on chosen significance level. Note: we test β j =0, and never. ˆ j j ˆ 0 j
3 Normality provides a benchmark for t-test Assumption 6: Normality The population error u is independent of the explanatory variables x1, x2,, xk and is normally distributed with zero mean and variance σ 2 : u ~ Normal(0, σ 2 ). 4 assumptions =>OLS gives us an unbiased estimate of the coefficient. 4+1 assumptions =>OLS gives us an unbiased estimate of the variance of the coefficient estimate and OLS is efficient (BLUE). 4+2 assumptions (=Classical Linear Model assumptions) =>coefficients have a normal distribution. OLS estimators are the best estimators: smallest variance among ALL estimators, not only the linear ones.
4 0 Density What if normality assumption fails? Example: Crime data, variable narr86 use hist narr86,discrete Non-normality of the errors will not be a problem if: -Large sample size -Log transformation of the dep var. -drop outliers narr86
5 Distribution of OLS estimators 5 Gauss-Markov Assumptions + Normality assumption OLS estimators are normally distributed: Or So under the CLM assumptions,. Careful: this is different from the previous theorem, which involved the constant σ in sd( ˆ j), while in the t-test it is the random variable ˆ. Note: normality of the OLS estimators is still approximately true in large samples even without normality of the errors.
6 Testing against two-sided alternatives: null hypothesis H 0 : β j =0 against H 1 : β j 0. Need to decide on a significance level, or the proba of rejecting H 0 when it is true. Common choice for significance level: 5%. When the alternative is two-sided, we are interested in the absolute value of the t-statistic => rejection rule is: t ˆ >c, where critical value c depends on the significance j level and the degrees of freedom (df = n-k-1) : when df<120: see table G2; when df>120: standard normal critical value.
7
8 Example: Correlates of education Source: WAGE2.dta, Wooldridge Population model to be estimated: educ 0 1sibs 2 feduc 3meduc 4brthord. reg educ sibs meduc feduc brthord u Source SS df MS Number of obs = 663 F( 4, 658) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc feduc brthord _cons
9 Test significancy of one variable: H 0 : β 1 =0 Is the relationship between education and the number of siblings significant? Is significantly different from 0? 1 We can reject the null-hypothesis ( = 0) at the 5% significance level. We see this because the absolute value of the t-stat is: ˆ is the critical value for a two-tailed test at 5% when we have more than 120 degrees of freedom (here n-k-1=658). 1 1 t ˆ1 se( ˆ ) criticalvalue
10 What about the other variables? The coefficient of mother s and father s education is statistically significant at the 1% level and below. (t-stat is larger than 2.576). However, we fail to reject the null-hypothesis that birthorder has no effect on education at the 1, 5, and 10% significance level. Note that this last finding suggests that brthord is an irrelevant variable in the regression. Taking it out of the regression does not have a strong effect on any of the coefficient estimates (see regression below)
11 What happens if you drop one irrelevant variable?. reg educ sibs meduc feduc if brthord!=. Source SS df MS Number of obs = 663 F( 3, 659) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = educ Coef. Std. Err. t P> t [95% Conf. Interval] sibs meduc feduc _cons Note: we see that once the irrelevant variable brthord is not included, the standard error of sibs decreases, and the absolute value of the t-stat increases: sibs is now statistically significant at the 1% level.
12 Practical guidance One can NEVER accept the null hypothesis: When low t-stat (or high p-value), we fail to reject the null hypothesis. Statistical significance (economic) importance t significance ~ t-statistic importance ~ magnitude of coefficient ˆ j se( ˆ ) If coefficient is insignificant (low t value), then no meaningful interpretation of sign and magnitude of coefficient=>just ignore it. Practical advice: with bigger samples, std errors decrease, which results in more statistical significance=>decrease significance level to be sure. ˆ j j
13 Testing against one-sided alternatives: H 0 : β j =0 against H 1 : β j <0. Here we only care about the alternative H 1 : β j <0. Why? Introspection, econ theory We are looking for a sufficiently large negative value of t ˆ in order to reject in favor of H 1 j rejection rule: H 0 is rejected in favor of H 1 if t<-c (or t >c). Remember: to reject H 0 against the negative alternative, we must get a negative t statistic.
14 A few things about the critical value Critical value c smaller than for 2-sided test (see table G2). As significance level falls, the critical value increases. So if H0 is rejected at the 1% level, then it is also rejected at the 5 and 10% levels. Testing H 0 against alternative hypothesis H 1 : β j >0 leads to rejection rule: H 0 is rejected in favor of H 1 if t>c (or t >c).
15 Testing other hypotheses about β j : H 0 : β j =a H 0 : β j =a against H 1 : β j a a = the hypothesized ceteris paribus effect of x on y ˆ j a t-stat can be written as: t ˆ j se( ˆ ) We reject H 0 if t c ˆ. Alternatively: reject H0 if a is not j in the 95% confidence interval: ˆ is statistically different j from a at the 5% significance level. t If H 1 : β j >a, reject if. ˆ j Note: depending on whether one sided or sided alternative, c will not be the same, see G2. c j
16 Confidence intervals for β j Using the fact that, a 95% confidence interval for the population parameter β j is given by: ˆ ˆ ˆ ˆ j c. se( j ), j c. se( j ), where the constant c is the 97.5 th percentile in a t n-k-1 distribution (as before). Ex. (see appendix G2): for df=n-k-1=25, a 95% CI is ˆ ( ˆ ) j se j When df>50, we can consider c 2. Application: H 0 : β j =a is rejected if a not in the 95% CI (the same if a is 0).
17 Example: Rationality of house assessments Source: HPRICE1.dta, Wooldridge. regress lprice lassess Source SS df MS Number of obs = 88 F( 1, 86) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = lprice Coef. Std. Err. t P> t [95% Conf. Interval] lassess _cons Test elasticity of actual price w.r.t. assessed price is zero, i.e. the assessment has no impact on actual price. H 0 : β 1 =0 against H 1 : β 1 0.
18 H 0 : β 1 =0 against H 1 : β 1 0. Compute t-stat to test H 0 : β 1 =0: t ˆ 1 ˆ ˆ1 se( 1) =>greater than critical values at any significance level=>we reject the null hypothesis, we conclude that β 1 0, i.e. the value of the assessed price does impact the value of the actual price of the house. Alternatively, we can see that 0 is not included in the confidence interval at 95% given by stata for the value of ˆ j, so we can reject H 0 at the 5% level (at least).
19 H 0 : β 1 =1 against H 1 : β 1 1. Compute t-stat to test H 0 : β 1 =1: ˆ t 0.22 ˆ1 se( ˆ ) =>smaller than critical values at 1, 5, or even10% levels (c is appr. 1.7 for two-tailed test at 10%)=>we cannot reject the null hypothesis. Alternatively, we could have looked at the 95% confidence interval in the stata output. Given that 1 is in between the interval, we cannot reject H 0 at the 5% significance level.
20 Is the price assessment rational? What if we include other characteristics? We estimate the model: lprice 0 1lassess 2llotsize 3lsqrft 4bdrms u We think that once assessed price is controlled for, the other characteristics should not impact the actual price. =>test 3 null hypotheses: H 0 : β 2 =0; H 0 : β 3 =0; H 0 : β 4 =0. Stata commands: eststo clear eststo: reg lprice lassess eststo: reg lprice lassess llotsize lsqrft bdrms eststo: reg lprice llotsize lsqrft bdrms esttab,r2
21
22 Interpretation of the results Is the house assessment rational? i.e. do other house characteristics impact actual sales price of the house when the assessed price is controlled for? The results in column 2 do not allow to reject the 3 null-hypotheses, and hence provide support for the rational assessment interpretation. Not surprisingly, we see that R-squared does not increase much. House characteristics do not explain much more of the variation in sales prices, once assessed prices are controlled for. Moreover, looking at column three, we see that, as one would expect, we do find significant effects of some the house characteristics on the sales price, if the assessed price of the house is not controlled for. Note also that the coefficient of log(sqrft) is negative in column 2 but positive in column 3. We don t have to be worried about this because we only want to think about the interpretation of the sign if the coefficient is significant. Given that it s insignificant in column 2, the counterintuitive sign in column 2 doesn t matter.
23 P-values for t-tests Given an observed t statistic, what is the smallest significance level at which H 0 would be rejected? = P( T > t ): «p-value for testing H 0 : β j =0 against twosided alternative» with T being a random variable with n-k-1 df t the numerical value of the test statistic = probability of observing a t statistic as large as we did if the null hypothesis is true=>think of it as the proba of rejecting H 0 while H 0 is true. =lowest significance level at which you can reject H 0. Note: to obtain the one-sided p-value: just divide the twosided p-value by 2.
24 Testing multiple/joint linear restrictions: the F-test Testing exclusion restrictions: y 0 1x1 2x2 3x3 4x4 5x5 u H 0 : β 3 =0 ; β 4 =0 ; β 5 =0. H 1 : H 0 does not hold, i.e. x 3, x 4 and x 5 combined have an effect on y. Need to test the restrictions jointly. F-test: estimates the model with (=unrestricted) and without (=restricted) x 3, x 4 and x 5, compare the Sum of Squared residuals: how does SSR increase when we drop these variables? If this increase is big enough, we will reject the joint null hypothesis.
25 The F-test F-statistic: F, where q= nb of restrictions. Under H 0, and assuming CLM assumptions hold, F~F q,n-k-1. =>Reject H 0 if SSR r is relatively large compared to SSR ur, more specifically if F>c, where critical value c depends on the chosen significance level, the number of restrictions q, and the degrees of freedom (n-k-1). (see table G3) SSR SSR r ur SSR n k ur q 1 Terminology: if H 0 is rejected, x 3, x 4 and x 5 are jointly statistically significant.
26 Notes on F-test The F-stat is always 0. Even if the t-tests on each coeff conclude x 3, x 4 and x 5 are individually statistically insignificant, it may be that x 3, x 4 and x 5 are jointly statistically significant (e.g. due to multicollinearity). Be careful when comparing two models: the same observations should be used=>careful to missing values!
27 Ex: Effect of personal characteristics and marriage characteristics on the number of extramarital affairs Source: affairs.dta (Wooldridge), data originally used in R.C. Fair (1978), "A Theory of Extramarital Affairs," Journal of Political Economy 86, 45-61, desc naffairs age educ occup yrsmarr ratemarr storage display value variable name type format label variable label naffairs byte %9.0g number of affairs within last year age float %9.0g in years educ byte %9.0g years schooling occup byte %9.0g occupation, reverse Hollingshead scale yrsmarr float %9.0g years married ratemarr byte %9.0g 5 = vry hap marr, 4 = hap than avg, 3 = avg, 2 = smewht unhap, 1 = vry unhap. sum naffairs age educ occup yrsmarr ratemarr Variable Obs Mean Std. Dev. Min Max naffairs age educ occup yrsmarr ratemarr
28 . tab naffairs number of affairs within last year Freq. Percent Cum.. tab ratemarr Total = vry hap marr, 4 = hap than avg, 3 = avg, 2 = smewht unhap, 1 = vry unhap Freq. Percent Cum Total
29 . regress naffairs age educ occup yrsmarr ratemarr Source SS df MS Number of obs = 601 F( 5, 595) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = naffairs Coef. Std. Err. t P> t [95% Conf. Interval] age educ occup yrsmarr ratemarr _cons regress naffairs yrsmarr ratemarr Source SS df MS Number of obs = 601 F( 2, 598) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = naffairs Coef. Std. Err. t P> t [95% Conf. Interval] yrsmarr ratemarr _cons
30 . esttab,r2 scalars(rss df_r) age * (-2.46) educ (-0.01) occup (1.50) (1) (2) naffairs naffairs yrsmarr 0.144*** ** (3.87) (3.15) ratemarr *** *** (-6.26) (-6.20) _cons 4.529*** 3.769*** (4.25) (6.65) N R-sq rss df_r t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001
31 Interpretation: These two regression estimates allow you to test whether individual characteristics jointly have an effect on the number of affairs a person has in a year. In other words, are affairs mainly explained by characteristics of the marriage itself, or do individual characteristics play a role? H 0 : β 1 =0, β 2 =0, β 3 =0. In order to figure this out, we estimate a regression including both individual and marriage characteristics (the unrestricted model), and one with only marriage characteristics (the restricted model). We use an F-test to test the exclusion restrictions. We obtain SSR ur = 5846, SSR r = 5921, q =3 (number of restrictions), n-k-1 = 595 (degrees of freedom) F = [(SSR r - SSR ur )/q]/[ssr ur /n-k-1] =2.55. The critical value for q=3 and n-k-1=595, at the 5% level, is Hence we cannot reject the null-hypothesis at the 5% significance level. We can however reject the null-hypothesis at the 10% level (critical value is 2.08).
32 Notes on F-test R-squared form of F-stat: because SSR=SST(1-R 2 ), F 2 ( R (1 R P-values for F-test: probability of observing a value of F as large as we did, given H 0 is true=>to reject H 0, p-value has to be low. F-stat for overall significance of a regression: H 0 : β 1 = β 2 = = β k =0 ur 2 ur 2 R r ) q ) n k 1 Non-zero hypotheses can be incorporated in a F-test. F-test for 1 restriction (β j =0) two-sided t-test.
33 Ex. 2: Effect of mother s and father s education on wage. eststo clear. eststo:regress wage educ IQ meduc feduc Source SS df MS Number of obs = 722 F( 4, 717) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ IQ meduc feduc _cons test (meduc=0) (feduc=0) ( 1) meduc = 0 ( 2) feduc = 0 F( 2, 717) = 4.26 Prob > F =
34 Interpretation: We want to test whether mother and father s education have a jointly significant effect on wage of the child. H 0 : β meduc =0, β feduc =0. we estimate the unrestricted model and ask stata to do the F-test. Stata indicates there are 2 restrictions, 717 degrees of freedom, and the calculated F-value is It also indicates the P-value of the F-test: Hence we can reject the null-hypothesis at the 5% level but not at the 1%. To be more precise, the probability of observing an F-value of 4.26 when the null-hypothesis holds is 1.45%. Note that the significance of this test is much higher than for the individual t-tests for these parameters. This can be explained by multicollinearity. We could have obtained the same result by estimating the restricted model, and using the R-squared form of the F-statistic. Note that we need to be careful to estimate the restricted model on the same observations (i.e. excluding those for which meduc or feduc have missing values, see below)
35 . eststo:regress wage educ IQ if e(sample) Source SS df MS Number of obs = 722 F( 2, 719) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = esttab,r2 scalars(rss df_r) educ 32.32*** 39.66*** (4.09) (5.27) IQ 4.717*** 5.338*** (4.08) (4.68) meduc (1.25) feduc (1.71) (1) (2) wage wage _cons (-1.17) (-1.03) N R-sq rss df_r t statistics in parentheses * p<0.05, ** p<0.01, *** p<0.001 F ( R (1 R 2 ur 2 ur R 2 r )/ q )/( n k 1) ( )/2 ( )/(717) 4.25 which is larger than 3.00, the critical value at 5%. The regression output shows that the F- value for the test of overall significance of the regression is very high, Indeed the P-value shows significance at very low levels.
36 Test H 0 : linear combination of the parameters=0 H 0 : β 1 =β 2 against H 1 :β 1 β 2. Method 1: test H 0 : β 1 -β 2 =0 Need t-stat: 1 2. =>Need to compute the denominator: with s 12 the estimate of the covariance. Note: ˆ ˆ t se ( ˆ ˆ ) se( ˆ ) se( ˆ ) 1/ 2 se( ˆ ˆ s 1 2) se( ˆ ˆ ) ( ˆ ) ( ˆ 1 2 se 1 se 2)
37 Ex2b: Effect of mother s and father s education on wage. regres wage educ IQ meduc feduc Source SS df MS Number of obs = 722 F( 4, 717) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] educ IQ meduc feduc _cons test meduc = feduc ( 1) meduc - feduc = 0 F( 1, 717) = 0.02 Prob > F =
38 Ex 3: Effect of years of tenure in company and years of experience on wage wage 0 1exper 2tenure 3educ 4IQ u. regress wage exper tenure educ IQ Source SS df MS Number of obs = 935 F( 4, 930) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] exper tenure educ IQ _cons test exper==tenure ( 1) exper - tenure = 0 F( 1, 930) = 2.84 Prob > F =
39 Test H 0 : linear combination of parameters=0 (2) H 0 : β 1 =β 2 against H 1 :β 1 β 2. 1 Method 2: define 1 2 Test H 0 : 0 versus H 1 : =>use standard t-test Need to redefine variables given that: =>Estimate y 0 ( 1 2) x1... k xk u y x ( x x ) k xk u Estimating this model allows us to test H 0 : 0 1 using a simple t-test on one variable (here x 1 ).
40 Ex 3b: Effect of years of tenure in company and years of experience on wage The new model estimated, using method 2, is: y 0 1 exp er 2(exp er tenure ) 3educ 4IQ u. gen sum = exper+tenure. regress wage exper sum educ IQ Source SS df MS Number of obs = 935 F( 4, 930) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = wage Coef. Std. Err. t P> t [95% Conf. Interval] exper sum educ IQ _cons
41 Interpretation: we have reformulated the problem in such a way that we can look at the t-test for the first variable (exper), and use it to test the null-hypothesis. The t-test shows we can reject the null hypothesis at the 10% level but not at the 5%. Note that value of t-test is square root of value of F-test shown earlier: equals 2.84 (approximately). Of course, we should be careful when interpreting the coefficient estimates. To know the total effect of experience, we add up the coefficients of exper and sum. This shows the same effect than previously.
42 Can we use regression outputs to test joint hypotheses?
43 Can we test whether log(lotsize), log(sqrft) and bdrms jointly have a significant effect, once the assessed price is controlled for? Yes, given that we have the R-squared of the restricted and the unrestricted model, and the number of observations, we can calculate the R-squared form of the F-test. Can we test whether price assessments are rational, when we define rationality as? The rationality hypothesis can be restated in these terms: a 1% change in assess would be associated with a 1% change in price; that is β 1 =1. In addition, lotsize, sqrft, and bdrms should not help to explain log(price), once the assessed value has been controlled for. Answer: No, we would need to have access to the data, because one of the coefficients is hypothesized not to equal zero.
44 How can we test for the latter joint hypothesis? There are four restrictions to be tested, three are exclusion restrictions, but β 1 =1 is not. How can we test this hypothesis using F-stat? =>estimate unrestricted and restricted models. Unrestricted model: versus restricted model: The F-stat is simply: y 0 1x1 3x3 4x4 y x u ( SSR 1 0 r SSR /( n 5) ) / 4 The 5% critical value in a F distribution with (4, 83) df is about 2.50, so we fail to reject H0. There is no evidence that the assessed values are not rational. ur SSR ur u ( ) / / 83
THE MULTIVARIATE LINEAR REGRESSION MODEL
THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus
More informationInference in Regression Analysis
ECNS 561 Inference Inference in Regression Analysis Up to this point 1.) OLS is unbiased 2.) OLS is BLUE (best linear unbiased estimator i.e., the variance is smallest among linear unbiased estimators)
More informationMultiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE. Sampling Distributions of OLS Estimators
1 2 Multiple Regression Analysis: Inference MULTIPLE REGRESSION ANALYSIS: INFERENCE Hüseyin Taştan 1 1 Yıldız Technical University Department of Economics These presentation notes are based on Introductory
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationProblem 4.1. Problem 4.3
BOSTON COLLEGE Department of Economics EC 228 01 Econometric Methods Fall 2008, Prof. Baum, Ms. Phillips (tutor), Mr. Dmitriev (grader) Problem Set 3 Due at classtime, Thursday 14 Oct 2008 Problem 4.1
More information1. The shoe size of five randomly selected men in the class is 7, 7.5, 6, 6.5 the shoe size of 4 randomly selected women is 6, 5.
Economics 3 Introduction to Econometrics Winter 2004 Professor Dobkin Name Final Exam (Sample) You must answer all the questions. The exam is closed book and closed notes you may use calculators. You must
More informationcoefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1
Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when
More informationModel Specification and Data Problems. Part VIII
Part VIII Model Specification and Data Problems As of Oct 24, 2017 1 Model Specification and Data Problems RESET test Non-nested alternatives Outliers A functional form misspecification generally means
More informationProblem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics
Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationCourse Econometrics I
Course Econometrics I 4. Heteroskedasticity Martin Halla Johannes Kepler University of Linz Department of Economics Last update: May 6, 2014 Martin Halla CS Econometrics I 4 1/31 Our agenda for today Consequences
More informationLab 10 - Binary Variables
Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2
More informationStatistical Inference. Part IV. Statistical Inference
Part IV Statistical Inference As of Oct 5, 2017 Sampling Distributions of the OLS Estimator 1 Statistical Inference Sampling Distributions of the OLS Estimator Testing Against One-Sided Alternatives Two-Sided
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More information2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.
BOSTON COLLEGE Department of Economics EC 228 Econometrics, Prof. Baum, Ms. Yu, Fall 2003 Problem Set 3 Solutions Problem sets should be your own work. You may work together with classmates, but if you
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationProblem Set 1 ANSWERS
Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2
Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions
More informationEconometrics Homework 1
Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z
More information1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11
Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationsociology 362 regression
sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationWarwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation
Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory
More informationAnswer all questions from part I. Answer two question from part II.a, and one question from part II.b.
B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries
More informationLecture 5: Hypothesis testing with the classical linear model
Lecture 5: Hypothesis testing with the classical linear model Assumption MLR6: Normality MLR6 is not one of the Gauss-Markov assumptions. It s not necessary to assume the error is normally distributed
More informationQuantitative Methods Final Exam (2017/1)
Quantitative Methods Final Exam (2017/1) 1. Please write down your name and student ID number. 2. Calculator is allowed during the exam, but DO NOT use a smartphone. 3. List your answers (together with
More informationCHAPTER 4. > 0, where β
CHAPTER 4 SOLUTIONS TO PROBLEMS 4. (i) and (iii) generally cause the t statistics not to have a t distribution under H. Homoskedasticity is one of the CLM assumptions. An important omitted variable violates
More information2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).
CHAPTER Functional Forms of Regression Models.1. Consider the following production function, known in the literature as the transcendental production function (TPF). Q i B 1 L B i K i B 3 e B L B K 4 i
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationF Tests and F statistics
F Tests and F statistics Testing Linear estrictions F Stats and F Tests F Distributions F stats (w/ ) F Stats and tstat s eported F Stat's in OLS Output Example I: Bodyfat Babies and Bathwater F Stats,
More informationsociology 362 regression
sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,
More informationProblem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]
Problem Set #3-Key Sonoma State University Economics 317- Introduction to Econometrics Dr. Cuellar 1. Use the data set Wage1.dta to answer the following questions. a. For the regression model Wage i =
More informationEmpirical Application of Simple Regression (Chapter 2)
Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget
More informationσ σ MLR Models: Estimation and Inference v.3 SLR.1: Linear Model MLR.1: Linear Model Those (S/M)LR Assumptions MLR3: No perfect collinearity
Comparison of SLR and MLR analysis: What s New? Roadmap Multicollinearity and standard errors F Tests of linear restrictions F stats, adjusted R-squared, RMSE and t stats Playing with Bodyfat: F tests
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More information5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is
Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do
More informationHandout 12. Endogeneity & Simultaneous Equation Models
Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to
More informationLecture 7: OLS with qualitative information
Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationLecture 2 Multiple Regression and Tests
Lecture 2 and s Dr.ssa Rossella Iraci Capuccinello 2017-18 Simple Regression Model The random variable of interest, y, depends on a single factor, x 1i, and this is an exogenous variable. The true but
More information5.2. a. Unobserved factors that tend to make an individual healthier also tend
SOLUTIONS TO CHAPTER 5 PROBLEMS ^ ^ ^ ^ 5.1. Define x _ (z,y ) and x _ v, and let B _ (B,r ) be OLS estimator 1 1 1 1 ^ ^ ^ ^ from (5.5), where B = (D,a ). Using the hint, B can also be obtained by 1 1
More informationInference in Regression Model
Inference in Regression Model Christopher Taber Department of Economics University of Wisconsin-Madison March 25, 2009 Outline 1 Final Step of Classical Linear Regression Model 2 Confidence Intervals 3
More informationFunctional Form. So far considered models written in linear form. Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X
Functional Form So far considered models written in linear form Y = b 0 + b 1 X + u (1) Implies a straight line relationship between y and X Functional Form So far considered models written in linear form
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More informationNonlinear Regression Functions
Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationMultiple Regression Analysis: Heteroskedasticity
Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationComputer Exercise 3 Answers Hypothesis Testing
Computer Exercise 3 Answers Hypothesis Testing. reg lnhpay xper yearsed tenure ---------+------------------------------ F( 3, 6221) = 512.58 Model 457.732594 3 152.577531 Residual 1851.79026 6221.297667619
More informationLecture 8: Functional Form
Lecture 8: Functional Form What we know now OLS - fitting a straight line y = b 0 + b 1 X through the data using the principle of choosing the straight line that minimises the sum of squared residuals
More informationCourse Econometrics I
Course Econometrics I 3. Multiple Regression Analysis: Binary Variables Martin Halla Johannes Kepler University of Linz Department of Economics Last update: April 29, 2014 Martin Halla CS Econometrics
More informationSimultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser
Simultaneous Equations with Error Components Mike Bronner Marko Ledic Anja Breitwieser PRESENTATION OUTLINE Part I: - Simultaneous equation models: overview - Empirical example Part II: - Hausman and Taylor
More informationTests of Linear Restrictions
Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some
More information1 A Non-technical Introduction to Regression
1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in
More information(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections
Answer Key Fixed Effect and First Difference Models 1. See discussion in class.. David Neumark and William Wascher published a study in 199 of the effect of minimum wages on teenage employment using a
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More informationEconomics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham
Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham Last name (family name): First name (given name):
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationIntroductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1
Introductory Econometrics Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Jun Ma School of Economics Renmin University of China October 19, 2016 The model I We consider the classical
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More informationOrdinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!
Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much! OLS: Comparison of SLR and MLR Analysis Interpreting Coefficients I (SRF): Marginal effects ceteris paribus
More informationUniversity of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points
EEP 118 / IAS 118 Elisabeth Sadoulet and Kelly Jones University of California at Berkeley Fall 2008 Introductory Applied Econometrics Final examination Scores add up to 125 points Your name: SID: 1 1.
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationTable 1: Fish Biomass data set on 26 streams
Math 221: Multiple Regression S. K. Hyde Chapter 27 (Moore, 5th Ed.) The following data set contains observations on the fish biomass of 26 streams. The potential regressors from which we wish to explain
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationHypothesis Tests and Confidence Intervals. in Multiple Regression
ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and
More informationSpecification Error: Omitted and Extraneous Variables
Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct
More informationIntroduction to Econometrics. Review of Probability & Statistics
1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical
More informationLecture 3: Multivariate Regression
Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous
More informationBrief Suggested Solutions
DEPARTMENT OF ECONOMICS UNIVERSITY OF VICTORIA ECONOMICS 366: ECONOMETRICS II SPRING TERM 5: ASSIGNMENT TWO Brief Suggested Solutions Question One: Consider the classical T-observation, K-regressor linear
More informationMultiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)
1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:
More informationLab 6 - Simple Regression
Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello
More informationThe simple linear regression model discussed in Chapter 13 was written as
1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple
More informationProblem C7.10. points = exper.072 exper guard forward (1.18) (.33) (.024) (1.00) (1.00)
BOSTON COLLEGE Department of Economics EC 228 02 Econometric Methods Fall 2009, Prof. Baum, Ms. Phillips (TA), Ms. Pumphrey (grader) Problem Set 5 Due Tuesday 10 November 2009 Total Points Possible: 160
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationAcknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression
INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical
More informationBinary Dependent Variables
Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome
More informationLab 11 - Heteroskedasticity
Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction
More informationECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013
ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013 Instructions: Answer all six (6) questions. Point totals for each question are given in parentheses. The parts within
More informationQuestion 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.
UNIVERSITY OF EAST ANGLIA School of Economics Main Series PGT Examination 017-18 ECONOMETRIC METHODS ECO-7000A Time allowed: hours Answer ALL FOUR Questions. Question 1 carries a weight of 5%; Question
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationLecture#12. Instrumental variables regression Causal parameters III
Lecture#12 Instrumental variables regression Causal parameters III 1 Demand experiment, market data analysis & simultaneous causality 2 Simultaneous causality Your task is to estimate the demand function
More informationInference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58
Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator
More informationLECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit
LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define
More informationExam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.
Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. This is an open book examination where all printed and written resources, in addition to a calculator, are allowed. If you are
More informationAt this point, if you ve done everything correctly, you should have data that looks something like:
This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows
More informationECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47
ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force
More informationSolutions to Problem Set 5 (Due November 22) Maximum number of points for Problem set 5 is: 220. Problem 7.3
Solutions to Problem Set 5 (Due November 22) EC 228 02, Fall 2010 Prof. Baum, Ms Hristakeva Maximum number of points for Problem set 5 is: 220 Problem 7.3 (i) (5 points) The t statistic on hsize 2 is over
More information