Introduction to Econometrics. Multiple Regression

Size: px
Start display at page:

Download "Introduction to Econometrics. Multiple Regression"

Transcription

1 Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1

2 OLS estimate of the TS/STR relation: TestScore = STR, R = 0.05, SER = 18.6 (10.4) (0.5) var(ts) =19.1 Is this a credible estimate of the causal effect on test scores of a change in the student-teacher ratio? No: there are omitted confounding factors such as family income and native English speakers that bias the OLS estimator. STR could be picking up the effect of these confounding factors.

3 Omitted Variable Bias The bias in the OLS estimator that occurs as a result of an omitted factor is called omitted variable bias. For omitted variable bias to occur, the omitted factor Z must be: 1. a determinant of Y. correlated with the regressor X Both conditions must hold for the omission of Z to result in omitted variable bias. 3

4 Omitted Variable Bias In the test score example: 1. English language ability plausibly affects test scores: whether the student has English as a second language è Z is a determinant of Y. Immigrant communities tend to be less affluent and thus have smaller school budgets and higher STR: è Z is correlated with X Accordingly, ˆ β 1 is biased What is the direction of this bias? What does common sense suggest? If common sense fails you, there is a formula 4

5 A formula for omitted variable bias: ˆ 1 n i= 1 β β 1 = n i= 1 ( X X) u i ( X X) where v i = (X i X )u i (X i µ X )u i. i i n 1 v i n n 1 s n i= 1 = X Under Least Squares Assumption #1, and hence E[(X i µ X )u i ] = cov(x i,u i ) = 0 E[ ˆ β 1 β 1 ]= 0 But what if E[(X i µ X )u i ] = cov(x i,u i ) = σ Xu 0? 5

6 6 A formula for omitted variable bias: 1 ˆ β β 1 = 1 1 ( ) ( ) n i i i n i i X X u X X = = = n i i X v n n s n = so E( 1 ˆ β ) β 1 = 1 1 ( ) ( ) n i i i n i i X X u E X X = = Xu X σ σ = u Xu X X u σ σ σ σ σ where holds with equality when n is large; specifically, 1 ˆ β p β 1 + u Xu X σ ρ σ, where ρ Xu = corr(x,u)

7 A formula for omitted variable bias: ˆ β p 1 β 1 + If an omitted factor Z is both: σ u σ X ρ (1) a determinant of Y (that is, it is contained in u); and () correlated with X, then ρ Xu 0 and the OLS estimator ˆ β 1 is biased. The math makes precise the idea that districts with few ESL students (1) do better on standardized tests and () have smaller classes (bigger budgets), so ignoring the ESL factor results in overstating the class size effect. Is this is actually going on in the CA data? Xu 7

8 Districts with fewer English Learners have higher test scores Districts with lower percent EL (PctEL) have smaller classes Among districts with comparable PctEL, the effect of class size is small (recall overall test score gap = 7.4) 8

9 Three ways to overcome omitted variable bias 1. Run a randomized controlled experiment in which treatment (STR) is randomly assigned: then PctEL is still a determinant of TestScore, but PctEL is uncorrelated with STR. (But this is unrealistic in practice). Adopt the cross tabulation approach, with finer gradations of STR and PctEL (But soon we will run out of data, and what about other determinants like family income and parental education?) 3. Use a method in which the omitted variable (PctEL) is no longer omitted: include PctEL as an additional regressor in a multiple regression. 9

10 Multiple Regression 10

11 The Population Multiple Regression Model Consider the case of two regressors: Y i = β 0 + β 1 X 1i + β X i + u i, i = 1,,n X 1, X are the two independent variables (regressors) (Y i, X 1i, X i ) denote the i th observation on Y, X 1, and X. β 0 = unknown population intercept β 1 = effect on Y of a change in X 1, holding X constant β = effect on Y of a change in X, holding X 1 constant u i = error term (omitted factors) 11

12 Interpretation of multiple regression coefficients Y i = β 0 + β 1 X 1i + β X i + u i, i = 1,,n Consider changing X 1 by ΔX 1 while holding X constant: Before: Y = β 0 + β 1 X 1 + β X After: Y + ΔY = β 0 + β 1 (X 1 + ΔX 1 ) + β X Difference: ΔY = β 1 ΔX 1 That is: β 1 = β = ΔY Δ X 1 ΔY ΔX, holding X constant, holding X 1 constant β 0 = predicted value of Y when X 1 = X = 0. 1

13 The OLS Estimator in Multiple Regression With two regressors, the OLS estimator solves: n min [ Y ( b + b X + b X )] b0, b1, b i 0 1 1i i i= 1 The OLS estimator minimizes the average squared difference between the actual values of Y i and the prediction (predicted value) based on the estimated line. This minimization problem is solved using calculus The result is the OLS estimators of β 0 and β 1. 13

14 Example: the California test score data Regression of TestScore against STR: TestScore = STR Now include percent English Learners in the district (PctEL): TestScore = STR 0.65PctEL What happens to the coefficient on STR? Why? Note: corr(str, PctEL) =

15 Multiple regression in STATA reg testscr str pctel, robust; Regression with robust standard errors Number of obs = 40 F(, 417) = 3.8 Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str pctel _cons TestScore = STR 0.65PctEL What are the sampling distribution of ˆ β 1 and ˆ β? 15

16 Least Squares Assumptions Y i = β 0 + β 1 X 1i + β X i + + β k X ki + u i, i = 1,,n 1. The conditional distribution of u given the X s has mean zero, that is, E(u X 1 = x 1,, X k = x k ) = 0.. (X 1i,,X ki,y i ), i =1,,n, are i.i.d. 3. X 1,, X k, and u have four moments: E( 4 4 X ) <,, E( 1i X ) <, E( 4 ki u ) < i 4. There is no multicollinearity. 16

17 Assumption #1: E(u X 1 = x 1,, X k = x k ) = 0 This has the same interpretation as in regression with a single regressor. If an omitted variable (1) belongs in the equation (so is in u) and () is correlated with an included X, then this condition fails Failure of this condition leads to omitted variable bias The solution if possible is to include the omitted variable in the regression. 17

18 Assumption #: (X 1i,,X ki,y i ), i =1,,n, are i.i.d. This is satisfied automatically if the data are collected by simple random sampling. 18

19 Assumption #3: finite fourth moments This technical assumption is satisfied automatically by variables with a bounded domain (test scores, PctEL, etc.) 19

20 Assumption #4: no multicollinearity Perfect multicollinearity is when one of the regressors is an exact linear function of the other regressors. Perfect multicollinearity usually reflects a mistake in the definitions of the regressors, or an oddity in the data 0

21 Example: Suppose you accidentally include STR twice: regress testscr str str, robust Regression with robust standard errors Number of obs = 40 F( 1, 418) = 19.6 Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str str (dropped) _cons

22 Perfect multicollinearity In the previous regression, β 1 is the effect on TestScore of a unit change in STR, holding STR constant (???) Second example: regress TestScore on a constant, D, and B, where: D i = 1 if STR 0, = 0 otherwise; B i = 1 if STR >0, = 0 otherwise, so B i = 1 D i and there is perfect multicollinearity Would there be perfect multicollinearity if the intercept (constant) were somehow dropped (that is, omitted or suppressed) in the regression?

23 The Sampling Distribution of the OLS Estimator Under the four Least Squares Assumptions, 1. The exact (finite sample) distribution of ˆ β 1 has mean β 1, var( ˆ β 1 ) is inversely proportional to n; so too for ˆ β.. Other than its mean and variance, the exact distribution of ˆ β is very complicated 1 3. ˆ β 1 is consistent: ˆ β p 1 β 1 (law of large numbers) ˆ β E( ˆ β ) var( ˆ β ) is approximately distributed N(0,1) (CLT) 5. Same for ˆ β,, ˆk β 3

24 Hypothesis Testing in Multiple Regression 4

25 Hypothesis Tests and Confidence Intervals for a Single Coefficient in Multiple Regression ˆ β E( ˆ β ) var( ˆ β ) is approximately distributed N(0,1) (CLT). Thus hypotheses on β 1 can be tested using the usual t-statistic, and confidence intervals are constructed as CI 95% (β 1 ) = [ ˆ β 1 ± 1.96 x SE( ˆ β 1 )] Same for β,, β k. ˆ ˆ Note that β 1 and β are generally not independently distributed so neither are their t-statistics (more on this later). 5

26 Example: The California class size data (1) TestScore = STR (10.4) (0.5) () TestScore = STR 0.650PctEL (8.7) (0.43) (0.031) The coefficient on STR in () is the effect on TestScores of a unit change in STR, holding constant the percentage of English Learners in the district Coefficient on STR falls by one-half 95% confidence interval for coefficient on STR in () is [ 1.10 ± ] = [ 1.95 ; 0.6] 6

27 Tests of Joint Hypotheses Let Expn = expenditures per pupil and consider the population regression model: TestScore i = β 0 + β 1 STR i + β Expn i + β 3 PctEL i + u i The null hypothesis that school resources do not matter, and the alternative that they do, corresponds to: H 0 : β 1 = 0 and β = 0 vs. H 1 : either β 1 0 or β 0 or both 7

28 Tests of Joint Hypotheses A joint hypothesis specifies a value for two or more coefficients, that is, it imposes a restriction on two or more coefficients. A common sense test is to reject if either of the individual t-statistics exceeds 1.96 in absolute value. But this common sense approach doesn t work! The resulting test does not have the right significance level! 8

29 Tests of Joint Hypotheses Here s why: Calculation of the probability of incorrectly rejecting the null using the common sense test based on the two individual t-statistics. ˆ ˆ Suppose that β 1 and β are independently distributed. Let t 1 and t be the t-statistics: t 1 = 1 ˆ β 0 SE( ˆ β ) 1 and t = ˆ β 0 SE( ˆ β ) The common sense test is: reject H 0 : β 1 = β = 0 if t 1 > 1.96 and/or t > 1.96 What is the probability that this common sense test rejects H 0, when H 0 is actually true? (It should be 5%.) 9

30 Tests of Joint Hypotheses Probability of incorrectly rejecting the null = Pr H [ t 1 > 1.96 and/or t > 1.96] = = 0 Pr H [ t 1 > 1.96, t > 1.96] Pr H [ t 1 > 1.96, t 1.96] 0 Pr H [ t , t > 1.96] 0 Pr H [ t 1 > 1.96] Pr H [ t 1 > 1.96] 0 Pr H [ t ] 0 Pr H [ t > 1.96] 0 Pr H [ t 1.96] 0 Pr H [ t > 1.96] 0 (disjoint events) (t 1, t are independent by assumption) = = = 9.75% which is not the desired 5%!! 30

31 Tests of Joint Hypotheses The size of a test is the actual rejection rate under the null hypothesis. The size of the common sense test is not 5%! Its size actually depends on the correlation between t 1 and t (and thus on the correlation between ˆ β 1 and ˆ β ). Two Solutions: Use a different critical value in this procedure not 1.96 (this is the Bonferroni method ) Use a different test statistic that test both β 1 and β at once: the F-statistic. 31

32 The F-statistic The F-statistic tests all parts of a joint hypothesis at once. In a regression with two regressors, when the joint null has the two restrictions β 1 = β 1,0 and β = β,0 : 1 t + t ˆ ρt 1 ttt F = 1 ˆ ρt 1, t 1, 1 where ρ estimates the correlation between t 1 and t. ˆt, t 1 Reject when F is large 3

33 The F-statistic The F-statistic testing β 1 and β (special case): 1 t + t ˆ ρt 1 ttt F = 1 ˆ ρt 1, t 1, 1 The F-statistic is large when t 1 and/or t are large The F-statistic corrects (in just the right way) for the correlation between t 1 and t. The formula for more than two β s is really nasty unless you use matrix algebra. This gives the F-statistic a nice large-sample approximate distribution, which is 33

34 Large-sample distribution of the F-statistic Consider special case that t 1 and t are independent, so ρ p 0; in large samples the formula becomes ˆt, t 1 1 t + t ˆ ρt 1 ttt F = 1 ˆ ρt 1, t 1, 1 1 ( ) 1 t + t Under the null, t 1 and t have standard normal distributions that, in this special case, are independent The large-sample distribution of the F-statistic is the distribution of the average of two independently distributed squared standard normal random variables. 34

35 Large-sample distribution of the F-statistic In large samples, F is distributed as χ q /q. The chi-squared distribution with q degrees of freedom is defined to be the distribution of the sum of q independent squared standard normal random variables. Selected large-sample critical values of χ q /q q 5% critical value (why?) 3.00 (the case q= above) p-value using the F-statistic: p-value = tail probability of the distribution beyond the F-statistic actually computed. χ q /q 35

36 Implementation in STATA Use the test command after the regression Example: Test the joint hypothesis that the population coefficients on STR and expenditures per pupil (expn_stu) are both zero, against the alternative that at least one of the population coefficients is nonzero. 36

37 F-test example, California class size data: reg testscr str expn_stu pctel, r; Regression with robust standard errors Number of obs = 40 F( 3, 416) = Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str expn_stu pctel _cons NOTE test str expn_stu; The test command follows the regression ( 1) str = 0.0 There are q= restrictions being tested ( ) expn_stu = 0.0 F(, 416) = 5.43 The 5% critical value for q= is 3.00 Prob > F = Stata computes the p-value for you 37

38 Two (related) loose ends: 1. Homoskedasticity-only versions of the F-statistic. The F distribution 38

39 The homoskedasticity-only F-statistic To compute the homoskedasticity-only F-statistic: Use the previous formulas, but using homoskedasticity-only standard errors; or Run two regressions, one under the null hypothesis (the restricted regression) and one under the alternative hypothesis (the unrestricted regression). The second method gives a simple formula 39

40 The restricted and unrestricted regressions Example: are the coefficients on STR and Expn zero? Restricted population regression (that is, under H 0 ): TestScore i = β 0 + β 3 PctEL i + u i (why?) Unrestricted population regression (under H 1 ): TestScore i = β 0 + β 1 STR i + β Expn i + β 3 PctEL i + u i The number of restrictions under H 0 = q =. The fit will be better (R will be higher) in the unrestricted regression (why?) By how much must the R increase for the coefficients on Expn and PctEL to be judged statistically significant? 40

41 Simple formula for the homoskedasticity-only F-statistic: F = ( Runrestricted Rrestricted )/ q unrestricted unrestricted (1 R ) /( n k 1) where: R restricted = the R for the restricted regression R = the R for the unrestricted regression unrestricted q = the number of restrictions under the null k unrestricted = the number of regressors in the unrestricted regression. 41

42 Example: Restricted regression: TestScore = PctEL, R = (1.0) (0.03) restricted Unrestricted regression: TestScore = STR Expn 0.656PctEL so: (15.5) (0.48) (1.59) (0.03) R = , k unrestricted unrestricted = 3, q = F = = ( Runrestricted Rrestricted )/ q unrestricted unrestricted (1 R ) /( n k 1) ( ) / (1.4366) /(40 3 1) =

43 The homoskedasticity-only F-statistic F = ( Runrestricted Rrestricted )/ q unrestricted unrestricted (1 R ) /( n k 1) The homoskedasticity-only F-statistic rejects when adding the two variables increased the R by enough that is, when adding the two variables improves the fit of the regression by enough If the errors are homoskedastic, then the homoskedasticity-only F-statistic has a large-sample distribution that is χ q/q. But if the errors are heteroskedastic, the large-sample distribution is a mess and is not χ q /q 43

44 The F distribution If: 1. u 1,,u n are normally distributed; and. X i is distributed independently of u i (so in particular u i is homoskedastic) then the homoskedasticity-only F-statistic has the F q,n-k 1 distribution, where q = the number of restrictions k = the number of regressors under the alternative (the unrestricted model). 44

45 The F q,n k 1 distribution: The F distribution is tabulated in many places When n gets large the F q,n-k 1 distribution converges to the distribution: F q, is another name for χ q /q For q not too big and n 100, the F q,n k 1 distribution and the χ q /q distribution are essentially identical. Many regression packages compute p-values of F-statistics using the F distribution χ q /q You will encounter the F-distribution in published empirical work. 45

46 Digression: A little history of statistics The theory of the homoskedasticity-only F-statistic and the F q,n k 1 distributions rests on implausibly strong assumptions (are earnings normally distributed?) These statistics dates to the early 0 th century, when computer was a job description and observations numbered in the dozens. The F-statistic and F q,n k 1 distribution were major breakthroughs: an easily computed formula; a single set of tables that could be published once, then applied in many settings; and a precise, mathematically elegant justification. 46

47 Digression: A little history of statistics The strong assumptions seemed a minor price for this breakthrough. But with modern computers and large samples we can use the heteroskedasticity-robust F-statistic and the F q, distribution, which only require the four least squares assumptions. This historical legacy persists in modern software, in which homoskedasticity-only standard errors (and F- statistics) are the default, and in which p-values are computed using the F q,n k 1 distribution. 47

48 Homoskedasticity-only F-stat and the F distribution Summary These are justified only under very strong conditions stronger than are realistic in practice. Yet, they are widely used. You should use the heteroskedasticity-robust F-statistic, with χ q /q (that is, F q, ) critical values. For n 100, the F-distribution essentially is the distribution. χ q /q For small n, the F distribution is not necessarily a better approximation to the sampling distribution of the F-statistic only if the strong conditions are true. 48

49 Summary: testing joint hypotheses The common-sense approach of rejecting if either of the t- statistics exceeds 1.96 rejects more than 5% of the time under the null (the size exceeds the desired significance level) The heteroskedasticity-robust F-statistic is built in to STATA ( test command); this tests all q restrictions at once. For n large, F is distributed as χ q /q (= F q, ) The homoskedasticity-only F-statistic is important historically (and thus in practice), and is intuitively appealing, but invalid when there is heteroskedasticity 49

50 Testing Single Restrictions on Multiple Coefficients Y i = β 0 + β 1 X 1i + β X i + u i, i = 1,,n Consider the null and alternative hypothesis, H 0 : β 1 = β vs. H 1 : β 1 β This null imposes a single restriction (q = 1) on multiple coefficients it is not a joint hypothesis with multiple restrictions (compare with β 1 = 0 and β = 0). 50

51 Testing Single Restrictions on Multiple Coefficients Two methods: 1) Rearrange ( transform ) the regression Rearrange the regressors so that the restriction becomes a restriction on a single coefficient in an equivalent regression ) Perform the test directly Some software, including STATA, lets you test restrictions using multiple coefficients directly 51

52 Method 1: Rearrange ( transform ) the regression (a) Original system: Y i = β 0 + β 1 X 1i + β X i + u i H 0 : β 1 = β vs. H 1 : β 1 β Add and subtract β X 1i : Y i = β 0 + (β 1 β ) X 1i + β (X 1i + X i ) + u i or Y i = β 0 + γ 1 X 1i + β W i + u i where γ 1 = β 1 β, W i = X 1i + X i (b) Rearranged ( transformed ) system: Y i = β 0 + γ 1 X 1i + β W i + u i H 0 : γ 1 = 0 vs. H 1 : γ 1 0 The testing problem is now a simple one: test whether γ 1 = 0 in specification (b). 5

53 Method : Perform the test directly Y i = β 0 + β 1 X 1i + β X i + u i H 0 : β 1 = β vs. H 1 : β 1 β Example: TestScore i = β 0 + β 1 STR i + β Expn i + β 3 PctEL i + u i To test, using STATA, whether β 1 = β : regress testscore str expn pctel, r test str=expn 53

54 Confidence Intervals in Multiple Regression 54

55 Confidence Sets for Multiple Coefficients Y i = β 0 + β 1 X 1i + β X i + + β k X ki + u i, i = 1,,n What is a joint confidence set for β 1 and β? A 95% confidence set is: A set-valued function of the data that contains the true parameter(s) in 95% of hypothetical repeated samples. The set of parameter values that cannot be rejected at the 5% significance level when taken as the null hypothesis. The coverage rate of a confidence set is the probability that the confidence set contains the true parameter values 55

56 Confidence Sets for Multiple Coefficients A common sense confidence set is the union of the 95% confidence intervals for β 1 and β, that is, the rectangle: [ ˆ β 1 ± 1.96 SE( ˆ β 1 ) ] x [ ˆ β ± 1.96 SE( ˆ β )] What is the coverage rate of this confidence set? Des its coverage rate equal the desired confidence level of 95%? 56

57 Coverage rate of common sense confidence set: Pr[(β 1, β ) { ˆ β 1 ± 1.96 SE( ˆ β 1 ), ˆ β 1.96 ± SE( ˆ β )}] = Pr[ ˆ β SE( ˆ β 1 ) β 1 ˆ β SE( ˆ β 1 ), ˆ β 1.96 SE( ˆ β ) β ˆ β SE( ˆ β )] ˆ β β ˆ β β = Pr[ ˆ , ] SE( β ) SE( ˆ β ) = Pr[ t and t 1.96] 1 = 1 Pr[ t 1 > 1.96 and/or t > 1.96] = %! Why? This confidence set inverts a test for which the size doesn t equal the significance level! 57

58 Recall: the probability of incorrectly rejecting the null = = = Pr H [ t 1 > 1.96 and/or t > 1.96] 0 Pr H [ t 1 > 1.96, t > 1.96] Pr H [ t 1 > 1.96, t 1.96] 0 Pr H [ t , t > 1.96] 0 Pr H [ t 1 > 1.96] Pr H [ t 1 > 1.96] 0 Pr H [ t ] 0 Pr H [ t > 1.96] 0 Pr H [ t 1.96] 0 Pr H [ t > 1.96] 0 (disjoint events) (t 1, t are independent by assumption) = = = 9.75% which is not the desired 5%!! 58

59 The confidence set based on the F-statistic Instead, use the acceptance region of a test that has size equal to its significance level ( invert a valid test): Let F(β 1,0,β,0 ) be the (heteroskedasticity-robust) F-statistic testing the hypothesis that β 1 = β 1, 0 and β = β,0 : 95% confidence set = {β 1,0, β,0 : F(β 1,0, β,0 ) < 3.00} 3.00 is the 5% critical value of the F, distribution This set has coverage rate 95% because the test on which it is based (the test it inverts ) has size of 5%. 59

60 The confidence set based on the F-statistic is an ellipse F = 1 t + t ˆ ρt 1 ttt {β 1, β : F = 1 ˆ ρt 1, t 1 t1 + t ˆ ρt 1, ttt 1 (1 ˆ ρ ) t, t 1 1 = (1 ˆ ρ ) t, t 1 1, } ˆ β β ˆ,0 β1 β ˆ ˆ 1,0 β1 β 1,0 β β,0 ˆt 1, t ˆ + ˆ + ρ SE( β ˆ ˆ ) SE( β1) SE( β1) SE( β) This is a quadratic form in β 1,0 and β,0 thus the boundary of the set F = 3.00 is an ellipse. 60

61 Confidence set based on inverting the F-statistic 61

62 Other Regression Statistics in Multiple Regression 6

63 The R, SER, and R for Multiple Regression Actual = predicted + residual: Y i = Y ˆi + u ˆi As in regression with a single regressor, the SER (and the RMSE) is a measure of the spread of the Y s around the regression line: SER = n 1 uˆ n k 1 i= 1 i 63

64 The R, SER, and R for Multiple Regression The R is the fraction of the variance explained: R = ESS TSS = 1 SSR, TSS where ESS = SSR = TSS = n uˆ i, and i= 1 n i= 1 ( Y Y) i n ˆ ˆ ( Yi Y), i= 1 à just as for regression with one regressor. 64

65 The R, SER, and R for Multiple Regression The R always increases when you add another regressor a bit of a problem for a measure of fit The R corrects this problem by penalizing you for including another regressor: R = 1 n 1 SSR n k 1 TSS so R < R 65

66 How to interpret the R and R? A high R (or R ): means that the regressors explain the variation in Y. does not mean that you have eliminated omitted variable bias. does not mean that you have an unbiased estimator of a causal effect (β 1 ). does not mean that the included variables are statistically significant this must be determined using hypotheses tests. 66

67 Example: A Closer Look at the Test Score Data A general approach to variable selection and model specification: Specify a base or benchmark model. Specify a range of plausible alternative models, which include additional candidate variables. Does a candidate variable change the coefficient of interest (β 1 )? Is a candidate variable statistically significant? Use judgment, not a mechanical recipe 67

68 Example: A Closer Look at the Test Score Data Variables we would like to see in the California data set: School characteristics: student-teacher ratio teacher quality computers (non-teaching resources) per student Student characteristics: English proficiency availability of extracurricular enrichment home learning environment parent s education level Variables actually in the California class size data set: student-teacher ratio (STR) percent English learners in the district (PctEL) percent eligible for subsidized/free lunch percent on public income assistance average district income 68

69 Example: A Closer Look at the Test Score Data 69

70 Digression: presentation of regression results in a table Listing regressions in equation form can be cumbersome with many regressors and many regressions Tables of regression results can present the key information compactly Information to include: variables in the regression (dependent and independent) estimated coefficients standard errors results of F-tests of pertinent joint hypotheses some measure of fit number of observations 70

71 71

72 Summary: Multiple Regression Multiple regression allows you to estimate the effect on Y of a change in X 1, holding X constant. If you can measure a variable, you can avoid omitted variable bias from that variable by including it. There is no simple recipe for deciding which variables belong in a regression you must exercise judgment. One approach is to specify a base model relying on a-priori reasoning then explore the sensitivity of the key estimate(s) in alternative specifications. 7

Introduction to Econometrics. Multiple Regression (2016/2017)

Introduction to Econometrics. Multiple Regression (2016/2017) Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:

More information

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

More information

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic) The F distribution If: 1. u 1,,u n are normally distributed; and. X i is distributed independently of u i (so in particular u i is homoskedastic) then the homoskedasticity-only F-statistic has the F q,n-k

More information

Hypothesis Tests and Confidence Intervals in Multiple Regression

Hypothesis Tests and Confidence Intervals in Multiple Regression Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients

More information

Hypothesis Tests and Confidence Intervals. in Multiple Regression

Hypothesis Tests and Confidence Intervals. in Multiple Regression ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Lecture 5. In the last lecture, we covered. This lecture introduces you to Lecture 5 In the last lecture, we covered. homework 2. The linear regression model (4.) 3. Estimating the coefficients (4.2) This lecture introduces you to. Measures of Fit (4.3) 2. The Least Square Assumptions

More information

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as: 1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted

More information

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data 1/2/3-1 1/2/3-2 Brief Overview of the Course Economics suggests important

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor) 1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 13 Nonlinearities Saul Lach October 2018 Saul Lach () Applied Statistics and Econometrics October 2018 1 / 91 Outline of Lecture 13 1 Nonlinear regression functions

More information

Introduction to Econometrics. Assessing Studies Based on Multiple Regression

Introduction to Econometrics. Assessing Studies Based on Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Assessing Studies Based on Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Introduction to Econometrics. Regression with Panel Data

Introduction to Econometrics. Regression with Panel Data Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Regression with Panel Data Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 Regression with Panel

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

ECO321: Economic Statistics II

ECO321: Economic Statistics II ECO321: Economic Statistics II Chapter 6: Linear Regression a Hiroshi Morita hmorita@hunter.cuny.edu Department of Economics Hunter College, The City University of New York a c 2010 by Hiroshi Morita.

More information

Assessing Studies Based on Multiple Regression

Assessing Studies Based on Multiple Regression Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Panel Data (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Regression with Panel Data A panel dataset contains observations on multiple entities

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Econometrics 1. Lecture 8: Linear Regression (2) 黄嘉平

Econometrics 1. Lecture 8: Linear Regression (2) 黄嘉平 Econometrics 1 Lecture 8: Linear Regression (2) 黄嘉平 中国经济特区研究中 心讲师 办公室 : 文科楼 1726 E-mail: huangjp@szu.edu.cn Tel: (0755) 2695 0548 Office hour: Mon./Tue. 13:00-14:00 The linear regression model The linear

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Multivariate Regression: Part I

Multivariate Regression: Part I Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Lecture #8 & #9 Multiple regression

Lecture #8 & #9 Multiple regression Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more

More information

Motivation for multiple regression

Motivation for multiple regression Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope

More information

6. Assessing studies based on multiple regression

6. Assessing studies based on multiple regression 6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

Lab 11 - Heteroskedasticity

Lab 11 - Heteroskedasticity Lab 11 - Heteroskedasticity Spring 2017 Contents 1 Introduction 2 2 Heteroskedasticity 2 3 Addressing heteroskedasticity in Stata 3 4 Testing for heteroskedasticity 4 5 A simple example 5 1 1 Introduction

More information

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.

More information

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor ECON4150 - Introductory Econometrics Lecture 4: Linear Regression with One Regressor Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 4 Lecture outline 2 The OLS estimators The effect of

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Introduction to Time Series Regression and Forecasting (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Introduction to Time Series Regression

More information

Chapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved.

Chapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved. Chapter 9: Assessing Studies Based on Multiple Regression 1-1 9-1 Outline 1. Internal and External Validity 2. Threats to Internal Validity a) Omitted variable bias b) Functional form misspecification

More information

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10) Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the

More information

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM

P1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM P1.T2. Stock & Watson Chapters 4 & 5 Bionic Turtle FRM Video Tutorials By: David Harper CFA, FRM, CIPM Note: This tutorial is for paid members only. You know who you are. Anybody else is using an illegal

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections Answer Key Fixed Effect and First Difference Models 1. See discussion in class.. David Neumark and William Wascher published a study in 199 of the effect of minimum wages on teenage employment using a

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Econ 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5.

Econ 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5. Outline 1 Elena Llaudet 2 3 4 October 6, 2010 5 based on Common Mistakes on P. Set 4 lnftmpop = -.72-2.84 higdppc -.25 lackpf +.65 higdppc * lackpf 2 lnftmpop = β 0 + β 1 higdppc + β 2 lackpf + β 3 lackpf

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Experiments and Quasi-Experiments (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Why study experiments? Ideal randomized controlled experiments

More information

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu Last Name (Print): Solution First Name (Print): Student Number: MGECHY L Introduction to Regression Analysis Term Test Friday July, PM Instructor: Victor Yu Aids allowed: Time allowed: Calculator and one

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. This is an open book examination where all printed and written resources, in addition to a calculator, are allowed. If you are

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Linear Regression with one Regressor

Linear Regression with one Regressor 1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Chapter 8 Heteroskedasticity

Chapter 8 Heteroskedasticity Chapter 8 Walter R. Paczkowski Rutgers University Page 1 Chapter Contents 8.1 The Nature of 8. Detecting 8.3 -Consistent Standard Errors 8.4 Generalized Least Squares: Known Form of Variance 8.5 Generalized

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

8. Nonstandard standard error issues 8.1. The bias of robust standard errors 8.1. The bias of robust standard errors Bias Robust standard errors are now easily obtained using e.g. Stata option robust Robust standard errors are preferable to normal standard errors when residuals

More information

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N 1) [4 points] Let Econ 836 Final Exam Y Xβ+ ε, X w+ u, w N w~ N(, σi ), u N u~ N(, σi ), ε N ε~ Nu ( γσ, I ), where X is a just one column. Let denote the OLS estimator, and define residuals e as e Y X.

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama Course Packet The purpose of this packet is to show you one particular dataset and how it is used in

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41 ECON2228 Notes 7 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 6 2014 2015 1 / 41 Chapter 8: Heteroskedasticity In laying out the standard regression model, we made

More information

Homework Set 2, ECO 311, Fall 2014

Homework Set 2, ECO 311, Fall 2014 Homework Set 2, ECO 311, Fall 2014 Due Date: At the beginning of class on October 21, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information