Chapter 6: Linear Regression With Multiple Regressors
|
|
- Heather Stone
- 5 years ago
- Views:
Transcription
1 Chapter 6: Linear Regression With Multiple Regressors 1-1
2 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution of the OLS estimator 6-2
3 Omitted Variable Bias The error u arises because of: -Mis measurement -Randomness -Omitted variables: They influence Y but are omitted in the regression function. We turn to when the omitted variables bias the OLS estimator. 6-3
4 Omitted variable bias - definition Suppose the variable Z is omitted in the regression. By definition, there is omitted variable bias if BOTH: 1. Z is a determinant of Y (Z is part of u); 2. Z is correlated with the regressor X ( corr(z,x) 0) If either condition fails, we do not claim omitted variable bias. We will quantify the bias of the OLS estimator with a formula. 6-4
5 Role of two conditions The following illustrates why, if either condition fails, there is no bias. 1. Time of day of test affects test scores Y, but not correlated with the STR (regressor X) no bias 2. Parking space per pupil is correlated with STR (regressor X), but does not affect test scores Y no bias 6-5
6 Example: English learners & test scores In CA school districts, let Z:=Percent of English learners 1. English language ability (eg among English learners, immigrants) plausibly affects test scores (English and Math are tested): Z is a determinant of Y. 2. Communities with more English learners tend to be poorer and have higher STR: Z is correlated with X. Accordingly, there is omitted variable bias. As suggested, this implies that the OLS estimator is biased. What does common sense suggest about bias direction? What about quantifying the bias? 6-6
7 Formula for omitted variable bias Under single-regressor-ols assn s 2,3 (even if ass n 1 fails) β 1 = 1 n 1 n n i=1 n i=1 ( X i X )u i ( X i X ) 2 i p σ Xu σ X 2 σ u = = σ X Here ρ Xu :=corr(x,u) is shorthand for correlation b/w X,u. Now, ρ Xu = σ Xu σ X σ u σ u σ X ρ Xu 0 under OLS assn 1 0 under omitted variable bias 6-7
8 Reading both conditions in formula p β 1 + σ u σ X ρ Xu Indeed, if an omitted variable Z is both: 1. a determinant of Y (contained in u) 2. correlated with X, then ρ Xu 0 and the OLS estimator is biased and inconsistent (large samples do not mitigate the bias) Omitted PctEL should affect test scores negatively. In data, PctEL is positively correlated with STR. So STR should be negatively correlated with error term. By formula, OLS estimate of β is biased negatively: Overstates STR s effect. 6-8
9 Districts with fewer English Learners have higher test scores Districts with lower percent EL (PctEL) have smaller classes Among districts with comparable PctEL, the effect of class size is small (recall overall test score gap = 7.4) 6-9
10 Causality and regression analysis The test score/str/fraction English Learners example illustrates that, if an omitted variable satisfies the two conditions for omitted variable bias, then the OLS estimator in the regression omitting that variable is biased and inconsistent. So, even if n is large, will not be close to β 1. ˆβ 1 Suppose the school board decided to cut class size by 2 students per class. What would be the effect on test scores? We now have doubts about the prediction of test scores rising by an expected 4.6 points, given this bias. Already guessed that omitting PctEL causes OLS estimate to overstate STR s effect on test scores. 6-10
11 Overcoming omitted variable bias Three ways: 1. Run a randomized controlled experiment in which treatment (STR) is randomly assigned: then PctEL is still a determinant of TestScore, but PctEL is uncorrelated with STR. (Rarely feasible.) 2. Adopt the cross tabulation approach, with finer gradations of STR and PctEL within each group, all classes have the same PctEL, so we control for PctEL (But soon exhausts data; cannot also analyze other determinants: family income, parental education, ) 3. Use a regression in which the omitted variable (PctEL) is no longer omitted: Include PctEL as an additional regressor in a multiple regression! 6-11
12 The Population Multiple Regression Model Consider the case of two regressors: Y i = β 0 + β 1 X 1i + β 2 X 2i + u i, i = 1,,n Y is the dependent variable X 1, X 2 are the two independent variables (regressors) (Y i, X 1i, X 2i ) denote the i th observation on Y, X 1, and X 2. β 0 = unknown population intercept β 1 = effect on Y of a change in X 1, holding X 2 constant β 2 = effect on Y of a change in X 2, holding X 1 constant u i = the regression error (omitted factors) 6-12
13 Interpretation of coefficients Y i = β 0 + β 1 X 1i + β 2 X 2i + u i, i = 1,,n Consider changing X 1 by X 1 while holding X 2 constant: Population regression line before the change: Y = β 0 + β 1 X 1 + β 2 X 2 Population regression line, after the change: Y + Y = β 0 + β 1 (X 1 + X 1 ) + β 2 X
14 Before: Y = β 0 + β 1 (X 1 + X 1 ) + β 2 X 2 After: Y + Y = β 0 + β 1 (X 1 + X 1 ) + β 2 X 2 Difference: Y = β 1 X 1 So: Y β 1 =, holding X 2 constant X 1 β 2 = Y, holding X 1 constant X 2 β 0 = value that makes sample means fit line 6-14
15 The OLS Estimator in Multiple Regression With two regressors, by definition, the OLS estimator minimizes sum of squared residuals n i=1 min b0,b 1,b 2 [Y i (b 0 + b 1 X 1i + b 2 X 2i )] 2 The residuals are the differences between the observed Y i (data) and the predicted/fitted value given by the b s These are the OLS estimators of β 0 and β
16 Example: the California test score data Regression of TestScore against STR: TestScore = STR Now include percent English Learners in the district (PctEL): TestScore = STR 0.65PctEL Wow, effect of STR seems to have halved! This is the extent of the overstatement bias due to omitting PctEL! Advantages of OLS over tabulation: (1) data-economical, (2) quantifiable, (3) extends to multiple regressors 6-16
17 Multiple regression in STATA reg testscr str pctel, robust; Regression with robust standard errors Number of obs = 420 F( 2, 417) = Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str pctel _cons TestScore = STR 0.65PctEL 6-17
18 Measures of Fit of Regression Predicted value Y i : = b 0 + b 1X1i b k X ki & residual i : = i Then tautologically decompose data = prediction + residual Y = Y i + u i i u Y Y i SER = std. deviation of (with d.f. correction) RMSE = std. deviation of (without d.f. correction) R 2 R 2 R 2 = fraction of variance of Y explained by X = adjusted R 2 = R 2 with a degrees-of-freedom correction that adjusts for estimation uncertainty; < R
19 SER and RMSE The SER and the RMSE measures how much the Ys spread around the regression line: SER = n 1 uˆ n k 1 i= 1 2 i RMSE = n 1 uˆ n = i 1 2 i 6-19
20 R 2 The R 2 is the fraction of the variance predicted/explained. Exercises: * OLS goes through sample means: Also, OLS makes prediction and residual uncorrelated: cov( Y i, u i ) = 0 Sample variances of predicted Y s, actual Y s, and residuals are ESS : = ( Y ˆ i Y ) 2 TSS : = ( Y i Y ) 2 Y = Y and u = 0 SSR = ˆ 2 : u i if we ignore the common factor 1/(n-k-1). Now, expand TSS: TSS = ( Y ) ( ) 2 ( )( i + ui Y = Y i Y + ui + Y i Y ui 0) = ESS+ SSR+ factor cov( Y, i u i ) = cov= 0 ESS+ SSR Dividing this equation by TSS and rearranging, get The left is the R 2, by definition. Equation shows R 2 higher the lower the SSR (which is what the OLS minimizes). ESS TSS SSR = 1 TSS 6-28
21 R 2 & R 2 The R 2 always increases when one adds a regressor, e.g. number of bathrooms, weight of nearby ants, This because can always set coeff s on these 0, other coeffs as before, making SSR the same (and perhaps lower by minimizing) A high R2 says fit good, nothing about uncovering causality. R 2 The (the adjusted R 2 ) corrects this weakness by R 2 penalizing inclusion of another regressor. It is R 2 2 May decrease when one adds a regressor. Always, R < R, but close with large sample n. 2 n 1 SSR = 1 n k 1 TSS 6-21
22 Example: Test scores Test score example: TestScore (1) = STR, TestScore R 2 =.05, SER = 18.6 (2) = STR 0.65PctEL, R 2 R 2 =.426, =.424, SER = 14.5 Including PctEL vastly improves the fit R 2 Note: R 2 and are close because n=420 is large 6-22
23 The OLS Assumptions Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,,n 1. The error term u given the X s has mean 0, E(u i X 1i = x 1,, X ki = x k ) = (X 1i,,X ki,y i ), i =1,,n, are i.i.d. 4 X 1i 4 X ki 2. Large outliers are rare: E( ),, E( ), E( ) are finite 3. There is no perfect multicollinearity NEW Y i 4 Note: (4) is true in case of sole regressor 6-23
24 Assn1: Mean of u given included X s is 0 E(u X 1 = x 1,, X k = x k ) = 0 Same interpretation as in regression with a single regressor. Again, say omitted variable bias occurs if both 1. OV influences Y (so is in u) 2. OV is correlated with an included X If possible, solution is to include OV in the regression. Another solution is to include a third variable that controls for the OV (discussed in ch7) 6-24
25 Assn2: (X 1i,,X ki,y i ), i =1,,n, are iid Automatic if data are collected by simple random sampling. Assn3: Large outliers are rare Recall, OLS is sensitive to recurring large outliers check data for outliers due to typos or coding errors 6-25
26 Assn4: No perfect multicollinearity Perfect multicollinearity: A regressor is an exact linear function of the other regressors. Example: Suppose you accidentally include STR twice: regress testscr str str, robust Regression with robust standard errors Number of obs = 420 F( 1, 418) = Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str str (dropped) _cons
27 . In the previous regression, β 1 is the effect on TestScore of a unit change in STR, holding STR constant (absurd) We will return to perfect (and imperfect) multicollinearity shortly, with more examples These OLS assumptions imply a sampling distribution of,,, ˆ. ˆβ 1 ˆβ 2 β k 6-27
28 Sampling Distribution of OLS Estimator Under OLS assumptions 1-4, for every regressor i=1,,k The sampling distribution of has mean β i (unbiased) ˆi β i β var( ) is inversely proportional to n. β ˆi Consistency: β β ˆi p i In large samples, approximately normally distributed: ˆ β i E( ˆ β ) i var( ˆ β ) i N(0,1) 6-28
29 Multicollinearity, Perfect and Imperfect Perfect multicollinearity: Some regressor is a linear function of the other regressors. Some more examples of perfect multicollinearity 1. The example from before: Include regressor STR twice, 2. Regress TestScore on a constant, D, and B, where: D i = 1 if STR 20, = 0 otherwise; B i = 1 if STR >20, = 0 otherwise, so B i = 1 D i 6-29
30 The dummy variable trap Suppose multiple binary/dummy variables, mutually exclusive & exhaustive: Every observation falls in exactly one dummy category (eg. Freshmen, Sophomores, Juniors, Seniors, Other). Including all dummies & intercept β 0 leads to perfect multicollinearity arises ( the dummy variable trap ). How so? Dummies, exhaustive & exclusive, add up to 1 which is the intercept s regressor To avoid the dummy variable trap: 1:Omit one group (e.g. Senior), or 2:Omit intercept How do (1) or (2) affect interpretation of the coefficients? 1: Coeffs are differences relative to omitted dummy 2: Coeffs are absolute levels for the included dummies 6-30
31 Perfect multicollinearity, ctd. Perfect multicollinearity usually reflects a mistake in the definitions of the regressors, or an oddity in the data Software will identify perfect multicollinearity and warn - by crashing or messaging error Solution: Modify list of regressors 6-31
32 Imperfect multicollinearity Imperfect and perfect multicollinearity are different. Imperfect multicollinearity: Some regressors are highly correlated Why phrase multicollinearity? If two regressors are highly correlated, their scatterplot is linear-like nearly co-linear 6-32
33 Imperfect multicollinearity, ctd. Imperfect multicollinearity implies that some coefficients will be imprecisely estimated. The idea: the coefficient on X 1 is the effect of X 1 holding X 2 constant; but if X 1 and X 2 are highly correlated, there is very little variation in X 1 once X 2 is held constant so the data don t contain much information about what happens when X 1 changes but X 2 doesn t. So the OLS estimator of the coefficient on X 1 is unreliable (high variance). Imperfect multicollinearity (correctly) results in large standard errors for one or more of the OLS coefficients. Special case clarifies (two regressors, homeskedasticity): σ 2 u var( slope Increasing as 1) = σ = 1 2 ρ β 2 X1, X 1 (as imp.col. worsens) 2 n 1 ρ X1, X σ 2 X1 6-33
Linear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationLinear Regression with Multiple Regressors
Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationIntroduction to Econometrics. Multiple Regression (2016/2017)
Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:
More informationIntroduction to Econometrics. Multiple Regression
Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR
More informationLecture 5. In the last lecture, we covered. This lecture introduces you to
Lecture 5 In the last lecture, we covered. homework 2. The linear regression model (4.) 3. Estimating the coefficients (4.2) This lecture introduces you to. Measures of Fit (4.3) 2. The Least Square Assumptions
More informationNonlinear Regression Functions
Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.
More informationECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors
ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption
More informationECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests
ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single
More informationHypothesis Tests and Confidence Intervals in Multiple Regression
Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients
More informationECO321: Economic Statistics II
ECO321: Economic Statistics II Chapter 6: Linear Regression a Hiroshi Morita hmorita@hunter.cuny.edu Department of Economics Hunter College, The City University of New York a c 2010 by Hiroshi Morita.
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationRegression with a Single Regressor: Hypothesis Tests and Confidence Intervals
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression
More informationHypothesis Tests and Confidence Intervals. in Multiple Regression
ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force
More informationChapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression
Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.
More informationMultiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)
1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:
More informationIntroduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data
Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data 1/2/3-1 1/2/3-2 Brief Overview of the Course Economics suggests important
More informationThe F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)
The F distribution If: 1. u 1,,u n are normally distributed; and. X i is distributed independently of u i (so in particular u i is homoskedastic) then the homoskedasticity-only F-statistic has the F q,n-k
More informationEconometrics Midterm Examination Answers
Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression
More informationECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests
ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 13 Nonlinearities Saul Lach October 2018 Saul Lach () Applied Statistics and Econometrics October 2018 1 / 91 Outline of Lecture 13 1 Nonlinear regression functions
More informationThe Simple Linear Regression Model
The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate
More informationAssessing Studies Based on Multiple Regression
Assessing Studies Based on Multiple Regression Outline 1. Internal and External Validity 2. Threats to Internal Validity a. Omitted variable bias b. Functional form misspecification c. Errors-in-variables
More informationEconometrics 1. Lecture 8: Linear Regression (2) 黄嘉平
Econometrics 1 Lecture 8: Linear Regression (2) 黄嘉平 中国经济特区研究中 心讲师 办公室 : 文科楼 1726 E-mail: huangjp@szu.edu.cn Tel: (0755) 2695 0548 Office hour: Mon./Tue. 13:00-14:00 The linear regression model The linear
More informationECON3150/4150 Spring 2015
ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2
More informationECON3150/4150 Spring 2016
ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and
More informationECON Introductory Econometrics. Lecture 16: Instrumental variables
ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental
More informationIntroduction to Econometrics. Assessing Studies Based on Multiple Regression
Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Assessing Studies Based on Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1
More informationEcon 1123: Section 5. Review. Internal Validity. Panel Data. Clustered SE. STATA help for Problem Set 5. Econ 1123: Section 5.
Outline 1 Elena Llaudet 2 3 4 October 6, 2010 5 based on Common Mistakes on P. Set 4 lnftmpop = -.72-2.84 higdppc -.25 lackpf +.65 higdppc * lackpf 2 lnftmpop = β 0 + β 1 higdppc + β 2 lackpf + β 3 lackpf
More informationContest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.
Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round
More informationEmpirical Application of Simple Regression (Chapter 2)
Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationLab 07 Introduction to Econometrics
Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand
More informationECON Introductory Econometrics. Lecture 17: Experiments
ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.
More informationSpecification Error: Omitted and Extraneous Variables
Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct
More informationChapter 9: Assessing Studies Based on Multiple Regression. Copyright 2011 Pearson Addison-Wesley. All rights reserved.
Chapter 9: Assessing Studies Based on Multiple Regression 1-1 9-1 Outline 1. Internal and External Validity 2. Threats to Internal Validity a) Omitted variable bias b) Functional form misspecification
More information6. Assessing studies based on multiple regression
6. Assessing studies based on multiple regression Questions of this section: What makes a study using multiple regression (un)reliable? When does multiple regression provide a useful estimate of the causal
More informationTHE MULTIVARIATE LINEAR REGRESSION MODEL
THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationStatistical Inference with Regression Analysis
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing
More informationEcon 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias
Contact Information Elena Llaudet Sections are voluntary. My office hours are Thursdays 5pm-7pm in Littauer Mezzanine 34-36 (Note room change) You can email me administrative questions to ellaudet@gmail.com.
More informationECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor
ECON4150 - Introductory Econometrics Lecture 4: Linear Regression with One Regressor Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 4 Lecture outline 2 The OLS estimators The effect of
More informationSimple Linear Regression: The Model
Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random
More informationEconometrics -- Final Exam (Sample)
Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and
More informationIntroduction to Econometrics. Regression with Panel Data
Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Regression with Panel Data Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 Regression with Panel
More informationLecture #8 & #9 Multiple regression
Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more
More informationExam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.
Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. This is an open book examination where all printed and written resources, in addition to a calculator, are allowed. If you are
More informationIntroduction to Econometrics
Introduction to Econometrics STAT-S-301 Panel Data (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Regression with Panel Data A panel dataset contains observations on multiple entities
More informationLecture 4: Multivariate Regression, Part 2
Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above
More informationMultivariate Regression: Part I
Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a
More informationChapter 2: simple regression model
Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationMultiple Linear Regression CIVL 7012/8012
Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for
More informationEconometrics. 8) Instrumental variables
30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates
More informationLecture notes to Stock and Watson chapter 8
Lecture notes to Stock and Watson chapter 8 Nonlinear regression Tore Schweder September 29 TS () LN7 9/9 1 / 2 Example: TestScore Income relation, linear or nonlinear? TS () LN7 9/9 2 / 2 General problem
More informationReplication of Examples in Chapter 6
Replication of Examples in Chapter 6 Zheng Tian 1 Introduction This document is to show how to perform hypothesis testing for a single coefficient in a simple linear regression model. I replicate examples
More informationIntroduction to Econometrics. Review of Probability & Statistics
1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More informationMeasurement Error. Often a data set will contain imperfect measures of the data we would ideally like.
Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and
More informationLECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity
LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists
More informationLinear Regression with one Regressor
1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2
Economics 326 Methods of Empirical Research in Economics Lecture 14: Hypothesis testing in the multiple regression model, Part 2 Vadim Marmer University of British Columbia May 5, 2010 Multiple restrictions
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationLecture 5: Omitted Variables, Dummy Variables and Multicollinearity
Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationMultiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =
Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =
More informationOrdinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much!
Ordinary Least Squares (OLS): Multiple Linear Regression (MLR) Analytics What s New? Not Much! OLS: Comparison of SLR and MLR Analysis Interpreting Coefficients I (SRF): Marginal effects ceteris paribus
More information4. Nonlinear regression functions
4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change
More informationMotivation for multiple regression
Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope
More informationLecture (chapter 13): Association between variables measured at the interval-ratio level
Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.
More informationRegression #8: Loose Ends
Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch
More information1 The basics of panel data
Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Related materials: Steven Buck Notes to accompany fixed effects material 4-16-14 ˆ Wooldridge 5e, Ch. 1.3: The Structure of Economic Data ˆ Wooldridge
More informationEconometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018
Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate
More informationLecture 8: Instrumental Variables Estimation
Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano
More informationLecture 7: OLS with qualitative information
Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:
More informationThe Simple Regression Model. Part II. The Simple Regression Model
Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square
More information2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0
Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationSection I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -
First Exam: Economics 388, Econometrics Spring 006 in R. Butler s class YOUR NAME: Section I (30 points) Questions 1-10 (3 points each) Section II (40 points) Questions 11-15 (10 points each) Section III
More informationFixed and Random Effects Models: Vartanian, SW 683
: Vartanian, SW 683 Fixed and random effects models See: http://teaching.sociology.ul.ie/dcw/confront/node45.html When you have repeated observations per individual this is a problem and an advantage:
More informationProblem Set 1 ANSWERS
Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one
More informationHandout 11: Measurement Error
Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)
More informationLab 6 - Simple Regression
Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello
More information4 Instrumental Variables Single endogenous variable One continuous instrument. 2
Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................
More informationSimple Linear Regression for the Climate Data
Prediction Prediction Interval Temperature 0.2 0.0 0.2 0.4 0.6 0.8 320 340 360 380 CO 2 Simple Linear Regression for the Climate Data What do we do with the data? y i = Temperature of i th Year x i =CO
More informationP1.T2. Stock & Watson Chapters 4 & 5. Bionic Turtle FRM Video Tutorials. By: David Harper CFA, FRM, CIPM
P1.T2. Stock & Watson Chapters 4 & 5 Bionic Turtle FRM Video Tutorials By: David Harper CFA, FRM, CIPM Note: This tutorial is for paid members only. You know who you are. Anybody else is using an illegal
More informationLongitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois
Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control
More informationFinal Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)
Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the
More information1 Independent Practice: Hypothesis tests for one parameter:
1 Independent Practice: Hypothesis tests for one parameter: Data from the Indian DHS survey from 2006 includes a measure of autonomy of the women surveyed (a scale from 0-10, 10 being the most autonomous)
More information8. Instrumental variables regression
8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption
More informationIntroductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1
Introductory Econometrics Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Jun Ma School of Economics Renmin University of China October 19, 2016 The model I We consider the classical
More information