Motivation for multiple regression

Size: px
Start display at page:

Download "Motivation for multiple regression"

Transcription

1 Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope coefficient β 1 in simple regression has causal interpretation only if the independent variable X is exogenous, i.e., cov(x, u) = 0. This is an assumption that we need to check. 3. The assumption cov(x, u) = 0 can be too strong to hold in reality. If any variable in u is correlated with X, the independent variable becomes endogenous, and the slope coefficient no longer has causal interpretation (in this case β 1 just measures association). 4. We can show that ˆβ 1 = (xi x)(y i ȳ) = (xi x) 2 (xi x)y i (1) (xi x) 2 = (xi x)(β 0 + β 1 x i + u i ) (xi x) 2 = β 1 + (xi x)u i (xi x) 2 (2) β The last result is important, and it implies that cov(x, u), (as n ) (3) σx 2 ˆβ 1 { β1, if cov(x, u) = 0 β 1 + bias, if cov(x, u) 0, bias = cov(x,u) σ 2 x (4) In words, the estimated slope coefficient converges to the true value (so is unbiased) if X is exogenous. When X becomes endogenous, ˆβ 1 is a biased estimate. The bias is given by cov(x,u). σx 2 6. One extreme case is β 1 = 0, so X has no causal effect on Y at all. But because X is correlated with some variables in the error term, the regression can produce a statistically significant ˆβ 1. In other words, the regression indicates spurious causality that does not exist. 7. For example, we may regress salary on education, and put ability in the error term. Because ability and education are correlated, the result of the simple regression may 1

2 be biased. What is captured by the simple regression may be the effect of ability on salary, not the effect of education on salary. 8. A multiple regression can explicitly account for many, if not all, other factors, by taking them out of the error term. Consider the simplest multiple regression Y = β 0 + β 1 X 1 + β 2 X 2 + e, (5) where e is the error term, for which we assume E(e X 1, X 2 ) = 0. (6) 9. This multiple regression becomes the simple regression if we define the error term in the simple regression as u β 2 X 2 + e. If we run the simple regression of Y on X 1 we can show ˆβ 1 β 1 + bias, bias = cov(x 1, u) σ 2 x1 = β 2 cov(x 1, X 2 ) σ 2 x1 (7) If β 2 > 0, cov(x 1, X 2 ) > 0, then bias > 0, ˆβ 1 > β 1, so the simple regression overestimates the effect of X 1 on Y. See Table 3.2 of the textbook for other possibilities. 10. We call X 2 the omitted variable if (i) it has causal effect on Y (so β 2 0), (ii) it is correlated with the key regressor (so cov(x 1, X 2 ) 0), and (iii) it is excluded from the regression (being put in the error term). The bias caused by omitted variable is called omitted variable bias. 11. If we use non-experimental data and the goal is to prove causality, omitted variable bias is the top issue we need to address. We have to ask if there is any variable in the error term which is the omitted variable. 12. Consider another example of omitted variable bias. Before we tried the simple regression of house price on the number of bathroom X 1. In that simple regression the error term contains the size of house X 2, which is the omitted variable. X 2 affects the house price, and is correlated with X 1. Therefore the slope of the simple regression is a biased estimate for the true causal effect of X 1 on Y. 2

3 Estimating multiple regression 1. Consider a multiple regression given by Y = β 0 + β 1 X 1 + β 2 X β k X k + u (1) Note there are k + 1 regressors, and one of them is constant (the intercept term). 2. The unknown parameters are β i, i = 0, 1,...k and σ 2 = var(u). 3. The key assumption to induce causality in multiple regression is E(u X 1, X 2,..., X k ) = E(u) = 0 (2) Assumption (2) is more likely to hold in reality than the simple regression because u now contains fewer variables. That means the multiple regression is more suitable than the simple regression for proving causality. 4. Notice that Assumption (2) does NOT require cov(x i X j ) 0. It is OK for the regressors in the multiple regression to be correlated with each other. Actually that is the whole point for multiple regression, which explicitly controls for X 2,..., X k and any one of them can be correlated with the key regressor X 1. Assumption (2) only requires those regressors be uncorrelated with the error term. 5. The OLS estimators for the coefficients, denoted by ˆβ k, (k = 0, 1,..., k), are obtained by solving k + 1 equations, which are the first order conditions (FOC) for minimizing residual sum squares i ûi = 0, (FOC 1) i x 1iû i = 0, (FOC 2)...,... i x kû i = 0, (FOC k+1) The formula for ˆβ k is complicated. Matrix algebra (not required) is needed in order to get simpler formula. 6. However, there is a simple formula for ˆβ 1 if we follow a two-step procedure (3) Theorem 1 Let ˆr be the residual of auxiliary regression of X 1 on X 2,..., X k. Then 1

4 the OLS estimator for β 1 in multiple regression (1) is ˆβ 1 = i ˆr iy i (Frisch-Waugh Theorem) (4) 7. Frisch-Waugh Theorem theorem indicates that we can obtain ˆβ 1 in two steps (a) Step 1: regress X 1 on X 2,..., X k and keep the residual ˆr (b) Step 2: regress Y on ˆr without intercept term 8. Residual ˆr measures the part of X 1 that cannot be explained by X 2,..., X k. Put differently, ˆr captures the part of X 1 after the effect of other factors has been netted out. This is why multiple regression is better than simple regression for proving causality. 9. Proof of Frisch-Waugh Theorem: Because ˆr is the residual, so it satisfies the FOC. That is and ˆr i = 0, i i ˆr i x 1i = i i x 2iˆr i = 0,..., i where û i is the residual for (1). The above equations imply that i ˆr iy i = ˆr 2 i, i ˆr i( ˆβ 0 + ˆβ 1 x 1i ˆβ k x ki + û i ) x kiˆr i = 0 (5) ˆr i û i = 0 (6) i = ˆβ 1 (7) 10. The OLS estimate and true coefficient are related via ˆβ 1 = β 1 + i ˆr iu i (8) from which we can prove the statistical property of ˆβ 1 : (a) E( ˆβ 1 X) = β 1 so ˆβ 1 is unbiased (b) The (conditional) variance for ˆβ 1 (assuming homoskedasticity) is var( ˆβ 1 X) = σ2 = σ 2 SST X1 (1 R 2 X1 ) (9) 2

5 where SST X1 = i (x 1i x 1 ) 2 measures the total variation in X 1, and RX1 2 denotes the R squared for the auxiliary regression of X 1 on X 2,..., X k. Everything else equal, the variance is big (and OLS estimate is imprecise) if X 1 is highly correlated with other independent variable (RX1 2 is big). The phenomenon of high correlation among regressors is called multicollinearity. The consequence of multicollinearity is insignificant estimate Intuitively, when regressors are highly correlated, the regression can not tell them apart, so the estimate is imprecise. 11. Now we face a trade off. The chance of multicollinearity is zero when we run simple regression. But simple regression has high chance of suffering omitted variable bias. Multiple regression has higher chance of multicollinearity, but lower chance of omitted variable bias. Econometrics puts more weight on omitted variable bias than multicollinearity 12. After obtaining the coefficient estimates, we can compute the residual û i = Y i Ŷi = Y i ˆβ 0 ˆβ 1 X 1i... ˆβ k X ki (10) Then the variance of the error term is estimated as ˆσ 2 = û2 i n k 1 (11) The square root of ˆσ 2 is called the standard error of regression (SER). 13. Because multiple regression has multiple independent variables, we can test hypothesis that involves several coefficients. The test is called F test, and is computed as F = (RSS r RSS u )/q RSS u /(n k u 1) (12) where RSS r is the RSS for the restricted regression that imposes the null hypothesis. RSS u is the RSS for the unrestricted regression, q is the number of restrictions, and k u is the number of regressors in the unrestricted regression. 3

6 (a) The F test follows F distribution with degree of freedom of (q, n k u 1) under the null hypothesis. The t test is special case of F test. The null hypothesis is rejected if the p-value is less than (b) The intuition is, the null hypothesis is false (so can be rejected) if imposing the null hypothesis significantly changes RSS. (c) For example, consider an unrestricted multiple regression Y = β 0 + β 1 X 1 + β 2 X 2 + u The null hypothesis is H 0 : β 1 = β 2 By imposing the restriction in the null hypothesis, we get the restricted regression Y = β 0 + β 1 (X 1 + X 2 ) + u so the restricted regression uses X 1 + X 2 as regressor. For this example q = 1 F test can be used when the null hypothesis involes several coefficients 4

7 Example: Multiple Regression 1. We still use the house data. 2. First we run simple regression of rprice on baths. This simple regression puts variable area (which measures the size of house) into the error term. Because the number of bathrooms and house size must be correlated, baths is endogenous in the simple regression. As a result, the estimated coefficient of baths has NO causal interpretation (or is a biased estimate for the true causal effect). This number just measures the linear association or correlation between baths and rprice. We can only conclude that having one more bathroom is associated with a raise of in real house price. The OLS fitted line baths, however, is the best linear predictor for y (if we only use baths as predictor), no matter baths is exogenous or not. 3. Next we run multiple regression of rprice on baths and area. The stata command is reg rprice baths area. Now area is out of the error term, but other factors may still be there. That means it is still unlikely we can get causal effect by running this multiple regression with just two regressors. The estimated coefficient of baths now becomes , smaller than We can conclude that having one more bathroom, while holding house size fixed, is associated with a raise of in real house price. Put differently, if we have two houses with same size, but one house has one more bathroom than the other, then the rprice of former is higher than the latter by Another way to interpret is, it measures the association between baths and rprice, after the effect of area has been netted out. 4. So one benefit of multiple regression is that multiple regression can explicitly control for other factors, therefore lower the chance of omitted variable bias. Comparing the simple and multiple regressions, it is safe to say the simple regression (in this example) overestimates the effect of baths on rprice. The simple regression estimated coefficient may capture not just the effect of baths, but also area. In short, there is omitted variable bias for simple regression. The omitted variable is area. The omitted variable bias is positive since baths and area are positively correlated and we expect area has positive effect on rprice, see Table 3.2 in the textbook for detail. 5. Another benefit of multiple regression is bigger R squared. We see > The multiple regression fits data better just because it uses more regressors (more 1

8 information) 6. Multiple regression has cost. Note that in multiple regression the standard error for baths coefficient is , higher than in simple regression. This finding implies that the estimate in multiple regression is less precise than simple regression. It is the correlation between baths and area that causes the variance to rise, see formula below var( ˆβ 1 X) = σ2 = σ 2 SST X1 (1 R 2 X1 ). For this problem we are lucky because baths is still significant after area is included. In practice, it is not uncommon that the key regressor becomes insignificant after other regressors are added. 7. Next we apply the Frisch-Waugh Theorem to show how to obtain The stata commands are * step 1: auxiliary regression qui reg baths area predict rhat, re * step 2: regress y onto rhat reg rprice rhat So first we (quietly) regress baths X 1 onto area X 2, and save the residual rhat ˆr using command predict with option re. In step 2 we regress rprice Y onto rhat. We get the same estimate as that reported by command reg rprice baths area, so Frisch- Waugh Theorem is verified. 8. We also report the F test for the hypothesis that baths and size have same effect on rprice. The null hypothesis is H 0 : β 1 = β 2. You can use the command test baths = area. Or you can construct the F test manually by running unrestricted and restricted regressions. See the do file for details. 2

9 3

10 Do File * Do file for multiple regression (chapter 3 and 4) clear capture log close ************************************* cd "I:\311" log using 311log.txt, text replace use 311_house.dta, clear * simple regression reg rprice baths * multiple regression reg rprice baths area * save rss sca rssu = e(rss) * example of f test, H0: beta1 = beta2 test baths = area * restricted regression gen x = baths + area qui reg rprice x sca rssr = e(rss) * F test and p value sca f = ((rssr-rssu)/1)/(rssu/321-3) sca pvalue = Ftail(1, 318, f) dis "f test is " f dis "pvalue is " pvalue * verify Frisch-Waugh Theorem * step 1: auxiliary regression qui reg baths area predict rhat, re * step 2: regress y onto rhat reg rprice rhat ***************************************** log close 4

Homework Set 2, ECO 311, Fall 2014

Homework Set 2, ECO 311, Fall 2014 Homework Set 2, ECO 311, Fall 2014 Due Date: At the beginning of class on October 21, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information

Homework Set 2, ECO 311, Spring 2014

Homework Set 2, ECO 311, Spring 2014 Homework Set 2, ECO 311, Spring 2014 Due Date: At the beginning of class on March 31, 2014 Instruction: There are twelve questions. Each question is worth 2 points. You need to submit the answers of only

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Empirical Application of Simple Regression (Chapter 2)

Empirical Application of Simple Regression (Chapter 2) Empirical Application of Simple Regression (Chapter 2) 1. The data file is House Data, which can be downloaded from my webpage. 2. Use stata menu File Import Excel Spreadsheet to read the data. Don t forget

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Chapter 2: simple regression model

Chapter 2: simple regression model Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics I Lecture 3: The Simple Linear Regression Model Econometrics I Lecture 3: The Simple Linear Regression Model Mohammad Vesal Graduate School of Management and Economics Sharif University of Technology 44716 Fall 1397 1 / 32 Outline Introduction Estimating

More information

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

More information

Simple Linear Regression: The Model

Simple Linear Regression: The Model Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: Heteroskedasticity Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

More information

Two-Variable Regression Model: The Problem of Estimation

Two-Variable Regression Model: The Problem of Estimation Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook) 1 2 Panel Data Panel data is obtained by observing the same person, firm, county, etc over several periods. Unlike the pooled cross sections,

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE 1. You have data on years of work experience, EXPER, its square, EXPER, years of education, EDUC, and the log of hourly wages, LWAGE You estimate the following regressions: (1) LWAGE =.00 + 0.05*EDUC +

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Lecture 2 Multiple Regression and Tests

Lecture 2 Multiple Regression and Tests Lecture 2 and s Dr.ssa Rossella Iraci Capuccinello 2017-18 Simple Regression Model The random variable of interest, y, depends on a single factor, x 1i, and this is an exogenous variable. The true but

More information

Lecture 4: Regression Analysis

Lecture 4: Regression Analysis Lecture 4: Regression Analysis 1 Regression Regression is a multivariate analysis, i.e., we are interested in relationship between several variables. For corporate audience, it is sufficient to show correlation.

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity R.G. Pierse 1 Omitted Variables Suppose that the true model is Y i β 1 + β X i + β 3 X 3i + u i, i 1,, n (1.1) where β 3 0 but that the

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Homework Set 3, ECO 311, Spring 2014

Homework Set 3, ECO 311, Spring 2014 Homework Set 3, ECO 311, Spring 2014 Due Date: At the beginning of class on May 7, 2014 Instruction: There are eleven questions. Each question is worth 2 points. You need to submit the answers of only

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

More information

Econometrics Review questions for exam

Econometrics Review questions for exam Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Multivariate Regression Analysis

Multivariate Regression Analysis Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

EC4051 Project and Introductory Econometrics

EC4051 Project and Introductory Econometrics EC4051 Project and Introductory Econometrics Dudley Cooke Trinity College Dublin Dudley Cooke (Trinity College Dublin) Intro to Econometrics 1 / 23 Project Guidelines Each student is required to undertake

More information

The Simple Linear Regression Model

The Simple Linear Regression Model The Simple Linear Regression Model Lesson 3 Ryan Safner 1 1 Department of Economics Hood College ECON 480 - Econometrics Fall 2017 Ryan Safner (Hood College) ECON 480 - Lesson 3 Fall 2017 1 / 77 Bivariate

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON Introductory Econometrics. Lecture 16: Instrumental variables ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

More information

Introduction to Econometrics. Multiple Regression (2016/2017)

Introduction to Econometrics. Multiple Regression (2016/2017) Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data Panel data Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data - possible to control for some unobserved heterogeneity - possible

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrics of Panel Data Jakub Mućk Meeting # 2 Jakub Mućk Econometrics of Panel Data Meeting # 2 1 / 26 Outline 1 Fixed effects model The Least Squares Dummy Variable Estimator The Fixed Effect (Within

More information

8. Instrumental variables regression

8. Instrumental variables regression 8. Instrumental variables regression Recall: In Section 5 we analyzed five sources of estimation bias arising because the regressor is correlated with the error term Violation of the first OLS assumption

More information

Simple Linear Regression Model & Introduction to. OLS Estimation

Simple Linear Regression Model & Introduction to. OLS Estimation Inside ECOOMICS Introduction to Econometrics Simple Linear Regression Model & Introduction to Introduction OLS Estimation We are interested in a model that explains a variable y in terms of other variables

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 16, 2013 Outline Introduction Simple

More information

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects Economics 113 Simple Regression Models Simple Regression Assumptions Simple Regression Derivation Changing Units of Measurement Nonlinear effects OLS and unbiased estimates Variance of the OLS estimates

More information

The Multiple Regression Model Estimation

The Multiple Regression Model Estimation Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:

More information

The multiple regression model; Indicator variables as regressors

The multiple regression model; Indicator variables as regressors The multiple regression model; Indicator variables as regressors Ragnar Nymoen University of Oslo 28 February 2013 1 / 21 This lecture (#12): Based on the econometric model specification from Lecture 9

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL INTRODUCTION TO BASIC LINEAR REGRESSION MODEL 13 September 2011 Yogyakarta, Indonesia Cosimo Beverelli (World Trade Organization) 1 LINEAR REGRESSION MODEL In general, regression models estimate the effect

More information

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

More information

L2: Two-variable regression model

L2: Two-variable regression model L2: Two-variable regression model Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revision: September 4, 2014 What we have learned last time...

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

More information

Introduction to Econometrics. Multiple Regression

Introduction to Econometrics. Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR

More information

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2. Updated: November 17, 2011 Lecturer: Thilo Klein Contact: tk375@cam.ac.uk Contest Quiz 3 Question Sheet In this quiz we will review concepts of linear regression covered in lecture 2. NOTE: Please round

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Problem Set - Instrumental Variables

Problem Set - Instrumental Variables Problem Set - Instrumental Variables 1. Consider a simple model to estimate the effect of personal computer (PC) ownership on college grade point average for graduating seniors at a large public university:

More information

Dealing With Endogeneity

Dealing With Endogeneity Dealing With Endogeneity Junhui Qian December 22, 2014 Outline Introduction Instrumental Variable Instrumental Variable Estimation Two-Stage Least Square Estimation Panel Data Endogeneity in Econometrics

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Regression Analysis: Basic Concepts

Regression Analysis: Basic Concepts The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

Advanced Econometrics

Advanced Econometrics Based on the textbook by Verbeek: A Guide to Modern Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna May 16, 2013 Outline Univariate

More information

Topic 7: Heteroskedasticity

Topic 7: Heteroskedasticity Topic 7: Heteroskedasticity Advanced Econometrics (I Dong Chen School of Economics, Peking University Introduction If the disturbance variance is not constant across observations, the regression is heteroskedastic

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Econometrics -- Final Exam (Sample)

Econometrics -- Final Exam (Sample) Econometrics -- Final Exam (Sample) 1) The sample regression line estimated by OLS A) has an intercept that is equal to zero. B) is the same as the population regression line. C) cannot have negative and

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata' Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where

More information