# ECON3150/4150 Spring 2016

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, / 49

2 Outline Multiple linear regression model and OLS Estimation Properties Measures of fit Data scaling Dummy variables in MLRM 2 / 49

3 Estimation of MLRM 3 / 49

4 OLS estimation of MLRM The procedure for obtaining the estimates is the same as with one regressor. Choose the estimate that minimize the sum of squared errors. min n i=1 û 2 i The estimates ˆβ 0, ˆβ 1 and ˆβ 2 are chosen simultaneously to make the squared error as small as possible. The i subscript is for the observation number, the second subscript is for the variable number. β j is the coefficient on variable number j. In the general form we have k independent variables, thus k+1 first order conditions. 4 / 49

5 OLS estimation of MLRM If k=2 then minimize: S(β 0, β 1, β 2 ) = n (Y i β 0 β 1 X 1i β 2 X 2i ) 2 i=1 The solution to the FOCs give you: The ordinary least square estimators ( ˆβ 0, ˆβ 1, ˆβ 2 ) of the true population coefficients (β 0, β 1, β 2 ). The predicted value Ŷ of Y i given X 1i and X 2i. The OLS residuals û i = Y i Ŷ i. 5 / 49

6 OLS estimation of MLRM The OLS fitted values and residuals have the same important properties as in the simple linear regression: The sample average of the residuals is zero and so Ȳ = Ŷ The sample covariance between each independent variable and the OLS residuals is zero. Consequently, the sample covariance between the OLS fitted values and the OLS residuals is zero. The point ( X 1, X 2,..., X k, Ȳ ) is always on the OLS regression line. 6 / 49

7 Properties of the MLRM OLS estimator Under the OLS assumptions the OLS estimators of MLRM are unbiased and consistent estimators of the unknown population coefficients. E( ˆβ j ) = β j, j = 0, 1, 2,...k In large samples the joint samling distribution of ˆβ 0, ˆβ 1,... ˆβ k is well approximated by a multivariate normal distribution. Under the OLS assumptions, including homoskedasticity, the OLS estimators ˆβ j are the best linear unbiased estimators of the population parameter β j. Under heteroskedasticity the OLS estimators are not necessarily the one with the smallest variance. 7 / 49

8 Consistency Clive W. J. Granger (Nobel Prize-winner) once said: If you can t get it right as n goes to infinity you shouldn t be in this business. Consistency involves a thought experiment about what would happen as the sample size gets large. If obtaining more and more data does not generally get us closer to the parameter of interest, then we are using a poor estimation procedure. The OLS estimators are inconsistent if the error is correlated with any of the independent variables. 8 / 49

9 Variance of the OLS estimator Under the OLS assumptions, conditional on the sample values of the independent variables: var( ˆβ j ) = σ 2 n i=1 (X ij X j ) 2 (1 Rj 2 j = 0, 1, 2,..., k, ), Where Rj 2 is the R-squared from regressing x j on all other independent variables. As in the SLRM the OLS variance of ˆβ 1 depend on the variance of the error term and the sample variance in the independent variable. In addition it depends on the linear relationship among the independent variables R 2 j 9 / 49

10 Variance of the OLS estimator 10 / 49

11 Variance of the OLS estimator Blue: error distributed normal with mean 0 and standard deviation 10. Red: Error distributed normal with mean 0 and standard deviation / 49

12 Estimate variance If σ 2 is not known we need to estimate it: ˆσ 2 = 1 n k 1 Higher variance gives higher standard errors, lower precision due to more noise. n i=1 û 2 i 12 / 49

13 Added OLS assumption Normality assumption The population error u is independent of the explanatory variables and is normally distributed with zero mean and variance σ 2 : u Normal(0, σ 2 ) If the other OLS assumptions plus this one holds the OLS estimator has an exact normal sampling distribution and the homoskedasticity only t-statistic has an exact Student t distribution. The OLS estimators are jointly normally distributed. Each ˆβ j is distributed N(β j, σ 2ˆβj ). 13 / 49

14 Simulation A simulation is a fictitious computer representation of reality. Steps in simulation: 1 Choose the sample size n 2 Choose the parameter values and functional form of the population regression function. 3 Generate n values of x randomly in Stata 4 Choose probability distribution of the error term and generate n values of u 5 Estimate the model 6 Repeat step 1 through 5 multiple times and look at the summary statistics over the repetitions. 14 / 49

15 Monte Carlo simulation Example: A random realization of X and u for 100 observations with the true population function: Y = x + u. How does OLS perform in estimating the underlying population function? 15 / 49

16 Simulation Thursday January 29 11:18: Page 1 (R) / / / / / / / / / / / / Statistics/Data Analysis 1. reg y x Source SS df MS Number of obs = 100 F( 1, 98) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = y Coef. Std. Err. t P> t [95% Conf. Interval] x _cons The coefficients are close to the true population coefficients. 16 / 49

17 Simulation So running one simulation got us close to the estimate, how if we simulate 1000 times? Gives us 1000 estimates for β 0 and β 1 17 / 49

18 Simulation Thursday January 29 11:31: Page 1 (R) / / / / / / / / / / / / Statistics/Data Analysis 1. sum Variable Obs Mean Std. Dev. Min Max _b_x _b_cons The estimated OLS coefficients approximate to the true population coefficient. Thus OLS gives an unbiased estimate for the slope coefficient and the constant term. 18 / 49

19 Simulation: Bias Another example. Suppose that the population is characterized by: y = 3 2x 1 + u β 0 = 3 β 1 = 2 u is distributed normal, mean 0 and standard deviation 3. x s are between 0.01 and 10 spaced evenly n= 1000 Estimate using y = β 0 + β 1 x 1 + u and plot y on X. 19 / 49

20 Simulation: Bias Scatterplot of the simulated data: 20 / 49

21 Simulation: Bias How to make the scatter plot in Stata:. *drop any observations that are currently in memory.. set obs 1000 obs was 0, now *In the runs of the regression we want there to be.. observations.. gen u=rnormal(0,3). *generates distributions of x. range x *Generate Y:. gen y=3-2*x+u.. twoway (scatter y x, mfcolor(white) mlcolor(black)), graphregion(color(white)) 21 / 49

22 Simulation: Bias Suppose that we sample 30 people from the population and estimate β 1 via OLS First sample ˆβ 1 = Second sample: ˆβ 1 = Second sample: ˆβ 1 = None of them equals the population parameter of 2. Is this a problem? 22 / 49

23 Simulation: Bias Keep sampling! If we sample 1000 times we get the following histogram of the estimates of β / 49

24 Thursday January 29 12:21: Page 1 Variance observations ( / / / / / / / / / / / / Statistics/Data Analysis 10 observations repeated on 1000 samples with model from example 2: sum Variable Obs Mean Std. Dev. Min Max _b_x _b_cons Thursday January 29 11:31: Page observations repeated on 1000 samples. (R) / / / / / / / / / / / / Statistics/Data Analysis 1. sum Variable Obs Mean Std. Dev. Min Max _b_x _b_cons / 49

25 Simulation Thursday January 29 11:46: Page 1 The errors are not normally distributed with mean 0, but with mean 3. So u N(3, 1) (R / / / / / / / / / / / / Statistics/Data Analysis Variable Obs Mean Std. Dev. Min Max _b_x _b_cons As long as X and u are uncorrelated ˆβ 1 is unbiased. The constant term and the error term is correlated in this situation so β 0 is biased. 25 / 49

26 Simulation: Heteroskedasticity By assumption the variance of errors is common across x. (homoskedasticity assumption). Which graph illustrates a violation of this assumption? 26 / 49

27 Measures of fit 27 / 49

28 Goodness of fit SST, SSE and SSR is defined exactly as in the simple regression case. Which means that the R 2 is defined the same as in the regression with one regressor. However R 2 never decrease and typically increase when you add another regressor as you explain at least as much as with one regressor. This means that an increased R 2 not necessarily means that the added variable improves the fit of the model. 28 / 49

29 The adjusted R-squared The adjusted R-squared is introduced in MLRM to compensate for the increasing R-squared. The adjusted R-squared includes a penalty for including another regressor thus R 2 does not necessarily increase when you add another regressor. ( ) n 1 SSR R 2 = 1 (1) n k 1 TSS 29 / 49

30 Properties of R 2 n 1 Since n k 1 < 1 R2 > R 2 Adding a variable may decrease or increase R depending on whether the increase in explanation is large enough to make up for the penalty R 2 can be negative. 30 / 49

31 Note on caution about R 2 / R 2 The goal of regression is not to maximize R 2 (or R 2 ) but to estimate the causal effect. R 2 is simply an estimate of how much variation in y is explained by the independent variables in the population. Although a low R 2 means that we have not accounted for several factors that affect Y, this does not mean that these factors in u are correlated with the independent variables. Whether to include a variable should thus be based on whether it improves the estimate rather than whether it increase the fraction of variance we can explain. A low R 2 does imply that the error variance is large relative to the variance of Y, which means we may have a hard time precisely estimating the β j. A large error variance can be offset by a large sample size, with enough data one can precisely estimate the partial effects even when there are many unobserved factors. 31 / 49

32 The standard error of the regression Remember that the standard error of the regression (SER) estimates the standard deviation of the error term u i : SER = sû = s 2 û where s2 û = 1 n k 1 n i=1 û 2 i = SSR n k 1 The only difference from the SLRM is that the number of regressors k is included in the formula. (2) 32 / 49

33 Heteroskedasticity and OVB Heteroskedasticity is likely to occur in data sets in which there is a wide disparity between the largest and smallest observed values. Pure heteroskedasticity is caused by the error term of a correctly specified equation. Impure heteroskedasticity is heteroskedasticity caused by an error in specification, such as an omitted variable. 33 / 49

34 Overspecification The OVB problem may lead you to think that you should include all variables you have in your regression. If an explanatory variable in a regresion model has a zero population parameter in estimating an equation by OLS we call that variable irrelevant. An irrelevant variable has no partial effect on y. A model that includes irrelevant variables is called an overspecified model. An overspecified model gives unbiased estimates, but it can have undesirable effects on the variances of the OLS. 34 / 49

35 Controlling for too many factors In a similar way we can over control for factors. In some cases, it makes no sense to hold some factors fixed, precisely because they should be allowed to change. If you are interested in the effect of beer taxes on traffic fatalities it makes no sense to estimate: fatalities = β 0 + β 1 tax + β 2 beercons +... As you will measure the effect of tax holding beer consumption fixed, which is not particularly interesting unless you want to test for some indirect effect of beer taxes. 35 / 49

36 Measurement of variable 36 / 49

37 Effects of data scaling on OLS Consider an example where: ˆ bwght = ˆβ 0 + ˆβ 1 cigs + ˆβ 2 faminc bwght = child birth weights, in ounces. cigs = number og cigarettes smoked by the mother while pregnant, per day faminc = annual family income, in thousands of dollars using bwght.dta 37 / 49

38 Effects of data scaling on OLS Wednesday February 4 14:37: Page 1 (R) / / / / / / / / / / / / Statistics/Data Analysis 1. reg bwght cigs faminc Source SS df MS Number of obs = 1388 F( 2, 1385) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = bwght Coef. Std. Err. t P> t [95% Conf. Interval] cigs faminc _cons / 49

39 Effects of data scaling on OLS Alternatively you can specify the model in grams so that bwghtgram = bwght28.35 Then: ˆ bwght/16 = ˆβ 0 /16 + ( ˆβ 1 /16) cigs + ( ˆβ 1 /16)faminc So it follows from previous lectures that each new coefficient will be the corresponding old coefficient divided by 16. If you wanted grams each coefficient would be multiplied by Once the effects are transformed into the same units we get exactly the same answer, regardless of how the dependent variable is measured. 39 / 49

40 Effects of data scaling on OLS Alternatively one could measure cigs in cigarette packs instead. Then: ˆ bwght = ˆβ 0 + ˆβ 1 (20 cigs/20) + ˆβ 2 faminc ˆ bwght = ˆβ ˆβ 1 (packs) + ˆβ 2 faminc (3) The only effect is that the coefficient on packs is 20 times higher than the coefficient on cigarettes, and so will the standard error be. 40 / 49

41 Effects of data scaling on OLS The below figure show the three regressions including the goodness of fit measures. The R 2 from the three regressions are the same (as they should be). The SSR and SER differ in the second specification from the two others. Remember SSR is measured in squared units of the dependent variable, while SER is measured in units of the dependent variable. 41 / 49

42 Standardizing variables Sometimes a key variable is measured on a scale that is difficult to interpret. An example is test scores - tests can be arbitrarily scored Then it can make sense to ask what happens if test score is one standard deviation higher. A variable is standardized by subtracting off its mean and dividing by the standard deviation. You can make a regression where the scale of the regressors are irrelevant by standardizing all the variables in the regression. 42 / 49

43 Standardizing variables Suppose we start with the model: Take the mean of the entire equation: Subtract 1 from 2 and Y = β 0 + β 1 X + u (4) ˆµ u = β 0 + β 1ˆµ x (5) (Y ˆµ y ) = β 1 (x ˆµ x ) + u Divide both sides by ˆσ y and multiply β 1 by ˆσ x /ˆσ x and manipulate such that: (y û y ) ˆσ y = ˆσ x (x ˆµ x ) β 1 + 1ˆσ u ˆσ y ˆσ x y y s = β 1 x s + ũ 43 / 49

44 Standardizing variables Note: the coefficients and standard errors change The t-statistics, p-values and R 2 do not change Standardizing variables allows a comparison of size of coefficients across variables. 44 / 49

45 Dummy variables in MLRM The multiple regression model allows for using several dummy independent variables in the same equation. In the multiple regression model a dummy variable gives an intercept shift between the groups. If the regression model is to have different intercepts for, say, g groups or categories, we need to include g-1 dummy variables in the model along with an intercept. The intercept for the base group is the overall intercept in the model The dummy variable coefficient for a particular group represents the estimated difference in intercepts between that group and the base group. An alternative is to suppress the intercept, but it makes it more cumbersome to test for differences relative to a base group. 45 / 49

46 Dummy variables in MLRM price = β 0 + β 1 weight + β 2 Foreign Slope= β 1 β 0 + β 2 β 0 46 / 49

47 Dummy variables in MLRM Variables which are ordinal can either be entered to the equation in its form or you can create a dummy variable for each of the values. Creating a dummy variable for each value allow the movement between each level to be different so it is more flexible than simply putting the variable in the model. F.ex you can have a credit rate ranking between 0 and 4. Then you can include 4 dummy variables in your regression. 47 / 49

48 Alternative specification OLS The OLS minimization problem: S(β 0, β 1 ) = n (Y i β 0 β 1 x i ) 2 i=1 can alternatively be written as: S(α, β 1 ) = n (Y i α β 1 (X i X )) 2 i=1 where the intercept parameter is redefined to: α = β 0 + β 1 X 48 / 49

49 Alternative specification OLS S(α, β 1 ) β 1 S(α, β 1 ) α = 2 = 2 n [Y i α β 1 (X i X )] i=1 n [Y i α β 1 (X i X )] (X i X ) i=1 ˆα and ˆβ 1 are the values of α and β 1 for which the FOC is equal to zero. Solution: ˆα = Ȳ = ˆβ 0 + ˆβ 1 X and β 1 as before. 49 / 49

### ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

### ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

### ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

### ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

### Lab 6 - Simple Regression

Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello

### Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

### Statistical Inference with Regression Analysis

Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

### Specification Error: Omitted and Extraneous Variables

Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct

### Chapter 2: simple regression model

Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.

### Handout 11: Measurement Error

Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

### Econometrics Review questions for exam

Econometrics Review questions for exam Nathaniel Higgins nhiggins@jhu.edu, 1. Suppose you have a model: y = β 0 x 1 + u You propose the model above and then estimate the model using OLS to obtain: ŷ =

### Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

### sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Scatterplots Quantitative Research Methods: Introduction to correlation and regression Scatterplots can be considered as interval/ratio analogue of cross-tabs: arbitrarily many values mapped out in -dimensions

### Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

### Lab 10 - Binary Variables

Lab 10 - Binary Variables Spring 2017 Contents 1 Introduction 1 2 SLR on a Dummy 2 3 MLR with binary independent variables 3 3.1 MLR with a Dummy: different intercepts, same slope................. 4 3.2

### Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

### Brief Suggested Solutions

DEPARTMENT OF ECONOMICS UNIVERSITY OF VICTORIA ECONOMICS 366: ECONOMETRICS II SPRING TERM 5: ASSIGNMENT TWO Brief Suggested Solutions Question One: Consider the classical T-observation, K-regressor linear

### Greene, Econometric Analysis (7th ed, 2012)

EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

### Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

### Multiple Regression Analysis. Part III. Multiple Regression Analysis

Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

### Binary Dependent Variables

Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

### The Classical Linear Regression Model

The Classical Linear Regression Model ME104: Linear Regression Analysis Kenneth Benoit August 14, 2012 CLRM: Basic Assumptions 1. Specification: Relationship between X and Y in the population is linear:

### ECON Introductory Econometrics. Lecture 16: Instrumental variables

ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental

### Lecture 8: Instrumental Variables Estimation

Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano

### Lecture 14. More on using dummy variables (deal with seasonality)

Lecture 14. More on using dummy variables (deal with seasonality) More things to worry about: measurement error in variables (can lead to bias in OLS (endogeneity) ) Have seen that dummy variables are

### Empirical Application of Panel Data Regression

Empirical Application of Panel Data Regression 1. We use Fatality data, and we are interested in whether rising beer tax rate can help lower traffic death. So the dependent variable is traffic death, while

### 1 Warm-Up: 2 Adjusted R 2. Introductory Applied Econometrics EEP/IAS 118 Spring Sylvan Herskowitz Section #

Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Sylvan Herskowitz Section #10 4-1-15 1 Warm-Up: Remember that exam you took before break? We had a question that said this A researcher wants to

### Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture (chapter 13): Association between variables measured at the interval-ratio level Ernesto F. L. Amaral April 9 11, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015.

### Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

### Lecture 5. In the last lecture, we covered. This lecture introduces you to

Lecture 5 In the last lecture, we covered. homework 2. The linear regression model (4.) 3. Estimating the coefficients (4.2) This lecture introduces you to. Measures of Fit (4.3) 2. The Least Square Assumptions

### The Simple Regression Model. Part II. The Simple Regression Model

Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

### Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Economics 102: Analysis of Economic Data Cameron Spring 2016 May 12 Department of Economics, U.C.-Davis Second Midterm Exam (Version A) Compulsory. Closed book. Total of 30 points and worth 22.5% of course

### 1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.

### Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

### Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

### Course Econometrics I

Course Econometrics I 4. Heteroskedasticity Martin Halla Johannes Kepler University of Linz Department of Economics Last update: May 6, 2014 Martin Halla CS Econometrics I 4 1/31 Our agenda for today Consequences

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### Multiple Regression Analysis

Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

### Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model

### Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Week 8 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 25 Recommended Reading For the today Instrumental Variables Estimation and Two Stage

### Measures of Fit from AR(p)

Measures of Fit from AR(p) Residual Sum of Squared Errors Residual Mean Squared Error Root MSE (Standard Error of Regression) R-squared R-bar-squared = = T t e t SSR 1 2 ˆ = = T t e t p T s 1 2 2 ˆ 1 1

### 0. Introductory econometrics

0. Introductory econometrics 0.1 Structure of economic data Cross-sectional data: Data wic are collected from units of te underlying population at a given time period (wic may vary occasionally) (te arrangement

### Econometrics Multiple Regression Analysis: Heteroskedasticity

Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties

### Introduction to Econometrics

Introduction to Econometrics STAT-S-301 Panel Data (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Regression with Panel Data A panel dataset contains observations on multiple entities

### Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

### OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

OSU Economics 444: Elementary Econometrics Ch.0 Heteroskedasticity (Pure) heteroskedasticity is caused by the error term of a correctly speciþed equation: Var(² i )=σ 2 i, i =, 2,,n, i.e., the variance

### Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

### Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

Lecture 2 Linear Regression: A Model for the Mean Sharyn O Halloran Closer Look at: Linear Regression Model Least squares procedure Inferential tools Confidence and Prediction Intervals Assumptions Robustness

### 8. Nonstandard standard error issues 8.1. The bias of robust standard errors

8.1. The bias of robust standard errors Bias Robust standard errors are now easily obtained using e.g. Stata option robust Robust standard errors are preferable to normal standard errors when residuals

### Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

### Regression Models. Chapter 4. Introduction. Introduction. Introduction

Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

### Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

### Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Econometrics Week 4 Institute of Economic Studies Faculty of Social Sciences Charles University in Prague Fall 2012 1 / 23 Recommended Reading For the today Serial correlation and heteroskedasticity in

### sociology 362 regression

sociology 36 regression Regression is a means of modeling how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

### STAT 3A03 Applied Regression With SAS Fall 2017

STAT 3A03 Applied Regression With SAS Fall 2017 Assignment 2 Solution Set Q. 1 I will add subscripts relating to the question part to the parameters and their estimates as well as the errors and residuals.

### Introduction to Econometrics. Regression with Panel Data

Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Regression with Panel Data Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 Regression with Panel

### The general linear regression with k explanatory variables is just an extension of the simple regression as follows

3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

### sociology 362 regression

sociology 36 regression Regression is a means of studying how the conditional distribution of a response variable (say, Y) varies for different values of one or more independent explanatory variables (say,

### ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

ECON2228 Notes 7 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 6 2014 2015 1 / 41 Chapter 8: Heteroskedasticity In laying out the standard regression model, we made

### Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Practical Econometrics for Finance and Economics (Econometrics 2) Seppo Pynnönen and Bernd Pape Department of Mathematics and Statistics, University of Vaasa 1. Introduction 1.1 Econometrics Econometrics

### Econ 371 Problem Set #6 Answer Sheet. deaths per 10,000. The 90% confidence interval for the change in death rate is 1.81 ±

Econ 371 Problem Set #6 Answer Sheet 10.1 This question focuses on the regression model results in Table 10.1. a. The first part of this question asks you to predict the number of lives that would be saved

### Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Linear Regression with 1 Regressor Introduction to Econometrics Spring 2012 Ken Simons Linear Regression with 1 Regressor 1. The regression equation 2. Estimating the equation 3. Assumptions required for

### Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h. This is an open book examination where all printed and written resources, in addition to a calculator, are allowed. If you are

### Multivariate Regression Analysis

Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x

### Hypothesis Tests and Confidence Intervals. in Multiple Regression

ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and

### Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

### The Multiple Linear Regression Model

Short Guides to Microeconometrics Fall 2016 Kurt Schmidheiny Unversität Basel The Multiple Linear Regression Model 1 Introduction The multiple linear regression model and its estimation using ordinary

### 1 Motivation for Instrumental Variable (IV) Regression

ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

### Chapter 4. Regression Models. Learning Objectives

Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

### Econ 423 Lecture Notes

Econ 423 Lecture Notes (hese notes are modified versions of lecture notes provided by Stock and Watson, 2007. hey are for instructional purposes only and are not to be distributed outside of the classroom.)

### ECO375 Tutorial 8 Instrumental Variables

ECO375 Tutorial 8 Instrumental Variables Matt Tudball University of Toronto Mississauga November 16, 2017 Matt Tudball (University of Toronto) ECO375H5 November 16, 2017 1 / 22 Review: Endogeneity Instrumental

### Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data 1/2/3-1 1/2/3-2 Brief Overview of the Course Economics suggests important

### Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

### Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

### We begin by thinking about population relationships.

Conditional Expectation Function (CEF) We begin by thinking about population relationships. CEF Decomposition Theorem: Given some outcome Y i and some covariates X i there is always a decomposition where

### . regress lchnimp lchempi lgas lrtwex befile6 affile6 afdec6 t

BOSTON COLLEGE Department of Economics EC 228 Econometrics, Prof. Baum, Ms. Yu, Fall 2003 Problem Set 7 Solutions Problem sets should be your own work. You may work together with classmates, but if you

### Lecture 3: Multiple Regression. Prof. Sharyn O Halloran Sustainable Development U9611 Econometrics II

Lecture 3: Multiple Regression Prof. Sharyn O Halloran Sustainable Development Econometrics II Outline Basics of Multiple Regression Dummy Variables Interactive terms Curvilinear models Review Strategies

### STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf

### Answer Key: Problem Set 6

: Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,

### Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

### Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

### The Simple Regression Model. Simple Regression Model 1

The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising

### Econometrics. 9) Heteroscedasticity and autocorrelation

30C00200 Econometrics 9) Heteroscedasticity and autocorrelation Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Heteroscedasticity Possible causes Testing for

### Chapter 14. Simultaneous Equations Models Introduction

Chapter 14 Simultaneous Equations Models 14.1 Introduction Simultaneous equations models differ from those we have considered in previous chapters because in each model there are two or more dependent

### Fixed and Random Effects Models: Vartanian, SW 683

: Vartanian, SW 683 Fixed and random effects models See: http://teaching.sociology.ul.ie/dcw/confront/node45.html When you have repeated observations per individual this is a problem and an advantage:

### Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

### Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Simultaneous Equations with Error Components Mike Bronner Marko Ledic Anja Breitwieser PRESENTATION OUTLINE Part I: - Simultaneous equation models: overview - Empirical example Part II: - Hausman and Taylor

### Hypothesis Tests and Confidence Intervals in Multiple Regression

Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients

### Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

Applied Economics Panel Data Department of Economics Universidad Carlos III de Madrid See also Wooldridge (chapter 13), and Stock and Watson (chapter 10) 1 / 38 Panel Data vs Repeated Cross-sections In

### PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

### Sociology Exam 2 Answer Key March 30, 2012

Sociology 63993 Exam 2 Answer Key March 30, 2012 I. True-False. (20 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher has constructed scales

### AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

### Econometrics II Censoring & Truncation. May 5, 2011

Econometrics II Censoring & Truncation Måns Söderbom May 5, 2011 1 Censored and Truncated Models Recall that a corner solution is an actual economic outcome, e.g. zero expenditure on health by a household

### Econometrics. 8) Instrumental variables

30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates

### ECON 3150/4150 spring term 2014: Exercise set for the first seminar and DIY exercises for the first few weeks of the course

ECON 3150/4150 spring term 2014: Exercise set for the first seminar and DIY exercises for the first few weeks of the course Ragnar Nymoen 13 January 2014. Exercise set to seminar 1 (week 6, 3-7 Feb) This

Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

### Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity

### Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural