A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.

Size: px
Start display at page:

Download "A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps."

Transcription

1 ECON 497: Lecture 6 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 6 Specification: Choosing the Independent Variables Studenmund Chapter 6 Before we start, let's get one thing straight. You should use theories about the relationship you are estimating to determine which explanatory or independent variables to include. Everything else in this lecture is of less importance than this basic rule. Specification Error There are three things to be done in specifying an equation to be estimated: 1. Choosing the correct independent variables (which we'll discuss here) 2. Choosing the correct function form (which is in chapter 7) 3. Choosing the correct form of the error term (which is discussed in later chapters) Getting any of these wrong will result in a specification error, which can seriously affect the characteristics of the estimators of the regression coefficients. A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps. Now, some particular problems and their implications. Omitted Variables If there is a variable that should be included in the equation but is left out, there is an omitted variable problem. More specifically, the estimators of the regression coefficients may be biased, meaning that their expected values may not be equal to the actual values. To put this more clearly, consider the relationship: Y i = β 0 + β 1 X 1i + β 2 X 2i + ε i If this is the model which is estimated and all the classical assumptions are satisfied, we can say that E[β 0 hat]=b 0 E[β 1 hat]=β 1 E[β 2 hat]=β 2 That is, the expected value of each estimator is equal to the value it is meant to estimate. If, however, you don't think to include one of these explanatory variables or can't get data on it or whatever, the equation you wind up estimating might be: Y i = β 0 + β 1 X 1i + ε i

2 ECON 497: Lecture 6 Page 2 of 2 In this case, you will attempt to estimate the coefficient β 1, but unfortunately the estimator will be biased. That is: E[β 1 hat] not= β 1 Specifically, as presented in Studenmund, the expected value of β 1 hat is equal to β 1 plus β 2 times a function of the correlation between X 1 and X 2 (ρ 12 ): E[β 1 hat] = β 1 + β 2 *f(ρ 12 ) (6.4) What does this mean? 1. If there is no correlation between the included and the excluded variable, then the estimator of the coefficient of the included variable will be unbiased. That is, if ρ 12 =0, E[β 1 hat]=β If there is some correlation between X 1 and X 2, the sign of the bias on β 1 hat will depend on the sign of the coefficient β (which you can make a guess about from your theory about the relationship between the dependent and independent variables) and the sign of the correlation coefficient. If a relevant explanatory variable is excluded what will be the impact on the estimated coefficients of the included variables? Effect of Excluded Variables on Estimated Coefficients of Included Variables Positive correlation between Negative correlation included and excluded between included and variable excluded variable Excluded variable has a positive effect Estimated coefficient will be positively biased Estimated coefficient will be negatively biased Excluded variable has a negative effect Estimated coefficient will be negatively biased Estimated coefficient will be positively biased Some examples provided by students. A further problem with omitting explanatory variables is that omissions will make the reported standard errors of the estimated coefficients (the standard errors of b 1 hat, b 2 hat) smaller than they should be. Now, we only care about the standard errors of the estimated coefficients because they are used to calculate the t-statistics for the estimated coefficients. To state this clearly: t-stat = b 1 hat/se(b 1 hat) If a variable (such as X 2 )

3 ECON 497: Lecture 6 Page 3 of 3 which should be included is omitted, the numerator of the t-stat will be biased and the denominator will be too small. The net effect on the t-statistic is ambiguous. EX: Consider the example from Studenmund: Y t hat = PC t PB t LYD t (0.13) (0.08) (2.06) t = Y t hat = PC t LYD t (0.12) (0.60) t = When the variable PB is improperly excluded from the equation, the standard errors of the estimated coefficients on PC and LYD become smaller. What effect does this have on the t-statistics? The t-stat on the estimated coefficient for PC gets smaller (in absolute terms) because the magnitude of the estimate of the coefficient gets smaller. The t-stat on the estimated coefficient for LYD gets larger because the estimate gets larger and the standard error is smaller. So, when a variable is omitted, the t-stats become unreliable. However, if you know whether the bias is positive or negative, you can make some predictions about how the correct estimates compare to yours. Return to student examples. Correcting for Omitted Variables The easiest answer is to say that if you know something is omitted, you should simply include it. If that were so easy, though, we wouldn't be discussing this. If there is an independent variable you know should be included but you can't get data for it, you can at least estimate the sign of the bias on the estimates of other coefficients. This will allow you to say that you know that the estimates are biased, but if you know the sign of the bias you can at least say whether your estimates are likely to be either too large or too small. Irrelevant Variables If an independent variable which shouldn't be included in an equation is included, there may be problems with the estimated equation. Fortunately, these shouldn't be a problem. Saying that an explanatory variable doesn't belong in an equation is roughly equivalent to saying that the coefficient on that variable is zero. That is, the variable has no effect on the dependent variable.

4 ECON 497: Lecture 6 Page 4 of 4 As Studenmund puts this (p. 180): If the true model is: Yi = β 0 + β 1 X 1i + ε i but instead, you include X 2 and estimate: Y i = β 0 + β 1 X 1i + β 2 X 2i + ε i ** Then the error term from the second equation (the one you incorrectly estimate) will be ε i **=f(ε i - β 2 X 2i ) However, if X 2 doesn't belong in the equation, then β 2 =0, so ε i ** = ε i The problem is that if the irrelevant variable (X 2 ) is correlated with the relevant variable (X 1 ) then the standard error of the estimated coefficient will be larger than it should be (in absolute value) leading to a smaller t-statistic. As a result, there may be some estimated coefficients that are significant but, because of the inclusion of the irrelevant variable, will appear insignificant. Again, including an irrelevant variable can make significant coefficients appear insignificant. This is similar to the problem resulting from multicollinearity. These problems are nicely summarized by the Table 6.2 in Studenmund: Effect on Coefficient Estimate Omitted Variable Irrelevant Variable Bias Yes, unless ρ 12 =0 No Variance of Estimated Coefficients Decreases, unless ρ 12 =0 Increases, unless ρ 12 =0 T-stat Effect Ambiguous change, unless ρ 12 =0 Decreases, unless ρ 12 =0 Four Important Specification Criteria Studenmund offers four rules for determining if a variable should be added to a regression equation. The first is the most important; the others are supplemental. If you're considering adding a theoretically justifiable variable to your equation, do a regression without it in the equation and then with it in the equation. If these rules are satisfied, the variable should be included. 1. Theory: Is the variable's place in the equation unambiguous and theoretically sound? 2. t-test: Is the variable's estimated coefficient significant in the expected direction? 3. Adj. R 2 : Does the overall fit of the equation (adjusted for the degrees of freedom) improve when the variable is added to the equation? 4. Bias: Do other variables' coefficients change significantly when the variable is added to the equation? Studenmund offers two examples of the application of these rules and you should read these examples. One Additional Consideration As you will all discover, data are not perfect. That is, some observations will likely be missing some values. This may be because people taking a survey decline to answer a

5 ECON 497: Lecture 6 Page 5 of 5 particular question ( Is that your nose, or are you eating a banana? ), because governments don t collect information on the variable collected for an international data set ( Generallisimo, how many political prisoners are we currently holding? ) or because the people answering the question simply don t know the information. Because it is impossible to do a regression using observations missing even one included variable, you may find yourself making a tradeoff between including all the explanatory variable you want and having very few observations or sacrificing one or two variables and having plenty of observations. Under these conditions, doing the regression both ways and comparing the results may be the best solution, but missing variables for some observations may be one reason to include, or more likely exclude, a variable from a regression. Some Practices to Avoid and/or Understand 1. Data Mining: Data mining is the estimation of many (or all) possible regression equations with no regard for theoretical justification in a blind attempt to get the desired results. Remember, a level of significance of 5% in a hypothesis test means that even if an explanatory variable has no influence on a dependent variable, 5% of the time it will appear to have influence. If you try twenty different models which have no real explanatory power, the expected number of models which will appear to have significant explanatory power (at a 5% level of significance) is one. 2. Stepwise Regressions: This is the process of allowing a computer package to choose the explanatory variable which has the greatest explanatory power, then having chosen the first, chooses a second explanatory variable which adds the most explanatory power from those remaining, and so on. There are actually procedures written into some statistical packages which do this automatically if asked. Computers are dumb machines and know not what they do, but the software packages have this features because there are equally dumb researchers out there who want to use this procedure. If someone seriously presents results from a stepwise regression, you should taunt them about their lack of an underlying theory until they cry. 3. Sequential Specification Searches: This is the process of starting with the variables you know should be included and then trying others about which you are less sure. This isn't necessarily a bad idea, but there is always a concern about what reasons the researchers may have had for reporting the results that they did. If a number of specifications are tried, all of their results should be

6 ECON 497: Lecture 6 Page 6 of 6 either presented or, at least, mentioned in the final report. If, for example, a large number of the secondary explanatory variables had little or no effect on the estimated coefficients and explanatory power, this could be discussed in a footnote or an appendix. 4. Relying on t-test Results: Problems with multicollinearity and omitted variable bias can make t-tests unreliable indicators of which variables should be included or excluded. 5. Scanning: As far as I can tell, this refers to data mining one data set to find a good specification and then estimating that model using a different data set. Please note that this requires two distinct data sets. 6. Sensitivity Analysis: This is the practice of estimating your preferred specification, determining which results are important, and then estimating some slight variations of the specification to see if the important results are preserved or if they disappear. If the important result(s) persist across slight changes in the model, these results are said to be robust to slight changes in the model. If these important results disappear when the model is changed slightly, they may be artificial products of a particular specification and do not accurately reflect a relationship in the data. Presentation of the results from several different specifications can clarify the robustness of important regression results. A fun question to ask someone presenting results is, "Are your important results robust to changes in model specification?" An Example: Automobile Acceleration Times One of the data sets from Studenmund deals with acceleration times (S) from 0 to 62 mph for various automobiles. Two of the explanatory variables are weight in pounds (P) and engine horsepower (H). As you can see below, there is a positive correlation between these two variables.

7 ECON 497: Lecture 6 Page 7 of 7 Correlations Correlations T E P H T E P H Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation ** Sig. (2-tailed) N Pearson Correlation ** Sig. (2-tailed) N **. Correlation is significant at the 0.01 level (2-tailed). What would be the impact of removing one of them from the regression? More specifically, what would be the impact on the coefficient of H if P were excluded from the regression? H should have a negative coefficient while P has a positive coefficient. The two variables are positively correlated. So, if P is excluded, the magnitude of the negative coefficient on H should be reduced. That is, the negative estimated coefficient on H should become a smaller negative number.

8 ECON 497: Lecture 6 Page 8 of 8 Regression Model 1 Model Summary Adjusted Std. Error of R R Square R Square the Estimate.842 a a. Predictors: (Constant), H, T, E, P Model 1 (Constant) T E P H a. Dependent Variable: S Regression Model 1 Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig E E Model Summary Adjusted Std. Error of R R Square R Square the Estimate.839 a a. Predictors: (Constant), H, T, E Model 1 (Constant) T E H a. Dependent Variable: S Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts B Std. Error Beta t Sig E

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

MITOCW ocw f99-lec16_300k

MITOCW ocw f99-lec16_300k MITOCW ocw-18.06-f99-lec16_300k OK. Here's lecture sixteen and if you remember I ended up the last lecture with this formula for what I called a projection matrix. And maybe I could just recap for a minute

More information

Topic 1. Definitions

Topic 1. Definitions S Topic. Definitions. Scalar A scalar is a number. 2. Vector A vector is a column of numbers. 3. Linear combination A scalar times a vector plus a scalar times a vector, plus a scalar times a vector...

More information

ECON 497: Lecture 4 Page 1 of 1

ECON 497: Lecture 4 Page 1 of 1 ECON 497: Lecture 4 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 4 The Classical Model: Assumptions and Violations Studenmund Chapter 4 Ordinary least squares

More information

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics

Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Notes 11: OLS Theorems ECO 231W - Undergraduate Econometrics Prof. Carolina Caetano For a while we talked about the regression method. Then we talked about the linear model. There were many details, but

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C = Economics 130 Lecture 6 Midterm Review Next Steps for the Class Multiple Regression Review & Issues Model Specification Issues Launching the Projects!!!!! Midterm results: AVG = 26.5 (88%) A = 27+ B =

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

ECON 497: Lecture Notes 10 Page 1 of 1

ECON 497: Lecture Notes 10 Page 1 of 1 ECON 497: Lecture Notes 10 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 10 Heteroskedasticity Studenmund Chapter 10 We'll start with a quote from Studenmund:

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation? Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

1 A Non-technical Introduction to Regression

1 A Non-technical Introduction to Regression 1 A Non-technical Introduction to Regression Chapters 1 and Chapter 2 of the textbook are reviews of material you should know from your previous study (e.g. in your second year course). They cover, in

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture - 39 Regression Analysis Hello and welcome to the course on Biostatistics

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

Statistics and Quantitative Analysis U4320

Statistics and Quantitative Analysis U4320 Statistics and Quantitative Analysis U3 Lecture 13: Explaining Variation Prof. Sharyn O Halloran Explaining Variation: Adjusted R (cont) Definition of Adjusted R So we'd like a measure like R, but one

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

appstats27.notebook April 06, 2017

appstats27.notebook April 06, 2017 Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Chapter 9 - Correlation and Regression

Chapter 9 - Correlation and Regression Chapter 9 - Correlation and Regression 9. Scatter diagram of percentage of LBW infants (Y) and high-risk fertility rate (X ) in Vermont Health Planning Districts. 9.3 Correlation between percentage of

More information

Regression of Inflation on Percent M3 Change

Regression of Inflation on Percent M3 Change ECON 497 Final Exam Page of ECON 497: Economic Research and Forecasting Name: Spring 2006 Bellas Final Exam Return this exam to me by midnight on Thursday, April 27. It may be e-mailed to me. It may be

More information

Business Statistics 41000: Homework # 5

Business Statistics 41000: Homework # 5 Business Statistics 41000: Homework # 5 Drew Creal Due date: Beginning of class in week # 10 Remarks: These questions cover Lectures #7, 8, and 9. Question # 1. Condence intervals and plug-in predictive

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

MITOCW ocw f99-lec05_300k

MITOCW ocw f99-lec05_300k MITOCW ocw-18.06-f99-lec05_300k This is lecture five in linear algebra. And, it will complete this chapter of the book. So the last section of this chapter is two point seven that talks about permutations,

More information

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math. Regression, part II I. What does it all mean? A) Notice that so far all we ve done is math. 1) One can calculate the Least Squares Regression Line for anything, regardless of any assumptions. 2) But, if

More information

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

Ref.:   Spring SOS3003 Applied data analysis for social science Lecture note SOS3003 Applied data analysis for social science Lecture note 05-2010 Erling Berge Department of sociology and political science NTNU Spring 2010 Erling Berge 2010 1 Literature Regression criticism I Hamilton

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

HOW TO WRITE PROOFS. Dr. Min Ru, University of Houston

HOW TO WRITE PROOFS. Dr. Min Ru, University of Houston HOW TO WRITE PROOFS Dr. Min Ru, University of Houston One of the most difficult things you will attempt in this course is to write proofs. A proof is to give a legal (logical) argument or justification

More information

Sociology 593 Exam 2 March 28, 2002

Sociology 593 Exam 2 March 28, 2002 Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that

More information

Iris Wang.

Iris Wang. Chapter 10: Multicollinearity Iris Wang iris.wang@kau.se Econometric problems Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences?

More information

Chapter 19 Sir Migo Mendoza

Chapter 19 Sir Migo Mendoza The Linear Regression Chapter 19 Sir Migo Mendoza Linear Regression and the Line of Best Fit Lesson 19.1 Sir Migo Mendoza Question: Once we have a Linear Relationship, what can we do with it? Something

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

LECTURE 11. Introduction to Econometrics. Autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct

More information

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel

Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Psychology Seminar Psych 406 Dr. Jeffrey Leitzel Structural Equation Modeling Topic 1: Correlation / Linear Regression Outline/Overview Correlations (r, pr, sr) Linear regression Multiple regression interpreting

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression. PBAF 528 Week 8 What are some problems with our model? Regression models are used to represent relationships between a dependent variable and one or more predictors. In order to make inference from the

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

The General Linear Model. How we re approaching the GLM. What you ll get out of this 8/11/16

The General Linear Model. How we re approaching the GLM. What you ll get out of this 8/11/16 8// The General Linear Model Monday, Lecture Jeanette Mumford University of Wisconsin - Madison How we re approaching the GLM Regression for behavioral data Without using matrices Understand least squares

More information

MITOCW ocw f99-lec23_300k

MITOCW ocw f99-lec23_300k MITOCW ocw-18.06-f99-lec23_300k -- and lift-off on differential equations. So, this section is about how to solve a system of first order, first derivative, constant coefficient linear equations. And if

More information

Bivariate Regression Analysis. The most useful means of discerning causality and significance of variables

Bivariate Regression Analysis. The most useful means of discerning causality and significance of variables Bivariate Regression Analysis The most useful means of discerning causality and significance of variables Purpose of Regression Analysis Test causal hypotheses Make predictions from samples of data Derive

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Multiple Regression Theory 2006 Samuel L. Baker

Multiple Regression Theory 2006 Samuel L. Baker MULTIPLE REGRESSION THEORY 1 Multiple Regression Theory 2006 Samuel L. Baker Multiple regression is regression with two or more independent variables on the right-hand side of the equation. Use multiple

More information

MORE ON SIMPLE REGRESSION: OVERVIEW

MORE ON SIMPLE REGRESSION: OVERVIEW FI=NOT0106 NOTICE. Unless otherwise indicated, all materials on this page and linked pages at the blue.temple.edu address and at the astro.temple.edu address are the sole property of Ralph B. Taylor and

More information

Lecture 12: Quality Control I: Control of Location

Lecture 12: Quality Control I: Control of Location Lecture 12: Quality Control I: Control of Location 10 October 2005 This lecture and the next will be about quality control methods. There are two reasons for this. First, it s intrinsically important for

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Chapter 10 Regression Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania Scatter Diagrams A graph in which pairs of points, (x, y), are

More information

Multivariate Correlational Analysis: An Introduction

Multivariate Correlational Analysis: An Introduction Assignment. Multivariate Correlational Analysis: An Introduction Mertler & Vanetta, Chapter 7 Kachigan, Chapter 4, pps 180-193 Terms you should know. Multiple Regression Linear Equations Least Squares

More information

FinQuiz Notes

FinQuiz Notes Reading 10 Multiple Regression and Issues in Regression Analysis 2. MULTIPLE LINEAR REGRESSION Multiple linear regression is a method used to model the linear relationship between a dependent variable

More information

MITOCW MIT18_02SCF10Rec_61_300k

MITOCW MIT18_02SCF10Rec_61_300k MITOCW MIT18_02SCF10Rec_61_300k JOEL LEWIS: Hi. Welcome back to recitation. In lecture, you've been learning about the divergence theorem, also known as Gauss's theorem, and flux, and all that good stuff.

More information

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size.

- measures the center of our distribution. In the case of a sample, it s given by: y i. y = where n = sample size. Descriptive Statistics: One of the most important things we can do is to describe our data. Some of this can be done graphically (you should be familiar with histograms, boxplots, scatter plots and so

More information

Math Review -- Conceptual Solutions

Math Review -- Conceptual Solutions Math Review Math Review -- Conceptual Solutions 1.) Is three plus four always equal to seven? Explain. Solution: If the numbers are scalars written in base 10, the answer is yes (if the numbers are in

More information

Proof: If (a, a, b) is a Pythagorean triple, 2a 2 = b 2 b / a = 2, which is impossible.

Proof: If (a, a, b) is a Pythagorean triple, 2a 2 = b 2 b / a = 2, which is impossible. CS103 Handout 07 Fall 2013 October 2, 2013 Guide to Proofs Thanks to Michael Kim for writing some of the proofs used in this handout. What makes a proof a good proof? It's hard to answer this question

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

More information

Multiple linear regression S6

Multiple linear regression S6 Basic medical statistics for clinical and experimental research Multiple linear regression S6 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/42 Introduction Two main motivations for doing multiple

More information

MITOCW ocw feb k

MITOCW ocw feb k MITOCW ocw-18-086-13feb2006-220k INTRODUCTION: The following content is provided by MIT OpenCourseWare under a Creative Commons license. Additional information about our license and MIT OpenCourseWare

More information

Econometrics Part Three

Econometrics Part Three !1 I. Heteroskedasticity A. Definition 1. The variance of the error term is correlated with one of the explanatory variables 2. Example -- the variance of actual spending around the consumption line increases

More information

download instant at

download instant at Answers to Odd-Numbered Exercises Chapter One: An Overview of Regression Analysis 1-3. (a) Positive, (b) negative, (c) positive, (d) negative, (e) ambiguous, (f) negative. 1-5. (a) The coefficients in

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

FAQ: Linear and Multiple Regression Analysis: Coefficients

FAQ: Linear and Multiple Regression Analysis: Coefficients Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable

More information

Hi, I'm Jocelyn, and we're going to go over Fall 2009, Exam 1, problem number 2.

Hi, I'm Jocelyn, and we're going to go over Fall 2009, Exam 1, problem number 2. MIT OpenCourseWare http://ocw.mit.edu 3.091SC Introduction to Solid State Chemistry, Fall 2010 Transcript Exam 1 Problem 2 The following content is provided under a Creative Commons license. Your support

More information

Friday, March 15, 13. Mul$ple Regression

Friday, March 15, 13. Mul$ple Regression Mul$ple Regression Mul$ple Regression I have a hypothesis about the effect of X on Y. Why might we need addi$onal variables? Confounding variables Condi$onal independence Reduce/eliminate bias in es$mates

More information

MITOCW ocw-18_02-f07-lec02_220k

MITOCW ocw-18_02-f07-lec02_220k MITOCW ocw-18_02-f07-lec02_220k The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free.

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Sociology 593 Exam 1 Answer Key February 17, 1995

Sociology 593 Exam 1 Answer Key February 17, 1995 Sociology 593 Exam 1 Answer Key February 17, 1995 I. True-False. (5 points) Indicate whether the following statements are true or false. If false, briefly explain why. 1. A researcher regressed Y on. When

More information

Solving with Absolute Value

Solving with Absolute Value Solving with Absolute Value Who knew two little lines could cause so much trouble? Ask someone to solve the equation 3x 2 = 7 and they ll say No problem! Add just two little lines, and ask them to solve

More information

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N 1) [4 points] Let Econ 836 Final Exam Y Xβ+ ε, X w+ u, w N w~ N(, σi ), u N u~ N(, σi ), ε N ε~ Nu ( γσ, I ), where X is a just one column. Let denote the OLS estimator, and define residuals e as e Y X.

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

But, there is always a certain amount of mystery that hangs around it. People scratch their heads and can't figure

But, there is always a certain amount of mystery that hangs around it. People scratch their heads and can't figure MITOCW 18-03_L19 Today, and for the next two weeks, we are going to be studying what, for many engineers and a few scientists is the most popular method of solving any differential equation of the kind

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data? Univariate analysis Example - linear regression equation: y = ax + c Least squares criteria ( yobs ycalc ) = yobs ( ax + c) = minimum Simple and + = xa xc xy xa + nc = y Solve for a and c Univariate analysis

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Advanced Regression Techniques CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 Overview Overview Overview Common Transformations

More information

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL)

12.12 MODEL BUILDING, AND THE EFFECTS OF MULTICOLLINEARITY (OPTIONAL) 12.12 Model Building, and the Effects of Multicollinearity (Optional) 1 Although Excel and MegaStat are emphasized in Business Statistics in Practice, Second Canadian Edition, some examples in the additional

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Explanation of R 2, and Other Stories

Explanation of R 2, and Other Stories Explanation of R 2, and Other Stories I accidentally gave an incorrect formula for R 2 in class. This summary was initially just going to be me correcting my error, but I've taken the opportunity to clarify

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Multiple Regression: Inference

Multiple Regression: Inference Multiple Regression: Inference The t-test: is ˆ j big and precise enough? We test the null hypothesis: H 0 : β j =0; i.e. test that x j has no effect on y once the other explanatory variables are controlled

More information

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1

, (1) e i = ˆσ 1 h ii. c 2016, Jeffrey S. Simonoff 1 Regression diagnostics As is true of all statistical methodologies, linear regression analysis can be a very effective way to model data, as along as the assumptions being made are true. For the regression

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information