Multivariate Regression: Part I

Size: px
Start display at page:

Download "Multivariate Regression: Part I"

Transcription

1 Topic 1 Multivariate Regression: Part I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà

2 Outline of this topic Statement of the objective: we want to explain the behavior of one variable as a function of other variables. Typical assumptions and why they are needed. Three approaches to pursue the objective given the assumptions: Method of Moments Ordinary Least Squares Maximum Likelihood Estimation 2

3 Objective We continue our evaluation on how to improve schools using the California data We began by asking if class size affects scores and we have data for both. However, there may be other explanatory factors and/or policy variables (e.g. increase overall expenditures per student). Here is a summary of the data 3

4 Policy Evaluation: Test Scores and Class Size The California Test Score Data Set (caschool.dta STATA data file) All K-6 and K-8 California school districts (n = 420) Variables: 5 th grade test scores (Stanford-9 achievement test, combined math and reading), district average (testscr) Student-teacher teacher ratio = no. of students in the district divided by no. full-time equivalent teachers (str) District parental average income in thousands of dollars(avginc) Average expenditures per student in dollars (expn_stu) 4

5 A look at the data Basic Statistics:. sum testscr str expn_stu avginc Variable Obs Mean Std. Dev. Min Max testscr str expn_stu avginc Correlation matrix. correlate testscr str expn_stu avginc (obs=420) testscr str expn_stu avginc testscr str expn_stu avginc

6 Statement t t of the Population Problem It is natural to postulate that: testscr i = constant + f(str i, expn_stu i, avginc i ) + error i for i = 1,,420., A good place to begin is to assume this relation is linear. Using the more general notation we will use in the course: y i = x i1 1 + ::: + x ik K + ² i for i =1;:::;n The left hand side is the endogenous or dependent variable, the x s are the regressors, or explanatory variables, and the the residuals or error terms 6

7 Some Features of the Linear Regression Model To investigate the problem we collect a random sample of fdata fy i ;x i1 ;:::;x ik g n i=1 Some vector and matrix notation: 2 y1 3 2 x1j 3 y = ; x j n 1 y n and X i 1 K = 6 4. j x nj = x i1 ::: x ik 7 5 for j =1; :::; K and X = x 1 ::: x K We use the convention that x 1 contains the constant t term So, in matrix notation, the linear regression model is y = X + ² 7

8 What we want using statistical concepts Assuming y, X have a joint distribution, we want to make statements about the conditional mean of y given X, notice that. sum testscr if avginc < 15 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if avginc > 40 Variable Obs Mean Std. Dev. Min Max testscr Mathematically: m(x) =E(yjX) = Z 1 1 yf(yjx)dy 8

9 The Regression Error Given the previous definition: ² = y m(x) This implies the following properties for the regression error: E(²jX) =0 E(²) = for any function h(.) 4. E(h(X) 0 ²) =0 E(X 0 ²)=0 For example, to prove the first property: E(²jX) =E((y m(x))jx) =E(yjX) E(m(X)jX) = m(x) ( ) m(x) ( ) =0 9

10 Prediction: min MSE The conditional mean has the property that it minimizes the mean squared error (MSE) out of any function g(.), E(y g(x)) 2 = E(²+m(X) g(x)) 2 = E(² 2 )+2E(²(m(X) g(x))) + E(m(X) g(x)) 2 = E(² 2 )+E(m(X) g(x)) 2 >E(² 2 )ifm(x) 6= g(x) Here I abuse notation to indicate that, e.g. 0 ) E(" 2 ) = E("" 0 ) 10

11 Conditional Variance Just as we consider the conditional mean, we may explore how the variance of y varies with X, (X) =V (yjx) =E("" 0 jx) When it is the case that the variance is constant so that (X) =E("" 0 jx) =¾ 2 I n we say the error term is homoscedastic, otherwise we say it is heteroscedastic. 11

12 Normality If we assume y and X are jointly normally distributed, life gets easy (clearly a strong assumption) That is because we can use the projection formula for the joint normal to obtain the conditional mean of y i given X i. Here is how, if i yi X 0 i» N i ¹y ¹ X ; μ then E(y i jxi)=m(x 0 i)=¹ 0 yi (X0 i ¹ X ) V (y i) 0 1 i jx i ) =

13 Let s make some assumptions 1. Linearity: y = X + " 2. Full rank: 3. X is an n K matrix with rank K 2 E[²1 jx] 3 E(²jX) = = 0; hence E[²] =0andE[yjX] =X E[² n jx] 4. Homoscedasticity: V (²jX) =¾ 2 I n hence V (² i jx) =¾ 2 and Cov(² i ; ² j jx) = 0 for all i = j 5. Normality: ²jX» N(0;¾ 2 I n ) 13

14 Checking the assumptions in the data 1. Linearity: could be problematic theoretically. sum testscr if str <= 17 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str >17 & str <=20 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str >= 22.8 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str < 22.8 & str >= 19.8 Variable Obs Mean Std. Dev. Min Max testscr but testscr(14-17)-testscr(17-20) = 3.8 and testscr(20-23)-testscr(23-26) =

15 More Checking 2. X is full rank: this means that one (or more) regressors cannot be exact linear combinations of the others. Easiest is to check the correlation matrix of X:. correlate str expn_stu avginc (obs=420) str expn_stu avginc str expn_stu avginc Later we will discuss slightly more sophisticated ways of checking this 15

16 Final Checks Assumptions 3 (residuals have zero conditional mean) and d4(h (homoscedasticity) it we cannot check just yet. Assumption 5 is normality. This we can check with Jarque-Bera statistics and also looking at some histogram/density plots Dens sity Average Test Score (= (read_scr+math_scr)/2 );

17 Why do we make these assumptions? Linearity: not as strict as it sounds. Usual example, a Cobb-Douglas production function: Y = AL l K k! log(y )=log(a)+ l log(l)+ k log(k) y i = 1 + x i2 2 + x i3 3 + " i Beyond that, t we will discuss what to do with truly nonlinear specification later. For now, linearity makes derivations very convenient by using projection arguments 17

18 Multicolinearity X is a full rank matrix: easy, we cannot really identify parameters otherwise. An example, suppose 3 regressors such that x 1 = x 2 + x 3 y = x x x " y =(x 2 + x 3 ) 1 + x x " y = x 2 ( 1 + 2)+x 3 ( 1 + 3)+" y = x x " which means that 1; 2 and 3 cannot be separately identified. Mechanically, we run into numerical problems Exact colinearity is easy to detect, but approximate colinearity can affect regression results as well. 18

19 Conditional mean-zero errors This is a critical assumption, as we will see, it ensures that the model is properly specified and that the parameters estimates tend to their true values. Reasons why this assumption may not hold in practice have to do with misspecification problems: e.g. omitted variable bias, errors-invariables and endogeneity (only really applies when we want to emphasize analysis of causal relations as opposed to simple correlations) 19

20 Homoscedasticity 20 This assumption is often violated. However, it is easy to relax. It will not affect parameter estimates but it will affect how their standard errors are calculated (i.e., the efficiency of the estimator).. sum testscr if str <= 17 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str >17 & str <=20 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str >= 22.8 Variable Obs Mean Std. Dev. Min Max testscr sum testscr if str < 22.8 & str >= 19.8 Variable Obs Mean Std. Dev. Min Max testscr

21 Normality/Gaussianity Assuming the data are Gaussian allows us to use well known projection formulas and allows us to derive finite sample statistics However, the data is often not Gaussian. It turns out that using the thought experiment of increasing the sample size to infinity will allow us to use some probability limit theory under which the estimators will have a Normal distribution Hence the importance of having a random sample 21

22 Random Sample Let be i.i.d. fw i g n i=1 = fy i ;X i g n i=1 Then f(w 1 ; :::; w n )= f 1 (w 1 ; μ 1 ):::ff i (w i jw i 1 ; :::; w 1 ; μ i ):::ff n (w n jw n 1 ; :::; w 1 ; μ n ) = f(w 1 ; μ):::f(w n ; μ) i.e. notice the independence assumption in the first line, and the identical assumption in the second In time series, as long as the amount of dependence is limited, one can relax the indepedence assumption 22

23 Where are we so far? We have postulated a population model of how y relates to X y i = 1 + x i2 2 + ::: + x ik K + " i ; i =1; :::; n We have a random sample: Now we want to obtain the distribution ib ti of the parameters. The mean of the distribution is the parameter estimate and knowing the distribution is vital to do inference: b» D( ; ) 23

24 Methods of Moments Let s try to figure out how to estimate We will use the method of moments approach first. It consists on the analogy principle: i translate t a population moment condition into its equivalent sample moment condition (think LLN). For example: ¹ 1 X n yi 1 X n E(y ¹y ) = 0! y i ¹ y = 0 n!1 n n i=1 i=1 b¹ = 1 X n ¹ y yi y i n i=1 24

25 Deriving the MM estimator for linear regression Recall, one of the key assumptions in the linear regression model is: P n E("jX) =0! E(X 0 i=1 ") =0! X0 i " i X 0 " = =0 n n with Hence: y = X + ² E(X 0 ")=E(X 0 (y X )) = 0! X0 y n b =(X 0 X) 1 X 0 y X0 X n =0 25

26 Least Squares Linear Regression: Test Scores and Student-to-Teacher Ratio est Score T Student to Teacher Ratio Average Test Score (= (read_scr+math_scr)/2 ); Fitted values 26

27 Deriving the OLS estimator Consider the problem of minimizing the distance of the observations with respect to the regression line. Since we care about distance but not the sign of the error, we could use absolute values: this gives rise to the LAD estimator but it is not convenient because it is not differentiable Instead, by squaring the distance, the objective function can be optimized using derivative methods 27

28 Derivation of OLS Objective: min S( ) =E(" 2 i )! min In matrix algebra: 1 n nx " 2 i = 1 n i=1 nx (y i X i ) 2 i=1 "0 " (y X ) 0 (y X ) min S( ) = = n n General result: suppose f( ) is a real valued scalar function of. A necessary condition for a local optimum = b 28

29 Derivation of OLS (cont.) If the hessian is positive semidefinite, then ^ is a local minimum. Rules of = 2 @ K f @ 2 K@ 01 ::: 2 1@ 2 K@ 0K 3 0A 0A0 = A; A = A0 ; =(A + A 0 ) 0A = 0(A

30 Derivation of OLS (cont.) Recall: min S( ) = "0 " (y X ) 0 (y X ) = n n = y0 y 0X 0 n y y0 X 0X 0 X + n n n Applying the rules of matrix ) =0 X0 y X 0 μ y X 0 μ X X 0 X n n n n = 2 X0 y X n +2X0 n =0 ^ = (X 0 X) 1 X 0 y 2 S( ) X 0 X = 2 which is positive 0 n

31 Remarks No multicolinearity assumption ensures X X is invertible.. M is both b" = y X b = y X(X 0 X) 1 X 0 y = My symmetric and idempotent ( M = M 0 and M = M 2 ) and MX = 0 y by = y " = (I M)y = X(X 0 X) 1 X 0 y = Py where P is called the projection matrix. X 0^² = X 0 My = 0 by construction, the residuals are uncorrelated to the regressors. 31

32 Maximum Likelihood Estimator Assuming the random sample fy i ;X i g n i=1 is normally distributed and since the ^ are a linear combination of these, they will have a multivariate Gaussian distribution. Further, we now that the residuals are mean zero. And under the assumption of homoscedasticity, their covariance matrix is =¾. 2 I The multivariate normal is f("; ) =(2¼) n=2 j j 1=2 expf 1 2 (" ¹)0 1 (" ¹)g 32

33 MLE Taking the log (to construct the log likelihood function) and using the assumptions of the linear regression model: L("; ) = n 2 log(2¼) n 2 log ¾2 1 2¾ 2 "0 " = n log(2¼) n log ¾2 1 X )0 X ) 2 2 2¾ 2 (y (y and ¾ 2 Take derivatives with respect b = 1 2¾ 2 2 X 0 y + X 0 X =0! =(X 0 X) 1 X 0 = n b"0 b" =0! b¾ 2 = b"0 2 = b 2 ¾ 2 2¾ 4 n 33

34 Let s revisit Joint Normality and Linear Regression Recall: if y and X are jointly normal then yi X 0 i» N ¹y ¹ X ; μ E(y i jxi)=m(x 0 i)=¹ 0 yi (X i 0 ¹ X ) V (y 0 1 i jx i ) = Compare to OLS E(yjX)! y b = X b = X(X 0 X) 1 X 0 y! Ã! Ã! 1 X n 1 n 1 X y by yi 0 0 i = y i X i X n i=1 n i X i X i i=1 34

35 An example of GAUSS code for OLS Here is the basic code (a more complete file labeled topic1.prg does more things): load z[] = topic1.csv; vars = 4; z = reshape(z,rows(z)/vars,vars); rows(z)/vars vars); n = rows(z); y = z[.,1]; x = ones(rows(z),1) 1)~z[ z[.,2:cols(z)]; beta = inv(x'x)*x'y; beta;

36 Some Regression output From STATA. use "C:\Docs\teaching\140\STATA\caschool.dta". reg testscr str expn_stu avginc Source SS df MS Number of obs = 420 F( 3, 416) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = testscr Coef. Std. Err. t P> t [95% Conf. Interval] str expn_stu avginc _cons

37 Regression Output from GAUSS TOPIC 1 OLS EXAMPLE USING GAUSS' BUILT IN OLS ROUTINE Valid cases: 420 Dependent variable: Y Missing cases: 0 Deletion method: None Total SS: Degrees of freedom: 416 R-squared: Rbar-squared: Residual SS: Std error of est: F(3,416): Probability of F: Standard Prob Standardized Cor with Variable Estimate Error t-value > t Estimate Dep Var CONSTANT X X X

38 Measuring Goodness of Fit Intuition: If the regression is really good, then the residuals will be very close to zero and the predictions of the dependent variable will be close to y, most of the time. R-squared: is the standard measure of fit and is based on comparing the residual variance or the prediction variance, with the variance of the dependent variable. 38

39 R-squared Recall ^y = X ^ = X(X 0 X) 1 X 0 y = P y and y = P y +(I P )y = P y + My Definition: R 2 R2 = y0 P y y 0 y =1 y0 My y 0 y 2 [0; 1] where I use the properties: P = P and P P = P M = M and M M = M 39

40 Adjusted R-squared Takes advantage of the different degrees of freedom adjustments in computing sample variances: Pn (by by) R 2 = ( i=1 (y i 2 )=(n k) ( P n i=1 (y i y) 2 )=(n 1) =1 (P n i=1 b" i 2 )=(n k) P n ( i=1 1 (y i y) 2 )=(n 1) 2 [0; 1] Generally superior but most programs still report both 40

Econometrics Midterm Examination Answers

Econometrics Midterm Examination Answers Econometrics Midterm Examination Answers March 4, 204. Question (35 points) Answer the following short questions. (i) De ne what is an unbiased estimator. Show that X is an unbiased estimator for E(X i

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Lecture 4: Multivariate Regression, Part 2

Lecture 4: Multivariate Regression, Part 2 Lecture 4: Multivariate Regression, Part 2 Gauss-Markov Assumptions 1) Linear in Parameters: Y X X X i 0 1 1 2 2 k k 2) Random Sampling: we have a random sample from the population that follows the above

More information

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals (SW Chapter 5) Outline. The standard error of ˆ. Hypothesis tests concerning β 3. Confidence intervals for β 4. Regression

More information

Introduction to Econometrics. Review of Probability & Statistics

Introduction to Econometrics. Review of Probability & Statistics 1 Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D. wongchaiwat@hotmail.com Introduction 2 What is Econometrics? Econometrics consists of the application of mathematical

More information

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Econometrics. 8) Instrumental variables

Econometrics. 8) Instrumental variables 30C00200 Econometrics 8) Instrumental variables Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Thery of IV regression Overidentification Two-stage least squates

More information

Extensions to the Basic Framework II

Extensions to the Basic Framework II Topic 7 Extensions to the Basic Framework II ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Nonlinear regression Limited Dependent Variable regression Applications of

More information

Hypothesis Tests and Confidence Intervals. in Multiple Regression

Hypothesis Tests and Confidence Intervals. in Multiple Regression ECON4135, LN6 Hypothesis Tests and Confidence Intervals Outline 1. Why multipple regression? in Multiple Regression (SW Chapter 7) 2. Simpson s paradox (omitted variables bias) 3. Hypothesis tests and

More information

Lab 07 Introduction to Econometrics

Lab 07 Introduction to Econometrics Lab 07 Introduction to Econometrics Learning outcomes for this lab: Introduce the different typologies of data and the econometric models that can be used Understand the rationale behind econometrics Understand

More information

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests ECON4150 - Introductory Econometrics Lecture 5: OLS with One Regressor: Hypothesis Tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 5 Lecture outline 2 Testing Hypotheses about one

More information

Introduction to Econometrics. Multiple Regression (2016/2017)

Introduction to Econometrics. Multiple Regression (2016/2017) Introduction to Econometrics STAT-S-301 Multiple Regression (016/017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 OLS estimate of the TS/STR relation: OLS estimate of the Test Score/STR relation:

More information

Lecture notes to Stock and Watson chapter 8

Lecture notes to Stock and Watson chapter 8 Lecture notes to Stock and Watson chapter 8 Nonlinear regression Tore Schweder September 29 TS () LN7 9/9 1 / 2 Example: TestScore Income relation, linear or nonlinear? TS () LN7 9/9 2 / 2 General problem

More information

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors ECON4150 - Introductory Econometrics Lecture 6: OLS with Multiple Regressors Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 6 Lecture outline 2 Violation of first Least Squares assumption

More information

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression Chapter 7 Hypothesis Tests and Confidence Intervals in Multiple Regression Outline 1. Hypothesis tests and confidence intervals for a single coefficie. Joint hypothesis tests on multiple coefficients 3.

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 6 Multiple regression model Siv-Elisabeth Skjelbred University of Oslo February 5th Last updated: February 3, 2016 1 / 49 Outline Multiple linear regression model and

More information

Extensions to the Basic Framework I

Extensions to the Basic Framework I Topic 5 Extensions to the Basic Framework I ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Heteroskedasticity: reminder of OLS results and White (1980) corrected standard

More information

Essential of Simple regression

Essential of Simple regression Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

Hypothesis Tests and Confidence Intervals in Multiple Regression

Hypothesis Tests and Confidence Intervals in Multiple Regression Hypothesis Tests and Confidence Intervals in Multiple Regression (SW Chapter 7) Outline 1. Hypothesis tests and confidence intervals for one coefficient. Joint hypothesis tests on multiple coefficients

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018 Econometrics I KS Module 1: Bivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: March 12, 2018 Alexander Ahammer (JKU) Module 1: Bivariate

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 7 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 68 Outline of Lecture 7 1 Empirical example: Italian labor force

More information

Introduction to Econometrics. Multiple Regression

Introduction to Econometrics. Multiple Regression Introduction to Econometrics The statistical analysis of economic (and related) data STATS301 Multiple Regression Titulaire: Christopher Bruffaerts Assistant: Lorenzo Ricci 1 OLS estimate of the TS/STR

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 13 Nonlinearities Saul Lach October 2018 Saul Lach () Applied Statistics and Econometrics October 2018 1 / 91 Outline of Lecture 13 1 Nonlinear regression functions

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Lecture 5. In the last lecture, we covered. This lecture introduces you to Lecture 5 In the last lecture, we covered. homework 2. The linear regression model (4.) 3. Estimating the coefficients (4.2) This lecture introduces you to. Measures of Fit (4.3) 2. The Least Square Assumptions

More information

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor) 1 Multiple Regression Analysis: Estimation Simple linear regression model: an intercept and one explanatory variable (regressor) Y i = β 0 + β 1 X i + u i, i = 1,2,, n Multiple linear regression model:

More information

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points Economics 102: Analysis of Economic Data Cameron Spring 2016 May 12 Department of Economics, U.C.-Davis Second Midterm Exam (Version A) Compulsory. Closed book. Total of 30 points and worth 22.5% of course

More information

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor ECON4150 - Introductory Econometrics Lecture 4: Linear Regression with One Regressor Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 4 Lecture outline 2 The OLS estimators The effect of

More information

Econ 2120: Section 2

Econ 2120: Section 2 Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted

More information

Chapter 6: Linear Regression With Multiple Regressors

Chapter 6: Linear Regression With Multiple Regressors Chapter 6: Linear Regression With Multiple Regressors 1-1 Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem -

Section I. Define or explain the following terms (3 points each) 1. centered vs. uncentered 2 R - 2. Frisch theorem - First Exam: Economics 388, Econometrics Spring 006 in R. Butler s class YOUR NAME: Section I (30 points) Questions 1-10 (3 points each) Section II (40 points) Questions 11-15 (10 points each) Section III

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Econometrics Homework 1

Econometrics Homework 1 Econometrics Homework Due Date: March, 24. by This problem set includes questions for Lecture -4 covered before midterm exam. Question Let z be a random column vector of size 3 : z = @ (a) Write out z

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Introductory Econometrics Lecture 13: Hypothesis testing in the multiple regression model, Part 1 Jun Ma School of Economics Renmin University of China October 19, 2016 The model I We consider the classical

More information

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Inference ME104: Linear Regression Analysis Kenneth Benoit August 15, 2012 August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58 Stata output resvisited. reg votes1st spend_total incumb minister

More information

Week 3: Simple Linear Regression

Week 3: Simple Linear Regression Week 3: Simple Linear Regression Marcelo Coca Perraillon University of Colorado Anschutz Medical Campus Health Services Research Methods I HSMP 7607 2017 c 2017 PERRAILLON ALL RIGHTS RESERVED 1 Outline

More information

Problem Set 1 ANSWERS

Problem Set 1 ANSWERS Economics 20 Prof. Patricia M. Anderson Problem Set 1 ANSWERS Part I. Multiple Choice Problems 1. If X and Z are two random variables, then E[X-Z] is d. E[X] E[Z] This is just a simple application of one

More information

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is Practice Final Exam Last Name:, First Name:. Please write LEGIBLY. Answer all questions on this exam in the space provided (you may use the back of any page if you need more space). Show all work but do

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data 1/2/3-1 1/2/3-2 Brief Overview of the Course Economics suggests important

More information

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%.

Question 1 carries a weight of 25%; Question 2 carries 20%; Question 3 carries 20%; Question 4 carries 35%. UNIVERSITY OF EAST ANGLIA School of Economics Main Series PGT Examination 017-18 ECONOMETRIC METHODS ECO-7000A Time allowed: hours Answer ALL FOUR Questions. Question 1 carries a weight of 5%; Question

More information

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like. Measurement Error Often a data set will contain imperfect measures of the data we would ideally like. Aggregate Data: (GDP, Consumption, Investment are only best guesses of theoretical counterparts and

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

Instrumental Variable Regression

Instrumental Variable Regression Topic 6 Instrumental Variable Regression ARE/ECN 240 A Graduate Econometrics Professor: Òscar Jordà Outline of this topic Randomized Experiments, natural experiments and causation Instrumental variables:

More information

At this point, if you ve done everything correctly, you should have data that looks something like:

At this point, if you ve done everything correctly, you should have data that looks something like: This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows

More information

Handout 12. Endogeneity & Simultaneous Equation Models

Handout 12. Endogeneity & Simultaneous Equation Models Handout 12. Endogeneity & Simultaneous Equation Models In which you learn about another potential source of endogeneity caused by the simultaneous determination of economic variables, and learn how to

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

ECON3150/4150 Spring 2016

ECON3150/4150 Spring 2016 ECON3150/4150 Spring 2016 Lecture 4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo Last updated: January 26, 2016 1 / 49 Overview These lecture slides covers: The linear regression

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

Lecture 3: Multivariate Regression

Lecture 3: Multivariate Regression Lecture 3: Multivariate Regression Rates, cont. Two weeks ago, we modeled state homicide rates as being dependent on one variable: poverty. In reality, we know that state homicide rates depend on numerous

More information

Lab 6 - Simple Regression

Lab 6 - Simple Regression Lab 6 - Simple Regression Spring 2017 Contents 1 Thinking About Regression 2 2 Regression Output 3 3 Fitted Values 5 4 Residuals 6 5 Functional Forms 8 Updated from Stata tutorials provided by Prof. Cichello

More information

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. The Sharp RD Design 3.

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Practice exam questions

Practice exam questions Practice exam questions Nathaniel Higgins nhiggins@jhu.edu, nhiggins@ers.usda.gov 1. The following question is based on the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + u. Discuss the following two hypotheses.

More information

Control Function and Related Methods: Nonlinear Models

Control Function and Related Methods: Nonlinear Models Control Function and Related Methods: Nonlinear Models Jeff Wooldridge Michigan State University Programme Evaluation for Policy Analysis Institute for Fiscal Studies June 2012 1. General Approach 2. Nonlinear

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation Michele Aquaro University of Warwick This version: July 21, 2016 1 / 31 Reading material Textbook: Introductory

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015

Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 Nonrecursive Models Highlights Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 This lecture borrows heavily from Duncan s Introduction to Structural

More information

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u. BOSTON COLLEGE Department of Economics EC 228 Econometrics, Prof. Baum, Ms. Yu, Fall 2003 Problem Set 3 Solutions Problem sets should be your own work. You may work together with classmates, but if you

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria SOLUTION TO FINAL EXAM Friday, April 12, 2013. From 9:00-12:00 (3 hours) INSTRUCTIONS:

More information

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10) Name Economics 170 Spring 2004 Honor pledge: I have neither given nor received aid on this exam including the preparation of my one page formula list and the preparation of the Stata assignment for the

More information

Econometrics II Censoring & Truncation. May 5, 2011

Econometrics II Censoring & Truncation. May 5, 2011 Econometrics II Censoring & Truncation Måns Söderbom May 5, 2011 1 Censored and Truncated Models Recall that a corner solution is an actual economic outcome, e.g. zero expenditure on health by a household

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11 Econ 495 - Econometric Review 1 Contents 1 Linear Regression Analysis 4 1.1 The Mincer Wage Equation................. 4 1.2 Data............................. 6 1.3 Econometric Model.....................

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

The Classical Linear Regression Model

The Classical Linear Regression Model The Classical Linear Regression Model ME104: Linear Regression Analysis Kenneth Benoit August 14, 2012 CLRM: Basic Assumptions 1. Specification: Relationship between X and Y in the population is linear:

More information

Handout 11: Measurement Error

Handout 11: Measurement Error Handout 11: Measurement Error In which you learn to recognise the consequences for OLS estimation whenever some of the variables you use are not measured as accurately as you might expect. A (potential)

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution

More information

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF). CHAPTER Functional Forms of Regression Models.1. Consider the following production function, known in the literature as the transcendental production function (TPF). Q i B 1 L B i K i B 3 e B L B K 4 i

More information

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler

Basic econometrics. Tutorial 3. Dipl.Kfm. Johannes Metzler Basic econometrics Tutorial 3 Dipl.Kfm. Introduction Some of you were asking about material to revise/prepare econometrics fundamentals. First of all, be aware that I will not be too technical, only as

More information

Problem Set 10: Panel Data

Problem Set 10: Panel Data Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005

More information

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

4 Instrumental Variables Single endogenous variable One continuous instrument. 2 Econ 495 - Econometric Review 1 Contents 4 Instrumental Variables 2 4.1 Single endogenous variable One continuous instrument. 2 4.2 Single endogenous variable more than one continuous instrument..........................

More information

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs

Immigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics C1.1 Use the data set Wage1.dta to answer the following questions. Estimate regression equation wage =

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013 ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2013 Instructions: Answer all six (6) questions. Point totals for each question are given in parentheses. The parts within

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

8. Nonstandard standard error issues 8.1. The bias of robust standard errors 8.1. The bias of robust standard errors Bias Robust standard errors are now easily obtained using e.g. Stata option robust Robust standard errors are preferable to normal standard errors when residuals

More information

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari MS&E 226: Small Data Lecture 6: Bias and variance (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 47 Our plan today We saw in last lecture that model scoring methods seem to be trading o two di erent

More information

Lecture 8: Instrumental Variables Estimation

Lecture 8: Instrumental Variables Estimation Lecture Notes on Advanced Econometrics Lecture 8: Instrumental Variables Estimation Endogenous Variables Consider a population model: y α y + β + β x + β x +... + β x + u i i i i k ik i Takashi Yamano

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

Specification Error: Omitted and Extraneous Variables

Specification Error: Omitted and Extraneous Variables Specification Error: Omitted and Extraneous Variables Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 5, 05 Omitted variable bias. Suppose that the correct

More information

Quantitative Methods Final Exam (2017/1)

Quantitative Methods Final Exam (2017/1) Quantitative Methods Final Exam (2017/1) 1. Please write down your name and student ID number. 2. Calculator is allowed during the exam, but DO NOT use a smartphone. 3. List your answers (together with

More information

Lecture #8 & #9 Multiple regression

Lecture #8 & #9 Multiple regression Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Graduate Econometrics Lecture 4: Heteroskedasticity

Graduate Econometrics Lecture 4: Heteroskedasticity Graduate Econometrics Lecture 4: Heteroskedasticity Department of Economics University of Gothenburg November 30, 2014 1/43 and Autocorrelation Consequences for OLS Estimator Begin from the linear model

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Linear Regression with one Regressor

Linear Regression with one Regressor 1 Linear Regression with one Regressor Covering Chapters 4.1 and 4.2. We ve seen the California test score data before. Now we will try to estimate the marginal effect of STR on SCORE. To motivate these

More information