Econometrics I. Professor William Greene Stern School of Business Department of Economics 5-1/36. Part 5: Regression Algebra and Fit

Size: px
Start display at page:

Download "Econometrics I. Professor William Greene Stern School of Business Department of Economics 5-1/36. Part 5: Regression Algebra and Fit"

Transcription

1 Econometrics I Professor William Greene Stern School of Business Department of Economics 5-1/36

2 Econometrics I Part 5 Regression Algebra and Fit; Restricted Least Squares 5-2/36

3 Minimizing ee b minimizes ee = (y - Xb)(y - Xb). Any other coefficient vector has a larger sum of squared residuals. Proof: d = the vector, not equal to b; u = Xd u = y Xd = y Xb + Xb Xd = e - X(d - b). Then, uu = (y - Xd)(y-Xd) = sum of squares using d = [(y Xb) - X(d - b)][(y Xb) - X(d - b)] = [e - X(d - b)][e - X(d - b)] Expand to find uu = ee + (d-b)xx(d-b) (The cross product term is 2eX(d-b)=0 as X e = 0.) = e e + v v > ee 5-3/36

4 Dropping a Variable An important special case. Suppose b X,z = [b,c] = the regression coefficients in a regression of y on [X,z] b X = [d,0] = is the same, but computed to force the coefficient on z to equal 0. This removes z from the regression. We are comparing the results that we get with and without the variable z in the equation. Results which we can show: Dropping a variable(s) cannot improve the fit - that is, it cannot reduce the sum of squared residuals. Adding a variable(s) cannot degrade the fit - that is, it cannot increase the sum of squared residuals. 5-4/36

5 Adding a Variable Never Increases the Sum of Squares Theorem 3.5 on text page 40. u = the residual in the regression of y on [X,z] e = the residual in the regression of y on X alone, uu = ee c 2 (z*z*) ee where z* = M X z and c is the coefficient on z in the regression of y on [X,z]. 5-5/36

6 The Fit of the Regression Variation: In the context of the model we speak of covariation of a variable as movement of the variable, usually associated with (not necessarily caused by) movement of another variable. Total variation = (y = ym 0 i - y) y. M 0 = I i(i i) -1 i = the M matrix for X = a column of ones. n 2 i=1 5-6/36

7 Decomposing the Variation y i xb + e y y xb - xb + e i i i i = ( x - x) b+e i i i n 2 n 2 n 2 (y i y) [( x i 1 i1 i - x) b] e i1 i (Sum of cross products is zero.) Total variation = regression variation + residual variation Recall the decomposition: Var[y] = Var [E[y x]] + E[Var [ y x ]] = Variation of the conditional mean around the overall mean + Variation around the conditional mean function. 5-7/36

8 Decomposing the Variation of Vector y Decomposition: (This all assumes the model contains a constant term. one of the columns in X is i.) y = Xb + e so M 0 y = M 0 Xb + M 0 e = M 0 Xb + e. (Deviations from means.) ym 0 y = b(x M 0 )(M 0 X)b + ee = bxm 0 Xb + ee. (M 0 is idempotent and e M 0 X = e X = 0.) Total sum of squares = Regression Sum of Squares (SSR)+ Residual Sum of Squares (SSE) 5-8/36

9 A Fit Measure R 2 = bxm 0 Xb/yM 0 y e'e 1 (y y) n 2 i1 i Regression Variation Total Variation (Very Important Result.) R 2 is bounded by zero and one only if: (a) There is a constant term in X and (b) The line is computed by linear least squares. 5-9/36

10 Adding Variables R 2 never falls when a variable z is added to the regression. A useful general result for adding a variable 2 R with both X and variable z equals Xz 2 R X with only *2 Xz X X yz X X plus the increase in fit due to after X is accounted for: R R (1 R )r z Useful practical wisdom: It is not possible meaningfully to accumulate R 2 by adding variables in sequence. The incremental fit added by each variable depends on the order. The increase in R 2 that occurs from x 3 in (x 1 then x 2 then x 3 ) is different from that in (x 1 then x 3 then x 2 ). 5-10/36

11 Adding Variables to a Model What is the effect of adding PN, PD, PS? 5-11/36

12 A Useful Result Squared partial correlation of an x in X with y is squared t - ratio squared t - ratio + degrees of freedom We will define the 't-ratio' and 'degrees of freedom' later. Note how it enters: R R (1 R )r r R R / 1 R *2 2* Xz X X yz yz Xz X X 5-12/36

13 Partial Correlation Partial correlation is a difference in R 2 s. For PS in the example above, R 2 without PS =.9861, R 2 with PS =.9907 ( ) / ( ) = / ( (36-5)) = /36

14 Comparing fits of regressions Make sure the denominator in R 2 is the same - i.e., same left hand side variable. Example, linear vs. loglinear. Loglinear will almost always appear to fit better because taking logs reduces variation. 5-14/36

15 5-15/36

16 (Linearly) Transformed Data How does linear transformation affect the results of least squares? Z = XP for KK nonsingular P Based on X, b = (XX) -1 X y. Based on Z, c = (ZZ) -1 Z y = (P XXP) -1 P X y = P -1 (X X) -1 P -1 P X y = P -1 b Fitted value is Zc = (XP)(P -1 b) = Xb. The same!! Residuals from using Z are y - Zc = y - Xb (we just proved this.). The same!! Sum of squared residuals must be identical, as y-xb = e = y-zc. R 2 must also be identical, as R 2 = 1 - ee/y M 0 y (!!). 5-16/36

17 Linear Transformation What are the practical implications of this result? (1) Transformation does not affect the fit of a model to a body of data. (2) Transformation does affect the estimates. If b is an estimate of something (), then c cannot be an estimate of - it must be an estimate of P -1, which might have no meaning at all. Xb is the projection of y into the column space of X. Zc is the projection of y into the column space of Z. But, since the columns of Z are just linear combinations of those of X, the column space of Z must be identical to that of X. Therefore, the projection of y into the former must be the same as the latter, which now produces the other results.) 5-17/36

18 Principal Components Z = XC Fewer columns than X Includes as much variation of X as possible Columns of Z are orthogonal Why do we do this? Collinearity Combine variables of ambiguous identity such as test scores as measures of ability 5-18/36

19 What is a Principal Component? X = a data matrix (deviations from means) z = Xp = a linear combination of the columns of X. Choose p to maximize the variation of z. How? p = eigenvector that corresponds to the largest eigenvalue of X X. (Notes 7:41-44.) 5-19/36

20 5-20/36

21 Movie Regression. Opening Week Box for 62 Films Ordinary least squares regression LHS=LOGBOX Mean = Standard deviation = Number of observs. = 62 Residuals Sum of squares = Standard error of e = Fit R-squared = Adjusted R-squared = Variable Coefficient Standard Error t-ratio P[ T >t] Mean of X Constant *** LOGBUDGT STARPOWR SEQUEL MPRATING * ACTION *** COMEDY ANIMATED ** HORROR INTERNET BUZZ VARIABLES LOGADCT.29451** LOGCMSON LOGFNDGO CNTWAIT *** /36

22 The fit goes down when the 4 buzz variables are reduced to a single linear combination of the Ordinary least squares regression p LHS=LOGBOX Mean = p Standard deviation = P= p Number of observs. = Residuals Sum of squares = p Standard error of e = Fit R-squared = Adjusted R-squared = Variable Coefficient Standard Error t-ratio P[ T >t] Mean of X Constant *** LOGBUDGT.38159** STARPOWR SEQUEL MPRATING ACTION ** COMEDY ANIMATED * HORROR PCBUZZ.39704*** /36

23 5-23/36

24 Adjusted R Squared Adjusted R 2 (adjusted for degrees of freedom) 2 n1 2 R 1 1 R n K Degrees of freedom adjustment. R 2 includes a penalty for variables that don t add much fit. Can fall when a variable is added to the equation. 5-24/36

25 (1.34) /36

26 What is being adjusted? Adjusted R 2 The penalty for using up degrees of freedom. 2 R = 1 - [ee/(n K)]/[yM 0 y/(n-1)] = 1 [(n-1)/(n-k)(1 R 2 )] R 2 Will rise when a variable is added to the regression? 2 R is higher with z than without z if and only if the t ratio on z is in the regression when it is added is larger than one in absolute value. (See p. 46 in text.) 5-26/36

27 Full Regression (Without PD) Ordinary least squares regression... LHS=G Mean = Standard deviation = Number of observs. = 36 Model size Parameters = 9 Degrees of freedom = 27 Residuals Sum of squares = Standard error of e = Fit R-squared = <********** Adjusted R-squared = <********** Info criter. LogAmemiya Prd. Crt. = <********** Akaike Info. Criter. = <********** Model test F[ 8, 27] (prob) = 503.3(.0000) Variable Coefficient Standard Error t-ratio P[ T >t] Mean of X Constant ** PG *** Y.02214*** PNC PUC PPT PN * PS *** YEAR ** /36

28 PD added to the model. R 2 rises, Adjusted R 2 falls Ordinary least squares regression... LHS=G Mean = Standard deviation = Number of observs. = 36 Model size Parameters = 10 Degrees of freedom = 26 Residuals Sum of squares = Standard error of e = Fit R-squared = Was Adjusted R-squared = Was Variable Coefficient Standard Error t-ratio P[ T >t] Mean of X Constant ** PG *** Y.02231*** PNC PUC PPT PD (NOTE LOW t ratio) PN PS ** YEAR * /36

29 Linear Least Squares Subject to Restrictions Restrictions: Theory imposes certain restrictions on parameters. Some common applications Dropping variables from the equation = certain coefficients in b forced to equal 0. (Probably the most common testing situation. Is a certain variable significant? ) Adding up conditions: Sums of certain coefficients must equal fixed values. Adding up conditions in demand systems. Constant returns to scale in production functions. Equality restrictions: Certain coefficients must equal other coefficients. Using real vs. nominal variables in equations. General formulation for linear restrictions: Minimize the sum of squares, ee, subject to the linear constraint Rb = q. 5-29/36

30 Restricted Least Squares In practice, restrictions can usually be imposed by solving them out. 1. Force a coefficient to equal zero. Drop the variable from the equation n i 1 i1 2xi2 3x 2 i1 i3 3 Problem: Minimize for,, (y x n i1 i 1 i1 2 i2 Solution: Minimize for, (y x x ) 2. Adding up restriction. Impose + + = 1. Strategy: =1. Solution: Minimize for, ( 3. Equality restriction. Impose n yi 1xi1 2x i2 (1 i 1 1 2)x i3) ) subject to 0 n 2 i i3 1 i1 i1 i3 2 i2 i3 = [(y x ) (x x ) (x x )] 3 2 n i 1 i1 2 i2 3x i3) subject to i1 3 2 Minimize for,, (y x x n i1 i 1 i1 2 i2 i3 Solution: Minimize for, [y x (x x )] In each case, least squares using transformations of the data. 5-30/36

31 Restricted Least Squares Solution General Approach: Programming Problem Minimize for L = (y - X)(y - X) subject to R = q Each row of R is the K coefficients in a restriction. There are J restrictions: J rows 3 = 0: R = [0,0,1,0, ] q = (0). J=1 2 = 3 : R = [0,1,-1,0, ] q = (0). J=1 2 = 0, 3 = 0: R = 0,1,0,0, q = 0 J=2 0,0,1,0, /36

32 Solution Strategy Quadratic program: Minimize quadratic criterion subject to linear restrictions All restrictions are binding Solve using Lagrangean formulation Minimize over (,) L* = (y - X)(y - X) + 2(R-q) (The 2 is for convenience see below.) 5-32/36

33 Necessary Conditions L* Restricted LS Solution 2 X( y X) 2R 0 L* 2( R q) 0 Divide everything by 2. Collect in a matrix form XX R Xy or = Solution ˆ = R 0 q Does not rely on full rank of X. Relies on column rank of A = K J. 1 A w. A w 5-33/36

34 Restricted Least Squares If X has full rank, there is a partitioned solution for * and * β* b XX R R XX R Rb q 1 1 = - ( ) [ ( ) ]( ) 1 * [ R( X X) R ]( Rb q) 1 where b the simple least squares coefficients, b = ( X X) X y. Note β* = b and * 0 if Rb q = /36

35 Aspects of Restricted LS 1. b* = b - Cm where m = the discrepancy vector Rb - q. Note what happens if m = 0. What does m = 0 mean? 2. =[R(XX) -1 R] -1 (Rb - q) = [R(XX) -1 R] -1 m. When does = 0. What does this mean? 3. Combining results: b* = b - (XX) -1 R. How could b* = b? 5-35/36

36 Restrictions and the Criterion Function Assume full rank X case. (The usual case.) b XX Xy y X X 1 = ( ) uniquely minimizes ( - ) (y- ) =. ( y- Xb) ( y- Xb) < ( y- Xb*) ( y-xb*) for any b* b. Imposing restrictions cannot improve the criterion value. 2 2 It follows that R * < R. Restrictions must degrade the fit. 5-36/36

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra Econometrics I Professor William Greene Stern School of Business Department of Economics 3-1/29 Econometrics I Part 3 Least Squares Algebra 3-2/29 Vocabulary Some terms to be used in the discussion. Population

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

Theorems. Least squares regression

Theorems. Least squares regression Theorems In this assignment we are trying to classify AML and ALL samples by use of penalized logistic regression. Before we indulge on the adventure of classification we should first explain the most

More information

Econometric Analysis of Panel Data. Assignment 3

Econometric Analysis of Panel Data. Assignment 3 Department of Economics Econometric Analysis of Panel Data Professor William Greene Phone: 22.998.0876 Office: KMC 7-78 Home page:www.stern.nyu.edu/~wgreene Email: wgreene@stern.nyu.edu URL for course

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Lecture 3 : Regression: CEF and Simple OLS Zhaopeng Qu Business School,Nanjing University Oct 9th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics Oct 9th,

More information

Economics 620, Lecture 5: exp

Economics 620, Lecture 5: exp 1 Economics 620, Lecture 5: The K-Variable Linear Model II Third assumption (Normality): y; q(x; 2 I N ) 1 ) p(y) = (2 2 ) exp (N=2) 1 2 2(y X)0 (y X) where N is the sample size. The log likelihood function

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b) Explain/Obtain the variance-covariance matrix of Both in the bivariate case (two regressors) and

More information

Correlation 1. December 4, HMS, 2017, v1.1

Correlation 1. December 4, HMS, 2017, v1.1 Correlation 1 December 4, 2017 1 HMS, 2017, v1.1 Chapter References Diez: Chapter 7 Navidi, Chapter 7 I don t expect you to learn the proofs what will follow. Chapter References 2 Correlation The sample

More information

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0: Linear Regression 1 Single Explanatory Variable Assume (y is not necessarily normal) where Examples: y» F; Ey = + x Vary = ¾ 2 ) y = + x + u Eu = 0 Varu = ¾ 2 Exu = 0: 1. School performance as a function

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Brief Suggested Solutions

Brief Suggested Solutions DEPARTMENT OF ECONOMICS UNIVERSITY OF VICTORIA ECONOMICS 366: ECONOMETRICS II SPRING TERM 5: ASSIGNMENT TWO Brief Suggested Solutions Question One: Consider the classical T-observation, K-regressor linear

More information

Multiple Linear Regression CIVL 7012/8012

Multiple Linear Regression CIVL 7012/8012 Multiple Linear Regression CIVL 7012/8012 2 Multiple Regression Analysis (MLR) Allows us to explicitly control for many factors those simultaneously affect the dependent variable This is important for

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Random Vectors, Random Matrices, and Matrix Expected Value

Random Vectors, Random Matrices, and Matrix Expected Value Random Vectors, Random Matrices, and Matrix Expected Value James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 16 Random Vectors,

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression Model 1 2 Ordinary Least Squares 3 4 Non-linearities 5 of the coefficients and their to the model We saw that econometrics studies E (Y x). More generally, we shall study regression analysis. : The regression

More information

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION APPENDIX B EIGENVALUES AND SINGULAR VALUE DECOMPOSITION B.1 LINEAR EQUATIONS AND INVERSES Problems of linear estimation can be written in terms of a linear matrix equation whose solution provides the required

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The general linear regression with k explanatory variables is just an extension of the simple regression as follows 3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

4. Nonlinear regression functions

4. Nonlinear regression functions 4. Nonlinear regression functions Up to now: Population regression function was assumed to be linear The slope(s) of the population regression function is (are) constant The effect on Y of a unit-change

More information

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Economics 620, Lecture 2: Regression Mechanics (Simple Regression) 1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013 18.S096 Problem Set 3 Fall 013 Regression Analysis Due Date: 10/8/013 he Projection( Hat ) Matrix and Case Influence/Leverage Recall the setup for a linear regression model y = Xβ + ɛ where y and ɛ are

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari MS&E 226: Small Data Lecture 6: Bias and variance (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 47 Our plan today We saw in last lecture that model scoring methods seem to be trading o two di erent

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

Chapter 2 The Simple Linear Regression Model: Specification and Estimation Chapter The Simple Linear Regression Model: Specification and Estimation Page 1 Chapter Contents.1 An Economic Model. An Econometric Model.3 Estimating the Regression Parameters.4 Assessing the Least Squares

More information

Singular Value Decomposition Compared to cross Product Matrix in an ill Conditioned Regression Model

Singular Value Decomposition Compared to cross Product Matrix in an ill Conditioned Regression Model International Journal of Statistics and Applications 04, 4(): 4-33 DOI: 0.593/j.statistics.04040.07 Singular Value Decomposition Compared to cross Product Matrix in an ill Conditioned Regression Model

More information

Statistical and Econometric Methods for Transportation Data Analysis

Statistical and Econometric Methods for Transportation Data Analysis Statistical and Econometric Methods for Transportation Data Analysis Chapter 5 Simultaneous-Equation Models Example 5.2 (with 3SLS Extensions) Seemingly Unrelated Regression Estimation and 3SLS A survey

More information

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements:

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements: Econ 427, Spring 2010 Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements: 1. (page 132) In each case, the idea is to write these out in general form (without the lag

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

Matrix Approach to Simple Linear Regression: An Overview

Matrix Approach to Simple Linear Regression: An Overview Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015 MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

The Multiple Regression Model Estimation

The Multiple Regression Model Estimation Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Can you tell the relationship between students SAT scores and their college grades?

Can you tell the relationship between students SAT scores and their college grades? Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Statistical and Econometric Methods for Transportation Data Analysis

Statistical and Econometric Methods for Transportation Data Analysis Statistical and Econometric Methods for Transportation Data Analysis Chapter 5 Simultaneous-Equation Models Example 5.2 Seemingly Unrelated Regression Estimation A survey of 206 people was conducted on

More information

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit LECTURE 6 Introduction to Econometrics Hypothesis testing & Goodness of fit October 25, 2016 1 / 23 ON TODAY S LECTURE We will explain how multiple hypotheses are tested in a regression model We will define

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Lecture 3: Multiple Regression

Lecture 3: Multiple Regression Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u

More information

determine whether or not this relationship is.

determine whether or not this relationship is. Section 9-1 Correlation A correlation is a between two. The data can be represented by ordered pairs (x,y) where x is the (or ) variable and y is the (or ) variable. There are several types of correlations

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

7.3 Ridge Analysis of the Response Surface

7.3 Ridge Analysis of the Response Surface 7.3 Ridge Analysis of the Response Surface When analyzing a fitted response surface, the researcher may find that the stationary point is outside of the experimental design region, but the researcher wants

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

MANY BILLS OF CONCERN TO PUBLIC

MANY BILLS OF CONCERN TO PUBLIC - 6 8 9-6 8 9 6 9 XXX 4 > -? - 8 9 x 4 z ) - -! x - x - - X - - - - - x 00 - - - - - x z - - - x x - x - - - - - ) x - - - - - - 0 > - 000-90 - - 4 0 x 00 - -? z 8 & x - - 8? > 9 - - - - 64 49 9 x - -

More information

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form Topic 7 - Matrix Approach to Simple Linear Regression Review of Matrices Outline Regression model in matrix form - Fall 03 Calculations using matrices Topic 7 Matrix Collection of elements arranged in

More information

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise.

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise. Name Solutions Linear Algebra; Test 3 Throughout the test simplify all answers except where stated otherwise. 1) Find the following: (10 points) ( ) Or note that so the rows are linearly independent, so

More information

1 Last time: least-squares problems

1 Last time: least-squares problems MATH Linear algebra (Fall 07) Lecture Last time: least-squares problems Definition. If A is an m n matrix and b R m, then a least-squares solution to the linear system Ax = b is a vector x R n such that

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

ECONOMETRICS I Take Home Final Examination

ECONOMETRICS I Take Home Final Examination Department of Economics ECONOMETRICS I Take Home Final Examination Fall 2016 Professor William Greene Phone: 212.998.0876 Office: KMC 7-90 URL: people.stern.nyu.edu/wgreene e-mail: wgreene@stern.nyu.edu

More information

2.2 Classical Regression in the Time Series Context

2.2 Classical Regression in the Time Series Context 48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression

More information

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal Econ 3790: Statistics Business and Economics Instructor: Yogesh Uppal Email: yuppal@ysu.edu Chapter 14 Covariance and Simple Correlation Coefficient Simple Linear Regression Covariance Covariance between

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

F-tests and Nested Models

F-tests and Nested Models F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

Linear Systems. Carlo Tomasi

Linear Systems. Carlo Tomasi Linear Systems Carlo Tomasi Section 1 characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix and of

More information

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42

Projection. Ping Yu. School of Economics and Finance The University of Hong Kong. Ping Yu (HKU) Projection 1 / 42 Projection Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Projection 1 / 42 1 Hilbert Space and Projection Theorem 2 Projection in the L 2 Space 3 Projection in R n Projection

More information

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

ANALYSIS OF VARIANCE AND QUADRATIC FORMS 4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed

More information

Notes on Random Variables, Expectations, Probability Densities, and Martingales

Notes on Random Variables, Expectations, Probability Densities, and Martingales Eco 315.2 Spring 2006 C.Sims Notes on Random Variables, Expectations, Probability Densities, and Martingales Includes Exercise Due Tuesday, April 4. For many or most of you, parts of these notes will be

More information

Scatter plot of data from the study. Linear Regression

Scatter plot of data from the study. Linear Regression 1 2 Linear Regression Scatter plot of data from the study. Consider a study to relate birthweight to the estriol level of pregnant women. The data is below. i Weight (g / 100) i Weight (g / 100) 1 7 25

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 +  1 y 2 = + x 2 +  2 :::::::: :::::::: y N = + x N +  N 1 Economics 620, Lecture 4: The K-Variable Linear Model I Consider the system y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N or in matrix form y = X + " where y is N 1, X is N

More information

ECON 450 Development Economics

ECON 450 Development Economics ECON 450 Development Economics Statistics Background University of Illinois at Urbana-Champaign Summer 2017 Outline 1 Introduction 2 3 4 5 Introduction Regression analysis is one of the most important

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

The Simple Regression Model. Simple Regression Model 1

The Simple Regression Model. Simple Regression Model 1 The Simple Regression Model Simple Regression Model 1 Simple regression model: Objectives Given the model: - where y is earnings and x years of education - Or y is sales and x is spending in advertising

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information