The Linear Regression Model

Size: px
Start display at page:

Download "The Linear Regression Model"

Transcription

1 The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67

2 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general representation of the model of interest: y = Xβ + ɛ, y = β = y 1... y N β 1... β k 1 C A, X = 1 C A, ɛ = x 11 x 12.. x 1k x N1 x N2.. x Nk ε 1... ε N 1 C A. 1 C A, Favero () The Linear Regression Model 2 / 67

3 OLS The simplest way to derive estimates of the parameters of interest is the ordinary least squares (OLS) method. Such a method chooses values for the unknown parameters to minimize the magnitude of the non-observable components. In our simple bivariate case this amount to choosing a line that goes through the scatterplot of excess returns on each asset on the market excess returns such that it provides the best fit. The best fit is obtained by minimizing the sum of squared vertical deviations of the data points from the fitted line. Define the following quantity: e (β) = y where e (β) is a (n 1) vector. If we treat Xβ, as a (conditional) prediction for y, then we can consider e (β) as a forecasting error. The sum of the squared errors is then Xβ, S (β) = e (β) e (β). Favero () The Linear Regression Model 3 / 67

4 OLS The OLS method produces an estimator of β, β, b defined as follows: S bβ = min e (β) e (β). β Given β, b we can define an associated vector of residual bɛ as bɛ = y Xβ. b The OLS estimator is derived by considering the necessary and sufficient conditions for β b to be a unique minimum for S: 1 X bɛ = ; 2 rank(x) = k. Condition 1 imposes orthogonality between the right-hand side variables on the OLS residuals, and ensures that residuals have an average of zero when a constant is included among the regressors. Condition 2 requires that the columns of the X matrix are linearly independent: no variable in X can be expressed as a linear combination of the other variables in X. Favero () The Linear Regression Model 4 / 67

5 OLS From 1 we derive an expression for the OLS estimates: X bɛ = X y Xβ b = X y X Xβ b =, bβ = X X 1 X y. Favero () The Linear Regression Model 5 / 67

6 Properties of the OLS estimates We have derived the OLS estimator without any assumption on the statistical structure of the data. However, the statistical structure of the data is needed to define the properties of the estimator. To illustrate them, we refer to the basic concepts of mean and variance of vector variables. Given a generic vector of variables, x,and its mean vector E (x) x = x 1... x n 1 C,E (x) = B E (x 1 )... E (x n ) 1 C A Favero () The Linear Regression Model 6 / 67

7 Properties of the OLS estimates We define the mean matrix of outer products E (xx ) as: E x 2 1 E (x 1 x 2 ).. E (x 1 x n ) E xx. E x E (x 2 x n ) = E (x n x 1 ) E (x n x 2 ).. E x 2 n The variance-covariance matrix of x is the defined as: var (x) = E (x E (x)) E (x E (x)) = E xx E (x) E (x). 1 C A The variance-covariance matrix is symmetric and positive definite, by construction. Given an arbitrary A vector of dimension n, we have: var A x = A var (x) A. Favero () The Linear Regression Model 7 / 67

8 The first hypothesis The first relevant hypothesis for the derivation of the statistical properties of OLS regards the relationship between disturbances and regressors in the estimated equation. This hypothesis is constructed in two parts: first we assume that E (y i j x i ) = xi β, ruling out the contemporaneous correlation between residuals and regressors (note that assuming the validity of this hypothesis implies that there are no omitted variables correlated with the regressors), second we assume that the components of the available sample are independently drawn. The second assumption guarantees the equivalence between E (y i j x i ) = xi β and E (y i j x 1,..., x i,..., x n ) = xi β. Using vector notation, we have: E (y j X) = Xβ, which is equivalent to E (ɛ j X) =. (1) Favero () The Linear Regression Model 8 / 67

9 The first hypothesis Note that hypothesis (1) is very demanding. It implies that E (ɛ i j x 1,...x i,...x n ) = (i = 1,...n). The conditional mean is, in general, a non-linear function of (x 1,..., x i,..., x n ) and (1) requires that such a function is a constant of zero. Note that (1) requires that each regressor is orthogonal not only to the error term associated with the same observation (E (x ik ε i ) = for all k), but also to the error tem associated with each other observation E x jk ε i = for all j 6= k. This statement is proved by using the properties of conditional expectations. Favero () The Linear Regression Model 9 / 67

10 The first hypothesis Since E (ɛ j X) = implies, from the law of iterated expectations, that E (ɛ) =, we have Then E ε i j x jk = E E (εi j x) j x jk =. (2) E ε i x jk = E E ε i x jk j x jk = E x jk E ε i j x jk =. Favero () The Linear Regression Model 1 / 67

11 The second and third hypothesis The second hypothesis defines the constancy of the conditional variance of shocks: E ɛ ɛ j X = σ 2 I, (3) where σ 2 is a constant independent from X. In the case of our data, this is a strong assumption unlikely to be met in practice. The third hypothesis is the one already introduced, which guarantees that the OLS estimator can be derived: rank (X) = k. (4) Under hypotheses (1) estimator. (4) we can derive the properties of the OLS Favero () The Linear Regression Model 11 / 67

12 Property 1: unbiasedness The conditional expectation (with respect to X) of the OLS estimates is the vector of unknown parameters β: bβ = X X 1 X (Xβ + ɛ) E bβ j X = β+ X X 1 X ɛ = β+ X X 1 X E (ɛ j X) = β, by hypothesis (1). Favero () The Linear Regression Model 12 / 67

13 Property 2: variance of OLS The conditional variance of the OLS estimator is σ 2 (X X) 1 : var bβ j X bβ = E β bβ β j X = E X X 1 X ɛɛ X X X 1 j X = X X 1 X E ɛɛ j X X X X 1 = X X 1 X σ 2 IX X X 1 = σ 2 X X 1. Favero () The Linear Regression Model 13 / 67

14 Property 3: Gauss-Markov theorem The OLS estimator is the most efficient in the class of linear unbiased estimators. Consider the class of linear estimators: β L = Ly. This class is defined by the set of matrices (k n) L, which are fixed when conditioning upon X. L does not depend on y. Therefore we have: E (β L j X) = E (LXβ + Lε j X) = LXβ, and LXβ = β only if LX = I k. Such a condition is obviously satisfied by the OLS estimator, which is obtained by setting L = (X X) 1 X. The variance of the general estimator in the class of linear unbiased estimators is readily obtained as: var (β L j X) = E Lεε L j X = σ 2 LL. Favero () The Linear Regression Model 14 / 67

15 To show that the OLS estimator is the most efficient within this class we have to show that the variance of the OLS estimator differs from the variance of the generic estimator in the class by a positive semidefinite matrix. To this aim define D = L (X X) 1 X ; LX = I requires DX =. LL = X X 1 X + D X X X 1 + D from which we have that = X X 1 X X X X 1 + X X 1 X D + +DX X X 1 + DD = X X 1 + DD, var (β L j X) = var bβ j X + σ 2 DD, which proves the point. For any given matrix D, (not necessarily square), the symmetric matrix DD is positive semidefinite. Favero () The Linear Regression Model 15 / 67

16 Residual Analysis Consider the following representation: bɛ = y Xbβ = y X X X 1 X y = My, where M = I n Q, and Q = X (X X) 1 X. The (n n) matrices M and Q, have the following properties: 1 they are symmetric: M = M, Q = Q; 2 they are idempotent: QQ = Q, MM = M; 3 MX =, MQ =, QX = X. Favero () The Linear Regression Model 16 / 67

17 Residual Analysis Note that the OLS projection for y can be written as by = Xβ b = Qy, and that bɛ = My, from which we have the known result of orthogonality between the OLS residuals and regressors. We also have My = MXβ + Mɛ = Mɛ, given that MX =. Therefore we have a very well-specified relation between the OLS residuals and the errors in the model bɛ = Mɛ, which cannot be used to derive the errors given the residuals, since the M matrix is not invertible. We can re-write the sum of squared residuals as: S bβ = bɛ bɛ = ɛ M Mɛ = ɛ Mɛ. S bβ is an obvious candidate for the construction of an estimate for σ 2. Favero () The Linear Regression Model 17 / 67

18 Residual Analysis To derive an estimate of σ 2 from S bβ, we introduce the concept of trace. The trace of a square matrix is the sum of all elements on its principal diagonal. The following properties are relevant: 1 given any two square matrices A and B, tr (A + B) = tra + trb; 2 given any two matrices A and B, tr (AB) = tr (BA) ; 3 the rank of an idempotent matrix is equal to its trace. Favero () The Linear Regression Model 18 / 67

19 Residual Analysis Using property 2 together with the fact that a scalar coincides with its trace, we have: ɛ Mɛ =tr ɛ Mɛ = tr Mɛɛ. Now we analyse the expected value of S bβ, conditional upon X: E S bβ j X = E trmɛɛ j X = tre Mɛɛ j X = trm Eɛɛ j X = σ 2 trm. Favero () The Linear Regression Model 19 / 67

20 Residual Analysis From properties 1 and 2 we have: trm = tri n tr X X X 1 X = n tr X X X X 1 = n k. Therefore, an unbiased estimate of σ 2 is given by s 2 = S bβ / (n k). Favero () The Linear Regression Model 2 / 67

21 Residual Analysis:the R-squared Using the result of orthogonality between the OLS projections and residuals, we can write: var (y) = var (by) + var (bɛ), from which we can derive the following residual-based indicator of the goodness of fit: R 2 = var (by) var (y) = 1 var (bɛ) var (y). The information contained in R 2 is associated with the information contained in the standard error of the regression, which is the square root of the estimated variance of OLS residuals. Favero () The Linear Regression Model 21 / 67

22 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split these procedure in three steps. First, introduce a measure of sampling variability and evaluate again what you know taking into account that parameters are estimated and there is uncertainty surrounding your point estimates. Second, understand the relevance of our regression independently from inference on the parameters. There is an easy way to do this: suppose all parameters in the model are known and identical to the estimated values and learn how to read these. Third, remember that each regression is run after a reduction process has been, explicitly or implicitly implemented. The relevant question is what happens if something went wrong in the reduction process? What are the consequences of omitting relevant information or of including irrelevant ones in your specification? Favero () The Linear Regression Model 22 / 67

23 Statistical Significance and Relevance Relevance of a regression is different form statistical significance of the estimated parameters. In fact, confusing statistical significance of the estimated parameter describing the effect of a regressor on the dependent variable with practical relevance of that effect is a rather common mistake in the use of the linear model. Statistical inference is a tool for estimating parameters in a probability model and assessing the amount of sampling variability. Statistics gives us indication on what we can say about the values of the parameters in the model on the basis of our sample. The relevance of a regression is determined by the share of the unconditional variance of y that is explained by the variance of E (y j X). Measuring how large is the share of the unconditional variance of y explained by the regression function is the fundamental role of R 2. Favero () The Linear Regression Model 23 / 67

24 Statistical Significance of regression coefficients Estimate the coefficients in a regression, specify a null hypothesis of interest (for example, 4 Factors are needed to explain team performance). Derive a statistic (i.e. a quantity function of the regression coefficients) whose distribution is known under the null hypothesis, compute the observed value of the statistics Compute p as be the probability (under the null) of getting the value you have observed for the statistics p is called the p-value. Adopt a decision rule about p, call it p* and reject the null if the observed value of your statistic is smaller than p*. For example, if you take p=.5 you reject the null everytime your observed statistics is smaller than.5. In this case you make the call that the observation of an event that has very low probability under the null is an indication that the null is rejected. Favero () The Linear Regression Model 24 / 67

25 Statistical Significance of regression coefficients Of course by using the criterion adopted you run the risk of rejecting an hypothesis when that hypothesis is true. This is called the Probability of Type I error or the size of your test. There is another risk that you run: the probability of type II error, that is the probability of not rejecting a null when it is false. Think about an alternative hypothesis on the coefficients, you can compute the probability with which your statistics will be smaller than the cutoff point to which you associate a probability p*. That is the probability of type II error. The power of the test is 1- Pr (type II error). Note that the p-value can be computed in two ways i) by deriving the relevant distribution under the null ii) by simulating via Monte-Carlo or bootstrap the relevant distribution under the null. Using simulation makes easy to calculate the power of your test against given alternatives. Favero () The Linear Regression Model 25 / 67

26 Relevance of regression coefficients Estimate the coefficients in a regression and keep them fixed at their point estimate Run an experiment by changing the conditional mean of the dependent variable via a shock to the regressors Assess how relevant is the shock to the regressor(s) (say, one of the four factor) to determine the dependent variables (say,team performance) Favero () The Linear Regression Model 26 / 67

27 Inference in the Linear Regression Model Inference in the Linear Regression Model is about design the appropriate statistics to test the hypothesis of interest on the coefficients in a linear model. We shall address this process in two steps how to formalize the relevant hypothesis on how to build the statistics. Favero () The Linear Regression Model 27 / 67

28 How to formalize the relevant hypothesis Given the genral representation of the linear regression model: y = Xβ + ɛ, Our general case of interest is that of r restrictions on the vector of parameters with r < k. If we limit our interest to the class of linear restrictions on coefficients, we can express them as H = Rβ = r, where R is an (r k) matrix of parameters with rank k and r is an (r 1) vector of parameters. Favero () The Linear Regression Model 28 / 67

29 How to formalize the relevant hypothesis To illustrate how R and r are constructed, we consider the baseline case of the CAPM model; we want to impose the restriction β,i = on the following specification: r i t r rf t = β,i + β 1,i r m t Rβ = r, 1 β,i = (). β 1,i r rf t + u i,t, (5) Favero () The Linear Regression Model 29 / 67

30 How to build the statistics To perform inference in the linear regression model, we need a further hypothesis to specify the distribution of ɛ conditional upon X: ɛ j X N, σ 2 I. (6) or, equivalently y j X N Xβ, σ 2 I, (7) Given (??) we can immediately derive the distribution of bβ j X which, being a linear combination of a normal distribution, is also normal: bβ j X N β, σ 2 X X 1. (8) Favero () The Linear Regression Model 3 / 67

31 How to build the statistics If bβ j X N β, σ 2 (X X) 1, then: R b β r j X N Rβ r, σ 2 R X X 1 R. (9) And the relevant test can be constructed by deriving the distribution of (??) under the null Rβ r =. Unfortunately, using the normal distribution would require the knowledge of σ 2, which in general is not known. Fortunately, a statistics can be built based on the OLS estimate for σ 2. Favero () The Linear Regression Model 31 / 67

32 How to build the statistics Fortunately, a statistics can be built based on the OLS estimate for σ 2. In fact, it can be shown that Rβ b r R (X X) 1 R 1 Rβ b r s 2 rf (r, T k), under H, that can be used to test the relevant hypothesis. Notice that, as we know that in the case r=1, t t k = p F (1, T k),if we are interested in testing hypothesis on a single coefficients (say β 1 ) we can use the following statistic: ˆ β 1 β 1 1/2 t (T k) under H. Var ˆβ1 Therefore, an immediate test of significance of the coefficient can be performed, by taking the ratio of each estimated coefficient and the associated standard error. Favero () The Linear Regression Model 32 / 67

33 The Partitioned Regression Model Given the linear model: y = Xβ + ɛ, Partition X in two blocks two blocks of dimension (Txr) and (Tx (k r)) and β in a corresponding way into β 1 β 2. The partitioned regression model can then be written as follows y = X 1 β 1 + X 2 β 2 + ɛ, Favero () The Linear Regression Model 33 / 67

34 The Partitioned Regression Model It is useful to derive the formula for the OLS estimator in the partitioned regression model. To obtain such results we partition the normal equations X X b β = X y as: X 1 X 2 X 1 X 2 b β1 bβ 2! = X 1 X 2 y, or, equivalently, X 1 X 1 X 1 X 2 X 2 X 1 X 2 X 2 b β1 bβ 2! = X 1 y X 2 y. (1) Favero () The Linear Regression Model 34 / 67

35 The Partitioned Regression Model System (1) can be resolved in two stages by first deriving an expression β b 2 as: bβ 2 = X2 X 1 2 X2 y X 2 X 1β b 1, and then by substituting it in the first equation of (1) to obtain X1 X 1β b 1 + X1 X 2 X2 X 1 2 X2 y X 2 X 1β b 1 = X1 y, from which: bβ 1 = X1 M 1 2X 1 X 1 M 2 y M 2 = I X 2 X2 X 1 2 X 2. Favero () The Linear Regression Model 35 / 67

36 The Partitioned Regression Model Note that, as M 2 is idempotent, we can also write: bβ 1 = X 1 M 2 M 2X 1 1 X 1 M 2 M 2y, and b β 1 can be interpreted as the vector of OLS coefficients of the regression of y on the matrix of residuals of the regression of X 1 on X 2. Thus, an OLS regression on two regressors is equivalent to two OLS regressions on a single regressor (Frisch-Waugh theorem). Favero () The Linear Regression Model 36 / 67

37 The Partitioned Regression Model Finally, consider the residuals of the partitioned model: bɛ = y X 1 b β1 X 2 b β2, bɛ = y X 1 b β X2 X 2 X 2 bɛ = M 2 y M 2 X 1 b β1 1 X 2 y X 2 X 1 b β 1, = M 2 y M 2 X 1 X1 M 1 2X 1 X 1 M 2 y = M 2 M 2 X 1 X1 M 1 2X 1 X 1 M 2 y, however, we already know that bɛ = My, therefore, M = M 2 M 2 X 1 X1 M 1 2X 1 X 1 M 2. (11) Favero () The Linear Regression Model 37 / 67

38 Testing restrictions on a subset of coefficients In the general framework to test linear restrictions we set r =, R = I r, and partition β in a corresponding way into β 1 β 2. In this case the restriction Rβ r = is equivalent to β 1 = in the partitioned regression model. Under H, X 1 has no additional explicatory power for y with respect to X 2, therefore: H : y = X 2 β 2 + ɛ, (ɛ j X 1, X 2 ) s N, σ 2 I. Note that the statement y = X 2 γ 2 + ɛ, (ɛ j X 2 ) N, σ 2 I, is always true under our maintained hypotheses. However, in general γ 2 6= β 2. Favero () The Linear Regression Model 38 / 67

39 Testing restrictions on a subset of coefficients To derive a statistic to test H remember that the general matrix R (X X) 1 R becomes now the upper left block of (X X) 1, which we can now write as (X 1 M 2X 1 ) 1. The statistic then takes the form bβ 1 (X 1 M 2X 1 ) b β 1 rs 2 = y M 2 X 1 (X1 M 2X 1 ) 1 X1 M 2y T y My r k F (T k, r). Given (11), (1) can be re-written as: y M 2 y y My T y My r k F (T k, r), (12) where the denominator is the sum of the squared residuals in the unconstrained model, while the numerator is the difference between the sum of residuals in the constrained model and the sum of residuals in the unconstrained model. Favero () The Linear Regression Model 39 / 67

40 Testing restrictions on a subset of coefficients Consider the limit case r = 1 and β 1 is a scalar. The F-statistic takes the form bβ 2 1 s 2 (X 1 M 2X 1 ) s F (T k, r), under H, where (X 1 M 2X 1 ) 1 is element (1, 1) of the matrix (X X) 1. Using the result on the relation between the F and the Student s t-distribution: bβ 1 s (X 1 M 2X 1 ) 1/2 t (T k) under H. Therefore, an immediate test of significance of the coefficient can be performed, by taking the ratio of each estimated coefficient and the associated standard error. Favero () The Linear Regression Model 4 / 67

41 The Relevance of a Regression The relevance of a regression is determined by the share of the unconditional variance of y that is explained by the variance of E (y j X). Measuring how large is the share of the unconditional variance of y explained by the regression function is the fundamental role of R 2. a variable can be very significant in explaining the variance of E (y j X), but little of the unconditional variance of y can be explained by the variance of E (y j X). statistical significance does not imply relevance. Favero () The Linear Regression Model 41 / 67

42 The partial regression theorem The Frisch-Waugh Theorem described above is worth more consideration. The theorem tells us than any given regression coefficient in the model E (y j X) = Xβ can be computed in two different but exactly equivalent ways: 1) by regressing y on all the columns of X, 2) by first regressing the j-th column of X on all the other columns of X, computing the residuals of this regression and then by regressing y on these residuals. This result is relevant in that it clarifies that the relationships pinned down by the estimated parameters in a linear model do not describe the connections between the regressand and each regressor but the connection between the part of each regressor that is not explained by the other ones and the regressand. Favero () The Linear Regression Model 42 / 67

43 What if analysis The relevant question in this case becomes how much shall y change if I change X i? The estimation of a single equation linear model does not allow to answer that question, for a number of reasons. First, estimated parameters in a linear model can only answer the question how much shall E (y j X) if I change X? We have seen that the two questions are very different if the R 2 of the regression is low, in this case a change in E (y j X) may not effect any visible and relevant effect on y. Second, a regression model is a conditional expected value GIVEN X. In this sense there is no space for changing the value of any element in X. Favero () The Linear Regression Model 43 / 67

44 What if analysis Any statement involving such a change requires some assumption on how the conditional expectation of y changes if X changes and a correct analysis of this requires an assumption on the joint distribution of y and X. Simulation might require the use of the multivariate joint model even when valid estimation can be performed concentrating only on the conditional model. Strong exogeneity is stronger than weak exogeneity for the estimation of the parameters of interest. Favero () The Linear Regression Model 44 / 67

45 What if analysis Think of a linear model with know parameters y = β 1 x 1 + β 2 x 2 What is in this model the effect of on y of changing x 1 by one unit while keeping x 2 constant? Easy β 1. Now think of the estimated linear model: y = ˆ β 1 x 1 + ˆ β 2 x 2 + û Now y is different from E (y j X) and the question "what is in this model the effect of on E (y j X) of changing x 1 by one unit while keeping x 2 constant?" does not in general make sense. Favero () The Linear Regression Model 45 / 67

46 What if analysis Changing x 1 keeping x 2 unaltered implies that there is zero correlation among this variables. But the estimates β ˆ 1 and β ˆ 2 are obtained by using data in which in general there is some correlation between x 1 and x 2. Data in which fluctuations in x 1 do not have any effect on x 2 would have most likely generated different estimates from those obtained in the estimation sample. The only valid question that can be answered using the coefficients in linear regression is "What is the effect on E (y j X) of changing the part of each regressors that is orthogonal to the other ones". "What if" analysis requires simulation and in most cases a low level of reduction than that used for regression analysis. Favero () The Linear Regression Model 46 / 67

47 The semi-partial R-squared When the columns of X are orthogonal to each other the total R 2 can be exactly decomposed in the sum of the partial R 2 due to each regressor x i (the partial R 2 of a regressor i is defined as the R 2 of the regression of y on x i ). This is in general not the case in applications with non experimental data: columns of X are correlated and a (often large) part of the overall R 2 does depend on the joint behaviour of the columns of X. However, it is always possible to compute the marginal contribution to the overall R 2 due to each regressor x i, defined as the difference between the overall R 2 and the R 2 ot the regression that includes all columns X except x i. This is called the semi-partial R 2. Favero () The Linear Regression Model 47 / 67

48 The semi-partial R-squared Interestingly, the the semi-partial R 2 is a simple tranformation of the t-ratio: spr 2 i = t2 β i 1 R 2 (T k) This result has two interesting implications. First, a quantity which we considered as just a measure of statistical reliability, can lead to a measure of relevance when combined with the overall R 2 of the regression. Second, we can re-iterate the difference between statistical significance and relevance. Suppose you have a sample size of 1 and you have 1 columns in X and the t-ratio on a coefficient β i is of about 4 with an associate P-value of the order.1: very statistical significant! The derivation of the semi-partial R 2 tells us that the contribution of this variable to the overall R2 is at most approximately 16/(1-1) that is: less than two thousands. Favero () The Linear Regression Model 48 / 67

49 Model Mis-specification Each specification can be interpreted of the result of a reduction process, what happens if the reduction process that has generated E (y j X) omits some relevant information? There are three general cases of mis-specification. Mis-specification related to the choice of variables included in the regressions, Mis-specification related to ignoring the existence on constraints on the estimated parameters Misspecification related to wrong assumptions on the properties of the error terms. Favero () The Linear Regression Model 49 / 67

50 Choice of variables under-parameterization (the estimated model omits variables included in the DGP) over-parameterization (the estimated model includes more variables than the DGP). Favero () The Linear Regression Model 5 / 67

51 Under-parameterization Given the DGP: for which hypotheses (1) estimated: y = X 1 β 1 +X 2 β 2 +ɛ, (13) (??) hold, the following model is y = X 1 β 1 + ν. (14) The OLS estimates are given by the following expression: bβ up 1 = X 1 X 1 1 X 1 y, (15) Favero () The Linear Regression Model 51 / 67

52 Under-parameterization while the OLS estimates which are obtained by estimation of the DGP, are: bβ 1 = X 1 M 2X 1 1 X 1 M 2 y. (16) The estimates in (16) are best linear unbiased estimators (BLUE) by construction, while the estimates in (15) are biased unless X 1 and X 2 are uncorrelated. To show this, consider: bβ 1 = X1 X 1 1 X1 y X 1 X 2β b 2 (17) = b β up 1 bd b β 2, (18) where bd is the vector of coefficients in the regression of X 2 on X 1 and bβ 2 is the OLS estimator obtained by fitting the DGP. Favero () The Linear Regression Model 52 / 67

53 Illustration Given the DGP: the following model is estimated by OLS y = X X ɛ 1, (19) X 2 =.8X 1 + ɛ 2 (2) y = X 1 β 1 + ν. (21) (a) The OLS estimate of β 1 will be.5 (b) The OLS estimate of β 1 will be (c) The OLS estimate of β 1 will be.9 (d) The OLS estimate of β 1 will have a mean of.9 Favero () The Linear Regression Model 53 / 67

54 Under-parameterization Note that if E (y j X 1, X 2 ) = X 1 β 1 +X 2 β 2, E (X 1 j X 2 ) = X 1 D, then, E (y j X 1 ) = X 1 β 1 +X 1 Dβ 2 = X 1 α. Therefore the OLS estimator in the under-parameterized model is a biased estimator of β 1, but an unbiased estimator of α. Then, if the objective of the model is forecasting and X 1 is more easily observed than X 2, the under-parameterized model can be safely used. On the other hand, if the objective of the model is to test specific predictions on parameters, the use of the under-parameterized model delivers biased results. Favero () The Linear Regression Model 54 / 67

55 Over-parameterization Given the DGP, y = X 1 β 1 + ɛ, (22) for which hypotheses (1) (??) hold, the following model is estimated: y = X 1 β 1 + X 2 β 2 + v. (23) The OLS estimator of the over-parameterized model is bβ op 1 = X 1 M 2X 1 1 X 1 M 2 y, (24) Favero () The Linear Regression Model 55 / 67

56 Over-parameterization while, by estimating the DGP, we obtain: bβ 1 = X 1 X 1 1 X 1 y. (25) By substituting y from the DGP, one finds that both estimators are unbiased and the difference is now made by the variance. In fact we have: var bβ op 1 j X 1, X 2 = σ 2 X1 M 1 2X 1, (26) var bβ1 j X 1, X 2 = σ 2 X1 X 1 1. (27) Favero () The Linear Regression Model 56 / 67

57 Over-parameterization Remember that if two matrices A and B are positive definite and A is positive semidefinite, then also the matrix B 1 A 1 is positive semidefinite. We have to show that X1 X 1 X1 M 2X 1 is a positive semidefinite matrix. Such a result is almost immediate: B X 1 X 1 X 1 M 2X 1 = X 1 (I M 2 ) X 1 = X 1 Q 2X 1 = X 1 Q 2Q 2 X 1. We conclude that over-parameterization impacts on the efficiency of estimators and the power of the tests of hypotheses. Favero () The Linear Regression Model 57 / 67

58 Estimation under linear constraints The estimated model is the linear model analysed up to now: y = Xβ + ɛ, while the DGP is instead: y = Xβ + ɛ, subject to Rβ r =, where the constraints are expressed using the so called implicit form. Favero () The Linear Regression Model 58 / 67

59 Estimation under linear constraints A useful alternative way of expressing constraints, known as the explicit form has been expressed by Sargan (1988): β = Sθ + s, where S is a (k (k r)) matrix of rank k r and s is a k 1 vector. To show how constraints are specified in the two alternatives let us consider the case of β 1 = β 2 on the following specification: ln y i = β + β 1 x 1i + β 2 x 2i + ε i. (28) Favero () The Linear Regression Model 59 / 67

60 Estimation under linear constraints Using Rβ r = : 1 β β 1 β 2 1 A = (), while using β = Sθ + β β 1 β 2 1 A A β β 1 In practice the constraints in the explicit form are written by considering θ as the vector of free parameters. 1 A. Favero () The Linear Regression Model 6 / 67

61 Estimation under linear constraints Note that there is no unique way of expressing constraints in the explicit form, in our case the same constraint can be imposed β β 1 β 2 1 A A β β 2 As the two alternatives are indifferent, Rβ r = and RSθ + Rs r = are equivalent, which implies: 1 RS = ; 2 Rs r =. 1 A. Favero () The Linear Regression Model 61 / 67

62 The restricted least squares (RLS) estimator To construct RLS, substitute the constraint in the original model to obtain: y Xs = XSθ + ɛ. (29) Equation (29) is equivalent to: y = X θ + ɛ, (3) where y = y Xs, X = XS. Note that the transformed model features the same residuals with the original model; therefore, if standard hypotheses hold for the original model, they also hold for the transformed. Favero () The Linear Regression Model 62 / 67

63 Application Given the DGP: y = X 1 β 1 +X 2 β 1 +ɛ, (31) the RLS estimator will be obtained by regressing y on (a) (X 1 + X 2 ) (b) (X 1 X 2 ) (c) (X 1 /X 2 ) (d) X 1 and X 2 Favero () The Linear Regression Model 63 / 67

64 The restricted least squares (RLS) estimator We apply OLS to the transformed model to obtain: bθ = X X 1 X y (32) = S X XS 1 S X (y Xs). From (32) the RLS estimation is easily obtained by applying the transformation β b rls = Sbθ + s. Similarly, the variance of the RLS estimator is easily obtained as: var b θ j X = σ 2 X X 1 = σ 2 S X XS 1, var bβ rls j X = var Sbθ + s j X = S var b θ j X S = σ 2 S S X XS 1 S. Favero () The Linear Regression Model 64 / 67

65 The restricted least squares (RLS) estimator We can now discuss the properties of OLS and RLS in the case of a DGP with constraints. Unbiasedness Under the assumed DGP, both estimators are unbiased, since such properties depend on the validity of hypotheses (1) (??), which is not affected by the imposition of constraints on parameters. Efficiency Obviously, if we interpret RLS as the OLS estimator on the transformed model (32) we immediately derive the results that the RLS is the most efficient estimator, as the hypotheses for the validity of the Gauss Markov theorem are satisfied when OLS is applied to (32). Note that by posing L = (X X) 1 X in the context of the transformed model, we do not generally obtain OLS but an estimator whose conditional variance with respect to X, coincides with the conditional variance of the OLS estimator. Favero () The Linear Regression Model 65 / 67

66 The restricted least squares (RLS) estimator We support this intuition with a formal argument var bβ j X var bβ rls j X = σ 2 X X 1 σ 2 S S X XS 1 S. Define A as: A = X X 1 S S X XS 1 S. Given that AX XA = X X 1 S S X XS 1 S X X X X 1 S S X XS 1 S = X X 1 = X X 1 = A, 2S S X XS 1 S + S S X XS 1 S S S X XS 1 S S S X XS 1 S A is positive semidefinite, being the product of a matrix and its transpose. Favero () The Linear Regression Model 66 / 67

67 Heteroscedasticity, Autocorrelation, and the GLS estimator Let us reconsider the single equation model and generalize it to the case in which the hypotheses of diagonality and constancy of the conditional variances-covariance matrix of the residuals do not hold: y = Xβ + ɛ, (33) ɛ n.d., σ 2 Ω, where Ω is a (T T) symmetric and positive definite matrix. When the OLS method is applied to model (33), it delivers estimators which are consistent but not efficient; moreover, the traditional formula for the variance-covariance matrix of the OLS estimators, σ 2 (X X) 1, is wrong and leads to an incorrect inference. Favero () The Linear Regression Model 67 / 67

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

Interpreting Regression Results -Part II

Interpreting Regression Results -Part II Interpreting Regression Results -Part II Carlo Favero Favero () Interpreting Regression Results -Part II / 9 The Partitioned Regression Model Given the linear model: y = Xβ + ɛ, Partition X in two blocks

More information

Model Mis-specification

Model Mis-specification Model Mis-specification Carlo Favero Favero () Model Mis-specification 1 / 28 Model Mis-specification Each specification can be interpreted of the result of a reduction process, what happens if the reduction

More information

Heteroscedasticity and Autocorrelation

Heteroscedasticity and Autocorrelation Heteroscedasticity and Autocorrelation Carlo Favero Favero () Heteroscedasticity and Autocorrelation 1 / 17 Heteroscedasticity, Autocorrelation, and the GLS estimator Let us reconsider the single equation

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

The Statistical Property of Ordinary Least Squares

The Statistical Property of Ordinary Least Squares The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1 Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2) RNy, econ460 autumn 04 Lecture note Orthogonalization and re-parameterization 5..3 and 7.. in HN Orthogonalization of variables, for example X i and X means that variables that are correlated are made

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

The regression model with one stochastic regressor.

The regression model with one stochastic regressor. The regression model with one stochastic regressor. 3150/4150 Lecture 6 Ragnar Nymoen 30 January 2012 We are now on Lecture topic 4 The main goal in this lecture is to extend the results of the regression

More information

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1 PANEL DATA RANDOM AND FIXED EFFECTS MODEL Professor Menelaos Karanasos December 2011 PANEL DATA Notation y it is the value of the dependent variable for cross-section unit i at time t where i = 1,...,

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Least Squares Estimation-Finite-Sample Properties

Least Squares Estimation-Finite-Sample Properties Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Next is material on matrix rank. Please see the handout

Next is material on matrix rank. Please see the handout B90.330 / C.005 NOTES for Wednesday 0.APR.7 Suppose that the model is β + ε, but ε does not have the desired variance matrix. Say that ε is normal, but Var(ε) σ W. The form of W is W w 0 0 0 0 0 0 w 0

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

The Multiple Regression Model Estimation

The Multiple Regression Model Estimation Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

Linear Regression. Junhui Qian. October 27, 2014

Linear Regression. Junhui Qian. October 27, 2014 Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

ECON 4160, Lecture 11 and 12

ECON 4160, Lecture 11 and 12 ECON 4160, 2016. Lecture 11 and 12 Co-integration Ragnar Nymoen Department of Economics 9 November 2017 1 / 43 Introduction I So far we have considered: Stationary VAR ( no unit roots ) Standard inference

More information

Topic 3: Inference and Prediction

Topic 3: Inference and Prediction Topic 3: Inference and Prediction We ll be concerned here with testing more general hypotheses than those seen to date. Also concerned with constructing interval predictions from our regression model.

More information

Advanced Quantitative Methods: ordinary least squares

Advanced Quantitative Methods: ordinary least squares Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the

More information

Non-Spherical Errors

Non-Spherical Errors Non-Spherical Errors Krishna Pendakur February 15, 2016 1 Efficient OLS 1. Consider the model Y = Xβ + ε E [X ε = 0 K E [εε = Ω = σ 2 I N. 2. Consider the estimated OLS parameter vector ˆβ OLS = (X X)

More information

Properties of the least squares estimates

Properties of the least squares estimates Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Tuesday 9 th May, 07 4:30 Consider a system whose response can be modeled by R = M (Θ) where Θ is a vector of m parameters. We take a series of measurements, D (t) where t represents

More information

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T, Regression Analysis The multiple linear regression model with k explanatory variables assumes that the tth observation of the dependent or endogenous variable y t is described by the linear relationship

More information

Review of Econometrics

Review of Econometrics Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,

More information

Topic 3: Inference and Prediction

Topic 3: Inference and Prediction Topic 3: Inference and Prediction We ll be concerned here with testing more general hypotheses than those seen to date. Also concerned with constructing interval predictions from our regression model.

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

Variable Selection and Model Building

Variable Selection and Model Building LINEAR REGRESSION ANALYSIS MODULE XIII Lecture - 37 Variable Selection and Model Building Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur The complete regression

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Introduction to Estimation Methods for Time Series models. Lecture 1

Introduction to Estimation Methods for Time Series models. Lecture 1 Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Regression #3: Properties of OLS Estimator

Regression #3: Properties of OLS Estimator Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with

More information

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27 Estimation of the Response Mean Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 27 The Gauss-Markov Linear Model y = Xβ + ɛ y is an n random vector of responses. X is an n p matrix

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information

ECON 4160, Autumn term Lecture 1

ECON 4160, Autumn term Lecture 1 ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least

More information

Econometrics 2, Class 1

Econometrics 2, Class 1 Econometrics 2, Class Problem Set #2 September 9, 25 Remember! Send an email to let me know that you are following these classes: paul.sharp@econ.ku.dk That way I can contact you e.g. if I need to cancel

More information

Reference: Davidson and MacKinnon Ch 2. In particular page

Reference: Davidson and MacKinnon Ch 2. In particular page RNy, econ460 autumn 03 Lecture note Reference: Davidson and MacKinnon Ch. In particular page 57-8. Projection matrices The matrix M I X(X X) X () is often called the residual maker. That nickname is easy

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

ANALYSIS OF VARIANCE AND QUADRATIC FORMS 4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Regression #4: Properties of OLS Estimator (Part 2)

Regression #4: Properties of OLS Estimator (Part 2) Regression #4: Properties of OLS Estimator (Part 2) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #4 1 / 24 Introduction In this lecture, we continue investigating properties associated

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

Ordinary Least Squares Regression

Ordinary Least Squares Regression Ordinary Least Squares Regression Goals for this unit More on notation and terminology OLS scalar versus matrix derivation Some Preliminaries In this class we will be learning to analyze Cross Section

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator by Emmanuel Flachaire Eurequa, University Paris I Panthéon-Sorbonne December 2001 Abstract Recent results of Cribari-Neto and Zarkos

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Instrumental Variables and the Problem of Endogeneity

Instrumental Variables and the Problem of Endogeneity Instrumental Variables and the Problem of Endogeneity September 15, 2015 1 / 38 Exogeneity: Important Assumption of OLS In a standard OLS framework, y = xβ + ɛ (1) and for unbiasedness we need E[x ɛ] =

More information

Introduction to Econometrics. Heteroskedasticity

Introduction to Econometrics. Heteroskedasticity Introduction to Econometrics Introduction Heteroskedasticity When the variance of the errors changes across segments of the population, where the segments are determined by different values for the explanatory

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

The Standard Linear Model: Hypothesis Testing

The Standard Linear Model: Hypothesis Testing Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 25: The Standard Linear Model: Hypothesis Testing Relevant textbook passages: Larsen Marx [4]:

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Introductory Econometrics

Introductory Econometrics Introductory Econometrics Violation of basic assumptions Heteroskedasticity Barbara Pertold-Gebicka CERGE-EI 16 November 010 OLS assumptions 1. Disturbances are random variables drawn from a normal distribution.

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Instrumental Variables, Simultaneous and Systems of Equations

Instrumental Variables, Simultaneous and Systems of Equations Chapter 6 Instrumental Variables, Simultaneous and Systems of Equations 61 Instrumental variables In the linear regression model y i = x iβ + ε i (61) we have been assuming that bf x i and ε i are uncorrelated

More information

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Spatial Regression. 9. Specification Tests (1) Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Spatial Regression 9. Specification Tests (1) Luc Anselin http://spatial.uchicago.edu 1 basic concepts types of tests Moran s I classic ML-based tests LM tests 2 Basic Concepts 3 The Logic of Specification

More information

The BLP Method of Demand Curve Estimation in Industrial Organization

The BLP Method of Demand Curve Estimation in Industrial Organization The BLP Method of Demand Curve Estimation in Industrial Organization 9 March 2006 Eric Rasmusen 1 IDEAS USED 1. Instrumental variables. We use instruments to correct for the endogeneity of prices, the

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Our point of departure, as in Chapter 2, will once more be the outcome equation:

Our point of departure, as in Chapter 2, will once more be the outcome equation: Chapter 4 Instrumental variables I 4.1 Selection on unobservables Our point of departure, as in Chapter 2, will once more be the outcome equation: Y Dβ + Xα + U, 4.1 where treatment intensity will once

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Instrumental Variables

Instrumental Variables Università di Pavia 2010 Instrumental Variables Eduardo Rossi Exogeneity Exogeneity Assumption: the explanatory variables which form the columns of X are exogenous. It implies that any randomness in the

More information

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No. 7. LEAST SQUARES ESTIMATION 1 EXERCISE: Least-Squares Estimation and Uniqueness of Estimates 1. For n real numbers a 1,...,a n, what value of a minimizes the sum of squared distances from a to each of

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

Föreläsning /31

Föreläsning /31 1/31 Föreläsning 10 090420 Chapter 13 Econometric Modeling: Model Speci cation and Diagnostic testing 2/31 Types of speci cation errors Consider the following models: Y i = β 1 + β 2 X i + β 3 X 2 i +

More information

Answers to Problem Set #4

Answers to Problem Set #4 Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2

More information

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing 1 The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing Greene Ch 4, Kennedy Ch. R script mod1s3 To assess the quality and appropriateness of econometric estimators, we

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

Quick Review on Linear Multiple Regression

Quick Review on Linear Multiple Regression Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Introductory Econometrics

Introductory Econometrics Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction

More information

Spatial Econometrics

Spatial Econometrics Spatial Econometrics Lecture 5: Single-source model of spatial regression. Combining GIS and regional analysis (5) Spatial Econometrics 1 / 47 Outline 1 Linear model vs SAR/SLM (Spatial Lag) Linear model

More information

Lecture 24: Weighted and Generalized Least Squares

Lecture 24: Weighted and Generalized Least Squares Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares When we use ordinary least squares to estimate linear regression, we minimize the mean squared error: MSE(b) = 1 n (Y i X i β)

More information

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014

Regression. ECO 312 Fall 2013 Chris Sims. January 12, 2014 ECO 312 Fall 2013 Chris Sims Regression January 12, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License What

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

The outline for Unit 3

The outline for Unit 3 The outline for Unit 3 Unit 1. Introduction: The regression model. Unit 2. Estimation principles. Unit 3: Hypothesis testing principles. 3.1 Wald test. 3.2 Lagrange Multiplier. 3.3 Likelihood Ratio Test.

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

Chapter 11 Specification Error Analysis

Chapter 11 Specification Error Analysis Chapter Specification Error Analsis The specification of a linear regression model consists of a formulation of the regression relationships and of statements or assumptions concerning the explanator variables

More information

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models SEM with observed variables: parameterization and identification Psychology 588: Covariance structure and factor models Limitations of SEM as a causal modeling 2 If an SEM model reflects the reality, the

More information