The Statistical Property of Ordinary Least Squares
|
|
- Andra Hubbard
- 5 years ago
- Views:
Transcription
1 The Statistical Property of Ordinary Least Squares The linear equation, on which we apply the OLS is y t = X t β + u t Then, as we have derived, the OLS estimator is ˆβ = [ X T X] 1 X T y Then, substituting the linear model into y, we get [ 1 ˆβ = X X] T X T [Xβ + u] [ 1 = β + X X] T X T u
2 Thus, statistical property of the OLS is very much the statistical property of [ X T X] 1 X T u Definition: The OLS estimator is said to be unbiased if [ E ˆβ] = β This means, given the way the data is generated, before you start doing the estimation procedure, you would expect that on average, the OLS estimate should not deviate from the true value. Because [ 1 ˆβ β = X X] T X T u, then OLS is unbiased if ( [ ) 1 E X X] T X T u = 0
3 Now, let us use the Law of Iterated Expectation. That is, [ ( ) [ [ 1 ( ) 1 E X T X X u] T = E E X T X X u X]] T [ ( ) 1 = E X T X X T E [u X]] Therefore, the OLS estimator is unbiased if E [u X] = 0 In this case, [ E ˆβ X] = β as well. This assumption states that the regressor X is exogenous to the error term. But this could be a strong assumption. In time series, it means that mean of current u t conditional on all present and future values of X s has to be zero. Roughly, it means that current u t has to be orthogonal (or uncorrelated) with X s of all periods.
4 In some cases, we can make the the assumption weaker. That is, E [u t X t ] = 0 If the data is times series, i.e. the sample is the data over time, and t means time periods, then the assumption is said to be that the regressors (X t ) are predetermined with respect to the error term.
5 An Example where OLS estimator is biased: y t = β 1 + β 2 y t 1 + u t, u t IID(0, σ 2 ) Then, using the FWL theorem, we first demean y t, y t 1 to derive the OLS coefficient. That is, [ ] 1 [ ] 1 ˆβ 2 = y 1M T ı y 1 y T 1 M ı y = y 1M T ı y 1 y T 1 M ı (y 1 β 2 + u) [ ] 1 = β 2 + y 1M T ı y 1 y T 1 M ı u Notice that β 2 is a scalar. So, β 2 y 1 = y 1 β 2. Because E [u M ı y 1 ] = E [u y 1 y 1 ı] 0 in general, the OLS estimator is not unbiased.
6 Consistency of the Least Squares Estimator In this Chapter, we will argue that as the sample size increases, the least squares estimator ˆβ will converge in probability to the truel value β. Law of Large Numbers Before moving to show the consistency of the OLS estimator, we discuss a very important theorem, the Law of Large Numbers. Let u t, t = 1,..., n be any random variable that is independently and identically distributed with finite mean E[u t ] = µ u and finite variance σ 2 u. Then, the sample average is ū = 1 n n t=1 u t
7 which has mean E [ū] = E [ 1 n ] n u t = 1 n t=1 n E[u t ] = 1 n t=1 Next, we derive variance. Now, for n = 2 Var(ū) = E n µ = µ t=1 [ ] (u 1 + u 2 ) µ = 1 4 E [ (u 1 µ + u 2 µ) 2] = 1 4 E [ (u 1 µ) 2 + (u 2 µ) 2 + 2(u 1 µ)(u 2 µ) ] = 1 4 [Var(u 1) + Var(u 2 ) + E(u 1 µ)(u 2 µ)] = 1 4 [Var(u 1) + Var(u 2 )] This is because u 1, u 2 are independent Cov(u 1, u 2 ) = E(u 1 µ)(u 2 µ) = E(u 1 µ)e(u 2 µ) = 0
8 Similarly, for general n, ( Var(ū) = Var 1 n = 1 n n 2 ) n u t = E t=1 t=1 Var(u t ) + i j [ 1 n n (u t µ) t=1 ] 2 Cov(u i, u j ) = 1 n 2 n t=1 σ 2 u = σ2 u n
9 Mean of the sample average is the true value. E [ū] = µ Variance of the sample average converges to zero as sample size goes to infinity. Var(ū) 0 as n or lim Var(ū) = 0 n Which means that as the sample size increases, the sample average will be distributed more and more closely around the true value. Then, no matter how small, deviation of the sample mean from the true value becomes less and less likely. More formally, For any ɛ > 0, Pr ( ū µ < ɛ) 0 as n
10 or or lim Pr ( ū µ < ɛ) 0 n plim n ū = µ and in words, sample average converges to the true mean in probability. Theorem: (Weak) Law of Large Numbers Let u t, t = 1,..., n be independently and identically distributed with finite mean µ and variance σ 2 u. Then, the sample average ū converges to the true value µ in probability.
11 Consistency of OLS estimator. Suppose, to simplify the discusson, that X t is random but independent of u t. OLS estimator: ˆβ = [ 1 [ 1 X X] T X T y = β + X X] T X T u Now, divide both denominator and numerator by sample size n Now, let us assume that where S X T X is invertible. [ ] ˆβ = β + n XT X n XT u plim n 1 n XT X = S X T X
12 Then, it is know that we can write like this. [ ] plim n ˆβ = β + plim n n XT X plim n n XT u What is left for us to derive Now, = β + [S X T X ] 1 plim n 1 n XT u plim n 1 n XT u 1 n XT u = 1 n T x t u t Then, x t u t, t = 1,..., n are independently and identically distributed random varibles with finite mean E[x t u t ] = 0 and (we assume) finite variance Var(x t u t ) = σ 2 xu. Therefore, Law of Large Number holds and t=1
13 plim n 1 n Together, we have shown that T x t u t = µ xu = 0 t=1 [ ] plim n ˆβ = β + plim n n XT X plim n n XT u = β + [S X T X ] 1 plim n 1 n XT u = β Thus, the OLS estimator converges to the true parameter value in probability as sample size increases. We also say that the OLS estimator is consistent.
14 The (Variance) Covariance Matrix of OLS Estimator The (variance) covariance matrix of the OLS estimator given X is [ [ [ ] [ ] T Var ˆβ X] = E ˆβ E( ˆβ X) ˆβ E( ˆβ X)] X which is a k by k matrix, if X has k variables. The often reported standard error of the OLS parameter estimate is [ std.error( ˆβ i ) = Var ˆβ X] ii
15 Derivation of the Covariance Matrix Remember that OLS estimator is ˆβ = [ X T X] 1 X T y = β + and because OLS is unbiased, [ E ˆβ] = β [ X T X] 1 X T u Therefore, [ ˆβ E ˆβ] = [ X T X] 1 X T u and [ ˆβ E [ ]] [ [ ]] [ T [ ] ] [ 1 [ ] 1 T ˆβ ˆβ E ˆβ = X T X X T u X T X X u] T = [ 1 [ ] 1 X X] T X T uu T X X T X
16 Now, assume that the error term of the linear model satisfies the following Var(u X) = E[uu T X] = σ 2 I That is, because E[u t X] = 0 for any sample t and for s t Var (u t X) = E[uu T X] tt = σ 2 Cov (u s, u t X) = E[uu T X] st = 0 That is, the variance of the error terms are the same for each sample, and the error terms of two different samples are uncorrelated.
17 Then, Var( ˆβ X) = = [ 1 ( ) X X] T X T E uu T X [ 1 X X] T X T σ 2 IX [ ] 1 X X T X [ X T X] 1 = σ 2 [ X T X ] 1
18 Precision of Least Squares Estimators The smaller the variance of an OLS parameter ˆβ, we say the higher the precision of the parameter estimate. Therefore, we defined the precision matrix as the inverse of the Covariance matrix. Prec( ˆβ) = Var( ˆβ) 1 = σ 2 (X T X) First, we can see that the smaller the variance of the error term, the smaller the variance, i.e, larger the precision. Secondly, because N X T X = xt T x t usually the larger the sample size, the smaller the variance, i.e. the larger the precision. t=1
19 Now, we look at the variance of a single OLS parameter ˆβ 1. As we have seen from FWL Therorem, if we regress X 2 on both y and X and take residuals, i.e. premultiply them with M 2 = I X 2 [X T 2 X 2 ] 1 X T 2 and the variance of ˆβ 1 is ˆβ 1 = [x 1 M 2 x 1 ] 1 x 1 M 2 y Var( ˆβ 1 ) = σ 2 [x 1 M 2 x 1 ] 1 = σ 2 x T 1 M 2x 1 If x 1 can be perfectly explained by the rest of X, i.e, X 2, then M 2 x 1 = 0 and therefore, OLS estimator ˆβ 1 is not defined, or equivalently, its variance is infinite and precision zero. This is what is called Munticollinearity problem. The smaller the sum of squares of the residual M 2 x 1, the larger the variance, and thus the smaller the precision.
20 Linear Functions of Parameter Estimates Suppose that the objective of the OLS regression was to obtain the parameter estimates so that we can obtain predictor ŷ p given the parameter estimate ˆβ and specific value of x p. Then, ŷ p = x p ˆβ Notice that because OLS estimator is unbiased, and the mean of the error term is zero, then predictor is also unbiased. That is, [ E [ŷ p x p ] = x p E ˆβ] = x p β = E [x p β + u p x p ] = E [y p x p ] and the variance of the predictor is [ [ [ ] ] T Var [ŷ p x p ] = E x p ˆβ x p β] x p ˆβ x p β xp [ [ ] [ ] T = x p E ˆβ β ˆβ β] x T p x p = x p Var( ˆβ)x p T
21 The forecast error is y p ŷ p = x p β + u p x p ˆβ = up + x p [β ˆβ ] Because the error term u p and X and x p are assumed to be uncorrelated, if we assume that u p is uncorrelated with u t, t = 1,..., n, then u p and ˆβ are also uncorrelated. Therefore, Var(y p ŷ p ) = Varu p + x p Var( ˆβ)x T p Cov(u p, x p ˆβ) = Var(u p ) + x p Var( ˆβ)x T p = σ 2 + σ 2 x p (X T X) 1 x T p
22 The same as we have discussed, we can derive standard errors of any linear combination of the OLS estimate. ω T ˆβ as follows. Var(ω T ˆβ) = E [ ] [ ] T ω T ˆβ ωβ ω T ˆβ ω T β = Eω T [ ˆβ β ] [ ˆβ β ] T ω = ω T Var( ˆβ)ω = ω T σ 2 [ X T X] 1 ω
23 Efficiency of the OLS Estimator In this section, we will conclude that the OLS is the most efficient estimator among all linear unbiased estimators. That is, roughly speaking, OLS has the smallest variance. But how do we define the variance covariance matrix A to be smaller than the other variance covariance matrix B? A nice way to define smaller is as follows. : A symmetric matrix A is smaller than the other symmetric matrix B if, for any vector ω, ω T [B A] ω 0 This is equivalent to saying that the symmetric matrix B A is positive semidefinite. A symmetric matrix A is positive definite if for any vector ω ω T Aω = k k ω i ω j A ij > 0 i=1 j=1
24 Notice that the variance covariance matrix is always symmetric and positive semidefinite. Suppose y t to be a (row) vector of random variable. Then, (y t E[y t ]) (y t E[y t ]) T is positive definite, because for any (row) vector ω ω T (y t E[y t ]) (y t E[y t ]) T ω = [ ω T (y t E[y t ])] 2 0 By taking expectations, [ ] E ω T (y t E[y t ]) (y t E[y t ]) T ω = ω T Var(y t )ω 0
25 Gauss-Markov Theorem Assume that E [u X] = 0 and E [ uu T X ] = σi in the linear regression model. Then, the OLS estimator is more efficient than any other linear unbiased estimator β, i.e. is a positive semidefinite matrix. Var( β) Var( ˆβ)
26 Proof Consider an arbitrary linear estimator, which is β = Ay Now, denote β = ˆβ + β ˆβ Then, Var( β X) = Var( ˆβ X) + Var( ˆβ β X) + 2Cov( ˆβ, ˆβ β X) What we will show below is that indeed Cov( ˆβ, ˆβ β X) = 0 and therefore, Var( β X) = Var( ˆβ X) + Var( ˆβ β X) Var( ˆβ X)
27 The OLS estimator is one of the linear estimator with [ ] 1 A = X T X X T Because Ay = A [Xβ + u] E [Ay X ] = AXβ + AE [u X ] = AXβ
28 In order for the unbiasedness to hold, i.e. for any true value β AX = I = E [Ay X ] = β [ X T X] 1 X T X has to hold. So, [ [ ] ] 1 A X T X X T X = 0 Now, [ β ˆβ E β ˆβ X] [ = A(Xβ + u) β ] ] 1 X T = [ A [ X T X X T X] 1 X T u u
29 Because it is assumed that Var(u X) = σ 2 I ( ) Cov β ˆβ, ˆβ X ([ = E A = E [ ] ] 1 [ X T X X T E(uu X)X ] X T X ) ([ [ ] ] 1 [ ] A X T X X T σ 2 IX X T X X [ ] ] 1 [ ] = σ [A 2 X T X X T X X T X = 0 ) X
30 Residuals and Error Terms Now, consider the residuals of the OLS ( [ ] ) 1 û = y X ˆβ = I X X T X X T y = M X y = M X [Xβ + u] = M X u Then, because E [M X u X] = 0 [ ] Var (û X) = E ûû T X = M X E [ σ 2 I ] M X = σ 2 M X Which is different from Var (u) = σ 2 I In fact, Var (u) Var (û) = σ 2 [I M X ] = σ 2 P X
31 Notice that for any vector y y T P X y = y T P X P X y = (X ˆβ) T (X ˆβ) 0 So, P T is positive semidefinite. Therefore, in matrix sense, Var (û) is smaller than Var (u). That is, the OLS overfits the data, i.e. makes the residual having smaller variance than the error term. As we have seen the variance matrix of OLS estimator is ( ) [ ] 1 Var ˆβ = σ 2 X T X We need to derive an estimate of σ 2 = Var(u t ). A potential estimator is the sample variance of the residual, which is 1 n n [ût û ] 2 t=1 As long as constant term ı is included in X, which is usually the case, û = 1 n ıû = 0
32 Therefore, Varû t = [ σ 2 M X ] tt = diag ( σ 2 M X )t where diag(a) is the vector containing the diagonal elements of the n by n matrix A. Hence, n Varû t = t=1 n [ σ 2 M X ]tt = trace ( σ 2 ) M X t=1 where trace(a) is the sum of all the diagonal elements,i.e., trace(a) = n i=1 A ii
33 Notice that trace(a + B) = trace(a) + trace(b) This is because Also, This is because trace(ab) = = [A + B] ii = A ii + B ii. trace(ab) = trace(ba) n n n [AB] ii = A ij B ji = i=1 i=1 j=1 n [BA] jj = trace(ba) j=1 n n B ji A ij j=1 i=1
34 Therefore, [ ( ) ]) 1 trace(σ 2 M X ) = trace (σ 2 I X X T X X T ( [ ] ) 1 = σ 2 trace(i) σ 2 trace X X T X X T ( [ ) 1 = σ 2 n σ 2 trace X X] T X T X = σ 2 (n k) Therefore, [ n E t=1 û 2 t ] = n Var(û t ) = σ 2 (n k) < σ 2 n = t=1 n Var(u t ) t=1
35 So, the unbiased estimate of the variance σ 2 is s 2 = SSR n k = 1 n k n t=1 Together, the estimate of the variance matrix of the OLS coefficient is Var( ˆβ) ( ) 1 = s 2 X T X û 2 t
36 Misspecification of Linear Regression Models In most situations, we do not a priori know what is the true regression model, i.e. which variable should belong to the regression equation. Overspecification Suppose we put in more variables on the RHS as needed. y = Xβ + Zγ + u, u IID(0, σ 2 I) where E(u [X, Z]) = 0 but Z is redundant, i.e. γ = 0 Then, [ 1 [ 1 ˆβ = X T M Z X] X T M Z y = X T M Z X] X T M Z (Xβ + u) [ 1 = β + X T M Z X] X T M Z u
37 Because Therfore, E [u X, Z] = 0, ( [ ) 1 E X T M Z X] X T M Z u [X, Z] = 0 [ E ˆβ X, Z] = 0 and thus, the coefficient ˆβ is unbiased.
38 Variance ( ) ( ) 1 Var ˆβ X, Z = σ 2 X T M Z X Now, we know that X T M Z X = X T X X T P Z X Hence, X T X X T M Z X = X T P Z X which is positive semidefinite. So, X T M Z X X T X Therefore, ( σ 2 X X) ( 1 ) 1 T σ 2 X T M Z X That is, if we include unnecessary variables, as longa s the assumption E [u X, Z] are satisfied, we have unbiasedness but we have an estimator with larger variance, i.e. we lose on efficiency.
39 Underspecification What if we the true specification is y = Xβ + Zγ + u but we leave out Z? As we have discussed, the OLS estimate is [ 1 ˆβ = X X] T X T (Xβ + Zγ + u) [ 1 = β + X X] T X T (Zγ + u) Given the assumption of E [u X, Z] = 0, [ E ˆβ X, Z] [ 1 = β + X X] T X T Zγ as long as X T Z 0 and γ 0, we have omitted variable bias.
40 Now, ˆβ β = [ 1 [ 1 X X] T X T Zγ + X X] T X T u and ( ˆβ β)( ˆβ β) T = [ 1 [ X X] T X T Zγγ T Z T X X T X [ 1 [ ] 1 + X X] T X T uu T X X T X [ 1 [ + X X] T X T Zγu T X X T X [ 1 [ + X X] T X T uγ T Z T X X T X ] 1 ] 1 ] 1 Because E [ ] uu T [X, Z] = σ 2 I, E [u [X, Z]] = 0
41 It is not clear which OLS estimator has less variance. If γ is small, then it could be that omitting an unimportant variable would result in bias but improve in MSE. But with omittev variables, variance covariance matrix gives the wrong message on the accuracy and reliability of the OLS estimate. With larger sample size, bias dominates, thus the problem with underspecification becomes more severe. If we take conditional expectations given X, Z, we obtain [ ] MSE( ˆβ o ) = E ( ˆβ o β)( ˆβ o β) T [X, Z] ( 1 ( = X X) T X T Zγγ T Z T X X T X +σ 2 ( X T X ) 1 ) 1 Mean square error is the variation of an estimator around its true value. What if you included the variables Z into the regression? Then, ˆβ is unbiased, and thus, MSE( ˆβ) = Var( ˆβ) = σ 2 ( X T M Z X ) 1
42 Measures of Goodness of Fit The simple R 2 is R 2 = 1 SSR TSS = 1 n t=1 ût 2 n t=1 (y t ȳ) 2 If we look at the numerator: The more independent variables we have, the less will be the RSS. n ûkt 2 = min (β1,...,β k ) t=1 min (β1,...,β k+1 ) n y t t=1 k x jt β j j=1 n k+1 y t x jt β j t=1 j=1 2 2 = n t=1 û 2 k+1,t
43 The more independent variables you include (higher k), the smaller becomes the RSS (residual sum of squres). If k = n, then RSS becomes zero. So, you can increase the R 2 by simply putting more and more stuff as the independent variables. Hence, R 2 won t be a good indication of the appropriateness of the linear model. One needs to fine a measure of goodness of fit that penalizes having many independent variables.
44 For the numerator: instead of SSR, use unbiased estimator for Var(u t ) = σ 2, s 2. For the denomintaor, instead of TSS, use the unbiased estimator for the Var(y) R 2 = 1 1 n n k t=1 û2 i 1 n 1 n t=1 (y i ȳ) 2 = 1 (n 1) n t=1 û2 i (n k) n t=1 (y i ȳ) 2 This is called adjusted R 2. Given the same SSR, having more independent variables (higher k) reduces the R 2. So, it puts penalties on the high number of regressors. Notice that for very poorly fit models, adjusted R 2 can be negative.
45 Hypothesis Testing in Linear Regression Models Suppose you have the following linear model: log(wage) = β 0 + β 1 educ + β 2 exp + β 3 ten + u educ: education, exp: experience, ten: tenure on the job. Suppose you obtained the following regression result. variables estimate std. error constant education experience tenure sample size 526 R
46 Suppose you want to know whether the return to education is zero or not. That is, you want to test the hypothesis that the return to education is 0, when the estimated return is with the standard error If the estimated value is very different from zero, then you should reject. But what is the proper way to measure whether is very different from zero or not? Then, the variance of the OLS becomes important as well. If the OLS is accurately estimated, then, even if it is close to zero, one should reject the hypothesis of zero returns to education. In this case the standard error is 0.007, fairly accurate.
47 A potential candidate of would be, for the hypothesis of β j = β 0 ˆβ j β 0 st.dev( ˆβ j ) If, relative to the accuracy of the OLS estimate, ˆβ j, is too far away from the hypothesis β 0, then we reject the hypothesis. Suppose the OLS estimator is normally distributed with mean β 0 ˆβ j N(β 0, σ βj ) Then, z = ˆβ j β 0 Var( ˆβ N(0, 1) j ) 1/2 Then, one can set up a rejection region with R βj 0 such that reject if ˆβ j β 0 Var( ˆβ j ) 1/2 R β
48 Then, we can make R β such that P( z R β ) = 0.05 Then, even if the hypothesis is true, i.e. β = β 0, the hypothesis will be rejected, i.e. ˆβ j β 0 Var( ˆβ j ) 1/2 R β with probability In this case, the probability of Type 1 error (rejecting a true hypothesis) is In other words, this test has a significance level of Or, the power of the test is 0.05.
49 Type II error: Suppose the null hypothesis β = β 0 is not true. Then, mistakenly not rejecting the false null hypothesis is type II error, which equals 1 - power. R β is called critical value, and is often denoted as c α, where α is the power of the test. For example, if z N(0, 1), the critical value c α for α = 0.05 is C 0.05 = 1.96 i.e. P (z 1.96 z 1.96) = 0.05 or P ( 1.96 z 1.96) = = 0.95 Notice that because this is two-tailed test P (z 1.96) = 0.025, P (z 1.96) = That is, if Φ() is the (cumulative) distribution function of N(0, 1), Φ(c α ) = 1 α/2
50
51 P values Suppose in the above example, z = ˆβ j β 0 = 1.96 Var( ˆβ j ) 1/2 Then, the P-value is That is, you barely fail to reject the hypothesis at significance level If the P-value is 0.05, can you reject the hypothesis at significance leve of 0.1? The critical value for 0.1, C 0.1 is Then, because 1.96 > 1.645, the hypothesis is rejected. What about significance level of 0.025? c = 2.24 > Hence, the hypothesis cannot be rejected. P-value (marginal significance): greatest significance level for which the hypothesis cannot be rejected.
52 In general, P-value for 2 tailed test for ẑ is Prob( z ẑ ) then, if the distribution of z is standard normal, p(ẑ) = 2(1 Φ( ẑ )) or Φ( ẑ ) = 1 α/2 P-values preserves all the information from the estimation.
53 Normal Distribution The density function of standard normal distribution (with mean 0, variance 1) u is f (u) = 1 ( ) u 2 exp 2π 2 Consider now x, the normal random variable with mean µ and variance σ 2. Then, u = x µ σ Therefore, the density of x, g(x) is g(x)dx = g(u) du dx dx = 1 2π exp = 1 σ 2π exp ( ) (x µ)2 2σ 2 ( ( x µ σ 2 ) 2 ) 1 σ
54 Joint Distribution The joint distribution of independent normal random variables x 1 N(0, 1), x 2 N(0, 1 2 ) is ( 1 f (x) = f (x 1 )f (x 2 ) = ( 2π) exp x x ( ) = det(i )( 2π) exp x T x/2 2 ) where x = [ x1 x 2 ]
55 Now, consider a vector y which is normally distributed with mean µ and variance matrix Σ. Let A be the matrix that satisfies AA T = Σ. Then, y can be expressed as because, then also, E(y) = µ, Var(y) = E y = µ + Ax ( [y µ][y µ] T ) = AIA T = Σ x = A 1 (y µ), dx = det(a 1 )dy = det(σ) 1/2 dy
56 Together, we obtain the joint normal distribution y as g(y)dy = f ( A 1 (y µ) ) dx dy dy = 1 det(σ) 1/2 (2 π) 2 exp ( 1 ) 2 (y µ)t Σ 1 (y µ) dy Independence Suppose y 1, y 2 are jointly normally distributed but not correlated, i.e. Σ 12 = Σ 21 = 0. Then, y 1, y 2 are independent. The opposite is also true. This is because in this case, (y µ) T Σ 1 (y µ) = (y 1 µ 1 ) 2 Σ 11 + (y 2 µ 2 ) 2 Σ 22 and det(σ) = Σ 11 Σ 22
57 Therefore, g(y) = 1 Σ11 2π exp ( (y 1 µ 1 ) 2 ) 2Σ 11 1 exp ( (y 2 µ 2 ) 2 ) Σ22 2π = g 1 (y 1 )g 2 (y 2 ) 2Σ 22 Therefore, y 1, y 2 are independent to each other. We next show that the sum of two normal distribution is also normal. Consider the sum of 2 independent standard normally distributed random variables, y = x 1 + x 2. Then, we can express x 2 = y x 1. Therefore,
58 [ ] d(x1, x 2 ) f (x)dx = g(x 1, y)det d(x 1, y) d(x 1, y) 1 = ( ( 2π) exp x (y x 1) 2 ) d(x 2 1, y) 2 1 = ( ( 2π) exp 2(x ) 1 y 2 )2 + y 2 2 d(x 2 1, y) 2 = 1 2π 2 exp ( y ) 1 2π 1/2 exp ( (x 1 y 2 ) Hence we can see that conditional on y, x 1 has mean y/2 with variance 1/2, and integrating over x 1, we can get the distribution of y having the following functional form. 1 2π exp ( y 2 ) that is, y is normally distributed with mean 0 and variance 2. 4 )
59 Chi Squared Distribution Suppose z is a vector of m independently and indentically distributed standard normal distributions. Then, y = z T z = m t=1 is chi-squared distributed with m degrees of freedom; i.e. y χ 2 (m) Because of this, we can see that if y 1 χ 2 (m 1 ) and y 2 χ 2 (m 2 ), and if y 1 and y 2 are independently distributed, then y 1 + y 2 is just like a sum of squared of m 1 + m 2 independently distributed standard normal distribution, so z 2 t y 1 + y 2 χ 2 (m 1 + m 2 )
60 Quadratic Form and chi-square distribution 1. If m-vector x N(0, Ω), then x T Ω 1 x χ 2 (m) Let A be such that AA T = Ω. Then, [ ] Var(A 1 x) = E A 1 xx T A 1T = A 1 ΩA 1T = A 1 AA T A 1T = I m Therefore, and therefore, A 1 x N(0, I m ) ( A 1 x ) T A 1 x = x T Ω 1 x χ 2 (m)
61 2. If P is a projection matrix with rank r and z N(0, I n ), then z T Pz χ(r) Because P is the projection matrix, there exists a n by r matrix X such that [ ] 1 P = X X T X X T Now, X T z is r by 1 vector, and X T z N(0, X T X) Therefore, from 1, [ 1 z T X X X] T X T z χ 2 (r)
62 Student s t distribution If z N(0, 1) and y χ 2 (m) and z and y are independent, then t = z ( y m) 1/2 has a Student t distribution with m degrees of freedom. F distribution If y 1 χ 2 (m 1 ) and y 2 χ 2 (m 2 ) and they are independent, then F = y 1/m 1 y 2 /m 2 has a F distribution F (m 1, m 2 ) with degrees of freedom m 1, m 2.
63 Exact Test in the Classical Normal Linear Model y = Xβ + u Additional assumption: u is statistically independent and normally distributed, u N(0, σ 2 I ). Test of a Single Restriction y = X 1 β 1 + X 2 β 2 + u, u N(0, σ 2 I) Test of the hypothesis β 2 = β 2 OLS Estimation: M X1 y = M X1 X 2 β 2 + M X1 u ˆβ 2 = [ X T 2 M X1 X 2 ] 1 X T 2 M X1 y Var( ˆβ 2 ) = σ 2 ( X T 2 M X1 X 2 ) 1
64 Then, if the null hypothesis is true, then the true parameter is β 2 = β2, and then, the OLS coefficient is normally distributed with mean β2, and variance Var( ˆβ ( ) 1 2 ) = σ 2 X2 T M X1 X 2 Then, we know that [ ˆβ 2 E ˆβ] Var( ˆβ 2 ) N(0, 1) Therefore, ˆβ 2 β 2 σ ( X T 2 M X 1 X 2 ) 1/2 N(0, 1)
65 The problem is that we do not know σ 2. To deal with this, we use s 2, the unbiased estimate of σ 2, i.e. s 2 = 1 n k n t=1 û 2 t = y T M X y n k But the problem is that if you just stick s 2 into the place where σ 2 used to be, then the resulting distribution is not normal any more, because s 2 is not fixed, it is random. Next, we will deal with the additional randomness.
66 Now, given X being a rank k n by k matrix, consider a n by n k matrix Z of rank n k such that X T Z = 0 Then, because [X, Z] is a n by n matrix with full rank, [ ] β y = [X, Z] = X ˆβ + Zˆγ γ has a solution. Notice that because of orthogonality, X T Zˆγ = 0
67 Therefore, Zˆγ is the residual û of OLS where X is the independent variable, i.e. [ 1 û = Zˆγ = Z Z Z] T Z T y = Z [ Z T Z] 1 Z T (Xβ + u) = Z [ Z T Z] 1 Z T u Then, because u/σ N(0, I ), [ 1 SSR = u T Z Z Z] T Z T u χ 2 (n k) Then, because z = ˆβ 2 β 2 σ ( X T 2 M X 1 X 2 ) 1/2 N(0, 1), SSR σ 2 χ 2 (n k) and if z and SSR are independent. z SSR σ 2 (n k) t(m)
68 We next show that z and SSR are independent. First, we show that ˆβ and û are uncorrelated. This is because ( ) [ T ( ) ] 1 ( ) 1 û ˆβ β = I X X T X X T uu T X X T X Hence, [ ( ) ] T E û ˆβ β X ( ) ] 1 ( ) 1 = σ [I 2 X X T X X T X X T X ( ) 1 ( ) ] 1 = σ [X 2 X T X X X T X = 0 Because both z and û are normally distributed, since they are uncorrelated with each other, they are independently distributed.
69 Therefore, ˆβ 2 β 2 (X T 2 M X 1 X 2) 1/2 s 2 = ˆβ 2 β 2 (s 2 [ X T 2 M X 1 X 2 ] 1 ) 1/2 t(m) where s 2 [ X T 2 M X1 X 2 ] 1 is the estimated variance of ˆβ 2.
70 Tests of Several Restrictions Suppose that X 1 is a n by k 1 matrix and X 2 is a n by k 2 matrix, β 1 is k 1 by 1 vector and β 2 is k 2 > 1 by 1 vector. If we want to test β 2 = 0, then, the above t test does not work. So, instead, we use the F-distribution based F-test. The null hypothesis is H 0 : y = X 1 β 1 + u, u N(0, σ 2 I) and the alternative hypothesis is H 1 : y = X 1 β 1 + X 2 β 2 + u, u N(0, σ 2 I)
71 Instead, use the chi squared distribution of the residuals. The residual if null hypothesis were true, û T r û r σ 2 χ 2 (n k 1 ) The residual for the alternative hypothesis: û T u û u σ 2 χ 2 (n k 1 k 2 ) Because of this, we are considering using F-test, which involves two chi-squared distributions. But for the F-test, those two distributions need to be independent. But û r and û u are not independent to each other, thus neither are û T r û r and û T u û u. So, we cannot use the above sum of squares of the residuals directly.
72 But we will show below that actually û u and û r û u are uncorrelated, thus independent because they are normally distributed. Therefore, we can do the F-test using and and û T u û u σ 2 χ 2 (n k 1 k 2 ) (û r û u ) T (û r û u ) σ 2 χ 2 (k 2 ) F = (û r û u ) T (û r û u ) /k 2 û T u û u /(n k 1 k 2 ) F (k 2, n k 1 k 2 )
73 Then, restricted sum of squares of residuals (RSSR), i.e. β 2 = 0 is RSSR = y T M X1 y and without restriction, û r = M X1 y = M X1 X 2 ˆβ 2 + M X1 û u = M X1 X 2 ˆβ 2 + û u Next, we show that M X1 X 2 ˆβ 2 and û u are uncorrelated. ( ) ] E [M X1 X 2 ˆβ 2 β 2 ûu T ) ] 1 = E [M X1 X 2 (X T2 M X1 X 2 X T2 M X1 uu T M X ) 1 = σ 2 M X1 X 2 (X T 2 M X1 X 2 X T 2 M X1 M X = 0
74 Therefore, û r û u = M X1 X 2 ˆβ 2 and û u are uncorrelated, and thus independent to each other. Furthermore, û r û u = M X1 X 2 (X T 2 M X1 X T 2 ) 1 X T 2 M X1 y = β 2 + P MX1 X 2 u Therfore, given the hypothesis of β 2 = 0 because P MX1 X 2 projection matrix, is a and finally, Therefore, (û r û u ) T (û r û u ) σ 2 = ut P MX1 X 2 u σ 2 χ 2 (k 2 ) (û r û u ) T û u = ˆβ T 2 X T 2 M X1 û u = 0 (û r û u ) T (û r û u ) = (û r û u ) T û r = û T r û r û T u (û u + (û r û u )) = û T r û r û T u û u
75 Together, we have shown that F = (ût r û r û T u û u ) /k2 û T u û u /(n k 1 k 2 ) = (RSSR USSR) /k 2 USSR/(n k 1 k 2 ) F (k 2, n k 1 k 2 )
76 Test of General Linear Restrictions. All linear restrictions on the parameters can be expressed as Rβ = 0 where β is k 1 vector, R is r k matrix of rank r (consisting of r linearly independent vectors), and r is the number of restrictions. For example, the restriction that is β = 0 Rβ = 0 where R = I k. The restriction that β 1 = 0,..., β k1 = 0 can be expresses similarly with R = [I k1, 0]
77 Then, as we have seen before, if we assume that u N ( 0, σ 2 I ), then, ˆβ N (β 0, σ 2 (X T X) 1) Therefore, Var E [ R ˆβ X ] = Rβ ( R ˆβ X ) ( = R Var( ˆβ) ) ( ) 1 R T = σ 2 R X T X R T Therefore, as we have seen before, ( R ˆβ ) T [ Rβ R ( X T X ) ] 1 1 ( R T R ˆβ ) Rβ σ 2 χ 2 (r) The remaining thing to do is to substitute s 2 for σ 2, and then, the statistics will be F-distributed.
78 As before, we need to show that s 2 and R ˆβ are independent with each other. We first show that û and ˆβ are independent. They are both normally distributed and E [ ( ) ] T û ˆβ β X ( ) ] 1 ( 1 = σ [I 2 X X T X X T X X X) T = 0 Since they are both normally distributed, and uncorrelated, they are independent. Therefore, s 2 and ˆβ are also independent, and thus, s 2 and R ˆβ are indepedent. Therefore, ( R ˆβ Rβ ) T [ R ( X T X ) 1 R T ] 1 ( R ˆβ Rβ ) /r s 2 F (r, n k)
LECTURE 2 LINEAR REGRESSION MODEL AND OLS
SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationIntermediate Econometrics
Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the
More informationLinear models. Linear models are computationally convenient and remain widely used in. applied econometric research
Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y
More informationHomoskedasticity. Var (u X) = σ 2. (23)
Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This
More informationProblem Set #6: OLS. Economics 835: Econometrics. Fall 2012
Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.
More informationEC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,
THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation (1) y = β 0 + β 1 x 1 + + β k x k + ε, which can be written in the following form: (2) y 1 y 2.. y T = 1 x 11... x 1k 1
More informationLeast Squares Estimation-Finite-Sample Properties
Least Squares Estimation-Finite-Sample Properties Ping Yu School of Economics and Finance The University of Hong Kong Ping Yu (HKU) Finite-Sample 1 / 29 Terminology and Assumptions 1 Terminology and Assumptions
More informationReview of Econometrics
Review of Econometrics Zheng Tian June 5th, 2017 1 The Essence of the OLS Estimation Multiple regression model involves the models as follows Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i, i = 1,...,
More informationMultivariate Regression Analysis
Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x
More informationLinear Regression. Junhui Qian. October 27, 2014
Linear Regression Junhui Qian October 27, 2014 Outline The Model Estimation Ordinary Least Square Method of Moments Maximum Likelihood Estimation Properties of OLS Estimator Unbiasedness Consistency Efficiency
More informationThe Linear Regression Model
The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationProperties of the least squares estimates
Properties of the least squares estimates 2019-01-18 Warmup Let a and b be scalar constants, and X be a scalar random variable. Fill in the blanks E ax + b) = Var ax + b) = Goal Recall that the least squares
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationSimple Linear Regression: The Model
Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationThe Simple Regression Model. Part II. The Simple Regression Model
Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square
More informationAsymptotic Theory. L. Magee revised January 21, 2013
Asymptotic Theory L. Magee revised January 21, 2013 1 Convergence 1.1 Definitions Let a n to refer to a random variable that is a function of n random variables. Convergence in Probability The scalar a
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationSo far our focus has been on estimation of the parameter vector β in the. y = Xβ + u
Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,
More informationInstrumental Variables
Università di Pavia 2010 Instrumental Variables Eduardo Rossi Exogeneity Exogeneity Assumption: the explanatory variables which form the columns of X are exogenous. It implies that any randomness in the
More informationINTRODUCTORY ECONOMETRICS
INTRODUCTORY ECONOMETRICS Lesson 2b Dr Javier Fernández etpfemaj@ehu.es Dpt. of Econometrics & Statistics UPV EHU c J Fernández (EA3-UPV/EHU), February 21, 2009 Introductory Econometrics - p. 1/192 GLRM:
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationEconometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012
Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation
More informationIntroductory Econometrics
Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna November 23, 2013 Outline Introduction
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationHeteroskedasticity and Autocorrelation
Lesson 7 Heteroskedasticity and Autocorrelation Pilar González and Susan Orbe Dpt. Applied Economics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 7. Heteroskedasticity
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More informationRegression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,
Regression Analysis The multiple linear regression model with k explanatory variables assumes that the tth observation of the dependent or endogenous variable y t is described by the linear relationship
More informationInterpreting Regression Results
Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationLecture 3: Multiple Regression
Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u
More information3. Linear Regression With a Single Regressor
3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)
More informationMa 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationANALYSIS OF VARIANCE AND QUADRATIC FORMS
4 ANALYSIS OF VARIANCE AND QUADRATIC FORMS The previous chapter developed the regression results involving linear functions of the dependent variable, β, Ŷ, and e. All were shown to be normally distributed
More informationEconometrics Summary Algebraic and Statistical Preliminaries
Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationMultiple Linear Regression
Multiple Linear Regression Asymptotics Asymptotics Multiple Linear Regression: Assumptions Assumption MLR. (Linearity in parameters) Assumption MLR. (Random Sampling from the population) We have a random
More informationThe general linear regression with k explanatory variables is just an extension of the simple regression as follows
3. Multiple Regression Analysis The general linear regression with k explanatory variables is just an extension of the simple regression as follows (1) y i = β 0 + β 1 x i1 + + β k x ik + u i. Because
More informationTwo-Variable Regression Model: The Problem of Estimation
Two-Variable Regression Model: The Problem of Estimation Introducing the Ordinary Least Squares Estimator Jamie Monogan University of Georgia Intermediate Political Methodology Jamie Monogan (UGA) Two-Variable
More informationThe Multiple Regression Model Estimation
Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:
More informationWISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A
WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2016-17 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This
More informationEcon 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )
Econ 60 Matrix Differentiation Let a and x are k vectors and A is an k k matrix. a x a x = a = a x Ax =A + A x Ax x =A + A x Ax = xx A We don t want to prove the claim rigorously. But a x = k a i x i i=
More informationMultiple Regression Analysis
Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,
More informationEcon 836 Final Exam. 2 w N 2 u N 2. 2 v N
1) [4 points] Let Econ 836 Final Exam Y Xβ+ ε, X w+ u, w N w~ N(, σi ), u N u~ N(, σi ), ε N ε~ Nu ( γσ, I ), where X is a just one column. Let denote the OLS estimator, and define residuals e as e Y X.
More information. a m1 a mn. a 1 a 2 a = a n
Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More information3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationECON Introductory Econometrics. Lecture 16: Instrumental variables
ECON4150 - Introductory Econometrics Lecture 16: Instrumental variables Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 12 Lecture outline 2 OLS assumptions and when they are violated Instrumental
More informationQuantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017
Summary of Part II Key Concepts & Formulas Christopher Ting November 11, 2017 christopherting@smu.edu.sg http://www.mysmu.edu/faculty/christophert/ Christopher Ting 1 of 16 Why Regression Analysis? Understand
More informationcoefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1
Review - Interpreting the Regression If we estimate: It can be shown that: where ˆ1 r i coefficients β ˆ+ βˆ x+ βˆ ˆ= 0 1 1 2x2 y ˆβ n n 2 1 = rˆ i1yi rˆ i1 i= 1 i= 1 xˆ are the residuals obtained when
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationAppendix A: Review of the General Linear Model
Appendix A: Review of the General Linear Model The generallinear modelis an important toolin many fmri data analyses. As the name general suggests, this model can be used for many different types of analyses,
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationEconometrics - 30C00200
Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix
More informationEmpirical Economic Research, Part II
Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationAnswers to Problem Set #4
Answers to Problem Set #4 Problems. Suppose that, from a sample of 63 observations, the least squares estimates and the corresponding estimated variance covariance matrix are given by: bβ bβ 2 bβ 3 = 2
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationLecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)
Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1 2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model
More informationHeteroskedasticity. Part VII. Heteroskedasticity
Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least
More informationEstimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17
Estimating Estimable Functions of β Copyright c 202 Dan Nettleton (Iowa State University) Statistics 5 / 7 The Response Depends on β Only through Xβ In the Gauss-Markov or Normal Theory Gauss-Markov Linear
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More information11 Hypothesis Testing
28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes
More informationEconometrics Multiple Regression Analysis: Heteroskedasticity
Econometrics Multiple Regression Analysis: João Valle e Azevedo Faculdade de Economia Universidade Nova de Lisboa Spring Semester João Valle e Azevedo (FEUNL) Econometrics Lisbon, April 2011 1 / 19 Properties
More informationIntroductory Econometrics
Based on the textbook by Wooldridge: : A Modern Approach Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 11, 2012 Outline Heteroskedasticity
More informationEcon 510 B. Brown Spring 2014 Final Exam Answers
Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity
More informationLecture 11: Regression Methods I (Linear Regression)
Lecture 11: Regression Methods I (Linear Regression) Fall, 2017 1 / 40 Outline Linear Model Introduction 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationChapter 2: simple regression model
Chapter 2: simple regression model Goal: understand how to estimate and more importantly interpret the simple regression Reading: chapter 2 of the textbook Advice: this chapter is foundation of econometrics.
More information5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.
88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal
More informationLecture 11: Regression Methods I (Linear Regression)
Lecture 11: Regression Methods I (Linear Regression) 1 / 43 Outline 1 Regression: Supervised Learning with Continuous Responses 2 Linear Models and Multiple Linear Regression Ordinary Least Squares Statistical
More informationSample Problems. Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them.
Sample Problems 1. True or False Note: If you find the following statements true, you should briefly prove them. If you find them false, you should correct them. (a) The sample average of estimated residuals
More informationWISE International Masters
WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are
More informationEssential of Simple regression
Essential of Simple regression We use simple regression when we are interested in the relationship between two variables (e.g., x is class size, and y is student s GPA). For simplicity we assume the relationship
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More information1. The Multivariate Classical Linear Regression Model
Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The
More informationStatement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss
More informationLecture 6 Multiple Linear Regression, cont.
Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationMotivation for multiple regression
Motivation for multiple regression 1. Simple regression puts all factors other than X in u, and treats them as unobserved. Effectively the simple regression does not account for other factors. 2. The slope
More informationAdvanced Quantitative Methods: ordinary least squares
Advanced Quantitative Methods: Ordinary Least Squares University College Dublin 31 January 2012 1 2 3 4 5 Terminology y is the dependent variable referred to also (by Greene) as a regressand X are the
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationFinancial Econometrics
Material : solution Class : Teacher(s) : zacharias psaradakis, marian vavra Example 1.1: Consider the linear regression model y Xβ + u, (1) where y is a (n 1) vector of observations on the dependent variable,
More informationLecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices
Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is
More informationEconometrics Honor s Exam Review Session. Spring 2012 Eunice Han
Econometrics Honor s Exam Review Session Spring 2012 Eunice Han Topics 1. OLS The Assumptions Omitted Variable Bias Conditional Mean Independence Hypothesis Testing and Confidence Intervals Homoskedasticity
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationQuick Review on Linear Multiple Regression
Quick Review on Linear Multiple Regression Mei-Yuan Chen Department of Finance National Chung Hsing University March 6, 2007 Introduction for Conditional Mean Modeling Suppose random variables Y, X 1,
More information