Supplementary Material. Dynamic Linear Panel Regression Models with Interactive Fixed Effects

Size: px
Start display at page:

Download "Supplementary Material. Dynamic Linear Panel Regression Models with Interactive Fixed Effects"

Transcription

1 Supplementary Material Dynamic Linear Panel Regression Models with Interactive Fixed Effects Hyungsik Roger Moon Martin Weidner August 3, 05 S. Proof of Identification Theorem. Proof of Theorem.. Let Qβ, λ, f E Y β X λ f F λ 0, f 0, w, where β R K, λ R N R and f R T R. We have Qβ, λ, f { = E Tr [ Y β X λ f Y β X λ f ] } λ 0, f 0, w { [ λ = E Tr 0 f 0 λf β β 0 X + e λ 0 f 0 λf β β 0 X + e ] } λ 0, f 0, w [ ] = E Tr e e λ 0, f 0, w { [ λ + E Tr 0 f 0 λf β β 0 X λ 0 f 0 λf β β 0 X ] } λ 0, f 0, w. }{{} Q β,λ,f Department of Economics and USC Dornsife INET, University of Southern California, Los Angeles, CA moonr@usc.edu. Department of Economics, Yonsei University, Seoul, Korea. Department of Economics, University College London, Gower Street, London WCE 6BT, U.K., and CeMMaP. m.weidner@ucl.ac.uk.

2 [ ] In the last step we used Assumption IDii. Because E Tr e e λ 0, f 0, w is independent of β, λ, f, we find minimizing Qβ, λ, f is equivalent to minimizing Q β, λ, f. We decompose Q β, λ, f as follows Q β, λ, f { [ λ = E Tr 0 f 0 λf β β 0 X λ 0 f 0 λf β β 0 X ] } λ 0, f 0, w { [ λ = E Tr 0 f 0 λf β β 0 X Mλ,λ 0,w λ 0 f 0 λf β β 0 X ] } λ 0, f 0, w { [ λ + E Tr 0 f 0 λf β β 0 X Pλ,λ 0,w λ 0 f 0 λf β β 0 X ] } λ 0, f 0, w { [ β = E Tr high β 0,high ] } X high Mλ,λ 0,w β high β 0,high X high λ 0, f 0, w }{{} Q high β high,λ { [ λ + E Tr 0 f 0 λf β β 0 X Pλ,λ 0,w λ 0 f 0 λf β β 0 X ] } λ 0, f 0, w, }{{} Q low β,λ,f where β high β 0,high X high = K m=k + β m β 0 mx m. A lower bound on Q high β high, λ is given by Q high β high, λ { [ β min E Tr high β 0,high ] } X high M λ,λ,w β high β 0,high X high λ 0, f 0, w λ R N R+R+rankw = minn,t r=r+r+rankw [ β µ r {E high β 0,high X high β high β 0,high ]} λ X 0 high, f 0, w. S.. Because Q β, λ, f, Q high β high, λ, and Q low β, λ, f, are expectations of traces of positive semi-definite matrices we have Q β, λ, f 0, Q high β high, λ 0, and Q low β, λ, f 0 for all β, λ, f. Let β, λ and f be the parameter values that minimize Qβ, λ, f, and thus also Q β, λ, f. Because Q β 0, λ 0, f 0 = 0 we have Q β, λ, f = min β,λ,f Q β, λ, f = 0. This implies Q high β high, λ = 0 and Q low β, λ, f = 0. Assumption IDv, the lower bound S..,

3 and Q high β high, λ = 0 imply β high = β 0,high. Using this, we find Q low β, λ, f { [ = E Tr λ 0 f 0 λ f β low β 0,low X low λ 0 f 0 λ f ] } β low β 0,low λ X 0 low, f 0, w, { [ min E Tr λ 0 f 0 λf β low β 0,low X low λ 0 f 0 λf ] } β low β 0,low λ X 0 low, f 0, w f { [ = E Tr λ 0 f 0 β low ] } β 0,low X low M λ λ 0 f 0 β low β 0,low λ X 0 low, f 0, w, S.. where β low β 0,low X low = K l= β l β 0 l X l. Because Q low β, λ, f = 0 and the last expression in S.. is non-negative we must have { [ E Tr λ 0 f 0 β low ] } β 0,low X low M λ λ 0 f 0 β low β 0,low λ X 0 low, f 0, w = 0. Using M λ = M λm λ and the cyclicality of the trace we obtain from the last equality: { } Tr M λam λ = 0, where A = E [ λ 0 f 0 β low β 0,low X low λ 0 f 0 β low ] β 0,low λ X 0 low, f 0, w. The trace of a positive semi-definite matrix is only equal to zero if the matrix itself is equal to zero, so we find M λam λ = 0, This together with the fact that A itself is positive semi definite implies note that A positive semi-definite implies A = CC for some matrix C, and M λam λ = 0 then implies M λc = 0, i.e., C = P λc A = P λap λ, and therefore ranka rankp λ R. We have thus shown { [ rank E λ 0 f 0 β low β 0,low X low λ 0 f 0 β low ]} β 0,low λ X 0 low, f 0, w R. 3

4 We furthermore find { [ R rank E λ 0 f 0 β low β 0,low X low { [ rank M w E λ 0 f 0 β low β 0,low X low + rank λ 0 f 0 β low ]} β 0,low λ X 0 low, f 0, w P f 0 λ 0 f 0 β low β 0,low λ X 0 low Mw, f 0, w { [ P w E λ 0 f 0 β low β 0,low X low M f 0 λ 0 f 0 β low β 0,low λ X 0 low Pw, f 0, w rank [ ] M w λ 0 f 0 f 0 λ 0 M w { [ + rank E β low β 0,low X low M f 0 β low ]} β 0,low λ X 0 low, f 0, w. Assumption IDiv guarantees rank M w λ 0 f 0 f 0 λ 0 M w = rank λ 0 f 0 f 0 λ 0 = R, that is, we must have E [ β low β 0,low X low M f 0 β low ] β 0,low λ X 0 low, f 0, w = 0. ]} ]} According to Assumption IDiii this implies β low = β 0,low, i.e., we have β = β 0. This also implies Q β, λ, f = λ 0 f 0 λ f F = 0, and therefore λ f = λ 0 f 0. S. Examples of Error Distributions The following Lemma provides examples of error distributions that satisfy e = O p maxn, T as N, T. Example i is particularly relevant for us, because those assumptions on e it are imposed in Assumption 5 in the main text, i.e., under those main text assumptions we indeed have e = O p maxn, T. Lemma S... For each of the following distributional assumptions on the errors e it, i =,..., N, t =,..., T, we have e = O p maxn, T. i The e it are independent across i and t, conditional on C, and satisfy Ee it C = 0, and Ee 4 it C is bounded uniformly by a non-random constant, uniformly over i, t and N, T. Here C can be any conditioning sigma-field, including the empty one corresponding to unconditional expectations. 4

5 ii The e it follow different MA processes for each i, namely e it = ψ iτ u i,t τ, for i =... N, t =... T, S.. τ=0 where the u it, i =... N, t =... T are independent random variables with Eu it = 0 and Eu 4 it uniformly bounded across i, t and N, T. The coefficients ψ iτ satisfy τ max i=...n ψ iτ < B, max ψ iτ < B, i=...n τ=0 τ=0 for a finite constant B which is independent of N and T. S.. iii The error matrix e is generated as e = σ / u Σ /, where u is an N T matrix with independently distributed entries u it and Eu it = 0, Eu it =, and Eu 4 it is bounded uniformly across i, t and N, T. Here σ is the N N cross-sectional covariance matrix, and Σ is the T T time-serial covariance matrix, and they satisfy max j=...n N i= σ ij < B, max τ=...t T Σ tτ < B, S..3 for some finite constant B which is independent of N and T. In this example we have Ee it e jτ = σ ij Σ tτ. Proof of Lemma S.., Example i. Latala 005 showed that for a N T matrix e with independent entries, conditional on C, we have E e [ C c max E ] / [ e it C + max E ] / [ e it C + E ] /4 e 4 it C i j, t where c is some universal constant. Because we assumed uniformly bounded 4th conditional moments for e it we thus have e = O P T + O P N + O P T N /4 = O p maxn, T. Example ii. Let ψ j i = ψ j,..., ψ Nj be an N vector for each j 0. Let U j be an N T sub-matrix of u it consisting of u it, i =... N, t = j,..., T j. We can then write equation S.. in matrix notation as e = diagψ j U j = j=0 T diagψ j U j + r, j=0 t= i,t 5

6 where we cut the sum at T, which results in the remainder r = j=t + diagψ j U j. When approximating an MA by a finite MAT process we have for the remainder E r F = N i= T E r ij t= σ u N i= σ u N T σ u N T t= j=t + j=t + j=t + ψ ij max ψ ij i j max i ψ ij, where σ u is the variance of u it. Therefore, for T we have r F E 0, N which implies r F = O p N, and therefore r r F = O p N. Let V be the N T matrix consisting of u it, i =... N, t = T,..., T. For j = 0... T the matrices U j are sub-matrices of V, and therefore U j V. From example i we know V = O p maxn, T. Furthermore, we know diagψ j max ψij i. Combining these results we find e T diagψ j U j + r j=0 T j=0 [ j=0 max i ψ ij V + op N max i ψ ij ] O p maxn, T + o p N O p maxn, T, as required for the proof. Example iii. Because σ and Σ are positive definite, there exits a symmetric N N matrix φ and a symmetric T T matrix ψ such that σ = φ and Σ = ψ. The error term can then be generated as e = φuψ, where u is an N T matrix with iid entries u it such that Eu it = 0 and Eu 4 it <. Given this definition of e we immediately have Ee it = 0 and Ee it e jτ = σ ij Σ tτ. What 6

7 is left to show is e = O p maxn, T. From example i we know u = O p maxn, T. Using the inequality σ σ σ = σ, where σ = σ because σ is symmetric we find σ σ max j=...n N σ ij < L, i= and analogously Σ < L. Because σ = φ and Σ = ψ, we thus find e φ u ψ LO p maxn, T, i.e., e = O p maxn, T. S.3 Comments on Assumption 4 on the regressors Consistency of the LS estimator β requires the regressors not only satisfy the standard noncollinearity condition in assumption 4i, but also the additional conditions on high- and lowrank regressors in assumption 4ii. Bai 009 considers the special cases of only high-rank and only low-rank regressors. As low-rank regressors he considers only cross-sectional invariant and time-invariant regressors, and he shows that if only these two types of regressors are present, one can show consistency under the assumption plim N,T W > 0 on the regressors instead of assumption 4, where W is the K K matrix defined by W,k k = TrM f 0 X k M λ 0 X k. This matrix appears as the approximate Hessian in the profile objective expansion in theorem 4., i.e., the condition plim N,T W > 0 is very natural in the context of the interactive fixed effect models, and one may wonder whether also for the general case one can replace assumption 4 with this weaker condition and still obtain consistency of the LS estimator. Unfortunately, this is not the case, and below we present two simple counter examples that show this. i Let there only be one factor R = ft 0 with corresponding factor loadings λ 0 i. Let there only be one regressor K = of the form X it = w i v t + λ 0 i ft 0. Assume the N vector w = w,..., w N, and the T vector v = v,..., v N are such that the N matrix Λ = λ 0, w and and the T matrix F = f 0, v satisfy plim N,T Λ Λ/N > 0, and plim N,T F F/T > 0. In this case, we have W = TrM f 0 vw M λ 0 wv, and therefore plim N,T W = plim N,T TrM f 0 vw M λ 0 wv > 0. However, β is 7

8 not identified because β 0 X + λ 0 f 0 = β 0 + X wv, i.e., it is not possible to distinguish β, λ, f = β 0, λ 0, f 0 and β, λ, f = β 0 +, w, v. This implies that the LS estimator is not consistent both β 0 and β 0 + could be the true parameter, but the LS estimator cannot be consistent for both. ii Let there only be one factor R = f 0 t with corresponding factor loadings λ 0 i. Let the N vectors λ 0, w and w be such that Λ = λ 0, w, w satisfies plim N,T Λ Λ/N > 0. Let the T vectors f 0, v and v be such that F = f 0, v, v satisfies plim N,T F F/T > 0. Let there be four regressors K = 4 defined by X = w v, X = w v, X 3 = w + λ 0 v +f 0, X 4 = w +λ 0 v +f 0. In this case, one can easily check plim N,T W > 0. However, again β k is not identified, because 4 k= β0 kx k + λ 0 f 0 = 4 k= β0 k + X k λ 0 + w + w f 0 + v + v, i.e., we cannot distinguish between the true parameters and β, λ, f = β 0 +, λ 0 w w, f 0 + v + v. Again, as a consequence the LS estimator is not consistent in this case. In example ii, there are only low-rank regressors with rankx l =. One can easily check assumption 4 is not satisfied for this example. In example i the regressor is a low-rank regressor with rankx =. In our present version of assumption 4 we only consider low-rank regressors with rankx =, but as already noted in a footnote in the main paper it is straightforward to extend the assumption and the consistency proof to low-rank regressors with rank larger than one. Independent of whether we extend the assumption or not, the regressor X of example i fails to satisfy assumption 4. This justifies our formulation of assumption 4, because it shows in general the assumption cannot be replaced by the weaker condition plim N,T W > 0. S.4 Some Matrix Algebra including Proof of Lemma A. The following statements are true for real matrices throughout the whole paper and supplementary material we never use complex numbers anywhere. Let A be an arbitrary n m matrix. In addition to the operator or spectral norm A and to the Frobenius or Hilbert-Schmidt 8

9 norm A F, it is also convenient to define the -norm, the -norm, and the max-norm by n m A = max A ij, A = max A ij, A max = max max A ij. j=...m i=...n i=...n j=...m i= j= Lemma S.4. Some useful inequalities. Let A be an n m matrix, B be an m p matrix, and C and D be n n matrices. Then we have: i A A F A rank A /, ii AB A B, iii AB F A F B A F B F, iv TrAB A F B F, for n = p, v Tr C C rank C, vi C Tr C, for C symmetric and C 0, vii A A A, viii A max A nm A max, ix A CA A DA, for C symmetric and C D. For C, D symmetric, and i =,..., n we have: x µ i C + µ n D µ i C + D µ i C + µ D, xi µ i C µ i C + D, for D 0, xii µ i C D µ i C + D µ i C + D. Proof. Here we use notation s i A for the ith largest singular value of a matrix A. i We have A = s A, and A F = ranka i= s i A. The inequalities follow directly from this representation. ii This inequality is true for all unitarily invariant norms, see, e.g., Bhatia 997. iii can be shown as follows AB F = TrABB A = Tr[ B AA A B I BB A ] B TrAA = B A F, where we used A B I BB A is positive definite. Relation iv is just the Cauchy Schwarz inequality. To show v we decompose C = UDO singular value decomposition, where U and 9

10 O are n rankc that satisfy U U = O O = I and D is a rankc rankc diagonal matrix with entries s i C. We then have O = U = and D = C and therefore TrC = TrUDO = TrDO U rankc = η ido Uη i i= rankc i= D O U = rankc C. For vi let e be a vector that satisfies e = and C = e Ce. Because C is symmetric such an e has to exist. Now choose e i, i =,..., n, such that e i, i =,..., n, becomes a orthonormal basis of the vector space of n vectors. Because C is positive semi definite we then have Tr C = i e ice i e Ce = C, which is what we wanted to show. For vii we refer to Golub and van Loan 996, p.5. For viii let e be the vector that satisfies e = and A CA = e A CAe. Because A CA is symmetric such an e has to exist. Because C D we then have C = e A CAe e A DAe A DA. This is what we wanted to show. For inequality ix let e be a vector that satisfied e = and A CA = e A CAe. Then we have A CA = e A DAe e A D CAe e A DAe A DA. Statement x is a special case of Weyl s inequality, see, e.g., Bhatia 997. The inequalities xi and xii follow directly from ix because µ n D 0 for D 0, and because D µ i D D for i =,..., n. Definition S.4.. Let A be an n r matrix and B be an n r matrix with ranka = r and rankb = r. The smallest principal angle θ A,B [0, π/] between the linear subspaces spana = {Aa a R r } and spanb = {Bb b B r } of R n is defined by cosθ A,B = max 0 a R r max 0 b R r a A Bb Aa Bb. Lemma S.4.3. Let A be an n r matrix and B be an n r matrix with ranka = r and rankb = r. Then we have the following alternative characterizations of the smallest principal angle between spana and spanb sinθ A,B = min 0 a R r = min 0 b R r 0 M B A a A a M A B b. B b

11 Proof. Because M B A a + P B A a = A a and sinθ A,B + cosθ A,B =, we find proving the theorem is equivalent to proving cosθ A,B = min 0 a R r P B A a A a = min 0 b R r P A B b A b. This last statement is theorem 8 in Galantai and Hegedus 006, and the proof can be found there. Proof of Lemma A.. Let The theorem claims S Z = min f,λ Tr [Z λf Z fλ ], S Z = min f TrZ M f Z, S 3 Z = min λ TrZ M λ Z, S 4 Z = min λ, f TrM λ Z M f Z, T S 5 Z = µ i Z Z, S 6 Z = i=r+ N i=r+ µ i ZZ. S Z = S Z = S 3 Z = S 4 Z = S 5 Z = S 6 Z. We find: i The non-zero eigenvalues of Z Z and ZZ are identical, so in the sums in S 5 Z and in S 6 Z we are summing over identical values, which shows S 5 Z = S 6 Z. ii Starting with S Z and minimizing with respect to f we obtain the first-order condition λ Z = λ λ f.

12 Putting this into the objective function we can integrate out f, namely Tr [ Z λf Z λf ] = Tr Z Z Z λf = Tr Z Z Z λλ λ λ λf = Tr Z Z Z λλ λ λ λλ Z = Tr Z M λ Z. This shows S Z = S 3 Z. Analogously, we can integrate out λ to obtain S Z = S Z. iii Let M λ be the projector on the N R eigenspaces corresponding to the N R smallest eigenvalues of ZZ, let P λ = I N M λ, and let ω R be the R th largest eigenvalue of ZZ. We then know the matrix P λ[zz ω R I N ]P λ M λ[zz ω R I N ]M λ is positive semi-definite. Thus, for an arbitrary N R matrix λ with corresponding projector M λ we have { 0 Tr P λ [ZZ ω R I N ]P λ } M λ [ZZ ω R I N ]M λ Mλ M λ = Tr { P λ[zz ω R I N ]P λ + M λ[zz } ω R I N ]M λ Mλ M λ = Tr [Z M λ Z] Tr [ Z M λ Z ] [ + ω R rankmλ rankm λ ], and because rankm λ = N R and rankm λ N R we have Tr [ Z M λ Z ] Tr [Z M λ Z]. This shows M λ is the optimal choice in the minimization problem of S 3 Z, i.e., the optimal λ = λ is chosen such that the span of the N-dimensional vectors λ r r =... R equals to the span of the R eigenvectors that correspond to the R largest eigenvalues of ZZ. This shows S 3 Z = S 6 Z. Analogously one can show S Z = S 5 Z. iv In the minimization problem in S 4 Z we can choose λ such that the span of the N- dimensional vectors λ r r =... R is equal to the span of the R eigenvectors that correspond to the R largest eigenvalues of ZZ. In addition, we can choose f such that the span of the T -dimensional vectors f r r =... R is equal to the span of the R eigenvectors that correspond to the R + -largest up to the R-largest eigenvalue of Z Z. With this choice of λ and f we actually project out all the R largest eigenvalues of Z Z

13 and ZZ. This shows that S 4 Z S 5 Z. This result is actually best understood by using the singular value decomposition of Z. We can write M λ Z M f = Z Z, where Z = P λ Z M f + Z P f. Because rankz rankp λ Z M f + rankz P f = R + R = R, we can always write Z = λf for some appropriate N R and T R matrices λ and f. This shows that S 4 Z = min λ, f TrM λ Z M f Z min TrZ ZZ Z { Z : rank Z R} = min f,λ Tr [Z λf Z fλ ] = S Z. Thus we have shown here S Z S 4 Z S 5 Z, and this holds with equality because S Z = S 5 Z was already shown above. S.5 Supplement to the Consistency Proof Appendix A Lemma S.5.. Under assumptions and 4 there exists a constant B 0 > 0 such that for the matrices w and v introduced in assumption 4 we have w M λ 0 w B 0 w w 0, v M f 0 v B 0 v v 0, wpa, wpa. Proof. We can decompose w = w w, where w is an N rankw matrix and w is a rankw K matrix. Note w has full rank, and M w = M w. By assumption i we know λ 0 λ 0 /N has a probability limit, i.e., there exists some B > 0 such that λ 0 λ 0 /N < B I R wpa. Using this and assumption 4 we find for any R vector a 0 we have M v λ 0 a λ 0 a = a λ 0 M v λ 0 a a λ 0 λ 0 a > B B, wpa. 3

14 Applying Lemma S.4.3 we find b w M min λ0 w b 0 b R rankw b w w b = min 0 a R R a λ 0 M w λ 0 a a λ 0 λ 0 a Therefore we find for every rankw vector b that b w M λ 0 > B B, wpa. w B/B w w b > 0, wpa. Thus w M λ 0 w B/B w w > 0, wpa. Multiplying from the left with w and from the right with w we obtain w M λ 0 w B/B w w 0, wpa. This is what we wanted to show. Analogously we can show the statement for v. As a consequence of the this lemma we obtain some properties of the low-rank regressors summarized in the following lemma. Lemma S.5.. Let the assumptions and 4 be satisfied and let X low,α = K l= α lx l be a linear combination of the low-rank regressors. Then there exists some constant B > 0 such that Xlow,α M f 0 X low,α min > B, wpa, {α R K, α =} Mλ 0 X low,α M f 0 X low,α min M λ 0 > B, wpa. {α R K, α =} Proof. Note M λ 0 X low,α M f 0 X low,α M λ 0 X low,α M f 0 X low,α, because M λ 0 =, i.e., if we can show the second inequality of the lemma we have also shown the first inequality. We can write X low,α = w diagα v. Using Lemma S.5. and part v, vi and ix of Lemma S.4. we find M λ 0 X low,α M f 0 X low,α M λ 0 = Mλ 0 w diagα v M f 0 v diagα w M λ 0 B 0 M λ 0 w diagα v v diagα w M λ 0 B 0 K Tr [M λ 0 w diagα v v diagα w M λ 0] = B 0 K Tr [v diagα w M λ 0w diagα v ] B 0 K v diagα w M λ 0w diagα v B 0 K v diagα w w diagα v B 0 K Tr [v diagα w w diagα v ] = B 0 Tr [ X K low,α X low,α]. 4

15 Thus we have M λ 0 X low,α M f 0 X low,α M λ 0 / B 0 /K α W low α, where the K K matrix W low W and thus W low above. is defined by W low,l l = Tr X l X l, i.e., it is a submatrix of W. Because converges to a positive definite matrix the lemma is proven by the inequality Using the above lemmas we can now prove the lower bound on the consistency proof. Remember [ S β, f = Tr λ 0 f 0 + K β 0 k β k X k M f λ 0 f 0 + k= k= S β, f that was used in ] K β 0 k β k X k P λ 0,w. We want to show under the assumptions of theorem 3. there exist finite positive constants a 0, a, a, a 3 and a 4 such that S β, f a 0 β low β 0,low β low β 0,low + a β low β 0,low + a a 3 β high β 0,high a 4 β high β 0,high β low β 0,low, wpa. Proof of the lower bound on S β, f. Applying Lemma A. and part xi of Lemma S.4. 5

16 we find [ S β, f µ R+ λ 0 f 0 + [ = µ R+ λ 0 f 0 + K β 0 k β k X k P λ 0,w λ 0 f 0 + k= K l= ] K β 0 k β k X k k= K β 0 l β l w l v l λ 0 f 0 + β 0 l β l w l v l K K + λ 0 f 0 + β 0 l β l w l v l P λ 0,w β 0 m β m X m l= m=k K K + β 0 m β m X mp λ 0,w λ 0 f 0 + β 0 l β l w l v l m=k l= ] K K + β 0 m β m X mp λ 0,w β 0 m β m X m m=k m=k [ K µ K R+ λ 0 f 0 + β 0 l β l w l v l λ 0 f 0 + β 0 l β l w l v l l= l= K K + λ 0 f 0 + β 0 l β l w l v l P λ 0,w β 0 m β m X m l= m=k ] K K + β 0 m β m X mp λ 0,w λ 0 f 0 + β 0 l β l w l v l m=k l= K l µ K R+ λ 0 f 0 + β 0 l β l w l v l λ 0 f 0 + β 0 l β l w l v l= l= l= a 3 β high β 0,high a4 β high β 0,high β low β 0,low, wpa, where a 3 > 0 and a 4 > 0 are appropriate constants. For the last step we used part xii of Lemma S.4. and the fact that K K β 0 m β m X mp λ 0,w λ 0 f 0 + β 0 l β l w l v l m=k l= K β high β 0,high max X m λ 0 f 0 m + K β low β 0,low max w l v l. Our assumptions guarantee the operator norms of λ 0 f 0 / and X m / are bounded from above as N, T, which results in finite constants a 3 and a 4. We write the above result as S β, f µ R+A A/ + terms containing β high, where we defined A = λ 0 f 0 + K l= β0 l β l w l v l. We also write A = A + A + A 3, with A = 6 l

17 M w A P f 0 = M w λ 0 f 0, A = P w A M f 0 = K l= β0 l β l w l v l M f 0, A 3 = P w A P f 0 = P w λ 0 f 0 + K l= β0 l β l w l v l P f. We then find A A = A A + A + A 3A + A 3 and A A A A a / A 3 + a / A a / A 3 + a / A = [A A a A 3A 3 ] + a A A, where for matrices refers to the difference being positive definite, and a is a positive number. We choose a = + µ R A A / A 3. The reason for this choice becomes clear below. Note [A A a A 3A 3 ] has at most rank R asymptotically it has exactly rank R. The non-zero eigenvalues of A A are therefore given by the at most R non-zero eigenvalues of [A A a A 3A 3 ] and the non-zero eigenvalues of a A A, the largest one of the latter being given given by the operator norm a A. We therefore find µ R+ A A µ [ ] R+ A A a A 3A 3 + a A A min { a A, µ R [A A a A 3A 3 ] }. Using Lemma S.4.xii and our particular choice of a we find Therefore µ R [A A a A 3A 3 ] µ R A A a A 3A 3 = µ RA A. µ R+A A { µ RA A min, where we used A A 3 and A A. A µ R A A A + µ R A A, } A A 3 + µ R A A Our assumptions guarantee there exist positive constants c 0, c, c, and c 3 such that A λ0 f 0 + µ R A A A = µ K l= β 0 l β l w l v l c 0 + c β low β 0,low, wpa, = µ R f 0 λ 0 M w λ 0 f 0 c, wpa, [ K ] K β 0 l β l w l v l M f 0 β 0 l β l v l w l l = c 3 β low β 0,low, wpa, l = 7

18 were for the last inequality we used Lemma S.5.. We thus have µ R+ A A Defining a 0 = c c 3, a c = c 0 c and a = c c µ R+ A A c 3 β low β 0,low + c c0 + c β low β 0,low, wpa. we thus obtain a 0 β low β 0,low β low β 0,low + a β low β 0,low, wpa, + a i.e., we have shown the desired bound on S β, f. S.6 Regarding the Proof of Corollary 4. As discussed in the main text, the proof of Corollary 4. is provided in Moon and Weidner 05. All that is left to show here is the matrix W = W λ 0, f 0, X k does not become singular as N, T under our assumptions. Proof. Remember W = TrM f 0 X k M λ 0 X k. The smallest eigenvalue of the symmetric matrix W λ 0, f 0, X k is given by a W a µ K W = min {a R K, a 0} a = min {a R K, a 0} a Tr = min {α R K, ϕ R K α 0, ϕ 0} [ Tr [ M f 0 M f 0 K k = a k X k M λ 0 K k = a k X k ] X low,ϕ + X high,α Mλ 0 X low,ϕ + X high,α ] α + ϕ, where we decomposed a = ϕ, α, with ϕ and α being vectors of length K and K, respectively, and we defined linear combinations of high- and low-rank regressors: X low,ϕ = K l= ϕ l X l, X high,α = 8 K m=k + α m X m.

19 Here, as in assumption 4 the components of α are denoted α K +,..., α K to simplify notation. We have M λ 0 = M λ 0,w + P Mλ 0 w, where w is the N K matrix defined in assumption 4, i.e., λ 0, w is an N R + K matrix, whereas M λ 0w is also an N K matrix. Using this we obtain µ K W = min {ϕ R K, α R K ϕ 0, α 0} = min {ϕ R K, α R K ϕ 0, α 0} { Tr [ M ϕ + α f 0 + Tr [ M f 0 X low,ϕ + X high,α Mλ 0,w X low,ϕ + X high,α ] X low,ϕ + X high,α PMλ0 w X low,ϕ + X high,α ] } { Tr [ ] M ϕ + α f 0 X high,α M λ 0,w X high,α + Tr [ M f 0 X low,ϕ + X high,α PMλ0 w X low,ϕ + X high,α ] }. S.6. We note there exists finite positive constants c, c, and c 3 such that Tr [ M f 0 Tr [ ] M f 0 X high,α M λ 0,w X high,α c α, X low,ϕ + X high,α PMλ0 w X low,ϕ + X high,α ] 0, wpa, Tr [ ] M f 0 X low,ϕ P Mλ 0 w X low,ϕ c ϕ, wpa, Tr [ ] M f 0 X low,ϕ c 3 P Mλ 0 w X high,α ϕ α, wpa, Tr [ ] M f 0 X high,α P Mλ 0 w X high,α 0, S.6. and we want to justify these inequalities now. The second and the last equation in S.6. are true because, e.g., Tr [ M f 0 X high,α P M λ 0 w X high,α ] = Tr [ Mf 0 X high,α P M λ 0 w X high,α M f 0], and the trace of a symmetric positive semi-definite matrix is non-negative. The first inequality in S.6. is true because rankf 0 + rankλ 0, w = R + K and using Lemma A. and assumption 4 we have α Tr [ ] M f 0 X high,α M λ 0,w X high,α α µ [ ] R+K + Xhigh,α X high,α > b, wpa, i.e., we can set c = b. The third inequality in S.6. is true because according Lemma S.4.v 9

20 we have Tr [ M f 0 X low,ϕ P Mλ 0 w X high,α ] K X low,ϕ X high,α K X low,ϕ F X high,α F K K K ϕ α max k =...K X k F max k =K +...K X k F c 3 ϕ α, where we used that assumption 4 implies X k / < C holds wpa for some constant C F as, and we set c 3 = K K K C. Finally, we have to argue that the third inequality in S.6. holds. Note X low,ϕ P M λ 0 w X low,ϕ = X low,ϕ M λ 0 X low,ϕ, i.e., we need to show Using part vi of Lemma S.4. we find Tr [ M f 0 X low,ϕ M λ 0 X low,ϕ ] c ϕ. Tr [ ] M f 0 X low,ϕ M λ 0 X low,ϕ = Tr [ ] M λ 0 X low,ϕ M f 0 X low,ϕ M λ 0 Mλ 0 X low,ϕ M f 0 X low,ϕ M λ 0, and according to Lemma S.5. this expression is bounded by some positive constant times ϕ in the lemma we have ϕ =, but all expressions are homogeneous in ϕ. Using the inequalities S.6. in equation S.6. we obtain µ K W min {ϕ R K, α R K min ϕ 0, α 0} c, c c c + c 3 { c α + max [ 0, c ϕ + α ϕ c 3 ϕ α ]}, wpa. Thus, the smallest eigenvalue of W is bounded from below by a positive constant as N, T, i.e., W is non-degenerate and invertible. S.7 Proof of Examples for Assumption 5 Proof of Example. We want to show the conditions of Assumption 5 are satisfied. Conditions i-iii are satisfied by the assumptions of the example. 0

21 For condition iv, notice Cov X it, X is C = E U it U is. Because β 0 < and sup it Ee it <, it follows N T i= t,s= Cov X it, X is C = = N i= t,s= N T E U it U is T i= t,s= p,q=0 β 0 p+q E e it p e is q <. For condition v, notice by the independence between the sigma field C and the error terms {e it } that we have for some finite constant M, = = N T i= t,s,u,v= N T i= t,s,u,v= N M T T = M T T = M T T i= t,s,u,v= p,q=0 t,s,u,v= p,q=0 s t,u,s,v= k= l= T s,v= min{s,v} k= Cov e it Xis, e iu Xiv C Cov e it U is, e iu U iv β 0 p+q E eit e is p e iu e iv q β 0 p E eit e is p β 0 q E eiu e iv q β 0 p+q [I {t = u} I {s p = v q} + I {t = v q} I {s p = u}] v β 0 s k+v l I {t = u} I {k = l} + M β 0 s+v k + M T T s,u= s u 0 β 0 s u T T T v,t= v t 0 T s,u= s u 0 β 0 s u T β 0 v t. T v,t= v t 0 β 0 v t

22 Notice and T T = T = T = = T s,v= T min{s,v} k= s v s= v= k= T s= v= β 0 T β 0 = O, T s,u= s u 0 β 0 s u = T s β 0 s v Therefore, we have the desired result N β 0 s+v k β 0 s v+v k + T l=0 s= v= T β 0 l + T T s= T s s= k= β 0 l l=0 T s β 0 s v + β 0 β 0 l l + T β 0 T i= t,s,u,v= l= s= u= T β 0 s k s β 0 s u T = β 0 l l = O. T l=0 Cov e it Xis, e iu Xiv C = Op. Preliminaries for Proof of Example Although we observe X it for t T, here we treat Z it = e it, X it as having an infinite past and future. Define Gτ t i = C σ {X is : τ s t} and Hτ t i = C σ {Z it : τ s t}. Then, by definition, we have Gτ t i, Hτ t i Fτ t i for all τ, t, i. By Assumption iv of Example, the time series of {X it : < t < } and {Z it : < t < } are conditional α-mixing conditioning on C uniformly in i. Mixing inequality: The following inequality is a conditional version of the α-mixing inequality of Hall and Heyde 980, p. 78. Suppose X it is a F t -measurable random

23 variable with E X it max{p,q} C X it C,p = E X it p C /p. Then, for each i, we have <, where p, q > with /p + /q <. Denote Cov X it, X it+m C 8 X it C,p X it+m C,q α p q m i. S.7. Proof of Example. Again, we want to show the conditions of Assumption 5 are satisfied. Conditions i-iii are satisfied by the assumptions of the example. For condition iv, we apply the mixing inequality S.7. with p = q > 4. Then, we have N i= t,s= N i= T Cov X it, X is C T T t t= m=0 = 6 N T T m i= m=0 t= 6 sup X it C,p i,t O p, Cov X it, X it+m C = X it C,p X it+m C,p α m i p P m=0 α p P m N T T m i= m=0 t= Cov X it, X it+m C where the last line holds because sup i,t X it C,p = O p for some p > 4 as assumed in the example, and m=0 α p P m = p m=0 m ζ P For condition v, we need to show N T i= t,s,u,v= = O because of ζ > 3 4p 4p Cov e it Xis, e iu Xiv C = Op. and p > 4. Notice = N T i= t,s,u,v= N T i= t,s,u,v= N T i= t,s,u,v= = I + II, say. Cov e it Xis, e iu Xiv C E e it Xis e iu Xiv C E e it Xis C E e iu Xiv C E e it Xis e iu Xiv C + N N T i= T t,s= E e it Xis C 3

24 First, for term I, there are a finite number of different orderings among the indices t, s, u, v. We consider the case t s u v and establish the desired result. The other cases can be shown analogously. Note N + N N i= N i= T N i= T T t T k t= T T t= k=0 T l l=0 m=0 0 l,m k 0 k+l+m T t T t= 0 k,m l 0 k+l+m T t E e it Xit+k e it+k+l Xit+k+l+m C E e it Xit+k e it+k+l Xit+k+l+m C [ E e it Xit+k e it+k+l Xit+k+l+m ] C E e it Xit+k C E e it+k+l Xit+k+l+m C + N + N N i= N i= T T T t= T t= = I + I + I 3 + I 4, say. 0 k,m l 0 k+l+m T t 0 p,l m 0 k+l+m T t E e it Xit+k C E e it+k+l Xit+k+l+m C E By applying the mixing inequality S.7. to X it+k e it+k+l Xit+k+l+m, we have E e it Xit+k e it+k+l Xit+k+l+m C [ e it Xit+k e it+k+l Xit+k+l+m C] E e it Xit+k e it+k+l Xit+k+l+m C with eit and 8 e it C,p Xit+k e it+k+l Xit+k+l+m C,q α p q k 8 e it C,p Xit+k C,3q e it+k+l C,3q Xit+k+l+m C,3q α p q k i, i where the last inequality follows by the generalized Holder s inequality. Choose p = 3q > 4. 4

25 Then, I 8 N 8 N i= T T t= sup e it C,p i,t 8 sup e it C,p i,t O p, 0 l,m k 0 k+l+m T t sup i,t sup i,t X it+k C,p C,p e it C,p Xit+k e it+k+l C,p Xit+k+l+m α 4p k i C,p X it+k C,p T k=0 T t= α 4p k 0 l,m k 0 k+l+m T t k α 4p k where the last line holds because we assume in example that sup i,t e it C,p sup i,t X it+k O p for some p > 4,, and p > 4. m=0 m α 4p m = 4p m=0 m ζ 4p By applying similar arguments, we can also show I, I 3, I 4 = O p. C,p = O because of ζ > 3 4p 4p and = S.8 Supplement to the Proof of Theorem 4.3 Notation E C and Var C and Cov C : In the remainder of this supplementary file we write E C, Var C and Cov C for the expectation, variance and covariance operators conditional on C, i.e., E C A = EA C, Var C A = VarA C and Cov C A, B = CovA, B C. What is left to show to complete the proof of Theorem 4.3 is that Lemma B. and Lemma B. in the main text appendix hold. Before showing this, we first present two further intermediate lemmas. 5

26 Lemma S.8.. Under the assumptions of Theorem 4.3 we have for k =,..., K, a P λ 0 Xk = o p, b X k P f 0 = o p, c P λ 0eX k = o p N 3/, d P λ 0eP f 0 = O p. Proof of Lemma S.8.. # Part a: We have P λ 0 Xk = λ 0 λ 0 λ 0 λ 0 Xk λ 0 λ 0 λ 0 λ 0 Xk λ 0 λ 0 λ 0 λ 0 Xk F = O p N / λ 0 Xk F, where we used part i and ii of Lemma S.4. and Assumption. We have [ ]} R T N E {E C λ 0 Xk F = E E C λ 0 X ir k,it r= t= i= { R T N } = E λ 0 ir E C X k,it = R r= r= T t= = O p, t= i= N E [ λ 0 ir Var C X k,it ] where we used X k,it is mean zero and independent across i, conditional on C, and our bounds on the moments of λ 0 ir and X k,it. We therefore have λ 0 Xk F = O p and the above inequality thus gives P λ 0 Xk = O p T = o p. i= # The proof for part b is similar. As above we first obtain X k P f 0 = P f 0 X k 6

27 O p T / f 0 X k F. Next, we have ] E C [ f 0 X k F = = R r= R r= N i= N [ R r= E C i= t,s= T ftr 0 X k,it t= T ftrf 0 sre 0 C Xk,it Xk,is max t f 0 tr ] N i= t,s= = O p T /4+ɛ O p = o p, T Cov C X k,it, X k,is where we used that uniformly bounded E f 0 t 4+ɛ implies max t f 0 tr = O p T /4+ɛ. We thus have f 0 X k F = o pt N and therefore X k P f 0 = o p. # Next, we show part c. First, we have [ λ 0 E {E C ex k ]} R F = E E C = E = { R R r= j= N r= i,j,l= t,s= N r= i,j= t= N N T λ 0 ire it X k,jt i= t= } T λ 0 irλ 0 lre C e it e ls X k,jt X k,js T E [ λ 0 ir E C e it Xk,jt] = ON T, where we used that E C e it e ls X k,jt X k,js is only non-zero if i = l because of cross-sectional independence conditional on C and t = s because regressors are pre-determined. We can thus conclude λ 0 ex k F = O p N T. Using this we find P λ 0eX k = λ 0 λ 0 λ 0 λ 0 ex k λ 0 λ 0 λ 0 λ 0 ex k λ 0 λ 0 λ 0 λ 0 ex k F = O p N / O p N T = O p. This is what we wanted to show. 7

28 # For part d, we first find f 0 eλ 0 F = O p, because E E C f 0 eλ 0 F = E E N T C i= t= { N N T T = E = = O, N i= i= T t= j= t= s= e it f 0 t λ 0 i E C e it e js f 0 t λ 0 i λ 0 j f 0 s E [ ] E C e it f 0 t λ 0 i λ 0 i ft 0 where we used e it is independent across i and over t, conditional on C. Thus we obtain P λ 0eP f 0 = λ 0 λ 0 λ 0 λ 0 ef 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 λ 0 ef 0 f 0 f 0 f 0 where we used part i and ii of Lemma S.4.. O p N / O p N λ 0 ef 0 F O p T O p T / = O p, Lemma S.8.. Suppose A and B are T T and N N matrices that are independent of e, conditional on C, such that E C A F = Op and E C B F = Op, and let Assumption 5 be satisfied. Then there exists a finite non-random constant c 0 such that a E C {Tr [e e E C e e A]} c 0 N E C A F, b E C {Tr [ee E C ee B]} c 0 T E C B F. Proof. # Part a: Denote A ts to be the t, s th element of A. We have } Therefore, Tr {e e E C e e A} = = T T e e E C e e ts A ts t= T s= T N e it e is E C e it e is A ts. t= s= i= E C Tr {e e E C e e A} [ T T T T N N ] = E C e it e is E C e it e is e jp e jq E C e jp e jq E C A ts A pq. t= s= p= q= i= j= 8

29 Let Σ it = E C e it. Then we find { N N } E C e it e is E C e it e is e jp e jq E C e jp e jq i= i= = Therefore, j= N N = {E C e it e is e jp e jq E C e it e is E C e jp e jq } j= Σ it Σ is if t = p s = q and i = j Σ it Σ is if t = q s = p and i = j E C e 4 it Σ it if t = s = p = q and i = j 0 otherwise. E C Tr {e e E C e e A} T T N Σ it Σ is EC A ts + EC A ts A st + t= s= i= T N EC e 4 it Σ it EC A tt. t= i= Define Σ i = diag Σ i,..., Σ it. Then, we have Also, T t= T s= N N Σ it Σ is EC A ts = EC Tr A Σ i AΣ i i= i= N E C AΣ i N F Σ i E C A F i= i= N sup Σ it E C A F. it S.8. Finally, T t= T s= [ N N Σ it Σ is E C A ts A st = E C Tr Σ i AAΣ i] i= T t= i= i= N E C Σ i A N AΣ i F F Σ i E C A F i= i= N sup Σ it E C A F. S.8. it N EC e 4 it Σ it EC A tt N sup E C e 4 it E C A F, it 9 S.8.3

30 and sup it E C e 4 it is assumed bounded by Assumption 5vi. # Part b: The proof is analogous to the proof of part a. Proof of Lemma B.. # For part a we have Tr P f 0 e P λ 0 Xk = Tr P f 0 e P λ 0P λ 0 Xk P f 0 R P λ 0 e P f 0 P λ 0 Xk Pf 0 = O p o p O p = o p, where the second-last equality follows by Lemma S.8. a and d. # To show statement b we define ζ k,ijt = e it Xk,jt. We then have Tr P λ 0 e X k = [ R λ 0 λ 0 ] r,q= N rq T N N λ 0 irλ 0 jqζ k,ijt. t= i,j= }{{} A k,rq We only have E C ζk,ijt ζ k,lms 0 if t = s because regressors are pre-determined and i = l and j = m because of cross-sectional independence. Therefore E { { } E C A T N } k,rq = E λ N 3 ir λ jq λ lr λ mq E C ζk,ijt ζ T k,lms = N 3 T T t,s= i,j,l,m= N t= i,j= E [ λ irλ jq E C ζ k,ijt] = O/N = op. We thus have A k,rq = o p and therefore also Tr P λ 0 e X k = o p. # The proof for statement c is similar to the proof of statement b. Define ξ k,its = e it Xk,is E C e it Xk,is. We then have { Tr P f 0 ]} [e Xk E C e Xk = [ R f ] f T r,q= rq N T T f tr f sq ξ k,its. i= t,s= }{{} B k,rq 30

31 Therefore E C B N T k,rq = T 3 N i,j= t,s,u,v= 4 max f t r t, r T 3 N 4 = max f t r t, r T 3 N = O p T 4/4+ɛ O p /T = o p, f tr f sq f ur f vq E C ξk,its ξ k,juv N T i,j= t,s,u,v= N T i= t,s,u,v= Cov C e it Xk,is, e ju Xk,jv Cov C e it Xk,is, e iu Xk,iv where we used uniformly bounded E f 0 t 4+ɛ implies max t f 0 tr = O p T /4+ɛ. # Part d and e: We have λ 0 λ 0 λ 0 f 0 f 0 f 0 = O p /, e = O p N /, X k = O p and P λ 0eP f 0 = O p, which was shown in Lemma S.8.. Therefore: Tr ep f 0 e M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 = Tr P λ 0eP f 0 e M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 R P λ 0eP f 0 e X k f 0 f 0 f 0 λ 0 λ 0 λ 0 = Op N / = o p. which shows statement d. The proof for part e is analogous. # To prove statement f we need to use in addition P λ 0 e X k = o pn 3/, which was also shown in Lemma S.8.. We find Tr e M λ 0 X k M f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 = Tr e M λ 0 X k e P λ 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 Tr e M λ 0 X k P f 0 e P λ 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 R e P λ 0 e X k λ 0 λ 0 λ 0 f 0 f 0 f 0 = o p. R e X k P λ 0 e P f 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 3

32 # Now we want to prove part g and h of the present lemma. For part g we have Tr { [ee E C ee ] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } = Tr { [ee E C ee ] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } + } Tr {[ee E C ee ] M λ 0 Xk P f 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 = Tr { [ee E C ee ] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } + ee E C ee X k P f 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 = Tr { [ee E C ee ] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } + o p. Thus, what is left to prove is Tr { [ee E C ee ] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } = o p. For this we define Using part i and ii of Lemma S.4. we find B k = M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0. B k F R / B k R / X k f 0 f 0 f 0 λ 0 λ 0 λ 0 R / X k F f 0 f 0 f 0 λ 0 λ 0 λ 0. and therefore E C Bk F R f 0 f 0 f 0 λ 0 λ 0 λ 0 E C Xk F = O, where we used E C Xk F = O, which is true because we assumed uniformly bounded moments of X k,it. Applying Lemma S.8. we therefore find E C Tr {[ee E C ee T ] B k } c 0 E C Bk F = o, and thus Tr {[ee E C ee ] B k } = o p, 3

33 which is what we wanted to show. The proof for part h is analogous. # Part i: Conditional on C the expression e itx it X it E C e it X it X it is mean zero, and it is also uncorrelated across i. This together with the bounded moments that we assume implies { N T [ ] } Var C e it X it X it E C e it X it X it = O p /N = o p, i= t= which shows the required result. # Part j: Define the K K matrix A = we have N i= T t= N i= e it X it X it X it X it = A + A. T t= e it X it + X it X it X it. Then Let B k be the N T matrix with elements B k,it = e it X k,it + X k,it. We have B k B k F = O p, because the moments of B k,it are uniformly bounded. The components of A can be written as A lk = Tr[B lx k X k ]. We therefore have We have X k X k = X k P f 0 + P λ 0 A lk R B l R B l A lk rankx k X k B l X k X k. Xk P f 0 Xk P f 0 + Xk M f 0. Therefore rankx k X k R and where we used Lemma S.8.. This shows the desired result. + P λ 0 Xk M f 0 P λ 0 Xk = R O p o p = o p, Proof of Lemma B.. Let c be a K-vector such that c =. The required result follows by the Cramer-Wold device, if we show N i= t= T e it X itc N 0, c Ωc. For this, define ξ it = e it X itc. Furthermore define ξ m = ξ M,m = ξ,it, with M = and m = T i + t {,..., M}. We then have the following: i Under Assumption 5i, ii, iii the sequence {ξ m, m =,..., M} is a martingale difference sequence under the filtration F m = C σ{ξ n : n < m}. 33

34 ii Eξ 4 it is uniformly bounded, because by Assumption 5vi E C e 8 it and E C X it 8+ɛ are uniformly bounded by a non-random constant applying Cauchy-Schwarz and the law of iterated expectations. iii M M m= ξ m = c Ωc + o p. ξ m E C ξ m ] } = { [ This is true, because firstly under our assumptions we have E M C M m= { E M C M m= ξ m E C ξ m } = O P /M = o P, implying we have M M m= ξ m = M M m= E Cξ m + o p. We furthermore have M M m= E Cξ m = Var C M / M m= ξ m, and using the result in equation 4 of the main text we find Var C M / M m= ξ m = Var C / N i= T t= ξ it = c Ωc + o p. These three properties of {ξ m, m =,..., M} allow us to apply Corollary 5.6 in White 00, which is based on Theorem.3 in Mcleish 974, to obtain M M m= ξ m d N 0, c Ωc. This concludes the proof, because M M m= ξ m = N i= T t= e itx itc. S.9 Expansions of Projectors and Residuals The incidental parameter estimators f and λ as well as the residuals ê enter into the asymptotic bias and variance estimators for the LS estimator β. To describe the properties of f, λ and ê, it is convenient to have asymptotic expansions of the projectors M λβ and M fβ that correspond to the minimizing parameters λβ and fβ in equation 4. Note the minimizing λβ and fβ can be defined for all values of β, not only for the optimal value β = β. The corresponding residuals are êβ = Y β X λβ f β. Theorem S.9.. Under Assumptions, 3, and 4i we have the following expansions K M λβ = M λ 0 + M λ,e + M λ,e k= K M fβ = M f 0 + M f,e + M f,e êβ = M λ 0 e M f 0 + ê e K k= k= βk β 0 k M βk β 0 k M λ,k + M rem λ f,k + M rem f βk β 0 k ê k + ê rem β, β, β, 34

35 where the spectral norms of the remainders satisfy for any series η 0: M rem β λ sup {β: β β 0 η } β β 0 + / e β β 0 + 3/ e = O 3 p, M rem β f sup {β: β β 0 η } β β 0 + / e β β 0 + 3/ e = O 3 p, ê rem β sup {β: β β 0 η } / β β 0 + e β β 0 + e = O 3 p, and we have rankê rem β 7R, and the expansion coefficients are given by M λ,e = M λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0, M λ,k = M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 X k M λ 0, M λ,e = M λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0 + λ 0 λ 0 λ 0 f 0 f 0 f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0 M λ 0 e M f 0 e λ 0 λ 0 λ 0 f 0 f 0 λ 0 λ 0 λ 0 λ 0 λ 0 λ 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0 e M λ 0 M λ 0 e f 0 f 0 f 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0 + λ 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0, analogously M f,e = M f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0, M f,k = M f 0 X k λ 0 λ 0 λ 0 f 0 f 0 f 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 X k M f 0, M f,e = M f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 + f 0 f 0 f 0 λ 0 λ 0 λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0 M f 0 e M λ 0 e f 0 f 0 f 0 λ 0 λ 0 f 0 f 0 f 0 f 0 f 0 f 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0 e M f 0 M f 0 e λ 0 λ 0 λ 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0 + f 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0, 35

36 and finally ê k = M λ 0 X k M f 0, ê e = M λ 0 e M f 0 e λ 0 λ 0 λ 0 f 0 f 0 f 0 λ 0 λ 0 λ 0 f 0 f 0 f 0 e M λ 0 e M f 0 M λ 0 e f 0 f 0 f 0 λ 0 λ 0 λ 0 e M f 0. Proof. The general expansion of M λβ is given in Moon and Weidner 05, and in the theorem we just make this expansion explicit up to a particular order. The result for M fβ is just obtained by symmetry N T, λ f, e e, X k X k. For the residuals ê we have ê = M λ Y [ ] β k X k = M λ e β β 0 X + λ 0 f 0, k= and plugging in the expansion of M λ gives the expansion of ê. We have êβ = A 0 + λ 0 f 0 λβ f β, where A 0 = e k β k β 0 kx k. Therefore ê rem β = A + A + A 3 with A = A 0 M λ 0 A 0 M f 0, A = λ 0 f 0 λβ f β, and A 3 = ê e. We find ranka R, ranka R, ranka 3 3R, and thus rankê rem β 7R, as stated in the theorem. Having expansions for M λβ and M fβ, we also have expansions for P λβ = I N M λβ and P fβ = I T M fβ. The reason why we give expansions of the projectors and not expansions of λβ and fβ directly is for the latter we would need to specify a normalization, whereas the projectors are independent of any normalization choice. An expansion for λβ can, for example, be defined by λβ = P λβλ 0, in which case the normalization of λβ is implicitly defined by the normalization of λ 0. S.0 Consistency Proof for Bias and Variance Estimators Proof of Theorem 4.4 It is convenient to introduce some alternative notation for Definition in section 4.3 of the main text. 36

37 Definition Let Γ : R R be the truncation kernel defined by Γx = for x, and Γx = 0 otherwise. Let M be a bandwidth parameter that depends on N and T. For an N N matrix A with elements A ij and a T T matrix B with elements B ts we define i the diagonal truncations A truncd = diag[a ii i=,...,n ] and B truncd = diag[b tt t=,...,t ]. ii the right-sided Kernel truncation of B, which is a T T matrix B truncr with elements Bts truncr = Γ s t Bts for t < s, and B truncr M ts = 0 otherwise. Here, we suppress the dependence of B truncr on the bandwidth parameter M. Using this notation we can represent the estimators for the bias in Definition as follows: B,k = N Tr [P f ê X k truncr], B,k = [ê T Tr ê truncd M λ X k f f f λ λ ] λ, B 3,k = [ê N Tr ê truncd M f X k λ λ λ f ] f f. Before proving Theorem 4.4 we establish some preliminary results. Corollary S.0.. Under the Assumptions of Theorem 4.3 we have β β 0 = O p. This corollary directly follows from Theorem 4.3. Corollary S.0.. Under the Assumptions of Theorem 4.4 we have P λ P λ 0 = M λ M λ 0 = Op N /, P f P f 0 = M f M f 0 = O p T /. Proof. Using e = O p N / and X k = O p N we find the expansion terms in Theorem S.9. satisfy M λ,e = O p N /, M λ,e = O p N, M λ,k = O p. Together with corollary S.0. the result for M λ M λ 0 immediately follows. In addition we have P λ P λ 0 = M λ + M λ 0. The proof for M f and P f is analogous. 37

38 Lemma S.0.3. Under the Assumptions of Theorem 4.4 we have A A N i= N i= T t= T t= e it X it X it X it X it = o p, e it ê it Xit X it = o p. Lemma S.0.4. Let f and f 0 be normalized as f f/t = IR and f 0 f 0 /T = I R. Then, under the assumptions of Theorem 4.4, there exists an R R matrix H = H such that f f 0 H = O p, λ λ 0 H = Op. Furthermore λ λ λ f f f λ 0 λ 0 λ 0 f 0 f 0 f 0 = Op N 3/. Here, the matrix H depends on N, T, but we write H instead of H simple. to keep notation Lemma S.0.5. Under the Assumptions of Theorem 4.4 we have i N EC e X k ê X k truncr = op, ii N EC e e ê ê truncd = op, iii T EC ee ê ê truncd = op. Lemma S.0.6. Under the Assumptions of Theorem 4.4 we have i N ê X k truncr = Op MT /8, ii N ê ê truncd = Op, iii T ê ê truncd = Op. The proof of the above lemmas is given section S. below. Using these lemmas we can now prove Theorem

39 Proof of Theorem 4.4, Part I: show Ŵ = W + o p. Using Tr C C rank C and corollary S.0. we find: Ŵk k W,k k [ = Tr M λ M ] ] λ 0 Xk M f X k + Tr [M λ 0 X k M f M f 0 X k R M λ M λ 0 Xk X k R M f M f 0 = R O pn O p + R O pt O p = o p. Thus we have Ŵ = W + o p = W + o p. X k X k Proof of Theorem 4.4, Part II: show Ω = Ω + o p. Let Ω N T i= t= e it X it X it. We have Ω = Ω + o P = Ω + A + A + o p = Ω + o P, where A and A are defined in Lemma S.0.3, and the lemma states A and A are o p. Proof of Theorem 4.4, Part III: show B = B + o p. Let B,k, = N Tr [P f 0 E C e X k ]. According to Assumption 6 we have B,k = B,k, +o p. What is left to show is B,k, = B,k + o p. Using Tr C C rank C we find B,k, B [ ] = E C N TrP f 0 e X k N [ Tr P f ê X k truncr] [P N Tr f 0 P f ê X k truncr] + { [ N Tr P f 0 E C e X k ê X k truncr]} R P f 0 P N f ê X k truncr + R N P f 0 EC e X k ê X k truncr. We have P f 0 =. We now apply Lemmas S.0.5, S.0. and S.0.6 to find B,k, B = N O p N / O p M /8 + o p N = o p. This is what we wanted to show. 39

40 Proof of Theorem 4.4, final part: show B = B + o p and B 3 = B 3 + o p. Define B,k, = T Tr [ E C ee M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 ]. According to Assumption 6 we have B,k = B,k, + o p. What is left to show is B,k, = B,k + o p. We have B,k B,k = T Tr [ E C ee M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 ] T Tr [ ê ê truncd M λ X k f f f λ λ λ ] = T Tr [ ê ê truncd M λ X k f 0 f 0 f 0 λ 0 λ 0 λ 0 f f f λ λ λ ] + T Tr [ê ê truncd M λ 0 M λ Xk f 0 f 0 f 0 λ 0 λ 0 λ 0 ] + T Tr {[ E C ee ê ê truncd] M λ 0 X k f 0 f 0 f 0 λ 0 λ 0 λ 0 } Using Tr C C rank C which is true for every square matrix C we find B,k B R,k ê ê truncd Xk f 0 f 0 f 0 λ 0 λ 0 λ 0 T f f f λ λ λ + R ê ê truncd Mλ 0 M λ Xk f 0 f 0 f 0 λ 0 λ 0 λ 0 T + R E C ee ê ê truncd Xk f 0 f 0 f 0 λ 0 λ 0 λ 0. T Here we used M f 0 = =. Using X k = O p, and applying Lemmas S.0., M f S.0.4, S.0.5 and S.0.6, we now find B,k B [,k = T O p T O p / O p N 3/ + O p T O p N / O p / O p / ] + o p T O p / O p / = o p. This is what we wanted to show. The proof of B 3 = B 3 + o p is analogous.. S. Proof of Intermediate Lemma Here we provide the proof of some intermediate lemmas that were stated and used in section S.0. The following lemma gives a useful bound on the maximum of correlated random variables 40

41 Lemma S... Let Z i, i =,,..., n, be n real valued random variables, and let γ and B > 0 be finite constants independent of n. Assume max i E C Z i γ B, i.e., the γ th moment of the Z i are finite and uniformly bounded. For n we then have max i Z i = O p n /γ. S.. Proof. Using Jensen s inequality one obtains E C max i Z i E C max i Z i γ /γ E C n i= Z i γ /γ n max i E C Z i γ /γ n /γ B /γ. Markov s inequality then gives equation S... Lemma S... Let Z k,tτ = N / Z t = N / Z 3 i = T / N [e it X k,iτ E C e it X k,iτ ], i= N [ ] e it E C e it, i= T [ ] e it E C e it. t= Under assumption 5 we have E Z C k,tτ 4 B, E Z C tτ 4 B, E Z3 C i 4 B, for some B > 0, i.e., the conditional expectations Z k,tτ, Z tτ, and Z 3 i are uniformly bounded over t, τ, or i, respectively. Proof. # We start with the proof for Z k,tτ. Define Z k,tτ,i = e itx k,iτ E C e it X k,iτ. By assumption we have finite 8th moments for e it and X k,iτ uniformly across k, i, t, τ, and thus using Cauchy Schwarz inequality we have finite 4th moment of Z k,tτ,i uniformly across k, i, t, τ. For ease of notation we now fix k, t, τ and write Z i = Z k,tτ,i. We have E CZ i = 0 and E C Z i Z j Z k Z l = 0 if i / {j, k, l} and the same holds for permutations of i, j, k, l. Using 4

42 this we compute N 4 N E C Z i = E C Z i Z j Z k Z l i= i,j,k,l= = 3 E C Z i Zj + E C Z 4 i i j i N N [ ] } = 3 E C Z i EC Z j + {E C Z 4i 3 EC Z i i,j= i=, Because we argued E C Z 4 i is bounded uniformly, the last equation shows is bounded uniformly across k, t, τ. This is what we wanted to show. # The proofs for Z t and Z 3 i are analogous. Z k,tτ = N / N i= Z k,tτ,i Lemma S..3. For a T T matrix A we have A truncr M A truncr max M max max A tτ, t t<τ t+m Here, for the bounds on τ we could write max, t M instead of t M, and mint, t + M instead of t + M, to guarantee τ T. Since this would complicate notation, we prefer the convention A tτ = 0 for t < or τ < of t > T or τ > T. Proof. For the -norm of A truncr we find A truncr = max t=...t M t+m τ=t+ A tτ max A tτ = M A truncr t<τ t+m max, and analogously we find the same bound for the -norm A truncr. Applying part vii of Lemma S.4. we therefore also get this bound for the operator norm A truncr. Proof of Lemma S.0.3. # We first show A N T i= t= e it X it X it X it X it = o p. Let B,it = X it X it, B,it = e itx it, and B 3,it = e it X it. Note B, B, and B 3 can either be viewed as K-vectors for each pair i, t, or equivalently as N T matrices B,k, B,k, and B 3,k for each k =,..., K. We have A = i t B,it B,it + B 3,it B,it, or equivalently A,k k = Tr B,k B 3,k + B,k B,k. 4

Dynamic Linear Panel Regression Models with Interactive Fixed Effects

Dynamic Linear Panel Regression Models with Interactive Fixed Effects Dynamic Linear Panel Regression Models with Interactive Fixed Effects Hyungsik Roger Moon Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper CWP47/4 Dynamic

More information

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction NUCLEAR NORM PENALIZED ESTIMATION OF IERACTIVE FIXED EFFECT MODELS HYUNGSIK ROGER MOON AND MARTIN WEIDNER Incomplete and Work in Progress. Introduction Interactive fixed effects panel regression models

More information

Likelihood Expansion for Panel Regression Models with Factors

Likelihood Expansion for Panel Regression Models with Factors Likelihood Expansion for Panel Regression Models with Factors Hyungsik Roger Moon Martin Weidner May 2009 Abstract In this paper we provide a new methodology to analyze the Gaussian profile quasi likelihood

More information

Likelihood Expansion for Panel Regression Models with Factors

Likelihood Expansion for Panel Regression Models with Factors Likelihood Expansion for Panel Regression Models with Factors Hyungsik Roger Moon Martin Weidner April 29 Abstract In this paper we provide a new methodology to analyze the Gaussian profile quasi likelihood

More information

Random Matrix Theory and its Applications to Econometrics

Random Matrix Theory and its Applications to Econometrics Random Matrix Theory and its Applications to Econometrics Hyungsik Roger Moon University of Southern California Conference to Celebrate Peter Phillips 40 Years at Yale, October 2018 Spectral Analysis of

More information

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects Hyungsik Roger Moon Martin Weidner December 26, 2014 Abstract In this paper we study the least squares (LS) estimator

More information

Nuclear Norm Regularized Estimation of Panel Regression Models

Nuclear Norm Regularized Estimation of Panel Regression Models Nuclear Norm Regularized Estimation of Panel Regression Models Hyungsik Roger Moon Martin Weidner February 3, 08 Abstract In this paper we investigate panel regression models with interactive fixed effects.

More information

Linear regression for panel with unknown number of factors as interactive fixed effects

Linear regression for panel with unknown number of factors as interactive fixed effects Linear regression for panel with unknown number of factors as interactive fixed effects Hyungsik Roger Moon Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap working paper

More information

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects

Linear Regression for Panel with Unknown Number of Factors as Interactive Fixed Effects Linear Regression for Panel with Unnown Number of Factors as Interactive Fixed Effects Hyungsi Roger Moon Martin Weidne February 27, 22 Abstract In this paper we study the Gaussian quasi maximum lielihood

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

Matrix Algebra, part 2

Matrix Algebra, part 2 Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38 Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38 Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues

More information

Honors Linear Algebra, Spring Homework 8 solutions by Yifei Chen

Honors Linear Algebra, Spring Homework 8 solutions by Yifei Chen .. Honors Linear Algebra, Spring 7. Homework 8 solutions by Yifei Chen 8... { { W {v R 4 v α v β } v x, x, x, x 4 x x + x 4 x + x x + x 4 } Solve the linear system above, we get a basis of W is {v,,,,

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017 Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION By Degui Li, Peter C. B. Phillips, and Jiti Gao September 017 COWLES FOUNDATION DISCUSSION PAPER NO.

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v ) Section 3.2 Theorem 3.6. Let A be an m n matrix of rank r. Then r m, r n, and, by means of a finite number of elementary row and column operations, A can be transformed into the matrix ( ) Ir O D = 1 O

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

Linear Algebra Review

Linear Algebra Review January 29, 2013 Table of contents Metrics Metric Given a space X, then d : X X R + 0 and z in X if: d(x, y) = 0 is equivalent to x = y d(x, y) = d(y, x) d(x, y) d(x, z) + d(z, y) is a metric is for all

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

Lecture 2: Linear operators

Lecture 2: Linear operators Lecture 2: Linear operators Rajat Mittal IIT Kanpur The mathematical formulation of Quantum computing requires vector spaces and linear operators So, we need to be comfortable with linear algebra to study

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Lecture 4 Orthonormal vectors and QR factorization

Lecture 4 Orthonormal vectors and QR factorization Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced

More information

Lecture 20: Linear model, the LSE, and UMVUE

Lecture 20: Linear model, the LSE, and UMVUE Lecture 20: Linear model, the LSE, and UMVUE Linear Models One of the most useful statistical models is X i = β τ Z i + ε i, i = 1,...,n, where X i is the ith observation and is often called the ith response;

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

Chapter 3. Matrices. 3.1 Matrices

Chapter 3. Matrices. 3.1 Matrices 40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows

More information

Lecture 7 Spectral methods

Lecture 7 Spectral methods CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

More information

ELEMENTS OF MATRIX ALGEBRA

ELEMENTS OF MATRIX ALGEBRA ELEMENTS OF MATRIX ALGEBRA CHUNG-MING KUAN Department of Finance National Taiwan University September 09, 2009 c Chung-Ming Kuan, 1996, 2001, 2009 E-mail: ckuan@ntuedutw; URL: homepagentuedutw/ ckuan CONTENTS

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

The Singular Value Decomposition

The Singular Value Decomposition CHAPTER 6 The Singular Value Decomposition Exercise 67: SVD examples (a) For A =[, 4] T we find a matrixa T A =5,whichhastheeigenvalue =5 Thisprovidesuswiththesingularvalue =+ p =5forA Hence the matrix

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

Review of similarity transformation and Singular Value Decomposition

Review of similarity transformation and Singular Value Decomposition Review of similarity transformation and Singular Value Decomposition Nasser M Abbasi Applied Mathematics Department, California State University, Fullerton July 8 7 page compiled on June 9, 5 at 9:5pm

More information

Zhaoxing Gao and Ruey S Tsay Booth School of Business, University of Chicago. August 23, 2018

Zhaoxing Gao and Ruey S Tsay Booth School of Business, University of Chicago. August 23, 2018 Supplementary Material for Structural-Factor Modeling of High-Dimensional Time Series: Another Look at Approximate Factor Models with Diverging Eigenvalues Zhaoxing Gao and Ruey S Tsay Booth School of

More information

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T 1 1 Linear Systems The goal of this chapter is to study linear systems of ordinary differential equations: ẋ = Ax, x(0) = x 0, (1) where x R n, A is an n n matrix and ẋ = dx ( dt = dx1 dt,..., dx ) T n.

More information

Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects

Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects Hyungsik Roger Moon Matthew Shum Martin Weidner First draft: October 2009; This draft: October 4, 205 Abstract We extend

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Estimation of random coefficients logit demand models with interactive fixed effects

Estimation of random coefficients logit demand models with interactive fixed effects Estimation of random coefficients logit demand models with interactive fixed effects Hyungsik Roger Moon Matthew Shum Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap

More information

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.

Vector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis. Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar

More information

1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications 1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

More information

MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics

MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics MTH 309 Supplemental Lecture Notes Based on Robert Messer, Linear Algebra Gateway to Mathematics Ulrich Meierfrankenfeld Department of Mathematics Michigan State University East Lansing MI 48824 meier@math.msu.edu

More information

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010

Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010 Economics 204 Summer/Fall 2010 Lecture 10 Friday August 6, 2010 Diagonalization of Symmetric Real Matrices (from Handout Definition 1 Let δ ij = { 1 if i = j 0 if i j A basis V = {v 1,..., v n } of R n

More information

Specification Test for Instrumental Variables Regression with Many Instruments

Specification Test for Instrumental Variables Regression with Many Instruments Specification Test for Instrumental Variables Regression with Many Instruments Yoonseok Lee and Ryo Okui April 009 Preliminary; comments are welcome Abstract This paper considers specification testing

More information

Stein s Method and Characteristic Functions

Stein s Method and Characteristic Functions Stein s Method and Characteristic Functions Alexander Tikhomirov Komi Science Center of Ural Division of RAS, Syktyvkar, Russia; Singapore, NUS, 18-29 May 2015 Workshop New Directions in Stein s method

More information

Eigenvalues, random walks and Ramanujan graphs

Eigenvalues, random walks and Ramanujan graphs Eigenvalues, random walks and Ramanujan graphs David Ellis 1 The Expander Mixing lemma We have seen that a bounded-degree graph is a good edge-expander if and only if if has large spectral gap If G = (V,

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ.

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ. Linear Algebra 1 M.T.Nair Department of Mathematics, IIT Madras 1 Eigenvalues and Eigenvectors 1.1 Definition and Examples Definition 1.1. Let V be a vector space (over a field F) and T : V V be a linear

More information

Sample ECE275A Midterm Exam Questions

Sample ECE275A Midterm Exam Questions Sample ECE275A Midterm Exam Questions The questions given below are actual problems taken from exams given in in the past few years. Solutions to these problems will NOT be provided. These problems and

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects

Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects Estimation of Random Coefficients Logit Demand Models with Interactive Fixed Effects Hyungsik Roger Moon Matthew Shum Martin Weidner First draft: October 2009; This draft: September 5, 204 Abstract We

More information

5 Compact linear operators

5 Compact linear operators 5 Compact linear operators One of the most important results of Linear Algebra is that for every selfadjoint linear map A on a finite-dimensional space, there exists a basis consisting of eigenvectors.

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5 Practice Exam. Solve the linear system using an augmented matrix. State whether the solution is unique, there are no solutions or whether there are infinitely many solutions. If the solution is unique,

More information

Classes of Linear Operators Vol. I

Classes of Linear Operators Vol. I Classes of Linear Operators Vol. I Israel Gohberg Seymour Goldberg Marinus A. Kaashoek Birkhäuser Verlag Basel Boston Berlin TABLE OF CONTENTS VOLUME I Preface Table of Contents of Volume I Table of Contents

More information

1. General Vector Spaces

1. General Vector Spaces 1.1. Vector space axioms. 1. General Vector Spaces Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule

More information

Jordan normal form notes (version date: 11/21/07)

Jordan normal form notes (version date: 11/21/07) Jordan normal form notes (version date: /2/7) If A has an eigenbasis {u,, u n }, ie a basis made up of eigenvectors, so that Au j = λ j u j, then A is diagonal with respect to that basis To see this, let

More information

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX September 2007 MSc Sep Intro QT 1 Who are these course for? The September

More information

Total Least Squares Approach in Regression Methods

Total Least Squares Approach in Regression Methods WDS'08 Proceedings of Contributed Papers, Part I, 88 93, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS Total Least Squares Approach in Regression Methods M. Pešta Charles University, Faculty of Mathematics

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly.

11 a 12 a 21 a 11 a 22 a 12 a 21. (C.11) A = The determinant of a product of two matrices is given by AB = A B 1 1 = (C.13) and similarly. C PROPERTIES OF MATRICES 697 to whether the permutation i 1 i 2 i N is even or odd, respectively Note that I =1 Thus, for a 2 2 matrix, the determinant takes the form A = a 11 a 12 = a a 21 a 11 a 22 a

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

MATH 106 LINEAR ALGEBRA LECTURE NOTES

MATH 106 LINEAR ALGEBRA LECTURE NOTES MATH 6 LINEAR ALGEBRA LECTURE NOTES FALL - These Lecture Notes are not in a final form being still subject of improvement Contents Systems of linear equations and matrices 5 Introduction to systems of

More information

Real Variables # 10 : Hilbert Spaces II

Real Variables # 10 : Hilbert Spaces II randon ehring Real Variables # 0 : Hilbert Spaces II Exercise 20 For any sequence {f n } in H with f n = for all n, there exists f H and a subsequence {f nk } such that for all g H, one has lim (f n k,

More information

Linear Algebra 2 Spectral Notes

Linear Algebra 2 Spectral Notes Linear Algebra 2 Spectral Notes In what follows, V is an inner product vector space over F, where F = R or C. We will use results seen so far; in particular that every linear operator T L(V ) has a complex

More information

Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 2 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 5, 2012 Andre Tkacenko

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Linear Algebra. Session 12

Linear Algebra. Session 12 Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)

More information

Math 429/581 (Advanced) Group Theory. Summary of Definitions, Examples, and Theorems by Stefan Gille

Math 429/581 (Advanced) Group Theory. Summary of Definitions, Examples, and Theorems by Stefan Gille Math 429/581 (Advanced) Group Theory Summary of Definitions, Examples, and Theorems by Stefan Gille 1 2 0. Group Operations 0.1. Definition. Let G be a group and X a set. A (left) operation of G on X is

More information

Linear Algebra and Dirac Notation, Pt. 2

Linear Algebra and Dirac Notation, Pt. 2 Linear Algebra and Dirac Notation, Pt. 2 PHYS 500 - Southern Illinois University February 1, 2017 PHYS 500 - Southern Illinois University Linear Algebra and Dirac Notation, Pt. 2 February 1, 2017 1 / 14

More information

arxiv: v1 [math.na] 1 Sep 2018

arxiv: v1 [math.na] 1 Sep 2018 On the perturbation of an L -orthogonal projection Xuefeng Xu arxiv:18090000v1 [mathna] 1 Sep 018 September 5 018 Abstract The L -orthogonal projection is an important mathematical tool in scientific computing

More information

STAT200C: Review of Linear Algebra

STAT200C: Review of Linear Algebra Stat200C Instructor: Zhaoxia Yu STAT200C: Review of Linear Algebra 1 Review of Linear Algebra 1.1 Vector Spaces, Rank, Trace, and Linear Equations 1.1.1 Rank and Vector Spaces Definition A vector whose

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Banach Journal of Mathematical Analysis ISSN: (electronic)

Banach Journal of Mathematical Analysis ISSN: (electronic) Banach J. Math. Anal. 6 (2012), no. 1, 139 146 Banach Journal of Mathematical Analysis ISSN: 1735-8787 (electronic) www.emis.de/journals/bjma/ AN EXTENSION OF KY FAN S DOMINANCE THEOREM RAHIM ALIZADEH

More information

Linear Algebra- Final Exam Review

Linear Algebra- Final Exam Review Linear Algebra- Final Exam Review. Let A be invertible. Show that, if v, v, v 3 are linearly independent vectors, so are Av, Av, Av 3. NOTE: It should be clear from your answer that you know the definition.

More information

LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS

LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS LINEAR ALGEBRA BOOT CAMP WEEK 2: LINEAR OPERATORS Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F has characteristic zero. The following are facts

More information

MATH 5640: Functions of Diagonalizable Matrices

MATH 5640: Functions of Diagonalizable Matrices MATH 5640: Functions of Diagonalizable Matrices Hung Phan, UMass Lowell November 27, 208 Spectral theorem for diagonalizable matrices Definition Let V = X Y Every v V is uniquely decomposed as u = x +

More information

Estimation of random coefficients logit demand models with interactive fixed effects

Estimation of random coefficients logit demand models with interactive fixed effects Estimation of random coefficients logit demand models with interactive fixed effects Hyungsik Roger Moon Matthew Shum Martin Weidner The Institute for Fiscal Studies Department of Economics, UCL cemmap

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

A Note on Hilbertian Elliptically Contoured Distributions

A Note on Hilbertian Elliptically Contoured Distributions A Note on Hilbertian Elliptically Contoured Distributions Yehua Li Department of Statistics, University of Georgia, Athens, GA 30602, USA Abstract. In this paper, we discuss elliptically contoured distribution

More information

Matrix Theory, Math6304 Lecture Notes from October 25, 2012

Matrix Theory, Math6304 Lecture Notes from October 25, 2012 Matrix Theory, Math6304 Lecture Notes from October 25, 2012 taken by John Haas Last Time (10/23/12) Example of Low Rank Perturbation Relationship Between Eigenvalues and Principal Submatrices: We started

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian

More information

1. Addition: To every pair of vectors x, y X corresponds an element x + y X such that the commutative and associative properties hold

1. Addition: To every pair of vectors x, y X corresponds an element x + y X such that the commutative and associative properties hold Appendix B Y Mathematical Refresher This appendix presents mathematical concepts we use in developing our main arguments in the text of this book. This appendix can be read in the order in which it appears,

More information