Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Size: px

Start display at page:

Download "Assessing the dependence of high-dimensional time series via sample autocovariances and correlations"

Victoria Daniel
5 years ago
Views:

1 Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia), Alex Aue, Debashis Paul (UC Davis) and Jianfeng Yao (HKU). Copenhagen, J. Heiny Large correlation and covariance matrices 1 / 36

2 Motivation: S&P 500 Index Lower tail index Upper tail index Figure: Estimated tail indices of log-returns of 478 time series in the S&P 500 index. J. Heiny Large correlation and covariance matrices 2 / 36

3 Setup Data matrix X = X p : p n matrix with iid centered columns. X = (X it ) i=1,...,p;t=1,...,n Sample covariance matrix S = 1 n XX Ordered eigenvalues of S λ 1 (S) λ 2 (S) λ p (S) Applications: Principal Component Analysis Linear Regression,... J. Heiny Large correlation and covariance matrices 3 / 36

4 Sample Correlation Matrix Sample correlation matrix R with entries R ij = S ij Sii S jj, i, j = 1,..., p and eigenvalues λ 1 (R) λ 2 (R) λ p (R). J. Heiny Large correlation and covariance matrices 4 / 36

5 The Model Data structure: X p = A p Z p, where A p is a deterministic p p matrix such that ( A p ) is bounded and Z p = (Z it ) i=1,...,p;t=1,...,n has iid, centered entries with unit variance (if finite). J. Heiny Large correlation and covariance matrices 5 / 36

6 The Model Data structure: X p = A p Z p, where A p is a deterministic p p matrix such that ( A p ) is bounded and Z p = (Z it ) i=1,...,p;t=1,...,n has iid, centered entries with unit variance (if finite). Population covariance matrix Σ = AA. Population correlation matrix Γ = (diag(σ)) 1/2 Σ(diag(Σ)) 1/2 Note: E[S] = Σ but E[R ij ] = Γ ij + O(n 1 ). J. Heiny Large correlation and covariance matrices 5 / 36

7 The Model Sample Population Covariance matrix S Σ Correlation matrix R Γ J. Heiny Large correlation and covariance matrices 6 / 36

8 The Model Sample Population Covariance matrix S Σ Correlation matrix R Γ Growth regime: n = n p and p n p γ [0, ), as p. p High dimension: lim p n (0, ) p Moderate dimension: lim p n = 0 J. Heiny Large correlation and covariance matrices 6 / 36

9 Main Result Approximation Under Finite Fourth Moment Assume X = AZ and E[Z11 4 ] <. Then we have as p, a.s. n/p diag(s) diag(σ) 0. Approximation Under Infinite Fourth Moment Assume X = Z and E[Z11 4 ] =. Then we have as p, c np S diag(s) a.s. 0. }{{} 0 J. Heiny Large correlation and covariance matrices 7 / 36

10 Main result Assume X = AZ and E[Z11 4 ] <. Then we have as p, a.s. n/p diag(s) diag(σ) 0, and n/p (diag(s)) 1/2 (diag(σ)) 1/2 a.s. 0. J. Heiny Large correlation and covariance matrices 8 / 36

11 Main result Assume X = AZ and E[Z11 4 ] <. Then we have as p, a.s. n/p diag(s) diag(σ) 0, and n/p (diag(s)) 1/2 (diag(σ)) 1/2 a.s. 0. Relevance: Note that R = (diag(s)) 1/2 S (diag(s)) 1/2. S = 1 n XX and R = Y Y, where ( Y = (Y ij ) p n = X ij n t=1 X2 it In general, any two entries of Y are dependent. ) p n J. Heiny Large correlation and covariance matrices 8 / 36

12 A Comparison Under Finite Fourth Moment Approximation of the sample correlation matrix Assume X = AZ and E[Z11 4 ] <. Then we have n p R (diag(σ)) 1/2 S(diag(Σ)) 1/2 }{{} S Q a.s. 0. J. Heiny Large correlation and covariance matrices 9 / 36

13 A Comparison Under Finite Fourth Moment Approximation of the sample correlation matrix Assume X = AZ and E[Z11 4 ] <. Then we have n p R (diag(σ)) 1/2 S(diag(Σ)) 1/2 }{{} Spectrum comparison S Q a.s. 0. An application of Weyl s inequality yields n p max λ i (R) λ i (S Q n ) i=1,...,p p R SQ a.s. 0. J. Heiny Large correlation and covariance matrices 9 / 36

14 A Comparison Under Finite Fourth Moment Approximation of the sample correlation matrix Assume X = AZ and E[Z11 4 ] <. Then we have n p R (diag(σ)) 1/2 S(diag(Σ)) 1/2 }{{} Spectrum comparison S Q a.s. 0. An application of Weyl s inequality yields n p max λ i (R) λ i (S Q n ) i=1,...,p p R SQ a.s. 0. Operator norm consistent estimation R Γ = O( p/n) a.s. J. Heiny Large correlation and covariance matrices 9 / 36

15 Notation Empirical spectral distribution of p p matrix C with real eigenvalues λ 1 (C),..., λ p (C): F C (x) = 1 p p 1 {λi (C) x}, x R. i=1 Stieltjes transform: 1 s C (z) = x z df C(x) = 1 p tr(c zi) 1, z C +, R Limiting spectral distribution: Weak convergence of (F Cp ) to distribution function F a.s. J. Heiny Large correlation and covariance matrices 10 / 36

16 Limiting Spectral Distribution of R Assume X = AZ, E[Z 4 11 ] < and that F Γ converges to a probability distribution H. 1 If p/n γ (0, ), then F R converges weakly to a distribution function F γ,h, whose Stieltjes transform s satisfies s(z) = dh(t) t(1 γ γs(z)) z, z C+. 2 If p/n 0, then F converges weakly to a n/p(r Γ) distribution function F, whose Stieltjes transform s satisfies dh(t) s(z) = z + t s(z), z C+, where s is the unique solution to s(z) = (z + t s(z)) 1 t dh(t) and z C +. J. Heiny Large correlation and covariance matrices 11 / 36

17 Special Case A = I Simplified assumptions: d 1 iid, symmetric entries X it = X p 2 Growth regime: lim p n = γ [0, 1] J. Heiny Large correlation and covariance matrices 12 / 36

18 Marčenko Pastur and Semicircle Law Marčenko Pastur law F γ has density f γ (x) = { 1 2πxγ (b x)(x a), if x [a, b], 0, otherwise, where a = (1 γ) 2 and b = (1 + γ) 2. J. Heiny Large correlation and covariance matrices 13 / 36

19 Marčenko Pastur and Semicircle Law Marčenko Pastur law F γ has density f γ (x) = { 1 2πxγ (b x)(x a), if x [a, b], 0, otherwise, where a = (1 γ) 2 and b = (1 + γ) 2. Semicircle law SC J. Heiny Large correlation and covariance matrices 13 / 36

20 Extreme Eigenvalues Largest and smallest eigenvalues of R If p/n γ [0, 1] and E[X 4 ] <, then n/p (λ1 (R) 1) a.s. 2 + γ and n/p (λp (R) 1) a.s. 2 + γ. J. Heiny Large correlation and covariance matrices 14 / 36

21 Extreme Eigenvalues Largest and smallest eigenvalues of R If p/n γ [0, 1] and E[X 4 ] <, then n/p (λ1 (R) 1) a.s. 2 + γ and n/p (λp (R) 1) a.s. 2 + γ. Earlier: R Γ = O( p/n) a.s. In this case: n/p R Γ a.s. 2 + γ. J. Heiny Large correlation and covariance matrices 14 / 36

22 Limiting Spectral Distribution Marčenko Pastur Theorem Assume E[X 2 ] = 1. Then (F S ) converges weakly to F γ. If E[X 4 ] < and p/n 0, then (F ) converges weakly n/p (S I) to SC. J. Heiny Large correlation and covariance matrices 15 / 36

23 Limiting Spectral Distribution Marčenko Pastur Theorem Assume E[X 2 ] = 1. Then (F S ) converges weakly to F γ. If E[X 4 ] < and p/n 0, then (F ) converges weakly n/p (S I) to SC. JH (2018+) Under the domain of attraction type-condition for the Gaussian law, n lim p p ne[ Y11] 4 = 0, the sequence (F R ) converges weakly to F γ. If in addition p/n 0, then (F ) converges weakly to n/p (R I) SC. Here Y ij = X ij n. t=1 X2 it J. Heiny Large correlation and covariance matrices 15 / 36

24 Simulation Study Regular variation with index α > 0: P( X > x) = x α L(x), where L is a slowly varying function. This implies E[ X α+ε ] = for any ε > 0. J. Heiny Large correlation and covariance matrices 16 / 36

25 Simulation Study Regular variation with index α > 0: P( X > x) = x α L(x), where L is a slowly varying function. This implies E[ X α+ε ] = for any ε > 0. Procedure: 1 Simulate X 2 Plot histograms of (λ i (R)) and (λ i (S)) 3 Compare with Marčenko Pastur density J. Heiny Large correlation and covariance matrices 16 / 36

26 α = 6 α = 6, n = 2000, p = 1000 J. Heiny Large correlation and covariance matrices 17 / 36

27 J. Heiny Large correlation and covariance matrices 18 / 36

28 Infinite Fourth Moment Regular variation with index α (0, 4) Normalizing sequence (a 2 np) such that np P(X 2 > a 2 npx) x α/2, as n for x > 0. Then a np = (np) 1 /α l(np) for a slowly varying function l. J. Heiny Large correlation and covariance matrices 19 / 36

29 Reduction to Diagonal Diagonal X with iid regularly varying entries α (0, 4) and p = n β l(n) with β [0, 1]. We have a 2 np XX diag(xx ) P 0, where denotes the spectral norm. n (XX ) ij = X it X jt. t=1 J. Heiny Large correlation and covariance matrices 20 / 36

30 Eigenvalues Weyl s inequality max i=1,...,p λ i (A + B) λ i (A) B. Choose A + B = XX and A = diag(xx ) to obtain a 2 np max i=1,...,p λ i (XX ) λ i (diag(xx )) P 0, n. Note: Limit theory for (λ i (S)) reduced to (S ii ). J. Heiny Large correlation and covariance matrices 21 / 36

31 Example: Eigenvalues Figure: Smoothed histogram based on simulations of the approximation error for the normalized eigenvalue a 2 np λ 1 (S) for entries X it with α = 1.6, β = 1, n = 1000 and p = 200. J. Heiny Large correlation and covariance matrices 22 / 36

32 Eigenvectors v k unit eigenvector of S associated to λ k (S) Unit eigenvectors of diag(s) are canonical basisvectors e j. Eigenvectors X with iid regularly varying entries with index α (0, 4) and p n = n β l(n) with β [0, 1]. Then for any fixed k 1, v k e Lk l2 P 0, n. J. Heiny Large correlation and covariance matrices 23 / 36

33 Localization vs. Delocalization Pareto data Indices of components Size of components Figure: X Pareto(0.8) Normal Data Indices of Components Size of Components Figure: X N(0, 1) Components of eigenvector v 1. p = 200, n = J. Heiny Large correlation and covariance matrices 24 / 36

34 Point Process of Normalized Eigenvalues Point process convergence N n = p i=1 d δ a 2 np λ i (XX ) i=1 δ Γ 2/α i = N The limit is a PRM on (0, ) with mean measure µ(x, ) = x α/2, x > 0, and Γ i = E E i, (E i ) iid standard exponential. J. Heiny Large correlation and covariance matrices 25 / 36

35 Point Process of Normalized Eigenvalues Limiting distribution: For k 1, lim n P(a 2 np λ k x) = lim P(N n(x, ) < k) = P(N(x, ) < k) n = k 1 s=0 ( x α/2 ) s e x α/2, x > 0. s! J. Heiny Large correlation and covariance matrices 26 / 36

36 Point Process of Normalized Eigenvalues Limiting distribution: For k 1, lim n P(a 2 np λ k x) = lim P(N n(x, ) < k) = P(N(x, ) < k) n = k 1 s=0 ( x α/2 ) s e x α/2, x > 0. s! Largest eigenvalue n a 2 λ 1 (S) d Γ α/2 1, np where the limit has a Fréchet distribution with parameter α/2. Soshnikov (2006), Auffinger et al. (2009), Auffinger and Tang (2016), Davis et al. (2014, ), JH and Mikosch (2016) J. Heiny Large correlation and covariance matrices 26 / 36

37 α = 3.99 α = 3.99, n = 2000, p = 1000 J. Heiny Large correlation and covariance matrices 27 / 36

38 α = 3 α = 3, n = 2000, p = 1000 J. Heiny Large correlation and covariance matrices 28 / 36

39 α = 2.1 α = 2.1, n = 10000, p = 1000 J. Heiny Large correlation and covariance matrices 29 / 36

40 Heavy Tails and Dependence (Z it ): iid field of regularly varying random variables. Stochastic volatility model: X = ( Z it σ (n) ) it p n J. Heiny Large correlation and covariance matrices 30 / 36

41 Heavy Tails and Dependence (Z it ): iid field of regularly varying random variables. Stochastic volatility model: X = ( Z it σ (n) ) it p n Generate deterministic covariance structure A: Davis et al. (2014) X = A 1/2 Z J. Heiny Large correlation and covariance matrices 30 / 36

42 Heavy Tails and Dependence (Z it ): iid field of regularly varying random variables. Dependence among rows and columns: X it = h kl Z i k,t l l=0 k=0 with some constants h kl. Davis et al. (2016) J. Heiny Large correlation and covariance matrices 31 / 36

43 Heavy Tails and Dependence (Z it ): iid field of regularly varying random variables. Dependence among rows and columns: X it = h kl Z i k,t l l=0 k=0 with some constants h kl. Davis et al. (2016) Relation to iid case: XX = h k1 l 1 h k2 l 2 Z(k 1, l 1 )Z (k 2, l 2 ), l 1,l 2 =0 k 1,k 2 =0 where Z(k, l) = (Z i k,t l ) i=1,...,p;t=1,...,n, l, k Z. J. Heiny Large correlation and covariance matrices 31 / 36

44 Heavy Tails and Dependence (Z it ): iid field of regularly varying random variables. Dependence among rows and columns: X it = h kl Z i k,t l l=0 k=0 with some constants h kl. Davis et al. (2016) Relation to iid case: XX = h k1 l 1 h k2 l 2 Z(k 1, l 1 )Z (k 2, l 2 ), l 1,l 2 =0 k 1,k 2 =0 where Z(k, l) = (Z i k,t l ) i=1,...,p;t=1,...,n, l, k Z. Location of squares: M ij = h il h jl, i, j Z. l Z J. Heiny Large correlation and covariance matrices 31 / 36

45 Autocovariance Matrices For s 0, X n (s) = (X i,t+s ) i=1,...,p; t=1,...,n, n 1. Then X n = X n (0). Autocovariance matrix for lag s X n (0)X n (s) Limit theory for singular values of such matrices. J. Heiny Large correlation and covariance matrices 32 / 36

46 Autocovariance eigenvectors J. Heiny Large correlation and covariance matrices 33 / 36

47 Autocovariance eigenvectors Coordinates of eigenvector Eigenvectors of K(0,0) 1st eigenvector of P(0,0) nd eigenvector of P(0,0) 3rd eigenvector of P(0,0) Number of coordinate th eigenvector of P(0,0) 5th eigenvector of P(0,0) J. Heiny Large 395 correlation 400 and405 covariance 410 matrices 34 /

48 Future work Distribution free measures of dependence: Spearman s rank correlation matrices Kendall s τ correlation matrices, U-statistic! J. Heiny Large correlation and covariance matrices 35 / 36

49 Thank you! J. Heiny Large correlation and covariance matrices 36 / 36

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)