On corrections of classical multivariate tests for high-dimensional data

Size: px
Start display at page:

Download "On corrections of classical multivariate tests for high-dimensional data"

Transcription

1 On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng

2 Overview Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

3 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

4 High-dimensional data and new challenge in statistics High dimensional data High dimensional data high dimensional models Nonparametric regression a very high-dimensional model (i.e. infinite dimensional model) but with one-dimensional data y i = f (x i ) + ε i, f R R, i = 1,..., n High-dimensional data observation vectors y i R p, with p relatively high w.r.t. the sample size n

5 High-dimensional data and new challenge in statistics High dimensional data Some typical data dimensions data ratio n/p data dimension p sample size n n/p portfolio climate survey speech analysis a 10 2 b ORL face data base micro-arrays Important data ratio n/p not always large ; could be 1 Note use of the Inverse data ratio y = p/n

6 A two sample problem High-dimensional effect by an example The two-sample problem two independent samples x 1,..., x n1 (µ 1, Σ), y 1,..., y n2 (µ 2, Σ) want to test H 0 µ 1 = µ 2 against H 1 µ 1 µ 2. Classical approach Hotelling s T 2 test where x = S n = n X 1 i=1 1 n 2 T 2 = n1n2 n (x y) Sn 1 (x y), n X 2 x i, y = y j, n = n 1 + n 2, " n1 j=1 # n X X 2 (x i x i )(x i x i ) + (y j y j )(y j y i ) i=1 j=1. S n a sample covariance matrix

7 A two sample problem The two-sample problem Hotelling s T 2 test nice properties invariance under linear transformations; finite-sample optimality if Gaussian; asymptotic optimality otherwise. Hotelling s T 2 test bad news low power even for moderate data dimensions; high instability in computing Sn 1 even for p = 40; very few is known for the non Gaussian case; fatal deficiency when p > n 2, S n is not invertible.

8 A two sample problem Dempster s non-exact test (NET) Dempster A.P., 58, 60 A reasonable test must be based on x y even when p > n 2; choose a new basis in R n, project the data such that 1. axis 1 Ground mean (n 1 µ 1 + n 2 µ 2 )/n 2. axis 2 (x y). let the data matrix X = (x n p 1,..., x n1, y 1,..., y n2 ), and the (orthonormal) base change H n Z = n p Hn X = n n h 1.. h n C A X = z 1.. z n C A, h 1 = 1 1 n, h 2 = n Under normality, we have the z i s are n independent N p(, Σ); Ez 1 = 1 (n 1µ n 1 + n 2µ 2 ), Ez 2 = n1n2 (µ n 1 µ 2 ), Ez 3 = 0, i = 3,..., n. n 2 nn1 1 n1 n 1 nn2 1 n2 C A.

9 A two sample problem Dempster s non-exact test (NET) Test statistic z 2 2 F D = (n 2) z z n 2 Under H 0, z j 2 Q = rx α k χ 2 1(k), k=1 where α 1 α r > 0 are the non null eigenvalues of Σ. The distribution of F D is complicated Approximations - so the NET test think as Σ = I p, 1. Q mχ 2 r ; 2. next estimate r by ˆr ; Finally, under H 0, F D F (ˆr, (n 2)ˆr).

10 A two sample problem Dempster s non-exact test (NET) Problems with the NET test Difficult to construct the orthogonal transformation H n = {h j } for large n; even under Gaussianity, the exact power function depend on H n.

11 A two sample problem Bai and Saranadasa s test (ANT) Bai & Saranadasa, 96 Consider directly the statistic M n = x y 2 n tr S n ; n 1n 2 generally under very mild conditions (here RMT comes!), M n σ 2 n = N(0, 1), σn 2 = Var(M n) = n2 n 1 n1 2n2 2 n 2 tr Σ2. A ratio consistent estimator bσ 2 n = 2n(n 1)(n 2) n 1n 2(n 3)» tr Sn 2 1 (tr Sn)2 n 2, bσ 2 n/σ 2 n P 1. Finally, under H 0, Z n = Mn bσ 2 n = N(0, 1) This is the Bai-Saranadasa s asymptotic normal test (ANT).

12 A two sample problem Comparison between T 2, NET and ANT Power functions Assuming p, n, p/n y (0, 1), n 1/n κ ; Hotelling s T 2, Dempster s NET and Bai-Saranadasa s ANT s β H (µ) = Φ ξ α + β D (µ) = Φ ξ α + where α = test size, and! n(1 y) κ(1 κ) Σ 1/2 µ 2 + o(1), 2y n 2 tr Σ 2 κ(1 κ) µ 2 «+ o(1) = β BS (µ). µ = µ 1 µ 2, ξ α = Φ 1 (1 α). Important because of the factor (1 y), T 2 losses power when y increases, i.e. p increases relatively to n.

13 A two sample problem Comparison between T 2, NET and ANT Simulation results 1 Gaussian case Choice of covariance Σ = (1 ρ)i p + ρj p, J p = 1 p1 p noncentral parameter η = µ 1 µ 2 2, (n tr Σ 2 1, n 2) = (25, 20), n = 45

14 A two sample problem A summary of the introduction High-dimensional effect need to be taken into account ; Surprisingly, asymptotic methods with RMT perform well even for small p (as low as p = 4) ; many of classical multivariate analysis methods have to be examined with respect to high-dimensional effects.

15 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

16 Marčenko-Pastur distributions The Marčenko-Pastur distribution Theorem. Assume Marčenko & Pastur, 1967 X = p n i.i.d. variables (0, 1), Σ = I p not necessarily Gaussian, but with finite 4-th moment p, n, p/n y (0, 1] Then, the (empirical) distribution of the eigenvalues of S n = 1 n XXT converges to the distribution with density function f (x) = 1 p (x a)(b x), a x b, 2πyx where a = (1 y) 2, b = (1 + y) 2.

17 Marčenko-Pastur distributions The Marčenko-Pastur distribution f (x) = 1 p (x a)(b x), (1 y) 2 = a x b = (1 + y) 2. 2πyx Densités de la loi de Marcenko Pastur 1.0 y p/n a b 1/ / / c=1/4 c=1/8 c=1/2

18 Marčenko-Pastur distributions An explanation of the power deficiency of Hotelling s T 2 when p increases, even in Gaussian case, S n is different from its population counterpart Σ ; when y = p/n 1, the left edge a 0 small eigenvalues yield an instability of the T 2 statistic T 2 = n1n2 n (x y) Sn 1 (x y).

19 Bai and Silverstein s CLT for linear spectral statistics Bai and Silverstein s CLT for linear spectral statistics of S n Set the Empirical spectral distribution F n = 1 p px δ λj, j=1 where λ j s are p eigenvalues of S n; y n = p ; n [a, b] U open C. for any g analytic on U G n(g) = p [F n(g) µ yn (g)] where µ α is the MP distribution of index α (0, 1).

20 Bai and Silverstein s CLT for linear spectral statistics A CLT for linear spectral statistics Theorem Assume that g 1,, g k are k analytic functions on U ; Bai and Silverstein, 04 the matrix entries x ij are i.i.d. real-valued random variables such that Ex ij = 0, Ex 2 ij = 1, Ex ij 4 = 3. as n, p, y n = p y (0, 1); n Then, (G n(g 1),, G n(g k )) N k (m, V ), with a given mean vector m = m(g 1,..., g k ) and asymptotic covariance matrix V = V (g 1,..., g k ). Other versions exist Lytova & Pastur 09; Bai & Wang 09

21 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

22 Random Fisher matrices Random Fisher matrices two independent samples x 1,..., x n1 (0, I p), y 1,..., y n2 (0, I p) with i.i.d coordinates of mean 0 and variance 1 Associated sample covariance matrices S 1 = 1 n X 1 x i x i, S 2 = 1 n X 2 y j yj. n 1 n 2 i=1 j=1 Fisher matrix V n = S 1S 1 2 where n 2 > p.

23 Random Fisher matrices Random Fisher matrices Assume y n1 = p n 1 y 1 (0, 1), y n2 = p n 2 y 2 (0, 1). Under mild moment conditions, the ESD Fn Vn of V n has a LSD F y1,y 2 with density 8 >< l(x) = > (1 y p 2) (b x)(x a), a x b, 2πx(y 1 + y 2x) 0, otherwise where a = (1 y 2) 2 `1 y 1 + y 2 y 1y 2 2, b = (1 y 2) 2 `1 + y 1 + y 2 y 1y 2 2.

24 Random Fisher matrices CLT for LSS of random Fisher matrices let " I (0,1) (y (1 y 1) 2 # 1) (1 + y, (1 + y1) 2 2) 2 (1 U y e open C, 2) 2 for an analytic function f on e U, define fg n(f ) = p Z + h i f (x) Fn Vn F yn1,y n2 (dx), where F yn1,y n2 is the LSD with indexes y nk, k = 1, 2.

25 Random Fisher matrices CLT for LSS of random Fisher matrices Theorem Assume Ex 4 11 <, Ey 4 11 < and let β x = E x , β y = E y Then for any analytic functions f 1,, f k defined on e U, h fgn(f 1),, f G n(f k )i = N k (m, υ) with suitable asymptotic mean and covariance functions m and υ. Zheng, 08

26 Random Fisher matrices CLT for LSS of random Fisher matrices Limiting mean function m Zheng, 08 m(f j ) = lim [(1) + (2) + (3)] r 1 + I " # 1 1 f j (z(ζ)) πi ζ =1 ζ 1 ζ + 1 ζ + y dζ (1) 2 r r hr I βx y1(1 y2)2 1 + f 2πi h 2 j (z(ζ)) ζ =1 (ζ + y dζ (2) 2 )3 hr I βy y2(1 y2) + f j (z(ζ)) ζ + 1 hr 2πi h ζ =1 (ζ + y dζ, (3) 2 )3 hr where h i z(ζ) = (1 y 2) h 2 + 2hR(ζ), h = y 1 + y 2 y 1y 2.

27 Random Fisher matrices CLT for LSS of random Fisher matrices Limiting covariance function υ Zheng, 08 υ(f j, f l ) = lim [(4) + (5))] 1<r 1 <r I I f j (z(r 1ζ 1))f l (z(r 2ζ 2))r 1r 2 dζ 1dζ 2, (4) 2π 2 ζ 2 =1 ζ 1 =1 (r 2ζ 2 r 1ζ 1) 2 I I (βxy1 + βy y2)(1 y2)2 f j (z(ζ 1)) f l (z(ζ 2)) 4π 2 h 2 ζ 1 =1 (ζ 1 + y 2 hr 1 ) 2 dζ1 ζ 2 =1 (ζ 2 + y 2 hr 2 ) dζ2(5) 2 j, l {1,, k}.

28 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

29 One-sample test on covariance matrices a sample x 1,..., x n) N p(µ, Σ) want to test H 0 Σ = I p in high-dimensional case, several previous work exist Ledoit & Wolf 02; Schott 07; Srivastava we focus on the LR statistic T n = n [trs n log S n p], S n = 1 n nx (x i x)(x i x), i=1 Classical LRT Data dimension p is fixed, and when n, T n = χ 2 p(p+1)/2. Will see rapidly deficient when p is not small.

30 RMT Corrected LRT Theorem Bai, Jiang, Y and Zheng 09 Assume p/n y (0, 1) and let g(x) = x log x 1. Then, under H 0 and when n» Tn n p F yn (g) N(m(g), υ(g)), where F yn is the Marčenko-Pastur law of index y n and log (1 y) m(g) =, 2 υ(g) = 2 log (1 y) 2y.

31 Simulation study I Comparison of LRT and CLRT by simulation nominal test level α = 0.05 ; for each (p, n), 10,000 independent replications with real Gaussian variables. Powers are estimated under the alternative H 1 Σ = diag(1, 0.05, 0.05, 0.05,..., 0.05). CLRT LRT (p, n ) Size Difference with 5% Power Size Power (5, 500) (10, 500) (50, 500) (100, 500) (300, 500)

32 Simulation study I On a plot n=500 Type I Error CLRT LRT Dimension

33 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

34 Two-samples test on covariance matrices two samples x 1,..., x n1 N p(µ 1, Σ 1), y 1,..., y n2 N p(µ 2, Σ 2) want to test H 0 Σ 1 = Σ 2 The associated sample covariance matrices are S 1 = 1 n X 1 (x i x)(x i x), S 2 = 1 n X 2 (y i y)(y i y), n 1 n 2 i=1 i=1 Let the LR statistic S1S 1 n1 2 2 L 1 = c1s 1S 1 n, 2 + c 2I p 2 where n = n 1 + n 2 and c k = n k n, k = 1, 2.

35 Two-samples test on covariance matrices Classical LRT Data dimension p is fixed, and when n 1, n 2 and under H 0, T n = 2 log L 1 χ 2 p(p+1)/2. Will see rapidly deficient when p is not small.

36 RMT Corrected LRT Theorem Bai, Jiang, Y and Zheng 09 Assuming that the conditions of CLT for LSS of Fisher matrices hold and let f (x) = log(y 1 + y 2x) y2 y 1 + y 2 log x log(y 1 + y 2). Then under H 0 and as n 1 n 2,» 2 log L1 p F yn1,y n n2 (f ) N (m(f ), υ(f )), with m(f ) = 1 2» log y1 + y 2 y 1y 2 y 1 + y 2 «y1 y 1 + y 2 log(1 y 2) y2 log(1 y 1), y 1 + y 2 υ(f ) = 2y 2 2 (y 1 + y 2) 2 log(1 y1) 2y 2 1 (y 1 + y 2) 2 log(1 y2) 2 log y 1 + y 2 y 1 + y 2 y 1y 2.

37 Simulation study II Comparison of LRT and CLRT by simulation nominal test level α = 0.05 ; for each (p, n 1, n 2), 10,000 independent replications with real Gaussian variables. Powers are estimated under the alternative H 1 Σ 1Σ 1 2 = diag(3, 1, 1,, ).

38 Simulation study II Comparison of LRT and CLRT by simulation with (y 1, y 2 ) = (0.05, 0.05) CLRT LRT (p, n 1, n 2 ) Size Difference with 5% Power Size Power (5, 100, 100) (10, 200, 200) (20, 400, 400) (40, 800, 800) (80, 1600, 1600) (160, 3200, 3200) (320, 6400, 6400)

39 Simulation study II Comparison of LRT and CLRT by simulation with (y 1, y 2 ) = (0.05, 0.1) CLRT LRT (p, n 1, n 2 ) Size Difference with 5% Power Size Power (5, 100, 50) (10, 200, 100) (20, 400, 200) (40, 800, 400) (80, 1600, 800) (160, 3200, 1600) (320, 6400, 3200)

40 Simulation study II Comparisons of LRT and CLRT y1=0.05, y2=0.05 y1=0.05, y2=0.1 Type I Error CLRT LRT Type One Error CLRT LRT Dimension Dimension

41 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

42 A general linear hypothesis in a multivariate regression A p-th dimensional regression model where x i = Bz i + ε i, i = 1,..., n ε i N p(0, Σ), x R p, z i R q, n p + q. A general linear hypothesis Write a bloc decomposition B = (B 1, B 2) with q 1 and q 2 columns To test H 0 B 1 = B 1, with a given B 1.

43 Wilk s Λ Let Σ b 0 and Σ b 1 be the likelihood estimator of Σ under H 0 and the alternative, respectively LRT statistic equals L 0/L 1 = (Λ n) n/2, Λ n = b Σ b Σ 0, where Λ n is the celebrated Wilk s Λ Wilks 32, 34 ; Bartlett 34. Classic (low dimensional) approximation of LRT for fixed p and q, n and under H 0 U n = n log Λ n χ 2 pq 1. Less biased Bartlett s correction Ũ n = k log Λ n, k = n q 1 (p q1 + 1). 2

44 High-dimensional correction of Wilk s Λ Theorem Let p, q 1, n q and Bai, Jiang, Y and Zheng, 10 y n1 = p y 1 (0, 1), y n2 = p y2 (0, 1). q 1 n q Then, under H 0, T n = υ(f ) 1 2 ˆ log Λn p F yn1,yn2 (f ) m(f ) N (0, 1), where m(f ), υ(f ) and F yn1,y n2 (f ) are suitable constants computed from f (x) = log(1 + yn 2 y n1 x).

45 The centering term where y n2 1 F yn1,y n2 (f ) = log c n + yn 1 1 log(c n d nh n) y n2 y n1 «= + yn 1 + y n2 cnh n d ny n2 log, y n1 y n2 h n h n = p y n1 + y n2 y n1 y n2 a n, b n = c n, d n = 1 2 (1 hn)2 (1 y n2 ) 2»r 1 + yn 2 y n1 b n ± r 1 + yn 2 a n, c n > d n, y n1

46 The limiting parameters where m(f ) = 1 2 log (c2 d 2 )h 2 (ch y, 2d) 2 «c 2 υ(f ) = 2 log, c 2 d 2 h = y 1 + y 2 y 1y 2 a 0, b 0 = c, d = 1 2 (1 h)2 (1 y 2) 2»r 1 + y2 y 1 b 0 ± r 1 + y2 a 0, c > d. y 1

47 A simulation experiment p=10, n=100, q=50, q1=30 p=20, n=100, q=60, q1=50 Power LRT CLRT ST1 ST2 Power LRT CLRT ST1 ST Change in Non center parameter c0 Change in Non center parameter c0 Gaussian entries, non central parameter c 0 d(h, H 0).

48 Introduction High-dimensional data and new challenge in statistics A two sample problem Sample covariance matrices Sample v.s. population covariance matrices Marčenko-Pastur distributions Bai and Silverstein s CLT for linear spectral statistics Random Fisher matrices Random Fisher matrices Testing covariance matrices I Simulation study I Testing covariance matrices II Simulation study II Multivariate regressions Conclusions

49 Some conclusions High dimensional effects should be taken into account; RMT for sample covariance matrices is a powerful tool to correct classical multivariate procedures ; Each time some Σ is to be estimated, one should take care of the natural estimator S n for high-dimensional data, S n Σ. Yet the RMT is not suffciently developped for statistics 1. dependent observations time series ; 2. not identically distributed variables.

50 Some references Bai, Z. D. and Saranadasa, H. (1996). Effect of high dimension comparison of significance tests for a high dimensional two sample problem. Statistica Sinica. 6, Bai, Z. D. and Silverstein, J. W. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann.Probab. 32, Z. D. Bai, D. Jiang, J. Yao and S. Zheng, Corrections to LRT on Large Dimensional Covariance Matrix by RMT. Annals of Statistics 37, Z. D. Bai, D. Jiang, J. Yao and S. Zheng, Large regression analysis. submitted Dempster, A. P. (1958). A high dimensional two sample significance test. Ann. Math. Statist. 29, Zheng, S. (2008). Central Limit Theorem for Linear Spectral Statistics of Large Dimensional F Matrix. Preprint, Northern-Est Normal University

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance

More information

A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p.

A new test for the proportionality of two large-dimensional covariance matrices. Citation Journal of Multivariate Analysis, 2014, v. 131, p. Title A new test for the proportionality of two large-dimensional covariance matrices Authors) Liu, B; Xu, L; Zheng, S; Tian, G Citation Journal of Multivariate Analysis, 04, v. 3, p. 93-308 Issued Date

More information

TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST

TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST Submitted to the Annals of Statistics arxiv:. TESTS FOR LARGE DIMENSIONAL COVARIANCE STRUCTURE BASED ON RAO S SCORE TEST By Dandan Jiang Jilin University arxiv:151.398v1 [stat.me] 11 Oct 15 This paper

More information

Estimation of the Global Minimum Variance Portfolio in High Dimensions

Estimation of the Global Minimum Variance Portfolio in High Dimensions Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results

More information

Large sample covariance matrices and the T 2 statistic

Large sample covariance matrices and the T 2 statistic Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and

More information

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX

HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX Submitted to the Annals of Statistics HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX By Shurong Zheng, Zhao Chen, Hengjian Cui and Runze Li Northeast Normal University, Fudan

More information

Covariance estimation using random matrix theory

Covariance estimation using random matrix theory Covariance estimation using random matrix theory Randolf Altmeyer joint work with Mathias Trabs Mathematical Statistics, Humboldt-Universität zu Berlin Statistics Mathematics and Applications, Fréjus 03

More information

An Introduction to Large Sample covariance matrices and Their Applications

An Introduction to Large Sample covariance matrices and Their Applications An Introduction to Large Sample covariance matrices and Their Applications (June 208) Jianfeng Yao Department of Statistics and Actuarial Science The University of Hong Kong. These notes are designed for

More information

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions

Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Central Limit Theorems for Classical Likelihood Ratio Tests for High-Dimensional Normal Distributions Tiefeng Jiang 1 and Fan Yang 1, University of Minnesota Abstract For random samples of size n obtained

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Asymptotic Statistics-VI. Changliang Zou

Asymptotic Statistics-VI. Changliang Zou Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

On testing the equality of mean vectors in high dimension

On testing the equality of mean vectors in high dimension ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 17, Number 1, June 2013 Available online at www.math.ut.ee/acta/ On testing the equality of mean vectors in high dimension Muni S.

More information

Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond

Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond Testing High-Dimensional Covariance Matrices under the Elliptical Distribution and Beyond arxiv:1707.04010v2 [math.st] 17 Apr 2018 Xinxin Yang Department of ISOM, Hong Kong University of Science and Technology

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

A new method to bound rate of convergence

A new method to bound rate of convergence A new method to bound rate of convergence Arup Bose Indian Statistical Institute, Kolkata, abose@isical.ac.in Sourav Chatterjee University of California, Berkeley Empirical Spectral Distribution Distribution

More information

Regularized LRT for Large Scale Covariance Matrices: One Sample Problem

Regularized LRT for Large Scale Covariance Matrices: One Sample Problem Regularized LRT for Large Scale ovariance Matrices: One Sample Problem Young-Geun hoi, hi Tim Ng and Johan Lim arxiv:502.00384v3 [stat.me] 0 Jun 206 August 6, 208 Abstract The main theme of this paper

More information

High-dimensional two-sample tests under strongly spiked eigenvalue models

High-dimensional two-sample tests under strongly spiked eigenvalue models 1 High-dimensional two-sample tests under strongly spiked eigenvalue models Makoto Aoshima and Kazuyoshi Yata University of Tsukuba Abstract: We consider a new two-sample test for high-dimensional data

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan

SHOTA KATAYAMA AND YUTAKA KANO. Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka , Japan A New Test on High-Dimensional Mean Vector Without Any Assumption on Population Covariance Matrix SHOTA KATAYAMA AND YUTAKA KANO Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama,

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

High-dimensional asymptotic expansions for the distributions of canonical correlations

High-dimensional asymptotic expansions for the distributions of canonical correlations Journal of Multivariate Analysis 100 2009) 231 242 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva High-dimensional asymptotic

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

CS145: Probability & Computing

CS145: Probability & Computing CS45: Probability & Computing Lecture 5: Concentration Inequalities, Law of Large Numbers, Central Limit Theorem Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis,

More information

Empirical Likelihood Tests for High-dimensional Data

Empirical Likelihood Tests for High-dimensional Data Empirical Likelihood Tests for High-dimensional Data Department of Statistics and Actuarial Science University of Waterloo, Canada ICSA - Canada Chapter 2013 Symposium Toronto, August 2-3, 2013 Based on

More information

Fluctuations from the Semicircle Law Lecture 4

Fluctuations from the Semicircle Law Lecture 4 Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014

More information

PATTERNED RANDOM MATRICES

PATTERNED RANDOM MATRICES PATTERNED RANDOM MATRICES ARUP BOSE Indian Statistical Institute Kolkata November 15, 2011 Introduction Patterned random matrices. Limiting spectral distribution. Extensions (joint convergence, free limit,

More information

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix

Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix Spectral analysis of the Moore-Penrose inverse of a large dimensional sample covariance matrix Taras Bodnar a, Holger Dette b and Nestor Parolya c a Department of Mathematics, Stockholm University, SE-069

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

Lectures 6 7 : Marchenko-Pastur Law

Lectures 6 7 : Marchenko-Pastur Law Fall 2009 MATH 833 Random Matrices B. Valkó Lectures 6 7 : Marchenko-Pastur Law Notes prepared by: A. Ganguly We will now turn our attention to rectangular matrices. Let X = (X 1, X 2,..., X n ) R p n

More information

Exponential tail inequalities for eigenvalues of random matrices

Exponential tail inequalities for eigenvalues of random matrices Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify

More information

MATH5745 Multivariate Methods Lecture 07

MATH5745 Multivariate Methods Lecture 07 MATH5745 Multivariate Methods Lecture 07 Tests of hypothesis on covariance matrix March 16, 2018 MATH5745 Multivariate Methods Lecture 07 March 16, 2018 1 / 39 Test on covariance matrices: Introduction

More information

Likelihood Ratio Tests for High-dimensional Normal Distributions

Likelihood Ratio Tests for High-dimensional Normal Distributions Likelihood Ratio Tests for High-dimensional Normal Distributions A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Fan Yang IN PARTIAL FULFILLMENT OF THE

More information

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Nathan Noiry Modal X, Université Paris Nanterre May 3, 2018 Wigner s Matrices Wishart s Matrices Generalizations Wigner s Matrices ij, (n, i, j

More information

Triangular matrices and biorthogonal ensembles

Triangular matrices and biorthogonal ensembles /26 Triangular matrices and biorthogonal ensembles Dimitris Cheliotis Department of Mathematics University of Athens UK Easter Probability Meeting April 8, 206 2/26 Special densities on R n Example. n

More information

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction Random Matrix Theory and its applications to Statistics and Wireless Communications Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction Sergio Verdú Princeton University National

More information

Numerical Solutions to the General Marcenko Pastur Equation

Numerical Solutions to the General Marcenko Pastur Equation Numerical Solutions to the General Marcenko Pastur Equation Miriam Huntley May 2, 23 Motivation: Data Analysis A frequent goal in real world data analysis is estimation of the covariance matrix of some

More information

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

arxiv: v1 [stat.me] 5 Nov 2015

arxiv: v1 [stat.me] 5 Nov 2015 Tests for High-Dimensional Covariance Matrices Using Random Matrix Projection arxiv:1511.01611v1 [stat.me] 5 Nov 2015 Tung-Lung Wu and Ping Li Department of Statistics and Biostatistics, Department of

More information

Statistica Sinica Preprint No: SS

Statistica Sinica Preprint No: SS Statistica Sinica Preprint No: SS-2017-0275 Title Testing homogeneity of high-dimensional covariance matrices Manuscript ID SS-2017-0275 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0275

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means

Jackknife Empirical Likelihood Test for Equality of Two High Dimensional Means Jackknife Empirical Likelihood est for Equality of wo High Dimensional Means Ruodu Wang, Liang Peng and Yongcheng Qi 2 Abstract It has been a long history to test the equality of two multivariate means.

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

1 Intro to RMT (Gene)

1 Intro to RMT (Gene) M705 Spring 2013 Summary for Week 2 1 Intro to RMT (Gene) (Also see the Anderson - Guionnet - Zeitouni book, pp.6-11(?) ) We start with two independent families of R.V.s, {Z i,j } 1 i

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Statistical Inference On the High-dimensional Gaussian Covarianc

Statistical Inference On the High-dimensional Gaussian Covarianc Statistical Inference On the High-dimensional Gaussian Covariance Matrix Department of Mathematical Sciences, Clemson University June 6, 2011 Outline Introduction Problem Setup Statistical Inference High-Dimensional

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Analysis of variance, multivariate (MANOVA)

Analysis of variance, multivariate (MANOVA) Analysis of variance, multivariate (MANOVA) Abstract: A designed experiment is set up in which the system studied is under the control of an investigator. The individuals, the treatments, the variables

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices

Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Testing linear hypotheses of mean vectors for high-dimension data with unequal covariance matrices Takahiro Nishiyama a,, Masashi Hyodo b, Takashi Seo a, Tatjana Pavlenko c a Department of Mathematical

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Random Matrix Theory and its Applications to Econometrics

Random Matrix Theory and its Applications to Econometrics Random Matrix Theory and its Applications to Econometrics Hyungsik Roger Moon University of Southern California Conference to Celebrate Peter Phillips 40 Years at Yale, October 2018 Spectral Analysis of

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Comparison Method in Random Matrix Theory

Comparison Method in Random Matrix Theory Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH

More information

Chapter 9. Hotelling s T 2 Test. 9.1 One Sample. The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus

Chapter 9. Hotelling s T 2 Test. 9.1 One Sample. The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus Chapter 9 Hotelling s T 2 Test 9.1 One Sample The one sample Hotelling s T 2 test is used to test H 0 : µ = µ 0 versus H A : µ µ 0. The test rejects H 0 if T 2 H = n(x µ 0 ) T S 1 (x µ 0 ) > n p F p,n

More information

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error

Journal of Multivariate Analysis. Sphericity test in a GMANOVA MANOVA model with normal error Journal of Multivariate Analysis 00 (009) 305 3 Contents lists available at ScienceDirect Journal of Multivariate Analysis journal homepage: www.elsevier.com/locate/jmva Sphericity test in a GMANOVA MANOVA

More information

Applications of Random Matrix Theory in Wireless Underwater Communication

Applications of Random Matrix Theory in Wireless Underwater Communication Applications of Random Matrix Theory in Wireless Underwater Communication Why Signal Processing and Wireless Communication Need Random Matrix Theory Atulya Yellepeddi May 13, 2013 18.338- Eigenvalues of

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 05: Testing in multivariate normal models with block circular covariance structures Yuli Liang Dietrich von Rosen Tatjana

More information

Effects of data dimension on empirical likelihood

Effects of data dimension on empirical likelihood Biometrika Advance Access published August 8, 9 Biometrika (9), pp. C 9 Biometrika Trust Printed in Great Britain doi:.9/biomet/asp7 Effects of data dimension on empirical likelihood BY SONG XI CHEN Department

More information

Testing Block-Diagonal Covariance Structure for High-Dimensional Data

Testing Block-Diagonal Covariance Structure for High-Dimensional Data Testing Block-Diagonal Covariance Structure for High-Dimensional Data MASASHI HYODO 1, NOBUMICHI SHUTOH 2, TAKAHIRO NISHIYAMA 3, AND TATJANA PAVLENKO 4 1 Department of Mathematical Sciences, Graduate School

More information

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA

MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA J. Japan Statist. Soc. Vol. 37 No. 1 2007 53 86 MULTIVARIATE THEORY FOR ANALYZING HIGH DIMENSIONAL DATA M. S. Srivastava* In this article, we develop a multivariate theory for analyzing multivariate datasets

More information

TWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA

TWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA Statistica Sinica (014): Preprint 1 TWO-SAMPLE BEHRENS-FISHER PROBLEM FOR HIGH-DIMENSIONAL DATA Long Feng 1, Changliang Zou 1, Zhaojun Wang 1, Lixing Zhu 1 Nankai University and Hong Kong Baptist University

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2 EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut

More information

STAT 730 Chapter 5: Hypothesis Testing

STAT 730 Chapter 5: Hypothesis Testing STAT 730 Chapter 5: Hypothesis Testing Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 28 Likelihood ratio test def n: Data X depend on θ. The

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56

Cointegrated VAR s. Eduardo Rossi University of Pavia. November Rossi Cointegrated VAR s Financial Econometrics / 56 Cointegrated VAR s Eduardo Rossi University of Pavia November 2013 Rossi Cointegrated VAR s Financial Econometrics - 2013 1 / 56 VAR y t = (y 1t,..., y nt ) is (n 1) vector. y t VAR(p): Φ(L)y t = ɛ t The

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Applications of random matrix theory to principal component analysis(pca)

Applications of random matrix theory to principal component analysis(pca) Applications of random matrix theory to principal component analysis(pca) Jun Yin IAS, UW-Madison IAS, April-2014 Joint work with A. Knowles and H. T Yau. 1 Basic picture: Let H be a Wigner (symmetric)

More information

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS

CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL DISTRIBUTIONS The Annals of Statistics 013, Vol. 41, No. 4, 09 074 DOI: 10.114/13-AOS1134 Institute of Mathematical Statistics, 013 CENTRAL LIMIT THEOREMS FOR CLASSICAL LIKELIHOOD RATIO TESTS FOR HIGH-DIMENSIONAL NORMAL

More information

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title

On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime. Title itle On singular values distribution of a matrix large auto-covariance in the ultra-dimensional regime Authors Wang, Q; Yao, JJ Citation Random Matrices: heory and Applications, 205, v. 4, p. article no.

More information

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ). 1 Economics 620, Lecture 8a: Asymptotics II Uses of Asymptotic Distributions: Suppose X n! 0 in probability. (What can be said about the distribution of X n?) In order to get distribution theory, we need

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

An introduction to G-estimation with sample covariance matrices

An introduction to G-estimation with sample covariance matrices An introduction to G-estimation with sample covariance matrices Xavier estre xavier.mestre@cttc.cat Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) "atrices aleatoires: applications aux communications

More information

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data Song Xi CHEN Guanghua School of Management and Center for Statistical Science, Peking University Department

More information

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices S. Dallaporta University of Toulouse, France Abstract. This note presents some central limit theorems

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008

Regularized Estimation of High Dimensional Covariance Matrices. Peter Bickel. January, 2008 Regularized Estimation of High Dimensional Covariance Matrices Peter Bickel Cambridge January, 2008 With Thanks to E. Levina (Joint collaboration, slides) I. M. Johnstone (Slides) Choongsoon Bae (Slides)

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

Lecture Stat Information Criterion

Lecture Stat Information Criterion Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution

More information

Lecture 5: Hypothesis tests for more than one sample

Lecture 5: Hypothesis tests for more than one sample 1/23 Lecture 5: Hypothesis tests for more than one sample Måns Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods 8/4 2011 2/23 Outline Paired comparisons Repeated

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

An Introduction to Spectral Learning

An Introduction to Spectral Learning An Introduction to Spectral Learning Hanxiao Liu November 8, 2013 Outline 1 Method of Moments 2 Learning topic models using spectral properties 3 Anchor words Preliminaries X 1,, X n p (x; θ), θ = (θ 1,

More information

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices by Jack W. Silverstein* Department of Mathematics Box 8205 North Carolina State University Raleigh,

More information

arxiv: v1 [math.pr] 13 Aug 2007

arxiv: v1 [math.pr] 13 Aug 2007 The Annals of Probability 2007, Vol. 35, No. 4, 532 572 DOI: 0.24/009790600000079 c Institute of Mathematical Statistics, 2007 arxiv:0708.720v [math.pr] 3 Aug 2007 ON ASYMPTOTICS OF EIGENVECTORS OF LARGE

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2 Unit roots in vector time series A. Vector autoregressions with unit roots Scalar autoregression True model: y t y t y t p y tp t Estimated model: y t c y t y t y t p y tp t Results: T j j is asymptotically

More information

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013 Multivariate Gaussian Distribution Auxiliary notes for Time Series Analysis SF2943 Spring 203 Timo Koski Department of Mathematics KTH Royal Institute of Technology, Stockholm 2 Chapter Gaussian Vectors.

More information