Future Random Matrix Tools for Large Dimensional Signal Processing

Size: px
Start display at page:

Download "Future Random Matrix Tools for Large Dimensional Signal Processing"

Transcription

1 /42 Future Random Matrix Tools for Large Dimensional Signal Processing EUSIPCO 204, Lisbon, Portugal. Abla KAMMOUN and Romain COUILLET 2 King s Abdullah University of Technology and Science, Saudi Arabia 2 SUPELEC, France September st, 204

2 Introduction/ 2/42 High-dimensional data Consider n observations x,, x n of size N, independent and identically distributed with zero-mean and covariance C N, i.e, E [ x x H ] = CN, Let X N = [x,, x n ]. The sample covariance estimate ŜN of C N is given by: n Ŝ N = n X NX H N = n x i xi, i= From the law of large numbers, as n +, Ŝ N a.s. C N. Convergence in the operator norm In practice, it might be difficult to afford n +, if n N, Ŝ N can be sufficiently accurate, if N/n = O(), we model this scenario by the following assumption: N + and n + with N n c, Under this assumption, we have pointwise convergence to each element of C N, i.e, ) a.s. (ŜN (C N ) i,j i,j but S N C N does not converge to zero. The convergence in the operator norm does not hold.

3 Introduction/ 3/42 Illustration Consider C N = I N, the spectrum of Ŝ N is different from that of C N Spectrum of eigenvalues Marchenko-Pastur Law 0.8 Histogram Eigenvalues of ŜN Figure: Spectrum of eigenvalues when N = 400 and n = 2000 The asymptotic spectrum can be characterized by the Marchenko-Pastur Law.

4 Introduction/ 4/42 Reasons of interest for signal processing Scale similarity in array processing applications: large antenna arrays vs limited number of observations, Need for detection and estimation based on large dimensional random inputs: subspace methods in array processing. The assumption number of obervations dimension of observation is no longer valid: large arrays, systems with fast dynamics. Example MUSIC with few samples (or in large arrays) Call A(Θ) = [a(θ ),..., a(θ K )] C N K, N large, K small, the steering vectors to identify and X = [x,..., x n ] C N n the n samples, taken from x t = K a(θ k ) p k s k,t + σw t. k= The MUSIC localization function reads γ(θ) = a(θ) HÛ W ÛH W a(θ) in the signal vs. noise spectral decomposition XX H = Û S ˆΛS Û H S + Û W ˆΛW Û H W. Writing equivalently A(Θ)PA(Θ) H + σ 2 I N = U S Λ S U H S + σ2 U W U H W, as n, N, n/n c, from our previous remarks Û W Û H W U W U H W Music is NOT consistent in the large N, n regime! We need improved RMT-based solutions.

5 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 7/42 Stieltjes Transform Definition Let F be a real probability distribution function. The Stieltjes transform m F of F is the function defined, for z C +, as m F (z) = df (λ) λ z For a < b continuity points of F, denoting z = x + iy, we have the inverse formula If F has a density f at x, then F (b) F (a) = lim y 0 π b I[m F (x + iy)]dx a f (x) = lim y 0 π I[m F (x + iy)] The Stieltjes transform is to the Cauchy transform as the characteristic functin is to the Fourier transform. Equivalence F m F Similar to the Fourier transform, knowing m F is the same as knowing F.

6 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 8/42 Stieltjes transform of a Hermitian matrix Let X be a N N random matrix. Denote by df X the empirical measure of its eigenvalues λ,, λ N, i.e, df X = N Ni= δ λi. The Stieltjes transform of X denoted by m X = m F is the stieltjes transform of its empirical measure: m X (z) = λ z df (λ) = N N i= λ i z = N tr (X zi N ). The Stieltjes transform of a random matrix is the trace of the resolvent matrix Q(z) = (X zi N ). The resolvent matrix plays a key role in the derivation of many of the results of random matrix theory. For compactly supported F, mf (z) is linked to the moments M k = E N tr Xk, + m F (z) = M k z k m F is defined in general on C + but exists everywhere outside the support of F. k=0

7 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 9/42 Side remark: the Shannon -transform A. M. Tulino, S. Verdù, Random matrix theory and wireless communications, Now Publishers Inc., Definition Let F be a probability distribution, m F its Stieltjes transform, then the Shannon-transform V F of F is defined as V F (x) log( + xλ)df (λ) = 0 x ( t m F ( t)) dt This quantity is fundamental to wireless communication purposes! Note that m F itself is of interest, not F!

8 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 0/42 Proof of the Mar cenko-pastur law V. A. Mar cenko, L. A. Pastur, Distributions of eigenvalues for some sets of random matrices, Math USSR-Sbornik, vol., no. 4, pp , 967. The theorem to be proven is the following Theorem Let X N C N n have i.i.d. zero mean variance /n entries with finite eighth order moments. As n, N with N n c (0, ), the e.s.d. of X NX H N converges almost surely to a nonrandom distribution function F c with density f c given by where a = ( c) 2, and b = ( + c) 2. f c (x) = ( c ) + δ(x) + (x a) 2πcx + (b x) +

9 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method /42 The Mar cenko-pastur density.2 c = 0. c = 0.2 c = Density fc (x) x Figure: Mar cenko-pastur law for different limit ratios c = lim N N/n.

10 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 2/42 Diagonal entries of the resolvent Since we want an expression of m F, we start by identifying the diagonal entries of the resolvent (X N X H N zi N ) of X N X H N. Denote [ ] y H XN = Y Now, for z C +, we have ( ) [ X N X H y N zi H y z y H Y H ] N = Yy YY H zi N Consider the first diagonal element of (R N zi N ). From the matrix inversion lemma, ( ) ( A B (A BD = C) A B(D CA B) ) C D (A BD C) CA (D CA B) which here gives [ ( ) ] X N X H N zi N = z zy H (Y H Y zi n ) y

11 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 3/42 Trace Lemma Z. Bai, J. Silverstein, Spectral Analysis of Large Dimensional Random Matrices, Springer Series in Statistics, To go further, we need the following result, Theorem Let {A N } C N N with bounded spectral norm. Let {x N } C N, be a random vector of i.i.d. entries with zero mean, variance /N and finite 8 th order moment, independent of A N. Then x H N A Nx N N tr A N a.s. 0. For large N, we therefore have approximately [ ( ) ] X N X H N zi N z z N tr (YH Y zi n )

12 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 4/42 Rank- perturbation lemma J. W. Silverstein, Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, Journal of Multivariate Analysis, vol. 54, no. 2, pp , 995. It is somewhat intuitive that adding a single column to Y won t affect the trace in the limit. Theorem Let A and B be N N with B Hermitian positive definite, and v C N. For z C \ R, ( N tr (B zi N ) (B + vv H zi N ) ) A A N dist(z, R + ) with A the spectral norm of A, and dist(z, A) = inf y A y z. Therefore, for large N, we have approximately, [ ( ) ] X N X H N zi N z z N tr (YH Y zi n ) z z N tr (XH N X N zi n ) = z z n N m F (z) in which we recognize the Stieltjes transform m F of the l.s.d. of X H N X N.

13 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 5/42 End of the proof We have again the relation hence n N m F (z) = m F (z) + N n N z [ ( ) ] X N X H N zi N n N z zm F (z) Note that the choice (, ) is irrelevant here, so the expression is valid for all pair (i, i). Summing over the N terms and averaging, we finally have m F (z) = N tr (X N X H N zi N which solve a polynomial of second order. Finally m F (z) = c 2z ) c z zm F (z) (c z) z. 2z From the inverse Stieltjes transform formula, we then verify that m F is the Stieltjes transform of the Mar cenko-pastur law.

14 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 6/42 Related bibliography V. A. Mar cenko, L. A. Pastur, Distributions of eigenvalues for some sets of random matrices, Math USSR-Sbornik, vol., no. 4, pp , 967. J. W. Silverstein, Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, Journal of Multivariate Analysis, vol. 54, no. 2, pp , 995. Z. D. Bai and J. W. Silverstein, Spectral analysis of large dimensional random matrices, 2nd Edition Springer Series in Statistics, R. B. Dozier, J. W. Silverstein, On the empirical distribution of eigenvalues of large dimensional information-plus-noise-type matrices, Journal of Multivariate Analysis, vol. 98, no. 4, pp , V. L. Girko, Theory of Random Determinants, Kluwer, Dordrecht, 990. A. M. Tulino, S. Verdù, Random matrix theory and wireless communications, Now Publishers Inc., 2004.

15 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 7/42 Asymptotic results involving Stieltjes transform J. W. Silverstein, Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, Journal of Multivariate Analysis, vol. 54, no. 2, pp , 995. Theorem Let Y N = n X N C 2 N, where X N C n N has i.i.d entries of mean 0 and variance. Consider the regime n, N + with N n c. Let ˆm N be the Stieltjes transform associated to X N X N. Then, ˆm N m N 0 almost surely for all z C\R +, where m N (z) is the unique solution in the set {z C +, m N (z) C + } to: ( m N (z) = ctdf C ) N + tm N (z) z in general, no explicit expression for F N, the distribution whose Stietljes transform is m N (z). The theorem above characterizes also the Stieltjes transform of B N = X H N X N denoted by m N, m N = cm N + (c ) z This gives access to the spectrum of the sample covariance matrix model of x, when y i = C 2 N x i, x i i.i.d., C N = E[yy H ].

16 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 8/42 Getting F from m F Remember that, for a < b real, F (x) = lim y 0 π I[m F (x + iy)] where m F is (up to now) only defined on C +. to plot the density F, first approach: span z = x + iy on the line {x R, y = ε} parallel but close to the real axis, solve m F (z) for each z, and plot I[m F (z)]. refined approach: spectral analysis, to come next. Example (Sample covariance matrix) For N multiple of 3, let F C (x) = 3 x + 3 x x K and let B N = n C 2 N Z H N Z NC 2 N with F B N F, then m F = cm F + (c ) z ( ) t m F (z) = c + tm F (z) df C (t) z We take c = /0 and alternatively K = 7 and K = 4.

17 Part : Fundamentals of Random Matrix Theory/.. The Stieltjes Transform Method 9/42 Spectrum of the sample covariance matrix Empirical eigenvalue distribution Empirical eigenvalue distribution 0.6 Limit law 0.6 Limit law Density 0.4 Density Eigenvalues Eigenvalues Figure: Histogram of the eigenvalues of B N = n C 2 N Z H N Z N C 2 N, N = 3000, n = 300, with C N diagonal composed of three evenly weighted masses in (i), 3 and 7 on top, (ii), 3 and 4 at bottom.

18 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 2/42 Support of a distribution The support of a density f is the closure of the set {x, f (x) 0}. For instance the support of the mar cenko-pastur law is [ ( c) 2, ( + c) 2] Density fc (x) [ ] Support of the Marchenko-Pastur law x Figure: Mar cenko-pastur law for different limit ratios c = 0.5.

19 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 22/42 Extreme eigenvalues Limiting spectral results are insufficient to infer about the location of extreme eigenvalues. Example: Consider df N (x) = N Nk= δ ak. Then, dfn 0 = N N df N + N δ A N (x) and df N with A N a N satisfy: df N dfn 0 0. However, the supports of F N and F N0 differ by the mass A N. Question: How is the behaviour of the extreme eigenvalues of random covariance matrices?

20 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 23/42 No eigenvalue outside the support of sample covariance matrices Z. D. Bai, J. W. Silverstein, No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices, The Annals of Probability, vol. 26, no. pp , 998. Theorem Let X N C N n with i.i.d. entries with zero mean, unit variance and infinite fourth order. Let C N C N N be nonrandom and bounded in norm. Let m N be the unique solution in C + of ( m N = z N ) τ df C N (τ), m n + τm N (z) = N N n m N (z) + N n n z, z C +, Let F N be the distribution associated to the Stieltjes transform m N (z). Consider B N = n C 2 N X N X H N C 2 N. We know that F B N F N converge weakly to zero. Choose N 0 N and [a, b], a > 0, outside the support of F N for all N N 0. Denote L N the set of eigenvalues of B N. Then, P(L N [a, b] i.o.) = 0.

21 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 24/42 No eigenvalue outside the support: which models? J. W. Silverstein, P. Debashis, No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix, J. of Multivariate Analysis vol. 00, no., pp , It has already been shown that (for all large N) there is no eigenvalues outside the support of Mar cenko-pastur law: XX H, X i.i.d. with zero mean, variance /N, finite 4 th order moment. Sample covariance matrix: C 2 XX H C 2 and X H CX, X i.i.d. with zero mean, variance /N, finite 4 th order moment. Doubly-correlated matrix: R 2 XCX H R 2, X with i.i.d. zero mean, variance /N, finite 4 th order moment. J. W. Silverstein, Z.D. Bai, Y.Q. Yin, A note on the largest eigenvalue of a large dimensional sample covariance matrix, Journal of Multivariate Analysis, vol. 26, no. 2, pp , 988. If 4 th order moment is infinite, lim sup λ XXH max = N J. Silverstein, Z. Bai, No eigenvalues outside the support of the limiting spectral distribution of information-plus-noise type matrices to appear in Random Matrices: Theory and Applications. Only recently, information plus noise models, X with i.i.d. zero mean, variance /N, finite 4 th order moment (X + A)(X + A) H, and the generally correlation model where each column of X has correlation R i.

22 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 25/42 Extreme eigenvalues: Deeper into the spectrum In order to derive statistical detection tests, we need more information on the extreme eigenvalues. We will study the fluctuations of the extreme eigenvalues (second order statistics) However, the Stieltjes transform method is not adapted here!

23 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 26/42 Distribution of the largest eigenvalues of XX H C. A. Tracy, H. Widom, On orthogonal and symplectic matrix ensembles, Communications in Mathematical Physics, vol. 77, no. 3, pp , 996. K. Johansson, Shape Fluctuations and Random Matrices, Comm. Math. Phys. vol. 209, pp , Theorem Let X C N n have i.i.d. Gaussian entries of zero mean and variance /n. Denoting λ + N the largest eigenvalue of XX H, then N 2 3 λ+ N ( + c) 2 ( + c) 4 3 c 2 X + F + with c = lim N N/n and F + the Tracy-Widom distribution given by ( ) F + (t) = exp (x t) 2 q 2 (x)dx with q the Painlevé II function that solves the differential equation in which Ai(x) is the Airy function. t q (x) = xq(x) + 2q 3 (x) q(x) x Ai(x)

24 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 27/42 The law of Tracy-Widom 0.5 Empirical Eigenvalues Tracy-Widom law F Density Centered-scaled largest eigenvalue of XX H Figure: Distribution of N 2 3 c 2 ( + c) 4 3 [ λ + N ( + c) 2] against the distribution of X + (distributed as Tracy-Widom law) for N = 500, n = 500, c = /3, for the covariance matrix model XX H. Empirical distribution taken over 0, 000 Monte-Carlo simulations.

25 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 28/42 Techniques of proof Method of proof requires very different tools: orthogonal (Laguerre) polynomials: to write joint unordered eigenvalue distribution as a kernel determinant. ρ N (λ,..., λ p ) = det p K N (λ i, λ j ) i,j= with K(x, y) the kernel Laguerre polynomial. Fredholm determinants: we can write hole probability as a Fredholm determinant. ( P N 2/3 ( λ i ( + c) 2) ) A, i =,..., N = + ( ) k k! k det K k A c A c N (x i, x j ) dx i i,j= det(i N K N ). kernel theory: show that KN converges to a Airy kernel. K N (x, y) K Airy (x, y) = Ai(x)Ai (y) Ai (x)ai(y). x y differential equation tricks: hole probability in [t, ) gives right-most eigenvalue distribution, which is simplified as solution of a Painelvé differential equation: the Tracy-Widom distribution. F + (t) = e t (x t)q(x) 2 dx, q = tq + 2q 3, q(x) x Ai(x).

26 Part : Fundamentals of Random Matrix Theory/.2 Extreme eigenvalues 29/42 Comments on the Tracy-Widom law deeper result than limit eigenvalue result gives a hint on convergence speed fairly biased on the left: even fewer eigenvalues outside the support. can be shown to hold for other distributions than Gaussian under mild assumptions

27 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 3/42 Spiked models We consider n independent observations x,, x n of size N, The correlation structure is in general white + low rank, E [ ] x x H = I + P where P is of low rank, Objective: to infer the eigenvalues and/or the eigenvectors of P

28 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 32/42 The first result J. Baik, J. W. Silverstein, Eigenvalues of large sample covariance matrices of spiked population models, Journal of Multivariate Analysis, vol. 97, no. 6, pp , Theorem Let B N = n (I + P) 2 X N X H N (I + P) 2, where X N C N n has i.i.d., zero mean and unit variance entries, and P N R N N with eigenvalues given by: eig(p) = diag(ω,..., ω K, 0,...,..., 0) }{{} N K with ω >... > ω K >, c = lim N N/n. Let λ,, λ N be the eigenvalues of B N. We then have if ω j > a.s. c, λ j + ω j + c +ω j ω (i.e. beyond the Mar cenko Pastur bulk!) j if ωj (0, a.s. c], λ j ( + c) 2 (i.e. right-edge of the Mar cenko Pastur bulk!) if ωj [ c, 0), λ j a.s. ( c) 2 (i.e. left-edge of the Mar cenko Pastur bulk!) for the other eigenvalues, we discriminate over c: if ω j < a.s. c, c <, λ j + ω j + c +ω j ω (i.e. beyond the Mar cenko Pastur bulk!) j if ω j < a.s. c, c >, λ j ( c) 2 (i.e. left-edge of the Mar cenko Pastur bulk!)

29 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 33/42 Illustration of spiked models 0.8 Mar cenko-pastur law, c = /3 Empirical Eigenvalues 0.6 Density Eigenvalues + ω + c +ω ω, + ω 2 + c +ω 2 ω 2 Figure: Eigenvalues of B N = n (P + I) 2 X N X N H (P + I) 2, where ω = ω 2 = and ω 3 = ω 4 = 2 Dimensions: N = 500, n = 500.

30 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 34/42 Interpretation of the result if c is large, or alternatively, if some population spikes are small, part to all of the population spikes are attracted by the support! if so, no way to decide on the existence of the spikes from looking at the largest eigenvalues in signal processing words, signals might be missed using largest eigenvalues methods. as a consequence, the more the sensors (N), the larger c = lim N/n, the more probable we miss a spike

31 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 35/42 Sketch of the proof We start with a study of the limiting extreme eigenvalues. Let x > 0, then det(b N xi N ) = det(i N + P) det(xx H xi N + x[i N (I N + P) ]) = det(i N + P) det(xx H xi N ) det(i N + xp(i N + P) (XX H xi N ) ). if x eigenvalue of BN but not of XX H, then for n large, x > ( + c) 2 (edge of MP law support) and det(i N + xp(i N + P) (XX H xi N ) ) = det(i r + xω(i N + Ω) U H (XX H xi N ) U) = 0 with P = UΩU H, U C N r. due to unitary invariance of X, U H (XX H xi N ) U a.s. (t x) df MP (t)i r m(x)i r with F MP the MP law, and m(x) the Stieltjes transform of the MP law (often known for r = as trace lemma). finally, we have that the limiting solutions xk satisfy x k m(x k ) + ( + ω k )ω k = 0. replacing m(x), this is finally: a.s. λ k x k + ω k + c( + ω k )ω k, if ω k > c

32 Part : Fundamentals of Random Matrix Theory/.3 Extreme eigenvalues: the spiked models 36/42 Comments on the result there exists a phase transition when the largest population eigenvalues move from inside to outside (0, + c). more importantly, for t < + c, we still have the same Tracy-Widom, no way to see the spike even when zooming in in fact, simulation suggests that convergence rate to the Tracy-Widom is slower with spikes.

33 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 38/42 Stieltjes transform inversion for covariance matrix models J. W. Silverstein, S. Choi, Analysis of the limiting spectral distribution of large dimensional random matrices, Journal of Multivariate Analysis, vol. 54, no. 2, pp , 995. We know for the model C 2N X N, X N C N n that, if F C N F C, the Stieltjes transform of the e.s.d. of B N = n XH N C NX N satisfies m BN (z) a.s. m F (z), with ( ) t m F (z) = z c + tm F (z) df C (t) which is unique on the set {z C +, m F (z) C + }. This can be inverted into z F (m) = m c t + tm df C (t) for m C +.

34 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 39/42 Stieltjes transform inversion and spectrum characterization Remember that we can evaluate the spectrum density by taking a complex line close to R and evaluating I[m F (z)] along this line. Now we can do better. It is shown that lim m F (z) = m 0 (x) exists. z C + z x R We also have, for x0 inside the support, the density f (x) of F in x 0 is π I[m 0] with m 0 the unique solution m C + of [z F (m) =] x 0 = m c t + tm df C (t) let m 0 R and x F the equivalent to z F on the real line. Then x 0 outside the support of F is equivalent to x F (m F (x 0 )) > 0, m F (x 0 ) 0, /m F (x 0 ) outside the support of F C. This provides another way to determine the support!. For m (, 0), evaluate x F (m). Whenever x F decreases, the image is outside the support. The rest is inside.

35 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 40/42 Another way to determine the spectrum: spectrum to analyze 0.6 Empirical eigenvalue distribution Limit law Density Eigenvalues Figure: Histogram of the eigenvalues of B N = n C 2 N X N X H N C 2 N, N = 300, n = 3000, with C N diagonal composed of three evenly weighted masses in, 3 and 7.

36 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 4/42 Another way to determine the spectrum: inverse function method x F (m), m B Support of F 7 x F (m) m Figure: Stieltjes transform of B N = n C 2 N X N X H N C 2 N, N = 300, n = 3000, with C N diagonal composed of three evenly weighted masses in, 3 and 7. The support of F is read on the vertical axis, whenever m F is decreasing.

37 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 42/42 Cluster boundaries in sample covariance matrix models Xavier Mestre, Improved estimation of eigenvalues of covariance matrices and their associated subspaces using their sample estimates, IEEE Transactions on Information Theory, vol. 54, no., Nov Theorem Let X N C N n have i.i.d. entries of zero mean, unit variance, and C N be diagonal such that F C N F C, as n, N, N/n c, where F C has K masses in t,..., t K with multiplicity n,..., n K respectively. Then the l.s.d. of B N = n C 2 N X N X H N C 2 N has support S given by with x q = x F (m q ), x + q = x F (m + q ), and S = [x, x + ] [x 2, x + 2 ]... [x Q, x + Q ] x F (m) = m c K t n k n k + t k= k m with 2Q the number of real-valued solutions counting multiplicities of x F (m) = 0 denoted in order m < m+ m 2 < m m Q < m+ Q.

38 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 43/42 Comments on spectrum characterization Previous results allows to determine the spectrum boundaries the number Q of clusters as a consequence, the total separation (Q = K) or not (Q < K) of the spectrum in K clusters. Mestre goes further: to determine local separability of the spectrum, identify the K inflexion points, i.e. the K solutions m,..., m K to check whether x F (m i ) > 0 and x F (m i+) > 0 x F (m) = 0 if so, the cluster in between corresponds to a single population eigenvalue.

39 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 44/42 Exact eigenvalue separation Z. D. Bai, J. W. Silverstein, Exact Separation of Eigenvalues of Large Dimensional Sample Covariance Matrices, The Annals of Probability, vol. 27, no. 3, pp , 999. Recall that the result on no eigenvalue outside the support says where eigenvalues are not to be found does not say, as we feel, that (if cluster separation) in cluster k, there are exactly n k eigenvalues. This is in fact the case, Empirical eigenvalue distribution 0.6 Limit law Density n n 2 Eigenvalues n 3

40 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 45/42 Eigeninference: Introduction of the problem Reminder: for a sequence x,..., x n C N of independent random variables, is an n-consistent estimator of C N = E[x x H ]. Ĉ N = n x n k x H k k= If n, N have comparable sizes, this no longer holds. Typically, n, N-consistent estimators of the full CN matrix perform very badly. If only the eigenvalues of CN are of interest, things can be done. The process of retrieving information about eigenvalues, eigenspace projections, or functional of these is called eigen-inference.

41 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 46/42 Girko and the G-estimators V. Girko, Ten years of general statistical analysis, Girko has come up with more than 50 N, n-consistent estimators, called G-estimators (Generalized estimators). Among those, we find G -estimator of generalized variance. For [ ] G (ĈN ) = α n(n ) n log det(c N ) + log N (n N) N k= (n k) with α n any sequence such that α 2 n log(n/(n N)) 0, we have in probability. G (Ĉ N ) α n log det(c N ) 0 However, Girko s proofs are rarely readable, if existent.

42 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 47/42 A long standing problem X. Mestre, Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates, IEEE trans. on Information Theory, vol. 54, no., pp , Consider the model BN = n C 2 N X N X H N C 2 N, where F C N is formed of a finite number of masses t,..., t K. It has long been thought the inverse problem of estimating t,..., t K from the Stieltjes transform method was not possible. Only trials were iterative convex optimization methods. The problem was partially solved by Mestre in 2008! His technique uses elegant complex analysis tools. The description of this technique is the subject of this course.

43 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 48/42 Reminders Consider the sample covariance matrix model BN = n C 2 N X N X H N C 2 N. Up to now, we saw: that there is no eigenvalue outside the support with probability for all large N. that for all large N, when the spectrum is divided into clusters, the number of empirical eigenvalues in each cluster is exactly as we expect. these results are of crucial importance for the following.

44 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 49/42 Eigen-inference for the sample covariance matrix model X. Mestre, Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates, IEEE trans. on Information Theory, vol. 54, no., pp , Theorem Consider the model B N = n C 2 N X N X H N C 2 N, with X N C N n, i.i.d. with entries of zero mean, unit variance, and C N R N N is diagonal with K distinct entries t,..., t K of multiplicity N,..., N K of same order as n. Let k {,..., K}. Then, if the cluster associated to t k is separated from the clusters associated to k and k +, as N, n, N/n c, ˆt k = n N k m N k (λ m µ m ) is an N, n-consistent estimator of t k, where N k = {N K i=k N i +,..., N K i=k+ N i }, λ,..., λ N are the eigenvalues of B N and µ,..., µ N are the N solutions of m X H N C N X N (µ) = 0 or equivalently, µ,..., µ N are the eigenvalues of diag(λ) N λ λ T.

45 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 50/42 Remarks on Mestre s result Assuming cluster separation, the result consists in taking the empirical ordered λi s inside the cluster (note that exact separation ensures there are N k of these!) getting the ordered eigenvalues µ,..., µ N of diag(λ) N λ λ T with λ = (λ,..., λ N ) T. Keep only those of index inside N k. take the difference and scale.

46 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 5/42 How to obtain this result? Major trick requires tools from complex analysis Silverstein s Stieltjes transform identity: for the conjugate model B N = n XH N C NX N, ( ) t m N (z) = z c + tm N (z) df C N (t) with m N the deterministic equivalent of m BN. This is the only random matrix result we need. Before going further, we need some reminders from complex analysis.

47 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 52/42 Limiting spectrum of the sample covariance matrix J. W. Silverstein, Z. D. Bai, On the empirical distribution of eigenvalues of a class of large dimensional random matrices, J. of Multivariate Analysis, vol. 54, no. 2, pp , 995. Reminder: If F C N F C, then m BN (z) a.s. m F (z) such that or equivalently ( ) t m F (z) = c + tm F (z) df C (t) z m F C ( /mf (z) ) = zm F (z)m F (z) with m F (z) = cm F (z) + (c ) z and N/n c.

48 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 53/42 Reminders of complex analysis Cauchy integration formula Theorem Let U C be an open set and f : U C be holomorphic on U. Let γ U be a continuous contour (i.e. closed path). Then, for a inside the surface formed by γ, we have f (z) dz = f (a) 2πi γ z a while for a outside the surface formed by γ, f (z) dz = 0. 2πi γ z a

49 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 54/42 Complex integration From Cauchy integral formula, denoting Ck a contour enclosing only t k, t k = ω dω = K ω N 2πi C k ω t k 2πi C k N j dω = N ωm k ω t j= j 2πiN F C (ω)dω. k C k After the variable change ω = /mf (z), When the system dimensions are large, t k = N zm N k 2πi F (z) m F (z) C F,k mf 2 dz, (z) m F (z) m BN (z) N N λ k= k z, with (λ,..., λ N ) = eig(b N ) = eig(yy H ). Dominated convergence arguments then show t k ˆt k a.s. 0 with ˆt k = N zm N k 2πi BN (z) m BN (z) C F,k mb 2 (z) dz N

50 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 55/42 Understanding the contour change x F (m), m B Support of F 7 x F (m) m 2 3 m /x 2 /x m IF CF,k encloses cluster k with real points m < m 2 THEN /m = x < t k < x 2 = /m 2 and C k encloses t k.

51 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 56/42 Poles and residues we find two sets of poles (outside zeros): λ,..., λ N, the eigenvalues of B N. the solutions µ,..., µ N to ˆm N (z) = 0. remember that m BN (w) = n N m B N (w) + n N N w ( ) m residue calculus, denote f (w) = nn wm BN (w) + n N BN (w) N m BN (w) 2, the λ k s are poles of order and lim z λ k (z λ k )f (z) = n N λ k the µ k s are also poles of order and by L Hospital s rule lim (z λ k )f (z) = lim z µ k z µ k n (z µ k )zm B (z) N = n N m BN (z) N µ k So, finally ˆt k = n N k m contour (λ m µ m )

52 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 57/42 Which poles in the contour? we now need to determine which poles are in the contour of interest. Since the µi are rank- perturbations of the λ i, they have the interleaving property λ < µ 2 < λ 2 <... < µ N < λ N what about µ? the trick is to use the fact that 2πi C k z dz = 0 which leads to the empirical version of which is m F (w) 2πi Γ k m F (w) 2 dw = 0 #{i : λ i Γ k } #{i : µ i Γ k } Since their difference tends to 0, there are as many λ k s as µ k s in the contour, hence µ is asymptotically in the integration contour.

53 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 58/42 Related bibliography C. A. Tracy and H. Widom, On orthogonal and symplectic matrix ensembles, Communications in Mathematical Physics, vol. 77, no. 3, pp , 996. G. W. Anderson, A. Guionnet, O. Zeitouni, An introduction to random matrices, Cambridge studies in advanced mathematics, vol. 8, 200. F. Bornemann, On the numerical evaluation of distributions in random matrix theory: A review, Markov Process. Relat. Fields, vol. 6, pp , 200. Y. Q. Yin, Z. D. Bai, P. R. Krishnaiah, On the limit of the largest eigenvalue of the large dimensional sample covariance matrix, Probability Theory and Related Fields, vol. 78, no. 4, pp , 988. J. W. Silverstein, Z.D. Bai and Y.Q. Yin, A note on the largest eigenvalue of a large dimensional sample covariance matrix, Journal of Multivariate Analysis, vol. 26, no. 2, pp C. A. Tracy, H. Widom, On orthogonal and symplectic matrix ensembles, Communications in Mathematical Physics, vol. 77, no. 3, pp , 996. Z. D. Bai, J. W. Silverstein, No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices, The Annals of Probability, vol. 26, no. pp , 998. Z. D. Bai, J. W. Silverstein, Exact Separation of Eigenvalues of Large Dimensional Sample Covariance Matrices, The Annals of Probability, vol. 27, no. 3, pp , 999. J. W. Silverstein, P. Debashis, No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix, J. of Multivariate Analysis vol. 00, no., pp , J. W. Silverstein, J. Baik, Eigenvalues of large sample covariance matrices of spiked population models Journal of Multivariate Analysis, vol. 97, no. 6, pp , I. M. Johnstone, On the distribution of the largest eigenvalue in principal components analysis, Annals of Statistics, vol. 99, no. 2, pp , 200. K. Johansson, Shape Fluctuations and Random Matrices, Comm. Math. Phys. vol. 209, pp , J. Baik, G. Ben Arous, S. Péché, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices, The Annals of Probability, vol. 33, no. 5, pp , 2005.

54 Part : Fundamentals of Random Matrix Theory/.4 Spectrum Analysis and G-estimation 59/42 Related bibliography (2) J. W. Silverstein, S. Choi, Analysis of the limiting spectral distribution of large dimensional random matrices, Journal of Multivariate Analysis, vol. 54, no. 2, pp , 995. W. Hachem, P. Loubaton, X. Mestre, J. Najim, P. Vallet, A Subspace Estimator for Fixed Rank Perturbations of Large Random Matrices, arxiv preprint , 20. R. Couillet, W. Hachem, Local failure detection and diagnosis in large sensor networks, (submitted to) IEEE Transactions on Information Theory, arxiv preprint F. Benaych-Georges, R. Rao, The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices, Advances in Mathematics, vol. 227, no., pp , 20. X. Mestre, On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices, IEEE Transactions on Signal Processing, vol. 56, no., X. Mestre, Improved estimation of eigenvalues and eigenvectors of covariance matrices using their sample estimates, IEEE trans. on Information Theory, vol. 54, no., pp , R. Couillet, J. W. Silverstein, Z. Bai, M. Debbah, Eigen-Inference for Energy Estimation of Multiple Sources, IEEE Transactions on Information Theory, vol. 57, no. 4, pp , 20. P. Vallet, P. Loubaton and X. Mestre, Improved subspace estimation for multivariate observations of high dimension: the deterministic signals case, arxiv preprint , 200.

55 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 62/42 Problem formulation We want to test the hypothesis H0 against H, { C N n hx Y = T + σw, information plus noise, hypothesis H σw, pure noise, hpothesis H 0 with h C N, x C N, W C N n. We assume no knowledge whatsoever but that W has i.i.d. (non-necessarily Gaussian) entries.

56 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 63/42 Exploiting the conditioning number L. S. Cardoso, M. Debbah, P. Bianchi, J. Najim, Cooperative spectrum sensing using random matrix theory, International Symposium on Wireless Pervasive Computing, pp , under either hypothesis, if H 0, for N large, we expect F YY H close to the Mar cenko-pastur law, of support [σ 2 ( c ) 2, σ 2 ( + c ) 2 ]. if H, if population spike more than + N n, largest eigenvalue is further away. the conditioning number of YY H is therefore asymptotically, as N, n, N/n c, if H 0, ( ) 2 cond(y) λmax c ( λ ) 2 min + c if H, cond(y) t + ( ) 2 ct c t > ( ) 2 + c with t = N k= h k 2 + σ 2 the conditioning number is independent of σ. We then have the decision criterion, whether or not σ is known, ( ) 2 H decide 0 : if cond(yy H Nn ) ( ) 2 + ε + Nn H : otherwise. for some security margin ε.

57 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 64/42 Comments on the method Advantages: much simpler than finite size analysis ratio independent of σ, so σ needs not be known Drawbacks: only stands for very large N (dimension N for which asymptotic results arise function of σ!) ad-hoc method, does not rely on performance criterion.

58 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 65/42 Generalized likelihood ratio test P. Bianchi, M. Debbah, M. Maida, J. Najim, Performance of Statistical Tests for Source Detection using Random Matrix Theory, IEEE Trans. on Information Theory, vol. 57, no. 4, pp , 20. Alternative generalized likelihood ratio test (GLRT) decision criterion, i.e. C(Y) = sup σ 2,h P Y h,σ 2 (Y, h, σ2 ) sup σ 2 P Y σ 2 (Y σ 2. ) Denote T N = λ max(yy H ) N tr YYH To guarantee a maximum false alarm ratio of α, decide { H : H 0 : if ( N ) ( N)n ( ) T n N T ( N)n N N > ξn otherwise. for some threshold ξ N that can be explicitly given as a function of α. Optimal test with respect to GLR. Performs better than conditioning number test.

59 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 66/42 Performance comparison for unknown σ 2, P Neyman-Pearson, Jeffreys prior Neyman-Pearson, uniform prior Conditioning number test GLRT Correct detection rate False alarm rate 0 2 Figure: ROC curve for a priori unknown σ 2 of the Neyman-Pearson test, conditioning number method and GLRT, K =, N = 4, M = 8, SNR = 0 db. For the Neyman-Pearson test, both uniform and Jeffreys prior, with exponent β =, are provided.

60 Application to Signal Sensing and Array Processing/2. Eigenvalue-based detection 67/42 Related biography R. Couillet, M. Debbah, A Bayesian Framework for Collaborative Multi-Source Signal Sensing, IEEE Transactions on Signal Processing, vol. 58, no. 0, pp , 200. T. Ratnarajah, R. Vaillancourt, M. Alvo, Eigenvalues and condition numbers of complex random matrices, SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 2, pp , M. Matthaiou, M. R. McKay, P. J. Smith, J. A. Mossek, On the condition number distribution of complex Wishart matrices, IEEE Transactions on Communications, vol. 58, no. 6, pp , 200. C. Zhong, M. R. McKay, T. Ratnarajah, K. Wong, Distribution of the Demmel condition number of Wishart matrices, IEEE Trans. on Communications, vol. 59, no. 5, pp , 20. L. S. Cardoso, M. Debbah, P. Bianchi, J. Najim, Cooperative spectrum sensing using random matrix theory, International Symposium on Wireless Pervasive Computing, pp , P. Bianchi, M. Debbah, M. Maida, J. Najim, Performance of Statistical Tests for Source Detection using Random Matrix Theory, IEEE Trans. on Information Theory, vol. 57, no. 4, pp , 20.

61 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 69/42 Source localization A uniform array of M antennas receives signal from K radio sources during n signal snapshots. Objective: Estimate the arrival angles θ,, θ K. θ 2 θ

62 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 70/42 Source Localization using Music Algorithm We consider the scenario of K sources and N antenna-array capturing n observations: x t = K a(θ k )s k,t + σw t, t =,, n k= AN = [a N (θ ),, a N (θ K )] with a N (θ) = eıπ sin θ eı(n )π sin θ σ 2 is the noise variance and is set for simplicity, Objective: infer θ,, θ K from the n observations Let XN = [x,, x n ], then, [ S X = AS + W = [A I N ] W] If K is finite while n, N +, the model correponds to the spiked covariance model. MUSIC Algorithm: Let Π be the orthogonal projection matrix on the span of AA and Π = I N Π (orthogonal projector on the noise subspace). Angles θ,, θ K are the unique ones verifying η(θ) a N (θ) Πa N (θ) = 0

63 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 7/42 Traditional MUSIC algorithm Traditional MUSIC algorithm: Angles are estimated as local minima of: a N (θ) ˆΠaN (θ) where ˆΠ is the orthogonal projection matrix on the eigenspace associated to the K largest eigenvalues of n X NX N It is well-known that this estimator is consistent when n + with K, N fixed, We consider the case of K finite spiked covariance model What happens when n, N +?

64 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 72/42 Asymptotic behaviour of the traditional MUSIC () We first need to understand the spectrum of n XXH We know that the weak spectrum is the MP law Up to K eigenvalues can leave the support: we identify here these eigenvalues Denote P = AA H = U S ΩU H S, Ω = diag(ω,..., ω K ), and Z = [S T W T ] T to recover (up to one row) the generic spiked model X = (I N + P) 2 Z. Reminder: If x eigenvalue of n XX H with x > ( + c) 2 (edge of MP law), for all large n, for some k. a.s. x λ k ρ k + ω k + c( + ω k )ω k, if ω k > c

65 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 73/42 Asymptotic behaviour of the traditional MUSIC (2) Recall the MUSIC approach: we want to estimate η(θ) = a(θ) H U W U H W a(θ) (U W C N (N K) such that U H W U S = 0) Instead of this quantity, we start with the study of a(θ) H û i û H i a(θ), k =,..., K with û,..., û N the eigenvectors belonging to λ... λ N. To fall back on known RMT quantities, we use the Cauchy-integral: a(θ) H û i û H i a(θ) = a(θ) H ( 2πı C i n XXH zi N ) a(θ)dz with C i a contour enclosing λ i only. Woodbury s identity (A + UCV ) = A A U(C + VA U) VA gives: a H û i û H i a = a 2πı Ci H (I N + P) ZZ H 2 ( n zi N ) (I N + P) 2 adz + â H 2πı Ĥ â 2dz C i where P = U S ΩU H S, and Ĥ â H â 2 = I K + zω(i K + Ω) U H S ( n ZZH zi N ) U S = za(θ) H (I N + P) 2 ( n ZZH zi N ) U S = Ω(I K + Ω) U H S ( n ZZH zi N ) (I N + P) 2 a(θ).

66 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 74/42 Asymptotic behaviour of the traditional MUSIC (3) For large n, the first term has no pole, while the second converges to T i H = I K + zm(z)ω(i K + Ω) a H 2πı H a 2 dz, with a H = zm(z)a (I N + P) 2 U S C i a 2 = m(z)ω(i K + Ω) U H S (I N + P) 2 a which after development is T i = K + ω l= l 2πı C i +ω l ω l zm 2 (z) + zm(z) dz. Using residue calculus, the sole pole is in ρi and we find Therefore, a(θ) H û i û H i a(θ) a.s. cω 2 i + cω a(θ) H u i u H i a(θ). i ˆη(θ) = a(θ) H ˆΠa(θ) a.s. a(θ)a(θ) H K i= cω 2 i + cω a(θ) H u i u H i a(θ) i

67 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 75/42 Improved G-MUSIC Recall that: a(θ) H u k u H k + cω k a(θ) cω 2 k a(θ) H û k û H a.s. k a(θ) 0 The ω k are however unknown. But they can be estimated from λ k a.s. ρ k = + ω k + c( + ω k )ω k This gives finally K ˆη G (θ) a(θ) H + c ˆω k a(θ) c ˆω 2 a(θ) H û k û H k a(θ) k= k with ˆω k = ˆλ k (c + ) 2 + (c + ˆλk ) 2 4c) We then obtain another (N, n)-consistent MUSIC estimator, only valid for K finite!

68 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 76/42 0 Simulation results 5 Cost function [db] MUSIC G-MUSIC angle [deg] Figure: MUSIC against G-MUSIC for DoA detection of K = 3 signal sources, N = 20 sensors, M = 50 samples, SNR of 0 db. Angles of arrival of 0, 35, and 37.

69 Application to Signal Sensing and Array Processing/2.2 The spiked G-MUSIC algorithm 77/42 Outline of the tutorial Part : Basics of Random Matrix Theory for Sample Covariance Matrices.. Introduction to the Stieltjes transform method, Mar cenko Pastur law, advanced models.2. Extreme eigenvalues: no eigenvalue outside the support, exact separation, Tracy Widom law.3. Extreme eigenvalues: the spiked models.4. Spectrum analysis and G-estimation Part 2: Application to Signal Sensing and Array Processing 2.. Eigenvalue-based detection 2.2. The (spiked) G-MUSIC algorithm Part 3: Advanced Random Matrix Models for Robust Estimation 3.. Robust estimation of scatter 3.2. Robust G-MUSIC 3.3. Robust shrinkage in finance 3.4. Second order robust statistics: GLRT detectors Part 4: Future Directions 4.. Kernel random matrices and kernel methods 4.2. Neural network applications

70 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 80/42 Covariance estimation and sample covariance matrices P.J. Huber, Robust Statistics, 98. Many statistical inference techniques rely on the sample covariance matrix (SCM) taken from i.i.d. observations x,..., x n of a r.v. x C N. The main reasons are: Assuming E[x] = 0, E[xx ] = C N, with X = [x,..., x n], by the LLN Ŝ N n XX a.s. C N as n. Hence, if θ = f (C N ), we often use the n-consistent estimate ˆθ = f ( Ŝ N ). The SCM ŜN is the ML estimate of C N for Gaussian x One therefore expects ˆθ to closely approximate θ for all finite n. This approach however has two limitations: if N, n are of the same order of magnitude, ŜN C N 0 as N, n, N/n c > 0, so that in general ˆθ θ 0 This motivated the introduction of G-estimators. if x is not Gaussian, but has heavier tails, Ŝ N is a poor estimator for C N. This motivated the introduction of robust estimators.

71 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 8/42 Reminders on robust estimation J. T. Kent, D. E. Tyler, Redescending M-estimates of multivariate location and scatter, 99. R. A. Maronna, Robust M-estimators of multivariate location and scatter, 976. Y. Chitour, F. Pascal, Exact maximum likelihood estimates for SIRV covariance matrix: Existence and algorithm analysis, The objectives of robust estimators: Replace the SCM ŜN by another estimate ĈN of C N which: rejects (or downscales) observations deterministically or rejects observations inconsistent with the full set of observations Example: Huber estimator, Ĉ N defined as solution of Ĉ N = { } n β n i x i xi k with β i = α min, 2 i= N x i ĈN x for some α >, k 2 function of ĈN. i Provide scale-free estimators of CN : Example: Tyler s estimator: if one observes x i = τ i z i for unknown scalars τ i, n Ĉ N = n i= N x i ĈN x i x i x i existence and uniqueness of Ĉ N defined up to a constant. few constraints on x,..., x n (N + of them must be linearly independent)

72 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 82/42 Reminders on robust estimation The objectives of robust estimators: replace the SCM Ŝ N by the ML estimate for C N. Example: Maronna s estimator for elliptical x Ĉ N = n ( ) u n N x i ĈN x i x i xi i= with u(s) such that (i) u(s) is continuous and non-increasing on [0, ) (ii) φ(s) = su(s) is non-decreasing, bounded by φ >. Moreover, φ(s) increases where φ(s) < φ. (note that Huber s estimator is compliant with Maronna s estimators) existence is not too demanding uniqueness imposes strictly increasing u(s) (inconsistent with Huber s estimate) consistency result: Ĉ N C N if u(s) meets the ML estimator for C N.

73 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 83/42 Robust Estimation and RMT So far, RMT has mostly focused on the SCM ŜN. x = AN w, w having i.i.d. zero-mean unit variance entries, x satisfies concentration inequalities, e.g. elliptically distributed x. Robust RMT estimation Can we study the performance of estimators based on the Ĉ N? what are the spectral properties of ĈN? can we generate RMT-based estimators relying on Ĉ N?

74 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 84/42 Assumptions: Setting and assumptions Take x,..., x n C N elliptical-like random vectors, i.e. x i = τ i C 2 N w i where τ,..., τ n R + random or deterministic with n ni= a.s. τ i w,..., w n C N random independent with w i / N uniformly distributed over the unit-sphere C N C N N deterministic, with C N 0 and lim sup N C N < We denote c N N/n and consider the growth regime c N c (0, ). Maronna s estimator of scatter: (almost sure) unique solution to Ĉ N = n ( ) u n N x i ĈN x i x i xi i= where u satisfies (i) u : [0, ) (0, ) nonnegative continuous and non-increasing (ii) φ : x xu(x) increasing and bounded with lim x φ(x) φ > (iii) φ < c+. Additional technical assumption: Let ν n n ni= δ τi. For each a > b > 0, a.s. lim sup lim sup n ν n ((t, )) = 0. t φ(at) φ(bt) Controls relative speed of the tail of ν n versus the flattening speed of φ(x) as x. Examples: τ i < M for each i. In this case, ν n((t, )) = 0 a.s. for t > M. For u(t) = ( + α)/(α + t), α > 0, and τ i i.i.d., it is sufficient to have E[τ +ε ] <.

75 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 85/42 Major issues with ĈN: Defined implicitly Heuristic approach Sum of non-independent rank-one matrices from vectors u( N x i But there is some hope: First remark: we can work with C N = I N without generality restriction! Denote Ĉ (j) = n ( ) u n N x i ĈN x i x i xi i j Then intuitively, Ĉ (j) and x j are only weakly dependent. We expect in particular (highly non-rigorous but intuitive!!): N x i Ĉ (i) x i τ i tr Ĉ N (i) τ i tr Ĉ N N. Our heuristic approach: Rewrite N x i ĈN x i as f ( N x i Ĉ (i) x i ) for some function f (later called g ) Deduce that Ĉ N = n ( ) (u f ) n N x i Ĉ (i) x i x i xi i= Use N x i Ĉ (i) x i τ i N tr ĈN to get Ĉ N n ( ) (u f ) τ n i tr Ĉ N N x i xi i= Use random matrix results to find a limiting value γ for N tr ĈN, and conclude Ĉ N n (u f )(τ n i γ)x i xi. i= ĈN x i )x i (Ĉ N depends on all x j s).

76 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 86/42 Heuristic approach in detail: f and γ Determination of f : Recall the identity (A + tvv ) v = A /( + tv A v). Then so that N x i ĈN x i = N x i Ĉ (i) x i = + c N u( N x i N x i Ĉ (i) x i ĈN x i ) N x i Ĉ (i) x i N x i ĈN x i c N φ( N x i Ĉ N x i ). Now the function g : x x/( c N φ(x)) is monotonous increasing (we use the assumption φ < c!), hence, with f = g, N x i ( ) ĈN x i = g N x i Ĉ (i) x i.

77 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 87/42 Heuristic approach in detail: f and γ Determination of γ: From previous calculus, we expect Ĉ N n ( ) (u g ) τ n i tr Ĉ N N x i xi n (u g ) (τ n i γ) x i xi. i= i= Hence γ tr Ĉ N N ( ) n N tr (u g ) (τ n i γ) τ i w i wi. i= Since τ i are independent of w i and γ deterministic, this is a Bai-Silverstein model n WDW, W = [w,..., w n ], D = diag(d ii ) = u g (τ i γ). And we have: γ ( ) ( ) N tr n WDW t(u g = m n WDW (0) 0 )(tγ) + + c(u g )(tγ)m n WDW (0) ν N (dt) = ( n n i= ) τ i (u g )(τ i γ) + cτ i (u g. )(τ i γ)m n WDW (0) Since γ m n WDW (0), this defines γ as a solution of a fixed-point equation: γ = ( n n i= ) τ i (u g )(τ i γ) + cτ i (u g. )(τ i γ)γ

78 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 88/42 Main result R. Couillet, F. Pascal, J. W. Silverstein, The Random Matrix Regime of Maronna s M-estimator with elliptically distributed samples, (submitted to) Elsevier Journal of Multivariate Analysis. Theorem (Asymptotic Equivalence) Under the assumptions defined earlier, we have ĈN Ŝ N a.s. 0, where Ŝ N n n v(τ i γ)x i xi i= v(x) = (u g )(x), ψ(x) = xv(x), g(x) = x/( cφ(x)) and γ > 0 unique solution of = n ψ(τ i γ) n + cψ(τ i= i γ). Remarks: Th. says: first order substitution of Ĉ N by Ŝ N allowed for large N, n. It turns out that v u and ψ φ in general behavior. Corollaries: max λ i (ŜN ) λ i (ĈN ) a.s. 0 i n N tr (ĈN zi N ) N tr (ŜN zi N ) a.s. 0 Important feature for detection and estimation. Proof: So far in the tutorial, we do not have a rigorous proof!

79 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 89/42 Fundamental idea: Showing that all τi N x i Technical trick: Denote and relabel terms such that We shall prove that, for each l > 0, Proof Ĉ (i) x i converge to the same limit γ. ( ) v N xi Ĉ (i) e x i i v(τ i γ) e... e n e > l i.o. and e n < + l i.o. Some basic inequalities: Denoting di τ i N xi Ĉ (i) x i = N w i Ĉ (i) w i, we have e j = ( v ( v τ j N w j τ j N w j ( n ) ) i j τ i v(τ i d i )w i wi wj v(τ j γ) ( n i j τ i v(τ i γ)e n w i wi v(τ j γ) ) wj ) = = ( v τ j N w j ( τ v j e n ( n ) ) i j τ i v(τ i γ)e i w i wi wj v(τ j γ) ( N wj n i j τ i v(τ i γ)w i wi v(τ j γ) ) wj )

80 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 90/42 Specialization to en : e n Proof ( ( τ v n e n N wn n ) ) i n τ i v(τ i γ)w i wi wn v(τ n γ) or equivalently, recalling ψ(x) = xv(x), ( ( N w n n ) ( i n τ i v(τ i γ)w i wi τ wn ψ n e n N w n ) ) n i n τ i v(τ i γ)w i wi wn. γ ψ(τ n γ) Random Matrix results: By trace lemma, we should have N w n τ n i v(τ i γ)w i wi w n N tr τ n i v(τ i γ)w i wi γ i n i n (by definition of γ as in previous slides)... DANGER: by relabeling, w n no longer independent of w,..., w n! Broken trace lemma! Solution: uniform convergence result. By (higher order) moment bounds, Markov inequality, and Borel Cantelli, for all large n a.s. max j n N w j τ n i v(τ i γ)w i wi w j γ < ε. i j

81 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 9/42 Proof Back to original problem: For all large n a.s., we then have (using growth of ψ) γ ε γ ( ) ψ τn e n (γ + ε). ψ(τ n γ) Proof by contradiction: Assume e n > + l i.o., then on a subsequence e n > + l always and γ ε γ ψ ( τn +l (γ + ε)) ψ(τ n γ). Bounded support for τ i : If 0 < τ < τ i < τ + < for all i, n, then on a subsequence where τ n τ 0, γ ε ψ ( τ 0 +l (γ + ε)) γ ψ(τ 0γ) }{{}}{{} ( as ε 0 τ0 ) ψ +l γ < as ε 0 ψ(τ 0 γ) CONTRADICTION! Unbounded support for τ i : Importance of relative growth of τ n versus convergence of ψ to ψ. Proof consists in dividing {τ i } in two groups: few large ones versus all others. Sufficient condition: lim sup lim sup n ν n((t, )) = 0. t φ(at) φ(bt)

82 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 92/ Simulations Empirical eigenvalue distribution of n ni= x i x i Limiting density 0.3 Density Eigenvalues Figure: Histogram of the eigenvalues of n n i= x i x i for n = 2500, N = 500, C N = diag(i 25, 3I 25, 0I 250), τ with Γ (.5, 2)-distribution.

83 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 93/42 Simulations Empirical eigenvalue distribution of ĈN Empirical eigenvalue distribution of ŜN 3 Limiting density 3 Limiting density Density 2 Density Eigenvalues Eigenvalues Figure: Histogram of the eigenvalues of Ĉ N (left) and Ŝ N (right) for n = 2500, N = 500, C N = diag(i 25, 3I 25, 0I 250), τ with Γ (.5, 2)-distribution. Remark/Corollary: Spectrum of Ĉ N a.s. bounded uniformly on n.

84 Advanced Random Matrix Models for Robust Estimation/3. Robust Estimation of Scatter 94/42 Hint on potential applications Spectrum boundedness: for impulsive noise scenarios, SCM spectrum grows unbounded robust scatter estimator spectrum remains bounded Robust estimators improve spectrum separability (important for many statistical inference techniques seen previously) Spiked model generalization: we may expect a generalization to spiked models spikes are swallowed by the bulk in SCM context we expect spikes to re-emerge in robust scatter context We shall see that we get even better than this... Application scenarios: Radar detection in impulsive noise (non-gaussian noise, possibly clutter) Financial data analytics Any application where Gaussianity is too strong an assumption...

85 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 96/42 System Setting Signal model: y i = L pl a l s li + τ i w i = A i w i l= A i [ p a... pl a L τi I N ], wi [s i,..., s Li, w i ] T. with y,..., y n C N satisfying:. τ,..., τ n > 0 random such that ν n n n i= δτ ν weakly and tν(dt) = ; i 2. w,..., w n C N random independent unitarily invariant N-norm; 3. L N, p... p L 0 deterministic; 4. a,..., a L C N deterministic or random with A A a.s. diag(p,..., p L ) as N, with A [ p a,..., p L a L ] C N L. 5. s,,..., s Ln C independent with zero mean, unit variance. Relation to previous model: If L = 0, yi = τ i w i. Elliptical model with covariance a low-rank (L) perturbation of I N. We expect a spiked version of previous results. Application contexts: wireless communications: signals s li from L transmitters, N-antenna receiver; a l random i.i.d. channels (al a l δ l l, e.g. a l CN(0, I N /N)); array processing: L sources emit signals s li at steering angle a l = a(θ l ). For ULA, [a(θ)] j = N 2 exp(2πıdj sin(θ)).

86 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 97/42 Some intuition Signal detection/estimation in impulsive environments: Two scenarios heavy-tailed noise (elliptical, Gaussian mixtures, etc.) Gaussian noise with spurious impulsions Problems expected with SCM: Respectively, unbounded limiting spectrum, no source separation! Invalidates G-MUSIC isolated eigenvalues due to spikes in time direction False alarms induced by noise impulses! Our results: In a spiked model with noise impulsions, whatever noise impulsion type, spectrum of Ĉ N remains bounded isolated largest eigenvalues may appear, two classes: isolated eigenvalues due to noise impulses CANNOT exceed a threshold! all isolated eigenvalues beyond this threshold are due to signal Detection criterion: everything above threshold is signal.

87 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 98/42 Theoretical results Theorem (Extension to spiked robust model) Under the same assumptions as in previous section, where ĈN ŜN a.s. 0 Ŝ N n v(τ n i γ)a i w i w i A i i= with γ the unique solution to and we recall ψ(tγ) = + cψ(tγ) ν(dt) A i [ p a... w i = [s i,..., s Li, w i ] T. pl a L τi I N ] Remark: For L = 0, A i = [0,..., 0, I N ]. Recover previous result A i w i becomes w i.

88 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 99/42 Theorem (Eigenvalue localization) Denote Localization of eigenvalues uk eigenvector of k-th largest eigenvalue of AA = L i= p i a i a i û k eigenvector of k-th largest eigenvalue of ĈN Also define δ(x) unique positive solution to ( ) tv c (tγ) δ(x) = c x + + δ(x)tv c (tγ) ν(dt). Further denote ( ) p lim c δ(x)v c (tγ) x S + + δ(x)tv c (tγ) ν(dt), S + φ ( + c) 2 γ( cφ ). a.s. Then, if p j > p, ˆλj Λ j > S +, otherwise lim sup n ˆλj S + a.s., with Λ j unique positive solution to ( ) v c (τγ) c δ(λ j ) + δ(λ j )τv c (τγ) ν(dτ) = p j.

89 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 00/42.2 Simulation Eigenvalues of n ni= y i y i Limiting spectral measure 0.8 Density Eigenvalues Figure: Histogram of the eigenvalues of n i y i y i against the limiting spectral measure, L = 2, p = p 2 =, N = 200, n = 000, Sudent-t impulsions.

90 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 0/42 Simulation 8 Eigenvalues of ĈN Limiting spectral measure 6 Density 4 Right-edge of support Λ 2 S Eigenvalues Figure: Histogram of the eigenvalues of Ĉ N against the limiting spectral measure, for u(x) = ( + α)/(α + x) with α = 0.2, L = 2, p = p 2 =, N = 200, n = 000, Student-t impulsions.

91 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 02/42 Comments SCM vs robust: Spikes invisible in SCM in impulsive noise, reborn in robust estimate of scatter. Largest eigenvalues: λ i (ĈN ) > S + Presence of a source! λ i (ĈN ) (sup(support), S + ) May be due to a source or to a noise impulse. λ i (ĈN ) < sup(support) As usual, nothing can be said. Induces a natural source detection algorithm.

92 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 03/42 Eigenvalue and eigenvector projection estimates Two scenarios: known ν = lim n n n i= δτ i unknown ν Theorem (Estimation under known ν). Power estimation. For each p j > p, ( ) v c (τγ) c δ(ˆλj ) + δ(ˆλj )τv c (τγ) ν(dτ) a.s. p j. 2. Bilinear form estimation. For each a, b C N with a = b =, and p j > p a u k uk b w k a û k ûk b a.s. 0 k,p k =p j k,p k =p j where v c (tγ) ( ) 2 ν(dt) + δ(ˆλk )tv c (tγ) w k =. v c (tγ) + δ(ˆλk )tv c (tγ) ν(dt) δ(ˆλk ) 2 t 2 v c (tγ) 2 ( ) c 2 ν(dt) + δ(ˆλk )tv c (tγ)

93 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 04/42 Eigenvalue and eigenvector projection estimates Theorem (Estimation under unknown ν). Purely empirical power estimation. For each p j > p, ( ˆδ(ˆλj ) ) n v(ˆτ i ˆγ n ) a.s. p N j. + ˆδ(ˆλj )ˆτ i= i v(ˆτ i ˆγ n ) 2. Purely empirical bilinear form estimation. For each a, b C N with a = b =, and each p j > p, where ŵ k = a u k uk b ŵ k a û k ûk b a.s. 0 k,p k =p j k,p k =p j n v(ˆτ i ˆγ) ( ) n 2 i= + ˆδ(ˆλk )ˆτ i v(ˆτ i ˆγ) n v(ˆτ i ˆγ) n ˆδ(ˆλk ) 2 ˆτ 2 i v(ˆτ i ˆγ) 2 ( ) n + ˆδ(ˆλk )ˆτ i= i v(ˆτ i ˆγ) N 2 i= + ˆδ(ˆλk )ˆτ i v(ˆτ i ˆγ) ˆγ n n N y i Ĉ (i) y i, i= ˆτ i ˆγ N y i Ĉ (i) y i, ˆδ(x) as δ(x) but for (τi, γ) ( ˆτ i, ˆγ).

94 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 05/42 Assume the model ai = a(θ i ) with Corollary (Robust G-MUSIC) Define ˆη RG (θ) and ˆη emp RG (θ) as Then, for each p j > p, Application to G-MUSIC a(θ) = N 2 [exp(2πıdj sin(θ))] N j=0. ˆη RG (θ) = {j,p j >p } k= w k a(θ) û k û k a(θ) {j,pj >p } ˆη emp RG (θ) = ŵ k a(θ) û k û k a(θ). k= where ˆθ j ˆθ emp j a.s. θ j a.s. θ j ˆθ j argmin θ R κ j {ˆη RG (θ)} ˆθ emp j argmin θ R κ j {ˆη emp RG (θ)}.

95 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 06/42 Localization functions ˆη X (θ) Simulations: Single-shot in elliptical noise Robust G-MUSIC Emp. robust G-MUSIC G-MUSIC 0.8 Emp. G-MUSIC Robust MUSIC MUSIC θ [deg] Figure: Random realization of the localization functions for the various MUSIC estimators, with N = 20, n = 00, two sources at 0 and 2, Student-t impulsions with parameter β = 00, u(x) = ( + α)/(α + x) with α = 0.2. Powers p = p 2 = = 5 db.

96 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 07/42 Simulations: Elliptical noise Mean square error E[ ˆθ θ 2 ] Robust G-MUSIC Emp. robust G-MUSIC G-MUSIC Emp. G-MUSIC Robust MUSIC MUSIC p, p 2 [db] Figure: Means square error performance of the estimation of θ = 0, with N = 20, n = 00, two sources at 0 and 2, Student-t impulsions with parameter β = 0, u(x) = ( + α)/(α + x) with α = 0.2, p = p 2.

97 Advanced Random Matrix Models for Robust Estimation/3.2 Spiked model extension and robust G-MUSIC 08/42 Simulations: Spurious impulses Mean square error E[ ˆθ θ 2 ] Robust G-MUSIC Emp. robust G-MUSIC G-MUSIC Emp. G-MUSIC Robust MUSIC MUSIC p, p 2 [db] Figure: Means square error performance of the estimation of θ = 0, with N = 20, n = 00, two sources at 0 and 2, sample outlier scenario τ i =, i < n, τ n = 00, u(x) = ( + α)/(α + x) with α = 0.2, p = p 2.

98 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 0/42 Context Ledoit and Wolf, A well-conditioned estimator for large-dimensional covariance matrices. Pascal, Chitour, Quek, 203. Generalized robust shrinkage estimator Application to STAP data. Chen, Wiesel, Hero, 20. Robust shrinkage estimation of high-dimensional covariance matrices. Shrinkage covariance estimation: For N > n or N n, shrinkage estimator ( ρ) n x n i xi + ρi N, for some ρ [0, ]. i= allows for invertibility, better conditioning ρ may be chosen to minimize an expected error metric Limitation of Maronna s estimator: Maronna and Tyler estimators limited to N < n, otherwise do not exist introducing shrinkage in robust estimator cannot do much harm anyhow... Introducing the robust-shrinkage estimator: The literature proposes two such estimators n Ĉ N (ρ) = ( ρ) n i= Č N (ρ) = ˇB N (ρ) N x i x i xi ĈN (ρ)x i N tr ˇB N (ρ), ˇBN (ρ) = ( ρ) n + ρi N, ρ (max{0, N n }, ] (Pascal) N n i= N x i x i xi ĈN (ρ)x i + ρi N, ρ (0, ] (Chen)

99 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance /42 Main theoretical result Which estimator is better? Having asked to authors of both papers, their estimator was much better than the other, but the arguments we received were quite vague... Our result: In the random matrix regime, both estimators tend to be one and the same! Assumptions: As before, elliptical-like model x i = τ i C 2 N w i This time, C N cannot be taken I N (due to +ρi N )! Maronna-based shrinkage is possible but more involved...

100 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 2/42 Theorem (Pascal s estimator) Pascal s estimator For ε (0, min{, c }), define ˆRε = [ε + max{0, c }, ]. Then, as N, n, N/n c (0, ), sup ĈN (ρ) ŜN (ρ) a.s. 0 where ρ ˆRε n Ĉ N (ρ) = ( ρ) n i= N x i Ŝ N (ρ) = ρ ˆγ(ρ) ( ρ)c n x i xi + ρi Ĉ N (ρ) N x i and ˆγ(ρ) is the unique positive solution to the equation in ˆγ Moreover, ρ ˆγ(ρ) is continuous on (0, ]. n i= = N λ i (C N ) N ˆγρ + ( ρ)λ i= i (C N ). C 2 N w i w i C 2 N + ρi N

101 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 3/42 Moreover, ρ ˇγ(ρ) is continuous on (0, ]. Theorem (Chen s estimator) Chen s estimator For ε (0, ), define Ř ε = [ε, ]. Then, as N, n, N/n c (0, ), sup ČN (ρ) Š N (ρ) a.s. 0 where ρ Ř ε Č N (ρ) = ˇB N (ρ) N tr ˇB N (ρ), ˇB N (ρ) = ( ρ) n n i= ρ n Š N (ρ) = C 2 ρ + T ρ n N w i w 2 i C N + i= in which T ρ = ρˇγ(ρ)f (ˇγ(ρ); ρ) with, for all x > 0, x i xi N x i Č N (ρ) + ρi N x i T ρ ρ + T ρ I N F (x; ρ) = 2 (ρ c( ρ)) + 4 (ρ c( ρ))2 + ( ρ) x and ˇγ(ρ) is the unique positive solution to the equation in ˇγ = N λ i (C N ) N ρ i= ˇγρ + ( ρ)c+f (ˇγ;ρ) λ i (C N ).

102 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 4/42 Asymptotic Model Equivalence Theorem (Model Equivalence) For each ρ (0, ], there exist unique ˆρ (max{0, c }, ] and ˇρ (0, ] such that Ŝ N (ˆρ) ˆρ ˆγ(ˆρ) ( ˆρ)c + ˆρ = Š N (ˇρ) = ( ρ) n C 2 n N w i wi C 2 N + ρi N. i= Besides, (0, ] (max{0, c }, ], ρ ˆρ and (0, ] (0, ], ρ ˇρ are increasing and onto. Up to normalization, both estimators behave the same! Both estimators behave the same as an impulsion-free Ledoit-Wolf estimator About uniformity: Uniformity over ρ in the theorems is essential to find optimal values of ρ.

103 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 5/42 Optimal Shrinkage parameter Chen sought for a Frobenius norm minimizing ρ but got stuck by implicit nature of ČN (ρ) Our results allow for a simplification of the problem for large N, n! Model equivalence says only one problem needs be solved. Theorem (Optimal Shrinkage) For each ρ (0, ], define ˆD N (ρ) = ( N tr ĈN (ρ) N tr ĈN (ρ) C N ) 2, Ď N (ρ) = N tr ( (ČN (ρ) C N ) 2 ). Denote D = c M 2 c+m 2, ρ = c c+m 2, M 2 = lim Ni= N N λ 2 i (C N ) and ˆρ, ˇρ unique solutions to ˆρ ˆρ ˆγ(ˆρ ) ( ˆρ )c + = Tˇρ ˆρ ˇρ + Tˇρ = ρ. Then, letting ε small enough, inf ˆD N (ρ) a.s. D, ρ ˆRε ˆD N (ˆρ ) a.s. D, inf Ď N (ρ) a.s. D ρ Ř ε Ď N (ˇρ ) a.s. D.

104 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 6/42 Estimating ˆρ and ˇρ Theorem only useful if ˆρ and ˇρ can be estimated! Careful control of the proofs provide many ways to estimate these. Proposition below provides one example. Optimal Shrinkage Estimate Let ˆρ N (max{0, c }, ] and ˇρ N (0, ] be solutions (not necessarily unique) to ˇρ N n ni= x i ˇρ N + ˇρ N n ni= x i ˆρ N N tr ĈN (ˆρ N ) = Č N (ˇρ N ) x i x i 2 = Č N (ˇρ N ) x i x i 2 defined arbitrarily when no such solutions exist. Then c N [ ( N tr ni= n c N [ ( N tr ni= n a.s. ˆρ N ˆρ a.s., ˇρ N ˇρ ˆD N (ˆρ N ) a.s. D, Ď N (ˇρ N ) a.s. D. x i x i N x i 2 x i x i N x i 2 ) 2 ] ) 2 ]

105 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 7/42 Simulations 3 inf ρ (0,] {Ď N (ρ)} Ď N ( ˇρ N ) D Ď N ( ˇρ O ) Normalized Frobenius norm n [log 2 scale] Figure: Performance of optimal shrinkage averaged over Monte Carlo simulations, for N = 32, various values of n, [C N ] ij = r i j with r = 0.7; ˇρ N as above; ˇρ O the clairvoyant estimator proposed in (Chen ).

106 Advanced Random Matrix Models for Robust Estimation/3.3 Robust shrinkage and application to mathematical finance 8/42 Simulations 0.8 Shrinkage parameter ˆρ ˆρ N ˆρ ˇρ ˇρ N ˇρ ˇρ O n [log 2 scale] Figure: Shrinkage parameter ρ averaged over Monte Carlo simulations, for N = 32, various values of n, [C N ] ij = r i j with r = 0.7; ˆρ N and ˇρ N as above; ˇρ O the clairvoyant estimator proposed in (Chen ); ˆρ = argmin {ρ (max{0, c N },]}{ ˆD N (ρ)} and ˇρ = argmin {ρ (0,]} {Ď N (ρ)}.

107 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 20/42 Context Hypothesis testing problem: Two sets of data Initial pure-noise data: x,..., x n, x i = τ i C 2 N w i as before. New incoming data y given by: { x, H0 y = αp + x, H with x = τc 2 N w, p C N deterministic known, α unknown. GLRT detection test: H T N (ρ) Γ H 0 for some detection threshold Γ where T N (ρ) and ĈN (ρ) defined in previous section. y Ĉ N (ρ)p y Ĉ N (ρ)y p In fact, originally found to be ĈN (0) but only valid for N < n introducing ρ may bring improved for arbitrary N/n ratios.. Ĉ N (ρ)p

108 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 2/42 Objectives and main results Initial observations: As N, n, N/n c > 0, under H 0, T N (ρ) a.s. 0. Trivial result of little interest! Natural question: for finite N, n and given Γ, find ρ such that P (T N (ρ) > Γ) = min Turns out the correct non-trivial object is, for γ > 0 fixed ( ) P NTN (ρ) > γ = min Objectives: for each ρ, develop central limit theorem to evaluate ( ) lim P NTN (ρ) > γ N,n N/n c determine limiting minimizing ρ empirically estimate minimizing ρ

109 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 22/42 What do we need? CLT over ĈN statistics We know that ĈN (ρ) ŜN (ρ) a.s. 0 Key result so far! What about N(ĈN (ρ) ŜN (ρ))? Does not converge to zero!!! But there is hope... : N(a Ĉ N This is our main result! (ρ)b a ŜN a.s. (ρ)b) 0 This requires much more delicate treatment, not discussed in this tutorial.

110 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 23/42 Main results Theorem (Fluctuation of bilinear forms) Let a, b C N with a = b =. Then, as N, n with N/n c > 0, for any ε > 0 and every k Z, sup a ĈN k (ρ)b a ŜN k (ρ)b a.s. 0 ρ R κ N ε where R κ = [κ + max{0, /c}, ].

111 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 24/42 False alarm performance Theorem (Asymptotic detector performance) As N, n with N/n c (0, ), ( sup ρ R κ P T N (ρ) > γ ) ( ) exp γ2 N 2σ 2 N (ˆρ) 0 where ρ ˆρ is the aforementioned mapping and σ 2 N (ˆρ) p C N QN 2 (ˆρ)p 2 p Q N (ˆρ)p N tr C NQ N (ˆρ) ( c( ρ) 2 m( ˆρ) 2 N tr C N 2 Q2 N (ˆρ)) with Q N (ˆρ) (I N + ( ˆρ)m( ˆρ)C N ). Limiting Rayleigh distribution Weak convergence to Rayleigh variable R N (ˆρ) Remark: σn and ˆρ not a function of γ There exists a uniformly optimal ρ!

112 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 25/42 Simulation Empirical hist. of T N (ρ) 0.6 Distribution of R N ( ˆρ) 0.8 Density Cumulative distribution Empirical dist. of T N (ρ) Values taken Values taken Distribution of R N ( ˆρ) Figure: Histogram distribution function of the NT N (ρ) versus R N (ˆρ), N = 20, p = N 2 [,..., ] T, C N Toeplitz from AR of order 0.7, c N = /2, ρ = 0.2.

113 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 26/42 Simulation Empirical hist. of T N (ρ) Density of R N ( ˆρ) Density 0.4 Cumulative distribution Empirical dist. of T N (ρ) Values taken Values taken Distribution of R N ( ˆρ) Figure: Histogram distribution function of the NT N (ρ) versus R N (ˆρ), N = 00, p = N 2 [,..., ] T, C N Toeplitz from AR of order 0.7, c N = /2, ρ = 0.2.

114 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 27/42 Empirical estimation of optimal ρ Optimal ρ can be found by line search... but CN unknown! We shall successively: empirical estimate σ N (ˆρ) minimize the estimate prove by uniformity asymptotic optimality of estimate Theorem (Empirical performance estimation) For ρ (max{0, cn }, ), let ˆσ 2 N (ˆρ) 2 ( c + c ˆρ N p Ĉ 2 ˆρ N (ρ)p p Ĉ N (ρ)p N tr Ĉ N (ρ) ) ( ˆρ N tr ĈN (ρ) N tr Ĉ N (ρ) Also let ˆσ 2 N () limˆρ ˆσ 2 N (ˆρ). Then sup σ 2 N (ˆρ) ˆσ2 N (ˆρ) a.s. 0. ρ R κ tr ĈN (ρ) N tr Ĉ N (ρ) ).

115 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 28/42 Final result Theorem (Optimality of empirical estimator) Define ˆρ N = argmin {ρ R κ} Then, for every γ > 0, ( ) P NTN (ˆρ N ) > γ inf ρ R κ { {ˆσ 2 N (ˆρ) }. ( )} P NTN (ρ) > γ 0.

116 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 29/ Simulations γ = 2 P( NT N (ρ) > γ) γ = Limiting theory Empirical estimator Detector ρ Figure: False alarm rate P( NT N (ρ) > γ), N = 20, p = N 2 [,..., ] T, C N Toeplitz from AR of order 0.7, c N = /2.

117 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 30/ Simulations γ = 2 P( NT N (ρ) > γ) 0 γ = Limiting theory Empirical estimator Detector ρ Figure: False alarm rate P( NT N (ρ) > γ), N = 00, p = N 2 [,..., ] T, C N Toeplitz from AR of order 0.7, c N = /2.

118 Advanced Random Matrix Models for Robust Estimation/3.4 Optimal robust GLRT detectors 3/42 Simulations 0 0 Limiting theory Detector N = 20 N = Γ Figure: False alarm rate P(T N (ρ) > Γ) for N = 20 and N = 00, p = N 2 [,..., ] T, [C N ] ij = 0.7 i j, c N = /2.

119 Future Directions/4. Kernel matrices and kernel methods 34/42 Motivation: Spectral Clustering N. El Karoui. The spectrum of kernel random matrices. The Annals of Statistics, 38():50, 200. Objective: Clustering data x,..., x n C N in k similarity classes classical machine learning problem brought here to big data! assumes similarity function, e.g. Gaussian kernel f (x i, x j ) = exp ( x i x j 2 ) 2σ 2 naturally brings kernel matrix: W = [W ij ] i,j n = [f (x i, x j )] i,j n. Letting x,..., x n random, leads naturally to studying kernel random matrices. Little is known on such random matrices, but for xi i.i.d. zero mean and covariance I N : ( W α T β ) n WW a.s. 0 for some α, β depending on f and its derivatives. Basically, W gets equivalent to a rank-one matrix.

120 Future Directions/4. Kernel matrices and kernel methods 35/42 Motivation: Spectral Clustering Clustering x,..., x n in k often written as: (RatioCut) k min S,...,S k S... S k =S i= j S i, j S c i i j, S i S j = f (x j, x j ). S i But difficult to solve, NP hard! Can be equivalently rewritten (RatioCut) ( ) min tr M T LM M M, M T M=I k where M = {M = [m ij ] i n, j k, m ij = S j 2 δ xi S j } and [ ] n L = [L ij ] i,j n = [ W + diag(w )] i,j n = f (x i, x j ) + δ i,j f (x i, x l ). l= i,j n Relaxing M to unitary leads to a simple eigenvalue/eigenvector problem: Spectral clustering.

121 Future Directions/4. Kernel matrices and kernel methods 36/42 Objectives Generalization to k distributions for x,..., x n should lead to asymptotically rank-k W matrices. If established, specific choices of known good kernel better understood. Eventually, find optimal choices for kernels.

122 Future Directions/4.2 Neural networks 38/42 Echo-state neural networks Neural network: Input neuron signal s t R (could be multivariate) Output neuron signal y t R (could be multivariate) N neurons with state x t R N at time t connectivity matrix W R N N connectivity vector to input w I R N connectivity vector to output w O R N State evolution x 0 = 0 (say) and x t+ = S (Wx t + w I s t ) with S entry-wise sigmoid function. Output observation y t = w T O xt. Classical neural networks: Learning phase: input-output data (s t, y t ) used to learn W, w O, w I (via e.g. LS) Interpolation phase: W, w O, w I fixed, we observe output y t from new data s t. Poses overlearning problems, difficult to set up, demands lots of learning data. Echo-state neural networks: To solve the problems of neural networks W and w I set to be a random matrix, no longer learned only w O is learned Reduces amount of data to learn, shows striking performances in some scenarios.

123 Future Directions/4.2 Neural networks 39/42 ESN and random matrices W, w I being random, performance study involves random matrices. Stability, chaos regime, etc. involve extreme eigenvalues of W main difficulty is non-linearity caused by S Performance measures: MSE for training data MSE for interpolated data Optimization to be performed on regression method!, e.g. w O = (X train X T train + γi N ) X train y train with X train = [x,..., x T ], y train = [y,..., y T ] T, T train period. In first approximation: S = Id. MSE performance with stationary inputs leads to study W j w I wi T (W T ) j j= New random matrix model, can be analyzed with usual tools though.

124 Future Directions/4.2 Neural networks 40/42 Related biography J. T. Kent, D. E. Tyler, Redescending M-estimates of multivariate location and scatter, 99. R. A. Maronna, Robust M-estimators of multivariate location and scatter, 976. Y. Chitour, F. Pascal, Exact maximum likelihood estimates for SIRV covariance matrix: Existence and algorithm analysis, N. El Karoui, Concentration of measure and spectra of random matrices: applications to correlation matrices, elliptical distributions and beyond, R. Couillet, F. Pascal, J. W. Silverstein, Robust M-Estimation for Array Processing: A Random Matrix Approach, 202. J. Vinogradova, R. Couillet, W. Hachem, Statistical Inference in Large Antenna Arrays under Unknown Noise Pattern, (submitted to) IEEE Transactions on Signal Processing, 202. F. Chapon, R. Couillet, W. Hachem, X. Mestre, On the isolated eigenvalues of large Gram random matrices with a fixed rank deformation, (submitted to) Electronic Journal of Probability, 202, arxiv Preprint R. Couillet, M. Debbah, Signal Processing in Large Systems: a New Paradigm, IEEE Signal Processing Magazine, vol. 30, no., pp , 203. P. Loubaton, P. Vallet, Almost sure localization of the eigenvalues in a Gaussian information plus noise model. Application to the spiked models, Electronic Journal of Probability, 20. P. Vallet, W. Hachem, P. Loubaton, X. Mestre, J. Najim, On the consistency of the G-MUSIC DOA estimator. IEEE Statistical Signal Processing Workshop (SSP), 20.

125 Future Directions/4.2 Neural networks 4/42 To know more about all this Our webpages: I I

126 Future Directions/4.2 Neural networks 42/42 Spraed it! To download this presentation (PDF format): Log in to your Spraed account ( Scan this QR code.

Random Matrices for Big Data Signal Processing and Machine Learning

Random Matrices for Big Data Signal Processing and Machine Learning Random Matrices for Big Data Signal Processing and Machine Learning (ICASSP 2017, New Orleans) Romain COUILLET and Hafiz TIOMOKO ALI CentraleSupélec, France March, 2017 1 / 153 Outline Basics of Random

More information

Robust covariance matrices estimation and applications in signal processing

Robust covariance matrices estimation and applications in signal processing Robust covariance matrices estimation and applications in signal processing F. Pascal SONDRA/Supelec GDR ISIS Journée Estimation et traitement statistique en grande dimension May 16 th, 2013 FP (SONDRA/Supelec)

More information

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction

Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction Random Matrix Theory and its applications to Statistics and Wireless Communications Eigenvalues and Singular Values of Random Matrices: A Tutorial Introduction Sergio Verdú Princeton University National

More information

An introduction to G-estimation with sample covariance matrices

An introduction to G-estimation with sample covariance matrices An introduction to G-estimation with sample covariance matrices Xavier estre xavier.mestre@cttc.cat Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) "atrices aleatoires: applications aux communications

More information

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2

EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME. Xavier Mestre 1, Pascal Vallet 2 EXTENDED GLRT DETECTORS OF CORRELATION AND SPHERICITY: THE UNDERSAMPLED REGIME Xavier Mestre, Pascal Vallet 2 Centre Tecnològic de Telecomunicacions de Catalunya, Castelldefels, Barcelona (Spain) 2 Institut

More information

Non white sample covariance matrices.

Non white sample covariance matrices. Non white sample covariance matrices. S. Péché, Université Grenoble 1, joint work with O. Ledoit, Uni. Zurich 17-21/05/2010, Université Marne la Vallée Workshop Probability and Geometry in High Dimensions

More information

Free Probability, Sample Covariance Matrices and Stochastic Eigen-Inference

Free Probability, Sample Covariance Matrices and Stochastic Eigen-Inference Free Probability, Sample Covariance Matrices and Stochastic Eigen-Inference Alan Edelman Department of Mathematics, Computer Science and AI Laboratories. E-mail: edelman@math.mit.edu N. Raj Rao Deparment

More information

Concentration Inequalities for Random Matrices

Concentration Inequalities for Random Matrices Concentration Inequalities for Random Matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify the asymptotic

More information

Large sample covariance matrices and the T 2 statistic

Large sample covariance matrices and the T 2 statistic Large sample covariance matrices and the T 2 statistic EURANDOM, the Netherlands Joint work with W. Zhou Outline 1 2 Basic setting Let {X ij }, i, j =, be i.i.d. r.v. Write n s j = (X 1j,, X pj ) T and

More information

Random Matrices in Machine Learning

Random Matrices in Machine Learning Random Matrices in Machine Learning Romain COUILLET CentraleSupélec, University of ParisSaclay, France GSTATS IDEX DataScience Chair, GIPSA-lab, University Grenoble Alpes, France. June 21, 2018 1 / 113

More information

Exponential tail inequalities for eigenvalues of random matrices

Exponential tail inequalities for eigenvalues of random matrices Exponential tail inequalities for eigenvalues of random matrices M. Ledoux Institut de Mathématiques de Toulouse, France exponential tail inequalities classical theme in probability and statistics quantify

More information

EUSIPCO

EUSIPCO EUSIPCO 03 56974375 ON THE RESOLUTION PROBABILITY OF CONDITIONAL AND UNCONDITIONAL MAXIMUM LIKELIHOOD DOA ESTIMATION Xavier Mestre, Pascal Vallet, Philippe Loubaton 3, Centre Tecnològic de Telecomunicacions

More information

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE

Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Painlevé Representations for Distribution Functions for Next-Largest, Next-Next-Largest, etc., Eigenvalues of GOE, GUE and GSE Craig A. Tracy UC Davis RHPIA 2005 SISSA, Trieste 1 Figure 1: Paul Painlevé,

More information

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices

Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices by Jack W. Silverstein* Department of Mathematics Box 8205 North Carolina State University Raleigh,

More information

Determinantal point processes and random matrix theory in a nutshell

Determinantal point processes and random matrix theory in a nutshell Determinantal point processes and random matrix theory in a nutshell part II Manuela Girotti based on M. Girotti s PhD thesis, A. Kuijlaars notes from Les Houches Winter School 202 and B. Eynard s notes

More information

1 Intro to RMT (Gene)

1 Intro to RMT (Gene) M705 Spring 2013 Summary for Week 2 1 Intro to RMT (Gene) (Also see the Anderson - Guionnet - Zeitouni book, pp.6-11(?) ) We start with two independent families of R.V.s, {Z i,j } 1 i

More information

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg

Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws. Symeon Chatzinotas February 11, 2013 Luxembourg Random Matrix Theory Lecture 1 Introduction, Ensembles and Basic Laws Symeon Chatzinotas February 11, 2013 Luxembourg Outline 1. Random Matrix Theory 1. Definition 2. Applications 3. Asymptotics 2. Ensembles

More information

MULTIPLE-CHANNEL DETECTION IN ACTIVE SENSING. Kaitlyn Beaudet and Douglas Cochran

MULTIPLE-CHANNEL DETECTION IN ACTIVE SENSING. Kaitlyn Beaudet and Douglas Cochran MULTIPLE-CHANNEL DETECTION IN ACTIVE SENSING Kaitlyn Beaudet and Douglas Cochran School of Electrical, Computer and Energy Engineering Arizona State University, Tempe AZ 85287-576 USA ABSTRACT The problem

More information

Random Matrices and Multivariate Statistical Analysis

Random Matrices and Multivariate Statistical Analysis Random Matrices and Multivariate Statistical Analysis Iain Johnstone, Statistics, Stanford imj@stanford.edu SEA 06@MIT p.1 Agenda Classical multivariate techniques Principal Component Analysis Canonical

More information

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao

A note on a Marčenko-Pastur type theorem for time series. Jianfeng. Yao A note on a Marčenko-Pastur type theorem for time series Jianfeng Yao Workshop on High-dimensional statistics The University of Hong Kong, October 2011 Overview 1 High-dimensional data and the sample covariance

More information

Estimation of the Global Minimum Variance Portfolio in High Dimensions

Estimation of the Global Minimum Variance Portfolio in High Dimensions Estimation of the Global Minimum Variance Portfolio in High Dimensions Taras Bodnar, Nestor Parolya and Wolfgang Schmid 07.FEBRUARY 2014 1 / 25 Outline Introduction Random Matrix Theory: Preliminary Results

More information

A determinantal formula for the GOE Tracy-Widom distribution

A determinantal formula for the GOE Tracy-Widom distribution A determinantal formula for the GOE Tracy-Widom distribution Patrik L. Ferrari and Herbert Spohn Technische Universität München Zentrum Mathematik and Physik Department e-mails: ferrari@ma.tum.de, spohn@ma.tum.de

More information

Applications and fundamental results on random Vandermon

Applications and fundamental results on random Vandermon Applications and fundamental results on random Vandermonde matrices May 2008 Some important concepts from classical probability Random variables are functions (i.e. they commute w.r.t. multiplication)

More information

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA) The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues

More information

Insights into Large Complex Systems via Random Matrix Theory

Insights into Large Complex Systems via Random Matrix Theory Insights into Large Complex Systems via Random Matrix Theory Lu Wei SEAS, Harvard 06/23/2016 @ UTC Institute for Advanced Systems Engineering University of Connecticut Lu Wei Random Matrix Theory and Large

More information

Supelec Randomness in Wireless Networks: how to deal with it?

Supelec Randomness in Wireless Networks: how to deal with it? Supelec Randomness in Wireless Networks: how to deal with it? Mérouane Debbah Alcatel-Lucent Chair on Flexible Radio merouane.debbah@supelec.fr The performance dilemma... La théorie, c est quand on sait

More information

Performances of Low Rank Detectors Based on Random Matrix Theory with Application to STAP

Performances of Low Rank Detectors Based on Random Matrix Theory with Application to STAP Performances of Low Rank Detectors Based on Random Matrix Theory with Application to STAP Alice Combernoux, Frédéric Pascal, Marc Lesturgie, Guillaume Ginolhac To cite this version: Alice Combernoux, Frédéric

More information

Comparison Method in Random Matrix Theory

Comparison Method in Random Matrix Theory Comparison Method in Random Matrix Theory Jun Yin UW-Madison Valparaíso, Chile, July - 2015 Joint work with A. Knowles. 1 Some random matrices Wigner Matrix: H is N N square matrix, H : H ij = H ji, EH

More information

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices

Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices Local semicircle law, Wegner estimate and level repulsion for Wigner random matrices László Erdős University of Munich Oberwolfach, 2008 Dec Joint work with H.T. Yau (Harvard), B. Schlein (Cambrigde) Goal:

More information

Asymptotic analysis of a GLRT for detection with large sensor arrays

Asymptotic analysis of a GLRT for detection with large sensor arrays Asymptotic analysis of a GLRT for detection with large sensor arrays Sonja Hiltunen, P Loubaton, P Chevalier To cite this version: Sonja Hiltunen, P Loubaton, P Chevalier. Asymptotic analysis of a GLRT

More information

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR

On corrections of classical multivariate tests for high-dimensional data. Jian-feng. Yao Université de Rennes 1, IRMAR Introduction a two sample problem Marčenko-Pastur distributions and one-sample problems Random Fisher matrices and two-sample problems Testing cova On corrections of classical multivariate tests for high-dimensional

More information

Universality of distribution functions in random matrix theory Arno Kuijlaars Katholieke Universiteit Leuven, Belgium

Universality of distribution functions in random matrix theory Arno Kuijlaars Katholieke Universiteit Leuven, Belgium Universality of distribution functions in random matrix theory Arno Kuijlaars Katholieke Universiteit Leuven, Belgium SEA 06@MIT, Workshop on Stochastic Eigen-Analysis and its Applications, MIT, Cambridge,

More information

THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES

THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES THE EIGENVALUES AND EIGENVECTORS OF FINITE, LOW RANK PERTURBATIONS OF LARGE RANDOM MATRICES FLORENT BENAYCH-GEORGES AND RAJ RAO NADAKUDITI Abstract. We consider the eigenvalues and eigenvectors of finite,

More information

Inhomogeneous circular laws for random matrices with non-identically distributed entries

Inhomogeneous circular laws for random matrices with non-identically distributed entries Inhomogeneous circular laws for random matrices with non-identically distributed entries Nick Cook with Walid Hachem (Telecom ParisTech), Jamal Najim (Paris-Est) and David Renfrew (SUNY Binghamton/IST

More information

Applications of Large Random Matrices to Digital Communications and Statistical Signal Processing

Applications of Large Random Matrices to Digital Communications and Statistical Signal Processing Applications of Large Random Matrices to Digital Communications and Statistical Signal Processing W. Hachem, Ph. Loubaton and J. Najim EUSIPCO 2011 Barcelona August / September 2011 Hachem, Loubaton and

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

On the concentration of eigenvalues of random symmetric matrices

On the concentration of eigenvalues of random symmetric matrices On the concentration of eigenvalues of random symmetric matrices Noga Alon Michael Krivelevich Van H. Vu April 23, 2012 Abstract It is shown that for every 1 s n, the probability that the s-th largest

More information

INVERSE EIGENVALUE STATISTICS FOR RAYLEIGH AND RICIAN MIMO CHANNELS

INVERSE EIGENVALUE STATISTICS FOR RAYLEIGH AND RICIAN MIMO CHANNELS INVERSE EIGENVALUE STATISTICS FOR RAYLEIGH AND RICIAN MIMO CHANNELS E. Jorswieck, G. Wunder, V. Jungnickel, T. Haustein Abstract Recently reclaimed importance of the empirical distribution function of

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

Markov operators, classical orthogonal polynomial ensembles, and random matrices

Markov operators, classical orthogonal polynomial ensembles, and random matrices Markov operators, classical orthogonal polynomial ensembles, and random matrices M. Ledoux, Institut de Mathématiques de Toulouse, France 5ecm Amsterdam, July 2008 recent study of random matrix and random

More information

On corrections of classical multivariate tests for high-dimensional data

On corrections of classical multivariate tests for high-dimensional data On corrections of classical multivariate tests for high-dimensional data Jian-feng Yao with Zhidong Bai, Dandan Jiang, Shurong Zheng Overview Introduction High-dimensional data and new challenge in statistics

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Statistical Inference and Random Matrices

Statistical Inference and Random Matrices Statistical Inference and Random Matrices N.S. Witte Institute of Fundamental Sciences Massey University New Zealand 5-12-2017 Joint work with Peter Forrester 6 th Wellington Workshop in Probability and

More information

1 Tridiagonal matrices

1 Tridiagonal matrices Lecture Notes: β-ensembles Bálint Virág Notes with Diane Holcomb 1 Tridiagonal matrices Definition 1. Suppose you have a symmetric matrix A, we can define its spectral measure (at the first coordinate

More information

Random matrices: Distribution of the least singular value (via Property Testing)

Random matrices: Distribution of the least singular value (via Property Testing) Random matrices: Distribution of the least singular value (via Property Testing) Van H. Vu Department of Mathematics Rutgers vanvu@math.rutgers.edu (joint work with T. Tao, UCLA) 1 Let ξ be a real or complex-valued

More information

Preface to the Second Edition...vii Preface to the First Edition... ix

Preface to the Second Edition...vii Preface to the First Edition... ix Contents Preface to the Second Edition...vii Preface to the First Edition........... ix 1 Introduction.............................................. 1 1.1 Large Dimensional Data Analysis.........................

More information

A Random Matrix Framework for BigData Machine Learning and Applications to Wireless Communications

A Random Matrix Framework for BigData Machine Learning and Applications to Wireless Communications A Random Matrix Framework for BigData Machine Learning and Applications to Wireless Communications (EURECOM) Romain COUILLET CentraleSupélec, France June, 2017 1 / 80 Outline Basics of Random Matrix Theory

More information

w T 1 w T 2. w T n 0 if i j 1 if i = j

w T 1 w T 2. w T n 0 if i j 1 if i = j Lyapunov Operator Let A F n n be given, and define a linear operator L A : C n n C n n as L A (X) := A X + XA Suppose A is diagonalizable (what follows can be generalized even if this is not possible -

More information

Weiming Li and Jianfeng Yao

Weiming Li and Jianfeng Yao LOCAL MOMENT ESTIMATION OF POPULATION SPECTRUM 1 A LOCAL MOMENT ESTIMATOR OF THE SPECTRUM OF A LARGE DIMENSIONAL COVARIANCE MATRIX Weiming Li and Jianfeng Yao arxiv:1302.0356v1 [stat.me] 2 Feb 2013 Abstract:

More information

Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection

Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Plug-in Measure-Transformed Quasi Likelihood Ratio Test for Random Signal Detection Nir Halay and Koby Todros Dept. of ECE, Ben-Gurion University of the Negev, Beer-Sheva, Israel February 13, 2017 1 /

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case The largest eigenvalues of the sample covariance matrix 1 in the heavy-tail case Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia NY), Johannes Heiny (Aarhus University)

More information

On the principal components of sample covariance matrices

On the principal components of sample covariance matrices On the principal components of sample covariance matrices Alex Bloemendal Antti Knowles Horng-Tzer Yau Jun Yin February 4, 205 We introduce a class of M M sample covariance matrices Q which subsumes and

More information

LARGE DIMENSIONAL RANDOM MATRIX THEORY FOR SIGNAL DETECTION AND ESTIMATION IN ARRAY PROCESSING

LARGE DIMENSIONAL RANDOM MATRIX THEORY FOR SIGNAL DETECTION AND ESTIMATION IN ARRAY PROCESSING LARGE DIMENSIONAL RANDOM MATRIX THEORY FOR SIGNAL DETECTION AND ESTIMATION IN ARRAY PROCESSING J. W. Silverstein and P. L. Combettes Department of Mathematics, North Carolina State University, Raleigh,

More information

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations

Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Assessing the dependence of high-dimensional time series via sample autocovariances and correlations Johannes Heiny University of Aarhus Joint work with Thomas Mikosch (Copenhagen), Richard Davis (Columbia),

More information

Rectangular Young tableaux and the Jacobi ensemble

Rectangular Young tableaux and the Jacobi ensemble Rectangular Young tableaux and the Jacobi ensemble Philippe Marchal October 20, 2015 Abstract It has been shown by Pittel and Romik that the random surface associated with a large rectangular Young tableau

More information

Eigenvalue variance bounds for Wigner and covariance random matrices

Eigenvalue variance bounds for Wigner and covariance random matrices Eigenvalue variance bounds for Wigner and covariance random matrices S. Dallaporta University of Toulouse, France Abstract. This work is concerned with finite range bounds on the variance of individual

More information

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018 Homework #1 Assigned: August 20, 2018 Review the following subjects involving systems of equations and matrices from Calculus II. Linear systems of equations Converting systems to matrix form Pivot entry

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

The Dynamics of Learning: A Random Matrix Approach

The Dynamics of Learning: A Random Matrix Approach The Dynamics of Learning: A Random Matrix Approach ICML 2018, Stockholm, Sweden Zhenyu Liao, Romain Couillet L2S, CentraleSupélec, Université Paris-Saclay, France GSTATS IDEX DataScience Chair, GIPSA-lab,

More information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department

More information

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators John Sylvester Department of Mathematics University of Washington Seattle, Washington 98195 U.S.A. June 3, 2011 This research

More information

c 2005 Society for Industrial and Applied Mathematics

c 2005 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. XX, No. X, pp. XX XX c 005 Society for Industrial and Applied Mathematics DISTRIBUTIONS OF THE EXTREME EIGENVALUES OF THE COMPLEX JACOBI RANDOM MATRIX ENSEMBLE PLAMEN KOEV

More information

A Brief Outline of Math 355

A Brief Outline of Math 355 A Brief Outline of Math 355 Lecture 1 The geometry of linear equations; elimination with matrices A system of m linear equations with n unknowns can be thought of geometrically as m hyperplanes intersecting

More information

Determinantal point processes and random matrix theory in a nutshell

Determinantal point processes and random matrix theory in a nutshell Determinantal point processes and random matrix theory in a nutshell part I Manuela Girotti based on M. Girotti s PhD thesis and A. Kuijlaars notes from Les Houches Winter School 202 Contents Point Processes

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

Bulk scaling limits, open questions

Bulk scaling limits, open questions Bulk scaling limits, open questions Based on: Continuum limits of random matrices and the Brownian carousel B. Valkó, B. Virág. Inventiones (2009). Eigenvalue statistics for CMV matrices: from Poisson

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

Novel spectrum sensing schemes for Cognitive Radio Networks

Novel spectrum sensing schemes for Cognitive Radio Networks Novel spectrum sensing schemes for Cognitive Radio Networks Cantabria University Santander, May, 2015 Supélec, SCEE Rennes, France 1 The Advanced Signal Processing Group http://gtas.unican.es The Advanced

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

Covariance estimation using random matrix theory

Covariance estimation using random matrix theory Covariance estimation using random matrix theory Randolf Altmeyer joint work with Mathias Trabs Mathematical Statistics, Humboldt-Universität zu Berlin Statistics Mathematics and Applications, Fréjus 03

More information

The norm of polynomials in large random matrices

The norm of polynomials in large random matrices The norm of polynomials in large random matrices Camille Mâle, École Normale Supérieure de Lyon, Ph.D. Student under the direction of Alice Guionnet. with a significant contribution by Dimitri Shlyakhtenko.

More information

Gaussian random variables inr n

Gaussian random variables inr n Gaussian vectors Lecture 5 Gaussian random variables inr n One-dimensional case One-dimensional Gaussian density with mean and standard deviation (called N, ): fx x exp. Proposition If X N,, then ax b

More information

Selfadjoint Polynomials in Independent Random Matrices. Roland Speicher Universität des Saarlandes Saarbrücken

Selfadjoint Polynomials in Independent Random Matrices. Roland Speicher Universität des Saarlandes Saarbrücken Selfadjoint Polynomials in Independent Random Matrices Roland Speicher Universität des Saarlandes Saarbrücken We are interested in the limiting eigenvalue distribution of an N N random matrix for N. Typical

More information

THE problem of estimating the directions of arrival (DoA)

THE problem of estimating the directions of arrival (DoA) Performance analysis of an improved MUSIC DoA estimator Pascal Vallet, Member, IEEE, Xavier Mestre, Senior Member, IEEE, and Philippe Loubaton, Fellow, IEEE arxiv:503.07v [stat.ap] 8 Jun 05 Abstract This

More information

Operator-Valued Free Probability Theory and Block Random Matrices. Roland Speicher Queen s University Kingston

Operator-Valued Free Probability Theory and Block Random Matrices. Roland Speicher Queen s University Kingston Operator-Valued Free Probability Theory and Block Random Matrices Roland Speicher Queen s University Kingston I. Operator-valued semicircular elements and block random matrices II. General operator-valued

More information

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws

Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Random Matrices: Beyond Wigner and Marchenko-Pastur Laws Nathan Noiry Modal X, Université Paris Nanterre May 3, 2018 Wigner s Matrices Wishart s Matrices Generalizations Wigner s Matrices ij, (n, i, j

More information

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data

Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations of High-Dimension, Low-Sample-Size Data Sri Lankan Journal of Applied Statistics (Special Issue) Modern Statistical Methodologies in the Cutting Edge of Science Asymptotic Distribution of the Largest Eigenvalue via Geometric Representations

More information

Random Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications

Random Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications Random Correlation Matrices, Top Eigenvalue with Heavy Tails and Financial Applications J.P Bouchaud with: M. Potters, G. Biroli, L. Laloux, M. A. Miceli http://www.cfm.fr Portfolio theory: Basics Portfolio

More information

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices

A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices A Note on the Central Limit Theorem for the Eigenvalue Counting Function of Wigner and Covariance Matrices S. Dallaporta University of Toulouse, France Abstract. This note presents some central limit theorems

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Random regular digraphs: singularity and spectrum

Random regular digraphs: singularity and spectrum Random regular digraphs: singularity and spectrum Nick Cook, UCLA Probability Seminar, Stanford University November 2, 2015 Universality Circular law Singularity probability Talk outline 1 Universality

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Random Matrices: Invertibility, Structure, and Applications

Random Matrices: Invertibility, Structure, and Applications Random Matrices: Invertibility, Structure, and Applications Roman Vershynin University of Michigan Colloquium, October 11, 2011 Roman Vershynin (University of Michigan) Random Matrices Colloquium 1 / 37

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

Random matrices, phase transitions & queuing theory. Raj Rao Nadakuditi

Random matrices, phase transitions & queuing theory. Raj Rao Nadakuditi Random matrices, phase transitions & queuing theory Raj Rao Nadakuditi Dept. of Electrical Engg. & Computer Science http://www.eecs.umich.edu/~rajnrao Joint works with Jinho Baik, Igor Markov and Mingyan

More information

ADAPTIVE ANTENNAS. SPATIAL BF

ADAPTIVE ANTENNAS. SPATIAL BF ADAPTIVE ANTENNAS SPATIAL BF 1 1-Spatial reference BF -Spatial reference beamforming may not use of embedded training sequences. Instead, the directions of arrival (DoA) of the impinging waves are used

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Empirical properties of large covariance matrices in finance

Empirical properties of large covariance matrices in finance Empirical properties of large covariance matrices in finance Ex: RiskMetrics Group, Geneva Since 2010: Swissquote, Gland December 2009 Covariance and large random matrices Many problems in finance require

More information

Fluctuations from the Semicircle Law Lecture 4

Fluctuations from the Semicircle Law Lecture 4 Fluctuations from the Semicircle Law Lecture 4 Ioana Dumitriu University of Washington Women and Math, IAS 2014 May 23, 2014 Ioana Dumitriu (UW) Fluctuations from the Semicircle Law Lecture 4 May 23, 2014

More information

Airy and Pearcey Processes

Airy and Pearcey Processes Airy and Pearcey Processes Craig A. Tracy UC Davis Probability, Geometry and Integrable Systems MSRI December 2005 1 Probability Space: (Ω, Pr, F): Random Matrix Models Gaussian Orthogonal Ensemble (GOE,

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

Free deconvolution for signal processing applications

Free deconvolution for signal processing applications IEEE TRANSACTIONS ON INFORMATION THEORY, VOL., NO., JANUARY 27 Free deconvolution for signal processing applications Øyvind Ryan, Member, IEEE, Mérouane Debbah, Member, IEEE arxiv:cs.it/725v 4 Jan 27 Abstract

More information

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30 Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Near extreme eigenvalues and the first gap of Hermitian random matrices

Near extreme eigenvalues and the first gap of Hermitian random matrices Near extreme eigenvalues and the first gap of Hermitian random matrices Grégory Schehr LPTMS, CNRS-Université Paris-Sud XI Sydney Random Matrix Theory Workshop 13-16 January 2014 Anthony Perret, G. S.,

More information

Quantum Computing Lecture 2. Review of Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces

More information