Free Probability Theory and Random Matrices Roland Speicher Queen s University Kingston, Canada
What is Operator-Valued Free Probability and Why Should Engineers Care About it? Roland Speicher Queen s University Kingston, Canada
Many approximations for calculating eigenvalue distribution of matrices consist in replacing independent Gaussian random variables Reasons for doing so: free (semi)circular variables in limit N, we have this transition asymptotically but even for finite N, this approximation is usually quite close to original problem this is an approximation which is usually exactly calculable for each N
Example: selfadjoint Gaussian N N random matrix X N = (x ij ) N i,j=1 with x ij (1 i j ) are independent complex (i j) or real (i = j) Gaussian random variables x ij = x ji ϕ(x ij ) = 0, ϕ(x ij x ij ) = 1/N
replacing independent Gaussian variables by free (semi)circular variables gives
selfadjoint noncommutative N N random matrix S N = (c ij ) N i,j=1 with c ij (1 i j ) are free circular (i j) or semicircular (i = j) random variables c ij = c ji ϕ(c ij ) = 0, ϕ(c ij c ij ) = 1/N
X N = (x ij ) N i,j=1 S N = (c ij ) N i,j=1 X N has a complicated averaged eigenvalue distribution (i.e., with respect to tr ϕ) S N has a very simple distribution with respect to tr ϕ: for each N, S N is a semicircular variable
X N = (x ij ) N i,j=1 S N = (c ij ) N i,j=1 Morale: S N is an approximation for X N with the approximation gets better for large N distr(s N ) can be calculated exactly
Why is S N better treatable than X N? taking matrices does not fit well with independence and Gaussianity freeness and (semi)circular variables, on the other hand, go very nicely with matrices
freeness and (semi)circular variables go very nicely with matrices caveat: this is true for free (semi)circulars which are centered and all have same variance however: we might be interested in more general situations: there might even be correlations between different x ij and x kl
freeness and (semi)circular variables go very nicely with matrices caveat: this is true for free (semi)circulars which are centered and all have same variance however: we might be interested in more general situations: x ij might have different variances for different ij x ij might not be centered ( Ricean model) there might even be correlations between different x ij and x kl
Example: non-zero mean, independent Gaussian variables with constant variance Y N = A N + X N deterministic matrix of means independent centered Gaussians constant variance
We replace this by... A N + S N same deterministic matrix as before free centered (semi)circulars of constant variance We have: A N is free from S N, thus distr(a N + S N ) = distr(a N ) distr(s N )
Example: non-zero mean, independent Gaussian variables with varying variance Y N = A N + X N deterministic independent matrix of centered Gaussians means ϕ(x ij x ij ) = σ ij /N
We replace this by... A N + S N same deterministic matrix as before free centered (semi)circulars with ϕ(c ij c ij ) = σ ij/n Now we have a problem
We replace this by... A N + S N same deterministic matrix as before free centered (semi)circulars with ϕ(c ij c ij ) = σ ij/n Now we have a problem S N is not semicircular in general A N and S N are not free in general
So what do we gain by replacing independent Gaussians by free (semi)circulars in such a case?
X = (x ij ) N i,j=1 Y = (y kl ) N k,l=1 with {x ij } and {y kl } independent X and Y are not independent actually: relation between X and Y is untreatable
X = (x ij ) N i,j=1 Y = (y kl ) N k,l=1 with {x ij } and {y kl } independent X and Y are not independent actually: relation between X and Y is untreatable
X = (x ij ) N i,j=1 Y = (y kl ) N k,l=1 with {x ij } and {y kl } free X and Y are not free however: relation between X and Y is more complicated, but still treatable operator-valued freeness
X = (x ij ) N i,j=1 Y = (y kl ) N k,l=1 with {x ij } and {y kl } free X and Y are not free however: relation between X and Y is more complicated, but still treatable in terms of operator-valued freeness
Let (C, ϕ) be non-commutative probability space. Consider N N matrices over C: M N (C) := {(a ij ) N i,j=1 a ij C} M N (C) = M N (C) C is a non-commutative probability space with respect to tr N ϕ, but there is also an intermediate level
Instead of M N (C) tr ϕ C consider...
M N (C) = M N (C) C=: A id ϕ=: E M N (C)=: B tr ϕ tr C
M N (C) = M N (C) C=: A id ϕ=: E M N (C)=: B tr ϕ tr C
Let B A. A linear map is a conditional expectation if E : A B E[b] = b b B and E[b 1 ab 2 ] = b 1 E[a]b 2 a A, b 1, b 2 B An operator-valued probability space consists of B A and a conditional expectation E : A B
Consider an operator-valued probability space (A, E : A B). The operator-valued distribution of a A is given by all operator-valued moments E[ab 1 ab 2 b n 1 a] B (n N, b 1,..., b n 1 B)
Consider an operator-valued probability space (A, E : A B). The operator-valued distribution of a A is given by all operator-valued moments E[ab 1 ab 2 b n 1 a] B (n N, b 1,..., b n 1 B) Random variables x, y A are free with respect to E (or free with amalgamation over B) if E[p 1 (x)q 1 (y)p 2 (x)q 2 (y) ] = 0 whenever p i, q j are polynomials with coefficients from B and E[p i (x)] = 0 i and E[q j (y)] = 0 j.
Note: polynomials in x with coefficients from B are of the form x 2 b 0 x 2 b s and x do not commute in general!
Note: polynomials in x with coefficients from B are of the form x 2 b 0 x 2 b 1 xb 2 xb 3 b s and x do not commute in general!
Note: polynomials in x with coefficients from B are of the form x 2 b 0 x 2 b 1 xb 2 xb 3 b 1 xb 2 xb 3 + b 4 xb 5 xb 6 + etc. b s and x do not commute in general!
Operator-valued freeness works mostly like ordinary freeness, one only has to take care of the order of the variables; in all expressions they have to appear in their original order! Example: If x and {y 1, y 2 } are free, then one has E[y 1 xy 2 ] = E [ y 1 E[x]y 2 ] ; and more general E[y 1 b 1 xb 2 y 2 ] = E [ y 1 b 1 E[x]b 2 y 2 ].
Consider E : A B. Define free cumulants by κ B n : A n B E[a 1 a n ] = π N C(n) κ B π[a 1,..., a n ] arguments of κ B π are distributed according to blocks of π but now: cumulants are nested inside each other according to nesting of blocks of π
Example: π = { {1, 10}, {2, 5, 9}, {3, 4}, {6}, {7, 8} } NC(10), 1 2 3 4 5 6 7 8 9 10 κ B π[a 1,..., a 10 ] = κ B 2 ( a 1 κ B ( 3 a2 κ B 2 (a 3, a 4 ), a 5 κ B 1 (a 6) κ B 2 (a ) ) 7, a 8 ), a 9, a10
For a A define its operator-valued Cauchy transform 1 G a (b) := E[ b a ] = E[b 1 (ab 1 ) n ] n 0 and operator-valued R-transform R a (b) : = κ B n+1 (ab, ab,..., ab, a) n 0 = κ B 1 (a) + κb 2 (ab, a) + κb 3 (ab, ab, a) + Then bg(b) = 1 + R(G(b)) G(b) or G(b) = 1 b R(G(b))
If x and y are free over B, then mixed B-valued cumulants in x and y vanish R x+y (b) = R x (b) + R y (b) G x+y (b) = G x [ b Ry ( Gx+y (b) )] subordination If s is a semicircle over B then R s (b) = η(b) where η : B B is a linear map given by η(b) = E[sbs].
Back to random matrices What can we say about A N + S N deterministic matrix of means free centered (semi)circulars with ϕ(c ij c ij ) = σ ij/n
Consider T N := A N + S N We want Cauchy transform 1 g T (z) = tr ϕ[ z T ]
Consider T N := A N + S N We want Cauchy transform 1 g T (z) = tr ϕ[ z T ] M N (C) = M N (C) C id ϕ M N (C) tr ϕ tr C
Consider T := A + S We want Cauchy transform 1 g T (z) = tr ϕ[ z T ] 1 g T (z) = tr ϕ[ z T ] = tr [ 1 id ϕ( z T ) ] = tr[gt (z)] } {{ } G T (z)
We have nice behavior as M N (C)-operator-valued objects A N, S N are free over M N (C), i.e., G T (z) = G A [ z RS ( GT (z) )] S N is semicircular over M N (C), i.e., R S (B) = η(b) with η : M N (C) M N (C) linear, given by η(b) = E[SBS].
Thus the distribution of T N = A N + S N Cauchy-transform g T according to is determined via its g T (z) = tr [ G T (z) ] and G T (z) = G A [ z η ( GT (z) )] = E 1 z η ( G T (z) ) A
Thus the distribution of T N = A N + S N Cauchy-transform g T according to is determined via its g T (z) = tr [ G T (z) ] and G T (z) = G A [ z η ( GT (z) )] = E 1 z η ( G T (z) ) A Note: by [Helton, Far, Speicher IMRN 2007], there exists exactly one solution of above fixed point equation with the right positivity property!
Note: the more symmetries we have in entries of S N, the better is the freeness between A N and S N! For considered situation where different c ij are free, we have for η : M N (C) M N (C) that actually [η(b)] ij = E[SBS] ij = k,l ϕ(c ik b kl c lj ) = ϕ(c ik c jl ) b kl = δ ij }{{} k,l δ kl δ ij σ ik k σ ik b kk thus η is effectively a mapping on diagonal matrices
Consider in addition to E M : M N (C) M N (C); also a 11... a 1N....... a N1... a NN ϕ(a 11 )... ϕ(a 1N )..... ϕ(a N1 )... ϕ(a NN ) D N (C) := {diagonal matrices} M N (C) and corresponding conditional expectation E D : M N (C) D N (C); a 11... a 1N....... a N1... a NN ϕ(a 11 )... 0..... 0... ϕ(a NN )
Then we have in our situation A N + S N deterministic matrix of means free centered (semi)circulars with ϕ(c ij c ij ) = σ ij/n that actually, by [Nica, Shlyakhtenko, Speicher, JFA 2002], A N and S N are free over D N (C) S N is semicircular over D N (C)
M N (C) M N (C) M N (C) E M M N (C) E D tr ϕ tr D N (C) tr C C C with correlation free entries free entries varying variance constant variance
Let us now treat more relevant non-selfadjoint situation H N = A N + C N deterministic matrix of means free circulars no symmetry condition ϕ(c ij c ij ) = σ ij/n Calculate the distribution of HH!
HH has the same distribution as square of T := ( 0 ) H H 0 = ( 0 ) A A 0 + ( 0 ) C C 0 These are 2N 2N selfadjoint matrices of the type considered before.
We have ( ) 0 A A 0 and ( 0 ) C C 0 are free over D 2N (C) ( ) 0 C C 0 where is semicircular over D 2N (C) with η ( B1 0 0 B 2 ) = ( ) η1 (B 2 ) 0, 0 η 2 (B 1 ) η 1 (B 2 ) = E DN [CD 2 C ] η 2 (B 1 ) = E DN [C D 1 C]
We have and Thus G T (z) = zg T 2(z 2 ) G T 2(z) = zg T 2(z 2 ) = G T (z) = G ( 0 A A 0 = E D2N = E D2N ) ( ( G1 (z) 0 ) 0 G 2 (z) z R ( 0 z zη C C 0 ) (G T (z)) ( G1 (z 2 ) 0 0 G 2 (z 2 ) ) ( 0 A A 0 ( z zη1 (G 2 (z 2 )) A A z zη 2 (G 1 (z 2 )) )) 1 ) 1
This yields zg 1 (z) = E DN and zg 2 (z) = E DN ( ( 1 η 1 (G 2 (z)) + A N 1 z zη 2 (G 1 (z)) A N 1 η 2 (G 1 (z)) + A N ) 1 ) 1 1 z zη 1 (G 2 (z)) A N
This yields zg 1 (z) = E DN and zg 2 (z) = E DN ( ( 1 η 1 (G 2 (z)) + A N 1 z zη 2 (G 1 (z)) A N 1 η 2 (G 1 (z)) + A N ) 1 ) 1 1 z zη 1 (G 2 (z)) A N These are actually the fixed point equations of [Hachem, Loubaton, Najim, Ann. Appl. Prob. 2007] for a deterministic equivalent (a la Girko) of square of random matrix with non-centered, independent Gaussians with nonconstant variance as entries.
Conclusion many approximations (like deterministic equivalents a la Girko) consist in replacing independent Gaussians by free (semi)circulars
Conclusion many approximations (like deterministic equivalents a la Girko) consist in replacing independent Gaussians by free (semi)circulars operator-valued free probability allows conceptual and streamlined treatment of those
Conclusion many approximations (like deterministic equivalents a la Girko) consist in replacing independent Gaussians by free (semi)circulars operator-valued free probability allows conceptual and streamlined treatment of those also convergence questions might be treated more uniformly by relying on asymptotic freeness results
Conclusion many approximations (like deterministic equivalents a la Girko) consist in replacing independent Gaussians by free (semi)circulars operator-valued free probability allows conceptual and streamlined treatment of those also convergence questions might be treated more uniformly by relying on asymptotic freeness results this approach also allows to treat classes of random matrices with correlation between entries