Operator norm convergence for sequence of matrices and application to QIT

Operator norm convergence for sequence of matrices and application to QIT Benoît Collins University of Ottawa & AIMR, Tohoku University Cambridge, INI, October 15, 2013

Overview

Overview Plan: 1. Norm convergence for simple random matrices. 2. Application: threshold for for the appt property. 3. Norm convergence for multiple random matrices. 4. Application: converse threshold for PPT. 5. Application: convergence of the collection of output states.

Random matrices

Random matrices Random matrix theory: probability theory with matrix valued random variables, study of its properties as d.

Random matrices Random matrix theory: probability theory with matrix valued random variables, study of its properties as d. Notation: let X d M d (C) be a (self-adjoint or normal) (random) matrix. Let λ (d) i be its eigenvalues, and µ d = d 1 λ (d) i is the eigenvalue counting measure (histogram of eigenvalues). Classical RMT question: When does the (random) probability measure µ d have an interesting behaviour at d? More recent RMT question: how about the (random) set supp(µ d )? (largest eigenvalue)

Single random matrix: Wishart matrices Let G M d s (C) be a Ginibre random matrix

Single random matrix: Wishart matrices Let G M d s (C) be a Ginibre random matrix ({G ij } are i.i.d. standard complex Gaussian random variables)

Single random matrix: Wishart matrices Let G M d s (C) be a Ginibre random matrix ({G ij } are i.i.d. standard complex Gaussian random variables) Let W = W d = GG be the corresponding Wishart matrix of parameters (d, s) (G denotes the Hermitian adjoint of G).

Single random matrix: Wishart matrices The Marchenko-Pastur (or free Poisson) probability distributions, π c, (c > 0), is defined as follows (x a)(b x) π c = max(1 c, 0)δ 0 + 1 2πx [a,b] (x) dx, (1) (a = ( c 1) 2 and b = ( c + 1) 2 ).

Ginibre and Wishart ensembles Notation: Let γ be the full cycle γ = (1 2 k). For any permutation α, let α be its length on the Cayley graph of S k generated by transpositions. The geodesic inequality α + α 1 γ γ = k 1 holds. We define g(α), the genus of a permutation α on I, as g(α) = α + α 1 γ γ. (2) 2

Ginibre and Wishart ensembles Let W d be a d d Wishart matrix of parameter (d, s) and let Z d = ( Wd ds ds Id ) d be its centered and renormalized version. Theorem The moments of Z d are given by [ 1 E d Tr( Z p ) ] d = ( ) d α p/2 d 2g(α), (3) s α Sp o

Ginibre and Wishart ensembles The previous theorem implies that x k µ d (dx) x k π c (dx) (only genus zero terms survive):

Ginibre and Wishart ensembles The previous theorem implies that x k µ d (dx) x k π c (dx) (only genus zero terms survive): Theorem Assuming s/d c, with the renormalization W d / sd, µ d π c.

Ginibre and Wishart ensembles The previous theorem implies that x k µ d (dx) x k π c (dx) (only genus zero terms survive): Theorem Assuming s/d c, with the renormalization W d / sd, µ d π c. However the only thing we can say about supp(µ d ) at this point is lim d d d dsupp(µ d ) supp(π c ). The converse inclusion is not clear.

Ginibre and Wishart ensembles In general, convergence of µ d to something, is ensured by convergence of moments (under reasonable boundedness assumption).

Ginibre and Wishart ensembles In general, convergence of µ d to something, is ensured by convergence of moments (under reasonable boundedness assumption). In order to ensure the convergence of supp(µ d ), we need to look at moments k that grow as d grows. [for example, d 1 Tr diag(1 + ε, 1,..., 1) k ) behaves like d 1 Tr diag(1 + ε, 1,..., 1) k ) at k fixed, d. The situation starts to change when k >> log d this is the minimal growth of k needed to understand supp(µ d ).]

Ginibre and Wishart ensembles Theorem In the situation where s/d converges, and d, then the extremal eigenvalues of W d /d converge almost surely to a, b.

Ginibre and Wishart ensembles Theorem In the situation where s/d converges, and d, then the extremal eigenvalues of W d /d converge almost surely to a, b. And almost surely, when 1 d s, the extremal eigenvalues of ds(wd /ds 1/d) converge to ±2.

Application: APPT A quantum state ρ D(C d 1 C d 2 ) is absolutely PPT (or APPT) if for any unitary matrix U U(d), UρU PPT. APPT = U U(d) U(PPT )U PPT.

Application: APPT A quantum state ρ D(C d 1 C d 2 ) is absolutely PPT (or APPT) if for any unitary matrix U U(d), UρU PPT. APPT = U U(d) U(PPT )U PPT. APPT is a convex body, a convex compact set with non-empty interior. Known fact: ɛd + (1 ɛ) I d S PPT for some ɛ < 1 d 1.

Application: APPT Setup: d = d 1 d 2, we pick a Wishart matrix of parameter d, s and try to check the APPT property for s, d Let p = min(d 1, d 2 ).

Application: APPT When p = min(d 1, d 2 ), we have the following (almost sharp) threshold estimate

Application: APPT When p = min(d 1, d 2 ), we have the following (almost sharp) threshold estimate Theorem (C, Nechita, Ye) Let ρ be a random state according to the parameters d, s. (i) almost surely, when d and s > (4 + ε)p 2 d, the quantum state ρ is APPT; (ii) when 1 p 2 d and s < (4 ε)p 2 d, ρ is not APPT almost surely; (iii) when p 2 τd for a constant τ (0, 1], there exists a constant C τ such that whenever s < 4(C τ ε)p 2 d, ρ is not APPT almost surely.

Application: APPT When p = min(d 1, d 2 ) is fixed and s/d c for a constant c > 0 as d, sharp estimate on the threshold for APPT. Theorem (C, Nechita, Ye) Let ρ be a random induced state distributed according to the measure µ d,s. Almost surely, when d and s cd, one has: (i) ρ APPT, if c > (p + p 2 1) 2 ; (ii) ρ / APPT, if c < (p + p 2 1) 2.

Multimatrices So far we only dealt with one matrix. What if we take a bunch of i.i.d random matrices and take NC polynomials in them?

Multimatrices So far we only dealt with one matrix. What if we take a bunch of i.i.d random matrices and take NC polynomials in them? [E.g. understand the behaviour of W (1) d W (2) d + W (2) d W (1) d, if W (1) d, W (2) d are independent copies] In principle, the method of moments allows us to understand the limiting behaviour of µ d. But the limiting behaviour of supp(µ d ) is much more difficult to understand.

Linearization trick

Linearization trick Roughly speaking:

Linearization trick Roughly speaking: Understanding supp(µ d ) for all non-commutative polynomials in W (1) d, W (2) d [W (1) d W (2) d + W (2) d W (1) d and all the others] is equivalent to

Linearization trick Roughly speaking: Understanding supp(µ d ) for all non-commutative polynomials in W (1) d, W (2) d [W (1) d W (2) d + W (2) d W (1) d and all the others] is equivalent to Understanding supp(µ d ) for all a 0 1 d + a 1 W (1) d + a 2 W (2) d for all a 0, a 1, a 2 M k (C) selfadjoint matrices, all k.

Linearization trick Advantage: no non-commutative multiplication ( linearization ).

Linearization trick Advantage: no non-commutative multiplication ( linearization ). Price to pay: (1) allow matrix coefficients (2) give up speed of convergence (global equivalence).

Linearization trick Advantage: no non-commutative multiplication ( linearization ). Price to pay: (1) allow matrix coefficients (2) give up speed of convergence (global equivalence). Extra bonus: we obtain for free the understanding of non-commutative polynomials with matrix coefficients [e.g. a 1 W (1) d W (2) d + a 2 W (2) d W (1) d.]

Free probability and random matrices A non-commutative probability space : unital algebra A with tracial state ϕ (Elements therein: NCRV). E.g. random matrices (M d (L (Ω, P)), E[d 1 Tr( )])

Free probability and random matrices A non-commutative probability space : unital algebra A with tracial state ϕ (Elements therein: NCRV). E.g. random matrices (M d (L (Ω, P)), E[d 1 Tr( )]) Let A 1,..., A k be subalgebras of A. They are free if for all a i A ji (i = 1,..., k) such that ϕ(a i ) = 0, one has ϕ(a 1 a p ) = 0 as soon as j 1 j 2, j 2 j 3,..., j p 1 j p.

Free probability and random matrices Convergence in distribution = pointwise convergence of moments.

Free probability and random matrices Convergence in distribution = pointwise convergence of moments. Sequences of random variables (a (d) 1 ) n,..., (a (d) k ) n are called asymptotically free as n iff the k-tuple (a (d) 1,..., a(d) k ) n converges in distribution towards a family of free random variables.

Free probability and random matrices Theorem (Voiculescu) Let U 1,..., U k,... be a collection of independent Haar distributed random matrices of M d (C) and (Wi d ) i I be a set of constant matrices of M d (C) admitting a joint limit distribution for large n with respect to the state d 1 Tr. Then, the family ((U 1, U1 ),..., (U k, Uk ),..., (W i)) admits a limit distribution, and is asymptotically free with respect to E(d 1 Tr).

Free probability and random matrices Strong convergence = in addition, convergence of the set of singular values towards the limiting set in the Hausdorff distance sense.

Free probability and random matrices Theorem (C, Male) Let U 1,..., U k,... be a collection of independent Haar distributed random matrices of M n (C) and (Wi d ) i I be a set of constant matrices of M d (C) admitting a STRONG joint limit distribution for large d with respect to the state d 1 Tr. Then the family ((U 1, U1 ),..., (U k, Uk ),..., (W i)) admits a STRONG limit distribution, and is STRONGLY asymptotically free with respect to E(d 1 Tr).

Application: treshold for asymetric PPT In M d M n, Banica and Nechita (arxiv1105.2556) consider the partial transpose of Wishart matrices of parameter dn, dm and prove that as d (m, n fixed), µ d converges to a free difference of Marchenko Pastur distributions of parameter m(n ± 1)/2. They prove that the spectrum of this distribution is positive iff n m/4 + 1/m and m 2. This proves that PPT does not hold with high probability if one of these conditions is violated.

Application: simultaneous behaviour of quantum channels Let π d be a (random projection) in M k M d having the following property: A M k, the pair of matrices (π d, A 1 d ) converge strongly.

Application: simultaneous behaviour of quantum channels Let π d be a (random projection) in M k M d having the following property: A M k, the pair of matrices (π d, A 1 d ) converge strongly. Let Φ d be the channel End(Im(π d )) M k obtained by taking the partial trace over M d. Let χ be any other quantum channel M p M q. Theorem (C, Fukuda, Nechita) The collection Φ d (pure states) converges strongly to a convex body (the dual of {A, lim π d A 1 d π d 1}).

Application: simultaneous behaviour of quantum channels Example 1: π d is a random projection of rank tkd, t [0, 1] (allows us to obtain violation of MOE additivity as large as log 2)

Application: simultaneous behaviour of quantum channels Example 1: π d is a random projection of rank tkd, t [0, 1] (allows us to obtain violation of MOE additivity as large as log 2) Example 2: π d = k 1 (U (d) i U (d) j ) (complementary channel of Φ d : X k 1 U i XUi ). This allows us to prove that Φ d S1 S = 4(k 1)/k 2, k 2.

Selected References (1) The strong asymptotic freeness of Haar and deterministic matrices math/arxiv:1105.4345- With C. Male. To appear in Annales Scientifiques de l ENS (2) Laws of large numbers for eigenvectors and eigenvalues associated to random subspaces in a tensor product math/arxiv:1008.3099- With S. Belinschi and I. Nechita. Inventiones Mathematicae December 2012, Volume 190, Issue 3, pp 647-697 (3) Almost one bit violation for the additivity of the minimum output entropy, arxiv:1305.1567 - with S. Belinschi and I. Nechita. (4) The absolute positive partial transpose property for random induced states. math/arxiv:1108.1935- With I. Nechita and D. Ye, to appear in RMTA (5) In preparation, with M. Fukuda and I. Nechita (6) In preparation, with P. Hayden and I. Nechita