From nonnegative matrices to nonnegative tensors Shmuel Friedland Univ. Illinois at Chicago 7 October, 2011 Hamilton Institute, National University of Ireland
Overview 1 Perron-Frobenius theorem for irreducible nonnegative matrices 2 Irreducibility and weak irreducibility for tensors 3 Perron-Frobenius for irreducible tensors 4 Kingman, Karlin-Ost and Friedland inequalities 5 Maxplus eigenvalues of tensors 6 Friedland-Karlin characterization of spectral radius 7 Diagonal scaling 8 Rank one approximation of tensors 9 Nonnegative multilinear forms
Nonnegative irreducible and primitive matrices A = [a ij ] R n n + induces digraph DG(A) = DG = (V, E) V = [n] := {1,..., n}, E [n] [n], (i, j) E a ij > 0 DG strongly connected, SC, if for each pair i j there exists a dipath from i to j Claim: DG SC iff for each I [n] j [n] \ I and i I s.t. (i, j) E A -primitive if A N > 0 for some N > 0 A N (R n + \ {0}) int R n + A primitive A irreducible and g.c.d of all cycles in DG(A) is one
Perron-Frobenius theorem PF: A R n + irreducible. Then 0 < ρ(a) spec (A) algebraically simple x, y > 0 Ax = ρ(a)x, A y = ρ(a)y. A R n n + primitive iff in addition to above λ < ρ(a) for λ spec (A) \ {ρ(a)} Collatz-Wielandt: ρ(a) = min x>0 max i [n] (Ax) i x i = max x>0 min i [n] (Ax) i x i
Irreducibility and weak irreducibility of nonnegative tensors F := [f i1,...,i d ] n i 1 =...=i d (C n ) d is called d-cube tensor, (d 3) F 0 if all entries are nonnegative F irreducible: for each I [n], there exists i I, j 2,..., j d J := [n] \ I s.t. f i,j2,...,j d > 0. D(F) digraph ([n], A): (i, j) A if there exists j 2,... j d [n] s.t. f i,j2,...,j d > 0 and j {j 2,..., j d }. F weakly irreducible if D(F) is strongly connected. Claim: irreducible implies weak irreducible For d = 2 irreducible and weak irreducible are equivalent Example of weak irreducible and not irreducible n = 2, d = 3, f 1,1,2, f 1,2,1, f 2,1,2, f 2,2,1 > 0 and all other entries of F are zero
Perron-Frobenius theorem for nonnegative tensors I F = [f i1,...,i d ] (C n ) d maps C n to itself (Fx) i = f i, x := i 2,...,i d [n] f i,i 2,...,i d x i2... x id, i [n] Note we can assume f i,i2,...,i d is symmetric in i 2,..., i d. F has eigenvector x = (x 1,..., x n ) C n with eigenvalue λ: (Fx) i = λx d 1 i for all i [n] Assume: F 0, (FR n + \ {0}) R n + \ {0} 1 F 1 : Π n Π n, x 1 (Fx) 1 d 1 n i=1 (Fx) d 1 i Brouwer fixed point: x 0 eigenvector with λ > 0 eigenvalue Problem When there is a unique positive eigenvector with maximal eigenvalue?
Perron-Frobenius theorem for nonnegative tensors II Theorem Chang-Pearson-Zhang 2009 [2] Assume F ((R n ) d ) + is irreducible. Then there exists a unique nonnegative eigenvector which is positive with the corresponding maximum eigenvalue λ. Furthermore the Collatz-Wielandt characterization holds λ = min x>0 max i [n] (Fx) i x d 1 i = max x>0 min i [n] (Fx) i x d 1 i Theorem Friedland-Gaubert-Han 2011 [5] Assume F ((R n ) d ) + is weakly irreducible. Then there exists a unique positive eigenvector with the corresponding maximum eigenvalue λ. Furthermore the Collatz-Wielandt characterization holds
Generalization of Kingman inequality: Friedland-Gaubert Kingman s inequality: D R m convex, A : D R n n +, A(t) = [a ij(t)], each log a ij (t) [, ) is convex, (entrywise logconvex) then log ρ(a) : D [, ) convex, (ρ(a( )) logconvex) Generalization: F : D ((R n ) d ) + entrywise logconvex then ρ(t ( )) is logconvex (L. Qi & collaborators) Proof Outline: F s = [f s i 1,...,i d ], (0 0 = 0), F G = [f i1,...,i d g i1,...,i d ] GKI: ρ(f α G β ) (ρ(f)) α (ρ(g)) β, α, β 0, α + β = 1 ( ) Assume F, G > 0, Fx = ρ(f)x (d 1), Gx = ρ(g)y (d 1) Hölder s inequality for p = α 1, q = β 1 yields ((F α G β )(x α y β )) i (Fx) α i (Gx) β i = (ρ(f)) α (ρ(g)) β (xi α y β i ) d 1 Collatz-Wielandt implies ( )
Karlin-Ost and Friedland inequalities-fg ρ(f s ) 1 s non-increasing on (0, ) ( ) Assume F > 0, s > 1 use y s non-increasing (F s x s ) 1 s i (Fx) i = ρ(f)x d 1 i use Collatz-Wielandt ρ trop (F) = lim s ρ(f s ) 1 s - the tropical eigenvalue of F. if F weakly irreducible then F has positive tropical eigenvector max i2,...,i d f i,i2,...,i d x i2... x id = ρ trop (F)x d 1 i, i [n], x > 0 Cor: ρ(f G) ρ(f 1 2 G 1 2 ) 2 ρ(f)ρ(g) ρ(f G) ρ(f p ) 1 p ρ(g q ) 1 q 1, p + 1 q = 1 p = 1, q = ρ(f G) ρ(f)ρ trop (G) pat(g) pattern of G, tensor with 0/1 entries obtained by replacing every non-zero entry of G by 1. F = pat(g) ρ(g) ρ(pat(g))ρ trop (G)
Characterization of ρ trop (F) I Friedland 1986: ρ trop (A) is the maximum geometric average of cycle products of A R n n +. D(F) := ([n], Arc), (i, j) Arc iff j 2,...,j d f i,j2,...,j d x j2... x jd contains x j. d 1 cycle on [m] vertices is d 1 outregular strongly connected subdigraph D = ([m], Arc) of D(F), i.e. the digraph adjacency matrix A(D) = [a ij ]) Z+ m m of subgraph is irreducible with each row sum d 1. A(D)1 = (d 1)1, v A(D) = (d 1)v, v = (v 1,..., v m ) > 0 probability vector Assume for simplicity d 1 cycle on [m] weighted-geometric average: m i=1 (f i,j 2 (i),...,j d (i)) v i Friedland-Gaubert: ρ trop (F) is the maximum weighted-geometric average of d 1 cycle products of F ((R n ) d ) + Cor. ρ trop (F) is logconvex in entries of T.
Characterization of ρ trop (F) II More general results Akian-Gaubert [1] Z = (z i1,...,i d ) ((R n ) d ) + occupation measure: i 1,...,i d z i1,...,i d = 1 and for all k [n] i,{j 2,...,j d } k z i,j 2,...,i d = (d 1) m 2,...,m d z k,m2,...,m d first sum is over i [n] and all j 2,..., j d [n] s. t. k {j 2,..., j d } Def: Z n,d all occupation measures Thm: log ρ trop (F) = max Z Zn,d j 1,...,j d [n] z j 1,...,j d log f i1,...,i d Proof: The extreme points of occupational measures correspond to geometric average
Diagonal similarity of nonnegative tensors F = [f i1,...,i d ] ((R n ) d ) + is diagonally similar to G = [g i1,...,i d ] ((R n ) d ) + if g i1,...,i d = e (d 1)t i 1 + d j=2 t i j f i1,...,i d for some t = (t 1,..., t n ) R n Diagonally similar tensors have the same eigenvalues and spectral radius generalization of Engel-Schneider [3], (Collatz-Wielandt) ρ trop (F) = inf (t1,...,t n) R n max i 1,...,i d e (d 1)t i 1 + d j=2 t i j f i1,...,i d
Generalized Friedland-Karlin inequality I Friedland-Karlin 1975: A R n n + irreducible, Au = ρ(a)u, A v = ρ(a)v, u v = (u 1 v 1,..., u n v n ) > 0 probability vector: log ρ(diag(e t )A) log ρ(a) + n i=1 u iv i t i (graph of convex function above its supporting hyperplane) (e t F) i1,...,i d = e t i 1 f i1,...,i d GFKI: F is weakly irreducible. A := D(u) (d 2) (Fx)(u), Au = ρ(a)u, A v = ρ(a)v and u v > 0 probability vector log ρ(diag(e t )F) log ρ(f) + n i=1 u iv i t i F super-symmetric: Fx = φ(x), φ homog. pol. degree d log ρ(diag(e t )F) log ρ(f) + n i=1 ud i t i, n i=1 ud i = 1
Generalized Friedland-Karlin inequality II min n x>0 i=1 u iv i log (Fx) i = log ρ(f) ( ) x d 1 i equality iff x the positive eigenvector of F. Gen. Donsker-Varadhan: ρ(f) = max p Πn inf n x>0 i=1 p i (Fx) i x d 1 i Prf: For x = u RHS ( ) ρ(t ). For p = u v ( ) RHS ( ) = ρ(f). Gen. Cohen: ρ(f) convex in (f 1,...,1,..., f n,...,n ): ρ(f + D) = max p Πn ( n i=1 p id i,...,i + inf n x>0 i=1 p i (Fx) i x d 1 i ) ( ) GFK: F weakly irreducible, positive diagonal, u, v > 0, u v Π n, t, s R n s.t. e t i 1 f i1,...,i d e s i 2 +...+s id with eigenvector u and v left eigenvector of D(u) (d 2) Fx(u) PRF: Strict convex function g(z) = n i=1 u iv i (log Fe z (d 1)z i ) achieves unique minimum for some z = log x, as g( (R n + \ {0}) = F super-symmetric and v = u d 1 then t = s
Scaling of nonnegative tensors to tensors with given row, column and depth sums 0 T = [t i,j,k ] R m n l has given row, column and depth sums: r = (r 1,..., r m ), c = (c 1,..., c n ), d = (d 1,..., d l ) > 0: j,k t i,j,k = r i > 0, i,k t i,j,k = c j > 0, i,j t i,j,k = d k > 0 m i=1 r i = n j=1 c j = lk=1 d k Find nec. and suf. conditions for scaling: T = [t i,j,k e x i +y j +z k ], x, y, z such that T has given row, column and depth sum Solution: Convert to the minimal problem: min r x=c y=d z=0 f T (x, y, z), f T (x, y, z) = i,j,k t i,j,ke x i +y j +z k Any critical point of f T on S := {r x = c y = d z = 0} gives rise to a solution of the scaling problem (Lagrange multipliers) f T is convex f T is strictly convex implies T is not decomposable: T T 1 T 2. For matrices indecomposability is equivalent to strict convexity
Scaling of nonnegative tensors II if f T is strictly convex and is on S, f T achieves its unique minimum Equivalent to: the inequalities x i + y j + z k 0 if t i,j,k > 0 and equalities r x = c y = d z = 0 imply x = 0 m, y = 0 n, z = 0 l. Fact: For r = 1 m, c = 1 n, d = 1 l Sinkhorn scaling algorithm works. Newton method works, since the scaling problem is equivalent finding the unique minimum of strict convex function Hence Newton method has a quadratic convergence versus linear convergence of Sinkhorn algorithm True for matrices too
Rank one approximations R m n l IPS: A, B = m,n,l i=j=k a i,j,kb i,j,k, T = T, T x y z, u v w = (u x)(v y)(w z) X subspace of R m n l, X 1,..., X d an orthonormal basis of X P X (T ) = d i=1 T, X i X i, P X (T ) 2 = d i=1 T, X i 2 T 2 = P X (T ) 2 + T P X (T ) 2 Best rank one approximation of T : min x,y,z T x y z = min x = y = z =1,a T a x y z Equivalent: max x = y = z =1 m,n,l i=j=k t i,j,kx i y j z k Lagrange multipliers: T y z := j=k=1 t i,j,ky j z k = λx T x z = λy, T x y = λz λ singular value, x, y, z singular vectors How many distinct singular values are for a generic tensor?
l p maximal problem and Perron-Frobenius (x 1,..., x n ) p := ( n i=1 x i p ) 1 p Problem: max m,n,l x p= y p= z p=1 i=j=k t i,j,kx i y j z k Lagrange multipliers: T y z := j=k=1 t i,j,ky j z k = λx p 1 T x z = λy p 1, T x y = λz p 1 (p = p = 3 is most natural in view of homogeneity Assume that T 0. Then x, y, z 0 2t 2s 1, t, s N) For which values of p we have an analog of Perron-Frobenius theorem? Yes, for p 3, No, for p < 3, Friedland-Gauber-Han [5]
Nonnegative multilinear forms Associate with T = [t i1,...,i d ] R m 1... m d + a multilinear form f (x 1,... x d ) : R m 1... m d R f (x 1,..., x d ) = i j [m j ],j [d] t i 1,...,i d x i1,1... x id,d, x i = (x 1,i,..., x mi,i R m i For u R m, p (0, ] let u p := ( m i=1 u i p ) 1 p and S m 1 p,+ := {0 u Rm, u p = 1} For p 1,..., p d (1, ) critical point (ξ 1,..., ξ d ) S m 1 1 p 1,+ of f S m 1 1 p 1,+... Sm d 1 p d,+ satisfies Lim [4]: ti1,...,i d x i1,1... x ij 1,j 1x ij+1,j+1... x id,d = λx p j 1 i j,j, i j [m j ], x j S p j 1 m j,+, j [d]... Sm d 1 p d,+
Perron-Frobenius theorem for nonnegative multilinear forms Theorem- Friedland-Gauber-Han [5] f : R m 1... R m d R, a nonnegative multilinear form, T weakly irreducible and p j d for j [d]. Then f has unique positive critical point on S m 1 1 +... S m d 1 +. If F is irreducible then f has a unique nonnegative critical point which is necessarily positive
Eigenvectors of homogeneous monotone maps on R n + Hilbert metric on PR n >0 : for x = (x 1,..., x n ), y = (y 1,..., y n ) > 0. Then dist(x, y) = max i [n] log y i x i min i [n] log y i x i. F = (F 1,..., F n ) : R n >0 Rn >0 homogeneous: F(tx) = tf(x) for t > 0, x > 0, and monotone F(y) F(x) if y x > 0. F induces ˆF : PR n >0 PRn >0 F nonexpansive with respect to Hilbert metric dist(f(x), F(y)) dist(x, y). α max x y β min x α max F(x) = F(α max x) F(y) F(β min x) = β min F(x) dist(f(x), F(y)) log β min α max = dist(x, y) x > 0 eigenvector of F if F(x) = λf(x). So x PR n + fixed point of F PR n +.
Existence of positive eigenvectors of F 1. If F contraction: dist(f(x), F(y)) K dist(x, y) for K < 1, then F has unique fixed point in PR n + and power iterations converge to the fixed point 2. Use Brouwer fixed and irreducibility to deduce existence of positive eigenvector 3. Gaubert-Gunawardena 2004: for u (0, ), J [n] let u J = (u 1,..., u n ) > 0: u i = u if i J and u i = 1 if i J. F i (u J ) nondecreasing in u. di-graph G(F) [n] [n]: (i, j) G(F) iff lim u F i (u {j} ) =. Thm: F : R n >0 Rn >0 homogeneous and monotone. If G(F) strongly connected then F has positive eigenvector
Uniqueness and convergence of power method for F Thm 2.5, Nussbaum 88: F : R n >0 Rn >0 homogeneous and monotone. Assume; u > 0 eigenvector F with the eigenvalue λ > 0, F is C 1 in some open neighborhood of u, A = DF(u) R+ n n ρ(a)(= λ) a simple root of det(xi A). Then u is a unique eigenvector of F in R n >0. Cor 2.5, Nus88: In the above theorem assume A = DF(u) is primitive. Let ψ 0, ψ u = 1. Define G : R n >0 Rn >0 G(x) = 1 ψ F(x) F(x) Then lim m G m (x) = u for each x R n >0.
Outline of the uniqueness of pos. crit. point of f Define: F : R m + R n + R l + R m + R n + R l +: ( F((x, y, z)) i,1 = x p 3 p ( F((x, y, z)) j,2 = y p 3 p ( F((x, y, z)) k,3 = z p 3 p ) 1 n,l j=k=1 t p 1 i,j,ky j z k, i = 1,..., m ) 1 m,l i=k=1 t p 1 i,j,kx i z k, j = 1,..., n ) 1 m,n i=j=1 t p 1 i,j,kx i y j, k = 1,..., l Assume n,l j=k=1 t i,j,k > 0, i = 1,..., m, m,l i=k=1 t i,j,k > 0, j = 1,..., n, m,n i=j=1 t i,j,k > 0, k = 1,..., l F 1-homogeneous monotone, maps open positive cone R m + R n + R l + to itself. T = [t i,j,k ] induces tri-partite graph on m, n, l : i m connected to j n and k l iff t i,j,k > 0, sim. for j, k If tri-partite graph is connected then F has unique positive eigenvector If F completely irreducible, i.e. F N maps nonzero nonnegative vectors to positive, nonnegative eigenvector is unique and positive
References I M. Akian and S. Gaubert, Spectral theorem for convex monotone homogeneous maps, and ergodic control, Nonlinear Analysis 52 (2003), 637-679. K.C. Chang, K. Pearson, and T. Zhang, Perron-Frobenius theorem for nonnegative tensors, Commun. Math. Sci. 6, (2008), 507-520. G.M. Engel and H. Schneider, Diagonal similarity and equivalence for matrices over groups with 0, Czechoslovak Mathematical Journal 25 (1975), 389 403. S. Friedland, Limit eigenvalues of nonnegative matrices, Linear Algebra Appl. 74 (1986), 173-178. S. Friedland, S. Gaubert and L. Han, Perron-Frobenius theorem for nonnegative multilinear forms, arxiv:0905.1626, to appear in LAA.
References II S. Friedland and S. Karlin, Some inequalities for the spectral radius of nonnegative matrices and applications, Duke Math. J. 42 (1975), 459-490. S. Gaubert and J. Gunawardena, The Perron-Frobenius theorem for homogeneous, monotone functions, Trans. Amer. Math. Soc. 356 (2004), 4931-4950. S. Karlin and F. Ost, Some monotonicity properties of Schur powers of matrice and related inequalities, Linear Algebra Appl. 68 (1985), 47 65. L.H. Lim, Singular values and eigenvalues of tensors: a variational approach, CAMSAP 05, 1 (2005), 129-132. M. Ng, L. Qi and G. Zhou, Finding the largest eigenvalue of a nonnegative tensor, SIAM J. Matrix Anal. Appl. 31 (2009), 1090 1099.
References III R. Nussbaum, Hilbert s projective metric and iterated nonlinear maps, Memoirs Amer. Math. Soc., 1988, vol. 75. L. Qi, Eigenvalues of a real supersymmetric tensor, J. Symbolic Comput. 40 (2005), 1302-1324.