Department of Mathematics Technical Report May 2000 ABSTRACT. for any matrix norm that is reduced by a pinching. In addition to known

University of Kentucky Lexington Department of Mathematics Technical Report 2000-23 Pinchings and Norms of Scaled Triangular Matrices 1 Rajendra Bhatia 2 William Kahan 3 Ren-Cang Li 4 May 2000 ABSTRACT Suppose U is an upper-triangular matrix, and D a nonsingular diagonal matrix whose diagonal entries appear in nondescending order of magnitude down the diagonal. It is proved that kd ;1 UDkkUk for any matrix norm that is reduced by a pinching. In addition to known examples { weakly unitarily invariant norms { we show that any matrix norm dened by kak def Re (x Ay) = max x6=0 y6=0 (x) (y) where () and () are two absolute vector norms, has this property. This includes `p operator norms as a special case. 1 This report is available on the web at http://www.ms.uky.edu/~rcli/. 2 Indian Statistical Institute, New Delhi 110 016, India. email: rbh@isid.ernet.in 3 Computer Science Division and Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720. email: wkahan@cs.berkeley.edu 4 Department of Mathematics, University ofkentucky, Lexington, KY 40506 (rcli@ms.uky.edu). This work was supported in part by the National Science Foundation under Grant No. ACI-9721388 and by the National Science Foundation CAREER award under Grant No. CCR-9875201.

Pinchings and Norms of Scaled Triangular Matrices Rajendra Bhatia William Kahan y Ren-Cang Li z May 2000 Abstract Suppose U is an upper-triangular matrix, and D a nonsingular diagonal matrix whose diagonal entries appear in nondescending order of magnitude down the diagonal. It is proved that kd ;1 UDkkU k for any matrix norm that is reduced by apinching. In addition to known examples { weakly unitarily invariant norms { we show that any matrix norm dened by kak def Re (x Ay) = max x6=0 y6=0 (x) (y) where () and () are two absolute vector norms, has this property. This includes operator norms as a special case. `p Indian Statistical Institute, New Delhi 110 016, India. email: rbh@isid.ernet.in y Computer Science Division and Department of Mathematics, University of California at Berkeley, Berkeley, CA 94720. email: wkahan@cs.berkeley.edu z Department of Mathematics, University of Kentucky, Lexington, KY 40506. email: rcli@ms.uky.edu. This work was supported in part by the National Science Foundation under Grant No. ACI-9721388 and by the National Science Foundation CAREER award under Grant No. CCR-9875201. 1

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 2 1 Introduction Suppose U = (u ij ) is an upper-triangular matrix, i.e., u ij = 0 if i > j, and D = diag( 1 2 n ) a nonsingular diagonal matrix whose diagonal entries appear in ascending order of magnitude down the diagonal, i.e., 0 < j i jj j j if i<j. The matrix D ;1 UD is still upper-triangular and its nonzero entries satisfy j ;1 i u ij j j = j j = i jju ij jju ij j (1.1) since i j. As a consequence, we would expect that for certain matrix norms kk kd ;1 UDkkUk: (1.2) Inequality (1.2) is clearly equivalent tokuk kdud ;1 k and if norm kk is invariant under either transpose or conjugate transpose, then (1.2) is equivalent tokdld ;1 kklk and klk kd ;1 LDk which are also equivalent toeach other, where L is a lower-triangular matrix. Inequality (1.2) may be used to bound the componentwise relative condition number kju ;1 jjujk [4, p.37], [5] for solving a linear triangular system, where jj of a matrix is understood entrywise. The inequality implies kju ;1 jjujk= min kjd ;1 U ;1 jjudjk: D = diag( i )with0< j i jj j j for i<j Inequality (1.2) is obviously true for any absolute matrix norms since absolute matrix norms are increasing functions of absolute values of the matrix's entries [6, Denition 5.5.9 and Theorem 5.5.10, p.285]. Frequently used absolute matrix norms are jaj p j(a ij )j p def = p s X i j ja ij j p for p 1 including the Frobenius norm (p = 2) as a special one. What about other matrix norms, especially the often used `p operator norms and the unitarily invariant norms? These norms are not, in general, increasing functions of the absolute values of the matrix entries. However, we shall prove that Inequality (1.2) is valid for them. This is true because they all satisfy the pinching inequality (see x2 below). More generally, we shall consider matrix norms dened by Re (x Ay) def kak = max and () x6=0 y6=0 (x) (y) and () are two absolute vector norms. (1.3) Here Re [] denotes the real part of a complex number. These include `p operator norms dened by def kayk p Re (x Ay) kak p = max = max for p 1 y6=0 kyk p x6=0 y6=0 kxk q kyk p where q is p's dual dened by 1=p +1=q =1,and kxk p = p s X j j j j p

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 3 for any vector x =( 1 2 ::: n ). The assumption that both () and () are absolute is essential. Here is a counterexample: n =2,() =kk 2, ()=kb k 2,and U = ;3 ;1 0 1 D = 1 2 and B = Then () is absolute while () is not. Calculations lead to 8 5 ;2 ;1 kuk = kub ;1 k 2 =5:42 > 4:16 = kd ;1 UDB ;1 k 2 = kd ;1 UDk : With this example, counterexamples with none of () and () being absolute can be constructed: one way istotake () asabove and () =ki k 2 for I = 1 1 for small jj. Itiseasytoverify that () is not absolute for any 6= 0. Then 0 () =kk 2, the same as the ()above. Since both kuk and kd ;1 UDk are continuous at =0, for sucient small6= 0wehave 2 Pinchings and Norms kuk > kd ;1 UDk : Let A be a n n matrix partitioned conformally into blocks as A = 0 B @ A 11 A 12 A 1N 1 C A 21 A 22 A 2N...... A (2.1) A N 1 A N 2 A NN where the diagonal blocks are square. The block diagonal matrix C(A) def = 0 B @ A 11 A 22... ANN 1 C A : (2.2) is called a pinching of A. It is known that pinching reduces A's weakly unitarily invariant norms 1 (and thus unitarily invariant norms), i.e., The Pinching Inequality: kc(a)k kak (2.3) holds for every weakly unitarily invariant norm. The following lemma enlarges the domain of this inequality: 1 A matrix norm kkwu is weakly unitarily invariant, if it satises, besides the usual properties of any norm, also kxax kwu = kakwu for any unitary matrix X:

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 4 Lemma 2.1 Every norm kk dened as in (1.3) satises the pinching inequality. Proof: We follow the idea used in [2]. Let be a primitive Nth root of unity, and let X be the diagonal matrix that in a partitioning conformal with (2.1) has the form X = diag(i I 2 I ::: N;1 I) where the I's are the identity matrices of appropriate sizes. It can be seen that C(A) = 1 N N;1 X Let x, y be vectors such that (x) =1= (y) andkc(a)k (2.4) we have kc(a)k = 1 N = 1 N j=0 N;1 X j=0 X N;1 j=0 X j AX j : (2.4) Re [x X j AX j y] = Re [x C(A)y]. Then by Re [(X j x) A(X j y)]: (2.5) Now note that each X j is a diagonal matrix with unimodular diagonal entries. Since and are absolute norms, (X j x)=1= (X j y)forallj. Hence, each of the summands on the right hand side of (2.5) is bounded by kak. Consequently kc(a)k kak. In the next section, we shall need a special corollary of the pinching inequality: Lemma 2.2 Let A = A 11 A 12 be any block matrix in which the diagonal blocks are A 21 A 22 square. Let 0 1. Then A 11 A 12 A 21 A 22 kak for every norm that satises the pinching inequality (2.3). Proof: Write A 11 A 12 = A +(1; )C(A) A 21 A 22 and then use the triangle inequality and the pinching inequality. See [1, pp.97{107] or [3]. A matrix norm jjjjjj is unitarily invariant if it satises, besides the usual properties of any norm, also 1. jjxay jj = jja jj for any unitary matrices X and Y 2. jjajj = kak 2 for any A having rank one. See Bhatia [1, p.91] or Stewart and Sun [8, pp.74-88]. We denote by jjj jjj a general unitarily invariant norm. The most frequently used unitarily invariant norms are the spectral norm kk 2 (also called the `2 operator norm) and the Frobenius norm kk F.

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 5 3 Norms of Scaled Triangular Matrices Theorem 3.1 Suppose U is an upper-triangular matrix, and D a nonsingular diagonal matrix whose diagonal entries appear in ascending order of magnitude down the diagonal. Then the inequality kd ;1 UDkkUk (1.2) holds for any matrix norm that satises the pinching inequality (2.3). Proof: Without loss of any generality, wemay assume D = diag( 1 2 ::: n ) with 1= 1 2 n : Decompose D = F 2 F n, where F i = I i;1 i I n;i+1 i = i = i;1 1 for i =2 3 n,andi k is the k k identity matrix. Inequality (1.2) is proved if it is proved for every D = F i. This is what we shall do. Let F = I I Conformally partition, where 1 and the I's generally have dierent dimensions. U = R S T To prove the inequality (1.2) for D = F,wehave toshow R S T for everey norm that satises the pinching inequality. This follows from Lemma 2.2. 4 Norms and Wiping out of Entries The idea that we have used in proving Lemma 2.1 depends on expressing C(A) asan average of special diagonal similarities of A. This can be done for some other parts of a matrix, too. Let Sbe an equivalence relation on f1 2 ::: ng with equivalence classes I 1 I 2 ::: I N. Let F be the map on the space of n n matrices that sets all those (i j) entries { (i j) 62 S{ tozeroandkeeps the others unchanged. It is shown [2] that such a map has a representation F(A)= 1 N N;1 X j=0 R : S T X j AX j (4.1)

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 6 where X is a diagonal matrix whose ith diagonal entry is k;1 if i 2 I k,and is a primitive Nth root of unity. Using the argument in the proof of Lemma 2.1 we see that kf(a)k kak (4.2) for every norm dened in (1.3). A particularly interesting example of such a map F is the one that wipes out all entries of a matrix except those on the dexter and the sinister diagonals, i.e., F sets all A's entries to zero except those (i j) entries for which either i = j or i + j = n + 1, where nothing is changed. For A and C(A) as in (2.1) and (2.2), let O(A) =A ;C(A) be the o-diagonal part of A. A straightforward application of the pinching inequality shows that ko(a)k 2kAk : However, using the representation (2.4) we can show, as in [2], that ko(a)k 2(1 ; 1=N)kAk : (4.3) This inequality is sharp for the norm kk 2 [2]. The same idea can be used to estimate the norms ka ;F(A)k using the representation (4.1). Finally we remark that for the `p operator norms (1 p 1), we could obtain better bounds than (4.3) by aninterpolation argument. To see this, we notice that the operator norms kak 1 and kak 1 are diminished if any entry of A is wiped out. Hence, we have ko(a)k p kak p for p =1 1. (4.4) Since A 7 O(A) is a linear map on the space of matrices, we getbyinterpolating between the values p =1 2 1, the inequality ko(a)k p (2 ; 2=N) 2= maxfp qg kak p (4.5) where q is p's dual. For details of the interpolation method, the reader is referred to Reed and Simon [7, pp.27-45]. (The reader should be aware that kak p denotes the Schatten p-norm in [2], but here it denotes the `p operator norm.)

Bhatia, Kahan, & Li: Norms of Scaled Triangular Matrices 7 References [1] R. Bhatia. Matrix Analysis. Graduate Texts in Mathematics, vol. 169. Springer, New York, 1996. [2] R. Bhatia, M.-D. Choi, and C. Davis. Comparing a matrix to its o-diagonal part. Operator Theory: Advances and Applications, 40:151{164, 1989. [3] R. Bhatia and J. A. R. Holbrook. Unitary invariance and spectral variation. Linear Algebra and Its Applications, 95:43{68, 1987. [4] J. Demmel. Applied Numerical Linear Algebra. SIAM, Philadelphia, 1997. [5] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadephia, 1996. [6] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, Cambridge, 1985. [7] M. Reed and B. Simon. Methods of Modern Mathematical Physics, II: Fourier Analysis, Self- Adjointness. Academic Press, New York, 1975. [8] G. W. Stewart and J.-G. Sun. Matrix Perturbation Theory. Academic Press, Boston, 1990.