Norms and Perturbation theory for linear systems

CHAPTER 7 Norms and Perturbation theory for linear systems Exercise 7.7: Consistency of sum norm? Observe that the sum norm is a matrix norm. This follows since it is equal to the l -norm of the vector v =vec(a) obtained by stacking the columns of a matrix A on top of each other. Let A =(a ij ) ij and B =(b ij ) ij be matrices for which the product AB is defined. Then kabk S = X X a ik b kj X a ik b kj i,j k i,j,k X a ik b lj = X a ik X b lj = kak S kbk S, i,j,k,l i,k l,j where the first inequality follows from the triangle inequality and multiplicative property of the absolute value. Since A and B where arbitrary, this proves that the sum norm is consistent. Exercise 7.8: Consistency of max norm? Observe that the max norm is a matrix norm. This follows since it is equal to the l -norm of the vector v =vec(a) obtainedbystackingthecolumnsofamatrixaon top of each other. To show that the max norm is not consistent we use a counter example. Let A = B =.Then = => =, contradicting kab M kak M kbk M. M M Exercise 7.9: Consistency of modified max norm? Exercise 7.8 shows that the max norm is not consistent. In this Exercise we show that the max norm can be modified so as to define a consistent matrix norm. (a) Let A C m,n and define kak := p mnkak M as in the Exercise. To show that k kdefines a consistent matrix norm we have to show that it fulfills the three matrix norm properties and that it is submultiplicative. Let A, B C m,n be any matrices and any scalar. () Positivity. Clearly kak = p mnkak M 0. Moreover, kak =0 () a i,j =08i, j () A =0. () Homogeneity. k Ak = p mnk Ak M = p mnkak M = kak. 39 M M

(3) Subadditivity. One has ka + Bk = p nmka + Bk M p nm kak M + kbk M = kak + kbk. () Submultiplicativity. One has kabk = p mn max im jn p mn max im jn p mn max im qx a i,k b k,j k= qx a i,k b k,j k= max b k,j kq jn q p mn max a i,k im kq = kakkbk. (b) For any A C m,n,let qx a i,k k= max b k,j kq jn kak () := mkak M and kak () := nkak M. Comparing with the solution of part (a) we see, that the points of positivity, homogeneity and subadditivity are fulfilled here as well, making kak () and kak () valid matrix norms. Furthermore, for any A C m,q, B C q,n, kabk () = m max im jn qx k= = kak () kbk (), a i,k b k,j m max a i,k im kq q max b k,j kq jn kabk () = n max im jn qx a i,k b k,j q k= = kak () kbk (), max a i,k im kq n max b k,j kq jn which proves the submultiplicativity of both norms. Exercise 7.: The sum norm is subordinate to? For any matrix A =(a ij ) ij C m,n and column vector x =(x j ) j C n,onehas kaxk = a ij x j a ij x j a ij x k = kak S kxk, k= which shows that the matrix norm k k S is subordinate to the vector norm k k. 0

Exercise 7.: The max norm is subordinate to? Let A =(a ij ) ij C m,n be a matrix and x =(x j ) j C n acolumnvector. (a) One has kaxk = max a ij x j max a ij x j max a ij,...,m,...,m = kak M kxk.,...,m,...,n x j (b) Assume that the maximum in the definition of kak M is attained in column l, implying that kak M = a k,l for some k. Lete l be the lth standard basis vector. Then ke l k =and kae l k = max,...,m a i,l = a k,l = a k,l =kak M ke l k, which is what needed to be shown. (c) By (a), kak M kaxk /kxk for all nonzero vectors x, implyingthat kak M max kaxk kxk. By (b), equality is attained for any standard basis vector e l for which there exists a k such that kak M = a k,l.weconcludethat kak M =max kaxk kxk, which means that k k M is the (, )-operator norm (see Definition 7.3). Exercise 7.9: Spectral norm Let A = U V be a singular value decomposition of A, andwrite := kak for the biggest singular value of A. Since the orthogonal matrices U and V leave the Euclidean norm invariant, max y Ax = kxk ==kyk = max kxk ==kyk max y U V x = kxk ==kyk ix i y i max kxk ==kyk max y x kxk ==kyk i x i y i = max kxk ==kyk y x max kxk ==kyk kyk kxk =. Moreover, this maximum is achieved for x = v and y = u,andweconclude kak = = max kxk ==kyk y Ax. Since A is nonsingular we find ka k =max Exercise 7.0: Spectral norm of the inverse ka xk kxk =max kxk kaxk. This proves the second claim. For the first claim, let n be the singular values of A. Again since A is nonsingular, n must be nonzero, and Equation(7.7)

states that = ka k n.fromthisandwhatwejustprovedwehavethat n for any x so that kxk =,sothatalsokaxk n for such x. kaxk We have A = Exercise 7.: p-norm example, A = 3. Using Theorem 7.5, one finds kak = kak = 3 and ka k = ka k =. The singular values of A are the square roots of the zeros of 0=det(A T A I) =(5 ) 6 = 0 +9=( 9)( ). Using Theorem 7.7, we find kak = =3andkA k = =. Alternatively, since A is symmetric positive definite, we know from (7.8) that kak = and ka k = /, where = 3 is the biggest eigenvalue of A and = is the smallest. Exercise 7.: Unitary invariance of the spectral norm Suppose V is a rectangular matrix satisfying V V = I. Then kvak = max kxk = kvaxk = max kxk = x A V VAx = max kxk = x A Ax = max kxk = kaxk = kak. The result follows by taking square roots. Exercise 7.5: kauk rectangular A Let U =[u,u ] T be any matrixsatisfying=u T U.ThenAU is a -matrix, and clearly the operator -norm of a -matrix equals its euclidean norm (when viewed as a vector): a a x = a x a x = x a. a In order for kauk < kak to hold, we need to find a vector v with kvk =sothat kauk < kavk.inotherwords,weneedtopickamatrixa that scales more in the direction v than in the direction U. Forinstance,if 0 0 A = 0, U =, v = 0, then kak = max kaxk kavk => =kauk. kxk =

Exercise 7.6: p-norm of diagonal matrix The eigenpairs of the matrix A =diag(,..., n)are(, e ),...,( n, e n ). For (A) = max{,..., n }, onehas kak p = ( x p + + nx n p ) /p max (x,...,x n)6=0 ( x p + + x n p ) /p ( (A) p x p + + (A) p x n p ) /p max (x,...,x n)6=0 ( x p + + x n p ) /p On the other hand, for j such that (A) = j, onefinds kak p =max kaxk p kxk p kae j k p ke j k p = (A). = (A). Together, the above two statements imply that kak p = (A) foranydiagonalmatrix A and any p satisfying p. Exercise 7.7: Spectral norm of a column vector We write A C m, for the matrix corresponding to the column vector a C m.write kak p for the operator p-norm of A and kak p for the vector p-norm of a. Inparticular kak is the spectral norm of A and kak is the Euclidean norm of a. Then kak p =max kaxk p x =max x kak p x = kak p, proving (b). Note that (a) follows as the special case p =. Exercise 7.8: Norm of absolute value matrix (a) One finds +i A = i p = p. (b) Let b i,j denote the entries of A. Observe that b i,j = a i,j = b i,j. Together with Theorem 7.5, these relations yield kak F = kak =max jn kak = max im a i,j a i,j a i,j = =max jn = max im b i,j = k A k F, b i,j = k A k, b i,j = k A k, which is what needed to be shown. (c) To show this relation between the -norms of A and A, wefirstexamine the connection between the l -norms of Ax and A x, wherex =(x,...,x n )and x =( x,..., x n ). We find kaxk = a i,j x j 3 a i,j x j = k A x k.

Now let x with kx k =beavectorforwhichkak = kax k. That is, let x be aunitvectorforwhichthemaximuminthedefinitionof-normisattained.observe that x is then a unit vector as well, k x k =. Then,bytheaboveestimateof l -norms and definition of the -norm, kak = kax k k A x k k A k. (d) By Theorem 7.5, we can solve this exercise by finding a matrix A for which A and A have di erent largest singular values. As A is real and symmetric, there exist a, b, c R such that a b a b A = b c, A =, b c a A T A = + b ab + bc ab + bc b + c, A T A = a + b ab + bc ab + bc b + c. To simplify these equations we first try the case a + c =0. Eliminatingc we get a A T A = + b 0 a 0 a + b, A T A = + b ab ab a + b. To get di erent norms we have to choose a, b in such a way that the maximal eigenvalues of A T A and A T A are di erent. Clearly A T A has a unique eigenvalue := a + b and putting the characteristic polynomial (µ) =(a + b µ) ab of A T A to zero yields eigenvalues µ ± := a + b ± ab. Hence A T A has maximal eigenvalue µ + = a + b + ab = + ab. The spectral norms of A and A therefore di er whenever both a and b are nonzero. For example, when a = b = c =wefind A =, kak = p, k A k =. Exercise 7.35: Sharpness of perturbation bounds Suppose Ax = b and Ay = b + e. Let K = K(A) =kakka k be the condition number of A. Lety A and y A be unit vectors for which the maxima in the definition of the operator norms of A and A are attained. That is, ky A k ==ky A k, kak = kay A k,andka k = ka y A k. Ifb = Ay A and e = y A,then ky xk kxk = ka ek ka bk = ka y A k = ka k = kakka k ky A k ky A k kay A k = K kek kbk, showing that the upper bound is sharp. If b = y A ky xk kxk = ka ek ka bk = showing that the lower bound is sharp. and e = Ay A,then ky A k ka y A k = ka k = kay A kakka k ky A k = K Exercise 7.36: Condition number of nd derivative matrix Recall that T =tridiag(,, ) and, by Exercise.6, T is given by T ij = T ji =( ih)j >0, j i m, h = m +. kek kbk,

From Theorems 7.5 and 7.7, we have the following explicit expressions for the -, - and -norms kak =max a i,j, kak =, ka k =, kak = max a i,j jn m im for any matrix A C m,n,where is the largest singular value of A, m the smallest singular value of A, andweassumedato be nonsingular in the third equation. a) For the matrix T this gives ktk = ktk = m +form =, andktk = ktk =form 3. For the inverse we get kt k = kt k = = h for m = 8 and kt k = 3 for m =. Form>, one obtains == 3 j X T = ( jh)i + ij ( ih)j i=j = kt k It is easy to commit an error when simplifying this sum. It can be computed symbolically using Symbolic Math Toolbox using the following code: syms i j m simplify(symsum((-j/(m+))*i,i,,j-) + symsum( (-i/(m+))*j,i,j,m)) Here we have substituted h =/(m +). Thisproducestheresult j (m + arrive at this ourselves, we can first rewrite the expression as j X ( jh)i + ( ih)j j X ( ih)j. (j )j The first sum here equals ( jh).thesecondsumequals m(m +) j jh i = mj jh = mj mj/ =mj/. The third sum equals j X j j X jh i = j(j ) hj j Combining the three sums we get = j(j )( hj)/. j ((j )( jh)+m (j )( hj)) = j (m + j), j). To which we also arrived at above. This can also be written as j h j which is a quadratic function in j that attains its maximum at j = = m+. For odd m>, h this function takes its maximum at integral j, yielding kt k = h. For even 8 m>, on the other hand, the maximum over all integral j is attained at j = m = h h or j = m+ = +h,whichbothgivekt k h = (h ). 8 5

Since T is symmetric, the row sums equal the column sums, so that kt k = kt k.weconcludethatthe-and-condition numbers of T are 8 m =; cond (T) =cond (T) = >< 6 m =; h >: m odd, m>; h m even, m>. b) Since the matrix T is symmetric, T T T = T and the eigenvalues of T T T are the squares of the eigenvalues,..., n of T. As all eigenvalues of T are positive, each singular value of T is equal to an eigenvalue. Using that i = cos(i h), we find = m = cos(m h) =+cos( h), m = = It follows that cond (T) = m cos( h). = +cos( h) cos( h) =cot h. c) From tan x>xwe obtain cot x = <. Using this and cot x>x tan x x 3 we find h 3 < cond (T) < h. (d) For p =,substituteh =/(m+) in c) and use that / < /. For p =, we need to show due to a) that h /3 < h h. when m is odd, and that h /3 < (h ) h. when m is even. The right hand sides in these equations are obvious. The left equation for m odd is also obvious since / < /. The left equation for m even is also obvious since /3 < /. Exercise 7.7: When is a complex norm an inner product norm? As in the Exercise, we let hx, yi = s(x, y)+is(x,iy), s(x, y) = kx + yk kx yk. We need to verify the three properties that define an inner product. Let x, y, z be arbitrary vectors in C m and a C be an arbitrary scalar. () Positive-definiteness. One has s(x, x) =kxk 0and s(x,ix) = kx + ixk kx ixk = k( + i)xk k( i)xk ( +i i )kxk = =0, so that hx, xi = kxk 0, with equality holding precisely when x = 0. 6

() Conjugate symmetry. Since s(x, y) isreal,s(x, y) = s(y, x), s(ax, ay) = a s(x, y), and s(x, y) = s(x, y), hy, xi = s(y, x) is(y, ix) = s(x, y) is(ix, y) = s(x, y) is(x, iy) = hx, yi. (3) Linearity in the first argument. Assuming the parallelogram identity, s(x, z)+s(y, z) = kx + zk kz xk + ky + zk kz yk = z + x + y + x y z x + y x y z + x + y = z + x + y = z + x + y x + y =s, z, x y + x y z x + y z x + y + x y z x + y implying that s(x + y, z) =s(x, z)+s(y, z). It follows that hx + y, zi = s(x + y, z)+is(x + y,iz) = s(x, z)+ s(y, z)+ is(x, iz)+ is(y, iz) = s(x, z)+ is(x, iz)+ s(y, z)+ is(y, iz) = hx, zi + hy, zi. + x y That hax, yi = ahx, yi follows, mutatis mutandis, fromtheproofoftheorem 7.5. Exercise 7.8: p-norm for p =and p = We need to verify the three properties that define a norm. Consider arbitrary vectors x =[x,...,x n ] T and y =[y,...,y n ]inr n and a scalar a R. First we verify that k k is a norm. () Positivity. Clearly kxk = x + + x n 0, with equality holding precisely when x = = x n =0,whichhappensifandonlyifx is the zero vector. () Homogeneity. One has kaxk = ax + + ax n = a ( x + + x n )= a kxk. (3) Subadditivity. Using the triangle inequality for the absolute value, kx+yk = x +y + + x n +y n x + y + + x n + y n = kxk +kyk. Next we verify that k k is a norm. () Positivity. Clearly kxk =max{ x,..., x n } 0, with equality holding precisely when x = = x n =0,whichhappensifandonlyifx is the zero vector. () Homogeneity. One has kaxk =max{ a x,..., a x n } = a max{ x,..., x n } = a kxk. 7

(3) Subadditivity. Using the triangle inequality for the absolute value, kx + yk =max{ x + y,..., x n + y n } max{ x + y,..., x n + y n } max{ x,..., x n } +max{ y,..., y n } = kxk + kyk Exercise 7.9: The p-norm unit sphere In the plane, unit spheres for the -norm, -norm, and -norm are Exercise 7.50: Sharpness of p-norm inequality Let p.thevectorx l =[, 0,...,0] T R n satisfies kx l k p =( p + 0 p + + 0 p ) /p ==max{, 0,..., 0 } = kx l k, and the vector x u =[,,...,] T R n satisfies kx u k p =( p + + p ) /p = n /p = n /p max{,..., } = n /p kx u k. Exercise 7.5: p-norm inequalities for arbitrary p Let p and q be integers satisfying q p, andletx =[x,...,x n ] T C n. Since p/q, the function f(z) =z p/q is convex on [0, ). For any z,...,z n [0, ) and,..., n 0satisfying + + n =, Jensen s inequality gives iz i p/q = f iz i if(z i )= iz p/q i. In particular for z i = x i q and = = n =/n, p/q p/q n p/q x i q = n x i q n x i q p/q = n Since the function x 7 x /p is monotone, we obtain n /q kxk q = n /q /q x i q n /p x i p. /p x i p = n /p kxk p, from which the right inequality in the exercise follows. The left inequality clearly holds for x = 0, soassumex 6= 0. Without loss of generality we can then assume kxk =,sincekaxk p kaxk q if and only if kxk p kxk q for any nonzero scalar a. Then, for any i =,...,n,onehas x i, implying 8

that x i p x i q. Moreover, since x i =forsomei, onehas x q + + x n q, so that /p /p /q kxk p = x i p x i q x i q = kxk q. Finally we consider the case p =. Thestatementisobviousforq = p, soassume that q is an integer. Then /q /q kxk q = x i q kxk q = n /q kxk, 7 proving the right inequality. Using that the map x x /q is monotone, the left inequality follows from kxk q =(max x i ) q i x i q = kxk q q. 9