that of the SVD provides new understanding of left and right generalized singular vectors. It is shown

ON A VARIAIONAL FORMULAION OF HE GENERALIZED SINGULAR VALUE DECOMPOSIION MOODY.CHU,ROBER. E. FUNDERLIC y AND GENE H. GOLUB z Abstract. A variational formulation for the generalized singular value decomposition (GSVD) of a pair of matrices A R mn and B R pn is presented. In particular, a duality theory analogous to that of the SVD provides new understanding of left and right generalized singular vectors. It is shown that the intersection of row spaces of A and B playsakey role in the GSVD duality theory. he main result that characterizes left GSVD vectors involves a generalized singular value deation process. Key words. Generalized Eigenvalue and Eigenvector, Generalized Singular Value and Singular Vector, Stationary Value and Stationary Point, Deation, Duality. AMS(MOS) subject classications. 65F15, 65H15. 1. Introduction. he singular value decomposition (SVD) of a given matri A R mn is (1) U AV = S =diagf 1 ::: q g q=minfm ng where U R mm and V R nn are orthogonal matrices, S R mn is zero ecept for the real nonnegative elements 1 ::: r > r+1 = :::= q = 0 on the leading diagonal with r =rank(a). he i i =1 ::: q are the singular values of A. Of the many ways to characterize the singular values of A, the following variational property is of particular interest [4]. heorem 1.1. Consider the optimization problem () ma 6=0 kak kk where kk denotes the -norm of a vector. hen the singular values of A are precisely the stationary values, i.e., the functional evaluations at the stationary points, of the objective function kak=kk with respect to 6= 0. We note that the stationary points R n in the problem () are the right singular vectors of A. At each ofsuchpoints, it follows from the usual duality theory that there eists a vector y R m of unit Euclidean length such that y A is equal to the corresponding stationary value. his y is the corresponding left singular vector of A. he main purpose of this paper is to delineate a similar variational principle that leads to the generalized singular value decomposition (GSVD) of a pair of matrices A R mn and B R pn. While the variational formula analogous to () for the GSVD Department of Mathematics, North Carolina State University, Raleigh, NC 7695-805. his research was supported in part by the National Science Foundation under grants DMS-91448 and DMS-9480. y Department of Computer Science, North Carolina State University, Raleigh, NC 7695-806. z Department of Computer Science, Stanford University, Stanford, CA 9405. his research was supported in part by National Science Foundation under grant CCR-95059. 1

is well-known, the corresponding duality theory has apparently not been developed. (See [1] for a related treatise.) he purpose of this note is to ll the duality theory gap for the GSVD problem. Let R(M) and N (M) denote, respectively, the range space and the null space of any given matri M. We will see that the intersection of row spaces of A and B, R(A ) \ R(B )= n z R n j z = A = y B for some R m and y R po plays a fundamental role in the duality theory of the associated GSVD. he equivalence C =[A B ] =0() A = y B suggests that the null space of the matri C := [A B ], ( ) N (C) = R m+p () j C =0 may beinterpreted as a \representation of R(A ) R(B ). But this representation is not unique in that dierent values of N(C) maygive rise to the same z R(A ) R(B ). In particular, all points in the subspace ( ) g (4) S := R m+p j g A = h B =0 ;h collapse into the zero vector in R(A ) R(B ). For a reason to be discussed in the sequel (see (16) and the argument thereafter), the subspace S should be taken out of consideration. More precisely, dene the homomorphism H : N (C) ;! R(A ) R(B ) by z = H( ) () z = A = y B and dene, for every N(C), the quotient map ( containing, i.e., (5) ( ):= + S: ) to be the coset of S hen the rst homomorphism theorem for vector spaces (See, for eample, [5, heorem 4.a]) states that R(A ) R(B )is isomorphic to the quotient space N (C)=S where (6) N (C)=S := ( ( ) j N(C) ) :

It is in this quotient space that we establish the duality theory. Recall that linearly independent vectors in N (C) that are not in S will generate naturally linearly independent vectors in the quotient space N (C)=S through the quotient map. hus the simplest way to represent N (C)=S is through the orthogonal complement S? of S in N (C). Dene N(A )andn(b ) to be matrices so that their columns span, respectively, thenull spaces N (A )andn (B ). Dene (7) Z := 6 4 A B N(A ) 0 0 N(B ) 7 5 : hen N (C)=S can be uniquely represented by the subspace (8) N (Z) = ( R m+p j Z We shall have the dimension counted carefully in. Our discussion is based upon the following formulation of the GSVD for A and B by Paige and Saunders [6] (or QSVD in []) that generalizes the original concept in [9]: Definition 1.1. Assume rank(c) =k, then there eist orthogonal U R mm and V R pp and an invertible X R nn such that U 0 A A 0 (9) 0 V X = B B 0 =0 ) : with A R mk and B R pk given by A = 4 B = 4 r s k ; r ; s I A S A O A 5 r s k ; r ; s O B S B I B 5 r s m ; r ; s p ; k + r s k ; r ; s where I A and I B are identity matrices, O A and O B are zero matrices with possibly no rows or no columns, and S A = diagf! (1) A :::! (s) A g and S B = diagf! (1) B :::! (s) B g satisfy he quotients 1 >! (1) A :::! (s) A > 0 0 <! (1) B! (i) A +! (i) B =1: :::! (s) B < 1 i :=!(i) A! (i) B i=1 ::: s

are called the generalized singular values of (A B) for which we make use of the notation :=diagf 1 ::: s g. he values of r and s are dened internally by the matrices A and B. Suppose we partition X into four blocks of columns X =[X 1 X X X 4 ] with column sizes r s k;r;s and n;k, respectively. Correspondingly, suppose we partition U into U =[U 1 U U ] with column sizes r s m; r ; s and V into V =[V 1 V V ] with column sizes p ; k + r s k; r ; s, respectively. Observe that X A AX = X B BX = I A 0 SA. 6 4 OAO 7 A 5 0 ::: 0 OB O B 0 SB. 6 7 4 I B 5 0 ::: 0 where, for simplicity,wehave used \0 to denote various zero matrices with appropriate sizes. Upon eamining the second column block, we notice that A AX = B BX : hat is, f i ji =1 ::: sg is a subset of the eigenvalues of the symmetric pencil (10) A A ; B B: Similarly, we point out the following remarks to include all other cases [, 6]: 1. If k<n, then A AX 4 = B BX 4 = 0 implies that every comple number is an eigenvalue of (10). his is the case that is considered of little interest. We will refer to eigenvalues of this type as defective.. Since A AX = 0 and B BX 6= 0, the pencil (10) has 0 as an eigenvalue with multiplicity k ; r ; s.. Since B BX 1 = 0 and A AX 1 6=0,wemay regard that the pencil (10) has 1 as an eigenvalue with multiplicity r. We view the relationships U AX = S A V BX = S B : as the fundamental and most important components of (9). We refer to the corresponding columns of U and V as the left generalized singular vectors of A and B, respectively. Note that there are two such left vectors for each generalized singular value, one for A, and one for B. Similar to heorem 1.1, we have the following variational formulation. 4

heorem 1.. Consider the optimization problem (11) kak ma B6=0 kbk : hen the generalized singular values, 1 ::: s, of(a B) are precisely the nonzero nite stationary values of the objective function in (11). Proof. he stationary values of kak=kbk are square roots of those of the function f() := ha Ai hb Bi where h i is the standard Euclidean inner product. It is not dicult to see that the gradient off at where B 6= 0isgiven by rf() = A A ; f()b B : hb Bi he theorem follows from comparing the rst order condition rf() = 0 with (10). Obviously, the corresponding stationary points R n for the problem (11) are related to columns of the matri X (up to scalar multiplications), which are also eigenvectors of the pencil (10). What is not clear are the roles that U and V play in terms of the variational formula (11). In this note we present some new insights in this regard. In the usual SVD duality theory the left singular vectors can be obtained from the optimization problem (1) ma y6=0 ky Ak kyk a formula similar to (). hus one might rst guess that the duality theory analogous to (11) would be the problem ky Ak ma y B6=0 ky Bk : However, this is certainly not a correct form as a single row vector y is not compatible for left multiplication on both A and B. We will see correct dual forms for the GSVD in (18) and ().. Duality heory. For convenience, we denote It follows from U = [u () 1 ::: u () s ] V = [v () 1 ::: v () s ]: (1) U AX = S A S ;1 B V BX =V 5 BX

that U A =V B or equivalently (14) C U ;V =0: Note that the columns of both U and V are unit vectors. Given any null space of C with kk 6= 0 and kyk 6= 0,weobservethat (15) C 4 kk ; kyk y kk kyk 5 = 1 kk C =0 in the where =kk and y=kyk are also unit vectors. Comparing (15) with the relationship (14), we are motivated to consider the role that each generalized singular value i plays in the optimization problem: (16) C ma =0 6=0 kyk kk : However, we need to hastily point out a subtle aw in the formulation of (16). Consider a given point Swith kk 6= 0andkyk 6= 0. hen Sfor arbitrary R. In this case, the optimization subproblem (17) kyk ma A=y B=0 6=0 kk becomes the problem jj ma R 6=0 jj that obviously has no stationary point at all and has maimum innity. he trouble persists so long as contains components from S. It is for this reason that the subspace S should be taken out of consideration. We should, instead of (16), consider the modied optimization problem (See (7) and (8)) (18) ma N (Z) 6=0 kyk kk : We will prove that each i corresponds to a stationary value for the problem (18). But rst it is worthy to point out some interesting remarks: 6

1. he optimization problem (18) is consistent with the ordinary singular value A problem where B = I. In this case, Z = I.hus N(Z) N(A ) 0 implies that y = A 6= 0. he forbidden situation A = y = 0 in (18) is not a concern in this case because of the homogeneity in, and only implies that 0 is a stationary value (or equivalently A has a zero singular value.) hus in the case of the ordinary SVD the problem (18) reduces to (1).. It is clear that dim(n (C)) = m + p ; k since we assume rank(c) =k. he structure involved in (9) implies that for S dened in (4) it must be S = R(U ) R(V 1 ): hat is, the size of N(A )andn(b )shouldbem(m ; r ; s) andp (p ; k + r), respectively. It follows that dim(s) = m + p ; k ; s. he space we are interested in is the quotient spacen (C)=S. It is known from the homomorphism theorem that dim(n (C)=S) = dim(n (C)) ; dim(s) [5, Lemma 4.8]. hus dim(n (C)=S) = s. We will see below that this dimension count agrees with our assumption that there are s generalized singular values. he following theorem is critical to the study of stationary values and stationary points of the optimization problem (18). heorem.1. Let the columns of the matri with R ms and R ps be a basis for the subspace N (Z). hen the non-defective nite nonzero eigenvalues of the symmetric pencil of (10), are the same as those of the pencil A A ; B B (19) ; : Proof. Suppose A A = B B. Since is non-defective and nonzero, A A = A B B 6= 0. hat is, represents a nonzero elementinn (C)=S. hus there ;B eist vectors v R s, v 6= 0, A R m;r;s, B R p;k+r such that It follows that ( ; )v = ;( A + In the above, wehave used the fact that Z A = v + N(A ) A ;B = v + N(B ) B : B) +( N(A ) A ; of (19) with v as the corresponding eigenvector. 7 N(B ) B )=0: = 0. his shows that is an eigenvalue

o complete the eigenvalue (generalized singular value) set equality, suppose now that ( ; )v = 0 with 6= 0 1 and v 6= 0.Wewant toshow that the equation A v (0) = ;B v has a solution. If this can be done, then since [A B ] = 0 it follows that is an eigenvector of the pencil A A ; B B with eigenvalue. v o show (0) means to show that the vector is in the column space of the v A matri. It suces to show that ;B (1) v v? y z wherever () Rewrite () as [A B ] must have y ;z h A ;B i y z = 0,showing that =0: y ;z N(C). We therefore y = w + N(A ) A ;z = w + N(B ) B for some vectors w A and B of appropriate size. Substituting y and z into (1) implies h i y z v = w ( v ; 1 v)+( v AN(A ) v ; 1 BN(B ) v =0: he assertion is therefore proved. Corollary.. he generalized singular values i, i =1 ::: s,are the stationary values associated with the optimization problem (18). Proof. We have already seen in heorem 1. on how the generalized singular values of (A B) are related to the pencil A A ; B B,whicharenow related to the pencil ;. By heorem 1. again, we conclude that the generalized singular values of (A B) can be found from the stationary values associated with the optimization problem () which is equivalent to (18). k vk ma v6=0 kvk : 8

We nowcharacterize the stationary points of (18). In particular, we prove the following result which completes our duality theory. Aside from the fundamental connection between the GSVD and its duality theory, the eigenvalue deation of the proof should be of special interest in its own right. heorem.. Suppose ::: 1 s (18) with corresponding stationary values 1 ::: s. Dene 1 s are stationary points for the problem (4) (5) u i := i k i k v i := y i ky i k : hen the columns of the matrices ~ U := [u1 ::: u s ] and ~ V := [v1 ::: v s ] are the left generalized singular vectors of A and B, respectively. 1 Proof. Suppose is an associated stationary point of (18) with the stationary 1 value 1. (he ordering of which stationary value is found is immaterial in the following discussion. We assume 1 is found rst.) aking this vector to be the rst basis vector in N (Z), we may write 1 = 1 where R m(s;1) and R p(s;1) are to be dened below. Consider the stacked matri Z := 6 4 A B N(A ) 0 0 N(B ) 1 0 Note that, due to the last row inz,thenull space of Z is a proper subspace of that of Z with one less dimension. We may therefore use a basis of the null space of Z to form the columns of the matri that 7 5 :.In this way, we attain the additional property 1 =0: Note that the eigenvector of (19) corresponding to eigenvalue 1 is the same as the stationary point for the problem () with stationary value 1. Since () is simply a coordinate representation of (18) and we already assume that is a stationary 1 point associated with (18), the eigenvector of (19) corresponding to 1 must be the unit vector e 1 R q. It follows that y 1 =0 9 1

and hence ; = y 1 y 1 ; 1 1 0 0 ; : We have thus shown that the eigenvalues of the pencil ; are eactly those of the pencil ; with 1 ecluded. Note that the submatri 1 spans anull subspace of Z that is complementary to the vector. After the rst 1 stationary point is found, we may therefore deate (18) to the problem (6) Z ma =0 6=0 kyk kk : A stationary point of (6) will also be a stationary point of (18) since it gives the same stationary value in both problems. his deation procedure may be continued until all nonzero stationary values are found. hen, by construction, ~ U ~ U = I and ~ V ~ V = I. Furthermore, we have ~U A = ~ V B which completes the proof. hat is, we have derived two matrices ~ U and ~ V that play the same role as that of U and V, in (9) respectively.. Summary. We have discussed a variational formulation for the GSVD of a pair of matrices. In particular, we characterize the role of the left generalized singular vectors in this formulation. We summarize the analogies between the SVD and the GSVD in the following table. he stationary values in any of the variational formulations below give rise to the corresponding singular values. here is a close correspondence between the (generalized) eigenvalue problem and the (generalized) singular value problem, as is indicated in heorem 1.1 and heorem 1.. he result in heorem.1 and heorem. apparently are new, and shed lights on the understanding of the left singular vectors. Some of the available numerical methods and approaches for computing the GSVD are available in [, 7, 8, 10]. he deation process used in the characterization of the left singular vectors can be carried out eectively by updating techniques [4]. We anticipate that the discussion here might lead to a new numerical algorithm, especially when a few singular values are required and the matri C is sparse. Acknowledgment. We want to thank an anonymous referee for the many valuable suggestions that signicantly improve this paper. 10

Regular Problem Generalized Problem Decomposition U A = SV U AX =V BX See formula (1) See Formula (1) Right Singular Vector V X Variational Formula ma 6=0 kak kk ma B6=0 kak kbk (including zero i i ) See formula () See formula (11) Left Singular Vector U [U V ] Variational Formula ma A I N(A ) 0 kyk kk =0 6=0 6 4 A B N(A ) 0 0 N(B ) ma 7 5 kyk kk =0 6=0 (= ma A 6=0 6=0 ka k kk ) (only positive i i ) See formula (1) See formula (18) able 1 Comparison of variational formulations between SVD and GSVD. 11

REFERENCES [1] B. Anstrom, he generalized singular value decomposition and (A;B)-problem, BI, 4(1984), 568-58. [] Z. Bai and J. W. Demmel, Computing the generalized singular value decomposition, SIAM J. Sci. Comput., 14(199), 1464-1486. [] B. De Moor and G. H. Golub, Generalized singular value decomposition: A proposal for a standardized nomenclature, Manuscript NA-89-05, Stanford University, 1989. [4] G. H. Golub and C. F. Van Loan, Matri Computations, nd ed., he John Hopkins University Press, Baltimore, MD, 1989. [5] I. N. Herstein, opics in Algebra, nd ed., Xero College Publishing, Leington, MS, 1975. [6] C. C. Paige and M. A. Saunders, owards a generalized singular value decomposition, SIAM J. Numer. Anal., 18(1981), 98-405. [7] C. C. Paige, Computing the generalized singular value decomposition, SIAM J. Sci. Stat. Comput., 7(1986), 116-1146. [8] G. W. Stewart, Computing the CS-decomposition of a partitioned orthonormal matri, Numer. Math., 40(198), 97-06. [9] C. F. Van Loan, Generalizing the singular value decomposition, SIAM J. Numer. Anal., 1(1976), 76-8. [10] C. F. Van Loan, Computing the CS and the generalized singular value decomposition, Numer. Math., 46(1985), 479-491. 1