An. Şt. Univ. Ovidius Constanţa Vol. 8, 00, 55 6 ON THE SINGULAR DECOMPOSITION OF MATRICES Alina PETRESCU-NIŢǍ Abstract This paper is an original presentation of the algorithm of the singular decomposition and implicitly, of the calculus of the pseudoinverse of any matrix with real coefficients. In the same time, we give a geometric interpretation of that decomposition. In the last section, we present an application of the singular decomposition in a problem of Pattern Recognition. Introduction Let A M m,n R be a matrix; for any column vector X R n M n, R, define f A : R n R m, f A X = AX. One may associate to A four remarkable vector spaces: IA = Imf A, NA = Kerf A, IA T and NA T = { Y A T Y = 0 }. IA is the vector subspace of R m, generated by the columns of the matrix A, whereas IA T is the subspace of R n, generated by the rows of A. If the rank of matrix A is ρa = r, then dimia = r, dimna = n r, dimia T = r, dimna T = m r. By convention, any vector x R n, x = x,x,...,x n, can be identified with the corresponding column vector X; for any x,y R n, the Euclidian scalar product is defined by x,y = X T Y. The following properties are well-known: The subspace NA is orthogonal to IA T and NA T is orthogonal to IA. Key Words: pseudoinverse of a matrix, algorithm of singular decomposition, geometric interpretation, pattern recognition of plane image. Mathematics Subject Classification: 5A09 Received: August, 009 Accepted: January, 00 55
56 Alina PETRESCU-NIŢǍ The following orthogonal decompositions hold: R n = NA IA T, R m = NA T IA. 3 The square matrix A T A of order n is symmetric and non-negatively defined i.e. X T A T AX 0, for any column vector X. 4 For any matrix A M m,n R of rank r, the number of positive eigenvectors of the matrix A T A is equal to r. 5Wesupposethatm n.letλ,λ,...,λ r bethepositiveeigenvaluesofmatrix A T A and σ k = λ k, k r singular numbers of the matrix A. These are called singular numbers of the matrix A. There exists an orthonormal basis {v,v,..., v n }ofr n formedbytheunitaryeigenvectors ofa T A, such that A T Av i = σ i v i, i r, and A T Av j = 0, r+ j n. If we denote u i = σ i Av i, i r, it follows that the vectors v,v,...,v r form an orthonormal basis for IA T and v r+,v r+,...,v n form an orthonormal basis for NA. Further, u,u,...,u r form an orthonormal basis for IA, which can be extended to an orthonormal basis u,u,...,u r, u r+,u r+,...,u m of R m ; the added vectors u r+,u r+,...,u m form an orthonormal basis for NA T. 6 The orthogonal matrices U = u u... u m, V = v v... v n are invertible the inverse is just the transposed matrix. We have: AV = Av Av... Av n = σ u σ u... σ r u r 0... 0, σ i 0,i =,,...,r and if we denote S 0 S = diagσ,σ,...,σ r and Σ = being an a m n - matrix, it follows that AV = U Σ and following relation holds: This is the singular decomposition of matrix A. If A = 0, then Σ = 0. A = UΣV T. 7 Until now we have supposed that m n. If A M m,n R and m < n, then we may consider the n m - matrix B = A T. According to the properties 5 and 6, we have the singular decomposition B = U Σ V T and it follows A = B T = V Σ U T.
ON THE SINGULAR DECOMPOSITION OF MATRICES 57 The main application of the singular decomposition is the explicit way to compute of the pseudoinverse A of an arbitrary non-null matrix A M m,n R, namely, if A = UΣV T according to, then A = VΣ U T, where Σ = S 0 M n,m R and S = diag,,..., σ σ σ r NOTE. One knows that for any A M m, n R, there is A + M n,m R, called the pseudoinverse of A, such that for any y R m, the minimum of the euclidian norm Ax y is attained iff x A + y. The Singular Decomposition Algorithm One knows that any rectangular matrix or a square matrix, invertible or not admits a singular decomposition given by or. Suppose now that the matrix A M m,n R is given, and m n. If m < n, then the algorithm applies to the matrix A T. Step. Compute the symmetric matrix A T A M n R and determine the nonzero eigenvalues λ,λ,...,λ r, as well as the singular numbers σ = λ,σ = λ,...,σ r = λ r, where ρa = r. Step. Determine an orthonormal basis {v,v,..., v n } of R n, formed by the unitary eigenvectors of A T A and denote by V M n R the orthogonal matrix whose columns are formed by the vectors v,v,...,v n. Step 3. Compute the column unitary vectors u i = σ i Av i for i r and complete them to an orthonormal basis {u,u,...,u r, u r+,u r+,...,u m } of R m. Denote by U M m R the orthogonal matrix formed by the column vectors u,u,...,u r,u r+,u r+,...,u m. Step 4. Taking S = diagσ,σ,...,σ r and defining the m n - matrix Σ having S in the left upper corner, that is S 0 Σ =, one obtains the singular decomposition. Example Take A = hence m =, n =, r =. The matrix A T A has the eigenvalues λ = 0,λ = 0, with the unitary eigenversors v = 5, T ; v = 5, T. Then u = 0 Av =, and take u =,. So, U = and finally, A = UΣV T.,V = 5,Σ =
58 Alina PETRESCU-NIŢǍ 0 Example For B = and A = B 0 T, A T A = ; λ = 3,λ = and the singular numbers of A are σ = 3,σ = r =. Hence, the unitary eigenvectors for A T A are: T T v =, and v =,. Take u = 3 Av = T T 6, 6, 6,u = Av =,0, and we complete u,u to an orthonormal basis of R 3. We take u 3 = a,b,c T with unknown components and impose the condition u 3 u,u 3 u and a + b + c =. It follows u 3 = T 3, 3, 3 and denote: U = 6 3 6 0 3 6 3, V = and Σ = 3 0 0 finally, the singular decomposition A = UΣV T and B = V T U T. 3 Geometric Interpretation of Singular Decomposition From geometrical point of view, any orthogonal matrix U M m R corresponds to a rotation of space R m. For m =, an orthogonal matrix U M R has the form cosθ sinθ U =,θ R sinθ cosθ and the application f U : R R,f U x,y = x,y becomes x = xcosθ ysinθ,y = xsinθ + ycosθ. This is the plane rotation formulae with the angle θ, around centered in the origin. This fact is generalized for upper dimensions. Any matrix of type S or Σ corresponds, from a geometrical point of view, σ 0 to a scale change. For instance, if n = and S =, the application 0 σ f S : R R,f S x,y = x,y, becomes x = σ x and y = σ y, and we get the plane scale change formulae.
ON THE SINGULAR DECOMPOSITION OF MATRICES 59 Proposition Any linear application f : R n R n is a composition of a rotation with a scale change, followed by another rotation. Proof. Let A be the associated matrix of the linear application f with respect to the canonical basis. According to, the singular decomposition of the matrix A has the form A = UΣV T, with U and V orthogonal matrices and Σ a diagonal matrix. Then f = f A = f U f Σ f V. The maps f U and f V correspond to orthogonal matrices and they represent rotations, whereas f Σ is a scale change, namely f Σ x,x,...,x n = σ x,...,σ r x r,0,...,0. Figure. The image of the unit sphere S n on f A Relation correspond to the statement of the proposition. Let A M n R be a nonsingular square matrix hence m = n = r and f A : R n R n be a linear application associated to A in the canonical basis of R n. Through the application f A, the unit sphere S n = {x R n x = } is transformed into an n-dimensional ellipsoid E n = f A S n. Indeed, if y = f A x, it follows that Y = AX, X = A Y hence = X = A Y = A Y,A Y = Y T A T A Y, and the matrix C = A T A is positively by defined, hence the set E n = {Y = AX/X S n } defines an ellipsoid. Therecursiveconstructionoftheorthonormalbasesv,v,...,v n andu,...,u n has also a geometric interpretation, which is presented in the sequel. Let w be
60 Alina PETRESCU-NIŢǍ a radial vector of maximal length in the ellipsoid and v = A w. If we denote by H the tangent hyperplane at v to the unitary sphere S n and H = f A H, then it follows that the hyperplane H is tangent at w to the ellipsoid E n see the figure. Indeed, we have w H and H has just one common point with the ellipsoid E n otherwise, since the application f A is bijective, it would follow that H is not tangent to the sphere.. Considering the restriction g of the linear application f A to H, we obtain a linear application g : H H, for which we can do the previous construction. This geometric interpretation leads to the singular decomposition without appealing the study of matrix A T A. Moreover, w H. We take v = v,u = w w 4 An application of the Singular Decomposition to the Classification of D Images Let A M m,n R be the gray levels matrix of a D black-white image for example: a photography, a map, a black-white TV image, etc.. Such a matrix can be obtained by splitting the image with a rectangular network, and associate to each node i,j, i m, j n, the gray level expressed as an integer number in the range between 0 and 63, where, for example, 0 stands for absolute white and 63 stands for absolute black. Let us consider the singular decomposition A = UΣV T = r λ i u i vi T, r = ρa, i= where λ > λ >... > λ r > 0 and λ,λ,...,λ r are the nonzero eigenvalues of matrix AA T. If A = a ij, i m, j n, then the Frobenius norm A F = i,j a ij / can be called the energy of the considered image. If the small eigenvalues are eliminated, we obtain and approximation A = k r λ i u i vi T,k << r and A A F = λi/. i= i=k+ If B M m,n R is another matrix, then matrix B = U ΣV T = r can be called the projective image of B on A, B = U.Σ.V T = λ r i= i u i vi T i= λ iu i vi T, where Σ = λ,...,λ r,0,...,0 and λ i = u T i Bv i, i ρ. We have: B B F A B F + r / λi λ i. For similar D images of the same class, the distance between the associated matrices i.e. the Frobenius norm of difference of the matrices is also i=
ON THE SINGULAR DECOMPOSITION OF MATRICES 6 small ; passing to the projective images, these are smaller because B C F B C F for any two matrices B,C M m,n R. Let us suppose that, for an image class ω we have the learning sample A,A,..., A N M m,n R. The average is given by µ = N A +...+A N and like above, the singular decomposition of average is µ = UΣV T = r i= λ iu i vi T. SimilarlywegettheprojectiveimagesA,...,A N on µ,hencea i = r j= xi j u jvj T, i N, where x i j = u T j A iv j, j r. The vector X i = x i,...,xi r can be interpreted as the coordinates vector of the projective image on µ of matrices A i, i N. The algorithm of supervised classification of images Let ω,ω,...,ω M be M classes of images already existing classes. We suppose that each class w i is represented by N i matrices of learning samples A i,ai,...,ai N i belonging to M m,n R. and the singular de- Step. Compute the average µ i = N i A i +...+A i composition of this matrix, hence the set of matrices N i u i j,vi j, j k, k minm,n, i M. Step. Compute the vectors of the coordinates of the projective image on µ i, i.e. X i j = x i j,...,x i j r, where: x i j p = u i p T A i j vi p, p k, j N i, i M. Step3. Compute the center of the classes ω i by X i C = Ni N i j= Xi j. Step4. the recognition step For any unclassified new D image F M m,n R, compute the projective images of F on µ i, i M, and the corresponding T coordinates vectors Y,...,Y M. If we denote z p i = u i i p Fv p, p k, we have Y i = z i,...,zi k, i M. If min Y i X i is reached for an index i = i 0, not necessary unique then F i M image F is places in class ω i0. References C [] Ben-Israel A., Greville T.N.E., Generalized Inverses; Theory and Applications, Springer Verlag N.Y., Inc., 003. [] Niţǎ, A., Generalized inverse of a matrix with applications to optimization of some systems Ph. D. Thesis, in Romanian, University of Bucharest, Faculty of Mathematics and Computer Science, 004. [3] Strang, G., Linear algebra and applications, Academic Press, 976.
6 Alina PETRESCU-NIŢǍ [4] Wang, J., A recurrent neural network for real-time, Appl.Math. Computer, 55993, 89-00. [5] Wang, J., Recurrent neural networks for computing pseudoinverses of rankdeficient matrices, Appl.Math. Computer, 8997 479-493. University POLITEHNICA of Bucharest Department of Mathematics Splaiul Independenţei 33, Ro - Bucharest, Romania e-mail: anita@euler.math.pub.ro