MATH : Computtionl Methods of Liner Algebr 1 The Row Echelon Form Lecture Note 9: Orthogonl Reduction Our trget is to solve the norml eution: Xinyi Zeng Deprtment of Mthemticl Sciences, UTEP A t Ax = A t b, (11) where A R m n is rbitrry; we hve shown previously tht this is euivlent to the lest sures problem: min xrn Ax b (1) A first observtion we cn mke is tht (11) seems fmilir! As A t A R n n is symmetric semi-positive definite, we cn try to compute the Cholesky decomposition such tht A t A = L t L for some lower-tringulr mtrix L R n n One problem with this pproch is tht we re not fully exploring our informtion, prticulrly in Cholesky decomposition we tret A t A s single entity in ignornce of the informtion bout A itself Prticulrly, the structure A t A motivtes us to study fctoriztion A = QE, whereq R m m is orthogonl nd E R m n is to be determined Then we my trnsform the norml eution to: E t Ex = E t Q t b, (1) where the identity Q t Q = I m (the identity mtrix in R m m ) is used euivlent to the lest sures problem with E: This norml eution is min Ex Qt b (1) xr n Becuse orthogonl trnsformtion preserves the L -norm, (1) nd (1) re euivlent to ech other Indeed, for ny x R n : Ax b =(b Ax) t (b Ax)=(b QEx) t (b QEx)=[Q(Q t b Ex)] t [Q(Q t b Ex)] =(Q t b Ex) t Q t Q(Q t b Ex)=(Q t b Ex) t (Q t b Ex)= Ex Q t b Hence the trget is to find n E such tht (1) is esier to solve Motivted by the Cholesky decomposition, we d like to find n E with structure similr to the upper-tringulr mtrices To this end, we sy tht E R m n is of the row echelon form defined below Definition 1 Let E =[e ij ] R m n be rbitrry, we define for ech row number 1 pple i pple m positive number n i such tht e ini = nd e ij = for ll j<n i If the entire i-th row is zero, we set n i =n+1 Then the mtrix E is sid to hve the row echelon form if nd only if the seuence {n 1,n,,n m } is strictly incresing until it reches nd stys t the vlue n+1 1
Grphiclly, such mtrix looks like: E = (1) We will see tht for mtrix of row echelon form, the lest sures problem (1) is esy to solve Let d = Q t b, then the residul vector is given by: e 1n1 x n1 +e 1,n1 +1x n1 +1 + +e 1n x n d 1 e n x n +e,n +1x n +1 + +e n x n d Ex d = e lnl x nl +e l,nl +1x nl +1 + +e ln x n d l, d l+1 d m where l is the lst non-zero row of E Note tht except for the first term, ll other components of the residul re independent of x n1 ;hencewemusthve: 1 x n1 = 1 @d 1 nx e 1j x j A (1) e 1n1 j=n 1 +1 Similrly, if l wehvee n = nd we deduce: x n = 1 @d e n 1 nx e j x j A (1) j=n +1 We cn continue on, nd eventully rech for ll 1 pple k pple l: 1 x nk = 1 nx @d k e kj x j A (18) e knk j=n k +1 Hence the solution to the lest sures problem (1) cn be computed s follows: 1 Choose x i, i/{n 1,,n l } rbitrrily (for exmple, zero) Use (18) to compute x nl, x nl 1,, x n1 recursively Menwhile, we reduce the problem to find fctoriztion A = QE such tht Q is orthogonl nd E is of the row echelon form
Givens Rottion A bsic tool to find the fctoriztion A = QE is to use Givens rottions Let us consider simple exmple in R : y Gx y x O x x Figure 1: Rottion by in R Prticulrly, we wnt to rotte vector x =[x, y] t by n ngle counter-clockwise to new vector Gx =[x,y ] t According to Figure 1, we ssume: x = rcos, y= rsin, where r = x = Gx Then the two coordintes of Gx re given by: x = rcos( + )=r(cos cos sin sin ) = cos x sin y, y = rsin( + )=r(sin cos +cos sin ) = cos y+sin x Thus we conclude tht the rottion mtrix G is defined: pple cos sin G = sin cos (1) In multiple dimensions, we consider the rottions tht keep ll but two coordintes constnt In R, these opertions re those rotte bout one of the three xises Prticulrly, let the indices for the two modified coordintes be i nd j, then the rottion by n ngle is euivlent to pre-multipliction with the Givens mtrix G i,j ( ) 1 cos( ) sin( ) G i,j ( )= () sin( ) cos( ) 1 All Givens mtrices re orthogonl
Orthogonl Reduction by Givens Rottions The ide here is to pply seuence of Givens rottions to the left of A so tht the ltter is trnsformed into the row echelon form We ve lerned from the process of Gussin elimintion tht left multipliction indictes row mnipultions; nd we see more fmilirities between the orthogonl reduction procedure here nd the Gussin elimintion Tht is, the lst elements of column of A re trnsformed to zeroes by row opertions The tool of choice is lower-tringulr mtrices for the Gussin elimintion, wheres it is orthogonl mtrices (or more specificlly the product of seuence of Givens mtrices) in the current sitution First we look t the product G 1, ( )A, where is number to be determined Denote the i-th row of A by t i,1pple i pple m, nd we denote the i-th column of generic mtrix M by [M] i,then: cos t 1 sin t sin t 1 +cos t G 1, ( )A = t t m cos 11 sin 1 sin 11 +cos 1 ) [G 1, ( )A] 1 = 1 Note tht ll the rows except for the first two ones re not chnged t ll We my choose such tht sin 11 +cos 1 =, or euivlently: 1 = rctn, (1) nd the (,1)-element of G 1, ( )A becomes zero The dvntge of the Givens trnsformtion over the Gussin elimintion is tht (1) is well-defined even when 11 =, in which cse = / nd G 1, ( ) cn still be computed We shll denote this prticulr Givens mtrix by G (1) 1, Another fct we notice fter the rottion is tht the L -norm of the first column of A is not chnged Prticulrly, note tht if 11 + 1 =, there is: 11 m1 sin = p 1, cos = p 11 ; 11 + 1 11 + 1 nd we hve: [A] 1! [G (1) 1, A] 1 is given by 11 1 1 m1 p 11 + 1! 1 m1 It is esy to check tht in the specil sitution 11 + 1 =, the previous sttement remins true Preserving the L -norm of the first column vector is ctully true for ll (nd cn be derived from the L -norm preserving property of ny orthogonl mtrix); nd prticulrly we see tht the
L -norm of ll the column vectors of A remin the sme fter A! G (1) 1, A Next, we construct Givens mtrix G (1) 1, tht will mke the (,1)-element of G(1) 1, A zero: p 11 + 1 + 1 [A] 1! [G (1) 1, G(1) 1, A] 1 is given by 11 1 1 1 m1! 1 m1 The mtrix G (1) 1, is given by: G (1) 1, = G 1,( ), where = rctn 1 p 11 + 1! As we continue, ll the remining non-zeroes in the first column of A cn be eliminted Eventully we obtin seuence of Givens mtrices nd define their product s G 1 : G 1 = G (1) 1,m G(1) 1,m 1 G(1) 1,, () so tht: [A] 1! [G 1 A] 1 is given by 11 1 m1 p 11 + 1 + + m1! Let us denote A (1) = G 1 A, then the first column of A (1) is exctly wht we wnt for E; nd if (1) 11 = p 11 + + m1 =, we hve n 1 = 1 The next step is to use Givens rottions to eliminte s mny non-zeroes elements of the second column of A (1) s possible If (1) 11 =, this process is the sme s wht we did before for the first column of A; butif (1) 11 =, we wnt to leve the first row of A(1) untouched! Prticulrly, we construct seuence of Givens mtrices nd define G s their products: ( G (),m G = G(),m 1 G(),, if (1) 11 =; G () 1,m G() 1,m 1 G() 1,, if (1) 11 = ()
such tht: [A (1) ]! [G A (1) ] is given by or (1) 1 () () (m) m (1) 1 () (m) m!! (1) 1 ( (1) ) +( (1) ) + +( (1) m ) ( (1) 1 ) +( (1) ) + +( (1) m ), if (1) 11 =;, if (1) 11 = Continuing this process, we obtin orthogonl m m mtrices G 1, G,, nd G n such tht: G n G n 1 G 1 A = E, () where E is of the row echelon form Defining Q=(G n G n 1 G 1 ) t we obtin the desired fctoriztion A = QE Now we write down the lgorithm rigorously in Algorithm 1 Here we use n integer p to keep trck of the row number, below which the non-zero entries re trnsformed to zero Algorithm 1 Orthogonl Reduction by Givens Rottions 1: Set p = 1 nd Q = I m : for i =1,,,n do : for j = p+1,p+,,m do : if ji = then : Continue : end if : Compute = rctn( ji / pi ) 8: Compute A G p,j ( )A 9: Compute Q QG p,j ( ) t 1: end for 11: if pi!= then 1: Set p p+1 1: end if 1: end for At the end of the lgorithm, the mtrix A is trnsformed into the row echelon form E Note tht in the line, we do not hve to ctully compute from line nd form the mtrix G p,j ( ) but insted compute nd store: c pj = pi ji + pi, s pj = ji ji + pi ;
nd then compute for A: jk c pj jk s pj pk pk s pj jk +c pj jk, k= i,i+1,,n () Similrly for the line, if the mtrix Q is not explicitly needed immeditely, ll we need to do is to keep trck of ll pirs c pj nd s pj so tht Q cn be reconstructed lter Anlysis of Algorithm 1 The preceding fctoriztion is more robust thn the Gussin elimintion becuse we cn obtin n prioriestimte on ll the components tht my pper during the orthogonliztion process In prticulr, whenever we pply the Givens rottion, the L -norm of the column vectors re not chnged; hence we hve: v mx mx e ji = u mx ji ) e ki pplet ji, j=1 j=1 for ll i =1,,n nd k =1,,m Next, we study the complexity of Algorithm 1 Note tht in the outer loop, no computtion ctully tkes plce if p is not incresed t ll Thus the mximum possible computtionl cost includes r =min(m,n) inner loops, which correspond to the vlue of p s p = 1, p =,, p = r, respectively For given such p, the inner loop hs m p itertions Ech itertion contins (we tke the pproch without computing explicitly) five flops nd one sure root opertion to compute c pj nd s pj The opertions () re thusly completed with (n i+1) flops Note tht we lwys hve i p, the totl number of flops is thusly bounded s: j=1 rx mx (+(n p=1j=p+1 rx mx i+1)) pple (+(n p+1)) = p=1j=p+1 rx [(n p)(m p)+11(m p)] mr(n r)+nr(m r)+r p=1 Finlly, we improve the lgorithm 1 in computer science considertions Looking t ech outer loop, sy the first one, we strt to work on the row 1 nd row, then on the row 1 nd row, nd finlly move on to row 1 nd row m The objective is to rotte ll the non-zero entries of the first column of A to the first element If we tke into memory storge into ccount, it is usul prctice to store the elements of mtrix A row by row (this cn be true for both full mtrices nd sprse mtrices); hence we re motivted to operte on djcent rows s often s possible in order to improve the bndwidth usge nd reduce cche misses Such considertion results in roll-bck lgorithm to eliminte the non-zeros we first work on the lst two rows nd mke the m-th element zero, then Givens rottion is pplied to the rows m nd m 1 to mke the (m 1)-th element zero, nd finlly we rech the top of the column This modifiction is reflected in Algorithm 1 Note tht we lso incorporte the computtions of c s nd s s insted of in this modified version
Algorithm 1 Orthogonl Reduction by Givens Rottions (Modified) 1: Set p = 1 nd Q = I m : for i =1,,,n do : for j = m,m 1,,p+1 do : if ji = then : Continue : end if : Compute c j 1,j = j 1,i / ji + j 1,i nd s j 1,j = ji / ji + j 1,i 8: Compute A G j 1,j ( )A 9: Compute Q QG j 1,j ( ) t 1: end for 11: if pi!= then 1: Set p p+1 1: end if 1: end for 8