Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices

Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices Haw-ren Fang August 24, 2007 Abstract We consider the block LDL T factorizations for symmetric indefinite matrices in the form LBL T, where L is unit lower triangular and B is block diagonal with each diagonal block having dimension 1 or 2. The stability of this factorization and its application to solving linear systems has been well-studied in the literature. In this paper we give a condition under which the LBL T factorization will run to completion in inexact arithmetic with inertia preserved. We also analyze the stability of rank estimation for symmetric indefinite matrices by LBL T factorization using the Bunch-Parlett (complete pivoting), fast Bunch-Parlett or bounded Bunch-Kaufman (rook pivoting) pivoting strategy. 1 Introduction A symmetric possibly indefinite matrix A R n n can be factored into LBL T, where L is unit lower triangular and B is block diagonal with each block of order 1 or 2. This is a generalization of the Cholesky factorization, which requires positive semidefiniteness. The process is described as follows. Assuming A is non-zero, there exists a permutation matrix Π such that ΠAΠ T = s n s [ s n s ] A11 A T 21, A 21 A 22 where A 11 is nonsingular, and s = 1 or 2 denoting that A 11 is 1 1 or 2 2. The decomposition in outer product form is [ ] [ ][ ] [ A11 A T 21 I = s 0 A11 0 Is A 1 ] A 21 A 22 A 21 A 1 11 AT 21, (1) 11 I n s 0 Ā 0 I n s This work was supported by the National Science Foundation under Grant CCR 02-04084. Department of Computer Science and Engineering; University of Minnesota; Minneapolis, MN 55455, USA. 1

where Ā = A 22 A 21 A 1 11 AT 21 R(n s) (n s) is the Schur complement. Iteratively applying the reduction to the Schur complement, we obtain the factorization in the form PAP T = LBL T, where P is a permutation matrix, L is unit lower triangular, and B is block diagonal with each block of order 1 or 2. In LBL T factorization, choosing the permutation matrix Π and pivot size s at each step is called diagonal pivoting. In the literature, there are four major pivoting methods for general symmetric matrices: Bunch-Parlett (complete pivoting) [7], Bunch-Kaufman (partial pivoting) [4], bounded Bunch-Parlett and fast Bunch-Kaufman (rook pivoting) [1]. In addition, there are two pivoting strategies specifically for symmetric tridiagonal matrices. One is by Bunch [3] and the other is by Bunch and Marcia [5]. Slapničar analyzed the Bunch-Parlett pivoting [15, section 7]. Higham proved the stability of the Bunch-Kaufman pivoting strategy and Bunch s method in [12] and [13], respectively. Bunch and Marcia [5] showed the normwise backward stability of their method. In this paper, we give a sufficient condition under which the LBL T factorization of a symmetric matrix is guaranteed to run to completion numerically and preserve inertia. With complete pivoting, the Cholesky factorization can be applied to rank estimation. The stability is analyzed by Higham [11]. Given A R n n of rank r < n, the LU factorization needs 2(3n 2 r 3nr 2 +r 3 )/3 flops, whereas the Cholesky factorization needs (3n 2 r 3nr 2 +r 3 )/3 flops but requires symmetric positive semidefiniteness 1. To estimate the rank of a symmetric indefinite matrix, we can use LBL T factorization with the Bunch-Parlett pivoting strategy (complete pivoting), fast Bunch-Parlett or bounded Bunch-Kaufman pivoting strategy (rook pivoting). The number of required flops is still (3n 2 r 3nr 2 +r 3 )/3. We analyze the stability in this paper. The organization of this paper is as follows. Section 2 gives the basics of rounding error analysis. Section 3 discusses the conditions under which the LBL T factorization is guaranteed to run to completion with inertia preserved. Section 4 analyzes rank estimation for symmetric indefinite matrices by LBL T factorization with proper pivoting strategies. The concluding remarks are given in Section 5. Throughout the paper, without loss of generality, we assume the required interchanges for any diagonal pivoting are done prior to the factorization, so that A := PAP T, where P is the permutation matrix for pivoting. [ We ] denote 1 0 the column vector of ones by e, and the 2 2 identity matrix by I 0 1 2. 2 Basics of rounding error analysis We use fl( ) to denote the computed value of a given expression, and follow the standard model fl(x op y) = (x op y)(1 + δ), δ u, op = +,,, /, (2) 1 In this paper we consider the highest order terms while counting the number of flops. 2

where u is the unit roundoff. This model holds in most computers, including those using IEEE standard arithmetic. Lemma 1 gives a basic tool for rounding error analysis [14, Lemma 3.1]. Lemma 1 If δ i u and σ i = ±1 for i = 1,...,k, then if ku < 1, and k (1 + δ i ) σi = 1 + θ k, θ k ku 1 ku =: ǫ k. i=1 The function ǫ k defined in Lemma 1 has two useful properties 2 : ǫ m + ǫ n + 2ǫ m ǫ n ǫ m+n, m, n 0, cǫ n ǫ cn, c 1. Since we assume ku < 1 for all practical k, ǫ k = ku + kuǫ k = ku + O(u 2 ). These properties are used frequently to derive inequalities in this paper. Lemma 2 is helpful to our backward error analysis of LBL T factorizations. Lemma 2 Let y = (s n 1 a kb k c k )/d. No matter what the order of evaluation, the computed ŷ satisfies n 1 n 1 ŷd + a k b k c k = s + s, s ǫ n ( ŷd + a k b k c k ). Proof The proof is similar to the derivation leading to [14, Lemma 8.4]. Let r = s n 1 a kb k c k and assume that it is computed in the order (((s a 1 b 1 c 1 ) a 2 b 2 c 2 ) a n 1 b n 1 c n 1 ). Following the model (2), the computed ˆr satisfies n 1 ˆr = s(1 + δ 1 ) (1 + δ n 1 ) a i b i c i (1 + ξ i )(1 + η i )(1 + δ i ) (1 + δ n 1 ), i=1 where δ i, ξ i, η i u. Then the computed ŷ = fl(ˆr/d) = (ˆr/d)(1 + δ n ), δ n u. Therefore, ŷd = ˆr(1 + δ n ). Dividing both sides by (1 + δ 1 ) (1 + δ n ), we obtain By Lemma 1, i=1 n 1 1 ŷd (1 + δ 1 )...(1 + δ n ) = s (1 + ξ i )(1 + η i ) a i b i c i (1 + δ 1 ) (1 + δ i 1 ). n 1 ŷd(1 + θ n (0) ) = s a k b k c k (1 + θ n (k) ), where θ (k) n ǫ n for k = 0, 1,..., n 1. The last equation still holds for any order of evaluation with a little thought. The result follows immediately. 2 The two properties were listed in [12] and [14, Lemma 3.3], but with 2ǫ mǫ n replaced by ǫ mǫ n. Here we give a slightly tighter inequality. 3

3 Conditions for inertia preservation We use A to denote a symmetric matrix consisting of floating point numbers. Due to potential rounding errors in forming or storing A, A may not be exactly the same as the matrix under consideration. Hence we write A = Ã + Ã, where Ã is the matrix under consideration. If each element in A is rounded to the closest floating point number, then a ij ã ij uã ij with u the unit roundoff. Therefore, u A Ã u Ã 1 u A = ǫ 1 A. Here and throughout this paper, we use a hat to denote computed quantities; inequality and absolute value for matrices are defined elementwise. The overall backward error of the LBL T factorization is Ã ˆL ˆB ˆL T Ã A + A ˆL ˆB ˆL T ǫ 1 A + A ˆL ˆB ˆL T. In general, we assume A is close to Ã and therefore Ã is relatively negligible. Hence we consider only A ˆL ˆB ˆL T for simplicity. 3.1 Componentwise analysis Assuming the LBL T factorization of A R n n runs to completion, we denote B 1 L 11 B 2 B =..., L = L 21 L 22..... Rn n. Bm L m1 L m2 L mm Each B i is a 1 1 or 2 2 block, with L ii = 1 or L ii = I 2, respectively. The rest of L is partitioned accordingly. Algorithm 1 gives the LBL T factorization in inner product form. Algorithm 1 LBL T factorization of A R n n in inner product form. for all i = 1,...,m do for all j = 1,...,i 1 do L ij = (A ij j 1 L ikb k L T jk )B 1 j (*) end for B i = A ii i 1 L ikb k L T ik (**) end for Two remarks on Algorithm 1 are worth noting. First, in practice, we store L ik B k in an array for the computations in (*). Hence the factorization requires only n 3 /3 flops, the same as the Cholesky factorization. Second, in outer product form, the Schur complement Ā in (1) can be computed by either Ā = A 22 A 21 A11 1 AT 21 or Ā = A 22 LA LT 11, where L = A 21 A 1 11. The latter is computationally equivalent (including equivalence of rounding errors) to 4

the inner product formulation in Algorithm 1, based on which is our stability analysis. If the Schur complement is computed by Ā = A 22 A 21 A 1 11 AT 21, computationally it involves fewer arithmetic operations. Hence our analysis remains valid. These two remarks also apply to Algorithm 2. In Algorithm 1, each multiplication by B 1 j in ( ) with B j R 2 2 can be computed by solving a 2 2 linear system, denoted by Ey = z (i.e., E is B j ). We assume the linear system is solved successfully with computed ŷ satisfying E ǫ c E, (E + E)ŷ = z (3) for some constant ǫ c. Higham [12] proved that using Bunch-Kaufman pivoting with pivoting argument α = 1+ 17 8 0.640, if the system is solved by GEPP, then ǫ c = ǫ 12 ; if it is solved by explicit inverse of E with scaling (as implemented in both LA- PACK and LINPACK), ǫ c = ǫ 180. Since Bunch-Parlett, fast Bunch-Parlett and bounded Bunch-Kaufman pivoting strategies satisfy stronger conditions than the Bunch-Kaufman, condition (3) still holds. This remark remains valid with pivoting argument α = 5 1 2 0.618 suggested for symmetric triadic matrices [10]. Here a matrix is called triadic if it has at most two nonzero off-diagonal elements in each column. In addition, Higham [13] also showed that with Bunch s pivoting strategy for symmetric tridiagonal matrices [3], if a 2 2 linear system is solved by GEPP, then ǫ c = ǫ 6 5. For the 2 2 linear system solved by explicit inverse with scaling, a constant ǫ c can also be obtained. For the Bunch-Marcia algorithm, ǫ c = ǫ 500 with the 2 2 linear system solved by explicit inverse with scaling [5]. We conclude that all pivoting strategies in the literature [1, 3, 4, 5, 7, 10] satisfy condition (3). Lemma 4 gives a componentwise backward error analysis of LBL T factorization, with proof invoking Lemma 3. Lemma 3 Let Y = (S A kb k C k )E 1, where E and each B k are either 1 1 or 2 2, such that the matrix operations are well-defined. If E is a 2 2 matrix, we assume condition (3) holds. Then the computed Ŷ satisfies where Ŷ E + A k B k C k = S + S, S max{ǫ c, ǫ 4m 3 }( S + Ŷ E + A k B k C k ). If E is an identity, then max{ǫ c, ǫ 4m 3 } can be replaced by ǫ 4m 3, since ǫ c = 0. Proof Let Z = S A kb k C k and hence Y = ZE 1. If B k is a 2 2 matrix, then each element in A k B k C k can be represented in the form 4 i=1 a ib i c i. Therefore, each element in A kb k C k sums at most 4() terms. By 5

Lemma 2, Ẑ = Z + Z, Z ǫ 4m 3 ( S + A k B k C k ). (4) We use X (i) to denote the row vector formed by the ith row of matrix X. If E is a 2 2 matrix, then applying (3) by substituting y = Ŷ (i) and z = Ẑ(i), we obtain Ŷ (i) (E + E i ) = Ẑ(i), E i ǫ c E (5) for i = 1 and i = 2 (if any). By (4) and (5), S = Ŷ E (S and then A k B k C k ) = Ŷ E Ẑ + Ẑ Z = Ŷ E Ẑ + Z, S (i) = Ŷ (i) E i + Z (i) Ŷ (i) E i + Z (i) ǫ c Ŷ (i) E + ǫ 4m 3 ( S (i) + A (i) k B k C k ) max{ǫ c, ǫ 4m 3 }( S (i) + Ŷ (i) E + A (i) k B k C k ). Combining the cases i = 1, 2 (if any), we obtain the result for E being 2 2. If E is 1 1, then we apply Lemma 2 and obtain S ǫ 4m 3 ( Ŷ E + A k B k C k ). Lemma 4 If the LBL T factorization of a symmetric matrix A R n n runs to completion, then the computed ˆL ˆB ˆL T satisfies A ˆL ˆB ˆL T max{ǫ c, ǫ 4m 3 }( A + ˆL ˆB ˆL T ), where we assume condition (3) holds for all linear systems involving 2 2 pivots, and m is the number of blocks in B, m n. Proof Applying Lemma 3 to ( ) and ( ) in Algorithm 1, we obtain j 1 A ij ˆL ij ˆBj ˆL ik ˆBk ˆLT jk max{ǫ c, ǫ 4j 3 }( A ij + ˆL ij ˆB j 1 j + ˆL ik ˆB k ˆL T jk ). Therefore, A ij j ˆL ik ˆBk ˆLT jk max{ǫ c, ǫ 4j 3 }( A ij + j ˆL ik ˆB k ˆL T jk ) (6) 6

for 1 j i m, where ˆL jj = 1 or I 2, depending on whether B j is 1 1 or 2 2 for i = 1,..., m. The result is obtained by collecting terms in (6) into one matrix presentation. Lemma 4 coincides with [12, Eqn. (4.9)] by Higham, but with max{ǫ c, ǫ 4m 3 } replaced by p(n)u + O(u 2 ), where p(n) is a linear polynomial. From [12, Eqn. (4.9)] Higham obtained [12, Theorem 4.1], a componentwise backward error bound for solving linear systems by LBL T factorizations. Similarly, from Lemma 4 we can get a corresponding bound, as presented in [9, Theorem 3.4]. 3.2 Normwise analysis Lemma 4 expects a bound on ˆL ˆB ˆL T relative to A. For simplicity, we consider L B L T instead of ˆL ˆB ˆL T, since ˆL ˆB ˆL T = L B L T + O(u). (7) Therefore, the objective is to find a modest c(n) such that L B L T c(n) A (8) in some proper norm. Higham [12] showed that using Bunch-Kaufman pivoting strategy with the pivoting argument α = 1+ 17 8 0.640, the LBL T factorization of a symmetric matrix A R n n satisfies L B L T M max{ 1 α, (3 + α2 )(3 + α) (1 α 2 ) 2 }nρ n A M 35.675nρ n A M, (9) where ρ n is the growth factor and M is the largest magnitude element in the given matrix. These bounds also hold for the suggested pivoting argument α = 5 1 2 0.618 for triadic matrices [10]. The Bunch-Parlett, fast Bunch-Parlett and bounded Bunch-Kaufman pivoting strategies satisfy a stronger condition than the Bunch-Kaufman, so they also satisfy (9). Also, Higham [13] proved that using Bunch s pivoting strategy with pivoting argument α = 5 1 2 0.618, L B L T M (7α + 11)(α + 2) A M 42 A M, (10) where A is symmetric tridiagonal. Bunch and Marcia [5] also showed that the normwise backward error bound (10) is preserved with their pivoting method. We conclude that all pivoting strategies for LBL T factorization in the literature [1, 3, 4, 5, 7, 10] satisfy condition (8). Wilkinson showed that the Cholesky factorization of a symmetric positive definite matrix A R n n is guaranteed to run to completion if 20n 3/2 κ 2 (A)u 1 [18]. We give a sufficient condition for the success of LBL T factorization with inertia preserved in Theorem 2, with proof invoking a theorem by Weyl [17, Corollary 4.9]. For symmetric tridiagonal matrices the corresponding condition is given in Theorem 3. 7

Theorem 1 (Weyl) Let A, B be two n n Hermitian matrices and λ k (A), λ k (B), λ k (A+B) be the eigenvalues of A, B, and A+B arranged in increasing order for k = 1,..., n. Then for k = 1,..., n, we have λ k (A) + λ 1 (B) λ k (A + B) λ k (A) + λ n (B). Theorem 2 With the Bunch-Parlett, Bunch-Kaufman, fast Bunch-Parlett or bounded Bunch-Kaufman pivoting strategy, the LBL T factorization of a symmetric matrix A R n n succeeds with inertia preserved if 36n 2 max{c, 4n 3}ρ n κ 2 (A)u < 1, where the constant c is from (3) and ρ n is the growth factor. Proof The proof is by finite induction. Consider Algorithm 1. Let A + A = ˆL ˆB ˆL T, Â k = A k + A k, where A k = A(1 : k, 1 : k) and A k = A(1 : k, 1 : k). The process of LBL T factorization is to iteratively factor A k, increasing k from 1 to n. Obviously, the first stage succeeds for a nonsingular matrix. Suppose the factorization of A k s is successfully completed with inertia preserved (i.e., the inertia of Âk s is the same as that of A k s ), where s = 1 or 2 denotes whether the next pivot is 1 1 or 2 2. Since the inertia of Â k s is preserved, all the pivots in Âk s are full rank, so the factorization of A k succeeds (i.e., with no division by zero). The rest of the proof is to show that the inertia is preserved in Âk so the induction can be continued. By Lemma 4, (7) and (9), the componentwise backward error satisfies A k 2 35.925k 2 max{c, 4k 3}ρ k u A k 2 + O(u 2 ) for all possible 1 < k n. Assuming the sum of the O(u 2 ) terms is smaller than 0.075ρ k max{c, 4k 3}u A k 2, we have Now let A k 2 36k 2 max{c, 4k 3}ρ k u A k 2 =: f(k) A k 2. By Theorem 1, if λ i (A k ) > 0, λ (A k ) = min 1 i k λ i(a k ). λ i (A k + A k ) λ i (A k ) A k 2 λ (A k ) f(k) A k 2 = λ (A k )(1 f(k)κ 2 (A k )) λ (A n )(1 f(n)κ 2 (A)) > 0, where we have assumed f(n)κ 2 (A) < 1 for the last inequality. Under the same assumption, if λ i (A k ) < 0, λ i (A k + A k ) λ i (A k ) + A k 2 λ (A k ) + f(k) A k 2 < 0. 8

We conclude that if f(n)κ 2 (A) < 1, then λ i (A k + A k ) and λ i (A k ) have the same sign for i = 1,..., k. In other words, the inertia of Â k is preserved. By induction, the factorization is guaranteed running to completion with inertia preserved. In Theorem 2, c = 12 if all 2 2 systems are solved by GEPP, and c = 180 if all 2 2 systems are solved by explicit inverse with scaling, as shown in [12]. Table 1 displays the bounds on the growth factor ρ n, collected from literature. The bounds for general symmetric matrices are from [2] and [4]. The bounds for symmetric triadic matrices are given in [10]. Table 1: Bounds on growth factor ρ n. Symmetric Matrix General Triadic Pivoting Argument α = 1+ 17 8 α = 5 1 2 Bunch-Kaufman 2.57 n 1 O(1.28 n ) Bounded Bunch-Kaufman/Fast Bunch-Parlett 1+(ln n)/4 O(n 1.7 ) Bunch-Parlett 5.4n Sorensen and Van Loan [8, Section 5.3.2] suggested a variant of Bunch- Kaufman pivoting strategy. As discussed in [12], their variant preserves the properties (3) and (9). Hence Theorem 2 remains valid for this variant. Theorem 3 With the Bunch s pivoting method or the Bunch-Marcia pivoting strategy, the LBL T factorization of a symmetric tridiagonal matrix A R n n succeeds with inertia preserved if where the constant c is from (3). 42n max{c, 4n 3}κ 2 (A)u < 1, Proof The proof is much the same as that of Theorem 2, but invoking (10) instead of (9). In Theorem 3, using Bunch s pivoting method with the 2 2 systems solved by GEPP, c = 6 5 [13]. Using the Bunch-Marcia pivoting method with the 2 2 systems solved by explicit inverse with scaling, c = 500 [5]. In [6], Bunch and Marcia suggested a variant of their method with no effect on the properties (3) and (10). Hence Theorem 3 remains valid for this variant. Theorems 2 and 3 say that the LBL T factorization is guaranteed to run to completion with inertia preserved if the matrix is not too ill-conditioned. 4 Rank estimation Cholesky factorization for symmetric positive semidefinite matrices with complete pivoting can be used for rank estimation. The stability is well studied 9

by Higham [11]. In this section, we discuss the stability of the Bunch-Parlett, fast Bunch-Parlett, or bounded Bunch-Kaufman pivoting strategy applied to the rank estimation of symmetric indefinite matrices. 4.1 Componentwise analysis Recall that we use A R n n to denote a symmetric matrix of floating point numbers. It is unrealistic to assume A is singular because of the potential rounding errors. Let Ã be the exact singular matrix of rank r < n under consideration and A = Ã + Ã be its stored matrix of floating point numbers. The overall backward error is Ã ˆL ˆB ˆL T, where ˆL R n r and ˆB R r r. For simplicity of discussion we consider only A ˆL ˆB ˆL T and say that A is rank r, assuming A is close to Ã. Without loss of generality, we assume that the necessary interchanges are done so that A(1:r, 1:r) has rank r, as they would be with the Bunch-Parlett, fast Bunch-Parlett, or bounded Bunch-Kaufman pivoting. The factorization is denoted by A = LBL T, where L B 11 1 B 2 L 21 L 22 B =... Rr r, L =....... L,1 L,2 L, B L m1 L m2 L m, R n r. Each B i is either 1 1 or 2 2, with L ii = 1 or L ii = I 2, respectively. This factorization is guaranteed if A(1: r, 1: r) satisfies the condition in Theorem 2. The pseudo-code is given in Algorithm 2. Algorithm 2 LBL T factorization of A R n n of rank r < n. for all j = 1,..., do B j = A jj j 1 L jkb k L T jk (*) for all i = j+1,...,m do end for end for L ij = (A ij j 1 L ikb k L T jk )B 1 j (**) Algorithm 2 can be regarded as the incomplete LBL T factorization after processing r rows/columns. Theorem 4 gives a bound on its componentwise backward error. In rank estimation we are concerned with a symmetric matrix A R n n of rank r < n. Theorem 4 We consider the incomplete LBL T factorization of a symmetric matrix A R n n after processing r rows/columns (r < n), assume condition (3) holds for all linear systems involving 2 2 pivots, and define the backward error A by A + A = ˆL ˆB ˆL T + Â(r+1), (11) 10

where Â (r+1) = r n r [ r n r ] 0 0, 0 Ŝ r (A) with Ŝr(A) the computed Schur complement. Then A max{ǫ c, ǫ 4r+1 }( A + ˆL ˆB ˆL T + Â(r+1) ). (12) Proof To simplify the notation, we let ˆL ii = 1 or I 2 depending on whether B i is 1 1 or 2 2 for i = 1,...,. By Lemma 3 with the assumption that condition (3) holds, A ij j ˆL ik ˆBk ˆLT jk max{ǫ c, ǫ 4j 3 }( A ij + j ˆL ik ˆB k ˆL T jk ), (13) for j = 1,..., and i = j,..., m. Note that though A mm can be larger than 2 2, the error analysis in Lemma 3 for Ŝr(A) is still valid. Therefore, A mm Ŝr(A)+ ˆL mk ˆBk ˆLT mk ǫ 4m 3 ( A mm + Ŝr(A) + ˆL mk ˆB k ˆL T mk ). (14) The result is obtained by collecting terms in (13) (14) into one matrix representation. 4.2 Normwise analysis Now we bound A ˆL ˆBˆL T F for analyzing the stability of the Bunch-Parlett, fast Bunch-Parlett, and bounded Bunch-Kaufman pivoting strategies applied to rank estimation for symmetric indefinite matrices. We begin with a perturbation theorem for LBL T factorization. Theorem 5 Let S k (A) be the Schur complement appearing in an LBL T factorization of a symmetric matrix A R n n after processing the first k rows/columns, k < n. Suppose there is a symmetric perturbation in A, denoted by E. Partition A as [ ] A11 A A = T 21, A 21 A 22 where A 11 R k k, and partition E accordingly. Assume that both A 11 and A 11 + E 11 are nonsingular. Then S k (A + E) S k (A) E (1 + W ) 2 + O( E 2 ), where W = A 1 11 AT 21 and is a p-norm or the Frobenius norm. Theorem 5 is from [10, Theorem 5.1], a generalization of a lemma by Higham [11, Lemma 2.1][14, Lemma 10.10] for the Cholesky factorization. 11

Theorem 6 With the Bunch-Parlett, fast Bunch-Parlett, or bounded Bunch- Kaufman pivoting on a symmetric indefinite matrix A of rank r, A ˆL ˆB ˆL T F max{c, 4r+1}(τ(A) + 1)(( W F +1) 2 + 1)u A F + O(u 2 ), where c is from condition (3), τ(a) = L B L T F / A F, and W = A(1:r, 1: r) 1 A(1 : r, r+1 : n). Note that we have assumed the interchanges of rows and columns by pivoting are done so A(1:r, 1:r) has full rank. Proof Since (3) holds, so does (12). The growth factor and τ(a) are wellcontrolled, so that A = O(u), where A is defined in (11). Here and later in (16), we use the property (7) for derivation. Note that in exact arithmetic, ˆL ˆB ˆL T is the partial LBL T factorization of A + A with Â(r+1) the Schur complement. By Theorem 5, we obtain the perturbation bound Â(r+1) F ( W F + 1) 2 A F + O(u 2 ), (15) where W = A(1 : r, 1 : r) 1 A(1 : r, r+1 : n). By (11), (15) and A = O(u), we have ˆL ˆB ˆL T = A + O(u). By Theorem 4, A F max{ǫ c, ǫ 4r+1 }( ˆL ˆB ˆL T F + A F + Â(r+1) F ) = max{c, 4r+1}(τ(A) + 1)u A F + O(u 2 ). (16) Substituting (16) into (15), we obtain Â(r+1) F max{c, 4r+1}(τ(A) + 1)( W F + 1) 2 u A F + O(u 2 ). (17) The result is concluded from (11), (16) and (17). By Theorem 6, the bound on A ˆL ˆB ˆL T F / A F is governed by W F and τ(a). A bound on W F is given in [10, Lemma 5.3] that γ W F γ+2 (n r)((1+γ)2r 1), (18) where γ = max{ 1 α, 1 1 α } is the element bound of L. With the suggested pivoting argument α = 1+ 17 8 0.640, γ = 1+ 17 4 2.562. To bound τ(a), we apply the analysis of Higham for (9), and then obtain τ(a) 36n(r + 1)ρ r+1 (19) for symmetric A R n n of rank r < n, where ρ r+1 is the growth factor, whose bounds are displayed in Table 1. We conclude that with Bunch-Parlett, fast Bunch-Parlett, or bounded Bunch-Kaufman pivoting, LBL T factorization is normwise backward stable for rank-deficient symmetric matrices. 12

4.3 Numerical experiments Our experiments were on a laptop with machine epsilon 2 52 2.22 10 16, with implementation based on newmat library 3. For rank estimation, the important practical issue is when to stop the factorization. In (17), the bound on Â(r+1) F / A F is governed by W F and τ(a). However, both bounds (18) and (19) are pessimistic. To investigate the typical ranges of W F and τ(a) in practice, we used the random matrices described as follows. Each indefinite matrix was constructed as QΛQ T R n n, where Q, different for each matrix, is a random orthogonal matrix generated by the method of G. W. Stewart [16], and Λ = diag(λ i ) of rank r. The following three test sets were used in our experiments: λ 1 = λ 2 = = λ r 1 = 1, λ r = σ, λ 1 = λ 2 = = λ r 1 = σ, λ r = 1, λ i = β i, i = 1,...,r 1, λ r = 1, where 0 < σ 1 and β r 1 = σ. We assigned the sign of λ i randomly for i = 1,..., r 1. Let t denote the number of negative eigenvalues. For each test set, we experimented with all combinations of n = 10, 20,..., 100, r = 2, 3,..., n, t = 1, 2,..., r 1, and σ = 1, 10 3,..., 10 12, for a total of 94, 875 indefinite matrices in each set. In addition to W F and τ(a), we also measured the growth factor ρ r+1, and the relative backward error ξ r = A ˆL ˆB ˆL T F /(u A F ), and the relative Schur residual η r = Â(r+1) F /(u A F ). Their maximum values for the Bunch- Parlett, fast Bunch-Parlett, and bounded Bunch-Kaufman pivoting strategies are displayed in Table 2. All these numbers are modest, except that the bounded Bunch-Kaufman pivoting strategy introduced relatively large backward errors. The assessment showed that the rank estimation by LBL T factorization with Bunch-Parlett and fast Bunch-Parlett pivoting strategies is potentially stable in practice. Table 2: Experimental maximum ρ r+1, W F, τ(a), ξ r, η r for assessment of stability of rank estimation. Algorithm ρ r+1 W F τ(a) ξ r η r Bunch-Parlett 15.143 39.313 38.263 90.911 90.514 Fast Bunch-Parlett 18.506 37.934 41.919 172.412 172.301 Bounded Bunch-Kaufman 28.691 103.467 53.724 67720 67719.9 Note that the bounds (18) and (19) depend on r more than n, as does (17). Experimenting with several different stopping criteria, we suggest Â(k+1) F (k+1) 3/2 u A F, (20) 3 newmat is a library of matrix operations written in C++ by Robert Davies. 13

or the less expensive ˆB i F (k+1) 3/2 u B 1 F, (21) where ˆB i is the block pivot chosen in Â(k+1). With the Bunch-Parlett pivoting strategy, A M B 1 M and Â(k+1) M B i M. Therefore, (20) and (21) are related. One potential problem is that continuing the factorization on Â(r+1) could be unstable for rank estimation. However, the element growth is well-controlled by pivoting. The dimensions of Schur complements are reduced, whereas the right-hand sides in (20) and (21) are increased during the decomposition. These properties safeguard the stability of rank estimation. We experimented on the three sets of matrices described above, in a total of 284, 625 matrices. The results are reported as follows. Using Bunch-Parlett or fast Bunch-Parlett pivoting strategy with stopping criterion (20), the estimated ranks were all correct. Using the Bunch-Parlett and fast Bunch-Parlett pivoting strategy with stopping criterion (21), there were 26 (0.009%) and 53 (0.019%) errors, respectively. Using bounded Bunch-Kaufman pivoting strategy with stopping criteria (20) and (21), there were 15 (0.005%) and 1,067 (0.375%) errors, respectively. The further experiment with σ = 10 13 and stopping criterion (21) exhibited the minor instability since the conditioning of the nonsingular part of a matrix affects the stability of rank estimation. Using the Bunch-Parlett, fast Bunch-Parlett, and bounded Bunch-Kaufman pivoting strategies, there were 503 (0.884%), 488 (0.857%), and 939 (1.651%) errors, respectively. Forcing all the nonzero eigenvalues to be positive in the three test sets, we also experimented with rank estimation of positive semidefinite matrices by LDL T factorization. For these positive semidefinite matrices, the Bunch-Parlett and fast Bunch-Parlett pivoting strategies are equivalent to complete pivoting. With stopping criteria (20) and (21), all the estimated ranks were accurate for σ = 1, 10 3,..., 10 12. Minor instability (less than 5% errors) occurred with σ = 10 13. The stopping criteria suggested by Higham [11] for rank estimation of positive semidefinite matrices by Cholesky factorization are in the same form as (20) and (21), but replacing (k+1) 3/2 by n. Using his stopping criteria with complete pivoting all the estimated ranks of the semidefinite matrices were accurate for σ = 1, 10 3,..., 10 12 and also σ = 10 13. However, his stopping criteria resulted in less accuracy than (20) and (21) for indefinite matrices. We summarize that with σ = 1, 10 3,..., 10 12 (i.e., when the rank was not ambiguous), both stopping criteria (20) and (21) worked well while using the Bunch-Parlett or fast Bunch-Parlett pivoting strategies. However, they may not be the best for all matrices. A priori information about the matrix, such as the size of W, the growth factor and distribution of nonzero eigenvalues, may help adjust the stopping criterion. 14

5 Concluding remarks We have studied the stability analysis LBL T factorizations. Our concluding remarks are listed below. 1. We gave in Theorem 2 a sufficient condition under which an LBL T factorization of a symmetric matrix is guaranteed to run to completion with inertia preserved. The corresponding result for symmetric tridiagonal matrices was given in Theorem 3. 2. We also analyzed the numerical stability of rank estimation of symmetric indefinite matrices using LBL T factorization with Bunch-Parlett, fast Bunch-Parlett or bounded Bunch-Kaufman pivoting. The results are presented in Theorems 4 and 6. 3. Based on our analysis and experiments, we suggest the stopping criteria (20) and (21) for rank estimation by LBL T factorization. In our experiments, both of them worked well with the Bunch-Parlett and fast Bunch- Parlett pivoting strategies when the rank was not ambiguous. We recommend (21) for its low cost, and the fast Bunch-Parlett pivoting strategy for its efficiency without losing much accuracy. Acknowledgment This paper is from Chapter 3 of the author s doctoral thesis [9]. Thanks to Dianne P. O Leary (thesis advisor) for constructive discussions and her help in editing an early manuscript, to Howard Elman for his careful reading, and to Nick Higham for his valuable comments and providing the Wilkinson reference [18]. It is also a pleasure to thank Roger Lee for the discussions on eigenvalue problems. References [1] C. Ashcraft, R. G. Grimes, and J. G. Lewis. Accurate symmetric indefinite linear equation solvers. SIAM J. Matrix Anal. Appl., 20(2):513 561, 1998. [2] J. R. Bunch. Analysis of the diagonal pivoting method. SIAM J. Numer. Anal., 8(4):656 680, 1971. [3] J. R. Bunch. Partial pivoting strategies for symmetric matrices. SIAM J. Numer. Anal., 11(3):521 528, 1974. [4] J. R. Bunch and L. Kaufman. Some stable methods for calculating inertia and solving symmetric linear systems. Math. Comp., 31:163 179, 1977. [5] J. R. Bunch and R. F. Marcia. A pivoting strategy for symmetric tridiagonal matrices. Numer. Linear Algebra Appl., 12(9):911 922, 2005. 15

[6] J. R. Bunch and R. F. Marcia. A simplified pivoting strategy for symmetric tridiagonal matrices. Numer. Linear Algebra Appl., 13(10):865 867, 2006. [7] J. R. Bunch and B. N. Parlett. Direct methods for solving symmetric indefinite systems of linear equations. SIAM J. Numer. Anal., 8(4):639 655, 1971. [8] J. J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, 1991. [9] H.-r. Fang. Matrix factorizations, triadic matrices, and modified Cholesky factorizations for optimization. PhD thesis, Univ. of Maryland, 2006. [10] H.-r. Fang and D. P. O Leary. Stable factorizations of symmetric tridiagonal and triadic matrices. SIAM J. Matrix Anal. Appl., 28(2):576 595, 2006. [11] N. J. Higham. Analysis of the Cholesky decomposition of a semi-definite matrix. In M. G. Cox and S. J. Hammarling, editors, Reliable Numerical Computation, pages 161 185. Oxford University Press, New York, 1990. [12] N. J. Higham. Stability of the diagonal pivoting method with partial pivoting. SIAM J. Matrix Anal. Appl., 18(1):52 65, 1997. [13] N. J. Higham. Stability of block LDL T factorization of a symmetric tridiagonal matrix. Linear Algebra Appl., 287:181 189, 1999. [14] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, second edition, 2002. [15] I. Slapničar. Componentwise analysis of direct factorization of real symmetric and Hermitian matrices. Linear Algebra Appl., 272:227 275, 1998. [16] G. W. Stewart. The efficient generation of random orthogonal matrices with an application to condition estimation. SIAM J. Numer. Anal., 17:403 409, 1980. [17] G. W. Stewart and J.-g. Sun. Matrix Perturbation Theory. Academic Press, 1990. [18] J. H. Wilkinson. A priori error analysis of algebraic processes. In I. G. Petrovsky, editor, Proc. International Congress of Mathematicians, Moscow 1966, pages 629 640. Mir Publishers, 1968. 16