c 2007 Society for Industrial and Applied Mathematics

Size: px
Start display at page:

Download "c 2007 Society for Industrial and Applied Mathematics"

Transcription

1 SIAM J MATRIX ANAL APPL Vol 29, No, pp c 2007 Society for Industrial and Applied Mathematics A SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS OF LINEAR EQUATIONS S CANDRASEKARAN, M GU, X SUN, J XIA, AND J ZU Abstract In this paper we develop a new superfast solver for Toeplitz systems of linear equations To solve Toeplitz systems many people use displacement equation methods With displacement structures, Toeplitz matrices can be transformed into Cauchy-like matrices using the FFT or other trigonometric transformations These Cauchy-like matrices have a special property, that is, their off-diagonal blocks have small numerical ranks This low-rank property plays a central role in our superfast Toeplitz solver It enables us to quickly approximate the Cauchy-like matrices by structured matrices called sequentially semiseparable SSS matrices The major work of the constructions of these SSS forms can be done in precomputations independent of the Toeplitz matrix entries These SSS representations are compact because of the low-rank property The SSS Cauchy-like systems can be solved in linear time with linear storage Excluding precomputations the main operations are the FFT and SSS system solve, which are both very efficient Our new Toeplitz solver is stable in practice Numerical examples are presented to illustrate the efficiency and the practical stability Key words displacement equation, SSS structure, superfast algorithm, Toeplitz matrix AMS subject classifications 5A06, 65F05, 65G05 DOI 037/ Introduction Toeplitz systems of linear equations arise in many applications, including PDE solving, signal processing, time series analysis, orthogonal polynomials, and many others A Toeplitz system is a linear system Tx b with a coefficient matrix to be a Toeplitz matrix t 0 t t 2 t N t t 0 t t N 2 2 T t 2 t t 0 t N 3, t N t N 2 t N 3 that is, its entries are constant along every diagonal a matrix whose entries are constant along every antidiagonal is called a ankel matrix The vector t t N t t 0 t t N is called the Toeplitz-vector that generates T There are both direct and iterative methods for solving Direct solvers are said to be fast if they cost ON 2 operations; examples include Schur-type methods, Received by the editors October 9, 200; accepted for publication in revised form by D A Bini May 8, 2007; published electronically December 3, Department of Electrical and Computer Engineering, University of California at Santa Barbara, Santa Barbara, CA 9306 shiv@eceucsbedu Department of Mathematics, University of California at Berkeley, Berkeley, CA 9720 mgu@ mathberkeleyedu, zhujiangcal@gmailcom Department of Mechanical Engineering, University of California at Berkeley, Berkeley, CA 9720 sunxt@calberkeleyedu Department of Mathematics, University of California at Los Angeles, Los Angeles, CA jxia@mathuclaedu 27 Copyright by SIAM Unauthorized reproduction of this article is prohibited

2 28 CANDRASEKARAN, GU, SUN, XIA, AND ZU Levinson-type methods, and others [32] An important type of direct solver is the displacement equation type fast solver based on Gaussian eliminations Some known displacement equation type methods are the einig [28], GKO [22], and Gu [26] methods Those methods have complexity ON 2 Methods with complexity less than ON 2 are called superfast In this paper we will present a displacement equation type superfast algorithm Fast and superfast methods Many fast and superfast methods that have been developed are numerically unstable [5, 5, 6, 7, 37] References [5] and [5] showed that the Schur algorithm and the Levinson algorithm are weakly stable in some cases, but both may be highly unstable in the case of an indefinite and nonsymmetric matrix Stable generalized Schur algorithms [32] and look-ahead algorithms were developed in [9, 0] igh-performance look-ahead Schur algorithms were presented [20] Many other solvers use the FFT or other trigonometric transforms to convert the Toeplitz or even ankel or Toeplitz-plus-ankel matrices into generalized Cauchy or Vandermonde matrices, which can be done stably in ON log N operations This is also the approach that we will use in this paper, with the aid of a displacement structure The concept of displacement structure was first introduced in [3] The Sylvestertype displacement equation for a matrix Ĉ RN N [29] is 3 ΩĈ ĈΛ UV, where Ω, Λ R N N, U R N α, V R α N, and α N is the displacement rank with respect to Ω and Λ if rankuvα The matrix Ĉ is considered to possess a displacement structure with respect to Ω and Λ if α N With displacement structures it was shown in [9, 2, 36, 22, 28] that Toeplitz and ankel matrices can be transformed into Cauchy-like matrices of the following form: u T Ĉ i v j u i,v j R α, η i λ j i,j N where we assume that η i λ j for i, j N Equivalently, a Cauchy-like matrix is the unique solution to the displacement equation 3 with u T Ω diagη,,η n, Λ diagλ,,λ n,u, and V v,,v n In particular, Ĉ is a Cauchy matrix if u T i v j for all i and j The displacement rank of Ĉ is at most α To solve Toeplitz systems through Cauchy-like matrices, many people have utilized matrix factorizations Gohberg and Olshevsky [23] presented a fast variation of the straightforward Gaussian elimination with partial pivoting GEPP procedure to solve a Cauchy-like linear system of equations in ON 2 operations Among other results, Gohberg, Kailath, and Olshevsky [22] developed algorithm GKO, an improved version of einig s algorithm [28], and demonstrated numerically that it is stable In their algorithm, the ankel matrix and the Toeplitz-plus-ankel matrix are also transformed via fast trigonometric transforms into Cauchy-like matrices u T n Copyright by SIAM Unauthorized reproduction of this article is prohibited

3 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 29 Gu presented a modified algorithm in [26] to avoid extra error growth This algorithm is numerically stable, provided that the element growth in the computed factorization is not large The algorithm takes ON 2 operations and is a fast stable method Superfast algorithms appeared in [3,, 6, 7, 35,, 2, 2] and many others Superfast algorithms use divide-and-conquer strategies Morf developed the first idea in [3] These methods are unstable for nonsymmetric systems as they cannot deal with nearly singular leading principal submatrices Van Barel and Kravanja presented a superfast method for rational interpolation at roots of unity [39] A similar idea was then applied to Toeplitz systems [38] It provided an explicit formula for the inverse of a Toeplitz matrix Additional techniques such as iterative refinement and downdating were still required to stabilize their algorithm 2 Main results Our new Toeplitz solver is also of the displacement equation type Given a Toeplitz linear system, we first use the FFT to transform the associated Toeplitz matrix into a Cauchy-like matrix Then instead of using matrix factorizations which often cost ON 2 or more, we exploit a special low-rank property of Cauchy-like matrices, that is, every off-diagonal block of a Cauchy-like matrix has a low numerical rank Using this low-rank property, we then approximate the Cauchy-like matrix by a low-rank matrix structure called the sequentially semiseparable SSS matrix proposed by Chandrasekaran et al [2, 3] A system with the coefficient matrix in compact SSS form can be solved with only Op 2 N operations, where N is the matrix dimension, and p is the complexity of the semiseparable description The SSS solver is practically stable in our numerical tests and those in [2, 3] The SSS structure was developed to capture the low-rank property of the offdiagonal blocks of a matrix and to maintain stability or practical stability in the mean time It is a matrix analog of semiseparable integral kernels in Kailath s paper [30] Matrix operations with compact form SSS representations are very efficient, provided that such compact representations exist or can be easily computed This turns out to be true for our case, as the Cauchy-like matrices are transformed from Toeplitz matrices We use a recursive compression scheme with a shifting strategy to construct compact SSS forms for those Cauchy-like matrices The major work of the compressions can be precomputed on some Cauchy matrices which are independent of the actual Toeplitz matrix entries The overall algorithm thus has the following stages: Precompute compressions of off-diagonal blocks of Cauchy matrices 2 Transform the Toeplitz matrix into a Cauchy-like matrix in ON log N operations 3 Construct a compact SSS representation from precomputed compressions and solve the Cauchy-like matrix system Op 2 N operations Recover the solution of the Toepliz system in ON log N operations The stages above are either stable or practically stable Our numerical results indicate that the overall algorithm is stable in practice The Toeplitz matrix does not have to be symmetric or positive definite, and no extra stabilizing step is necessary After the precomputations, the total cost for the algorithm is ON log N+Op 2 N This indicates that the entire algorithm is superfast We also point out that similar techniques are used in [33], where the low-rank property is exploited through the block columns without diagonals called neutered block columns in [33], in contrast with the off-diagonal blocks here The compres- Copyright by SIAM Unauthorized reproduction of this article is prohibited

4 250 CANDRASEKARAN, GU, SUN, XIA, AND ZU sions of either the neutered blocks or the off-diagonal blocks both give data-sparse representations which enable fast factorizations of the Cauchy-like matrices In fact, corresponding to neutered block rows or columns, there are also matrix representations called hierarchically semiseparable SS matrices [] which are usually more complicated structures than SSS matrices 3 Overview We will discuss the displacement structure and the transformation from a Toeplitz problem to a Cauchy-like problem in section 2 The low-rank property of this Cauchy-like problem is then exploited Section 3 then gives a linear complexity solver using the SSS structure In section, we will present an algorithm for fast construction of SSS structures We will then analyze the complexity in section 5 and use some numerical experiments to demonstrate the efficiency and the practical stability All algorithms have been implemented in Fortran 90 Section 6 draws some conclusions 2 Displacement structures and low-rank property 2 Cauchy-like systems Given a Toeplitz system, we can use a displacement structure to transform it into a Cauchy-like system Define δ 0 0 Z δ 0, and let Ω Z and Λ Z in 3 Kailath, Kung, and Morf [3] have shown that every Toeplitz matrix satisfies the displacement equation 3 with A B, having nonzero entries only in its first row and last column, to be a matrix of rank at most 2 ence the displacement rank of a Toeplitz matrix is at most 2 with respect to Z and Z The following result can be found in [28] Proposition 2 Let Ĉ RN N be a matrix satisfying the displacement equation 2 Z Ĉ ĈZ UV, where U R n α and V R α n Then FĈD 0 F is a Cauchy-like matrix satisfying 22 D FĈD 0 F FĈD 0 F D FU VD0 F, where F N ω2k j k,j N is the normalized inverse discrete Fourier transform matrix, ω e πi N, and D diag,ω 2,,ω 2N, D diagω, ω 3,,ω 2N, D 0 diag,ω,,ω N ere α 2 This proposition suggests that for a Toeplitz matrix T, one can convert it into the Cauchy-like matrix in 22 Therefore the Toeplitz system can be readily transformed into a new system 23 C x b, Copyright by SIAM Unauthorized reproduction of this article is prohibited

5 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 25 where C has the form u T i 2 C v j ω 2i ω 2j+ i,j N u i,v j R α In section 3 we will present a fast solver for 23 After obtaining x we will then recover x with an FFT again All the stages involving FFT are stable and cost ON log N The solver for 23 has a linear complexity and turns out to be practically stable Thus the total cost of our algorithm is bounded by ON log N+ ONp 2, where p is some parameter that will be described below This indicates our method is a superfast one with practical stability 22 Low-rank property of Cauchy-like matrices In this section, we will show a low-rank property of C, ie, every off-diagonal block of C has a low numerical rank This property is the basis of the superfast SSS solver in section 3 First, a simple numerical experiment can give us an idea of the low-rank property of C To find out the numerical ranks we can use one of the following tools: τ-accurate SVD: singular values less than τ are dropped if τ is an absolute tolerance, or singular values less than τ times the largest singular value are dropped if τ is a relative tolerance 2 τ-accurate QR: A QR, A : m n, Q : m k, R : k k, k l minm, n, which is obtained in the following way Compute the exact QR factorization of matrix A ˆQ ˆR, where ˆQ is m l and ˆR is l n with diagonal entries satisfying ˆR ˆR 22 ˆR ll Then obtain R by dropping all rows of ˆR with diagonal entries less than τ if τ is an absolute tolerance, or with diagonal entries less than τ ˆR if τ is a relative tolerance Drop relevant columns of ˆQ accordingly to obtain Q Later, by ranks we mean numerical ranks ere we take some random Toeplitz matrices in different sizes Then we transform them into Cauchy-like matrices C and compute the numerical ranks of their off-diagonal blocks For simplicity, we compute the ranks for blocks C : d, d +:N, C:2d, 2d +:N,, C : kd, kd +: N,, where d is a fixed integer, and these blocks are numbered as block numbers, 2,,k as shown in Figure 2 ere, k 8 off-diagonal blocks for each of three N N Cauchy-like matrices are considered See Table 2 for the numerical ranks, where we use the τ-accurate SVD with τ to be an absolute tolerance We can see that the numerical ranks are relatively small as compared to the block sizes And when we double the dimension of the matrix, the numerical ranks do not increase much This is more significant when a larger τ is used Fig 2 Off-diagonal blocks numbered as, 2, 3 Upper triangular part only Copyright by SIAM Unauthorized reproduction of this article is prohibited

6 252 CANDRASEKARAN, GU, SUN, XIA, AND ZU Table 2 Off-diagonal numerical ranks of Cauchy-like matrices transformed from some Toeplitz matrices with absolute tolerance τ 0 9 N 60 N 280 N 2560 Block # Block size Rank Block size Rank Block size Rank The low-rank property can be verified theoretically in the following way We first consider a special case of 2 where all u T i v j We show that the following Cauchy matrix has low-rank off-diagonal blocks: 25 C 0 ω 2i ω 2j+ i,j N The central idea is similar to that in the fast multipole method [25, 8], which implies that with well-separated points see, eg, Figure 22 an interaction matrix is numerically low-rank ere we introduce two sets of points {λ k } N k {ω2k } N k and {η k} N k {ω2k+ } N k on the unit circle When we consider an off-diagonal block of C 0 as follows: G λ i η j i p, p+ j N we can show G is numerically low-rank In fact G corresponds to two well-separated sets {λ k } p k and {η j} N kp+ ; that is, there exists a point c C such that 26 λ i c > d+ e, i,,p, η j c < d, j p +,,N, where d, e are positive constants Consider the expansion 27 r η j c k ηj r+ λ i η j λ i c ηj c λ i c k0 λ i c k+ + O c λ i c 28 r η j c k + ε, k+ λ i c k0 Fig 22 Well-separated sets in the plane Copyright by SIAM Unauthorized reproduction of this article is prohibited

7 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 253 where r is a number such that the error term ε O ηj c λ i c r+ is bounded by a given tolerance We have the estimate r+ η j c r+ d λ i c <, d + e which enables us to find an appropriate r according to the tolerance Thus r η j c k G k0 λ i c k+ +ˆε i p, p+ j N 29 λ c λ c 2 λ c r+ λ 2 c λ 2 c 2 λ 2 c r+ η p+ c η p+2 c η N c +ˆε λ p c λ p c 2 η λ p c r+ p+ c r η p+2 c r η N c r Therefore the numerical rank of G is at most r +, up to an error ˆε Now we can return to the Cauchy-like matrix 2 A similar argument shows that any off-diagonal block of C satisfies r Ĝ u T i v j η j c k λ i c k+ k0 u T λ c u T 2 λ 2 c u T p λ p c u T λ c 2 u T 2 λ 2 c 2 u T p λ p c 2 i p, p+ j N u T λ c r+ u T 2 λ 2 c r+ u T p λ p c r+ v v 2 v q η p+ c v η p+2 c v 2 η N c v N η p+ c r v η p+2 c r v 2 η N c r v N That is, we replace the entries of the two matrix factors in 29 with appropriate vectors Thus the numerical rank of Ĝ will be no larger than αr +, which will be relatively small as compared to N 3 SSS structures and superfast SSS solver 3 SSS representations To take advantage of the low-rank property of the Cauchy-like matrix C, we can use SSS structures introduced by Chandrasekaran et al [2, 3] The SSS structure nicely captures the ranks of off-diagonal blocks of a matrix such as shown in Figure 2 A matrix A C M M satisfies the SSS structure if there exist 2n positive integers m,,m n, and m,, m n with M m + + m n and M m + + m n to mi mj block-partition A as A A i,j k k, where A ij C satisfies D i if i j, 3 A ij U i W i+ W j Vj if i<j, P i R i R j+ Q j if i>j ere the superscript denotes the ermitian transpose and empty products are defined to be identity matrices The matrices {U i } n i, {V i} n i2, {W i} n i2, {P i} n i2, Copyright by SIAM Unauthorized reproduction of this article is prohibited

8 25 CANDRASEKARAN, GU, SUN, XIA, AND ZU Table 3 Dimensions of matrices in 3 Matrix U i V i W i P i Q i R i Dimensions m i k i m i k i k i k i m i l i m i l i+ l i+ l i {Q i } n i, {R i} n i2, and {D i} n i are called generators for the SSS structure and their dimensions are defined in Table 3 As an example, the matrix A with n has the form 32 A D U V 2 U W 2 V 3 U W 2 W 3 V P 2 Q D 2 U 2 V3 U 2 W 3 V P 3 R 2 Q P 3 Q 2 D 3 U 3 V P R 3 R 2 Q P R 3 Q 2 P Q 3 D The SSS representation 32 is related to the off-diagonal blocks in Figure 2 in the way that the upper off-diagonal block numbers, 2, and 3 are U V 2 W 2 V3 W 2 W 3 V, U W 2 U 2 V 3 W 3 V, U W 2 W 3 U 2 W 3 U 3 V Appropriate row and column bases of the off-diagonal blocks are clearly reflected The SSS structure depends on the sequences {m i } and { m i } and the SSS generation scheme If A is a square matrix M M, then we can have a simpler situation m i m i, i,,n SSS matrices are closed under addition, multiplication, inversion, etc, although the sizes of the generators may increase While any matrix can be represented in this form for sufficiently large k i s and l i s, the column dimensions of U i s and P i s, respectively, our main focus will be on SSS matrices which have low-rank off-diagonal blocks and have generators with k i s and l i s to be close to those ranks We say these SSS matrices are compact Particularly true for Cauchy-like matrices, they can have compact SSS forms Using SSS structures, we can take advantage of the superfast SSS system solver in [2, 3] to solve the Cauchy-like systems The solver is efficient when the SSS form is compact, and is practically stable The solver shares similar ideas with that for banded plus semiseparable systems in [] ere we briefly describe the main ideas of the solver in [2, 3] We consider solving the linear system Ax b, where A C N N satisfies 3 and b itself is an unstructured matrix The solver computes an implicit ULV decomposition of A, where U and V are orthogonal matrices Before we present the formal algorithm, we demonstrate the key ideas on a block matrix example 32 SSS solver: example Let the initial system Ax b be partitioned as follows: 33 D U V 2 U W 2 V 3 U W 2 W 3 V P 2 Q D 2 U 2 V3 U 2 W 3 V P 3 R 2 Q P 3 Q 2 D 3 U 3 V P R 3 R 2 Q P R 3 Q 2 P Q 3 D x x 2 x 3 x b b 2 b 3 b 0 P 2 P 3 R 2 P R 3 R 2 ξ, Copyright by SIAM Unauthorized reproduction of this article is prohibited

9 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 255 where the dimensions of the generators follow those in Table 3 with m i m i and the vector ξ 0 The extra zero vector ξ on the right-hand side of 33 has been added for the purpose of a general recursive pattern The algorithm has two main stages, compression or elimination and merging, depending on the relationship between k i l i and m i in the intermediate procedure 32 Compression At the beginning, k <m because of the low-rank property described earlier We apply a unitary transformation q to U so that the first m k rows of U become zeros: 3 q U 0Û Now we multiply q m k k to the first m equations of the system q Ax 0 b 0 I q 0 0 I 0 P 2 P 3 R 2 P R 3 R 2 ξ We pick another unitary transformation w to lower-triangularize q D, the, diagonal block A, ie, Then system 33 becomes q 0 0 I q D w w A 0 0 I w 0 0 I m k k m k D 0 k D 2 D 22 q x 0 0 I b 0 P 2 P 3 R 2 P R 3 R 2 which can be rewritten as D D 2 D 22 Û V2 Û W 2 V3 Û W 2 W 3 V P 2 Q P 2 ˆQ D 2 U 2 V3 U 2 W 3 V P 3 R 2 Q P 3 R 2 ˆQ P 3 Q 2 D 3 U 3 V P R 3 R 2 Q P R 3 R 2 ˆQ P R 3 Q 2 P Q 3 D β γ b 2 b 3 ξ, b where we have used the partitions w x m k k z ˆx, q b 0 0 P 2 P 3 R 2 P R 3 R 2 m k β k γ, and w Q ξ, z ˆx x 2 x 3 x m k Q k ˆQ Copyright by SIAM Unauthorized reproduction of this article is prohibited

10 256 CANDRASEKARAN, GU, SUN, XIA, AND ZU At this point, we can solve for z from the system of equations D z β We also subtract D 2 z from the right-hand side to obtain ˆb γ D 2 z Then we can discard the first m k rows and columns of the coefficient matrix of the system to obtain D 22 Û V 2 Û W 2 V 3 Û W 2 W 3 V P 2 ˆQ D 2 U 2 V3 U 2 W 3 V P 3 R 2 ˆQ P 3 Q 2 D 3 U 3 V P R 3 R 2 ˆQ P R 3 Q 2 P Q 3 D ˆx ˆb x 2 x 3 b 2 b 3 x b 0 P 2 P 3 R 2 P R 3 R 2 ˆξ, where ˆξ ξ + Q z This new system has a similar structure to the original one but with smaller dimension We can continue to solve it by recursion, if further compressions of the blocks such as 3 are possible Note the actual solution, say, x, can be recovered by x w z ˆx 322 Merging During the recursive eliminations there are situations when k i is no longer smaller than m i and no further compression is possible We are then unable to introduce more zeros into the system Now we proceed by merging appropriate block rows and columns of the matrix As an example we can merge the first two block rows and columns and rewrite the system of equations as follows: D U V2 P 2 Q D 2 P 3 Q R 2 Q 2 P R 3 Q R 2 Q 2 U W 2 U 2 V 3 U W 2 U 2 D 3 U 3 V P Q 3 D W 3 V b 0 b 2 P 2 ˆξ b 3 0 P 3 R ˆξ 2 b P R 3 x x 2 x 3 x ence the system becomes ˆD Û V3 Û W 3 V P 3 ˆQ D 3 U 3 V P R 3 ˆQ P Q 3 D ˆx x 3 x ˆb b 3 b 0 P 3 P R 3 ξ, where ˆD ˆx D U V2 P 2 Q D 2 x, ˆb x 2 U W, Û 2 b b 2 P 2 τ U 2, ξ R2 ˆξ, ˆQ Q R 2 Q 2, The number of block rows/columns is reduced by one Further compressions become possible and we can proceed to solve the system recursively In the case n,we have the system D x b 0ξ, which is solved directly Copyright by SIAM Unauthorized reproduction of this article is prohibited

11 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS General solve algorithm We now present a short description of the general algorithm The procedure in the example can be directly extended to a general system We assume that the matrix A is in compact SSS form represented by the generators {U i } n i, {V i} n i2, {W i} n i2, {P i} n i2, {Q i} n i, {R i} n i2, and {D i} n i as in 3 We also partition x x i and b b j such that x i and b i have m i rows As in the example, there are two stages at each step of the recursion In the compression stage, we perform orthogonal eliminations on both sides of A to create an m k m k lower triangular submatrix at the top left corner of A Then we solve a small triangular system and obtain the first few components of the solution vector At this stage, we are left with a new system with less unknowns; hence we can carry out a recursion In the merging stage, we merge the first two block rows and columns of A while still maintaining the SSS structure The numbers of block rows and columns are reduced by one Combining these two stages, we can proceed with recursion to solve the system When n we can solve the linear system directly with standard solvers The SSS solver has a complexity ONp 2 [2, 3], where p is the maximum numerical rank of the off-diagonal blocks of A, as compared to the traditional ON 3 cost for a general dense N N matrix We use only orthogonal transformations and a single substitution in the SSS solver Although a formal proof for the backward stability is not yet available, the solver is shown to be practically stable The reader is referred to [2, 3] for more discussions on the stability Fast construction of SSS representation for C According to section 3, a system in compact SSS form can be solved very efficiently We can thus use that algorithm to solve the Cauchy-like system 23, provided that C can be quickly written in SSS form Therefore we try to find an efficient construction scheme ere we provide a divide-and-conquer SSS construction scheme by using the fast merging and splitting strategy in [3] If we know further that the matrix has low-rank off-diagonal blocks and it is easy to compress the off-diagonal blocks, then the construction can be superfast ere we will concentrate on this situation as Cauchy-like matrices have this low-rank property We first present a general divide-and-conquer construction algorithm and then describe a fast shifting strategy to compress the off-diagonal blocks of C General divide-and-conquer construction algorithm In this section, we discuss the construction of the SSS structure of a matrix A, when the partition sequence {m i } n i is given The general construction methods can be applied to any unstructured matrix, thus proving that any matrix has an SSS structure Of course, k i and l i will usually be large in this case, precluding any speed-ups These methods can be viewed as specific ways to make the realization algorithm of [8] more efficient Suppose we are given an N N matrix A and a partition sequence {m i } n i with n i m i N Starting with an appropriate sequence { m, m 2 }, where k i m i m and n ik+ m i m 2, we can first partition A into a 2 2 block matrix and then construct a simple SSS form A where m m 2 m D B m 2 E D 2 2 B U V 2 U Σ F m m 2 D U V2 P 2 Q D 2,,F P2 Q P 2 Σ F Copyright by SIAM Unauthorized reproduction of this article is prohibited

12 258 CANDRASEKARAN, GU, SUN, XIA, AND ZU are the low-rank SVDs of the off-diagonal blocks B and E Note if the compressions are done by τ-accurate SVD approximations or rank-revealing QR decompositions, then appropriate signs should be replaced by signs We now split either the, block or the 2, 2 block to obtain a 3 3 block SSS matrix For instance, we can split the, block according to an appropriate new sequence { ˆm, ˆm 2, ˆm 3 } as follows, where ˆm +ˆm 2 m, ˆm 3 m 2 : D new U new V2 new P2 new Q new D2 new D, U new W2 new U2 new U, V3 new V2, R2 new Q new Q new 2 Q,P3 new P 2,D3 new D 2, where the new generators marked by the superscript new introduced based on can be determined in the following way First, we partition the matrices for the old first block conformally with the two new blocks as D D 2 U D 2 D 22 D, U 2 U, Q Q 2 Q We can then identify from these and the previous equations that D new D,D2 new D 22,U2 new U 2,V3 new V 2,Q new 2 Q 2,P3 new P 2,D3 new D 2 The remaining matrices satisfy 3 D 2 U U new V new 2 W new 2, D 2 Q P new 2 R2 new Q new By factorizing the left-hand side matrices using numerical tools such as the SVD and rank-revealing QR, these two equations allow us to compute those remaining matrices for the new blocks A is thus in the new form A ˆm ˆm 2 ˆm 3 ˆm D new U new V2 new U new W2 new V3 new ˆm 2 P2 new Q new D2 new U2 new V3 new ˆm 3 P2 new R2 new Q new P3 new Q new 2 D3 new We can use similar techniques if we want to split the second row and column of We can continue this by either splitting the first block row and block column or the last ones using the above techniques, or splitting any middle block row and block column similarly Then we will be able to construct the desired SSS representation according to the given sequence {m i,i, 2,,n} The general construction can be organized with bisection The major cost is in compressions of off-diagonal blocks of the form D 2 i X i Y i, where X i and Y i are tall and thin, and Di 2 is an off-diagonal block of D D i i Di 2 Di 2 Di 22 The compression can be achieved by a τ-accurate QR factorization The construction is also practically stable in our implementation Copyright by SIAM Unauthorized reproduction of this article is prohibited

13 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS Compression of off-diagonal blocks For a general matrix with low-rank off-diagonal blocks, the SSS construction can cost ON 2 p as a compression such as 2, 3, and can take OK 2 p, where K is the dimension of the block being compressed owever, for the Cauchy-like matrix C in 23 the compressions can be precomputed Theorem The compression of the off-diagonal block u T 5 G i v j XY λ i η j i p, q j N of C can be obtained by the compression of the corresponding off-diagonal block G X 0 Y0 λ i η j i p, q j N of C 0, where the column dimension of X is no larger than twice the size of the column dimension of X 0 Proof Assume the off-diagonal block G 0 λ i η j i p, q j N is in compressed form X 0 Y 0, and the i, j entry is λ i η j X i,: Y j,:, where X i,: Y j,: denotes the ith row of X Y As α 2 for simplicity we fix α 2 in 2 Then the corresponding off-diagonal block in C is u T G i v j ui v j + u i2 v j2 λ i η j i p, q j N λ i η j u i v j + u i2 v j2 X i,: Yj,: i p, q j N vj Yj,: u i X i,: u i2 X i,: v j2 Yj,: i p, q j N u X,: u 2 X,: vq Yq,: v N Y N,: v q2 Yq,: v N2 YN,: u p X i,: u p2 X i,: XY i p, q j N That is, we get a compression of G Theorem indicates that we can convert the compressions of the off-diagonal blocks of C to be the compressions of those of C 0 which is independent of the actual entries of the Toeplitz matrix T This means the compressions of off-diagonal blocks of C 0 can be precomputed The precomputation can be done in ON 2 p flops by a rank-revealing QR factorization such as in [27] It is possible to reduce the cost to ON log N due to the fact that the compression of a large off-diagonal block can be obtained by that of small ones This can be seen implicitly from the following subsection 3 Compressions of off-diagonal blocks in precomputation We further present a shifting strategy to reduce the cost of the compressions in the precomputation The significance of this shifting strategy is to relate the compressions of large off-diagonal block of C 0 to those of small ones That is, in different splitting stages of the SSS construction of C, the compressions of off-diagonal blocks with different sizes can be related Copyright by SIAM Unauthorized reproduction of this article is prohibited

14 260 CANDRASEKARAN, GU, SUN, XIA, AND ZU For simplicity, we look at the example of partitioning C 0 into blocks Assume in the first cut that we can partition C 0 into 2 2 blocks with equal dimensions as in the following: C0;, C 0;,2 C 0 ω 2i ω 2j+ i, j N C 0;2, C 0;2,2 ω 2 ω 3 ω 2 ω k+ ω 2 ω k+3 ω 2 ω 2N+ ω k ω 3 ω k ω k+ ω k ω k+3 ω k ω 2N+, C 0;2, C 0;2,2 where k N, and without loss of generality, we consider only the block upper triangular part Assume that we have obtained a compression X Y of C 0;,2 such that C0;, X Y C 0, C 0;2, C 0;2,2 where X and Y are tall and thin and can be computed with τ-accurate SVD or rank-revealing QR factorization Next, we split C 0;,, the, block of C 0,into 2 2 blocks and compress its off-diagonal block Suppose the resulting off-diagonal block of C 0;, has size k N 2 l We will compress it by shifting certain parts of X and Y That is, we partition X and Y conformally as X X, X,2,Y Y, and also pick a k N 2 l block from the lower left corner of C 0;,2 X Y that is, X,2 Y, C 0 C0;, X Y C 0;2, C 0;2,2 ω 2 ω 2l+3 ω 2k ω 2l+3 Y,2, ω 2 ω N+ ω 2k ω N+ ω 2k+2 ω N+3 ω k ω N+3 C 0;2, C 0;2,2 ω 2k+2 ω 2N l+ ω N ω 2N l+ X 2 Y2 X, Y, X, Y,2 X,2 Y, X,2 Y,2, C 0;2, C 0;2,2 Copyright by SIAM Unauthorized reproduction of this article is prohibited

15 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 26 where X 2 Y2 is an unknown compression of the upper right submatrix of C 0;,, and the blocks that don t concern us are left blank At this point we do not need another factorization to get X 2 Y2 ; instead we can directly derive the compression X 2 Y2 of C 0;, from X,2 and Y, Clearly we have X 2 Y2 ω 2 ω 2l+3 ω 2 ω N+ ω 2k ω 2l+3 ω 2k ω N+, ω X,2 Y, 2k+2 ω N+3 ω k ω N+3 ω N ω 2N l+ ω 2k+2 ω 2N l+ ω 2k X 2 Y 2 That means we can get X 2 Y2 by shifting a subblock of X Y A similar situation holds for the splitting of the 2, 2 block of C after the first splitting For the successive compressions in later splittings, a similar shifting can be used For splitting with general block sizes the shifting is also similar The shifting scheme indicates that, in different levels of divide-and-conquer SSS constructions, the compressions of large blocks and small blocks are related This can be used to further save the compression cost 5 Performance and numerical experiments It is well known that the FFT transformation of the Toeplitz matrix T to a Cauchy-like matrix C, and the recovery from the solution x of the Cauchy-like system to the solution x of the Toeplitz system, are stable In addition, the fast divide-and-conquer construction and the SSS system solver in section 3 are both practically stable as we will see in our numerical results Thus our overall algorithm is stable in practice ere no extra steps are needed to stabilize the algorithm as required in some other superfast methods [38] After the precomputations discussed in sections 2 and 3 are finished, the cost of each SSS construction is only ONp 2, where p is the maximum of the off-diagonal ranks of the Cauchy matrix C 0 The SSS solver also costs ONp 2 The total cost is thus no more than ON log N +ONp 2 flops The total storage requirement is ONp For the convenience of coding, we use an ON 2 p cost precomputation routine as discussed in subsection 2 A preliminary implementation of our superfast solver in Fortran 90 is available at jxia/work/toep We did experiments on an Itanium 2 Gz SGI Altix 350 server with a 6-bit Linux operating system and Intel MKL BLAS Our method denoted by NEW is compared to Gaussian elimination with partial pivoting denoted by GEPP via the LAPACK linear system solver ZGESV [3] We consider real N N Toeplitz matrices T whose entries are random and uniformly distributed in [0, ] The right-hand sides b are obtained by b Tx, where the exact solution x has random entries uniformly distributed in [, ] Matrices of size N 2 k 00, k 2, 3,, are considered These matrices are moderately illconditioned For example, for N 2 k 00 with k 2,,7, the one-norm condition numbers of the matrices increase from the order of about 0 3 to 0 6 For the NEW solver two important parameters are involved, the SSS block size d Figure 2 and the tolerance τ in the compressions of off-diagonal blocks ere we use the τ-accurate QR factorization with τ as a relative tolerance section 22 Figure 5 shows the execution time in seconds for GEPP and our NEW solver with different d and τ For the NEW solver only the time for solving the Cauchy-like Copyright by SIAM Unauthorized reproduction of this article is prohibited

16 262 CANDRASEKARAN, GU, SUN, XIA, AND ZU Execution time 0 GEPP NEWτ0 9, d NEWτ0, d time seconds k Fig 5 Computation time in seconds versus N 2 k 00 For the NEW solver, time for the precomputation is excluded Precomputation time 0 NEWτ0 9, d00 NEWτ0, d time seconds k Fig 52 Time in seconds for precomputations in the NEW solver system 23 is reported, as the compressions of off-diagonal blocks can be done in the precomputation, as described in section 3, and the SSS construction time is nearly the precomputation time The precomputation time is shown in Figure 52, although the precomputation only needs to be done once for a particular matrix size N We also consider the time scaling factor, that is, the factor by which the time is multiplied when the matrix size doubles; see Figure 53 We observe that the time scaling factors for the NEW solver are near 2, that is to say the NEW solver is close to being a linear time solver Figures 5 and 55 present the errors ε ˆx x 2 x and ε 2 T ˆx b 2 2 T ˆx + b, respectively, where ˆx denotes the numerical solution 2 A significant point of the new solver is that we can use a relatively large tolerance τ in the precomputation and solving, and then use iterative refinement to improve the accuracy A relatively large τ leads to relatively small off-diagonal ranks and thus Copyright by SIAM Unauthorized reproduction of this article is prohibited

17 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS Time scaling factor GEPP NEWτ0 9, d00 NEWτ0, d50 time scaling factor k Fig 53 Time scaling factors 0 2 ε 0 NEWτ0 9, d00 GEPP 0 6 relative error k Fig 5 ε ˆx x 2 x 2 versus N 2 k ε 2 0 NEWτ0 9, d00 GEPP 0 6 relative residual error k Fig 55 ε 2 T ˆx b 2 T ˆx + b 2 versus N 2 k 00 Copyright by SIAM Unauthorized reproduction of this article is prohibited

18 26 CANDRASEKARAN, GU, SUN, XIA, AND ZU small p in the operation counts above Iterative refinements are very cheap due to the facts that no precomputation is needed and the solver itself is very fast Figure 5 For τ 0 and d 50 the accuracy results after 2 to 8 steps of iterative refinement are displayed in Figure 56 In fact the number of iterative refinement steps required to reach ε 2 < 0 3 is listed in Table 5 Thus it is also clear that our NEW solver can also perform well as a preconditioner The block size d also affects the performance Figure 57 indicates that for a 0 2 ε relative residual error initial 2 steps 0 6 steps 6 steps 8 steps k Fig 56 ε 2 T ˆx b 2 T ˆx + b 2, initial values before iterative refinements, and after some steps of iterative refinement Computation time for NEW solver with different d2 j 25 precomputation + solver time solver time 60 time seconds j Fig 57 Computation time of the NEW solver with N 2800, τ0 9, and different block size d 2 j 25 Table 5 Number of iterative refinement steps required to reach ε 2 < 0 3 for different matrix dimensions N 2 k 00 k Number of steps Copyright by SIAM Unauthorized reproduction of this article is prohibited

19 SUPERFAST ALGORITM FOR TOEPLITZ SYSTEMS 265 given N if d 2 j 25 is too large or too small, then both the solver time and the precomputation time can be relatively large In fact by counting the operations in the algorithm it is possible to determine an optimal d; see [2] for the derivation of the optimal choice of d in the SSS solver We can do similar derivations for both the SSS construction and the SSS solver We just point out that when τ gets larger, the offdiagonal ranks decrease and a smaller d should be chosen In the previous experiments we used two sets of parameters: τ 0 9,d 00 and τ 0,d 50 Finally, it turns out that our current preliminary implementation of the solver is slower than the implementation of the algorithm in [38], likely due to our inefficient data structure and nonoptimized codes for SSS matrices The complicated nature of SSS matrices needs more careful coding and memory management An improved software implementation is under construction 6 Conclusions and future work In this paper we have presented a superfast and practically stable solver for Toeplitz systems of linear equations A Toeplitz matrix is first transformed into a Cauchy-like matrix, which has a nice low-rank property, and then the Cauchy-like system is solved by a superfast solver This superfast solver utilizes this low-rank property and makes use of an SSS representation of the Cauchy-like matrix A fast construction procedure for SSS structures is presented After a one-time precomputation the solver is very efficient ON log N +ONp 2 complexity Also the algorithm is efficient in that only linear storage is required In future work we hope to further reduce the precomputation cost and finish a betterdesigned version of the current Fortran 90 codes in both the computations and the coding Acknowledgments Many thanks to the referees for their valuable comments which have greatly improved the development and presentation of this paper The authors are also grateful to the editors and the editorial assistant for their patience and thoughtful suggestions REFERENCES [] G S Ammar and W B Gragg, Superfast solution of real positive definite Toeplitz systems, SIAM J Matrix Anal Appl, 9 988, pp 6 76 [2] G S Ammar and W B Gragg, Numerical experience with a superfast real Toeplitz solver, Linear Algebra Appl, 2 989, pp [3] E Anderson, Z Bai, C Bischof, J Demmel, J Dongarra, J Du Croz, A Greenbaum, S ammarling, A McKenney, S Ostrouchov, and D Sorensen, LAPACK Users Guide, 2nd ed, SIAM, Philadelphia, PA, 99 [] R R Bitmead and B D O Anderson, Asymptotically fast solution of Toeplitz and related systems of linear equations, Linear Algebra Appl, 3 980, pp 03 6 [5] A W Bojanczyk, R P Brent, F R de oog, and D R Sweet, On the stability of the Bareiss and related Toeplitz factorization algorithms, SIAM J Matrix Anal Appl, 6 995, pp 0 57 [6] R P Brent, F G Gustavson, and D Y Y Yun, Fast solution of Toeplitz systems of equations and computation of Padé approximants, J Algorithms, 980, pp [7] J R Bunch, Stability of methods for solving Toeplitz systems of equations, SIAM J Sci Statist Comput, 6 985, pp [8] J Carrier, L Greengard, and V Rokhlin, A fast adaptive multipole algorithm for particle simulations, SIAM J Sci Statist Comput, 9 988, pp [9] T F Chan and P C ansen, A look-ahead Levinson algorithm for general Toeplitz systems, IEEE Trans Signal Process, 0 992, pp [0] T F Chan and P C ansen, A look-ahead Levinson algorithm for indefinite Toeplitz systems, SIAM J Matrix Anal Appl, 3 992, pp [] S Chandrasekaran and M Gu, Fast and stable algorithms for banded plus semiseparable matrices, SIAM J Matrix Anal Appl, , pp Copyright by SIAM Unauthorized reproduction of this article is prohibited

20 266 CANDRASEKARAN, GU, SUN, XIA, AND ZU [2] S Chandrasekaran, P Dewilde, M Gu, T Pals, X Sun, A-J van der Veen, and D White, Fast Stable Solvers for Sequentially Semi-Separable Linear Systems of Equations and Least Squares Problems, Technical report, University of California, Berkeley, CA, 2003 [3] S Chandrasekaran, P Dewilde, M Gu, T Pals, X Sun, A-J van der Veen, and D White, Some fast algorithms for sequentially semiseparable representations, SIAM J Matrix Anal Appl, , pp 3 36 [] S Chandrasekaran, M Gu, and W Lyons, A Fast and Stable Adaptive Solver for ierarchically Semi-Separable Representations, Technical report, UCSB Math , University of California, Santa Barbara, CA, 200 [5] G Cybenko, The numerical stability of the Levinson-Durbin algorithm for Toeplitz systems of equations, SIAM J Sci Statist Comput, 980, pp [6] G Cybenko, Error Analysis of Some Signal Processing Algorithms, PhD thesis, Princeton University, Princeton, NJ, 978 [7] F R deoog, On the solution of Toeplitz systems, Linear Algebra Appl, 88/89 987, pp [8] P Dewilde and A van der Veen, Time-Varying Systems and Computations, Kluwer Academic Publishers, Boston, MA, 998 [9] M Fiedler, ankel and Loewner matrices, Linear Algebra Appl, 58 98, pp [20] K A Gallivan, S Thirumalai, P Van Dooren, and V Vermaut, igh performance algorithms for Toeplitz and block Toeplitz matrices, Linear Algebra Appl, 2/23 996, pp [2] L Gemignani, Schur complements of Bezoutians and the inversion of block ankel and block Toeplitz matrices, Linear Algebra Appl, , pp [22] I Gohberg, T Kailath, and V Olshevsky, Fast Gaussian elimination with partial pivoting for matrices with displacement structure, Math Comp, 6 995, pp [23] I Gohberg and V Olshevsky, Fast state space algorithms for matrix Nehari and Nehari- Takagi interpolation problems, Integral Equations Operator Theory, 20 99, pp 83 [2] I Gohberg and V Olshevsky, Complexity of multiplication with vectors for structured matrices, Linear Algebra Appl, , pp [25] L Greengard and V Rokhlin, A fast algorithm for particle simulations, J Comput Phys, , pp [26] M Gu, Stable and efficient algorithms for structured systems of linear equations, SIAM J Matrix Anal Appl, 9 998, pp [27] M Gu and S C Eisenstat, Efficient algorithms for computing a strong rank-revealing QR factorization, SIAM J Sci Comput, 7 996, pp [28] G einig, Inversion of generalized Cauchy matrices and other classes of structured matrices, in Linear Algebra for Signal Processing, IMA Vol Math Appl 69, Springer, New York, 995, pp 63 8 [29] G einig and K Rost, Algebraic Methods for Toeplitz-like Matrices and Operators, Oper Theory Adv Appl 3, Birkhäuser Verlag, Basel, 98, pp [30] T Kailath, Fredholm resolvents, Wiener-opf equations, and Riccati differential equations, IEEE Trans Inform Theory, 5 969, pp [3] T Kailath, S Kung, and M Morf, Displacement ranks of matrices and linear equations, J Math Anal Appl, , pp [32] T Kailath and A Sayed, eds, Fast Reliable Algorithms for Matrices with Structure, SIAM, Philadelphia, 999 [33] P G Martinsson, V Rokhlin, and M Tygert, A fast algorithm for the inversion of general Toeplitz matrices, Comput Math Appl, , pp [3] M Morf, Fast Algorithms for Multivariable Systems, PhD thesis, Department of Electrical Engineering, Stanford University, Stanford, CA, 97 [35] B R Musicus, Levinson and Fast Choleski Algorithms for Toeplitz and Almost Toeplitz Matrices, Technical report, Res Lab of Electronics, MIT, Cambridge, MA, 98 [36] V Pan, On computations with dense structured matrices, Math Comp, , pp [37] D R Sweet, The use of pivoting to improve the numerical performance of algorithms for Toeplitz matrices, SIAM J Matrix Anal Appl, 993, pp [38] M Van Barel, G einig, and P Kravanja, A stabilized superfast solver for nonsymmetric Toeplitz systems, SIAM J Matrix Anal Appl, , pp 9 50 [39] M Van Barel and P Kravanja, A stabilized superfast solver for indefinite ankel systems, Linear Algebra Appl, , pp Copyright by SIAM Unauthorized reproduction of this article is prohibited

Fast algorithms for hierarchically semiseparable matrices

Fast algorithms for hierarchically semiseparable matrices NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2010; 17:953 976 Published online 22 December 2009 in Wiley Online Library (wileyonlinelibrary.com)..691 Fast algorithms for hierarchically

More information

Enhancing Scalability of Sparse Direct Methods

Enhancing Scalability of Sparse Direct Methods Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks

Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks CHAPTER 2 Fast matrix algebra for dense matrices with rank-deficient off-diagonal blocks Chapter summary: The chapter describes techniques for rapidly performing algebraic operations on dense matrices

More information

MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS

MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS JIANLIN XIA Abstract. We propose multi-layer hierarchically semiseparable MHS structures for the fast factorizations of dense matrices arising from

More information

Exponentials of Symmetric Matrices through Tridiagonal Reductions

Exponentials of Symmetric Matrices through Tridiagonal Reductions Exponentials of Symmetric Matrices through Tridiagonal Reductions Ya Yan Lu Department of Mathematics City University of Hong Kong Kowloon, Hong Kong Abstract A simple and efficient numerical algorithm

More information

Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems

Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems LESLIE FOSTER and RAJESH KOMMU San Jose State University Existing routines, such as xgelsy or xgelsd in LAPACK, for

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

ANONSINGULAR tridiagonal linear system of the form

ANONSINGULAR tridiagonal linear system of the form Generalized Diagonal Pivoting Methods for Tridiagonal Systems without Interchanges Jennifer B. Erway, Roummel F. Marcia, and Joseph A. Tyson Abstract It has been shown that a nonsingular symmetric tridiagonal

More information

Old and new algorithms for Toeplitz systems

Old and new algorithms for Toeplitz systems Old and new algorithms for Toeplitz systems Richard P. Brent Computer Sciences Laboratory Australian National University Canberra, Australia ABSTRACT Toeplitz linear systems and Toeplitz least squares

More information

A robust inner-outer HSS preconditioner

A robust inner-outer HSS preconditioner NUMERICAL LINEAR ALGEBRA WIH APPLICAIONS Numer. Linear Algebra Appl. 2011; 00:1 0 [Version: 2002/09/18 v1.02] A robust inner-outer HSS preconditioner Jianlin Xia 1 1 Department of Mathematics, Purdue University,

More information

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices Algorithms to solve block Toeplitz systems and least-squares problems by transforming to Cauchy-like matrices K. Gallivan S. Thirumalai P. Van Dooren 1 Introduction Fast algorithms to factor Toeplitz matrices

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006.

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006. LAPACK-Style Codes for Pivoted Cholesky and QR Updating Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig 2007 MIMS EPrint: 2006.385 Manchester Institute for Mathematical Sciences School of Mathematics

More information

Split Algorithms and ZW-Factorization for Toeplitz and Toeplitz-plus-Hankel Matrices

Split Algorithms and ZW-Factorization for Toeplitz and Toeplitz-plus-Hankel Matrices Split Algorithms and ZW-Factorization for Toeplitz and Toeplitz-plus-Hanel Matrices Georg Heinig Dept of Math& CompSci Kuwait University POBox 5969 Safat 13060, Kuwait Karla Rost Faultät für Mathemati

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating

LAPACK-Style Codes for Pivoted Cholesky and QR Updating LAPACK-Style Codes for Pivoted Cholesky and QR Updating Sven Hammarling 1, Nicholas J. Higham 2, and Craig Lucas 3 1 NAG Ltd.,Wilkinson House, Jordan Hill Road, Oxford, OX2 8DR, England, sven@nag.co.uk,

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Logistics Notes for 2016-08-26 1. Our enrollment is at 50, and there are still a few students who want to get in. We only have 50 seats in the room, and I cannot increase the cap further. So if you are

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

An Efficient and Stable Parallel Solution for Non-symmetric Toeplitz Linear Systems

An Efficient and Stable Parallel Solution for Non-symmetric Toeplitz Linear Systems An Efficient and Stable Parallel Solution for Non-symmetric Toeplitz Linear Systems Pedro Alonso 1, José M Badía 2, and Antonio M Vidal 1 1 Universidad Politécnica de Valencia, cno Vera s/n, 46022 Valencia,

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Numerical Experiments for Finding Roots of the Polynomials in Chebyshev Basis

Numerical Experiments for Finding Roots of the Polynomials in Chebyshev Basis Available at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 12, Issue 2 (December 2017), pp. 988 1001 Applications and Applied Mathematics: An International Journal (AAM) Numerical Experiments

More information

Approximate Inverse-Free Preconditioners for Toeplitz Matrices

Approximate Inverse-Free Preconditioners for Toeplitz Matrices Approximate Inverse-Free Preconditioners for Toeplitz Matrices You-Wei Wen Wai-Ki Ching Michael K. Ng Abstract In this paper, we propose approximate inverse-free preconditioners for solving Toeplitz systems.

More information

FAST CONDITION ESTIMATION FOR A CLASS OF STRUCTURED EIGENVALUE PROBLEMS

FAST CONDITION ESTIMATION FOR A CLASS OF STRUCTURED EIGENVALUE PROBLEMS SIAM J. MATRIX ANAL. APPL. Vol. 30, No. 4, pp. 1658 1676 c 2009 Society for Industrial and Applied Mathematics FAST CONDITION ESTIMATION FOR A CLASS OF STRUCTURED EIGENVALUE PROBLEMS A. J. LAUB AND J.

More information

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM Ralf L.M. Peeters Bernard Hanzon Martine Olivi Dept. Mathematics, Universiteit Maastricht, P.O. Box 616, 6200 MD Maastricht,

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

Matrix Computations and Semiseparable Matrices

Matrix Computations and Semiseparable Matrices Matrix Computations and Semiseparable Matrices Volume I: Linear Systems Raf Vandebril Department of Computer Science Catholic University of Louvain Marc Van Barel Department of Computer Science Catholic

More information

1.1. Contributions. The most important feature of problem (1.1) is that A is

1.1. Contributions. The most important feature of problem (1.1) is that A is FAST AND STABLE ALGORITHMS FOR BANDED PLUS SEMISEPARABLE SYSTEMS OF LINEAR EQUATIONS S. HANDRASEKARAN AND M. GU y Abstract. We present fast and numerically stable algorithms for the solution of linear

More information

Block Lanczos Tridiagonalization of Complex Symmetric Matrices

Block Lanczos Tridiagonalization of Complex Symmetric Matrices Block Lanczos Tridiagonalization of Complex Symmetric Matrices Sanzheng Qiao, Guohong Liu, Wei Xu Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7 ABSTRACT The classic

More information

COMPUTING THE CONDITION NUMBER OF TRIDIAGONAL AND DIAGONAL-PLUS-SEMISEPARABLE MATRICES IN LINEAR TIME

COMPUTING THE CONDITION NUMBER OF TRIDIAGONAL AND DIAGONAL-PLUS-SEMISEPARABLE MATRICES IN LINEAR TIME COMPUTING THE CONDITION NUMBER OF TRIDIAGONAL AND DIAGONAL-PLUS-SEMISEPARABLE MATRICES IN LINEAR TIME GARETH I HARGREAVES Abstract For an n n tridiagonal matrix we exploit the structure of its QR factorization

More information

Incomplete Cholesky preconditioners that exploit the low-rank property

Incomplete Cholesky preconditioners that exploit the low-rank property anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université

More information

Parallel Algorithms for the Solution of Toeplitz Systems of Linear Equations

Parallel Algorithms for the Solution of Toeplitz Systems of Linear Equations Parallel Algorithms for the Solution of Toeplitz Systems of Linear Equations Pedro Alonso 1, José M. Badía 2, and Antonio M. Vidal 1 1 Departamento de Sistemas Informáticos y Computación, Universidad Politécnica

More information

Positive Denite Matrix. Ya Yan Lu 1. Department of Mathematics. City University of Hong Kong. Kowloon, Hong Kong. Abstract

Positive Denite Matrix. Ya Yan Lu 1. Department of Mathematics. City University of Hong Kong. Kowloon, Hong Kong. Abstract Computing the Logarithm of a Symmetric Positive Denite Matrix Ya Yan Lu Department of Mathematics City University of Hong Kong Kowloon, Hong Kong Abstract A numerical method for computing the logarithm

More information

A fast randomized algorithm for approximating an SVD of a matrix

A fast randomized algorithm for approximating an SVD of a matrix A fast randomized algorithm for approximating an SVD of a matrix Joint work with Franco Woolfe, Edo Liberty, and Vladimir Rokhlin Mark Tygert Program in Applied Mathematics Yale University Place July 17,

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

The antitriangular factorisation of saddle point matrices

The antitriangular factorisation of saddle point matrices The antitriangular factorisation of saddle point matrices J. Pestana and A. J. Wathen August 29, 2013 Abstract Mastronardi and Van Dooren [this journal, 34 (2013) pp. 173 196] recently introduced the block

More information

Fast Structured Spectral Methods

Fast Structured Spectral Methods Spectral methods HSS structures Fast algorithms Conclusion Fast Structured Spectral Methods Yingwei Wang Department of Mathematics, Purdue University Joint work with Prof Jie Shen and Prof Jianlin Xia

More information

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations Jin Yun Yuan Plamen Y. Yalamov Abstract A method is presented to make a given matrix strictly diagonally dominant

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

Normalized power iterations for the computation of SVD

Normalized power iterations for the computation of SVD Normalized power iterations for the computation of SVD Per-Gunnar Martinsson Department of Applied Mathematics University of Colorado Boulder, Co. Per-gunnar.Martinsson@Colorado.edu Arthur Szlam Courant

More information

Math 671: Tensor Train decomposition methods

Math 671: Tensor Train decomposition methods Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3

More information

Katholieke Universiteit Leuven Department of Computer Science

Katholieke Universiteit Leuven Department of Computer Science Orthogonal similarity transformation of a symmetric matrix into a diagonal-plus-semiseparable one with free choice of the diagonal Raf Vandebril, Ellen Van Camp, Marc Van Barel, Nicola Mastronardi Report

More information

A DIVIDE-AND-CONQUER METHOD FOR THE TAKAGI FACTORIZATION

A DIVIDE-AND-CONQUER METHOD FOR THE TAKAGI FACTORIZATION SIAM J MATRIX ANAL APPL Vol 0, No 0, pp 000 000 c XXXX Society for Industrial and Applied Mathematics A DIVIDE-AND-CONQUER METHOD FOR THE TAKAGI FACTORIZATION WEI XU AND SANZHENG QIAO Abstract This paper

More information

A Divide-and-Conquer Method for the Takagi Factorization

A Divide-and-Conquer Method for the Takagi Factorization A Divide-and-Conquer Method for the Takagi Factorization Wei Xu 1 and Sanzheng Qiao 1, Department of Computing and Software, McMaster University Hamilton, Ont, L8S 4K1, Canada. 1 xuw5@mcmaster.ca qiao@mcmaster.ca

More information

Some Notes on Least Squares, QR-factorization, SVD and Fitting

Some Notes on Least Squares, QR-factorization, SVD and Fitting Department of Engineering Sciences and Mathematics January 3, 013 Ove Edlund C000M - Numerical Analysis Some Notes on Least Squares, QR-factorization, SVD and Fitting Contents 1 Introduction 1 The Least

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Frame Diagonalization of Matrices

Frame Diagonalization of Matrices Frame Diagonalization of Matrices Fumiko Futamura Mathematics and Computer Science Department Southwestern University 00 E University Ave Georgetown, Texas 78626 U.S.A. Phone: + (52) 863-98 Fax: + (52)

More information

Partial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing. Cinna Julie Wu

Partial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing. Cinna Julie Wu Partial Left-Looking Structured Multifrontal Factorization & Algorithms for Compressed Sensing by Cinna Julie Wu A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 412 (2011) 1484 1491 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: wwwelseviercom/locate/tcs Parallel QR processing of Generalized

More information

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark DM559 Linear and Integer Programming LU Factorization Marco Chiarandini Department of Mathematics & Computer Science University of Southern Denmark [Based on slides by Lieven Vandenberghe, UCLA] Outline

More information

On the eigenvalues of specially low-rank perturbed matrices

On the eigenvalues of specially low-rank perturbed matrices On the eigenvalues of specially low-rank perturbed matrices Yunkai Zhou April 12, 2011 Abstract We study the eigenvalues of a matrix A perturbed by a few special low-rank matrices. The perturbation is

More information

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices

Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 208 LECTURE 3 need for pivoting we saw that under proper circumstances, we can write A LU where 0 0 0 u u 2 u n l 2 0 0 0 u 22 u 2n L l 3 l 32, U 0 0 0 l n l

More information

A Superfast Algorithm for Confluent Rational Tangential Interpolation Problem via Matrix-vector Multiplication for Confluent Cauchy-like Matrices

A Superfast Algorithm for Confluent Rational Tangential Interpolation Problem via Matrix-vector Multiplication for Confluent Cauchy-like Matrices Contemporary Mathematics A Superfast Algorithm for Confluent Rational Tangential Interpolation Problem via Matrix-vector Multiplication for Confluent Cauchy-like Matrices Vadim Olshevsky and Amin Shokrollahi

More information

A Randomized Algorithm for the Approximation of Matrices

A Randomized Algorithm for the Approximation of Matrices A Randomized Algorithm for the Approximation of Matrices Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert Technical Report YALEU/DCS/TR-36 June 29, 2006 Abstract Given an m n matrix A and a positive

More information

Intel Math Kernel Library (Intel MKL) LAPACK

Intel Math Kernel Library (Intel MKL) LAPACK Intel Math Kernel Library (Intel MKL) LAPACK Linear equations Victor Kostin Intel MKL Dense Solvers team manager LAPACK http://www.netlib.org/lapack Systems of Linear Equations Linear Least Squares Eigenvalue

More information

Spectrum-Revealing Matrix Factorizations Theory and Algorithms

Spectrum-Revealing Matrix Factorizations Theory and Algorithms Spectrum-Revealing Matrix Factorizations Theory and Algorithms Ming Gu Department of Mathematics University of California, Berkeley April 5, 2016 Joint work with D. Anderson, J. Deursch, C. Melgaard, J.

More information

The restarted QR-algorithm for eigenvalue computation of structured matrices

The restarted QR-algorithm for eigenvalue computation of structured matrices Journal of Computational and Applied Mathematics 149 (2002) 415 422 www.elsevier.com/locate/cam The restarted QR-algorithm for eigenvalue computation of structured matrices Daniela Calvetti a; 1, Sun-Mi

More information

9. Numerical linear algebra background

9. Numerical linear algebra background Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

More information

ETNA Kent State University

ETNA Kent State University C 8 Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-2, 2004. Copyright 2004,. ISSN 1068-613. etnamcs.kent.edu STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Abstract.

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

Randomized algorithms for the low-rank approximation of matrices

Randomized algorithms for the low-rank approximation of matrices Randomized algorithms for the low-rank approximation of matrices Yale Dept. of Computer Science Technical Report 1388 Edo Liberty, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Direct Methods Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) Linear Systems: Direct Solution Methods Fall 2017 1 / 14 Introduction The solution of linear systems is one

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

On the application of different numerical methods to obtain null-spaces of polynomial matrices. Part 1: block Toeplitz algorithms.

On the application of different numerical methods to obtain null-spaces of polynomial matrices. Part 1: block Toeplitz algorithms. On the application of different numerical methods to obtain null-spaces of polynomial matrices. Part 1: block Toeplitz algorithms. J.C. Zúñiga and D. Henrion Abstract Four different algorithms are designed

More information

A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation.

A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation. 1 A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation João Carvalho, DMPA, Universidade Federal do RS, Brasil Karabi Datta, Dep MSc, Northern Illinois University, DeKalb, IL

More information

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem Bor Plestenjak Department of Mathematics University of Ljubljana Slovenia Ellen Van Camp and Marc Van

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Generalized interval arithmetic on compact matrix Lie groups

Generalized interval arithmetic on compact matrix Lie groups myjournal manuscript No. (will be inserted by the editor) Generalized interval arithmetic on compact matrix Lie groups Hermann Schichl, Mihály Csaba Markót, Arnold Neumaier Faculty of Mathematics, University

More information

Fast Transforms: Banded Matrices with Banded Inverses

Fast Transforms: Banded Matrices with Banded Inverses Fast Transforms: Banded Matrices with Banded Inverses 1. Introduction Gilbert Strang, MIT An invertible transform y = Ax expresses the vector x in a new basis. The inverse transform x = A 1 y reconstructs

More information

0 Introduction Let the matrix A C nn be given by all its n entries The problem is to compute the product Ab of a matrix A by input vector b C n Using

0 Introduction Let the matrix A C nn be given by all its n entries The problem is to compute the product Ab of a matrix A by input vector b C n Using Fast algorithms with preprocessing for matrix{vector multiplication problems IGohberg and VOlshevsky School of Mathematical Sciences Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University,

More information

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix EE507 - Computational Techniques for EE 7. LU factorization Jitkomut Songsiri factor-solve method LU factorization solving Ax = b with A nonsingular the inverse of a nonsingular matrix LU factorization

More information

Efficient and Accurate Rectangular Window Subspace Tracking

Efficient and Accurate Rectangular Window Subspace Tracking Efficient and Accurate Rectangular Window Subspace Tracking Timothy M. Toolan and Donald W. Tufts Dept. of Electrical Engineering, University of Rhode Island, Kingston, RI 88 USA toolan@ele.uri.edu, tufts@ele.uri.edu

More information

Effective matrix-free preconditioning for the augmented immersed interface method

Effective matrix-free preconditioning for the augmented immersed interface method Effective matrix-free preconditioning for the augmented immersed interface method Jianlin Xia a, Zhilin Li b, Xin Ye a a Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA. E-mail:

More information

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x Technical Report CS-93-08 Department of Computer Systems Faculty of Mathematics and Computer Science University of Amsterdam Stability of Gauss-Huard Elimination for Solving Linear Systems T. J. Dekker

More information

Linear Algebra and Eigenproblems

Linear Algebra and Eigenproblems Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details

More information

On the stability of the Bareiss and related Toeplitz factorization algorithms

On the stability of the Bareiss and related Toeplitz factorization algorithms On the stability of the Bareiss and related oeplitz factorization algorithms A. W. Bojanczyk School of Electrical Engineering Cornell University Ithaca, NY 14853-5401, USA adamb@toeplitz.ee.cornell.edu

More information

Sparse factorization using low rank submatrices. Cleve Ashcraft LSTC 2010 MUMPS User Group Meeting April 15-16, 2010 Toulouse, FRANCE

Sparse factorization using low rank submatrices. Cleve Ashcraft LSTC 2010 MUMPS User Group Meeting April 15-16, 2010 Toulouse, FRANCE Sparse factorization using low rank submatrices Cleve Ashcraft LSTC cleve@lstc.com 21 MUMPS User Group Meeting April 15-16, 21 Toulouse, FRANCE ftp.lstc.com:outgoing/cleve/mumps1 Ashcraft.pdf 1 LSTC Livermore

More information

ECE133A Applied Numerical Computing Additional Lecture Notes

ECE133A Applied Numerical Computing Additional Lecture Notes Winter Quarter 2018 ECE133A Applied Numerical Computing Additional Lecture Notes L. Vandenberghe ii Contents 1 LU factorization 1 1.1 Definition................................. 1 1.2 Nonsingular sets

More information

A note on eigenvalue computation for a tridiagonal matrix with real eigenvalues Akiko Fukuda

A note on eigenvalue computation for a tridiagonal matrix with real eigenvalues Akiko Fukuda Journal of Math-for-Industry Vol 3 (20A-4) pp 47 52 A note on eigenvalue computation for a tridiagonal matrix with real eigenvalues Aio Fuuda Received on October 6 200 / Revised on February 7 20 Abstract

More information

A sparse multifrontal solver using hierarchically semi-separable frontal matrices

A sparse multifrontal solver using hierarchically semi-separable frontal matrices A sparse multifrontal solver using hierarchically semi-separable frontal matrices Pieter Ghysels Lawrence Berkeley National Laboratory Joint work with: Xiaoye S. Li (LBNL), Artem Napov (ULB), François-Henry

More information

Numerical solution of the eigenvalue problem for Hermitian Toeplitz-like matrices

Numerical solution of the eigenvalue problem for Hermitian Toeplitz-like matrices TR-CS-9-1 Numerical solution of the eigenvalue problem for Hermitian Toeplitz-like matrices Michael K Ng and William F Trench July 199 Joint Computer Science Technical Report Series Department of Computer

More information

Eigen-solving via reduction to DPR1 matrices

Eigen-solving via reduction to DPR1 matrices Computers and Mathematics with Applications 56 (2008) 166 171 www.elsevier.com/locate/camwa Eigen-solving via reduction to DPR1 matrices V.Y. Pan a,, B. Murphy a, R.E. Rosholt a, Y. Tang b, X. Wang c,

More information

Katholieke Universiteit Leuven Department of Computer Science

Katholieke Universiteit Leuven Department of Computer Science Structures preserved by matrix inversion Steven Delvaux Marc Van Barel Report TW 414, December 24 Katholieke Universiteit Leuven Department of Computer Science Celestijnenlaan 2A B-31 Heverlee (Belgium)

More information

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.) page 121 Index (Page numbers set in bold type indicate the definition of an entry.) A absolute error...26 componentwise...31 in subtraction...27 normwise...31 angle in least squares problem...98,99 approximation

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

Sparsity-Preserving Difference of Positive Semidefinite Matrix Representation of Indefinite Matrices

Sparsity-Preserving Difference of Positive Semidefinite Matrix Representation of Indefinite Matrices Sparsity-Preserving Difference of Positive Semidefinite Matrix Representation of Indefinite Matrices Jaehyun Park June 1 2016 Abstract We consider the problem of writing an arbitrary symmetric matrix as

More information

Using semiseparable matrices to compute the SVD of a general matrix product/quotient

Using semiseparable matrices to compute the SVD of a general matrix product/quotient Using semiseparable matrices to compute the SVD of a general matrix product/quotient Marc Van Barel Yvette Vanberghen Paul Van Dooren Report TW 508, November 007 n Katholieke Universiteit Leuven Department

More information

Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems. Contents

Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems. Contents Eigenvalue and Least-squares Problems Module Contents Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems nag nsym gen eig provides procedures for solving nonsymmetric generalized

More information

9. Numerical linear algebra background

9. Numerical linear algebra background Convex Optimization Boyd & Vandenberghe 9. Numerical linear algebra background matrix structure and algorithm complexity solving linear equations with factored matrices LU, Cholesky, LDL T factorization

More information

1. Introduction. Let µ(t) be a distribution function with infinitely many points of increase in the interval [ π, π] and let

1. Introduction. Let µ(t) be a distribution function with infinitely many points of increase in the interval [ π, π] and let SENSITIVITY ANALYSIS OR SZEGŐ POLYNOMIALS SUN-MI KIM AND LOTHAR REICHEL Abstract Szegő polynomials are orthogonal with respect to an inner product on the unit circle Numerical methods for weighted least-squares

More information

Homework 2 Foundations of Computational Math 2 Spring 2019

Homework 2 Foundations of Computational Math 2 Spring 2019 Homework 2 Foundations of Computational Math 2 Spring 2019 Problem 2.1 (2.1.a) Suppose (v 1,λ 1 )and(v 2,λ 2 ) are eigenpairs for a matrix A C n n. Show that if λ 1 λ 2 then v 1 and v 2 are linearly independent.

More information

Displacement Structure in Computing Approximate GCD of Univariate Polynomials

Displacement Structure in Computing Approximate GCD of Univariate Polynomials MM Research Preprints, 230 238 MMRC, AMSS, Academia, Sinica, Beijing No 21, December 2002 Displacement Structure in Computing Approximate GCD of Univariate Polynomials Lihong Zhi 1) Abstract We propose

More information

A fast randomized algorithm for overdetermined linear least-squares regression

A fast randomized algorithm for overdetermined linear least-squares regression A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm

More information

FAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS ON GENERAL MESHES

FAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS ON GENERAL MESHES Proceedings of the Project Review, Geo-Mathematical Imaging Group Purdue University, West Lafayette IN, Vol. 1 2012 pp. 123-132. FAST STRUCTURED EIGENSOLVER FOR DISCRETIZED PARTIAL DIFFERENTIAL OPERATORS

More information

Lecture 13 Stability of LU Factorization; Cholesky Factorization. Songting Luo. Department of Mathematics Iowa State University

Lecture 13 Stability of LU Factorization; Cholesky Factorization. Songting Luo. Department of Mathematics Iowa State University Lecture 13 Stability of LU Factorization; Cholesky Factorization Songting Luo Department of Mathematics Iowa State University MATH 562 Numerical Analysis II ongting Luo ( Department of Mathematics Iowa

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Algorithms Notes for 2016-10-31 There are several flavors of symmetric eigenvalue solvers for which there is no equivalent (stable) nonsymmetric solver. We discuss four algorithmic ideas: the workhorse

More information