DELFT UNIVERSITY OF TECHNOLOGY

Size: px
Start display at page:

Download "DELFT UNIVERSITY OF TECHNOLOGY"

Transcription

1 DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-4 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies M. Baumann, R. Astudillo, Y. Qiu, E. Ang, M.B. Van Gijzen, and R.-É. Plessix ISSN Reports of the Delft Institute of Applied Mathematics Delft 216

2 Copyright 216 by Delft Institute of Applied Mathematics, Delft, The Netherlands. No part of the Journal may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission from Delft Institute of Applied Mathematics, Delft University of Technology, The Netherlands.

3 Delft University of Technology Technical Report 16-4 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies M. Baumann R. Astudillo Y. Qiu E. Ang M.B. Van Gijzen R.-É. Plessix September 19, 216 Abstract In this work we present a new numerical framework for the efficient solution of the time-harmonic elastic wave equation at multiple frequencies. We show that multiple frequencies (and multiple right-hand sides) can be incorporated when the discretized problem is written as a matrix equation. This matrix equation can be solved efficiently using the preconditioned IDR(s) method. We present an efficient and robust way to apply a single preconditioner using MSSS matrix computations. For 3D problems, we present a memory-efficient implementation that exploits the solution of a sequence of 2D problems. Realistic examples in two and three spatial dimensions demonstrate the performance of the new algorithm. Keywords time-harmonic elastic wave equation multiple frequencies Induced Dimension Reduction (IDR) method preconditioned matrix equations multilevel sequentially semiseparable matrices (MSSS) Manuel Baumann (corresponding) Delft University of Technology Delft Institute of Applied Mathematics m.m.baumann@tudelft.nl Reinaldo Astudillo Delft University of Technology Yue Qiu Max Planck Institute for Dynamics of Complex Technical Systems Elisa Ang Nanyang Technological University Martin B. Van Gijzen Delft University of Technology René-Édouard Plessix Shell Global Solutions International B.V. 1 Introduction The understanding of the earth subsurface is a key task in geophysics, and seismic exploration is an approach that matches the intensity of reflected shock waves with simulation results in a least squares sense; cf. [41] and the references therein for an overview of state-of-the-art Full-Waveform Inversion (FWI) algorithms. From a mathematical point of view, the problem of matching measurements with simulation results leads to a PDE-constrained optimization problem where the objective function is defined by the respective FWI approach, and the constraining partial differential equation (PDE) is the wave equation. Since the earth is an elastic medium, the elastic wave equation needs to be considered. In order to design an efficient optimization algorithm, a fast numerical solution of the elastic wave equation (forward problem) is required at every iteration of the optimization loop. More recently, FWI has been considered for an equivalent problem formulated in the frequency-domain [2, 26]. The frequency-domain formulation of wave propagation has shown specific modeling advantages for both acoustic and elastic media. For the efficient FWI, notably the waveform tomography [25, 41], a fast numerical solution of the respective time-harmonic forward problem is required. More precisely, the forward problem requires the fast numerical solution of the discretized time-harmonic elastic wave equation at multiple wave frequencies and for multiple source terms. In this context, many efficient numerical solution methods have been proposed mostly for the (acoustic) Helmholtz e- quation [21,23,24,31]. In this work, we present an efficient solver of the time-harmonic elastic wave equation that results from a finite element discretization, cf. [9,13]. Especially for large 3D problems, the efficient numerical solution with respect to computation time and memory requirements is subject to current research. When an iter-

4 2 M. Baumann et al. ative Krylov method is considered, the design of efficient preconditioners for the elastic wave equation is required. In [1] a damped preconditioner for the elastic wave equation is presented. The authors of [32] analyze a multi-grid approach for the damped problem. Both works are extensions of the work of Erlangga et al. [31] for the acoustic case. The recent low-rank approach of the MUMPS solver [2] makes use of the hierarchical structure of the discrete problem and can be used as a preconditioner, cf. [43]. When domain decomposition is considered, the sweeping preconditioner [39] is an attractive alternative. In this work we propose a hybrid method that combines the iterative Induced Dimension Reduction (IDR) method with an efficient preconditioner that exploits the multilevel sequentially semiseparable (MSSS) matrix structure of the discretized elastic wave equation on a Cartesian grid. Moreover, we derive a matrix equation formulation that includes multiple frequencies and multiple right-hand sides, and present a version of IDR that solves linear matrix equations at a low memory requirement. The paper is structured as follows: In Section 2, we derive a finite element discretization for the time-harmonic elastic wave equation with a special emphasis on the case when multiple frequencies are present. Section 3 presents the IDR(s) method for the efficient iterative solution of the resulting matrix equation. We discuss an efficient preconditioner in Section 4 based on the MSSS structure of the discrete problem. We present different versions of the MSSS preconditioner for 2D and 3D problems in Section 4.2 and 4.3, respectively. The paper concludes with extensive numerical tests in Section 5. 2 The time-harmonic elastic wave equation at multiple frequencies In this section we present a finite element discretization of the time-harmonic elastic wave equation with a special emphasis on the mathematical and numerical treatment when multiple frequencies (and possibly multiple right-hand sides) are present. 2.1 Problem description The time-harmonic elastic wave equation describes the displacement vector u : Ω C d in a computational domain Ω R d,d {2,3}, governed by the following partial differential equation (PDE), ωk 2 ρ(x)u k σ(u k )=s, x Ω R d, k=1,...,n ω. (1) Here, ρ(x) is the density of an elastic material in the considered domain Ω that can differ with x Ω (inhomogeneity), s is a source term, and {ω 1,...,ω Nω } are multiple angular frequencies that define N ω problems in (1). The stress and strain tensor follow from Hooke s law for isotropic elastic media, σ(u k ) := λ(x)( u k I d )+2µ(x)ε(u k ), (2) ε(u k ) := 1 ( u k +( u k ) T), 2 (3) with λ, µ being the Lamé parameters. On the boundary Ω of the domain Ω, we consider the following boundary conditions, iω k ρ(x)bu k + σ(u k )ˆn=, x Ω a, (4) σ(u k )ˆn=, x Ω r, (5) where Sommerfeld radiation boundary conditions at Ω a model absorption, and we typically prescribe a free-surface boundary condition in the north of the computational domain Ω r, with Ω a Ω r = Ω. In (4), B is a d d matrix that depends on c p and c s, B B(x) := c p (x)ˆnˆn T + c s (x)ˆtˆt T + c s (x)ŝŝ T, with vectors {ˆn,ˆt,ŝ} being normal or tangential to the boundary, respectively; cf. [1] for more details. Note that the boundary conditions (4)-(5) can naturally be included in a finite element approach as explained in Section 2.2. Ω a y Ω r s=δ(l x /2,) Ω a x Ω a Fig. 1: Boundary conditions and source term for d = 2. For d = 3, the source is for instance located at(l x /2,L y /2,) T. We assume the set of five parameters {ρ,c p,c s,λ, µ} in (1)-(5) to be space-dependent. The Lamé parameters λ and µ are directly related to the density ρ and the speed of P- waves c p and speed of S-waves c s via, µ = c 2 s ρ, λ = ρ(c 2 p 2c 2 s). (6) 2.2 Finite element (FEM) discretization For the discretization of (1)-(5) we follow a classical finite element approach using the following ansatz, u k (x) N i=1 u i k ϕ i (x), x Ω Rd, u i k C. (7) In the numerical examples presented in Section 5 we restrict ourselves to Cartesian grids and basis functions ϕ i that are

5 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies 3 B-splines of degree p as described for instance in [8, Chapter 2]. The number of degrees of freedom is, hence, given by N = d i {x,y,z} (n i 1+ p), d {2,3}, p N +, (8) with n i grid points in the respective spatial direction (in Figure 1 we illustrate the case where d = 2 and n x = 7,n y = 5). Definition 1 (Tensor notation, [12]) The dot product between two vector-valued quantities u=(u x,u y ),v=(v x,v y ) is denoted as, u v := u x v x + u y v y. Similarly, we define the componentwise multiplication of two matrices U =[u i j ],V = [v i j ] as, U : V := i, j u i j v i j. A Galerkin finite element approach with d-dimensional test functions ϕ i applied to (1) leads to, ωk 2 N u i k i=1 Ω ρ(x)ϕ i ϕ j dω N u i k i=1 Ω σ(ϕ i ) ϕ j dω = s ϕ j dω, j= 1,...,N, Ω where we exploit the boundary conditions (4)-(5) in the following way: σ(ϕ i ) ϕ j dω Ω = σ(ϕ i )ϕ j ˆn dγ σ(ϕ i ) : ϕ j dω Ω Ω = iω k ρ(x)bϕ i ϕ j dγ σ(ϕ i ) : ϕ j dω Ω a Note that the stress-free boundary condition (5) can be included naturally in a finite element discretization by excluding Ω r from the above boundary integral. We summarize the finite element discretization of the time-harmonic, inhomogeneous elastic wave equation at multiple frequencies ω k by, (K+ iω k C ω 2 k M)x k = b, k=1,...,n ω, (9) with unknown vectors x k :=[u 1 k,...,un k ]T C N consisting of the coefficients in (7), and mass matrix M, stiffness matrix K and boundary matrix C given by, K i j = σ(ϕ i ) : ϕ j dω, M i j = ρϕ i ϕ j dω, Ω Ω C i j = ρbϕ i ϕ j dγ, b j = s ϕ j dω. Ω a Ω In a 2D problem (see Figure 1), the unknown x k contains the x-components and the y-components of the displacement vector. When lexicographic numbering is used, the matrices in (9) have the block structure [ ] Kxx K K = xy, C= K yx K yy [ Cxx C xy C yx C yy Ω ], M = [ Mxx M xy M yx M yy ], as shown in Figure 3 (left) for d = 2, and Figure 2 (top left) for d= 3. When solving (9) with an iterative Krylov method, it is necessary to apply a preconditioner. Throughout this document, we consider a preconditioner of the form P(τ)=(K+ iτc τ 2 M), (1) where τ is a single seed frequency that needs to be chosen with care for the range of frequencies {ω 1,...ω Nω }, cf. the considerations in [5, 36]. The efficient application of the preconditioner (1) for problems of dimension d = 2 and d = 3 on a structured domain is presented in Section 4, and the choice of τ is discussed in Section SSS MSSS Fig. 2: A spy plot of (1) for a 3D elastic problem when linear basis functions (p = 1) are used: In the top row we show the discretized problem for lexicographic (top left) and nodal-based ordering (top right). Appropriate zooming demonstrates the hierarchically repeating structure of the matrix on level 2 (bottom left) and level 1 (bottom right). For level 1, we indicate the SSS data structure used in Section Reformulation as a matrix equation We next describe a new approach to solve (9) at multiple frequencies. Therefore, we define the block matrix X consisting of all unknown vectors, X :=[x 1,...,x Nω ] C N N ω, and note that (9) can be rewritten as, A(X) := KX+iCXΣ MXΣ 2 = B, (11) where Σ := diag(ω 1,...,ω Nω ), and with block right-hand side B := [b,...,b]. In (11), we also define the linear operator D 1 U 2

6 4 M. Baumann et al. A( ) which defines the matrix equation (11) in short-hand notation as A(X)=B. This reformulation gives rise to use an extension of the IDR(s) method to solve linear matrix equations [3]. Note that an alternative approach to efficiently solve (9) at multiple frequencies (N ω > 1) leads to the solution of shifted linear systems as presented in [5, Section 4.2] and the references therein. The memory-efficient approach followed by [5] relies on the shift-invariance property of the Krylov spaces belonging to different frequencies. Some restrictions of this approach like collinear right-hand sides in (9) and the difficulty of preconditioner design are, however, not present in the matrix equation setting (11). 3 The Induced Dimension Reduction (IDR) method Krylov subspace methods are an efficient tool for the iterative numerical solution of large-scale linear systems of equations [18]. In particular, the matrices K,C, M that typically are obtained from a discretization of the time-harmonic elastic wave equation (9) are ill-conditioned and have very large dimensions, especially when high frequencies are considered. For these reasons, the numerical solution is computationally challenging, and factors like memory consumption and computational efficiency have to be taken into account when selecting a suitable Krylov method. The Generalized Minimum Residual (GMRES) method [35] is one of the most widely-used Krylov method because of its rather simple implementation and optimal convergence property. Nevertheless, GMRES is a long-recurrence Krylov method, i.e., its requirements for memory and computation grow in each iteration which is unfeasible when solving linear systems arising from the elastic wave equation. On the other hand, short-recurrence Krylov methods keep the computational cost constant per iteration; one of the most used method of this class is the Bi-conjugate gradient stabilized (Bi-CGSTAB) method [42]. In this work we propose to apply an alternative shortrecurrence Krylov method: the Induced Dimension Reduction (IDR) method [14,38]. IDR(s) uses recursions of depth s+1, with s N + being typically small, to solve linear systems of equations of the form, Ax=b, A C N N, {x,b} C N, (12) where the coefficient matrix A is a large, sparse, and in general non-hermitian. We mention some important numerical properties of the IDR(s) method: First, finite termination of the algorithm is ensured with IDR(s) computing the exact solution in N+ N s iterations in exact arithmetics. Second, Bi- CGSTAB and IDR(1) are mathematically equivalent [37]. Third, IDR(s) with s > 1 often outperforms Bi-CGSTAB for numerically difficult problems, for example, for convectiondiffusion-reaction problems where the convection term is dominating, or problems with a large negative reaction term, cf. [38] and [14], respectively. 3.1 IDR(s) for linear systems We present a brief introduction of the IDR(s) method that closely follows [38]. In Section 3.2, we explain how to use IDR(s) for solving (11) for multiple frequencies in a matrix equation setting. We introduce the basic concepts of the IDR(s) method. The IDR(s) algorithm is based on the following theorem. Theorem 1 (The IDR(s) theorem) Let A be a matrix in C N N, let v be any non-zero vector inc N, and let G be the full Krylov subspace, G := K N (A,v ). Let S be a (proper) subspace ofc N such that S and G do not share a nontrivial invariant subspace of A, and define the sequence: G j :=(I ξ j A)(G j 1 S), j= 1, 2,..., (13) where ξ j are nonzero scalars. Then it holds: 1. G j+1 G j, for j, and, 2. dim(g j+1 )<dim(g j ), unless G j {}. Proof Can be found in [38]. Exploiting the fact that the subspaces G j are shrinking and G j = {} for some j, IDR(s) solves the problem (12) by constructing residuals r k+1 in the subspaces G j+1, while in parallel, it extracts the approximate solutions x k+1. In order to illustrate how to create a residual vector in the space G j+i, let us assume that the space S is the left null space of a full rank matrix P :=[p 1, p 2,...,p s ], {x i } k i=k (s+1) are s+1 approximations to (12) and their corresponding residual vectors {r i } k i=k (s+1) are in G j. IDR(s) creates a residual vector r k+1 in G j+1 and obtains the approximation x k+1 using the following(s+1)-term recursions, x k+1 = x k + ξ j+1 v k + s j=1 γ j x k j, r k+1 =(I ξ j+1 A)v k, v k = r k s j=1 γ j r k j, where y k is the forward difference operator y k := y k+1 y k. The vector c=(γ 1, γ 2,...,γ s ) T can be obtained imposing the condition r k+1 G j+1 by solving the s s linear system, P H [ r k 1, r k 2,..., r k s ]c=p H r k. At this point, IDR(s) has created a new residual vector r k+1 in G j+1. However, using the fact that G j+1 G j, r k+1 is also in G j, IDR(s) repeats the above computation in order to create {r k+1, r k+2,...,r k+s+1 } in G j+1. Once s+1 residuals are in G j+1, IDR(s) is able to sequentially create new residuals in G j+2.

7 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies Preconditioned IDR(s) for linear matrix equations The IDR(s) theorem 1 can be generalized to solve linear problems in any finite-dimensional vector space. In particular, IDR(s) has recently been adapted to solve linear matrix equations [3]. In this work, we use this generalization of the IDR(s) method to solve the time-harmonic elastic wave equation at multiple frequencies. Using the definition of the linear operator A( ) in (11) yields a matrix equation in shorthand notation, A(X)=B, which is close to (12). Here, the block right-hand side B equals B := b[1, 1,...,1] Nω or B :=[b 1, b 2,...,b Nω ] depending whether we consider a constant source term for each frequency as in (1) or allow variations. IDR(s) for solving (11) uses the same recursions described in Section 3.1 acting on block matrices. The main differences with the original IDR(s) algorithm of [38] are the substitution of the matrix-vector product Ax by the application of the linear operator A(X), and the use of Frobenius inner products, see Definition 2. Note that two prominent long-recurrence Krylov methods have been generalized to the solution of linear matrix equations in [15] using a similar approach. In Algorithm 1, we present IDR(s) for solving the matrix equation (11) with biorthogonal residuals (see details in [3, 14]). The preconditioner used in Algorithm 1 is described in the following Section. Definition 2 (Frobenius inner product, [15]) The Frobenius inner product of two real matrices A,B of the same size is defined as A,B F := tr(a H B), where tr( ) denotes the trace of the matrix A H B. The Frobenius norm is, thus, given by A 2 F := A,A F. 4 Multilevel Sequentially Semiseparable Preconditioning Techniques Semiseparable matrices [4] and the more general concept of sequentially semiseparable (SSS) matrices [6, 7] are structured matrices represented by a set of generators. Matrices that arise from the discretization of 1D partial differential equations typically have an SSS structure [29], and submatrices taken from the strictly lower/upper-triangular part yield generators of low rank. Multiple applications from different areas can be found [1,16,3] that exploit this structure. Multilevel sequentially semiseparable (MSSS) matrices generalize SSS matrices to the case when d > 1. Again, discretizations of higher-dimensional PDEs give rise to matrices that have an MSSS structure [27], and the multilevel paradigm yields a hierarchical matrix structure with MSSS generators that are themselves MSSS of a lower hierarchical Algorithm 1 Preconditioned IDR(s) for matrix equations [3] 1: procedure PIDR(s) 2: Input: A as defined in (11), B C N Nω, tol (, 1),s N +, P C N (s Nω), X C N Nω, preconditioner P 3: Output: X such that B A(X) F / B F tol 4: G= C N s Nω, U = C N s Nω 5: M = I s C s s, ξ = 1 6: R=B A(X ) 7: while R F tol B F do 8: Compute [f] i = P i, R F for i=1,..., s 9: for k=1 to s do 1: Solve c from Mc=f,(γ 1,...,γ s ) T = c 11: V = R s i=k γ ig i 12: V = P 1 (V) Apply preconditioner, see Section 4 13: U k = Uc+ξV 14: G k = A(U k ) 15: for i=1 to k 1 do 16: α = P i, G k F /µ i,i 17: G k = G k αg i 18: U k = U k αu i 19: end for 2: µ i,k = P i, G k F,[M] ik = µ i,k, for i=k,...,s 21: β = φ k /µ k,k 22: R=R βg k 23: X=X+βU k 24: if k+ 1 s then 25: φ i = for i=1,...,k 26: φ i = φ i β µ i,k for i=k+ 1,...,s 27: end if 28: Overwrite k-th block of G and U by G k and U k 29: end for 3: V = P 1 (R) Apply preconditioner, see Section 4 31: T = A(V) 32: ξ = T,R F / T,T F 33: ρ = T,R F /( T F R F ) 34: if ρ <.7 then 35: ξ =.7 ξ/ ρ 36: end if 37: R=R ξt 38: X=X+ξV 39: end while 4: return X C N Nω 41: end procedure level. This way, at the lowest level, generators are SSS matrices. The advantages of Cartesian grids in higher dimensions and the resulting structure of the corresponding discretization matrices depicted in Figure 2 is directly exploited in MSSS matrix computations. For unstructured meshes we refer to [44] where hierarchically semiseparable (HSS) matrices are used. MSSS preconditioning techniques were first studied for PDE-constrained optimization problems in [27] and later extended to computational fluid dynamics problems [28]. In this work, we apply MSSS matrix computations to precondition the time-harmonic elastic wave equation. Appropriate splitting of the 3D elastic operator leads to a sequence of 2D problems in level-2 MSSS structure. An efficient preconditioner for 2D problems is based on model order reduction of level-1 SSS matrices.

8 6 M. Baumann et al. 4.1 Definitions and basic SSS operations We present the formal definition of an SSS matrix used on 1D level in Definition 3. Definition 3 (SSS matrix structure, [6]) Let A be an n n block matrix in SSS structure such that A can be written in the following block-partitioned form, U i W i+1 W j 1 Vj H, if i< j, A i j = D i, if i= j, (14) P i R i 1 R j+1 Q H j, if i> j. Here, the superscript H denotes the conjugate transpose of a matrix. The matrices {U s, W s, V s, D s, P s, R s, Q s } n s=1 are called generators of the SSS matrix A, with their respective dimensions given in Table 1. As a short-hand notation for (14), we use A=SSS(P s,r s,q s,d s,u s,w s,v s ). The special case of an SSS matrix when n = 4 is presented in the appendix. Table 1: Generators sizes for the SSS matrix A in Definition 3. Note that, for instance, m m n equals the dimension of A. U i W i V i D i P i R i Q i m i k i k i 1 k i m i k i 1 m i m i m i l i l i 1 l i m i l i+1 Basic operations such as addition, multiplication and inversion are closed under SSS structure and can be performed in linear computational complexity if k i and l i in Table 1 are bounded by a constant (note U 2 has rank 1 in Figure 2). The rank of the off-diagonal blocks, formally defined as the semiseparable order in Definition 4, plays an important role in the computational complexity analysis of SSS matrix computations. Definition 4 (Semiseparable order, [11]) Let A be an n n block matrix with SSS structure as in Definition 3. We use a MATLAB-style of notation, i.e. A(i : j,k :l) selects rows of blocks from i to j and columns of blocks from k to l of a matrix A. Let rank A(s+1:n,1:s)=l s, s=1,2,...,n 1, and let further, If the upper and lower semiseparable order are bounded by say r, i.e., {r l, r u } r, then the computational cost for the SSS matrix computations is of O((r ) 3 n) complexity [6], where n is the number of blocks of the SSS matrix as introduced in Definition 3. We will refer to r as the maximum off-diagonal rank. Matrix-matrix operations are closed under SSS structure, but performing SSS matrix computations will increase the semiseparable order, cf. [6]. We use model order reduction in the sense of Definition 5 in order to bound the semiseparable order. Using the aforementioned definition of semiseparable order, we next introduce the following lemma to compute the (exact) LU factorization of an SSS matrix. Lemma 1 (LU factorization of an SSS matrix) Let A = SSS(P s,r s,q s,d s,u s,w s,v s ) be given in generator form with semiseparable order(r l, r u ). Then the factors of an LU factorization of A are given by the following generators representation, L=SSS(P s,r s, ˆQ s,d L s,,, ), U = SSS(,,, D U s,û s,w s,v s ). The generators of L and U are computed by Algorithm 2. Moreover, L has semiseparable order(r l,), and U has semiseparable order(,r u ). Algorithm 2 LU factorization and inversion of an SSS matrix A [7,4] 1: procedure INV SSS(A) 2: Input: A=SSS(P s,r s,q s,d s,u s,w s,v s ) in generator form 3: // Perform LU factorization 4: D 1 =: D L 1 DU 1 LU factorization on generator level 5: Let Û 1 :=(D L 1 ) 1 U 1, and ˆQ 1 :=(D L 1 ) H Q 1 6: for i=2:n 1 do 7: if i=2 then 8: M i := ˆQ H i 1Ûi 1 9: else 1: M i := ˆQ H i 1Ûi 1+ R i 1 M i 1 W i 1 11: end if ( Di P i M i V H i ) =: D L i D U i LU factorization of generators 12: 13: Let Û i :=(D L i ) 1 (U i P i M i W i ), and 14: let ˆQ i :=(D U i ) H (Q i V i Mi H R H i ) 15: end for 16: M n := ˆQ H n 1Ûn 1+ R n 1 M n 1 W n 1 17: ( ) Dn P n M n Vn H =: D L n D U n LU factorization of generators 18: // Perform inversion 19: L := SSS(P s,r s, ˆQ s,d L s,,, ) 2: U := SSS(,,, D U s,û s,w s,v s ) 21: A 1 = U 1 L 1 SSS inversion (App. A) & MatMul (App. B) 22: end procedure rank A(1:s,s+1:n)=u s, s=1,2,...,n 1. Setting r l := max{l s } and r u := max{u s }, we call r l the lower semiseparable order and r u the upper semiseparable order of A, respectively. Definition 5 (Model order reduction of an SSS matrix) Let A=SSS(P s,r s,q s,d s,u s,w s,v s ) be an SSS matrix with lower order numbers l s and upper order numbers u s. The

9 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies SSS matrix A = SSS(P s, R s, Q s, Ds, U s, W s, V s ) is called a reduced order approximation of A, if ka A k2 is small, and for the lower and upper order numbers it holds, l s < ls, u s < us for all 1 s n Approximate block-lu decomposition using MSSS computations for 2D problems Similar to Definition 3 for SSS matrices, the generators representation for MSSS matrices (level-k SSS matrices) is given in Definition 6. Definition 6 (MSSS matrix structure, [27]) The matrix A is said to be a level-k SSS matrix if it has a form like (14) and all its generators are level-(k 1) SSS matrices. The level-1 SSS matrix is the SSS matrix that satisfies Definition 3 We call A to be in MSSS matrix structure if k > 1. Most operations for SSS matrices can directly be extended to MSSS matrix computations. In order to perform a matrix-matrix multiplication of two MSSS matrices in linear computational complexity, model order reduction which is studied in [6, 27, 28] is necessary to keep the computational complexity low. The preconditioner (1) for a 2D elastic problem is of level-2 MSSS structure. We present a blocklu factorization of a level-2 MSSS matrix in this Section. Therefore, model order reduction is necessary which results in an approximate block-lu factorization. This approximate factorization can be used as a preconditioner for IDR(s) in Algorithm 1. On a two-dimensional Cartesian grid, the preconditioner (1) has a 2 2 block structure as presented in Figure 3 (left). Definition 7 (Permutation of an MSSS matrix, [27]) Let P(τ) be a 2 2 level-2 MSSS block matrix arising from the FEM discretization of (1) using linear B-splines (p = 1), P11 P12 C2nx ny 2nx ny, (15) P(τ) = P21 P22 with block entries being level-2 MSSS matrices in generator form, P11 = MSSS(Ps11, R11 s, Qs, Ds,Us,Ws,Vs ), (17a) MSSS(Ps12, R12 s, Qs, Ds,Us,Ws,Vs ), MSSS(Ps21, R21 s, Qs, Ds,Us,Ws,Vs ), MSSS(Ps22, R22 s, Qs, Ds,Us,Ws,Vs ), (17b) P12 = P21 = P22 = 7 where n Ims Ψ1D = blkdiag s=1 blkdiag Ims n, s=1 such that P2D (τ) = Ψ T P(τ)Ψ is of global MSSS level-2 structure. We illustrate the effect of the permutation matrix Ψ in Figure 3. For a matrix (1) that results from a discretization of the 2D time-harmonic elastic wave equation, P2D is of block tri-diagonal MSSS structure. Corollary 1 (Block tri-diagonal permutation) Consider in Definition 7 the special case that the block entries in (15) are given as, 11 P11 = MSSS(Ps11,, I, D11 s,us,, I), (18a) 12 MSSS(Ps12,, I, D12 s,us,, I), 21 MSSS(Ps21,, I, D21 s,us,, I), 22 MSSS(Ps22,, I, D22 s,us,, I), (18b) P12 = P21 = P22 = (18c) (18d) with rectangular matrix I = [I, ]. Then the matrix Ψ T P(τ)Ψ is of block tri-diagonal MSSS structure. Proof This result follows from formula (2.13) of Lemma 2.4 in the original proof [27] when generators Risj = Wsi j for i, j {1, 2} Fig. 3: A spy plot of P(τ) for the wedge problem (left) and Ψ T P(τ)Ψ (right) for d = p = 2, and nnz=1,587 in both cases. Clearly, the permutation leads to a reduction in bandwidth, and the permuted matrix is block tri-diagonal. (17c) (17d) where 1 s nx. Note that all generators in (17a)-(17d) are SSS matrices of (fixed) dimension ny. Let {ms }ns=1 be the dimensions of the diagonal generators of such an SSS matrix, cf. Table 1, with ns=1 ms = ny. Then there exists a permutation matrix Ψ, ΨΨ T = Ψ TΨ = I, given by Ψ Ψ = Inx 1D Inx, (18) Ψ1D If the matrix (15) is sparse, it is advisable to use a sparse data structure on generator-level for (18a)-(18d) as well. Because of Corollary 1, the permuted 2D preconditioner can be written as, P1,1 P1,2 P2,1 P2,2 P2,3 T P2D = Ψ P(τ)Ψ = (19) Pnx,nx

10 8 M. Baumann et al. with block entries P i, j in SSS format according to Definition 3, compare Figure 3 (right). We perform a block-lu factorization of the form P 2D = LSU, with L i, j = { I if i= j P i, j S 1 j if i= j+ 1, U i, j = { I and Schur complements given by { P i,i if i=1 S i = P i,i P i,i 1 Si 1 1 P i 1,i if 2 i n x. if j= i Si 1 P i, j if j= i+1, (2) (21) The Schur complements in (2)-(21) are SSS matrices and inverses can be computed with Algorithm 2. From Lemma 1, we conclude that this does not increase the respective off-diagonal ranks. However, in (2)-(21), we also need to perform matrix-matrix multiplications and additions of SSS matrices which lead to an increase in rank, cf. [6] and Appendix B. Therefore, we apply model order reduction in the sense of Definition 5 at each step i of the recursion (21) in order to limit the off-diagonal rank. An algorithm that limits the off-diagonal ranks to a constant, say r, can be found in [27]. This leads to approximate Schur complements and, hence, an inexact LU factorization. In Experiment 1, we show that for small off-diagonal ranks this approach results in a very good preconditioner for 2D elastic problems. 4.3 SSOR splitting using MSSS computations for 3D problems For 3D problems, we consider a nodal-based FEM discretization of (1) with n z being the outermost dimension, as shown in Figure 4 for different order of B-splines. In order to derive a memory-efficient algorithm for 3D problems, we consider the matrix splitting, P 3D (τ)=l+ŝ+u, Ŝ=blkdiag(Ŝ 1,...,Ŝ nz ), (22) (a) p=1 (b) p=2 (c) p=3 Fig. 4: Nodal-based discretization of P 3D (τ) in 3D for different degrees p of FEM basis function. the Schur complements (21) is neglected. This choice avoids a rank increase due to multiplication and addition, but yields a worse preconditioner than in 2D. The block entries Ŝ i,i= 1,..,n z, are in level-2 MSSS structure and, hence, formula (2)-(21) can be applied sequentially for the inverses that appear in (23). In order to invert level-1 SSS matrices that recursively appear in (21), we use Algorithm 2. On the generator level, we use suitable LAPACK routines; cf. Table 2 for an overview of the different algorithms used at each level. Table 2: Overview of algorithms applied at different levels for the (approximate) inversion of the preconditioner (23). level algorithm for( ) 1 datatype 3D MSSS SSOR decomposition (23) sparse + L2 SSS 2D MSSS Schur (2)-(21) & MOR tridiag. L2 SSS 1D SSS Algorithm 2 L1 SSS (14) generator LAPACK routines set of sparse matrices We illustrate the data structure of the preconditioner (23) in 3D for the case of linear B-splines (p=1) in Figure 5. On level-3, we use a mixed data format that is most memoryefficient for the splitting (22). Since only diagonal blocks need to be inverted, we convert those to level-2 MSSS format, and keep the off-diagonal blocks of L and U in sparse format. where L and U are the (sparse) strictly lower and strictly upper parts of P 3D (τ), and Ŝ is a block-diagonal matrix with blocks Ŝ i being in level-2 MSSS structure. This data structure is illustrated in Figure 5a. According to [34, Section 4.1.2], the SSOR preconditioner based on the splitting (22) is given by, L2 coo / coo L coo / coo L2 (a) L3 SSS SSS / / SSS (b) L2 SSS, def. 6 D 1 U 1 V 2 H.... P 2 Q H 1 D D n (c) SSS, def. 3 P 3D (τ)= 1 η(2 η) (η L+Ŝ)Ŝ 1 (η U+ Ŝ) which for η = 1 equals, P 3D (τ)=(lŝ 1 + I)Ŝ(Ŝ 1 U+ I). (23) In (23) we note that this decomposition coincides with the 2D approach (2)-(21) when the term P i,i 1 S 1 i 1 P i 1,i in Fig. 5: Nested data structure for the preconditioner (19) after permutation for d = 3 and p = 1. With coo we abbreviate the coordinate-based sparse data structure as used, for instance, in [33]. For p>1, we apply the permutation of Definition 7 on each diagonal block of Ŝ, cf. Figure 6. This way, the Schur

11 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies 9 decomposition described in Section 4.2 can be applied for inverting block tri-diagonal level-2 MSSS matrices with m being a constant and d {2,3}. The memory requirement of the preconditioners P 2D and P 3D presented in Section 4.2 and Section 4.3, respectively, is linear in the respective problem dimension (8) Proof Consider the preconditioner P 2D = LSU given by (2)-(21). Besides blocks of the original operator, an additional storage of n x inverse Schur complements Si 1 in SSS format is required, (a) Ŝ i, 1 i n x (b) Ψ T Ŝ i Ψ Fig. 6: Permutation on level-2 leads to a block tri-diagonal level-2 MSSS matrix for p> Memory analysis for 2D and 3D MSSS preconditioner We finish our description of MSSS preconditioners with a memory analysis of the suggested algorithms described for 2D problems in Section 4.2, and for 3D problems in Section 4.3, respectively. The following Corollary 2 shows that in both cases we obtain linear memory requirements in terms of the problem size (8). z Ŝ 1Ŝ2 Ŝ 3 mem(p2d 1,r )=mem(p 2D )+ n x i=1 mem(s 1 i,r ) O(n x n y ). The approximate Schur decomposition described in Section 4.2 allows dense, full rank diagonal generators D i,1 i n, of size m m, and limits the rank of all off-diagonal generators by r using model order reduction techniques: mem(si 1,r )=n m }{{} 2 +4(n 1)mr +2(n 2)r r O(n y ). }{{}}{{} D i {U i,v i,p i,q i } {W i,r i } Concerning the memory requirement for storing P 2D in MSSS format, we first note that the permutation described in Corollary 1 does not affect the memory consumption. Since we use sparse generators in (18a)-(18d), the memory requirement is of the same order as the original, sparse matrix (1) obtained from the FEM discretization. For 3D problems, we suggest the usage of P 3D as in (23) based on the splitting (22). For the data structure, we keep the strictly lower and upper diagonal parts in sparse format and convert the diagonal blocks to level-2 MSSS format, cf. Figure 7, mem(p 1 3D,r )=n z mem(p 1 2D,r )+nnz(l)+nnz(u) O(n x n y n z ). 5 Numerical experiments Ŝ nz Fig. 7: Schematic illustration: The diagonal blocks of Ŝ in (22) correspond to a sequence of n z 2D problems in the xy plane. Corollary 2 (Linear memory requirement) Consider p = 1 and a three-dimensional problem of size n x n y n z. For simplicity, we assume on the generator-level m i m, and the off-diagonal ranks of the inverse Schur complements S i in (21) being limited by k i = l i r. The grid size in y- direction on level-1 implies n generators via n = dn y m 1, We present numerical examples 1 for the two-dimensional, elastic Marmousi-II model [19] as well as for a three-dimensional elastic wedge problem which has been inspired by the well-known acoustic test case introduced in [17, 24] for 2D and 3D, respectively. In the examples, we restrict ourselves to Cartesian grids with fixed discretization size h h x = h y = h z. Depending on the specific problem parameters, the maximum frequency we allow is restricted by, f max < min x Ω{c p,c s }, ppw=2, ppw h where in the following experiments a minimum of 2 points per wavelength (ppw) is guaranteed, and ω k = 2π f k. 1 All test cases are publicly available from the author s github repository [4].

12 1 M. Baumann et al. All numerical examples presented in this section have been implemented in FORTRAN 9 using the GNU/gfortran compiler running over GNU/Debian Linux, and executed on a computer with 4 CPUs Intel I5 with 32 GB of RAM. 5.1 Parameter studies We begin our numerical tests with a sequence of experiments performed on an academic two-dimensional wedge problem described in Figure 8. The aim of these first experiments is to prove the following concepts for the 2D algorithm introduced in Section 4.2: Demonstrate the dependency on the maximum off-diagonal rank, r = max{r l,r u }. In Experiment 1 we show that a small value of r leads to a very good preconditioner in term of number of Krylov iterations. Show that the 2D algorithm yields linear computational complexity when all problem parameters are unchanged and the grid size doubles (Experiment 2). In Experiments 3 and 4, we evaluate the frequency dependency of the MSSS-preconditioner (1) when τ ω. This is in particular important when multiple frequencies in a matrix equation framework are considered in Section 5.2. Table 3: Parameter configuration of the elastic wedge problem. The Lamé parameters can be computed via (6). Parameter Layer #1 Layer #2 Layer #3 ρ[kg/m 3 ] c p [m/s] c s [m/s] with a preconditioner that approximates the original operator, P(τ) (K+ iτc τ 2 M),τ = ω, by taking low-rank approximations in the block-lu factorization. Experiment 1 (Off-diagonal rank) This experiment evaluates the performance of the MSSS-preconditioner (19) for 2D problems when the maximal off-diagonal rank r is increased. In Experiment 1, we apply the approximate block-lu decomposition (2)-(21) as described in Section 4.2 to the 2D wedge problem at frequencies f = 8 Hz and f = 16 Hz. The maximum off-diagonal rank r = max{r l,r u } of the Schur complements (21) is restricted using model order reduction techniques, cf. [27]. The dimension of the diagonal constructors has been chosen to be m i = 4, cf. Table 1. Figure 9 shows the convergence behavior of preconditioned IDR(4) (Algorithm 1 with N ω = 1) and preconditioned BiCGStab [42]. We note that even in the high-frequency case, an offdiagonal rank of r = 1 leads to a very efficient preconditioner, and an (outer) Krylov method that converges within at most 4 iterations to a residual tolerance tol=1e-8. Moreover, we observe that IDR(s) outperforms BiCGStab in the considered example when the same preconditioner is applied. For a rank r > 15, we observe convergence within very few iterations. Quality of preconditioner for the 2D wedge problem Fig. 8: 2D elasticwedge problem used for parameter study: Speed of S-waves in m/s (left) and real part of z-component of displacement vector at f = 16 Hz (right). We perform parameter studies on a two-dimensional slice (xz-plane) of the wedge problem described in Figure 14. The values of ρ,c p and c s in the respective layers are given in Table 3, and the considered computational domain Ω = [,6] [,1] meters is shown in Figure 8. In the first set of experiments, we restrict ourselves to the single-frequency case, N ω = 1. The discrete problem is, thus, given by, # iterations off-diagonal rank r f = 8 Hz [IDR(4)] f = 8 Hz [BiCGStab] f = 16 Hz [IDR(4)] f = 16 Hz [BiCGStab] Fig. 9: Number of Krylov iterations when the maximum offdiagonal rank of the inverse Schur complements is restricted to r. (K+ iωc ω 2 M)x=b,

13 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies 11 Experiment 2 (Computational complexity in 2D) The inexact block-lu factorization yields linear computational complexity when applied as a preconditioner within MSSS-preconditioned IDR(s), demonstrated for the 2D wedge problem. In our second numerical experiment, the maximum off-diagonal rank is fixed to r = 15 such that very few IDR iterations are required, and the computational costs in Figure 1 are dominated by the MSSS preconditioner. We solve the 2D wedge problem at frequency 8 Hz for different mesh sizes and a finite element discretization with B-splines of degree p={1,2}. In Figure 1, the CPU time is recorded for different problem sizes: The mesh size h is doubled in both spatial directions such that the number of unknowns quadruples according to (8). From our numerical experiments we see that the CPU time increases by a factor of 4 for both, linear and quadratic, splines. This gives numerical prove that the 2D MSSS computations are performed in linear computational complexity. computation time t[s] p = 1 p = 2 MSSS-preconditioned IDR(4) h = 1m h = 5m h = 2.5m h = 1.25m problem size Fig. 1: Linear computational complexity of preconditioned IDR(4) for the 2Dwedge problem at f = 8 Hz. Table 4: Performance of the MSSS preconditioner when problem size and frequency are increased simultaneously such that ppw=2 andtol=1e-8: O(n 3 ) complexity. f h[m] r MSSS IDR(4) total CPU time 4 Hz sec 16 iter..71 sec 8 Hz sec 33 iter. 4.2 sec 16 Hz sec 62 iter sec 32 Hz sec 11 iter sec proaches in [22, 32] where the authors numerically prove O(n 3 ) complexity for 2D problems of size n x = n y n. The off-diagonal rank parameter r can on the other hand be used to tune the preconditioner in such a way that the number of IDR iterations is kept constant for various problem sizes. In Table 5, we show that a constant number of 3 IDR iterations can be achieved by a moderate increase of r which yields an algorithm that is nearly linear. Table 5: Performance of the MSSS preconditioner when problem size and frequency are increased simultaneously such that ppw = 2 and tol=1e-8: Constant number of iterations. f h[m] r MSSS IDR(4) total CPU time 4 Hz sec 29 iter..83 sec 8 Hz sec 33 iter. 4.2 sec 16 Hz sec 27 iter sec 32 Hz sec 33 iter sec Experiment 4 (Quality of P 2D (τ) when τ ω) Single-frequency experiments when seed frequency differs from the original problem. Preconditioned IDR(4) Experiment 3 (Constant points per wavelength) Convergence behavior of MSSS-preconditioned IRD(s) when the problem size and wave frequency are increased simultaneously. In the previous example, the wave frequency is kept constant while the problem size is increased which is of little practical use due to oversampling. We next increase the wave frequency and the mesh size simultaneously such that a constant number of points per wavelength, ppw = 2, is guaranteed. In Table 4, we use the freedom in choosing the maximum off-diagonal rank parameter r such that the overall preconditioned IDR(s) algorithm converges within a total number of iterations that grows linearly with the frequency. This particular choice of r shows that the MSSS preconditioner has comparable performance to the multi-grid ap- # iterations f = 2 Hz f = 3 Hz f = 4 Hz f = 5 Hz 1 τ =.4ω τ = ω τ = 1.6ω seed frequency τ Fig. 11: Number of iterations of preconditioned IDR(s) when τ ω in (19). We perform the experiment for different frequencies, and keep a constant grid size h=5m and residual tolerancetol = 1e-8.

14 12 M. Baumann et al. This experiments bridges to the multi-frequency case. We consider single-frequency problems at f {2,3,4,5} Hz, and vary the parameter τ of the preconditioner (19). The offdiagonal rank r is chosen sufficiently large such that fast convergence is obtained when τ = ω. From Figure 11 we conclude that the quality of the preconditioner heavily relies on the seed frequency, and a fast convergence of preconditioned IDR(4) is only guaranteed when τ is close to the original frequency. z [m] x [m] 1e The elasticmarmousi-ii model We now consider the case when N ω > 1, and the matrix equation, z [m] KX+iCXΣ MXΣ 2 = B, X C N N ω, (24) is solved. Note that this way we can incorporate multiple wave frequencies in the diagonal matrix Σ = diag(ω 1,...,ω Nω ), and different source terms lead to a block right-hand side of the form B = [b 1,...,b Nω ]. When multiple frequencies are present, the choice of seed frequency τ is crucial as we demonstrate for the Marmousi-II problem in Experiment 6. We solve the matrix equation (24) arising from the realistic Marmousi-II problem [19]. We consider a subset of the computational domain, Ω =[,4] [,185]m, as suggested in [32], cf. Figure 12. Experiment 5 (Marmousi-II at multiple right-hand sides) Performance of the MSSS-preconditioned IDR(s) method for the two-dimensional Marmousi-II problem when multiple source locations are present. z [m] x [m] 1e x [m] Fig. 12: Speed of S-waves in m/s (top), and real part of the z-component of the displacement vector in frequencydomain at f = 4 Hz (middle) and f = 6 Hz (bottom) for the Marmousi-II model, cf. [19] for a complete parameter set. The source location is indicated by the symbol..6 We consider the Marmousi-II problem depicted in Figure 12 at h = 5m and frequency f = 2 Hz. We present the performance of MSSS-preconditioned IDR(4) for N ω equallyspaced source locations (right-hand sides) in Table 6. The CPU time required for the preconditioner as well as the iteration count is constant when N ω > 1 because we consider a single frequency. The overall wall clock time, however, scales better than N ω due to the efficient implementation of block matrix-vector products in the IDR algorithm. The experiment for N ω = 2 shows that there is an optimal number of right-hand sides for a single-core algorithm. Experiment 6 (Marmousi-II at multiple frequencies) Performance of MSSS-preconditioned IDR(s) for the two-dimensionalmarmousi-ii problem at multiple frequencies. 7, 6, 5, MSSS-IDR(4) extrapolation Table 6: Numerical experiments for the Marmousi-II problem at f = 2 Hz using a maximum off-diagonal rank of r = 15. time [s] 4, 3, 2, # RHSs MSSS fact. [sec] IDR(4) [sec] (8 iter.) (8 iter.) (8 iter.) (8 iter.) 1, # frequencies Fig. 13: CPU time until convergence for N ω frequencies equally-spaced within the interval f k [2.4,2.8] Hz.

15 An MSSS-Preconditioned Matrix Equation Approach for the Time-Harmonic Elastic Wave Equation at Multiple Frequencies 13 In Experiment 6, we consider a single source term located at (L x /2,) T and N ω frequencies equally-spaced in the interval f k [2.4,2.8] Hz. The seed frequency is chosen at τ = (1.5i)ω max for which we recorded optimal convergence behavior. When the number of frequencies is increased, we observe an improved performance compared to a simple extrapolation of the N ω = 2 case. We also observed that the size of interval in which the different frequencies range is crucial for the convergence behavior. relative residual norm D wedge problem at f = 2 Hz (h = 2m) BiCGStab IDR(4) IDR(16) 5.3 A three-dimensional elasticwedge problem The wedge problem with parameters presented in Table 3 is extended to a third spatial dimension, resulting in Ω = [,6] [,6] [,1] R # iterations 3D wedge problem at f = 4 Hz (h = 1m) 1 BiCGStab IDR(4) IDR(16) 1 1 relative residual norm Fig. 14: Left: Parameter configuration of the elastic wedge problem for d = 3 according to Table 3. Right: Numerical solution of R(u z ) at f = 4Hz. Experiment 7 (A 3D elastic wedge problem) A three-dimensional, inhomogeneous elastic wedge problem with physical parameters specified in Table 3 is solved using the SSOR- MSSS preconditioner described in Section 4.3. Similar to Experiment 3, we consider a constant number of 2 points per wavelength, and increase the wave frequency from 2Hz to 4Hz while doubling the number of grid points in each spatial direction. In Figure 14 we observe a factor of 4 which numerically indicates a complexity of O(n 5 ) for 3D problems. Moreover, we note that IDR outperforms BiCGStab in terms of number of iterations. 6 Conclusions We present an efficient hybrid method for the numerical solution of the inhomogeneous time-harmonic elastic wave equation. We use an incomplete block-lu factorization based on MSSS matrix computations as a preconditioner for IDR(s) , # iterations Fig. 15: Convergence history of different Krylov methods preconditioned with the SSOR-MSSS preconditioner (23) for the 3Dwedge problem of Figure 14. The presented framework further allows to incorporate multiple wave frequencies and multiple source locations in a matrix equation setting (11). The suggested MSSS preconditioner is conceptional different for 2D and 3D problems: We derive an MSSS permutation matrix (18) that transforms the 2D elastic operator into block tridiagonal level- 2 MSSS matrix structure. This allows the application of an approximate Schur factorization (2)-(21). In order to achieve linear computational complexity, the involved SSS operations (level-1) are approximated using model order reduction techniques that limit the offdiagonal rank. A generalization to 3D problems is not straight-forward because no model order reduction algorithms for level-2 MSSS matrices are currently available [27]. We therefore suggest the SSOR splitting (23) where off-diagonal blocks are treated as sparse matrices and diagonal blocks

An Efficient Two-Level Preconditioner for Multi-Frequency Wave Propagation Problems

An Efficient Two-Level Preconditioner for Multi-Frequency Wave Propagation Problems An Efficient Two-Level Preconditioner for Multi-Frequency Wave Propagation Problems M. Baumann, and M.B. van Gijzen Email: M.M.Baumann@tudelft.nl Delft Institute of Applied Mathematics Delft University

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-02 The Induced Dimension Reduction method applied to convection-diffusion-reaction problems R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 14-1 Nested Krylov methods for shifted linear systems M. Baumann and M. B. van Gizen ISSN 1389-652 Reports of the Delft Institute of Applied Mathematics Delft 214

More information

Nested Krylov methods for shifted linear systems

Nested Krylov methods for shifted linear systems Nested Krylov methods for shifted linear systems M. Baumann, and M. B. van Gizen Email: M.M.Baumann@tudelft.nl Delft Institute of Applied Mathematics Delft University of Technology Delft, The Netherlands

More information

IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations

IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations Harrachov 2007 Peter Sonneveld en Martin van Gijzen August 24, 2007 1 Delft University of Technology

More information

EFFICIENT PRECONDITIONERS FOR PDE-CONSTRAINED OPTIMIZATION PROBLEM WITH A MULTILEVEL SEQUENTIALLY SEMISEPARABLE MATRIX STRUCTURE

EFFICIENT PRECONDITIONERS FOR PDE-CONSTRAINED OPTIMIZATION PROBLEM WITH A MULTILEVEL SEQUENTIALLY SEMISEPARABLE MATRIX STRUCTURE Electronic Transactions on Numerical Analysis. Volume 44, pp. 367 4, 215. Copyright c 215,. ISSN 168 9613. ETNA EFFICIENT PRECONDITIONERS FOR PDE-CONSTRAINED OPTIMIZATION PROBLEM WITH A MULTILEVEL SEQUENTIALLY

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNVESTY OF TECHNOLOGY EPOT - Convergence Analysis of Multilevel Sequentially Semiseparable Preconditioners Yue Qiu, Martin B. van Gijzen, Jan-Willem van Wingerden Michel Verhaegen, and Cornelis Vui

More information

Spectral analysis of complex shifted-laplace preconditioners for the Helmholtz equation

Spectral analysis of complex shifted-laplace preconditioners for the Helmholtz equation Spectral analysis of complex shifted-laplace preconditioners for the Helmholtz equation C. Vuik, Y.A. Erlangga, M.B. van Gijzen, and C.W. Oosterlee Delft Institute of Applied Mathematics c.vuik@tudelft.nl

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

Incomplete Cholesky preconditioners that exploit the low-rank property

Incomplete Cholesky preconditioners that exploit the low-rank property anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

On complex shifted Laplace preconditioners for the vector Helmholtz equation

On complex shifted Laplace preconditioners for the vector Helmholtz equation On complex shifted Laplace preconditioners for the vector Helmholtz equation C. Vuik, Y.A. Erlangga, M.B. van Gijzen, C.W. Oosterlee, D. van der Heul Delft Institute of Applied Mathematics c.vuik@tudelft.nl

More information

Domain decomposition on different levels of the Jacobi-Davidson method

Domain decomposition on different levels of the Jacobi-Davidson method hapter 5 Domain decomposition on different levels of the Jacobi-Davidson method Abstract Most computational work of Jacobi-Davidson [46], an iterative method suitable for computing solutions of large dimensional

More information

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota

Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota Multilevel low-rank approximation preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota SIAM CSE Boston - March 1, 2013 First: Joint work with Ruipeng Li Work

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Numerical behavior of inexact linear solvers

Numerical behavior of inexact linear solvers Numerical behavior of inexact linear solvers Miro Rozložník joint results with Zhong-zhi Bai and Pavel Jiránek Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic The fourth

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

Solving Ax = b, an overview. Program

Solving Ax = b, an overview. Program Numerical Linear Algebra Improving iterative solvers: preconditioning, deflation, numerical software and parallelisation Gerard Sleijpen and Martin van Gijzen November 29, 27 Solving Ax = b, an overview

More information

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

The solution of the discretized incompressible Navier-Stokes equations with iterative methods The solution of the discretized incompressible Navier-Stokes equations with iterative methods Report 93-54 C. Vuik Technische Universiteit Delft Delft University of Technology Faculteit der Technische

More information

Multilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota. Modelling 2014 June 2,2014

Multilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota. Modelling 2014 June 2,2014 Multilevel Low-Rank Preconditioners Yousef Saad Department of Computer Science and Engineering University of Minnesota Modelling 24 June 2,24 Dedicated to Owe Axelsson at the occasion of his 8th birthday

More information

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices Eun-Joo Lee Department of Computer Science, East Stroudsburg University of Pennsylvania, 327 Science and Technology Center,

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1 Scientific Computing WS 2018/2019 Lecture 9 Jürgen Fuhrmann juergen.fuhrmann@wias-berlin.de Lecture 9 Slide 1 Lecture 9 Slide 2 Simple iteration with preconditioning Idea: Aû = b iterative scheme û = û

More information

Lecture 18 Classical Iterative Methods

Lecture 18 Classical Iterative Methods Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information

A PRECONDITIONER FOR THE HELMHOLTZ EQUATION WITH PERFECTLY MATCHED LAYER

A PRECONDITIONER FOR THE HELMHOLTZ EQUATION WITH PERFECTLY MATCHED LAYER European Conference on Computational Fluid Dynamics ECCOMAS CFD 2006 P. Wesseling, E. Oñate and J. Périaux (Eds) c TU Delft, The Netherlands, 2006 A PRECONDITIONER FOR THE HELMHOLTZ EQUATION WITH PERFECTLY

More information

9.1 Preconditioned Krylov Subspace Methods

9.1 Preconditioned Krylov Subspace Methods Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 10-16 An elegant IDR(s) variant that efficiently exploits bi-orthogonality properties Martin B. van Gijzen and Peter Sonneveld ISSN 1389-6520 Reports of the Department

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 15-05 Induced Dimension Reduction method for solving linear matrix equations R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft Institute of Applied

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

A decade of fast and robust Helmholtz solvers

A decade of fast and robust Helmholtz solvers A decade of fast and robust Helmholtz solvers Werkgemeenschap Scientific Computing Spring meeting Kees Vuik May 11th, 212 1 Delft University of Technology Contents Introduction Preconditioning (22-28)

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Exploiting off-diagonal rank structures in the solution of linear matrix equations

Exploiting off-diagonal rank structures in the solution of linear matrix equations Stefano Massei Exploiting off-diagonal rank structures in the solution of linear matrix equations Based on joint works with D. Kressner (EPFL), M. Mazza (IPP of Munich), D. Palitta (IDCTS of Magdeburg)

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers Applied and Computational Mathematics 2017; 6(4): 202-207 http://www.sciencepublishinggroup.com/j/acm doi: 10.11648/j.acm.20170604.18 ISSN: 2328-5605 (Print); ISSN: 2328-5613 (Online) A Robust Preconditioned

More information

A Hybrid Method for the Wave Equation. beilina

A Hybrid Method for the Wave Equation.   beilina A Hybrid Method for the Wave Equation http://www.math.unibas.ch/ beilina 1 The mathematical model The model problem is the wave equation 2 u t 2 = (a 2 u) + f, x Ω R 3, t > 0, (1) u(x, 0) = 0, x Ω, (2)

More information

On the Preconditioning of the Block Tridiagonal Linear System of Equations

On the Preconditioning of the Block Tridiagonal Linear System of Equations On the Preconditioning of the Block Tridiagonal Linear System of Equations Davod Khojasteh Salkuyeh Department of Mathematics, University of Mohaghegh Ardabili, PO Box 179, Ardabil, Iran E-mail: khojaste@umaacir

More information

S N. hochdimensionaler Lyapunov- und Sylvestergleichungen. Peter Benner. Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz

S N. hochdimensionaler Lyapunov- und Sylvestergleichungen. Peter Benner. Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz Ansätze zur numerischen Lösung hochdimensionaler Lyapunov- und Sylvestergleichungen Peter Benner Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz S N SIMULATION www.tu-chemnitz.de/~benner

More information

Synopsis of Numerical Linear Algebra

Synopsis of Numerical Linear Algebra Synopsis of Numerical Linear Algebra Eric de Sturler Department of Mathematics, Virginia Tech sturler@vt.edu http://www.math.vt.edu/people/sturler Iterative Methods for Linear Systems: Basics to Research

More information

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system !"#$% "&!#' (%)!#" *# %)%(! #! %)!#" +, %"!"#$ %*&%! $#&*! *# %)%! -. -/ 0 -. 12 "**3! * $!#%+,!2!#% 44" #% &#33 # 4"!#" "%! "5"#!!#6 -. - #% " 7% "3#!#3! - + 87&2! * $!#% 44" ) 3( $! # % %#!!#%+ 9332!

More information

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses P. Boyanova 1, I. Georgiev 34, S. Margenov, L. Zikatanov 5 1 Uppsala University, Box 337, 751 05 Uppsala,

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

Notes on PCG for Sparse Linear Systems

Notes on PCG for Sparse Linear Systems Notes on PCG for Sparse Linear Systems Luca Bergamaschi Department of Civil Environmental and Architectural Engineering University of Padova e-mail luca.bergamaschi@unipd.it webpage www.dmsa.unipd.it/

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Migration with Implicit Solvers for the Time-harmonic Helmholtz

Migration with Implicit Solvers for the Time-harmonic Helmholtz Migration with Implicit Solvers for the Time-harmonic Helmholtz Yogi A. Erlangga, Felix J. Herrmann Seismic Laboratory for Imaging and Modeling, The University of British Columbia {yerlangga,fherrmann}@eos.ubc.ca

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Construction of a New Domain Decomposition Method for the Stokes Equations

Construction of a New Domain Decomposition Method for the Stokes Equations Construction of a New Domain Decomposition Method for the Stokes Equations Frédéric Nataf 1 and Gerd Rapin 2 1 CMAP, CNRS; UMR7641, Ecole Polytechnique, 91128 Palaiseau Cedex, France 2 Math. Dep., NAM,

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

CLASSICAL ITERATIVE METHODS

CLASSICAL ITERATIVE METHODS CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped

More information

Optimal multilevel preconditioning of strongly anisotropic problems.part II: non-conforming FEM. p. 1/36

Optimal multilevel preconditioning of strongly anisotropic problems.part II: non-conforming FEM. p. 1/36 Optimal multilevel preconditioning of strongly anisotropic problems. Part II: non-conforming FEM. Svetozar Margenov margenov@parallel.bas.bg Institute for Parallel Processing, Bulgarian Academy of Sciences,

More information

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 18th,

More information

Concepts. 3.1 Numerical Analysis. Chapter Numerical Analysis Scheme

Concepts. 3.1 Numerical Analysis. Chapter Numerical Analysis Scheme Chapter 3 Concepts The objective of this work is to create a framework to implement multi-disciplinary finite element applications. Before starting, it is necessary to explain some basic concepts of the

More information

FEM and Sparse Linear System Solving

FEM and Sparse Linear System Solving FEM & sparse system solving, Lecture 7, Nov 3, 2017 1/46 Lecture 7, Nov 3, 2015: Introduction to Iterative Solvers: Stationary Methods http://people.inf.ethz.ch/arbenz/fem16 Peter Arbenz Computer Science

More information

Master Thesis Literature Review: Efficiency Improvement for Panel Codes. TU Delft MARIN

Master Thesis Literature Review: Efficiency Improvement for Panel Codes. TU Delft MARIN Master Thesis Literature Review: Efficiency Improvement for Panel Codes TU Delft MARIN Elisa Ang Yun Mei Student Number: 4420888 1-27-2015 CONTENTS 1 Introduction... 1 1.1 Problem Statement... 1 1.2 Current

More information

AN HELMHOLTZ ITERATIVE SOLVER FOR THE THREE-DIMENSIONAL SEISMIC IMAGING PROBLEMS?

AN HELMHOLTZ ITERATIVE SOLVER FOR THE THREE-DIMENSIONAL SEISMIC IMAGING PROBLEMS? European Conference on Computational Fluid Dynamics ECCOMAS CFD 2006 P. Wesseling, E. Oñate, J. Périaux (Eds) c TU Delft, Delft The Netherlands, 2006 AN HELMHOLTZ ITERATIVE SOLVER FOR THE THREE-DIMENSIONAL

More information

Lecture 8: Fast Linear Solvers (Part 7)

Lecture 8: Fast Linear Solvers (Part 7) Lecture 8: Fast Linear Solvers (Part 7) 1 Modified Gram-Schmidt Process with Reorthogonalization Test Reorthogonalization If Av k 2 + δ v k+1 2 = Av k 2 to working precision. δ = 10 3 2 Householder Arnoldi

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

Block-tridiagonal matrices

Block-tridiagonal matrices Block-tridiagonal matrices. p.1/31 Block-tridiagonal matrices - where do these arise? - as a result of a particular mesh-point ordering - as a part of a factorization procedure, for example when we compute

More information

Splitting Iteration Methods for Positive Definite Linear Systems

Splitting Iteration Methods for Positive Definite Linear Systems Splitting Iteration Methods for Positive Definite Linear Systems Zhong-Zhi Bai a State Key Lab. of Sci./Engrg. Computing Inst. of Comput. Math. & Sci./Engrg. Computing Academy of Mathematics and System

More information

Iterative Methods for Linear Systems

Iterative Methods for Linear Systems Iterative Methods for Linear Systems 1. Introduction: Direct solvers versus iterative solvers In many applications we have to solve a linear system Ax = b with A R n n and b R n given. If n is large the

More information

Basic Principles of Weak Galerkin Finite Element Methods for PDEs

Basic Principles of Weak Galerkin Finite Element Methods for PDEs Basic Principles of Weak Galerkin Finite Element Methods for PDEs Junping Wang Computational Mathematics Division of Mathematical Sciences National Science Foundation Arlington, VA 22230 Polytopal Element

More information

Numerical Methods for Large-Scale Nonlinear Systems

Numerical Methods for Large-Scale Nonlinear Systems Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

Performance Comparison of Relaxation Methods with Singular and Nonsingular Preconditioners for Singular Saddle Point Problems

Performance Comparison of Relaxation Methods with Singular and Nonsingular Preconditioners for Singular Saddle Point Problems Applied Mathematical Sciences, Vol. 10, 2016, no. 30, 1477-1488 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2016.6269 Performance Comparison of Relaxation Methods with Singular and Nonsingular

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

The flexible incomplete LU preconditioner for large nonsymmetric linear systems. Takatoshi Nakamura Takashi Nodera

The flexible incomplete LU preconditioner for large nonsymmetric linear systems. Takatoshi Nakamura Takashi Nodera Research Report KSTS/RR-15/006 The flexible incomplete LU preconditioner for large nonsymmetric linear systems by Takatoshi Nakamura Takashi Nodera Takatoshi Nakamura School of Fundamental Science and

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

1 Exercise: Linear, incompressible Stokes flow with FE

1 Exercise: Linear, incompressible Stokes flow with FE Figure 1: Pressure and velocity solution for a sinking, fluid slab impinging on viscosity contrast problem. 1 Exercise: Linear, incompressible Stokes flow with FE Reading Hughes (2000), sec. 4.2-4.4 Dabrowski

More information

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID

More information

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i ) Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix

More information

Chapter 5 Eigenvalues and Eigenvectors

Chapter 5 Eigenvalues and Eigenvectors Chapter 5 Eigenvalues and Eigenvectors Outline 5.1 Eigenvalues and Eigenvectors 5.2 Diagonalization 5.3 Complex Vector Spaces 2 5.1 Eigenvalues and Eigenvectors Eigenvalue and Eigenvector If A is a n n

More information

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009

J.I. Aliaga 1 M. Bollhöfer 2 A.F. Martín 1 E.S. Quintana-Ortí 1. March, 2009 Parallel Preconditioning of Linear Systems based on ILUPACK for Multithreaded Architectures J.I. Aliaga M. Bollhöfer 2 A.F. Martín E.S. Quintana-Ortí Deparment of Computer Science and Engineering, Univ.

More information

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

6. Iterative Methods for Linear Systems. The stepwise approach to the solution... 6 Iterative Methods for Linear Systems The stepwise approach to the solution Miriam Mehl: 6 Iterative Methods for Linear Systems The stepwise approach to the solution, January 18, 2013 1 61 Large Sparse

More information

Comparative Analysis of Mesh Generators and MIC(0) Preconditioning of FEM Elasticity Systems

Comparative Analysis of Mesh Generators and MIC(0) Preconditioning of FEM Elasticity Systems Comparative Analysis of Mesh Generators and MIC(0) Preconditioning of FEM Elasticity Systems Nikola Kosturski and Svetozar Margenov Institute for Parallel Processing, Bulgarian Academy of Sciences Abstract.

More information

Mathematics and Computer Science

Mathematics and Computer Science Technical Report TR-2007-002 Block preconditioning for saddle point systems with indefinite (1,1) block by Michele Benzi, Jia Liu Mathematics and Computer Science EMORY UNIVERSITY International Journal

More information

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Outline A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems Azzam Haidar CERFACS, Toulouse joint work with Luc Giraud (N7-IRIT, France) and Layne Watson (Virginia Polytechnic Institute,

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

Matrix Algebra for Engineers Jeffrey R. Chasnov

Matrix Algebra for Engineers Jeffrey R. Chasnov Matrix Algebra for Engineers Jeffrey R. Chasnov The Hong Kong University of Science and Technology The Hong Kong University of Science and Technology Department of Mathematics Clear Water Bay, Kowloon

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 2 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory April 5, 2012 Andre Tkacenko

More information

Notes on SU(3) and the Quark Model

Notes on SU(3) and the Quark Model Notes on SU() and the Quark Model Contents. SU() and the Quark Model. Raising and Lowering Operators: The Weight Diagram 4.. Triangular Weight Diagrams (I) 6.. Triangular Weight Diagrams (II) 8.. Hexagonal

More information

Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems

Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems Seiji Fujino, Takashi Sekimoto Abstract GPBiCG method is an attractive iterative method for the solution

More information

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y) 5.1 Banded Storage u = temperature u= u h temperature at gridpoints u h = 1 u= Laplace s equation u= h u = u h = grid size u=1 The five-point difference operator 1 u h =1 uh (x + h, y) 2u h (x, y)+u h

More information

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1. J. Appl. Math. & Computing Vol. 15(2004), No. 1, pp. 299-312 BILUS: A BLOCK VERSION OF ILUS FACTORIZATION DAVOD KHOJASTEH SALKUYEH AND FAEZEH TOUTOUNIAN Abstract. ILUS factorization has many desirable

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

Matrix Algebra: Summary

Matrix Algebra: Summary May, 27 Appendix E Matrix Algebra: Summary ontents E. Vectors and Matrtices.......................... 2 E.. Notation.................................. 2 E..2 Special Types of Vectors.........................

More information

Boundary Value Problems and Iterative Methods for Linear Systems

Boundary Value Problems and Iterative Methods for Linear Systems Boundary Value Problems and Iterative Methods for Linear Systems 1. Equilibrium Problems 1.1. Abstract setting We want to find a displacement u V. Here V is a complete vector space with a norm v V. In

More information

Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions

Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions Domain Decomposition Preconditioners for Spectral Nédélec Elements in Two and Three Dimensions Bernhard Hientzsch Courant Institute of Mathematical Sciences, New York University, 51 Mercer Street, New

More information

A Review of Matrix Analysis

A Review of Matrix Analysis Matrix Notation Part Matrix Operations Matrices are simply rectangular arrays of quantities Each quantity in the array is called an element of the matrix and an element can be either a numerical value

More information

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about

More information

Residual iterative schemes for largescale linear systems

Residual iterative schemes for largescale linear systems Universidad Central de Venezuela Facultad de Ciencias Escuela de Computación Lecturas en Ciencias de la Computación ISSN 1316-6239 Residual iterative schemes for largescale linear systems William La Cruz

More information

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58 Background C. T. Kelley NC State University tim kelley@ncsu.edu C. T. Kelley Background NCSU, Spring 2012 1 / 58 Notation vectors, matrices, norms l 1 : max col sum... spectral radius scaled integral norms

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information