Block-tridiagonal matrices

Similar documents
7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Derivation of the Kalman Filter

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

MULTI-LAYER HIERARCHICAL STRUCTURES AND FACTORIZATIONS

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

Multilevel Preconditioning of Graph-Laplacians: Polynomial Approximation of the Pivot Blocks Inverses

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Cyclic Reduction History and Applications

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Optimal Interface Conditions for an Arbitrary Decomposition into Subdomains

Citation Osaka Journal of Mathematics. 43(2)

12/1/2015 LINEAR ALGEBRA PRE-MID ASSIGNMENT ASSIGNED BY: PROF. SULEMAN SUBMITTED BY: M. REHAN ASGHAR BSSE 4 ROLL NO: 15126

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

Numerical Solution Techniques in Mechanical and Aerospace Engineering

ECE133A Applied Numerical Computing Additional Lecture Notes

Optimal multilevel preconditioning of strongly anisotropic problems.part II: non-conforming FEM. p. 1/36

Matrix Computations and Semiseparable Matrices

Matrix decompositions

Computational Methods. Systems of Linear Equations

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Solving Systems of Linear Equations

ETNA Kent State University

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

V C V L T I 0 C V B 1 V T 0 I. l nk

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

Solving PDEs with CUDA Jonathan Cohen

CLASSICAL ITERATIVE METHODS

A Sparse QS-Decomposition for Large Sparse Linear System of Equations

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Elementary Linear Algebra

Matrix decompositions

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Applied Numerical Linear Algebra. Lecture 8

Scientific Computing

ANONSINGULAR tridiagonal linear system of the form

Numerical Linear Algebra

CSE 160 Lecture 13. Numerical Linear Algebra

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

Computational Fluid Dynamics Prof. Sreenivas Jayanti Department of Computer Science and Engineering Indian Institute of Technology, Madras

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

On the Skeel condition number, growth factor and pivoting strategies for Gaussian elimination

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Analysis Lecture Notes

Linear Algebra and Matrix Inversion

Foundations of Matrix Analysis

Homework 2 Foundations of Computational Math 2 Spring 2019

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

Algorithms to solve block Toeplitz systems and. least-squares problems by transforming to Cauchy-like. matrices

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

9. Numerical linear algebra background

Lecture 1 INF-MAT3350/ : Some Tridiagonal Matrix Problems

Analysis of Spectral Kernel Design based Semi-supervised Learning

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

Finite difference method for elliptic problems: I

Shortest paths with negative lengths

Numerical Analysis Lecture Notes

Algorithms PART II: Partitioning and Divide & Conquer. HPC Fall 2007 Prof. Robert van Engelen

Algebraic Multigrid as Solvers and as Preconditioner

1 Positive definiteness and semidefiniteness

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

The Solution of Linear Systems AX = B

Review Questions REVIEW QUESTIONS 71

AMS526: Numerical Analysis I (Numerical Linear Algebra)

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Gaussian Elimination and Back Substitution

Preconditioning techniques to accelerate the convergence of the iterative solution methods

Chapter 12 Block LU Factorization

Numerical Linear Algebra

Goal: to construct some general-purpose algorithms for solving systems of linear Equations

3 QR factorization revisited

Iterative Methods. Splitting Methods

Lecture 16 Methods for System of Linear Equations (Linear Systems) Songting Luo. Department of Mathematics Iowa State University

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Fast Fourier Transform Solvers and Preconditioners for Quadratic Spline Collocation

Review. Example 1. Elementary matrices in action: (a) a b c. d e f = g h i. d e f = a b c. a b c. (b) d e f. d e f.

Chapter 7. Tridiagonal linear systems. Solving tridiagonal systems of equations. and subdiagonal. E.g. a 21 a 22 a A =

Matrix decompositions

Linear Algebra Linear Algebra : Matrix decompositions Monday, February 11th Math 365 Week #4

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

2 Two-Point Boundary Value Problems

Algebraic Methods in Combinatorics

CALU: A Communication Optimal LU Factorization Algorithm

MATH 3511 Lecture 1. Solving Linear Systems 1

Dense LU factorization and its error analysis

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Outline Background Schur-Horn Theorem Mirsky Theorem Sing-Thompson Theorem Weyl-Horn Theorem A Recursive Algorithm The Building Block Case The Origina

Elementary maths for GMT

MTH 464: Computational Linear Algebra

MA3232 Numerical Analysis Week 9. James Cooley (1926-)

Communication Avoiding LU Factorization using Complete Pivoting

Image Reconstruction And Poisson s equation

y b where U. matrix inverse A 1 ( L. 1 U 1. L 1 U 13 U 23 U 33 U 13 2 U 12 1

ETNA Kent State University

Linear Algebra: A Constructive Approach

Algebra C Numerical Linear Algebra Sample Exam Problems

The Generalization of Quicksort

Transcription:

Block-tridiagonal matrices. p.1/31

Block-tridiagonal matrices - where do these arise? - as a result of a particular mesh-point ordering - as a part of a factorization procedure, for example when we compute the eigenvalues of a matrix.. p.2/31

Block-tridiagonal matrices Ω1 Ω2 Consider a two-dimensional domain partitioned in strips. Assume that points on the lines of intersection are only coupled to their nearest neighbors in the underlying mesh (and we do not have periodic boundary conditions). Hence, there is no coupling between subdomains except through the glue on the interfaces. Ω3. p.3/31

Block-tridiagonal matrices When the subdomains are ordered lexicographically from left to right, a domain Ω becomes coupled only to its pre- and postdecessors Ω 1 and Ω +1, respectively and the corresponding matrix takes the form of a block tridiagonal matrix = tridiag ( 1 +1 ), or ¾ 11 12 0 =......... 21 22 23 0 ÒÒ 1 ÒÒ For definiteness we let the boundary meshline Ω Ω +1 belong to Ω. In order to preserve the sparsity pattern we shall factor without use of permutations. Naturally, the lines of intersection do not have to be straight.. p.4/31

Block-tridiagonal matrices How do we factorize a (block)-tridiagonal matrix?. p.5/31

Let be block-tridiagonal, and expressed as = Ä Í. Convenient: seek Ä,, Í such that = Ä 1 Í and where is diagonal, Ä = Ä and Í = Í Direct computation: = ( Ä ) 1 ( Í ) = Ä Í +Ä 1 Í = Ä Í i.e., = + Ä 1 Í Important: Ä and Í are strictly lower and upper triangular.. p.6/31

= Ä 1 Í for pointwise tridiagonal matrices 2 6 4 11 22... 3 7 5 = 2 6 4 1 2... 3 + 7 5 2 6 4 0 21 0 ÒÒ ÒÒ 1 0 3 2... 7 6 5 4 1 1 12... Ò 1 Ò 3 2 7 6 5 4 0 12 0 23... 3 7 5 0 Factorization algorithm: 1 = 11 = 1 1 1. p.7/31

= Ä 1 Í for pointwise tridiagonal matrices Solution of systems with Ä 1 Í. p.8/31

Block-tridiagonal matrices Let be block-tridiagonal, and expressed as = Ä Í. One can envisage three major versions of the factorization algorithm: (i) = ( Ä ) 1 ( Í ) (ii) = ( 1 Ä )( 1 Í ) (iii) = (Á Ä ) 1 (Á Í ) = 1 1 1 1 2 1 = 11 ( = 1 1 1) 1 1 0 = 0 (Inverse free substitutions), where Ä Ä =, Í Í =. Here 1 (Á = Í ) 1 Ä ) 1 (Á Í ) 1 = 2 + ) + (Á 2 Í )(Á Í + Í ) and similarly (Á for Ä ) 1 (Á. (Á. p.9/31

Existence of factorization for block-tridiagonal matric We assume that the matrices are real. It can be shown that (Ö) is ÖÖ always nonsingular for two important classes of matrices, namely for matrices which are positive definite, i.e., 0 for all Ü ¾ Ê ( if Ò has order Ò) Ü Ì Ü blockwise generalized diagonally dominant matrices (also called block À-matrices), i.e., for which the diagonal matrices are nonsingular and 1 1 (here 01 = 0 Ò+1Ò = 0). + +1 1 1 = 1 2 Ò. p.10/31

A factorization passes through stages Ö = 1 2 Ò For two important classes of matrices there holds that the successive top blocks, i.e., pivot matrices which arise after every factorization stage, are nonsingular. At every stage the current matrix (Ö) is partitioned in 2 2 blocks, (1) = = 2 6 4 11 12 0 21 22 23 0............ 0 0 ÒÒ 1 ÒÒ 3 7 5 = 2 4 (1) (1) 11 12 (1) (1) 21 22 3 5 At the Öth stage we compute (Ö) 11 = (Ö) 1 11 and factor (Ö), (Ö) = 2 4 0 Á (Ö) 21 (Ö) 11 Á 3 2 5 1 4 (Ö) (Ö) 11 12 0 (Ö+1) 3 5 where (Ö+1) = (Ö) 22 (Ö) 21 (Ö) 11 (Ö) 12 complement. is the so-called Schur. p.11/31

Existence of factorization for block-tridiagonal matric The factorization of a block matrix is equivalent to the block Gaussian elimination of it. Note then that the only block in (Ö) 22 which will be affected by the elimination (of block matrix (1) 21 block tridiagonal decomposition of (Ö) 22 matrix. ) is the top block of the, i.e., (Ö+1) 11, the new pivot We show that for the above matrix classes the Schur complement (Ö+1) = (Ö) 22 (Ö) 21 (Ö) 11 (Ö) 12 belongs to the same class as (Ö), i.e., in particular that the pivot entries are nonsingular.. p.12/31

Lemma 1 Let = 11 12 21 22 be positive definite. Then = 1 2 and the Schur complement Ë = 22 21 1 11 12 are also positive definite. Proof Ü Ü Ü There holds 11Ü = 1 1 Ü for all (Ü = 1 0). Hence 1 ÜÌ 11Ü 1 0 for all Ì Ì Ì Ì Ü 1, i.e., 11 is positive definite. Similarly, it can be shown that 22 is positive definite. Since is nonsingular then Ü Ì Ü = Ü Ì Ì Ü = Ý Ì 1 Ý for Ý = Ü Ý so 1 0 for all = i.e., the inverse of is also positive definite. Ý Use Ý 0 now the explicit form of the inverse Ì computed by use of the factorization, 2 3 2 3 2 3 2 3 1 11 = 4 12 11 0 Á 0 5 4 5 4 5 = 4 5 0 1 Á Ë Á 0 Ë 1 21 11 where indicates entries not important for the present discussion. Hence, since 1 is positive definite, so is its diagonal block Ë 1. Hence, the inverse of Ë 1, and therefore also Ë, is positive definite. Á. p.13/31

Corollary 1 When (Ö) is positive definite, (Ö+1) and in particular, (Ö+1) 11 are positive definite. Proof (Ö+1) is a Schur complement of (Ö) so by Lemma 1, (Ö+1) is positive definite when (Ö) is. In particular, its top diagonal block is positive definite.. p.14/31

Lemma 2 Let = 11 12 21 22 be blockwise generalized diagonally dominant where is block tridiagonal. Then the Schur complement Ë = 22 21 1 11 12 is also generalized diagonally dominant. Proof (Hint) Since the only matrix block in Ë which has been changed from 22 is its top block 11 to (Ö+1) 11 it suffices to show that 11 is nonsingular and the first block column is generalized diagonally dominant.. p.15/31

Linear recursions Consider the solution of the linear system of equations x = b, where has been already factorized as = ÄÍ or = ÄÍ. The matrices Ä = Ð and Í = Ù are lower- and upper-triangular, respectively. To compute x, we must perform two steps: forward substitution: Äz = b, i.e., Þ 1 = 1 Þ = 1 È =1 Ð Þ = 2 3 Ò backward substitution: Í x = z, i.e., Ü Ò = Þ Ò Ü = Þ ÒÈ Ù Ü = Ò 1 Ò 2 1 =+1. p.16/31

While the implementation of the forward and back-substitution on a serial computer is trivial, to implement them on a vector or parallel computer system is problematic. The reason is that the relations are particular examples of a linear recursion which is an inherently sequential process. A general Ñ-level recurrence relation reads as Ü = 1 Ü 1 + 2 Ü 2 + + Ñ Ü Ñ + and the performance of its straightforward vector or parallel implementation is degraded due to the existing backwards data dependencies.. p.17/31

Block-tridiagonal matrices Can we speedup somehow the solution of systems with bi- or tri-diagonal matrices?. p.18/31

Multifrontal solution methods 1 3 5 7 9 8 6 4 2 x n0 (a) Two way frontal (Ü method Ò0 is the middle node (b) The structure of the matrix Any tridiagonal or block tridiagonal matrix can be attacked in parallel from both ends, after a proper numbering of the unknowns It can be seen that we can work independently on the odd numbered and even numbered points until we have eliminated all entries except the final corner one.. p.19/31

Hence, the factorization and forward substitution can occur in parallel for the two fronts (the even and the odd). At the final point we can either continue in parallel with the back substitution to compute the solution at all the other interior points, or we can use the same type of two way frontal method now for each of the two structures which have been split by the already computed solution at the middle point. This method of recursively dividing the domain in smaller and smaller pieces which can be handled all in parallel, can be continued log 2 Ò steps, after which we have just one unknown per subinterval.. p.20/31

The idea to perform Gaussian elimination from both ends of a tridiagonal matrix, called also twisted factorization, was proposed first by Babushka in 1972. Note that in this method no back substitution is required.. p.21/31

Odd-even elimination/cyclic reduction/divideand-conquer We sketch some parallel computation methods for recurrence relations. The methods are applicable for general (block-)band matrices. For simplicity of presentation, the idea is illustrated on one-level or two-level scalar recursions: Ü 1 = 1 Ü = Ü 1 + = 2 3 Ò 1Ü 1 + Ü + +1 Ü +1 = = 1 2 Ò 10 = ÒÒ+1 = 0 The corresponding matrix-vector equivalent of the above recursions is to solve a system x = b, where is lower bidiagonal and tridiagonal, respectively.. p.22/31

An idea to gain some parallelism when solving linear recursions is to reduce the size of the corresponding linear system by eliminating the odd-indexed unknowns from the even-numbered equations (or vice versa). This elimination can be done in parallel for each of the equations because the odd numbered equations and the even numbered equations are both mutually uncoupled. The system of equations resulting for the even numbered and for the odd numbered unknowns after the elimination can be applied for the reduced equations and so on. For every elimination step we reduce the order of the coupled equations to about half its previous order and eventually we are left with a single equation or a system of uncoupled equations. 1 2 3 4 5 6 7 2 4 6. p.23/31

In the odd-even elimination (or odd-even reduction) method we eliminate the odd numbered unknowns (i.e., numbers 1 (mod 2)) and we are left with a tridiagonal system for the even numbered (i.e., numbers 2 (mod 2)) unknowns. The method is repeated, i.e., we eliminate the unknowns 2 (mod 4) and are left with the unknowns numbered 4 (mod 4) and so on. Eventually we are left with just a single equation which we solve. At this point we can use back substitution to compute the remaining unknowns.. p.24/31

...the odd-even simultaneous... There exist a second version of this method, called the odd-even simultaneous elimination. In the odd-even simultaneous elimination method we eliminate the odd-numbered unknowns from the even numbered equations and simultaneously the even numbered unknowns from the odd equations. In this way we are left with two decoupled equations, one for the even numbered unknowns and one for the odd numbered unknowns. The same method can be recursively applied for these two sets in parallel. Hence, in this method we do not reduce the size of the problem but we successively decouple the problems into smaller and smaller sizes of subproblems. Eventually we arrive at a system on diagonal form which we solve for all unknowns in parallel. Therefore, in this method there is no need to perform back substitution.. p.25/31

...the odd-even... 4 6 2 7 3 5 1 8 6 4 2 7 5 3 8 1 8 2 3 4 6 7 1 5 Two elimination steps of the simultaneous elimination method. p.26/31

...the odd-even... The computational complexity of the sequential LU factorization and forward and back substitution method for three-diagonal matrices is 8Ò. While performing the odd-even simultaneous elimination we perform 9Òlog 2 Ò flops to transform the system and Ò flops to solve the last diagonal system. Hence, the redundancy of the odd-even simultaneous elimination method is 98log 2 Ò which is the price we pay to get a fully parallel method.. p.27/31

2 1Ù 2 1 + 2 Ù 2 + 2 Ù 2+1 = 2 2 2 2 FMB - NLA Algebraic description of the odd-even... Consider the three-term recursion, which we rewrite as 2 Ù 2 + 2+1Ù 2+1 + 2+1Ù 2+2 = 2+1 2+1Ù 2+1 + 2+2Ù 2+2 + 2+2Ù 2+3 = 2+2 We multiply the first equation by, the third by 2+1 and add the 2 resulting 2 2+2 equations to the second equation. The so-resulting equation is () Ù 1 + (1) 2+1 2+1 + (1) Ù 2 2 (1) = 2 (1) 2+1 = 2+1 (1) 2 2 1 2+1 = 2+1 2 (1) 2+2 2+2 2+1 = 2+1 2+1 2+3 = (1) 2+1 = 0 1 where Ù 2 2 2 2+1 2+2 2+1 2+1 2+2 2+2 Next, the odd-even reduction is repeated for all odd numbered equations. The resulting system can be reduced in a similar way and eventually we are left with just one equation.. p.28/31

Similarly, for the even points we get (1) 1 Ù 2 2 + (1) 2 2 Ù 2 + (1) 2 Ù 2+2 = (1) 2 = 1 2 where (1) 1 (1) (1) 2 2 and (1) 2 are defined accordingly. 2 It is interesting to note that for a sufficiently diagonally dominant matrix, the reduction can be terminated or truncated after fewer than Ç(log 2 Ò) steps, since the reduced system can be considered numerically (i.e., up to a machine precision) as a diagonal system.. p.29/31

With the same indices for a block tridiagonal system we get = blocktridiag( 1 ) (1) = 1 2 2 2 2 1 (1) 2+1 = 2+1 1 2 2 2 2+1 1 2+2 2+1 (1) 2+1 = 1 2+12+2 2+2 (1). p.30/31

Some keywords to discuss Load balancing for cyclic reduction methods Divide-and-conquer techniques Domain decomposition ordering. p.31/31