Contributions to Parallel Algorithms for Sylvester-type Matrix Equations and Periodic Eigenvalue Reordering in Cyclic Matrix Products

Size: px
Start display at page:

Download "Contributions to Parallel Algorithms for Sylvester-type Matrix Equations and Periodic Eigenvalue Reordering in Cyclic Matrix Products"

Transcription

1 Contributions to Parallel Algorithms for Sylvester-type Matrix Equations and Periodic Eigenvalue Reordering in Cyclic Matrix Products Robert Granat Licentiate Thesis, May 2005 UMINF Department of Computing Science Umeå University SE Umeå, Sweden

2 Print & Media, Umeå Universitet UMINF ISSN ISBN X

3 Abstract This Licentiate Thesis contains contributions in two different subfields of Computing Science: parallel ScaLAPACK-style algorithms for Sylvester-type matrix equations and periodic eigenvalue reordering in a cyclic product of matrices. Sylvester-type matrix equations, like the continuous-time Sylvester equation AX XB = C, where A of size m m, B of size n n and C of size m n are general matrices with real entries, have applications in many areas. For example, the continuous-time Sylvester equation shows up in eigenvalue problems, condition estimation of eigenvalue problems, e.g., sensitivity analysis of invariant subspaces corresponding to a specified spectrum, and in control and system theory. This thesis contributes to the area of parallel library ScaLAPACK-style software for solving Sylvester-type matrix equations. The algorithms and library software presented are based on the well-known Bartels Stewart s method and extend earlier work on triangular Sylvester-type matrix equations to general Sylvester matrix equations. The developed methods will serve as foundation for a future parallel software library for solving 42 sign and transpose variants of eight common Sylvester-type matrix equations. Many real world phenomena behave periodically, e.g., helicopter rotors and revolving satellites, and can be described in terms of periodic eigenvalue problems. Typically, eigenvalues and invariant subspaces (eigenvectors) to certain periodic matrix products are of interest and have direct physical interpretations. The eigenvalues of a cyclic matrix product can be computed via the periodic Schur decomposition. Our contribution in this area is a direct method for periodic eigenvalue reordering in the periodic real Schur form which extends earlier work on the standard and the generalized eigenvalue problems. Periodic eigenvalue reordering is vital in the computation of periodic eigenspaces corresponding to specified spectra and is utilized and requested, e.g., in recently proposed methods for solving periodic differential matrix equations arising in the analysis of the observability/controllability of linear continuous-time periodic systems and for solving discrete-time periodic Riccati equations arising in linear quadratic (LQ) optimal control problems. The proposed direct reordering method relies on orthogonal transformations only, i.e., is backward stable, and can be generalized to more general periodic matrix products arising in generalizations of the periodic Schur form. iii

4 iv

5 Preface This licentiate thesis consists of the following four papers and an introduction including a summary of the papers. I. Robert Granat, Bo Kågström and Peter Poromaa, Parallell ScaLAPACKstyle Algorithms for Solving Continuous-Time Sylvester Equations. In H. Kosch et al., Euro-Par 2003 Parallel Processing. Lecture Notes in Computer Science, Springer Verlag, Vol. 2790, pp , II. Robert Granat and Bo Kågström, Evaluating Parallel Algorithms for Solving Sylvester-Type Matrix Equations: Direct Transformation-Based versus Iterative Matrix-Sign-Function-Based Methods. To appear in PARA 04 State-of-the-Art in Scientific Computing Conference Proceedings, Lecture Notes in Computer Science, Springer Verlag, III. Robert Granat, Isak Jonsson and Bo Kågström, Combining Explicit and Recursive Blocking for Solving Triangular Sylvester-Type Matrix Equations on Distributed Memory Platforms. In M. Danelutto, D. Laforenza, M. Vanneschi (Eds), Euro-Par Lecture Notes in Computer Science, Springer Verlag, Vol. 3149, pp , IV. Robert Granat and Bo Kågström, Direct Eigenvalue Reordering in a Product of Matrices in Extended Periodic Real Schur Form, Report UMINF 05.05, submitted to SIAM Journal on Matrix Analysis and Applications, February The topic of papers I-III is parallel ScaLAPACK-style algorithms for computing the solution of general Sylvester-type matrix equations. In Paper IV, a direct method for eigenvalue reordering in a cyclic product of matrices is developed and analyzed. v

6 vi

7 Acknowledgements I wish to thank my supervisor Professor Bo Kågström, who is also co-author of all papers in this contribution. It has been great to work next to you and to take part of your deep knowledge and experience and I look forward to the second part of my doctoral studies. Thank you! Next, I want to send big thanks to Dr Isak ( the problem solver ) Jonsson, my assistant supervisor who is also co-author of one of the papers in this thesis. You have always been very kind and helpful. Thirdly, I wish to say thank you to Dr Peter Poromaa, who is co-author of the first paper. Thanks also to Dr Daniel Kressner and Dr Andras Varga for fruitful discussions on periodic eigenvalue problems and related topics. Thanks to all friends and colleagues at the Department of Computing Science, especially the members of the Numerical Linear Algebra and Parallel and High-Performance Computing Groups. Thanks to the staff at HPC2N (High Performance Computing Center North) for providing a great computing environment and superior technical support. Many thanks to Professor Jörgen Löfström, University of Gothenburg, whose teaching really woke up my interest in linear algebra. Also, thank you Dr Maya Neytcheva, University of Uppsala, who introduced me to parallel computing. Success! Eva, Elias and Anna, my family, you have made my life complete and I am proud being husband and father in our family. Thank you very much! Also many thanks to my parents, my three brothers and my grandparents for all your support during the last years. Finally, I want to express my deepest gratitude to my Lord and Savior Jesus Christ. You have been my Shepherd through life for so many years and never failed me. Thank you for all Your grace and mercy. Soli Deo Gloria! Financial support has been provided jointly by the Faculty of Science and Technology, Umeå University, and by the Swedish Research Council under grant VR and by the Swedish Foundation for Strategic Research under the frame program grant A3 02:128. Umeå, May 2005 Robert Granat vii

8 viii

9 Contents 1 Introduction Motivation for this work Matrix computations in CACSD High performance linear algebra software libraries Sylvester-type matrix equations Periodic eigenvalue problems Contributions in this thesis Paper I Paper II Paper III Paper IV Ongoing and future work Sylvester-type matrix equations Periodic eigenvalue problems Paper I 25 Paper II 39 Paper III 53 Paper IV 67 ix

10

11 Chapter 1 Introduction This chapter motivates and introduces the work presented in this thesis and gives some background to the topics considered. 1.1 Motivation for this work The growing demand for high-quality and high-performance library software is driven by the increasing need from industry, research facilities and communities to be able to solve larger and more complex problems faster than ever before. Often the problems considered are so complex that they cannot be solved by ordinary desktop computers in a reasonable amount of time. Scientists and engineers are more and more forced to utilize high-end computational devices, like specialized high performance computational units from vendor-constructed shared or distributed memory parallel computers or self-made so called clusterbased systems consisting of high-end commodity PC processors connected with high-speed networks. High performance clusters are getting more and more common due to their good scalability properties and high cost-effectiveness, i.e., lower cost per performance unit compared to the more old-fashioned supercomputer systems. It is also common among institutions with limited budgets to build cheaper and simpler clusters from ordinary PC workstations connected via simpler (and slower) networks, as well. In fact, any local area network (LAN) connecting a number of workstations can be considered to be a clusterbased system. However, the latter systems are mainly used for high-throughput computing applications, while clusters with high-speed networks are used for challenging parallel computations. To solve a problem in a reliable and efficient way, a lot of out-of-application considerations must be made regarding solution method, discretization, data distribution and granularity, expected and achieved accuracy of computed results, how to utilize the available computer power in the best way (should a par- 1

12 2 Chapter 1 allel computer be used or not), does the algorithm match the memory hierarchy of the target computer system, etc. Typically, an appropriate and efficient usage of high performance computer (HPC) systems, like parallel computers, calls for non-trivial reformulations of the problem settings and algorithms. Therefore, a lot of time and efforts can be saved by utilizing extensively tested high-quality software libraries as basic building blocks in the research. By this procedure, a lot more attention can be focused on the applications and related theory. Most problems in the real world are non-linear, i.e., the output from a phenomena or a process does not depend linearly on the input. Moreover, most problems are also continuous, i.e, the output depends continuously on the input. Roughly speaking, the graph of a continuous function can be painted without ever lifting the pen from the paper. Since very few real world problems can be solved analytically, that is, finding an exact mathematical expression that fully describes the relation between the input and the output of the process or phenomena, scientists are many times forced to linearize and discretize their problems to make them solvable in a finite amount of time using a finite amount of computational resources (such as computing devices, data storage, network bandwidth, etc.). This means that the computed solution always will be a more or less valid approximation. The good thing is that by linearization and discretization processes, many problems can be solved effectively by standard linear algebra methods. In numerical linear algebra, systems of linear equations, eigenvalue problems and related solution methods, i.e., matrix computations, are studied. The focus is on reliable and efficient algorithms for large-scale matrix computational problems from the point of view of using finite-precision arithmetic. Developments in the area of mumerical linear algebra often results in widely available public domain linear algebra software libraries, like LAPACK and ScaLAPACK (see Section 1.3). 1.2 Matrix computations in CACSD Matrix computations are fundamental to many areas of science and engineering and occur frequently in a variety of applications, for example in Computer-Aided Control System Design (CACSD). In CACSD various linear control systems are considered, like the following linear continuous-time descriptor system Eẋ(t) = Ax(t) + Bu(t) y(t) = Cx(t) + Du(t) (1.1) or a similar discrete-time system of the form Ex k+1 = Ax k + Bu k y k = Cx k + Du k, (1.2)

13 Introduction 3 where x(t), x k R n are state vectors, u(t), u k R m are the vectors of inputs (or controls) and y(t), y k R r are the vectors of output. The systems are described by the state matrix pair (A, E) R (n n) 2, the input matrix B R n m, the output matrix C R r n and the feed-forward matrix D R r m. The matrix E is possibly singular. With E = I, where I is the identity matrix of order n, standard state-space systems are considered. Other subsystems described by the tuples (E, A, B) and (E, A, C) are studied when investigating the controllability and observability characteristics of a system (see, e.g, [24, 38]). Applications with periodic behavior, e.g., rotating helicopter blades and revolving satellites can be described by discrete-time periodic descriptor systems of the form E k x k+1 = A k x k + B k u k y k = C k x k + D k u k, (1.3) where the matrices A k, E k R n n, B k R n m, C k R r n and D k R r m are periodic with periodicity K 1. For example, this means that A K = A 0, B K = B 0, etc. Important problems studied in CACSD include state-space realization, minimal realization, linear-quadratic (LQ) optimal control, pole assignment, distance to controllability and observability considerations, etc. For details see, e.g., [57]. The systems (1.1) (1.3) can be studied by various matrix computational approaches, e.g., by solving related eigenvalue problems. In this area, improved algorithms and software are developed for computing and investigating different subspaces, e.g., condition estimation of invariant or deflating subspaces [46, 47], solving various important matrix equations, like (periodic) Sylvester-type and (periodic) Riccati matrix equations and computing canonical structure information [38]. One common step in computing such structure information is the need to separate the stable and unstable eigenvalues by an eigenvalue reordering technique (see, e.g., [62, 35, 66, 65]). 1.3 High performance linear algebra software libraries It is important that scientists and engineers are able to solve their discretized and linearized problems without having to rewrite all the necessary software from scratch. Fortunately, as discussed above, many problems can be formulated in terms of common matrix operations. These ideas have been driving forces behind serial library software packages such as BLAS [13], LAPACK The definition of stable and unstable eigenvalue depends on the system considered. However, the common definitions of a stable eigenvalue λ for discrete-time and continuous-time systems are λ 1 and Re(λ) 0, respectively.

14 4 Chapter 1 [51, 3], SLICOT [27, 61] and their parallel counterparts ScaLAPACK [11] and PSLICOT [56]. BLAS (Basic linear algebra Subprograms) is structured in three levels. Level 1 BLAS is concerned with vector-vector operations, e.g., scalar products, rotations etc., and was developed during the seventies. Level 2 BLAS performs matrix-vector operations and was originally motivated by the increasing number of vector machines during the eighties. Level 3 BLAS concerns matrix-matrix operations, such as the well known GEMM (GEneral Matrix Multiply and add) operation C = αc + βab, where α and β are scalars, and A is an m k matrix, B is a k n matrix and C is an m n matrix. In general, the level 3 BLAS performs O(n 3 ) arithmetic operations while moving O(n 2 ) data element through the memory hierarchy of the computer. If the level 3 BLAS is properly tuned for the cache memory hierarchy of the target computer system, and the computations in the actual program are organized into level 3 operations, the execution may run with close to peak performance. In fact, the whole level 3 BLAS may be organized in GEMM-operations [44, 45] which means that the performance will depend only on how highly tuned the GEMM-operation is. Computer vendors often supply their own high performance implementation of the BLAS which is optimized for their specific architecture. Automatically tuned libraries also exists, see, e.g., ATLAS [4]. See also the GOTO-BLAS [31] which makes use of data streaming to efficiently utilize the memory hierarchy of the target computer. The LAPACK (linear algebra Package) is a combination of the libraries LINPACK and EISPACK and performs all kinds of linear algebra computations, from solving linear systems of equations to calculating all eigenvalues of a general matrix. The computations in LAPACK are organized to perform as much as possible in level 3 operations for optimal performance. The LAPACK project has been extremely successful [22] and now forms the underlying computational layer of the interactive MATLAB [53] environment, which perhaps is the most popular tool for solving computational problems in science and engineering and for educational purposes. ScaLAPACK (Scalable LAPACK) implements a subset of the algorithms in LAPACK for distributed memory environments. Basic building blocks are twodimensional (2D) block cyclic data distribution (see, e.g., [32]) over a logical rectangular processor mesh in combination with a Fortran 77 object oriented approach for handling the involved global distributed matrices. In connection to ScaLAPACK, a parallel version of BLAS exists, the PBLAS (Parallel BLAS) [58]. Explicit communication in ScaLAPACK is performed using the communication library BLACS (Basic linear algebra Communication Subprograms) [12], which provides processor mesh setup routines and basic point-to-point, collective and reduction communication routines. BLACS is usually implemented

15 Introduction 5 using MPI (Message Passing Interface) [55]. SLICOT (Subroutine Library in Systems and Control Theory) provides Fortran 77 implementations of numerical algorithms for computations in systems and control applications. Based on numerical linear algebra routines from BLAS and LAPACK libraries, SLICOT provides methods for the design and analysis of control systems. Similarly to LAPACK and ScaLAPACK, a parallel version of SLICOT, called PSLICOT, is under development. The goal is to include all functionality of SLICOT in a parallel version. PSLICOT builds also on the existing functionality of ScaLAPACK, PBLAS and BLACS. We remark that both LAPACK and ScaLAPACK are currently under revision for new releases [22]. 1.4 Sylvester-type matrix equations Matrix equations have been in focus of the numerical community for quite some time. Applications include eigenvalue problems and condition estimation of eigenvalue problems (e.g., see [46, 36, 59]) and various control problems (e.g., see [24]). Already in 1972, R. H. Bartels and G. W. Stewart published the paper Algorithm 432: Solution of the Matrix Equation AX + XB = C [6], which outlines what came to be called the Bartels Stewart s method for solving the continuous-time Sylvester equation (SYCT) AX + XB = C, (1.4) where A of size m m, B of size n n and C of size m n are arbitrary matrices with real entries. Equation (1.4) have a unique solution if and only if A and B has no eigenvalues in common. The solution method in [6] follows the general idea from mathematics of problem solving via reformulations and coordinate transformations: First transform the problem to a form where it is (more easily) solvable, then solve the transformed problem and finally transform the solution back to the original coordinate system. Examples include computing derivatives by a spectral method using forward and backward Fourier Transform as transformation method [25], and computing explicit inverses of general square matrices by using LU-factorization and matrix multiplication as transformation method [37]. The Bartels Stewart s method for Equation (1.4) is as follows: 1. Transform the matrix A and the matrix B to real Schur form. 2. Update the matrix C with respect to the two Schur decompositions. 3. Solve the resulting reduced triangular matrix equation. 4. Transform the obtained solution back to the original coordinate system.

16 6 Chapter 1 The first step, which is performed by reducing the matrices to Hessenberg forms and applying the QR-algorithm to compute their real Schur forms, is also known to be the dominating part in terms of floating point operations [30] and execution time. By recent developments in obtaining close to level 3 performance in the bulge-chasing [16] and advanced deflating techniques for the QR-algorithm [17], this might change in the future. The classic paper of Bartels and Stewart [6] has served as a foundation for later developments of direct solution methods for related problem, see, e.g., Hammarlings method [34] and the Hessenberg-Schur approach by Golub, Nash and Van Loan [29]. However, these methods were developed before matrix blocking became vital for handling the increasing performance gap between processors and memory modules. Level 3 BLAS LAPACK-style block algorithms for Sylvester-type matrix equations were developed in [46, 47]. In [8, 9, 10] fully iterative methods for solving matrix equations are considered. Those methods can be very fast and reliable, but they are limited to a certain range of problems and cannot be applied to all instances. Jonsson and Kågström presented fast recursive blocked algorithms for solving standard and generalized Sylvester-type matrix equations in [39, 40] and developed the software library RECSY [41]. Notice that their recursive blocking approach has a potential for automatic matching of the memory hierarchy and can be very effective in combination with a highly tuned level 3 BLAS. Recursive blocking can be applied to many problems in matrix computations [26]. ScaLAPACK-style algorithms for solving triangular standard and generalized coupled Sylvester equations were considered in [46, 59] and further developed for the standard Sylvester equation in [33]. In Chapter 3, we give some more examples of Sylvester-type matrix equations and related solution methods. 1.5 Periodic eigenvalue problems Given a general matrix A R n n, the standard eigenvalue problem consists of finding n eigenvalue-eigenvector pairs (λ i, x i ) R 1 (n 1) such that Ax i = λ i x i, (1.5) i = 1,..., n (see, e.g., [52]). Notice that Equation (1.5) only concerns right eigenvectors. Left eigenvectors are defined by yi T A = λ iyi T [30], i.e, they are right eigenvectors of the transposed matrix A T. The standard method for the general standard eigenvalue problem is the unsymmetric QR-algorithm (see, e.g., [28, 30, 50]), which is a backward stable algorithm belonging to a large family of bulge-chasing algorithms [68] that by

17 Introduction 7 iteration reduces the matrix A to real Schur form via an orthogonal similarity transformation Q R n n such that Q T AQ = T A, (1.6) where all eigenvalues of A appear as 1 1 and 2 2 blocks on the main diagonal of the quasi-triangular matrix T A. The column vectors q i, i = 1, 2,..., n of Q are called the Schur vectors of the decomposition (1.6), where q 1 = x 1 is an eigenvector associated with the eigenvalue λ 1. More importantly, given k n such that no 2 2 block resides in T A (k : k + 1, k : k + 1), the k first Schur vectors q i, i = 1, 2..., k, form an orthonormal basis for an invariant subspace of A associated with the k the first eigenvalues λ 1, λ 2,..., λ k. In most practical applications, the retrieved information from the Schur decomposition (eigenvalues and invariant subspaces) is sufficient and the eigenvectors need not to be computed explicitly. However, in case the matrix A is diagonalizable (see, e.g., [52]), the eigenvector x i can be computed as the null space of the linear systems A λ i I. The eigenvectors can also be computed by successively reordering each of the eigenvalues in the Schur form to the topleft corner of the matrix (see, e.g., [5, 23, 15]) and reading off the first Schur vector q 1. However, the latter approach is not utilized in practice for the standard eigenvalue problem but the basic idea can be useful in other contexts (see below). The periodic eigenvalue problem (see, e.g., [50, 68]) consists, in its simplest form, of computing eigenvalues and invariant subspace of the matrix product A = A K 1 A K 2 A 0, where A 0, A 1,..., A K 1, A K+i = A i, i = 0, 1,..., is a K-cyclic matrix sequence. Such problems can arise from forming the monodromy matrix [67] of discretetime periodic descriptor systems of the form (1.3) with E = I. From cost and accuracy reasons, it is necessary to work with the factors and not forming the product A explicitly [14, 68]. Furthermore, with E I, such an approach would require the explicit calculation of the K inverses E k, which may not even exist. In general, the eigenvalues of a K-cyclic matrix product is obtained by computing the periodic Schur decomposition (PRSF) [14, 35] Z T k+1a k Z k = T k, k = 0, 1,..., K 1, (1.7) where Z 0, Z 1,..., Z K 1, Z K = Z 0 are orthogonal and the sequence T k consists of K 1 upper triangular matrices and one upper quasi-triangular matrix. This is utilized by the periodic QR-algorithm (or periodic QZ-algorithm [14, 48] in case of E I). The periodic QR-algorithm is essentially analogous to the standard QR-algorithm applied to a (block) cyclic matrix [49]. The placement of

18 8 Chapter 1 the quasi-triangular matrix may be specified to fit the actual application. Sometimes, for example in pole assignment, the resulting PRSF should be ordered [64], i.e., the eigenvalues should be ordered in a specified way. Each formal cyclic matrix product is associated with a matrix tuple Ā = (A K 1, A K 2,..., A 1, A 0 ) [7]. The vector tuple ū = (x K 1, x K 2,, x 1, x 0 ), with x k 0, is called a right eigenvector of the tuple Ā corresponding to the eigenvalue λ if there exist scalars α k, possibly complex, such that the relations A k x k = α k x k+1, k = 0, 1,..., K 1, λ := 0 k=k 1 α k (1.8) hold with x K = x 0. A left eigenvector ȳ of the tuple Ā corresponding to λ is defined similarly. In this context, a direct eigenvalue reordering method may be utilized to compute eigenvectors corresponding to each eigenvalue in the periodic Schur form by reordering each eigenvalue to the top left corner of the periodic Schur form, similarly to the non-periodic case. More generally, the direct method can be used to compute periodic eigenspaces corresponding to a specified set of eigenvalues (see, e.g., [47]) of the matrix product. The extended periodic real Schur form (EPRSF) [63] generalizes PRSF for handling square products where the involved matrices are rectangular. EPRSF can be computed by a slightly modified periodic QR-algorithm. The generalized periodic real Schur form (GPRSF) generalizes PRSF to periodic matrix pairs (E k, A k ), with E k possible singular, and is typically computed by the periodic QZ-algorithm [48]. Both extensions are important in solving various periodic matrix equations, like Riccati, Lyapunov or Sylvester in their standard or generalized forms.

19 Chapter 2 Contributions in this thesis This chapter gives a brief summary of the papers in this thesis. 2.1 Paper I In this paper, the work by Kågström-Poromaa [46] and Poromaa [59] on block algorithms and parallel algorithms for triangular Sylvester matrix equations is extended. The algorithms presented are complete ScaLAPACK-style implementations of Bartels Stewarts method (see Section 1.4) for the four transpose variants of the general Sylvester equation op(a)x Xop(B) = C, (2.1) where op(a) denotes the matrix A or its transpose A T. One of the shortcomings of the previous algorithms was the lack of support for handling quasitriangular matrices where some 2 2 blocks, corresponding to complex conjugate pairs of eigenvalues, were split in the explicit blocking of the algorithms. This problem was resolved by proposing an implicit redistribution of the matrices in the initial stage of step 3 in Bartels Stewart s method (see Section 1.4). This paper also introduces the on demand communication scheme for step 3 that complements the original matrix shifting scheme [59] for the transpose variants where matrix shifting does not work. Paper I is mainly a refined and compressed version of my Master s Thesis [33]. 2.2 Paper II In this contribution, a comparison between two ScaLAPACK-style implementations of two different methods for solving Equation (1.4) is presented. The 9

20 10 Chapter 2 first method is the implementation of Bartels Stewart s method presented in Paper I and the second implementation is a Newton-iteration-style matrix-signfunction-based method [8, 9, 10]. The comparison concerns generality of use, execution time and accuracy of computed solutions. The matrices A and B are constructed to obtain differently conditioned test problems. The conditioning of the problems are also measured by computing lower bound estimates of sep(a, B) 1 [46]. Experimental results from two differently balanced parallel platforms are presented, showing that the method from Paper I can be substantially faster on well-balanced parallel platforms, and can deliver far more accuracy for illconditioned problems. 2.3 Paper III Paper III introduces ScaLAPACK-style hybrid algorithms for solving the triangular continuous-time Sylvester equation by combining the explicit matrix blocking approach adopted in LAPACK [3] and ScaLAPACK [11] with the recursive matrix blocking provided by the HPC library RECSY [39, 40, 41]. The hybrid algorithms are obtained by replacing the LAPACK standard routines with the solvers provided by RECSY as node subsystem solvers in step 3 of Bartels Stewart s method. Even though an overwhelming part of the computational work in step 3 is performed outside the node subsystem solver, the proposed approach gives a substantial improvement of the execution time of the algorithm, mainly from two reasons: It allows for using larger blocks in the explicit blocking without ruining the performance because of a slow kernel solver, thus decreasing the number of communication steps in the algorithm. It decreases the synchronization time for some idle processors during a certain step in the parallel solver where a subsolution is broadcasted along two different scopes. Experimental results are presented and evaluated for both matrix shifting and on demand communication. 2.4 Paper IV The final contribution concerns periodic eigenvalue problems. The paper presents the derivation and the analysis of a direct method (see, e.g., [5, 42]) for eigenvalue reordering in a K-cyclic matrix product A K 1 A K 2 A 1 A 0 of the sequence A 0, A 1,..., A K 2, A K 1, A K = A 0 without evaluating the matrix product.

21 Contributions in this thesis 11 The method relies on orthogonal transformations only and the proposed algorithm performs the reordering tentatively to guarantee backward stability. One important step in the method is the numerical solution of an associated triangular periodic Sylvester equation (PSE) A k X k X k+1 B k = C k, k = 0, 1,..., K 1, (2.2) where X K = X 0. Several methods for solving Equation (2.2) are discussed, including Gaussian elimination with partial or complete pivoting (GEPP/GECP) and iterative refinement (see, e.g., [37]). An error analysis of the direct reordering method is presented that reveals that the accuracy of the reordered eigenvalues is connected to the accuracy of the computed solution to the associated PSE. The theoretical results are also connected back to the standard case K = 1. Some experimental results are presented that illustrates the reliability and robustness of the direct reordering method for a selected number of problems, including well- and ill-conditioned artificial problems with short and long periods and an application with long period from satellite control.

22 12 Chapter 2

23 Chapter 3 Ongoing and future work The topics discussed in the thesis will be considered for further developments. 3.1 Sylvester-type matrix equations The Bartels Stewart s method presented in Section 1.4 can be generalized to other matrix equations, as well. Perhaps the simplest example (which was also mentioned by Bartels and Stewart) is the continuous-time Lyapunov equation (LYCT) AX + XA T = C, (3.1) where A and C = C T of size n n are general matrices with real entries. One may also consider the discrete-time counterparts of SYCT and LYCT: the discrete-time Sylvester equation (SYDT) AXB T X = C, (3.2) where A, B and C are as in SYCT, and the discrete-time Lyapunov equation (LYDT) AXA T X = C. (3.3) where A and C are as in LYCT. Furthermore, generalized variants of the Sylvester/Lyapunov matrix equations can be considered, for example, the generalized coupled Sylvester equation (GCSY) (AX Y B, DX Y E) = (C, F ), (3.4) where A and D of size m m, B and E of size n n and C and F of size m n are general matrices with real entries. A Bartels Stewart-style method for GCSY can be formulated as follows: 13

24 14 Chapter 3 1. Reduce the matrix pairs (A, D) and (B, E) to generalized Schur form. 2. Update the right hand side matrix pair (C, F ) with respect to the two generalized Schur decompositions. 3. Solve the resulting triangular generalized matrix equation. 4. Transform the computed solution back to the original coordinate system. Notice that step 3 above has already been considered in [59]. Other generalized matrix equations to consider are the generalized Sylvester equation (GSYL) AXB T CXD T = E, (3.5) where A and C of size m m, B and D of size n n and E of size m n are general matrices with real entries, the continuous-time generalized Lyapunov equation (GLYCT) AXE T EXA T = C, (3.6) where A, E and C = C T of size m m are general matrices with real entries, and the discrete-time generalized Lyapunov equation (GLYDT) AXA T + EXE T = C, (3.7) where A, E and C = C T of size m m are general matrices with real entries. Name Standard Sylvester (CT) Standard Lyapunov (CT) Standard Sylvester (DT) Standard Lyapunov (DT) Generalized Coupled Sylvester Generalized Sylvester Generalized Lyapunov (CT) Generalized Lyapunov (DT) Matrix equation op(a)x ± Xop(B) = C op(a)x + Xop(A T ) = C op(a)xop(b) ± X = C op(a)xop(a { T ) X = C op1 (A)X ± Y op 2 (B) = C op 1 (D)X ± Y op 2 (E) = F op 1 (A)Xop 2 B ± op 1 (C)Xop 2 (D) = E op 1 (A)Xop 2 (E T ) + op 2 (E)Xop 1 (A T ) = C op 1 (A)Xop 1 (A T ) op 2 (E)Xop 2 (E T ) = C Table 3.1: The sign and transpose variants of the Sylvester-type matrix equations to be considered in SCASY. CT and DT denote the continuous-time and discrete-time variants, respectively. Parallel ScaLAPACK-style Bartels Stewart-based algorithms for solving 42 sign and transpose variants of the standard and generalized Sylvester/Lyapunov matrix equations shown in Table 3.1 are under development (see also RECSY

25 Ongoing and future work 15 [41]). As the experienced reader might have noticed, the parallel Schur decomposition step for the generalized matrix equations requires a reliable and efficient ScaLAPACK-style implementation of the QZ-algorithm. Up to this date, the most important contributions regarding block forms and parallelizations of the QZ-algorithm and related topics have been given by Dackland and Kågström [19, 20], Dackland [18], Adlerborn, Dackland and Kågström [1, 2] and Kågström and Kressner [43]. The objective is to develop algorithms and a software package SCASY that will include general and triangular solvers for the Sylvester-type matrix equations presented in this section, parallel condition estimators [46] and matrix generators for matrices and matrix pairs with specified eigenvalues. At the time of writing, most matrix equation algorithms have already been implemented, but not fully tested or documented [60]. 3.2 Periodic eigenvalue problems The direct method for eigenvalue reordering in a cyclic product of matrices has been further developed to support computation of periodic eigenspaces with specified eigenvalues by bubble-sorting the selected eigenvalues to the top-left corner of the corresponding matrix product. This approach allows for condition estimation of selected clusters of eigenvalues by solving a few (say five) associated PSEs (2.2). Level 3 BLAS algorithms based on Kronecker product representations of small PSEs, Gaussian elimination with partial pivoting (GEPP) and iterative refinement for solving PSEs are developed. For the direct reordering method to be reliable, the associated periodic Schur decomposition must be stable. There exists a number of implementations of the periodic QR-algorithm (see, e.g, [50]), but at the time of writing, no one is completely satisfying. Some efforts to contribute to this area will be made. We remark that eigenvalue reordering techniques, like the one presented in this thesis, can be beneficial for implementing advanced aggressive early deflation techniques for any QR-style bulge-chasing algorithm [17]. Future work also includes considering more general periodic eigenvalue problems, like periodic matrix pairs (E k, A k ), with E k possibly singular, arising from periodic descriptor systems, see, e.g., [67] and Section 1.2, where K-cyclic matrix products of the form A = E 1 K 1 A K 1E 1 K 2 A K 2... E 1 0 A 0 (3.8) are considered. Such matrix products arise from the study of the monodromy matrix of the corresponding system. Notice that the matrix product is considered to be formal [7, 68], i.e., the inverses E k are not formed explicitly (see Section 1.5). Observe also that for K = 1, this problem is equivalent to the generalized eigenvalue problem

26 16 Chapter 3 Ax = λex, (3.9) where the task is to compute generalized eigenvalue-eigenvector pairs of the matrix pair (A, E). One key ingredient here is the generalized Schur form which can be computed by the QZ-algorithm without computing any explicit matrix inverses. For details we refer to [54, 21, 50] and the references therein. To perform direct eigenvalue reordering in the context of matrix products of the form (3.8), generalized periodic Sylvester-type matrix equations must be solved with high accuracy, in analogy with direct eigenvalue reordering in the generalized Schur form [42, 47]. The prototype software developed in Paper IV is being revised to reach library (see, e.g., LAPACK [3] and SLICOT [61]) standard and quality. All developed software will be designed to be integrated in state-of-the-art software libraries as LAPACK and SLICOT.

27 References [1] B. Adlerborn, K. Dackland, and B. Kågström. Parallel Two-Stage Reduction of a Regular Matrix Pair to Hessenberg-Triangular Form. In T. Sørvik and et al, editors, Applied Parallel Computing: New Paradigms for HPC Industry and Academia, volume 1947, pages Springer-Verlag, Lecture Notes in Computer Science, [2] B. Adlerborn, K. Dackland, and B. Kågström. Parallel and blocked algorithms for reduction of a regular matrix pair to Hessenberg-triangular and generalized Schur forms. In J. Fagerholm et al., editor, PARA 2002, LNCS 2367, pages Springer-Verlag, [3] E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. W. Demmel, J. J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. C. Sorensen. LAPACK Users Guide. SIAM, Philadelphia, PA, third edition, [4] ATLAS - Automatically Tuned Linear Algebra Software. See http: //math-atlas.sourceforge.net/. [5] Z. Bai and J. W. Demmel. On swapping diagonal blocks in real Schur form. Linear Algebra Appl., 186:73 95, [6] R. H. Bartels and G. W. Stewart. Algorithm 432: The Solution of the Matrix Equation AX BX = C. Communications of the ACM, 8: , [7] P. Benner, V. Mehrmann, and H. Xu. Perturbation analysis for the eigenvalue problem of a formal product of matrices. BIT, 42(1):1 43, [8] P. Benner and E. S. Quintana-Ortí. Solving Stable Generalized Lyanpunov Equations with the matrix sign functions. Numerical Algorithms, 20(1):75 100, [9] P. Benner, E.S. Quintana-Ortí, and G. Quintana-Ortí. Numerical Solution of Discrete Stable Linear Matrix Equations on Multicomputers. Parallel Algorithms and Applications, 17(1): ,

28 18 Chapter 3 [10] P. Benner, E.S. Quintana-Ortí, and G. Quintana-Ortí. Solving Stable Sylvester Equations via Rational Iterative Schemes. Preprint sfb393/04-08, TU Chemnitz, [11] L. S. Blackford, J. Choi, A. Cleary, E. D Azevedo, J. W. Demmel, I. Dhillon, J. J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK Users Guide. SIAM, Philadelphia, PA, [12] BLACS - Basic Linear Algebra Communication Subprograms. See http: // [13] BLAS - Basic Linear Algebra Subprograms. See blas/index.html. [14] A. Bojanczyk, G. H. Golub, and P. Van Dooren. The periodic Schur decomposition; algorithm and applications. In Proc. SPIE Conference, volume 1770, pages 31 42, [15] A. Bojanczyk and P. Van Dooren. Reordering diagonal blocks in the real Schur form. In NATO ASI on Linear Algebra for Large Scale and Real-Time Applications, volume 1770, pages , [16] K. Braman, R. Byers, and R. Mathias. The multishift QR algorithm, I: Maintaining well-focused shifts and level 3 performance. SIAM J. Matrix Anal. Appl., 23(4): , [17] K. Braman, R. Byers, and R. Mathias. The multishift QR algorithm, II: Aggressive early deflation. SIAM J. Matrix Anal. Appl., 23(4): , [18] K. Dackland. Parallel Reduction of a Regular Matrix Pair to Block- Hessenberg-Triangular Form Algorithm Design and Performance Modelling. Report uminf-98.09, Department of Computing Science, Umeå University, Sweden, [19] K. Dackland and B. Kågström. Reduction of a Regular Matrix Pair (A, B) to Block Hessenberg-Triangular Form. In J. Dongarra, K. Madsen, and J. Wasniewski, editors, Applied Parallel Computing: Computations in Physics, Chemistry and Engineering Science, volume 1041, pages Lecture Notes in Computer Science, Springer, [20] K. Dackland and B. Kågström. Blocked Algorithms and Software for Reduction of a Regular Matrix Pair to Generalized Schur Form. ACM Trans. Math. Software, 25(4): , 1999.

29 REFERENCES 19 [21] K. Dackland and B. Kågström. Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form. ACM Trans. Math. Software, 25(4): , [22] J. W. Demmel and J. J. Dongarra. Lapack 2005 prospectus: Reliable and scalable software for linear algebra computations on high end computers. Lapack working note 164, University of Carlifornia, Berkeley and University of Tennessee, Knoxville, [23] J. J. Dongarra, S. Hammarling, and J. H. Wilkinson. Numerical considerations in computing invariant subspaces. SIAM J. Matrix Anal. Appl., 13(1): , [24] G. E. Dullerud and F. Paganini. A Course in Robust Control Theory. Springer-Verlag, New York, A Convex Approach. [25] B. Eliasson. Numerical Vlasov-Maxwell Modelling of Space Plasma. PhD thesis, Uppsala University, Department of Information Technology, Scientific Computing, [26] E. Elmroth, F. Gustavson, I. Jonsson, and B. Kågström. Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software. SIAM Review, 46(1):3 45, [27] E. Elmroth, P. Johansson, B. Kågström, and D. Kressner. A web computing environment for the SLICOT library. In The Third NICONET Workshop on Numerical Control Software, pages 53 61, [28] J. G. F. Francis. The QR Transformation, Parts I and II. Computer Journal, 4: , , 1961, [29] G. H. Golub, S. Nash, and C. F. Van Loan. A Hessenberg-Schur method for the problem AX + XB = C. IEEE Trans. Automat. Control, 24(6): , [30] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press, Baltimore, MD, third edition, [31] GOTO-BLAS - High-Performance BLAS by Kazushige Goto. See http: // [32] A. Grama, A. Gupta, G. Karypsis, and V. Kumar. Introduction to Parallel Computing, Second Edition. Addison-Wesley, [33] R. Granat. A Parallel ScaLAPACK-style Sylvester Solver. Master s thesis, umnad 435/03, Department of Computing Science, 2003.

30 20 Chapter 3 [34] S. J. Hammarling. Numerical Solution of the Stable, Non-negative Definite Lyapunov Equation. IMA Journal of Numerical Analysis, (2): , [35] J. J. Hench and A. J. Laub. Numerical solution of the discrete-time periodic Riccati equation. IEEE Trans. Automat. Control, 39(6): , [36] N. J. Higham. Perturbation theory and backward error for AX XB = C. BIT, 33(1): , [37] N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA, second edition, [38] S. Johansson. Stratification of Matrix Pencils in Systems and Control: Theory and Algorithms. Licentiate Thesis, Report UMINF-05.17, Department of Computing Science, SE , Umeå University, Sweden, [39] I. Jonsson and B. Kågström. Recursive blocked algorithms for solving triangular systems. I. One-sided and coupled Sylvester-type matrix equations. ACM Trans. Math. Software, 28(4): , [40] I. Jonsson and B. Kågström. Recursive blocked algorithms for solving triangular systems. II. Two-sided and generalized Sylvester and Lyapunov matrix equations. ACM Trans. Math. Software, 28(4): , [41] I. Jonsson and B. Kågström. RECSY - A High Performance Library for Solving Sylvester-Type Matrix Equations. In Kosch H. et al, editor, Euro- Par 2003 Parallel Processing, volume 2790, pages Springer-Verlag, Lecture Notes in Computer Science, [42] B. Kågström. A direct method for reordering eigenvalues in the generalized real Schur form of a regular matrix pair (A, B). In Linear algebra for large scale and real-time applications (Leuven, 1992), volume 232 of NATO Adv. Sci. Inst. Ser. E Appl. Sci., pages Kluwer Acad. Publ., Dordrecht, [43] B. Kågström and D. Kressner. Multishift Variants of the QZ Algorithm with Aggressive Early Deflation. Report UMINF-05.11, ISSN , Department of Computing Science, UmeåUniversity, S Umeå, Sweden, [44] B. Kågström, P. Ling, and C. Van Loan. GEMM-Based Level 3 BLAS: High-Performance Model Implementations and Performance Evaluation Benchmark. ACM Trans. Math. Software, 24(3): , [45] B. Kågström, P. Ling, and C. Van Loan. Algorithm 784: GEMM-Based Level 3 BLAS: Portability and Optimization Issues. ACM Trans. Math. Software, 24(3): , 1998.

31 REFERENCES 21 [46] B. Kågström and P. Poromaa. Distributed and shared memory block algorithms for the triangular Sylvester equation with sep 1 estimators. SIAM J. Matrix Anal. Appl., 13(1):90 101, [47] B. Kågström and P. Poromaa. Computing eigenspaces with specified eigenvalues of a regular matrix pair (A, B) and condition estimation: theory, algorithms and software. Numer. Algorithms, 12(3-4): , [48] D. Kressner. An efficient and reliable implementation of the periodic QZ algorithm. In IFAC Workshop on Periodic Control Systems, [49] D. Kressner. The periodic QR algorithm is a disguised QR algorithm, To appear in Linear Algebra Appl. [50] D. Kressner. Numerical Methods and Software for General and Structured Eigenvalue Problems. PhD thesis, TU Berlin, Institut für Mathematik, Berlin, Germany, [51] LAPACK - Linear Algebra Package. See lapack/. [52] David C. Lay. Linear Algebra and its Applications, 2nd edition. Addison- Wesley, [53] The MathWorks, Inc., Cochituate Place, 24 Prime Park Way, Natick, Mass, Matlab Version 6.5, [54] C. B. Moler and G. W. Stewart. An algorithm for generalized matrix eigenvalue problems. SIAM J. Numer. Anal., 10: , [55] MPI - Message Passing Interface. See mpi/. [56] Niconet Taks II: Model Reduction. See niconet/nic2/nictask2.html. [57] N. S. Nise. Control Systems Engineering. Wiley, Fourth International Edition. [58] PBLAS - Parallel Basic Linear Algebra Subprograms. See netlib.org/scalapack/html/pblas qref.html. [59] P. Poromaa. Parallel Algorithms for Triangular Sylvester Equations: Design, Scheduling and Scalability Issues. In Kågström et al., editor, Applied Parallel Computing. Large Scale Scientific and Industrial Problems, volume 1541, pages Springer Verlag, Lecture Notes in Computer Science, 1998.

32 22 Chapter 3 [60] SCASY - ScaLAPACK-style solvers for Sylvester-type matrix equations. See granat/scasy.html. [61] SLICOT Library In The Numerics In Control Network (Niconet). See [62] J. Sreedhar and P. Van Dooren. Pole placement via the periodic Schur decomposition. In Proceedings Amer. Contr. Conf., pages , [63] A. Varga. Balancing related methods for minimal realization of periodic systems. Systems Control Lett., 36(5): , [64] A. Varga. Robust and minimum norm pole assignment with periodic state feedback. IEEE Trans. Automat. Control, 45(5): , [65] A. Varga. On solving discrete-time periodic Riccati equations accepted for IFAC [66] A. Varga. On solving periodic differential matrix equations with applications to periodic system norms computation submitted to CDC [67] A. Varga and P. Van Dooren. Computational methods for periodic systems - an overview. In Proc. of IFAC Workshop on Periodic Control Systems, Como, Italy, pages , [68] D. S. Watkins. Product Eigenvalue Problems. SIAM Review, 47:3 40, 2005.

33 I

34

35 Paper I Parallell ScaLAPACK-style Algorithms for Solving Continuous-Time Sylvester Equations Robert Granat, Bo Kågström and Peter Poromaa Department of Computing Science and HPC2N, Umeå University SE Umeå, Sweden. granat@cs.umu.se, bokg@cs.umu.se, peterp@cs.umu.se Abstract An implementation of a parallel ScaLAPACK-style solver for the general Sylvester equation, op(a)x Xop(B) = C, where op(a) denotes A or its transpose A T, is presented. The parallel algorithm is based on explicit blocking of the Bartels-Stewart method. An initial transformation of the coefficient matrices A and B to Schur form leads to a reduced triangular matrix equation. We use different matrix traversing strategies to handle the transposes in the problem to solve, leading to different new parallel wave-front algorithms. We also present a strategy to handle the problem when 2 x 2 diagonal blocks of the matrices in Schur form, corresponding to complex conjugate pairs of eigenvalues, are split between several blocks in the block partitioned matrices. Finally, the solution of the reduced matrix equation is transformed back to the original coordinate system. The implementation acts in a ScaLAPACK environment using 2-dimensional block cyclic mapping of the matrices onto a rectangular grid of processes. Real performance results are presented which verify that our parallel algorithms are reliable and scalable. Keywords: Sylvester matrix equation, continuous-time, Bartels Stewart method, blocking, GEMM-based, level 3 BLAS, SLICOT, ScaLAPACK-style algorithms. With kind permission of Springer Science and Business Media, Springer-Verlag, Lecture Notes in Computer Science, 2790 (2003), pp c 2003 Springer-Verlag, Berlin. All rights reserved This research was conducted using the resources of the High Performance Computer Center North (HPC2N). Financial support has been provided by the Swedish Research Council under grant VR and by the Swedish Foundation for Strategic Research under grant A3 02:

On aggressive early deflation in parallel variants of the QR algorithm

On aggressive early deflation in parallel variants of the QR algorithm On aggressive early deflation in parallel variants of the QR algorithm Bo Kågström 1, Daniel Kressner 2, and Meiyue Shao 1 1 Department of Computing Science and HPC2N Umeå University, S-901 87 Umeå, Sweden

More information

1. Introduction. Applying the QR algorithm to a real square matrix A yields a decomposition of the form

1. Introduction. Applying the QR algorithm to a real square matrix A yields a decomposition of the form BLOCK ALGORITHMS FOR REORDERING STANDARD AND GENERALIZED SCHUR FORMS LAPACK WORKING NOTE 171 DANIEL KRESSNER Abstract. Block algorithms for reordering a selected set of eigenvalues in a standard or generalized

More information

MATLAB TOOLS FOR SOLVING PERIODIC EIGENVALUE PROBLEMS 1

MATLAB TOOLS FOR SOLVING PERIODIC EIGENVALUE PROBLEMS 1 MATLAB TOOLS FOR SOLVING PERIODIC EIGENVALUE PROBLEMS 1 Robert Granat Bo Kågström Daniel Kressner,2 Department of Computing Science and HPC2N, Umeå University, SE-90187 Umeå, Sweden. {granat,bokg,kressner}@cs.umu.se

More information

Transactions on Mathematical Software

Transactions on Mathematical Software Parallel Solvers for Sylvester-Type Matrix Equations with Applications in Condition Estimation, Part I: Theory and Algorithms Journal: Manuscript ID: Manuscript Type: Date Submitted by the Author: Complete

More information

Parallel eigenvalue reordering in real Schur forms

Parallel eigenvalue reordering in real Schur forms CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2000; 00:1 7 [Version: 2002/09/19 v2.02] Parallel eigenvalue reordering in real Schur forms R. Granat 1, B. Kågström

More information

Parallel Variants and Library Software for the QR Algorithm and the Computation of the Matrix Exponential of Essentially Nonnegative Matrices

Parallel Variants and Library Software for the QR Algorithm and the Computation of the Matrix Exponential of Essentially Nonnegative Matrices Parallel Variants and Library Software for the QR Algorithm and the Computation of the Matrix Exponential of Essentially Nonnegative Matrices Meiyue Shao Ph Licentiate Thesis, April 2012 Department of

More information

Parallel Model Reduction of Large Linear Descriptor Systems via Balanced Truncation

Parallel Model Reduction of Large Linear Descriptor Systems via Balanced Truncation Parallel Model Reduction of Large Linear Descriptor Systems via Balanced Truncation Peter Benner 1, Enrique S. Quintana-Ortí 2, Gregorio Quintana-Ortí 2 1 Fakultät für Mathematik Technische Universität

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

Exponentials of Symmetric Matrices through Tridiagonal Reductions

Exponentials of Symmetric Matrices through Tridiagonal Reductions Exponentials of Symmetric Matrices through Tridiagonal Reductions Ya Yan Lu Department of Mathematics City University of Hong Kong Kowloon, Hong Kong Abstract A simple and efficient numerical algorithm

More information

Computation of a canonical form for linear differential-algebraic equations

Computation of a canonical form for linear differential-algebraic equations Computation of a canonical form for linear differential-algebraic equations Markus Gerdin Division of Automatic Control Department of Electrical Engineering Linköpings universitet, SE-581 83 Linköping,

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006.

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006. LAPACK-Style Codes for Pivoted Cholesky and QR Updating Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig 2007 MIMS EPrint: 2006.385 Manchester Institute for Mathematical Sciences School of Mathematics

More information

Direct Methods for Matrix Sylvester and Lyapunov Equations

Direct Methods for Matrix Sylvester and Lyapunov Equations Direct Methods for Matrix Sylvester and Lyapunov Equations D. C. Sorensen and Y. Zhou Dept. of Computational and Applied Mathematics Rice University Houston, Texas, 77005-89. USA. e-mail: {sorensen,ykzhou}@caam.rice.edu

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra By: David McQuilling; Jesus Caban Deng Li Jan.,31,006 CS51 Solving Linear Equations u + v = 8 4u + 9v = 1 A x b 4 9 u v = 8 1 Gaussian Elimination Start with the matrix representation

More information

Porting a Sphere Optimization Program from lapack to scalapack

Porting a Sphere Optimization Program from lapack to scalapack Porting a Sphere Optimization Program from lapack to scalapack Paul C. Leopardi Robert S. Womersley 12 October 2008 Abstract The sphere optimization program sphopt was originally written as a sequential

More information

The Future of LAPACK and ScaLAPACK

The Future of LAPACK and ScaLAPACK The Future of LAPACK and ScaLAPACK Jason Riedy, Yozo Hida, James Demmel EECS Department University of California, Berkeley November 18, 2005 Outline Survey responses: What users want Improving LAPACK and

More information

The LINPACK Benchmark in Co-Array Fortran J. K. Reid Atlas Centre, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, UK J. M. Rasmussen

The LINPACK Benchmark in Co-Array Fortran J. K. Reid Atlas Centre, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, UK J. M. Rasmussen The LINPACK Benchmark in Co-Array Fortran J. K. Reid Atlas Centre, Rutherford Appleton Laboratory, Chilton, Didcot, Oxon OX11 0QX, UK J. M. Rasmussen and P. C. Hansen Department of Mathematical Modelling,

More information

S N. hochdimensionaler Lyapunov- und Sylvestergleichungen. Peter Benner. Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz

S N. hochdimensionaler Lyapunov- und Sylvestergleichungen. Peter Benner. Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz Ansätze zur numerischen Lösung hochdimensionaler Lyapunov- und Sylvestergleichungen Peter Benner Mathematik in Industrie und Technik Fakultät für Mathematik TU Chemnitz S N SIMULATION www.tu-chemnitz.de/~benner

More information

ON THE USE OF LARGER BULGES IN THE QR ALGORITHM

ON THE USE OF LARGER BULGES IN THE QR ALGORITHM ON THE USE OF LARGER BULGES IN THE QR ALGORITHM DANIEL KRESSNER Abstract. The role of larger bulges in the QR algorithm is controversial. Large bulges are infamous for having a strong, negative influence

More information

Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems

Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems Algorithm 853: an Efficient Algorithm for Solving Rank-Deficient Least Squares Problems LESLIE FOSTER and RAJESH KOMMU San Jose State University Existing routines, such as xgelsy or xgelsd in LAPACK, for

More information

Accelerating computation of eigenvectors in the nonsymmetric eigenvalue problem

Accelerating computation of eigenvectors in the nonsymmetric eigenvalue problem Accelerating computation of eigenvectors in the nonsymmetric eigenvalue problem Mark Gates 1, Azzam Haidar 1, and Jack Dongarra 1,2,3 1 University of Tennessee, Knoxville, TN, USA 2 Oak Ridge National

More information

A Continuation Approach to a Quadratic Matrix Equation

A Continuation Approach to a Quadratic Matrix Equation A Continuation Approach to a Quadratic Matrix Equation Nils Wagner nwagner@mecha.uni-stuttgart.de Institut A für Mechanik, Universität Stuttgart GAMM Workshop Applied and Numerical Linear Algebra September

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating

LAPACK-Style Codes for Pivoted Cholesky and QR Updating LAPACK-Style Codes for Pivoted Cholesky and QR Updating Sven Hammarling 1, Nicholas J. Higham 2, and Craig Lucas 3 1 NAG Ltd.,Wilkinson House, Jordan Hill Road, Oxford, OX2 8DR, England, sven@nag.co.uk,

More information

Accelerating computation of eigenvectors in the dense nonsymmetric eigenvalue problem

Accelerating computation of eigenvectors in the dense nonsymmetric eigenvalue problem Accelerating computation of eigenvectors in the dense nonsymmetric eigenvalue problem Mark Gates 1, Azzam Haidar 1, and Jack Dongarra 1,2,3 1 University of Tennessee, Knoxville, TN, USA 2 Oak Ridge National

More information

Parallel eigenvalue reordering in real Schur forms

Parallel eigenvalue reordering in real Schur forms Parallel eigenvalue reordering in real Schur forms LAPACK Working Note #192 R. Granat 1, B. Kågström 1, D. Kressner 1,2 1 Department of Computing Science and HPC2N, Umeå University, SE-901 87 Umeå, Sweden.

More information

Solving projected generalized Lyapunov equations using SLICOT

Solving projected generalized Lyapunov equations using SLICOT Solving projected generalized Lyapunov equations using SLICOT Tatjana Styel Abstract We discuss the numerical solution of projected generalized Lyapunov equations. Such equations arise in many control

More information

Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI *

Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI * Solving the Inverse Toeplitz Eigenproblem Using ScaLAPACK and MPI * J.M. Badía and A.M. Vidal Dpto. Informática., Univ Jaume I. 07, Castellón, Spain. badia@inf.uji.es Dpto. Sistemas Informáticos y Computación.

More information

A hybrid Hermitian general eigenvalue solver

A hybrid Hermitian general eigenvalue solver Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe A hybrid Hermitian general eigenvalue solver Raffaele Solcà *, Thomas C. Schulthess Institute fortheoretical Physics ETHZ,

More information

Index. for generalized eigenvalue problem, butterfly form, 211

Index. for generalized eigenvalue problem, butterfly form, 211 Index ad hoc shifts, 165 aggressive early deflation, 205 207 algebraic multiplicity, 35 algebraic Riccati equation, 100 Arnoldi process, 372 block, 418 Hamiltonian skew symmetric, 420 implicitly restarted,

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

A NOVEL PARALLEL QR ALGORITHM FOR HYBRID DISTRIBUTED MEMORY HPC SYSTEMS

A NOVEL PARALLEL QR ALGORITHM FOR HYBRID DISTRIBUTED MEMORY HPC SYSTEMS A NOVEL PARALLEL QR ALGORITHM FOR HYBRID DISTRIBUTED MEMORY HPC SYSTEMS ROBERT GRANAT, BO KÅGSTRÖM, AND DANIEL KRESSNER Abstract A novel variant of the parallel QR algorithm for solving dense nonsymmetric

More information

On GPU Acceleration of Common Solvers for (Quasi-) Triangular Generalized Lyapunov Equations

On GPU Acceleration of Common Solvers for (Quasi-) Triangular Generalized Lyapunov Equations Max Planck Institute Magdeburg Preprints Martin Köhler Jens Saak On GPU Acceleration of Common Solvers for (Quasi-) Triangular Generalized Lyapunov Equations MAX PLANCK INSTITUT FÜR DYNAMIK KOMPLEXER TECHNISCHER

More information

MULTISHIFT VARIANTS OF THE QZ ALGORITHM WITH AGGRESSIVE EARLY DEFLATION LAPACK WORKING NOTE 173

MULTISHIFT VARIANTS OF THE QZ ALGORITHM WITH AGGRESSIVE EARLY DEFLATION LAPACK WORKING NOTE 173 MULTISHIFT VARIANTS OF THE QZ ALGORITHM WITH AGGRESSIVE EARLY DEFLATION LAPACK WORKING NOTE 173 BO KÅGSTRÖM AND DANIEL KRESSNER Abstract. New variants of the QZ algorithm for solving the generalized eigenvalue

More information

Computing Periodic Deflating Subspaces Associated with a Specified Set of Eigenvalues

Computing Periodic Deflating Subspaces Associated with a Specified Set of Eigenvalues BIT Numerical Mathematics 0006-3835/03/430-000 $6.00 2003 Vol. 43 No. pp. 00 08 c Kluwer Academic Publishers Computing Periodic Deflating Subspaces Associated with a Specified Set of Eigenvalues R. GRANAT

More information

A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation.

A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation. 1 A New Block Algorithm for Full-Rank Solution of the Sylvester-observer Equation João Carvalho, DMPA, Universidade Federal do RS, Brasil Karabi Datta, Dep MSc, Northern Illinois University, DeKalb, IL

More information

Factorized Solution of Sylvester Equations with Applications in Control

Factorized Solution of Sylvester Equations with Applications in Control Factorized Solution of Sylvester Equations with Applications in Control Peter Benner Abstract Sylvester equations play a central role in many areas of applied mathematics and in particular in systems and

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

On a quadratic matrix equation associated with an M-matrix

On a quadratic matrix equation associated with an M-matrix Article Submitted to IMA Journal of Numerical Analysis On a quadratic matrix equation associated with an M-matrix Chun-Hua Guo Department of Mathematics and Statistics, University of Regina, Regina, SK

More information

Preconditioned Parallel Block Jacobi SVD Algorithm

Preconditioned Parallel Block Jacobi SVD Algorithm Parallel Numerics 5, 15-24 M. Vajteršic, R. Trobec, P. Zinterhof, A. Uhl (Eds.) Chapter 2: Matrix Algebra ISBN 961-633-67-8 Preconditioned Parallel Block Jacobi SVD Algorithm Gabriel Okša 1, Marián Vajteršic

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems. Contents

Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems. Contents Eigenvalue and Least-squares Problems Module Contents Module 6.6: nag nsym gen eig Nonsymmetric Generalized Eigenvalue Problems nag nsym gen eig provides procedures for solving nonsymmetric generalized

More information

Symmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano

Symmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano Symmetric Pivoting in ScaLAPACK Craig Lucas University of Manchester Cray User Group 8 May 2006, Lugano Introduction Introduction We wanted to parallelize a serial algorithm for the pivoted Cholesky factorization

More information

Roundoff Error. Monday, August 29, 11

Roundoff Error. Monday, August 29, 11 Roundoff Error A round-off error (rounding error), is the difference between the calculated approximation of a number and its exact mathematical value. Numerical analysis specifically tries to estimate

More information

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS

BALANCING-RELATED MODEL REDUCTION FOR DATA-SPARSE SYSTEMS BALANCING-RELATED Peter Benner Professur Mathematik in Industrie und Technik Fakultät für Mathematik Technische Universität Chemnitz Computational Methods with Applications Harrachov, 19 25 August 2007

More information

CS 542G: Conditioning, BLAS, LU Factorization

CS 542G: Conditioning, BLAS, LU Factorization CS 542G: Conditioning, BLAS, LU Factorization Robert Bridson September 22, 2008 1 Why some RBF Kernel Functions Fail We derived some sensible RBF kernel functions, like φ(r) = r 2 log r, from basic principles

More information

Positive Denite Matrix. Ya Yan Lu 1. Department of Mathematics. City University of Hong Kong. Kowloon, Hong Kong. Abstract

Positive Denite Matrix. Ya Yan Lu 1. Department of Mathematics. City University of Hong Kong. Kowloon, Hong Kong. Abstract Computing the Logarithm of a Symmetric Positive Denite Matrix Ya Yan Lu Department of Mathematics City University of Hong Kong Kowloon, Hong Kong Abstract A numerical method for computing the logarithm

More information

Lecture 8: Fast Linear Solvers (Part 7)

Lecture 8: Fast Linear Solvers (Part 7) Lecture 8: Fast Linear Solvers (Part 7) 1 Modified Gram-Schmidt Process with Reorthogonalization Test Reorthogonalization If Av k 2 + δ v k+1 2 = Av k 2 to working precision. δ = 10 3 2 Householder Arnoldi

More information

Matrix Eigensystem Tutorial For Parallel Computation

Matrix Eigensystem Tutorial For Parallel Computation Matrix Eigensystem Tutorial For Parallel Computation High Performance Computing Center (HPC) http://www.hpc.unm.edu 5/21/2003 1 Topic Outline Slide Main purpose of this tutorial 5 The assumptions made

More information

Generalized interval arithmetic on compact matrix Lie groups

Generalized interval arithmetic on compact matrix Lie groups myjournal manuscript No. (will be inserted by the editor) Generalized interval arithmetic on compact matrix Lie groups Hermann Schichl, Mihály Csaba Markót, Arnold Neumaier Faculty of Mathematics, University

More information

2 Computing complex square roots of a real matrix

2 Computing complex square roots of a real matrix On computing complex square roots of real matrices Zhongyun Liu a,, Yulin Zhang b, Jorge Santos c and Rui Ralha b a School of Math., Changsha University of Science & Technology, Hunan, 410076, China b

More information

Parallel Solution of Large-Scale and Sparse Generalized algebraic Riccati Equations

Parallel Solution of Large-Scale and Sparse Generalized algebraic Riccati Equations Parallel Solution of Large-Scale and Sparse Generalized algebraic Riccati Equations José M. Badía 1, Peter Benner 2, Rafael Mayo 1, and Enrique S. Quintana-Ortí 1 1 Depto. de Ingeniería y Ciencia de Computadores,

More information

MPI Implementations for Solving Dot - Product on Heterogeneous Platforms

MPI Implementations for Solving Dot - Product on Heterogeneous Platforms MPI Implementations for Solving Dot - Product on Heterogeneous Platforms Panagiotis D. Michailidis and Konstantinos G. Margaritis Abstract This paper is focused on designing two parallel dot product implementations

More information

Out-of-Core SVD and QR Decompositions

Out-of-Core SVD and QR Decompositions Out-of-Core SVD and QR Decompositions Eran Rabani and Sivan Toledo 1 Introduction out-of-core singular-value-decomposition algorithm. The algorithm is designed for tall narrow matrices that are too large

More information

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11

Matrix Computations: Direct Methods II. May 5, 2014 Lecture 11 Matrix Computations: Direct Methods II May 5, 2014 ecture Summary You have seen an example of how a typical matrix operation (an important one) can be reduced to using lower level BS routines that would

More information

More Gaussian Elimination and Matrix Inversion

More Gaussian Elimination and Matrix Inversion Week7 More Gaussian Elimination and Matrix Inversion 7 Opening Remarks 7 Introduction 235 Week 7 More Gaussian Elimination and Matrix Inversion 236 72 Outline 7 Opening Remarks 235 7 Introduction 235 72

More information

Math 504 (Fall 2011) 1. (*) Consider the matrices

Math 504 (Fall 2011) 1. (*) Consider the matrices Math 504 (Fall 2011) Instructor: Emre Mengi Study Guide for Weeks 11-14 This homework concerns the following topics. Basic definitions and facts about eigenvalues and eigenvectors (Trefethen&Bau, Lecture

More information

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM

PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM Proceedings of ALGORITMY 25 pp. 22 211 PRECONDITIONING IN THE PARALLEL BLOCK-JACOBI SVD ALGORITHM GABRIEL OKŠA AND MARIÁN VAJTERŠIC Abstract. One way, how to speed up the computation of the singular value

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems

Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems Balanced Truncation Model Reduction of Large and Sparse Generalized Linear Systems Jos M. Badía 1, Peter Benner 2, Rafael Mayo 1, Enrique S. Quintana-Ortí 1, Gregorio Quintana-Ortí 1, A. Remón 1 1 Depto.

More information

THE RELATION BETWEEN THE QR AND LR ALGORITHMS

THE RELATION BETWEEN THE QR AND LR ALGORITHMS SIAM J. MATRIX ANAL. APPL. c 1998 Society for Industrial and Applied Mathematics Vol. 19, No. 2, pp. 551 555, April 1998 017 THE RELATION BETWEEN THE QR AND LR ALGORITHMS HONGGUO XU Abstract. For an Hermitian

More information

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Accelerating Model Reduction of Large Linear Systems with Graphics Processors Accelerating Model Reduction of Large Linear Systems with Graphics Processors P. Benner 1, P. Ezzatti 2, D. Kressner 3, E.S. Quintana-Ortí 4, Alfredo Remón 4 1 Max-Plank-Institute for Dynamics of Complex

More information

Solving Algebraic Riccati Equations with SLICOT

Solving Algebraic Riccati Equations with SLICOT Solving Algebraic Riccati Equations with SLICOT Peter Benner Institut für Mathematik, MA 4-5 Technische Universität Berlin Straße des 17. Juni 136 D-163 Berlin, Germany email: benner@math.tu-berlin.de

More information

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x

I-v k e k. (I-e k h kt ) = Stability of Gauss-Huard Elimination for Solving Linear Systems. 1 x 1 x x x x Technical Report CS-93-08 Department of Computer Systems Faculty of Mathematics and Computer Science University of Amsterdam Stability of Gauss-Huard Elimination for Solving Linear Systems T. J. Dekker

More information

MATRIX AND LINEAR ALGEBR A Aided with MATLAB

MATRIX AND LINEAR ALGEBR A Aided with MATLAB Second Edition (Revised) MATRIX AND LINEAR ALGEBR A Aided with MATLAB Kanti Bhushan Datta Matrix and Linear Algebra Aided with MATLAB Second Edition KANTI BHUSHAN DATTA Former Professor Department of Electrical

More information

Specialized Parallel Algorithms for Solving Lyapunov and Stein Equations

Specialized Parallel Algorithms for Solving Lyapunov and Stein Equations Journal of Parallel and Distributed Computing 61, 14891504 (2001) doi:10.1006jpdc.2001.1732, available online at http:www.idealibrary.com on Specialized Parallel Algorithms for Solving Lyapunov and Stein

More information

Charles F. Van Loan Director, Computer Science Undergraduate Program (about 350 students)

Charles F. Van Loan Director, Computer Science Undergraduate Program (about 350 students) Charles F. Van Loan Professor Department of Computer Science Cornell University Ithaca, New York 14850 cv@cs.cornell.edu http://www.cs.cornell.edu/cv Education 1969 B.S., University of Michigan - Applied

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science 412 (2011) 1484 1491 Contents lists available at ScienceDirect Theoretical Computer Science journal homepage: wwwelseviercom/locate/tcs Parallel QR processing of Generalized

More information

Low Rank Solution of Data-Sparse Sylvester Equations

Low Rank Solution of Data-Sparse Sylvester Equations Low Ran Solution of Data-Sparse Sylvester Equations U. Baur e-mail: baur@math.tu-berlin.de, Phone: +49(0)3031479177, Fax: +49(0)3031479706, Technische Universität Berlin, Institut für Mathemati, Straße

More information

DIRECT METHODS FOR MATRIX SYLVESTER AND LYAPUNOV EQUATIONS

DIRECT METHODS FOR MATRIX SYLVESTER AND LYAPUNOV EQUATIONS DIRECT METHODS FOR MATRIX SYLVESTER AND LYAPUNOV EQUATIONS DANNY C. SORENSEN AND YUNKAI ZHOU Received 12 December 2002 and in revised form 31 January 2003 We revisit the two standard dense methods for

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Algorithms Notes for 2016-10-31 There are several flavors of symmetric eigenvalue solvers for which there is no equivalent (stable) nonsymmetric solver. We discuss four algorithmic ideas: the workhorse

More information

Efficient Implementation of Large Scale Lyapunov and Riccati Equation Solvers

Efficient Implementation of Large Scale Lyapunov and Riccati Equation Solvers Efficient Implementation of Large Scale Lyapunov and Riccati Equation Solvers Jens Saak joint work with Peter Benner (MiIT) Professur Mathematik in Industrie und Technik (MiIT) Fakultät für Mathematik

More information

NAG Toolbox for Matlab nag_lapack_dggev (f08wa)

NAG Toolbox for Matlab nag_lapack_dggev (f08wa) NAG Toolbox for Matlab nag_lapack_dggev () 1 Purpose nag_lapack_dggev () computes for a pair of n by n real nonsymmetric matrices ða; BÞ the generalized eigenvalues and, optionally, the left and/or right

More information

Theorem Let X be a symmetric solution of DR(X) = 0. Let X b be a symmetric approximation to Xand set V := X? X. b If R b = R + B T XB b and I + V B R

Theorem Let X be a symmetric solution of DR(X) = 0. Let X b be a symmetric approximation to Xand set V := X? X. b If R b = R + B T XB b and I + V B R Initializing Newton's method for discrete-time algebraic Riccati equations using the buttery SZ algorithm Heike Fabender Zentrum fur Technomathematik Fachbereich - Mathematik und Informatik Universitat

More information

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland Matrix Algorithms Volume II: Eigensystems G. W. Stewart University of Maryland College Park, Maryland H1HJ1L Society for Industrial and Applied Mathematics Philadelphia CONTENTS Algorithms Preface xv xvii

More information

On the application of different numerical methods to obtain null-spaces of polynomial matrices. Part 2: block displacement structure algorithms.

On the application of different numerical methods to obtain null-spaces of polynomial matrices. Part 2: block displacement structure algorithms. On the application of different numerical methods to obtain null-spaces of polynomial matrices Part 2: block displacement structure algorithms JC Zúñiga and D Henrion Abstract Motivated by some control

More information

D. Gimenez, M. T. Camara, P. Montilla. Aptdo Murcia. Spain. ABSTRACT

D. Gimenez, M. T. Camara, P. Montilla. Aptdo Murcia. Spain.   ABSTRACT Accelerating the Convergence of Blocked Jacobi Methods 1 D. Gimenez, M. T. Camara, P. Montilla Departamento de Informatica y Sistemas. Univ de Murcia. Aptdo 401. 0001 Murcia. Spain. e-mail: fdomingo,cpmcm,cppmmg@dif.um.es

More information

SKEW-HAMILTONIAN AND HAMILTONIAN EIGENVALUE PROBLEMS: THEORY, ALGORITHMS AND APPLICATIONS

SKEW-HAMILTONIAN AND HAMILTONIAN EIGENVALUE PROBLEMS: THEORY, ALGORITHMS AND APPLICATIONS SKEW-HAMILTONIAN AND HAMILTONIAN EIGENVALUE PROBLEMS: THEORY, ALGORITHMS AND APPLICATIONS Peter Benner Technische Universität Chemnitz Fakultät für Mathematik benner@mathematik.tu-chemnitz.de Daniel Kressner

More information

ETNA Kent State University

ETNA Kent State University C 8 Electronic Transactions on Numerical Analysis. Volume 17, pp. 76-2, 2004. Copyright 2004,. ISSN 1068-613. etnamcs.kent.edu STRONG RANK REVEALING CHOLESKY FACTORIZATION M. GU AND L. MIRANIAN Abstract.

More information

Notes on LU Factorization

Notes on LU Factorization Notes on LU Factorization Robert A van de Geijn Department of Computer Science The University of Texas Austin, TX 78712 rvdg@csutexasedu October 11, 2014 The LU factorization is also known as the LU decomposition

More information

A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures. F Tisseur and J Dongarra

A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures. F Tisseur and J Dongarra A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures F Tisseur and J Dongarra 999 MIMS EPrint: 2007.225 Manchester Institute for Mathematical

More information

A COMPUTATIONAL APPROACH FOR OPTIMAL PERIODIC OUTPUT FEEDBACK CONTROL

A COMPUTATIONAL APPROACH FOR OPTIMAL PERIODIC OUTPUT FEEDBACK CONTROL A COMPUTATIONAL APPROACH FOR OPTIMAL PERIODIC OUTPUT FEEDBACK CONTROL A Varga and S Pieters DLR - Oberpfaffenhofen German Aerospace Research Establishment Institute for Robotics and System Dynamics POB

More information

Eigenvalue Problems and Singular Value Decomposition

Eigenvalue Problems and Singular Value Decomposition Eigenvalue Problems and Singular Value Decomposition Sanzheng Qiao Department of Computing and Software McMaster University August, 2012 Outline 1 Eigenvalue Problems 2 Singular Value Decomposition 3 Software

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

Lecture 2: Numerical linear algebra

Lecture 2: Numerical linear algebra Lecture 2: Numerical linear algebra QR factorization Eigenvalue decomposition Singular value decomposition Conditioning of a problem Floating point arithmetic and stability of an algorithm Linear algebra

More information

THE STABLE EMBEDDING PROBLEM

THE STABLE EMBEDDING PROBLEM THE STABLE EMBEDDING PROBLEM R. Zavala Yoé C. Praagman H.L. Trentelman Department of Econometrics, University of Groningen, P.O. Box 800, 9700 AV Groningen, The Netherlands Research Institute for Mathematics

More information

Using Godunov s Two-Sided Sturm Sequences to Accurately Compute Singular Vectors of Bidiagonal Matrices.

Using Godunov s Two-Sided Sturm Sequences to Accurately Compute Singular Vectors of Bidiagonal Matrices. Using Godunov s Two-Sided Sturm Sequences to Accurately Compute Singular Vectors of Bidiagonal Matrices. A.M. Matsekh E.P. Shurina 1 Introduction We present a hybrid scheme for computing singular vectors

More information

Sonderforschungsbereich 393

Sonderforschungsbereich 393 Sonderforschungsbereich 393 Parallele Numerische Simulation für Physik und Kontinuumsmechanik Peter Benner Enrique S. Quintana-Ortí Gregorio Quintana-Ortí Solving Stable Sylvester Equations via Rational

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Communication avoiding parallel algorithms for dense matrix factorizations

Communication avoiding parallel algorithms for dense matrix factorizations Communication avoiding parallel dense matrix factorizations 1/ 44 Communication avoiding parallel algorithms for dense matrix factorizations Edgar Solomonik Department of EECS, UC Berkeley October 2013

More information

Week6. Gaussian Elimination. 6.1 Opening Remarks Solving Linear Systems. View at edx

Week6. Gaussian Elimination. 6.1 Opening Remarks Solving Linear Systems. View at edx Week6 Gaussian Elimination 61 Opening Remarks 611 Solving Linear Systems View at edx 193 Week 6 Gaussian Elimination 194 61 Outline 61 Opening Remarks 193 611 Solving Linear Systems 193 61 Outline 194

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0 GENE H GOLUB 1 What is Numerical Analysis? In the 1973 edition of the Webster s New Collegiate Dictionary, numerical analysis is defined to be the

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

NAG Library Routine Document F08PNF (ZGEES)

NAG Library Routine Document F08PNF (ZGEES) F08 Least-squares and Eigenvalue Problems (LAPACK) F08PNF NAG Library Routine Document F08PNF (ZGEES) Note: before using this routine, please read the Users Note for your implementation to check the interpretation

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

arxiv: v1 [cs.sy] 29 Dec 2018

arxiv: v1 [cs.sy] 29 Dec 2018 ON CHECKING NULL RANK CONDITIONS OF RATIONAL MATRICES ANDREAS VARGA Abstract. In this paper we discuss possible numerical approaches to reliably check the rank condition rankg(λ) = 0 for a given rational

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information