Structure preserving Krylov-subspace methods for Lyapunov equations

Structure preserving Krylov-subspace methods for Lyapunov equations Matthias Bollhöfer, André Eppler Institute Computational Mathematics TU Braunschweig MoRePas Workshop, Münster September 17, 2009 System Reduction for Nanoscale IC Design

2 / 26 Overview 1 Introduction 2 3 4

3 / 26 Overview Introduction 1 Introduction 2 3 4

SyreNe-Project Goals Develop and compare methods for system reduction in the design of high dimensional nanoelectronic ICs. (Integrated Circuits) Test these methods in the practice of semiconductor development. Two complementary approaches: reduction of the whole system by a global method creation of reduced order models for single devices and large linear sub-circuits 4 / 26

5 / 26 Generalized projected Lyapunov-equations EXA T + AXE T = P l BB T P T l, X = P T r XP r (1) E T YA + A T YE = P T r C T CP r, Y = P l YP T l (2) E, A R n n B, C T R n ns equations arising from the work group of T. Stykel E singular, n s n existence and uniqueness of solution proved

5 / 26 Generalized Lyapunov-equations EXA T + AXE T = B B T (1) E T YA + A T YE = C T (2) E, A R n n B, CT R n ns equations arising from the work group of T. Stykel E singular, n s n existence and uniqueness of solution proved

6 / 26 Definitions Let A := E A + A E be the Lyapunov operator and vec be the operator R n n R n n which puts the columns of a matrix as column vector X = vec(x). Rewrite the Lyapunov equations (1), (2) as linear systems AX = B with B = vec( B B T ) and C = vec( C T C). Problem: Dimension n 2 AY = C Good news: E, A are usually sparse when dealing with circuit equations.

Structure preservation principle 7 / 26 right hand side is low rank and symmetric these properties transfer to the solution X of (1) iterative solver has to keep that structure in each step possible: Krylov-subspace methods, only need linear combination of vectors and applying the Lyapunov-operator use factorization X = VZV T =

Structure preservation - linear combination 8 / 26 Let X 1 = V 1 Z 1 V T 1, X 2 = V 2 Z 2 V T 2 be low rank matrices, then X 3 = α 1 X 1 + α 2 X 2 = α 1 V 1 Z 1 V1 T + α 2V 2 Z 2 V2 T ( α1 Z = (V 1 V 2 ) 1 0 }{{} 0 α 2 Z 2 V 3 has again a low rank factorization. Rank estimation rank(x 3 ) rank(x 1 ) + rank(x 2 ) } {{ } Z 3 ) (V 1 V 2 ) T }{{} V3 T

Structure preservation - apply Lyapunov operator 9 / 26 Let X = VZV T be a low rank matrix, then again X a = AX = E VZV T A T + A VZV T E T ( ) 0 Z = (EV AV ) (EV AV ) T }{{} Z 0 }{{} V a }{{} Va T Z a a low rank factorization exists. Rank estimation rank(x a ) 2 rank(x)

10 / 26 Overview Introduction 1 Introduction 2 3 4

11 / 26 ADI general Introduction properties iterative method for solving AX = B need shift parameters τ j, essential for convergence behavior, difficult to compute can be applied to solve Lyapunov equations X 0 = 0

11 / 26 ADI general Introduction properties iterative method for solving AX = B (E + τ j A)X j 1 = B B T X j 1 (E τ j A) T 2 (E + τ j A)X j = B B T X T (E τ j 1 j A) T 2 need shift parameters τ j, essential for convergence behavior, difficult to compute can be applied to solve Lyapunov equations X 0 = 0

ADI general Introduction properties iterative method for solving AX = B need shift parameters τ j, essential for convergence behavior, difficult to compute can be applied to solve Lyapunov equations E, A R n n B, CT R n ns X 0 = 0 EXA T + AXE T = B B T E T YA + A T YE = C T C 11 / 26

ADI for gen. Lyapunov 12 / 26 Standard ADI recursion (E + τ j A)X j 1 = B B T X j 1 (E τ j A) T (3) 2 (E + τ j A)X j = B B T X T (E τ j 1 j A) T (4) 2 Use (3) and (4) to obtain X j = 2τ j (E + τ j A) 1 B BT (E + τ j A) T (5) + (E + τ j A) 1 (E τ j A)X j 1 (E τ j A) T (E + τ j A) T. This symmetric sum can be factored X j = Z j Z T j.

CF-ADI [Li,White 04] 13 / 26 Cholesky Factor Alternating Direct Implicit Iteration computes the Cholesky-Factor Z of the solution X = ZZ T Algorithm 1 compute shift parameters τ 1,...τ j 2 z 1 = 2τ 1 (E + τ 1 A) 1 B Z = [z 1 ] 3 For i=2..j z i = P i 1 z i 1, with 2τi+1 P i = [I (τ i+1 + τ i )(E + τ i+1 A) 1 ] 2τi Z = [Z z i ]

14 / 26 Overview Introduction 1 Introduction 2 3 4

15 / 26 Introduction iterative method for solving AX = B allows flexible preconditioning in each step calculates an orthonormal basis of m-dimensional Krylov space K m (R 0, A) = span(r 0, AR 0,..., A m 1 R 0 )

algorithm [Saad 92] Initialize: choose X 0 and dimension m Arnoldi process: 1 compute R 0 = B AX 0, h 1,0 = R 0, V 1 = R 0 h 1,0 2 for j = 1... m a) compute W j = M 1 V j b) compute V j+1 = AW j c) compute i = 1... j MGS h i,j = (V j+1, V i ), V j+1 = V j+1 h i,j V i h j+1,j = V j+1, V j+1 = V j+1 h j+1,j 3 Let W m := [W 1... W m ] Compute solution X m = X 0 + W m Y m with Y m minimizes h 1,0 e 1 H m Y Restart : If not converged X 0 = X m 16 / 26

Rank truncation strategy 1 obtain starting factorization VZV T 2 compute QR factorization of V = QR 3 compute EVD of RZR T = UΣU T 4 truncate rank to given relative tolerance (tol p, tol r ), keep only first r columns Û = U(r), ˆΣ = Σ(r) V Z V T X = V Z V T 17 / 26

Rank truncation strategy 17 / 26 1 obtain starting factorization VZV T 2 compute QR factorization of V = QR 3 compute EVD of RZR T = UΣU T 4 truncate rank to given relative tolerance (tol p, tol r ), keep only first r columns Û = U(r), ˆΣ = Σ(r) ˆV ˆΣ ˆV T X = ˆV ˆΣ ˆV T Result X = ˆV ˆΣ ˆV T with ˆV = QÛ ˆV unitary matrix ˆΣ diagonal matrix

Preconditioning 18 / 26 Properties must preserve structure (important fact) may vary in each step should increase convergence rate possible candidate: CF-ADI Apply 1 cycle of CF-ADI to EW j A T + AW j E T = V j.

19 / 26 Overview Introduction Number of shifts Perturbed shifts Rank truncation 1 Introduction 2 3 4 Number of shifts Perturbed shifts Rank truncation

20 / 26 Number of shifts Introduction Number of shifts Perturbed shifts Rank truncation rank X = 18 CF-ADI more sensitive to number of shift parameters

21 / 26 Number of shifts Introduction Number of shifts Perturbed shifts Rank truncation minimum number of shifts necessary increasing the number does not result in further profit convenient range from 10 to 30 optimal number? (problem dependend) is less sensitive to specific number Conjecture A lower number of shifts is sufficient compared to pure CF-ADI.

22 / 26 Introduction Number of shifts Perturbed shifts Rank truncation Perturbed shifts τ i δ [0.99, 1.01] randomly chosen,12 shifts CF-ADI much more sensitive to perturbation of shift parameters

Rank truncation Introduction Number of shifts Perturbed shifts Rank truncation rank X = 18, tolr r = 1e 12, 12 shifts certain accuracy of tol p needed for convergence problem depended similar results for tol r 23 / 26

Summary Introduction Benefits structure preservation (low rank, symmetry) is fulfilled algorithm is less sensitive to perturbations and number of ADI shift parameters τ j Problems convergence strongly depends on CF-ADI number of columns of CF-ADI iterates increases in each step but rank only increases slightly expensive rank truncation Remedy: need rank compression or stopping criteria within CF-ADI 24 / 26

25 / 26 Outlook Introduction Possible improvements compress ranks within ADI compute faster low rank factorization improve rank truncation strategy (parameters tol p and tol r ) use different Krylov-subspace methods

} { z Dipl.-Math. techn. Thomas Mach Prof. Dr. Michael Hinze Dipl.Technomath. Martin Kunkel Prof. Dr. Heike Faßbender Juan Amorocho M.Sc. Dr. Patrick Lang Dipl.-Math. Oliver Schmidt Dr. Tatjana Stykel Dr. Andreas Steinbrecher } } Dipl.-Math. techn. André Eppler TU Berlin Prof. Dr. Matthias Bollhöfer { z { z } } Dipl.-Math. techn. André Schneider { z { z TU Chemnitz } ITWM Kaiserslautern TU Braunschweig z Prof. Dr. Peter Benner Universität Hamburg TU Braunschweig { BMBF Verbundprojekt SyreNe