Solving Large Scale Lyapunov Equations

Solving Large Scale Lyapunov Equations D.C. Sorensen Thanks: Mark Embree John Sabino Kai Sun NLA Seminar (CAAM 651) IWASEP VI, Penn State U., 22 May 2006

The Lyapunov Equation AP + PA T + BB T = 0 where A R n n, B R n p, p << n Assumptions: A is stable and A + A T 0 P = P T 0 Computation in Dense Case (Schur Decomposition ) Bartels and Stewart (72), Hammarling (82) Factored Form Soln. P = LL T D.C. Sorensen 2

Outline Model Reduction for Dynamical Systems Balanced Truncation and Lyapunov Equations APM and ADI for Large Scale Lyapunov Equations A Parameter Free Algorithm A Special Sylvester Equation Solver Implementation Issues Computational Results

LTI Systems and Model Reduction Time Domain ẋ = Ax + Bu y = Cx A R n n, B R n m, C R p n, n >> m, p Frequency Domain sx = Ax + Bu y = Cx Transfer Function H(s) C(sI A) 1 B, y(s) = H(s)u(s) D.C. Sorensen 4

Model Reduction Construct a new system {Â, ˆB, Ĉ} with LOW dimension k << n ˆx = Âˆx + ˆBu ŷ = Ĉˆx Goal: Preserve system response ŷ should approximate y Projection: x(t) = Vˆx(t) and V ˆx = AVˆx + Bu D.C. Sorensen 5

Model Reduction by Projection Krylov Projection Often Used Approximate x S V = Range(V) k-diml. subspace i.e. Put x = Vˆx, and then force W T [V ˆx (AVˆx + Bu)] = 0 ŷ = CVˆx If W T V = I k, then the k dimensional reduced model is ˆx = Âˆx + ˆBu ŷ = Ĉˆx where Â = W T AV, ˆB = W T B, Ĉ = CV. D.C. Sorensen 6

Balanced Reduction (Moore 81) Lyapunov Equations for system Gramians AP + PA T + BB T = 0 A T Q + QA + C T C = 0 With P = Q = S : Want Gramians Diagonal and Equal States Difficult to Reach are also Difficult to Observe Reduced Model A k = W T k AV k, B k = W T k B, C k = C k V k PVk = W k S k QW k = V k S k Reduced Model Gramians Pk = S k and Q k = S k. D.C. Sorensen 7

Hankel Norm Error estimate (Glover 84) Why Balanced Truncation? Hankel singular values = λ(pq) Model reduction H error (Glover) y ŷ 2 2 (sum neglected singular values) u 2 Extends to MIMO Preserves Stability Key Challenge Approximately solve large scale Lyapunov Equations in Low Rank Factored Form D.C. Sorensen 8

When are Low Rank Solutions Expected = Eigenvalue Decay 10 10 Evals of P, N=276 10 0 10 10 10 20 10 30 10 40 10 50 Evals σ 1 *1e 8 10 60 0 50 100 150 200 250 300

Eigenvalues of P Often Decay Rapidly Estimates from Analysis: Penzl (00), Symmetric A, must know κ(a) Zhou, Antoulas, S (02), Nonsymmetric A, must know σ(a) Beattie Sabino and Embree (in progress) D.C. Sorensen 10

Approximate Balancing AP + PA T + BB T = 0 A T Q + QA + C T C = 0 Sparse Case: Iteratively Solve in Low Rank Factored Form, P U k U T k, Q L kl T k [X, S, Y] = svd(u T k L k) W k = LY k S 1/2 k and V k = UX k S 1/2 k. Now: PW k V k S k and QV k W k S k D.C. Sorensen 11

Balanced Reduction via Projection Reduced model of order k: A k = W T k AV k, B k = W T k B, C k = CV k. 0 = W T k (AP + PAT + BB T )W k = A k S k + S k A T k + B kb T k 0 = V T k (AT Q + QA + C T C)V k = A T k S k + S k A k + C T k C k Reduced model is balanced and asymptotically stable for every k. D.C. Sorensen 12

Large Scale Lyapunov Equations Iterative - Wachpress Krylov - Saad; Hu and Reichel Quadrature - Gudmundsson and Laub Subspace Iteration - Hodel and Tennison, VanDooren, Y. Zhou and S. Balanced Reduction Large Scale: Lanczos - Golub and Boley Restarting Kasenally et. al. Low Rank Lyapunov Penzl,Li, White, Benner D.C. Sorensen 13

Low Rank Smith Method R.A. Smith (68) Variant of ADI Wachpress(88) Calvetti, Levenberg, Reichel (97) Penzl (99) (00) Li, White, et. al. (00) Gugercin, Antoulas, S (03) D.C. Sorensen 14

ADI for Lyapunov Equations From original equation Apply shift µ > 0 from left AP + PA T + BB T = 0 P = (A µi) 1 [ P(A + µi) T + BB T ] Apply shift µ from right (Alternate Direction) [ ] P = (A + µi)p + BB T (A µi) T Combine these into one step: Get a Stein Equation D.C. Sorensen 15

Low Rank Smith = ADI Convert to Stein Equation: AP + PA T + BB T = 0 P = A µ PA T µ + B µ B T µ, where A µ = (A + µi)(a µi) 1, B µ = 2 µ (A µi) 1 B. Solution: P = A j µb µ B T µ(a j µ) T = LL T, j=0 where L = [B µ, A µ B µ, A 2 µb µ,... ] Factored Form D.C. Sorensen 16

Convergence - Low Rank Smith P m = m A j µb µ B T µ(a j µ) T j=0 Easy: Note: ρ(a µ ) < 1. P m+1 = A µ P m A T µ + B µ B T µ E m+1 = A µ E m A T µ = A m+2 µ P(A m+2 µ ) T 0, where E m = P P m. D.C. Sorensen 17

Modified Low Rank Smith LR - Smith: Update Factored Form P m = L m L T m: (Penzl) L m+1 = [A µ L m, B µ ] = [A m+1 µ B µ, L m ] Modified LR - Smith: Update and Truncate SVD Re-Order and Aggregate Shift Applications Much Faster and Far Less Storage B A µ B; [V, S, Q] = svd([a µ B, L m ]); L m+1 V k S k ; (σ k+1 < tol σ 1 ) D.C. Sorensen 18

Modified Low Rank Smith - Multishift Bm = B; L = []; for j = 1:kshifts, mu = -u(j); rho = sqrt(2*mu); Bm = ((A - mu*eye(n))\bm); L = [L rho*bm]; for j = 1:ksteps, Bm = (A + mu*eye(n))*((a - mu*eye(n))\bm); L = [L rho*bm]; end Bm = (A + mu*eye(n))*bm; end [V,S,Q] = svd(l,0); L = V(:,1:k)*S(1:k,1:k); 10

Performance of Modified Low Rank Smith P S P MS P S < tol, Q S Q MS Q S < tol. tol = SVD drop tol, S = LR Smith, MS = Modified LR Smith Storage - Cols. U k, P U k U T k Prob. n LR Smith Modified CD120 120 70 25 ISS 270 210 106 Penzl 1006 300 19 D.C. Sorensen 20

ISS module comparison k = 26, n = 270 10 20 30 Sigma Plot of the ISS Example FOM Balanced LR Smith Mod. LR Smith 40 50 60 70 80 90 100 110 10 2 10 1 10 0 10 1 10 2 10 3

Approximate Power Method (Hodel) APU + PA T U + BB T U = 0 APU + PUU T A T U + BB T U + P(I UU T )A T U = 0 Thus APU + PUH T + BB T U 0 where H = U T AU Solving AZ + ZH T + BB T U = 0 gives approximation to Z PU Iterate Approximate Power Method Z j US with PU = US

A Parameter Free Synthesis (P US 2 U T ) Step 1: (APM step) Solve a projected Sylvester equation AZ + ZH T + BˆB T = 0, with H = U k T AU k, ˆB = Uk T B. Step 2: Solve the reduced order Lyapunov equation Solve H ˆP + ˆPH T + ˆBˆB T = 0. Step 3: Modify B Update B (I Z ˆP 1 U T )B. Step 4: (ADI step) Update factorization and basis U k Re-scale Z Z ˆP 1/2. Update (and truncate) [U, S] svd[us, Z]. U k U(:, 1 : k), basis for dominant subspace. D.C. Sorensen 23

Monotone Convergence of P j Recall P j+1 = P j + Z j ˆP 1 j Z T j P j. Key Lemma AP j + P j A T + BB T = B j B T j for j = 1, 2,... where B j+1 = (I Z j ˆP 1 j U T j )B j with B 1 = B, P 1 = 0 Montone Convergence! This implies P j P j+1 P for j = 1, 2,... lim P j = P o P, j monotonically D.C. Sorensen 24

Implementation Details Note: AP j + P j A T + BB T = B j B T j gives Residual Norm for Free! Stopping Rules: 1. 2. P j+1 P j 2 P j+1 2 = Z 1/2 j ˆP j 2 2 S j+1 2 2 B j 2 B 2 tol tol Must monitor and control cond( ˆP): Truncate SVD( ˆP j ) WŜ2 W T 1) Z j Z j WŜ 1 and 2) B j+1 = B j Z j Ŝ 1 U T j B j Convergence Theory still valid D.C. Sorensen 25

Solving the projected Sylvester equation: Must solve: W.L.O.G.... H T AZ + ZH T + BˆB T = 0, = R = (ρ ij ) is in Schur form. for j = 1:k, Solve (A ρ jj I)z j = BˆB T e j j 1 i=1 z iρ i,j ; end Main Cost: Sparse Direct Factorization LU = P(A ρ jj I)Q Alternative: Note Z = XY 1 where [ A M ] [ X 0 H T Y ] = [ X Y ] ˆR D.C. Sorensen 26

Low Rank Smith on Sylvester Convert to Stein Equation: AZ + ZH T + BˆB T = 0 Z = A µ ZH T µ + B µ ˆB T µ, where A µ = (A + µi)(a µi) 1, B µ = 2 µ (A µi) 1 B. H µ = (H + µi)(h µi) 1, ˆB µ = 2 µ (H µi) 1 ˆB. Solution: Z = A j µb µ ˆB T µ(h j µ) T j=0 D.C. Sorensen 27

Use Multi=Shift variant of: Avantages L.R. Smith Z = A j µb µ ˆB T µ(h j µ) T j=0 Note: Spectral Radius ρ(a µ ) < 1 regardless of µ. Choose shifts to minimize ρ(hµ ) We use Bagby ordering of eigenvalues of H (cf: Levenberg and Reichel,93). Monitor H j µ ˆB µ during iteration Big Avantage: Apply k shifts (roots of H) with few factorizations A µi. D.C. Sorensen 28

Problem: SUPG discretization Problem Description function [A,b]=supg(N, theta, nu, delta) SUPG: matrix and RHS from SUPG discretization of the advection-diffusion operator on square grid of bilinear finite elements. Mark Embree: Essentially distilled from the IFISS software of Elman and Silvester. Fischer, Ramage, Silvester, Wathen: Towards Parameter- Free Streamline Upwinding for Advection-Diffusion Problems Comput. Methods Appl. Mech. Eng., 179:185-202, 1999.

Automatic Shift Selection - Placement? k = 8, m = 59, n = 32*32, Thanks Embree Comparison Eigenvalues A vs H 0.03 0.02 0.01 0 0.01 0.02 0.03 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0.01 eigenvalues A eigenvalues small H eigenvalues full H

Automatic Shift Selection - Placement? k = 16, m = 59, n = 32*32, Thanks Embree Comparison Eigenvalues A vs H 0.03 0.02 0.01 0 0.01 0.02 0.03 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0.01 eigenvalues A eigenvalues small H eigenvalues full H

Automatic Shift Selection - Placement? k = 32, m = 59, n = 32*32, Thanks Embree Comparison Eigenvalues A vs H 0.03 0.02 0.01 0 0.01 0.02 0.03 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0.01 eigenvalues A eigenvalues small H eigenvalues full H

Automatic Shift Selection - Placement? k = 20, m = 76, n = 32*32, 0.03 0.02 Comparison Eigenvalues A vs H comp e vals A e vals full H e vals H 0.01 0 0.01 0.02 0.03 0.06 0.05 0.04 0.03 0.02 0.01 0

ɛ-pseudospectra for A from SUPG, n=32*32 3 0.03 0.02 0.01 0 0.01 0.02 0.03 4 5 6 7 8 9 10 11 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0.01 12

Convergence History, Supg, n = 32, N = 1024 Laptop Iter P + P P + B j ˆB j 1 2.7e-1 1.6e+0 4.7e+0 2 7.2e-2 1.6e-1 1.5e+0 3 6.6e-3 1.1e-2 1.3e-1 4 3.7e-4 3.5e-7 1.3e-2 5 9.3e-9 1.4e-11 3.4e-7 P f is rank k = 59 Comptime(P f ) = 16.2 secs P P f P = 8.8e 9 Comptime(P) = 810 secs D.C. Sorensen 35

Convergence History, Supg, n = 800, N = 640,000 CaamPC Iter P + P P + B j ˆB j 6 1.3e-01 2.5e+00 2.4e+00 7 7.5e-02 1.1e+00 1.2e+00 8 3.5e-02 6.74e-01 5.0e-01 9 2.0e-02 1.2e-02 6.7e-01 10 2.0e-04 7.1e-07 1.2e-02 11 1.0e-08 2.3e-11 6.4e-07 P f is rank k = 120 Comptime(P f ) = 157 mins = 2.6 hrs D.C. Sorensen 36

Power Grid Delivery Network (thanks Kai Sun) Modified Nodal Analysis (MNA) Description Eẋ(t) = Ax(t) + Bu(t), x an N vector of node voltages and inductor currents, A is an N N conductance matrix, E (diagonal) represents capacitance and inductance terms, u(t) includes the loads and voltage sources. The size N can be an order of millions for normal power grids. D.C. Sorensen 37

RLC Circuit Diagram V DD

Power Grid Results, N = 1,048,340 CaamPC Pf is rank k = 17 No Inductance Computational time = 77.52 min = 1.29 hrs. Residual Norm 9.5e-06 Sabino Code: Optimal shifts, much faster!! T Symmetric problem A = A Modified Smith with Multi-shift strategy Optimal shifts D.C. Sorensen 39

Power Grid Results, N = 1821, With Inductance CaamPC Problem is Non-Symmetric In Sylvester Used 4 real shifts, 30 iters per shift P f is rank k = 121 Comptime(P f ) = 11.4 secs P P f P = 2.5e 6 Comptime(P) = 370 secs D.C. Sorensen 40

Automatic Shift Selection - Placement? Power Grid, k = 20, m = 121, N = 1821, 2 Kai Comp. Eigenvalues A vs H, k=20, n=1821 1.5 1 0.5 0 0.5 1 comp e vals A e vals full H e vals H 1.5 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0

Convergence History Evals of P j Power Grid, N = 1821, 40 History Dominant Evals of P 38 36 34 32 30 28 s1 s2 s3 s4 s5 s6 s7 s8 26 24 22 20 1 2 3 4 5 6 7

Decay Rate for Evals of P Power Grid, N = 1821, 10 5 Eigenvalues P, N=1821 10 0 10 5 10 10 10 15 10 20 10 25 10 30 Eigenvalues P σ 1 *1e 8 σ 1 *1e 4 σ 1 *1e 2 10 35 10 40 0 200 400 600 800 1000 1200 1400 1600 1800 2000

Convergence History, Power Grid, N = 48,892 CaamPC Iter P + P P + B j ˆB j 4 2.3e-01 8.5e+00 8.2e+01 5 5.6e-02 2.2e+00 8.2e+00 6 8.0e-02 1.1e+00 2.2e+00 7 6.4e-02 2.4e-03 1.1e+00 8 7.4e-05 3.0e-06 2.3e-03 9 5.8e-08 1.9e-08 2.4e-06 P f is rank k = 307 Comptime(P f ) = 28 mins At N = 196K we run out of memory due to rank of P. D.C. Sorensen 44

Contact Info. Shift Selection and Decay Rate Estimation J. Sabino, Ph.D. Thesis in progress (Embree ) e-mail: sorensen@rice.edu Papers and Software available at: My web page: Model Reduction: www.caam.rice.edu/ sorensen/ www.caam.rice.edu/ modelreduction/ ARPACK: www.caam.rice.edu/software/arpack/