The Newton-ADI Method for Large-Scale Algebraic Riccati Equations Mathematik in Industrie und Technik Fakultät für Mathematik Peter Benner benner@mathematik.tu-chemnitz.de Sonderforschungsbereich 393 S N SIMULATION Workshop über Matrixgleichungen MPI Leipzig, 15. April 2005
Outline Outline Motivation Large-scale algebraic Riccati equations The Newton-ADI method Numerical results Conclusions and open problems SIMULATIONSN Peter Benner 2
Motivation Motivation Numerical solution of lq optimal control problem for parabolic systems via Galerkin approach, spatial FEM discretization finite-dim. LQR problem Minimize J(u) = 1 2 Z y T Qy + u T Ru dt for u L 2 (0, ; R m ), 0 where Mẋ = Lx + Bu, x(0) = x 0, y = Cx, with stiffness matrix L R n n, mass matrix M R n n, B R n m, C R p n. Solution of finite-dimensional LQR problem: u (t) = B T P x(t) =: K x(t), where P 0 is stabilizing solution of the algebraic Riccati equation (ARE) 0 = R(P) := C T C + A T P + PA PBB T P, with A := M 1 L, B := M 1 BR 1 2, C := CQ 1 2. SIMULATIONSN Peter Benner 3
Large-scale AREs Large-Scale Algebraic Riccati Equations General form for A,G = G T,Q = Q T R n n given and P R n n unknown: 0 = R(P) := Q + A T P + PA PGP Here, control-theoretic assumptions ensure existence of unique stabilizing solution P = P T 0, i.e., Λ (A GP ) C. In large scale applications from semi-discretized control problems for PDEs, n = 10 3 10 6 (= 10 6 10 12 unknowns!), A has sparse representation (A = M 1 L), G,Q low-rank with G = BB T, B R n m, m n, Q = C T C, C R p n, p n. Standard (eigenproblem-based) O(n 3 ) methods are not applicable! SIMULATIONSN Peter Benner 4
Large-scale AREs Low-Rank Approximation Consider spectrum of ARE solution. Example: Linear 1D heat equation with point control, Ω = [ 0, 1 ], FEM discretization using linear B-splines, h = 1/100 = n = 101. λ k 10 0 10 5 10 10 eigenvalues of P h for h=0.01 10 15 eps 10 20 0 20 40 60 80 100 Index k Idea: P = P T 0 = P = ZZ T = n λ k z k zk T Z (r) (Z (r) ) T = k=1 r λ k z k zk T. k=1 SIMULATIONSN Peter Benner 5
Large-scale AREs Low-Rank Krylov Subspace Methods Block-Arnoldi method [Jaimoukha/Kasenally 94] Consider 0 = R(P) = CC T + AP + PA T PBB T P. 1. Apply (block-)arnoldi process to A with start (block-)vector C to generate the Krylov space K l (A, C) = span{c, AC, A 2 C,..., A l 1 C} with orthogonal basis V l such that AV l = V l A l + W l+1 A l+1,l» 0 and A l = Vl TAV l is block upper-hessenberg. 2. Set B l := Vl TB, C l := Vl TC. 3. Find stabilizing solution of the ARE 4. Set P l := V l X l V T l. 0 = R l (X l ) = C l C T l + A lx l + X l A T l X lb l B T l X l. Ip SIMULATIONSN Peter Benner 6
Large-scale AREs Properties: + P l satisfies Galerkin-type condition V T l R(P l)v l = 0 + Computable residual error norm R(P l ) F = 2 A l+1,l [ 0 I p ] X l F. Block-Arnoldi, i.e., each step needs p matrix-vector products. Stabilizing X l may not exist as corresponding Hamiltonian matrix H l := [ A T l B l B T l C l C T l may have purely imaginary eigenvalues! No stabilization guarantee for P l! No convergence results for residuals. A l ] SIMULATIONSN Peter Benner 7
Large-scale AREs Hamiltonian Lanczos algorithm [Freund/Mehrmann 92, Ferng/Lin/Wang 95, B./Faßbender 95] Consider 0 = R(P) = Q + A T P + PA PGP. to gen- 1. Apply symplectic Lanczos method to Hamiltonian matrix erate Krylov space [ A Q ] G A T with symplectic basis K 2l (H, v 1 ) = span{v 1, Hv 1, H 2 v 1,..., H 2l 1 v 1 }, S l = [v 1, w 1,..., v l, w l ] R 2n,2l,»» 0 In S l = S T l In 0 0 I l I l 0 such that HS l = S l H l + ζ l+1 v l+1 e T 2l. 2. Compute ARE solution corresponding to H l, prolongate to R n n. SIMULATIONSN Peter Benner 8
Large-scale AREs Properties: + P l satisfies Galerkin-type condition P l P T l for l < n. V T l R(P l )V l = 0 + In general less matrix-vector products than for block-arnoldi. + Purely imaginary eigenvalues of small Hamiltonian matrix H l can be removed by cheap implicit restarts, i.e., can always get stable H l -invariant subspace. + Stabilization property for projected feedback matrix V T l (A GP l )V l for sufficiently small Lanczos residual (can be achieved by implicit restarts). No stabilization guarantee for P l. No convergence results for residuals. SIMULATIONSN Peter Benner 9
Large-scale AREs Related work: Two-sided structured Arnoldi method for Hamiltonian matrices based on symplectic URV decomposition/product eigenvalue problem. [Xu 97, Kreßner 04] Improved generation of low-rank approximate solution from r-dimensional (r n) H-invariant subspace range(w) via W 1 := W(1:n,:), W 2 = W(n+1:2n,:) [U,Σ, V ] = svd(w T 1 W 2 ), X = W 2 US 1 2. [B./Mehrmann/Sorensen 03, Amodei 03] Variants of Arnoldi using Frobenius inner product (global Arnoldi). [Jbilou 03] SIMULATIONSN Peter Benner 10
Newton-ADI for AREs Newton s Method for AREs Consider 0 = R(P) = C T C + A T P + PA PBB T P. Frechét derivative of R(P) at P: R P : Z (A BBT P) T Z+Z(A BB T P). Newton-Kantorovich method: P j+1 = P j ( R P j ) 1 R(Pj ), j = 0,1, 2,... = Newton s method (with line search) for AREs (for given P 0 = P0 T stabilizing): FOR j = 0, 1,... 1. A j A BB T P j =: A BK j. 2. Solve the Lyapunov equation A T j N j + N j A j = R(P j ). 3. P j+1 P j + t j N j. END FOR j [Kleinman 68, Mehrmann 91, Lancaster/Rodman 95, B./Byers 94/ 98, B. 97, Guo/Laub 99] SIMULATIONSN Peter Benner 11
Newton-ADI for AREs Properties and Implementation Convergence for Λ (A BK 0 ) stabilizing: A j = A BK j = A BB T P j is stable j 1. lim j R(P j ) F = 0 (monotonically). lim j P j = P 0 (locally quadratic). Need large-scale Lyapunov solver; here, ADI iteration [Wachspress 88]: P j = lim k Q (j) k, obtain Q(j) k by solving linear system coefficient matrix A j : A j = A B K j = sparse m m n = efficient inversion using Sherman-Morrison-Woodbury formula: (A BK j ) 1 = (I n + A 1 B(I m K j A 1 B) 1 K j )A 1. BUT: P = P T R n n = n(n + 1)/2 unknowns! SIMULATIONSN Peter Benner 12
Newton-ADI for AREs Low-rank Newton-ADI for AREs Re-write Newton s method for AREs [Kleinman 68] A T j N j + N j A j = R(P j ) A T j (P j + N j ) + (P {z } j + N j ) A {z } j = C T C P j BB T P {z } j =P j+1 =P j+1 =: W j W T j Set P j = Z j Z T j for rank (Z j ) n = A T j Z j+1 Z T j+1 + Z j+1 Z T j+1 A j = W j W T j Solve Lyapunov equations for Z j+1 directly by factored ADI iteration and use sparse + low-rank structure of A j. [B./Li/Penzl 99 ] SIMULATIONSN Peter Benner 13
Newton-ADI for AREs ADI Method for Lyapunov Equations For F R n n stable, W R n w (w n), consider Lyapunov equation F T X + XF = BB T. ADI Iteration: [Wachspress 88] (F T + p k I)X (j 1)/2 = BB T X k 1 (F p k I) (F T + p k I)X k T = BB T X (j 1)/2 (F p k I) with parameters p k C and p k+1 = p k if p k R. For X 0 = 0 and proper choice of p k : lim k X k = X superlinear. Re-formulation using X k = Y k Y T k yields iteration for Y k... SIMULATIONSN Peter Benner 14
Newton-ADI for AREs Set X k = Y k Y T k Factored ADI Iteration [Penzl 97, Li/Wang/White 99, B./Li/Penzl], some algebraic manipulations = V 1 p 2Re (p 1 )(F T + p 1 I) 1 B, Y 1 V 1 FOR j = 2, 3,... r V k `I (pk + p k 1 )(F T + p k I) 1 V k 1, Y k ˆ Y k 1 V k where Re (p k ) Re (p k 1 ) Y kmax = [ V 1... V kmax ] V k = C n w and Y kmax Yk T max X Note: Implementation in real arithmetic possible by combining two steps. SIMULATIONSN Peter Benner 15
Newton-ADI for AREs Direct Feedback Iteration LQR problem: compute feedback matrix directly! jth Newton iteration: K j = B T Z j Z T j = k max k=1 (B T V j,k )V T j,k j K = B T Z Z T K j can be updated in ADI iteration, no need to even form Z j, need only fixed workspace for K j R m n! Cost (# Newton iterations) mean (# ADI iterations) (cost for solving the elliptic problem) SIMULATIONSN Peter Benner 16
Numerical Results Numerical Results Linear 2D heat equation with homogeneous Dirichlet boundary and distributed control/observation. FD discretization on uniform 150 150 grid. n = 22.500, m = p = 1, 10 shifts for ADI iterations. 50 Inner (ADI) Iterations 10 0 Relative Change in Feedback Matrix ADI iterations 40 30 20 10 K j+1 K j F / K j F 10 5 10 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Newton steps 10 15 0 2 4 6 8 10 12 14 Newton steps SIMULATIONSN Peter Benner 17
Conclusions and Open Problems Conclusions and Open Problems Solution of LQR problems for parabolic PDEs via low-rank factor Newton-ADI method is efficient and reliable. Riccati-approach applicable to many control problems for linear evolution equations. Open Problems: Efficient stopping criteria. Efficient implementation of line search strategy for large-sparse AREs. Newton-ADI method for H control and other problems with indefinite Hessian possible? SIMULATIONSN Peter Benner 18