Fast Hessenberg QR Iteration for Companion Matrices David Bindel Ming Gu David Garmire James Demmel Shivkumar Chandrasekaran Fast Hessenberg QR Iteration for Companion Matrices p.1/20
Motivation: Polynomial root finding A standard algorithm: QR iteration on a companion matrix Robust software exists It s normwise backward stable It s used in Matlab But it takes O(n 3 ) time and O(n 2 ) storage Fast Hessenberg QR Iteration for Companion Matrices p.2/20
Companion matrices Given p(λ) = λ n + a n 1 λ n 1 +... + a 1 λ + a 0 Define a companion matrix C: a n 1 a n 2 a n 3... a 1 a 0 1 0 0... 0 0 C = 0 1 0... 0 0........ 0 0 0... 1 0 For matrix polynomials, replace a i by A i and 1 by I. Fast Hessenberg QR Iteration for Companion Matrices p.3/20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Hessenberg QR iteration 0 1 2 3 5 6 7 8 9 10 11 0 1 2 3 5 6 7 8 9 10 11 Fast Hessenberg QR Iteration for Companion Matrices p./20
Accelerating QR Have examples of how to do QR faster: Naive implementation = Hessenberg QR ridiagonal QR (symmetric and skew problems) Unitary Hessenberg QR Keys to fast QR: Find a structure preserved by the iteration Find a low-complexity representation for that structure Fast Hessenberg QR Iteration for Companion Matrices p.5/20
Rank symmetry Call A rank-symmetric if rank(a 12 ) = rank(a 21 ) for any 2-by-2 blocking of A with square A 11 and A 22. Examples: Symmetric matrices Skew-symmetric matrices Orthogonal matrices (by the CS decomposition) J-orthogonal matrices (by hyperbolic CS decomposition) Fast Hessenberg QR Iteration for Companion Matrices p.6/20
Rank symmetry Call A rank-symmetric if rank(a 12 ) = rank(a 21 ) for any 2-by-2 blocking of A with square A 11 and A 22. Examples: Symmetric matrices Skew-symmetric matrices Orthogonal matrices (by the CS decomposition) J-orthogonal matrices (by hyperbolic CS decomposition) Examples of rank-symmetric + low rank: Companion matrices Problems which lose self-adjointness at boundary Linear oscillators with low-rank damping Fast Hessenberg QR Iteration for Companion Matrices p.6/20
Rank symmetric + low rank h: For A rank symmetric and L with rank k, both square block 2-by-2: rank(a 12 + L 12 ) rank(a 21 + L 21 ) 2k Proof: rank(l ij ) k, so rank(a ij + L ij ) rank(a ij ) k. Note rank(a ij ) = rank(a ji ) and apply triangle inequality. Fast Hessenberg QR Iteration for Companion Matrices p.7/20
Off-diagonal rank Any matrix unitarily similar to C is orthogonal + rank 1. herefore: Hessenberg forms similar to C have rank(super-diagonal block) 3. Real Schur forms similar to C have rank(super-diagonal block) 2. (For polynomial matrices, make that 3d and 2d). Fast Hessenberg QR Iteration for Companion Matrices p.8/20
SSS structure Write Hessenberg matrices with low off-diagonal rank as narrow band matrix + SSS matrix. 0 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V 0 0 U 2 V3 U 2 W 3 V H = B + 0 0 0 U 3 V 0 0 0 0 Write H ij = U i W i+1... W j 1 V j for i < j where U i R m b 2k+1, V i R m b 2k+1, and W i R 2k+1 2k+1. Make the block size m b 2k + 1. otal storage cost: O(nk). Fast Hessenberg QR Iteration for Companion Matrices p.9/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V B 21 B 22 U 2 V3 U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V B 21 B 22 U 2 V3 U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V B 11 B 12 Û 1 V 3 B 21 B 22 Û 2 V 3 Û 1 W 3 V Û 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V 2 U 1 W 2 V3 U 1 W 2 W 3 V B 21 B 22 U 2 V3 U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 ˆV 2 U 1 ˆV 3 U 1 W 2 W 3 V B 21 B 22 B 23 Û 2 V B 32 B 33 Û 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Continue until Hessenberg form is restored Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V2 U 1 W 2 V 3 B 21 B 22 U 2 V 3 U 1 W 2 W 3 V U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Continue until Hessenberg form is restored Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V B 11 U 1 V2 U 1 W 2 ˆV 3 U 1 W 2 ˆV B 21 B 22 U 2 ˆV 3 U 2 ˆV B 32 B 33 B 3 B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Continue until Hessenberg form is restored Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V 2 U 1 W 2 V 3 U 1 W 2 W 3 V B 21 B 22 U 2 V 3 U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Continue until Hessenberg form is restored Fast Hessenberg QR Iteration for Companion Matrices p.10/20
SSS structure transformed B 11 B 11 U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V U 1 V2 U 1 W 2 V3 U 1 W 2 W 3 V B 21 B 22 U 2 V3 U 2 W 3 V B 32 B 33 U 3 V B 3 B Start with an SSS representation Bulge chase in B 11 : changes B 11 and U 1 Merge first two blocks and chase bulge across boundary Split blocks once bulge is within B 22 Continue until Hessenberg form is restored Fast Hessenberg QR Iteration for Companion Matrices p.10/20
Row and column bases Define C j and G j by the recurrences C 1 = U 1 and C j = G n = V n and G j = [ ] C j 1 W j [ U j V v G j+1 W j for j > 1 ] for j < n. C j and G j are row and column bases for the submatrix left of block column j and above block row j. All C j orthonormal: left proper form. All G j orthonormal: right proper form. Fast Hessenberg QR Iteration for Companion Matrices p.11/20
Splitting blocks Start with C i 1 and G i+1 orthonormal. H 1,i... H 1,n.. = H i,i... H i,n [ C i 1 0 0 I ] [ Vi H ii W i U i ] [ 0 G i+1 I 0 ]. [ V j B jj W j U j ] Vj α Bjj αα B βα jj W α j V β j U α j V β j B ββ jj W α j W β j U α j W β j U β j Do pivoted [ QR ] or SVD decomposition to split U, V, W. Wj α Choose Uj α with orthonormal columns. Fast Hessenberg QR Iteration for Companion Matrices p.12/20
Balancing Want to balance matrix to improve accuracy. But general diagonal scaling destroys structure! However: C is a companion matrix with Perron vector: Let hen DCD 1 x i = λ i D = diag(x 1 i ) = diag(λ i ). Is optimally balanced in the -norm. Is λ orthogonal + low rank Is scaling the polynomial indeterminate by λ. Fast Hessenberg QR Iteration for Companion Matrices p.13/20
Implementation Modified DLAHQR (basic double-shift QR) to use the new data structure. Convert to right proper form. Choose shifts (logic from LAPACK). Bulge chase top to bottom / convert to left proper form. Check for convergence / deflate (logic from LAPACK). Most of the code accesses data in the band part of the structure. his text is identical to the standard code just change LDA. Fast Hessenberg QR Iteration for Companion Matrices p.1/20
Performance Compared against LAPACK DLAHQR and DHSEQR Compiled using g77/gcc with -O2 flag LAPACK with standard optimizations, ALAS BLAS P processor at 1.5 GHz Polynomials of degree 100-2000. Fast Hessenberg QR Iteration for Companion Matrices p.15/20
Performance: QR-based 600 500 Polynomial roots Fast DLAHQR (QR) DLAHQR DHSEQR 00 ime (s) 300 200 100 0 0 200 00 600 800 1000 1200 100 1600 1800 2000 Polynomial degree Fast Hessenberg QR Iteration for Companion Matrices p.16/20
Performance: QR-based 10 3 Polynomial roots Fast DLAHQR (QR) DLAHQR DHSEQR 10 2 10 1 ime (s) 10 0 10 1 10 2 10 2 10 3 10 Polynomial degree Fast Hessenberg QR Iteration for Companion Matrices p.17/20
Performance: SVD-based 600 500 Polynomial roots Fast DLAHQR (SVD) DLAHQR DHSEQR 00 ime (s) 300 200 100 0 0 200 00 600 800 1000 1200 100 1600 1800 2000 Polynomial degree Fast Hessenberg QR Iteration for Companion Matrices p.18/20
Performance: SVD-based 10 3 Polynomial roots Fast DLAHQR (SVD) DLAHQR DHSEQR 10 2 10 1 ime (s) 10 0 10 1 10 2 10 2 10 3 10 Polynomial degree Fast Hessenberg QR Iteration for Companion Matrices p.19/20
Conclusions Hessenberg({ symmetric, skew, unitary } + low rank) has low off-diagonal rank. Applies to (matrix) polynomial root finding, some damped vibration calculations. Can use low-rank structure for fast bulge-chasing. Can use-use most existing QR lore and code. Appears backward stable with SVD-based splitting. Fast Hessenberg QR Iteration for Companion Matrices p.20/20