Algorithms for exact (dense) linear algebra

Algorithms for exact (dense) linear algebra Gilles Villard CNRS, Laboratoire LIP ENS Lyon Montagnac-Montpezat, June 3, 2005

Introduction

Problem: Study of complexity estimates for basic problems in exact linear algebra over K[x] and Z Deterministic Monte Carlo (non certified), Las Vegas (certified) randomized algorithms Time complexity Up to logarithmic factors, e.g. O (n c ) = O(n c log α n) (Space complexity) Introduction 1

Models Algebraic complexity K a commutative field, algebraic RAM: +,, / Over K[x], arithmetic operations in K Bit complexity Over Z or Q, bitwise computational cost Introduction 2

Motivations Complexity estimates with concrete entry domain Better understanding of linear algebra in bit complexity Improved algorithms for exact (or accurate) results Introduction 3

The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion

Algebraic complexity over K [Survey and more in Bürgisser et al. 199, Ch. 1] Asymptotic equivalence to matrix multiplication Matrix multiplication n n A B Determinant, inversion, rank, characteristic polynomial... O(n ω ), O(n 3 ) or O(n 2.3 ) Algorithms in O (n ω ) Known reductions between problems 4

Example 1. Det K = Inversion K Known reductions between problems 5

Example 1. Det K = Inversion K Proof. Derivative inequality [Baur and Strassen, 1983] det A = a 11 det A 2..n,2..n +..., hence det A a 11 = det A 2..n,2..n A 1 = A det A, a j,i = det A a i,j Known reductions between problems 5

Does not carry over to the polynomial or bit complexity case x, y two vectors with fixed size entries c an n-bits integer Compute φ = c x T y, O(n) bit operations Known reductions between problems

Does not carry over to the polynomial or bit complexity case x, y two vectors with fixed size entries c an n-bits integer Compute φ = c x T y, O(n) bit operations Compute [ φ/ x i ] 1 i n = c y, O(n 2 ) bit operations Known reductions between problems

Example 2. MM K = CharPoly K Known reductions between problems

Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n Known reductions between problems

Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] Known reductions between problems

Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] A 2 [u, Au] [u, Au, A 2 u, A 3 u] A 4 [u, Au, A 2 u, A 3 u] Known reductions between problems

Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] A 2 [u, Au] [u, Au, A 2 u, A 3 u] A 4 [u, Au, A 2 u, A 3 u]... repeated squaring... [u, Au, A 2 u, A 3 u,..., A d u] Known reductions between problems

Does not carry over to the polynomial or bit complexity case A of size log A A n/2 has entries of size O(n log A ) The multiplication by A n/2 costs O(n ω n log A ) = O(n ω+1 log A ) Known reductions between problems 8

Impact of data size? Ex. Determinant computation/output size : nd or O (nlog A ), Evaluation/interpolation or homomorphic scheme or O (nlog A ) bits a priori : n ω Known reductions between problems 9

Impact of data size? Ex. Determinant computation/output size : nd or O (nlog A ), Evaluation/interpolation or homomorphic scheme or O (nlog A ) bits a priori : nd points or O (nlog A ) bits n ω Known reductions between problems 9

MM(n, d) = O (n ω d) : cost for multiplying n n matrices of degree d MM(n, log A ) = O (n ω log A ) : cost for multiplying n n integer matrices (general case: consider generalized functions MM) Previous analysis shows that the determinant may be computed in O(n MM(n, d)) or O(n MM(n, log A )) operations, i.e. in n corresponding matrix products Known reductions between problems 10

Fundamentals of dense linear algebra over K[x] or Z (19 2000) : Monte Carlo rank System solution (Hensel lifting) [Moenck & Carter 9, Dixon 82] O(n ω + n 2 log A ) O (n 3 log A ) Determinant, inversion, nullspace, rank,... [Edmonds, Bareiss 9, Moenck & Carter 9] Frobenius form (minimum, characteristic polynomial) [Giesbrecht 93, Giesbrecht & Storjohann 02] Hermite and Smith forms (diophantine systems) [Kannan & Bachem 9, Domich 85, Giesbrecht 95, Storjohann 9-00] O (n MM(n, log A )) Deterministic O (n MM(n, log A )) Las Vegas O (n MM(n, log A )) Deterministic Known reductions between problems 11

Bit complexity algebraic complexity output size Known reductions between problems 12

Bit complexity algebraic complexity output size Is this bound pessimistic? Known reductions between problems 12

Bit complexity algebraic complexity output size Is this bound pessimistic? Clue. The output length may not be necessary a priori, i.e. at the beginning of the computation, but only at its very end. Known reductions between problems 12

Reduction to matrix multiplication Question: Which dense linear algebra problems over K[x] or Z can be solved by algorithms using about the same number of operations as for myltiplying two corresponding matrices plus the input/output size? Known reductions between problems 13

The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion

Triangularization in log 2 (n) steps A = Divide-double and conquer 14

Triangularization in log 2 (n) steps LA = Divide-double and conquer 14

Triangularization in log 2 (n) steps L LA = Divide-double and conquer 14

Triangularization in log 2 (n) steps L L LA = = T Divide-double and conquer 14

Divide and conquer [Strassen 199, Schönhage 193, Bunch & Hopcroft 194] [ I 0 BA 1 I ] [ A C B D ] = [ A C 0 D BA 1 C ] At next step : Dimension: divided by two Divide-double and conquer 15

Divide and conquer [Strassen 199, Schönhage 193, Bunch & Hopcroft 194] [ I 0 BA 1 I ] [ A C B D ] = [ A C 0 D BA 1 C ] At next step : Dimension: divided by two Entry size : multiplied by n/2 Divide-double and conquer 15

Divide-double and conquer [Jeannerod & Villard 2002, Storjohann 2002] The dimension is divided by two while the entry size is at most doubled Cost: ( n 2 i)ω 2 i d = O(n ω d) log n i=1 Divide-double and conquer 1

A = 2 4 85 55 3 35 49 3 5 59 43 2 50 12 18 31 91 4 1 41 94 83 8 23 53 85 49 8 8 30 80 2 3 5 (Left) nullspace? N A = 0? Divide-double and conquer 1

Gaussian elimination (Schur complement) : N g = 2 4 410 152550 3980 2955220...... 15181842 1389422 0 401840...... 280458 4081928 0 1881120...... 438828 4023028 0 3583510...... 3 5 However, one can choose instead (and compute over K[x]) : N = 2 3 25 32 1 38 1 30 32 33 2 8 43 23 1 1 55 1 4 10 43 28 95 50 30 53 5 23 25 12 182 90 40 3 4 Divide-double and conquer 19

Example: for inversion and nullspace A(x) K[x] 2n n of degree d Divide and conquer Gaussian elimination N(x) A(x) = 0 N(x) K[x] 2n n of degree O(nd) Total size: O(n 3 d) Divide-double and conquer 20

Example: for inversion and nullspace A(x) K[x] 2n n of degree d Divide and conquer Gaussian elimination Divide-double and conquer Minimal module bases N(x) A(x) = 0 M(x) A(x) = 0 N(x) K[x] 2n n of degree O(nd) M(x) K[x] 2n n with degree sum O(nd) e.g. generically degree d (Kronecker indices) Total size: O(n 3 d) Total size: O(n 2 d) Divide-double and conquer 20

(Example ctd) Stacking for the nullspace n/2 nullspace vectors of degree less than 2d 2 4 2d 3 5 2 4 3 5 n n 9 >= >; = 0 Divide-double and conquer 21

(Example ctd) Stacking for the nullspace n/4 nullspace vectors of degree less than 4d 2 4 4d 2d 3 5 2 4 3 5 n n/2 9 >= >; = 0 Divide-double and conquer 21

(Example ctd) Stacking for the nullspace p/2 nullspace vectors of degree less than 2nd/p 2 4... 4d 2d 3 5 2 4 3 5 n p 9 >= >; = 0 Divide-double and conquer 21

Theorem: The rank r of M K[x] n n (degree d) and n r independent nullspace vectors can be computed in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann & Villard, 2005] Divide-double and conquer 22

The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion

Matrix polynomial inversion, degree d over K[x] Over K : Input/output size: n 2 / Cost: n 3, n ω A 1 = B. Over K(x), output degree nd, Output size: n 3 d / Cost (e.g. Newton): n 4 d, n ω+1 d A 1 (x) = A (x)/ det A(x) = (B 0 + xb 1 +... + x nd 1 B nd 1 )/ det A(x) Matrix polynomials 23

Using minimal bases Theorem: The generic matrix inverse of a polynomial matrix (degree d) can be computed in essentially optimal time O (n 3 d). [Jeannerod & Villard, 2002-2005] Matrix polynomials 24

Using minimal bases Theorem: The generic matrix inverse of a polynomial matrix (degree d) can be computed in essentially optimal time O (n 3 d). [Jeannerod & Villard, 2002-2005] Corollary: For a generic A K n n, the matrix powers A, A 2,..., A n, can be computed in essentially optimal time O (n 3 d). Hint: (I xa) 1 = i=0 xi A i. Matrix polynomials 24

Using high order lifting A(x) unimodular transforms s 1 (x) s 2 (x)... s n (x) Theorem: The Smith normal form, hence the determinant of M K[x] n n (degree d) can be computed in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann, 2002-2003] Matrix polynomials 25

Using high order lifting and minimal bases 2 4 5 x 3 + 5 x 4 + 4 x 2 + x + 5 3 x + 5 x 2 + 4 4 x 2 + 5 x 3 + 2 x + 4 3 x 4 + x 2 + 4 + 3 x 3 x 2 + 1 3 x 2 + 3 x 3 + 4 x + 4 2 x 3 + x 4 + 2 x 2 + 4 4 x + x 2 + 4 3 x 2 + x 3 + 4 + 5 x 3 5 2 4 3 + 5 x 1 + 5 x 2 2 + 3 x 3 + x 5 2 + x 3 x 2 3 5 Theorem: A basis (n n, degree d) of a K[x]-module can be reduced in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm (cf also the matrix Gcd, and matrix Padé approximants). [Giorgi, Jeannerod & Villard, 2003] Matrix polynomials 2

The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion

Fact : The known improvements in bit complexity come from corresponding improvements over the polynomials (or truncated power series). Integer matrices 2

Example 1. Correspondence between K[x] and Z Multiplication M(d) = O(d log d log log d) = O (d): product in K[x] M(log a ) = M(s) = O(s log s log log s) = O (s): product in Z Integer matrices 2

Example 1. Correspondence between K[x] and Z Multiplication M(d) = O(d log d log log d) = O (d): product in K[x] M(log a ) = M(s) = O(s log s log log s) = O (s): product in Z Gcd [Knuth/Schönhage 190-191] Gcd in K[x]: O(M(d) log d) Gcd in Z: O(M(s) log s) Integer matrices 28

Example 2. Characteristic polynomial, correspondence between R and Z n n Best known estimations for the determinant without divisions carry over to the integer case, and lead to the characteristic polynomial Algebraic without divisions, R One tries to limit the degree increase Bit complexity, Z One tries to limit the size increase Integer matrices 29

Strassen s transformation for eliminating divisions works over K[[x]] A I + x(a I) Reducing the cost over K[[x]] = reducing the bit complexity Determinant without division over R Det K Det R [Strassen 193, Vermeidung von Divisionen] O (n ω+1 ) [(Le Verrier) Samuelson/Berkowitz, 1984] [Chistov, 1985] [Kaltofen, 1992] O (n 3+1/2 ), O(n 3.03 ) [Kaltofen & Villard, 2002-2005] O (n 3+1/5 ), O(n 2. ) Integer matrices 30

Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? Integer matrices 31

Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n Integer matrices 31

Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n B = A n n 3 n u T, u T B,..., u T B n 1 n 2 n n u T B i A j v = u T A k v, i = 0,..., n 1, j = 0,... n 1 n 2 n Integer matrices 31

Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z Integer matrices 32

Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p Integer matrices 32

Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p By i+1 = r i (mod x) By i+1 = r i mod p Integer matrices 32

Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p By i+1 = r i (mod x) r i+1 = (A(x)y i+1 b(x))/x By i+1 = r i mod p r i+1 = (Ay i+1 b)/p Integer matrices 32

Theorem: The Smith normal form of A Z n n, hence the determinant, can be computed in O (MM(n, log A )) = O (n ω log A ) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann, 2004-2005] Proof. Fast system solution (correction of the residue / p-adic lifting) Divide-double and conquer (based on the sizes of the invariant factors) Integer matrices 33

The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion

Classical open problems, algebraic complexity Determinant without divisions in O (n ω )? Reduction of matrix multiplication to system solution? Open problems, K[x] or bit complexity Small nullspace bases in O (n ω log A )? (Z) Essentially optimal inversion? New reductions to matrix multiplication? Conclusion 34