Algorithms for exact (dense) linear algebra Gilles Villard CNRS, Laboratoire LIP ENS Lyon Montagnac-Montpezat, June 3, 2005
Introduction
Problem: Study of complexity estimates for basic problems in exact linear algebra over K[x] and Z Deterministic Monte Carlo (non certified), Las Vegas (certified) randomized algorithms Time complexity Up to logarithmic factors, e.g. O (n c ) = O(n c log α n) (Space complexity) Introduction 1
Models Algebraic complexity K a commutative field, algebraic RAM: +,, / Over K[x], arithmetic operations in K Bit complexity Over Z or Q, bitwise computational cost Introduction 2
Motivations Complexity estimates with concrete entry domain Better understanding of linear algebra in bit complexity Improved algorithms for exact (or accurate) results Introduction 3
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
Algebraic complexity over K [Survey and more in Bürgisser et al. 199, Ch. 1] Asymptotic equivalence to matrix multiplication Matrix multiplication n n A B Determinant, inversion, rank, characteristic polynomial... O(n ω ), O(n 3 ) or O(n 2.3 ) Algorithms in O (n ω ) Known reductions between problems 4
Algebraic complexity over K [Survey and more in Bürgisser et al. 199, Ch. 1] Asymptotic equivalence to matrix multiplication Matrix multiplication n n A B Determinant, inversion, rank, characteristic polynomial... O(n ω ), O(n 3 ) or O(n 2.3 ) Algorithms in O (n ω ) LinSys MM Det Known reductions between problems 4
Example 1. Det K = Inversion K Known reductions between problems 5
Example 1. Det K = Inversion K Proof. Derivative inequality [Baur and Strassen, 1983] det A = a 11 det A 2..n,2..n +..., hence det A a 11 = det A 2..n,2..n A 1 = A det A, a j,i = det A a i,j Known reductions between problems 5
Does not carry over to the polynomial or bit complexity case x, y two vectors with fixed size entries c an n-bits integer Compute φ = c x T y, O(n) bit operations Known reductions between problems
Does not carry over to the polynomial or bit complexity case x, y two vectors with fixed size entries c an n-bits integer Compute φ = c x T y, O(n) bit operations Compute [ φ/ x i ] 1 i n = c y, O(n 2 ) bit operations Known reductions between problems
Example 2. MM K = CharPoly K Known reductions between problems
Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n Known reductions between problems
Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] Known reductions between problems
Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] A 2 [u, Au] [u, Au, A 2 u, A 3 u] A 4 [u, Au, A 2 u, A 3 u] Known reductions between problems
Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] A 2 [u, Au] [u, Au, A 2 u, A 3 u] A 4 [u, Au, A 2 u, A 3 u]... repeated squaring... [u, Au, A 2 u, A 3 u,..., A d u] Known reductions between problems
Example 2. MM K = CharPoly K Proof. Minimum polynomial [Keller-Gehrig, 1985] A K n n, u K n A u [u, Au] A 2 [u, Au] [u, Au, A 2 u, A 3 u] A 4 [u, Au, A 2 u, A 3 u]... repeated squaring... [u, Au, A 2 u, A 3 u,..., A d u] A d u + c d 1 A d 1 u +... + c 0 u = 0 = π(x) = x d + c d 1 x d 1 +... + c 0 Known reductions between problems
Does not carry over to the polynomial or bit complexity case A of size log A A n/2 has entries of size O(n log A ) The multiplication by A n/2 costs O(n ω n log A ) = O(n ω+1 log A ) Known reductions between problems 8
Impact of data size? Ex. Determinant computation/output size : nd or O (nlog A ), Evaluation/interpolation or homomorphic scheme or O (nlog A ) bits a priori : n ω Known reductions between problems 9
Impact of data size? Ex. Determinant computation/output size : nd or O (nlog A ), Evaluation/interpolation or homomorphic scheme or O (nlog A ) bits a priori : nd points or O (nlog A ) bits n ω Known reductions between problems 9
Impact of data size? Ex. Determinant computation/output size : nd or O (nlog A ), Evaluation/interpolation or homomorphic scheme or O (nlog A ) bits a priori : nd points or O (nlog A ) bits n ω Complexity estimates: O (n ω nd) = O (n ω+1 d) O (n ω+1 log A ) Known reductions between problems 9
MM(n, d) = O (n ω d) : cost for multiplying n n matrices of degree d MM(n, log A ) = O (n ω log A ) : cost for multiplying n n integer matrices (general case: consider generalized functions MM) Previous analysis shows that the determinant may be computed in O(n MM(n, d)) or O(n MM(n, log A )) operations, i.e. in n corresponding matrix products Known reductions between problems 10
Fundamentals of dense linear algebra over K[x] or Z (19 2000) : Monte Carlo rank System solution (Hensel lifting) [Moenck & Carter 9, Dixon 82] O(n ω + n 2 log A ) O (n 3 log A ) Determinant, inversion, nullspace, rank,... [Edmonds, Bareiss 9, Moenck & Carter 9] Frobenius form (minimum, characteristic polynomial) [Giesbrecht 93, Giesbrecht & Storjohann 02] Hermite and Smith forms (diophantine systems) [Kannan & Bachem 9, Domich 85, Giesbrecht 95, Storjohann 9-00] O (n MM(n, log A )) Deterministic O (n MM(n, log A )) Las Vegas O (n MM(n, log A )) Deterministic Known reductions between problems 11
Bit complexity algebraic complexity output size Known reductions between problems 12
Bit complexity algebraic complexity output size Is this bound pessimistic? Known reductions between problems 12
Bit complexity algebraic complexity output size Is this bound pessimistic? Clue. The output length may not be necessary a priori, i.e. at the beginning of the computation, but only at its very end. Known reductions between problems 12
Reduction to matrix multiplication Question: Which dense linear algebra problems over K[x] or Z can be solved by algorithms using about the same number of operations as for myltiplying two corresponding matrices plus the input/output size? Known reductions between problems 13
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
Triangularization in log 2 (n) steps A = Divide-double and conquer 14
Triangularization in log 2 (n) steps LA = Divide-double and conquer 14
Triangularization in log 2 (n) steps L LA = Divide-double and conquer 14
Triangularization in log 2 (n) steps L L LA = = T Divide-double and conquer 14
Divide and conquer [Strassen 199, Schönhage 193, Bunch & Hopcroft 194] [ I 0 BA 1 I ] [ A C B D ] = [ A C 0 D BA 1 C ] At next step : Dimension: divided by two Divide-double and conquer 15
Divide and conquer [Strassen 199, Schönhage 193, Bunch & Hopcroft 194] [ I 0 BA 1 I ] [ A C B D ] = [ A C 0 D BA 1 C ] At next step : Dimension: divided by two Entry size : multiplied by n/2 Divide-double and conquer 15
Divide-double and conquer [Jeannerod & Villard 2002, Storjohann 2002] The dimension is divided by two while the entry size is at most doubled Cost: ( n 2 i)ω 2 i d = O(n ω d) log n i=1 Divide-double and conquer 1
A = 2 4 85 55 3 35 49 3 5 59 43 2 50 12 18 31 91 4 1 41 94 83 8 23 53 85 49 8 8 30 80 2 3 5 (Left) nullspace? N A = 0? Divide-double and conquer 1
Gaussian elimination (Schur complement) : N g = 2 4 410 152550 3980 2955220...... 15181842 1389422 0 401840...... 280458 4081928 0 1881120...... 438828 4023028 0 3583510...... 3 5 Divide-double and conquer 18
Gaussian elimination (Schur complement) : N g = 2 4 410 152550 3980 2955220...... 15181842 1389422 0 401840...... 280458 4081928 0 1881120...... 438828 4023028 0 3583510...... 3 5 However, one can choose instead (and compute over K[x]) : N = 2 3 25 32 1 38 1 30 32 33 2 8 43 23 1 1 55 1 4 10 43 28 95 50 30 53 5 23 25 12 182 90 40 3 4 Divide-double and conquer 19
Example: for inversion and nullspace A(x) K[x] 2n n of degree d Divide and conquer Gaussian elimination N(x) A(x) = 0 N(x) K[x] 2n n of degree O(nd) Total size: O(n 3 d) Divide-double and conquer 20
Example: for inversion and nullspace A(x) K[x] 2n n of degree d Divide and conquer Gaussian elimination Divide-double and conquer Minimal module bases N(x) A(x) = 0 M(x) A(x) = 0 N(x) K[x] 2n n of degree O(nd) M(x) K[x] 2n n with degree sum O(nd) e.g. generically degree d (Kronecker indices) Total size: O(n 3 d) Total size: O(n 2 d) Divide-double and conquer 20
(Example ctd) Stacking for the nullspace n/2 nullspace vectors of degree less than 2d 2 4 2d 3 5 2 4 3 5 n n 9 >= >; = 0 Divide-double and conquer 21
(Example ctd) Stacking for the nullspace n/4 nullspace vectors of degree less than 4d 2 4 4d 2d 3 5 2 4 3 5 n n/2 9 >= >; = 0 Divide-double and conquer 21
(Example ctd) Stacking for the nullspace p/2 nullspace vectors of degree less than 2nd/p 2 4... 4d 2d 3 5 2 4 3 5 n p 9 >= >; = 0 Divide-double and conquer 21
Theorem: The rank r of M K[x] n n (degree d) and n r independent nullspace vectors can be computed in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann & Villard, 2005] Divide-double and conquer 22
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
Matrix polynomial inversion, degree d over K[x] Over K : Input/output size: n 2 / Cost: n 3, n ω A 1 = B. Over K(x), output degree nd, Output size: n 3 d / Cost (e.g. Newton): n 4 d, n ω+1 d A 1 (x) = A (x)/ det A(x) = (B 0 + xb 1 +... + x nd 1 B nd 1 )/ det A(x) Matrix polynomials 23
Using minimal bases Theorem: The generic matrix inverse of a polynomial matrix (degree d) can be computed in essentially optimal time O (n 3 d). [Jeannerod & Villard, 2002-2005] Matrix polynomials 24
Using minimal bases Theorem: The generic matrix inverse of a polynomial matrix (degree d) can be computed in essentially optimal time O (n 3 d). [Jeannerod & Villard, 2002-2005] Corollary: For a generic A K n n, the matrix powers A, A 2,..., A n, can be computed in essentially optimal time O (n 3 d). Hint: (I xa) 1 = i=0 xi A i. Matrix polynomials 24
Using high order lifting A(x) unimodular transforms s 1 (x) s 2 (x)... s n (x) Theorem: The Smith normal form, hence the determinant of M K[x] n n (degree d) can be computed in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann, 2002-2003] Matrix polynomials 25
Using high order lifting and minimal bases 2 4 5 x 3 + 5 x 4 + 4 x 2 + x + 5 3 x + 5 x 2 + 4 4 x 2 + 5 x 3 + 2 x + 4 3 x 4 + x 2 + 4 + 3 x 3 x 2 + 1 3 x 2 + 3 x 3 + 4 x + 4 2 x 3 + x 4 + 2 x 2 + 4 4 x + x 2 + 4 3 x 2 + x 3 + 4 + 5 x 3 5 2 4 3 + 5 x 1 + 5 x 2 2 + 3 x 3 + x 5 2 + x 3 x 2 3 5 Theorem: A basis (n n, degree d) of a K[x]-module can be reduced in O (MM(n, d)) = O (n ω d) operations in K by a randomized Las Vegas (certified) algorithm (cf also the matrix Gcd, and matrix Padé approximants). [Giorgi, Jeannerod & Villard, 2003] Matrix polynomials 2
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
Fact : The known improvements in bit complexity come from corresponding improvements over the polynomials (or truncated power series). Integer matrices 2
Example 1. Correspondence between K[x] and Z Multiplication M(d) = O(d log d log log d) = O (d): product in K[x] M(log a ) = M(s) = O(s log s log log s) = O (s): product in Z Integer matrices 2
Example 1. Correspondence between K[x] and Z Multiplication M(d) = O(d log d log log d) = O (d): product in K[x] M(log a ) = M(s) = O(s log s log log s) = O (s): product in Z Gcd [Knuth/Schönhage 190-191] Gcd in K[x]: O(M(d) log d) Gcd in Z: O(M(s) log s) Integer matrices 28
Example 2. Characteristic polynomial, correspondence between R and Z n n Best known estimations for the determinant without divisions carry over to the integer case, and lead to the characteristic polynomial Integer matrices 29
Example 2. Characteristic polynomial, correspondence between R and Z n n Best known estimations for the determinant without divisions carry over to the integer case, and lead to the characteristic polynomial Algebraic without divisions, R One tries to limit the degree increase Bit complexity, Z One tries to limit the size increase Integer matrices 29
Strassen s transformation for eliminating divisions works over K[[x]] A I + x(a I) Reducing the cost over K[[x]] = reducing the bit complexity Determinant without division over R Det K Det R [Strassen 193, Vermeidung von Divisionen] O (n ω+1 ) [(Le Verrier) Samuelson/Berkowitz, 1984] [Chistov, 1985] [Kaltofen, 1992] O (n 3+1/2 ), O(n 3.03 ) [Kaltofen & Villard, 2002-2005] O (n 3+1/5 ), O(n 2. ) Integer matrices 30
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? Integer matrices 31
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n Integer matrices 31
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n B = A n n 3 n Integer matrices 31
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n B = A n n 3 n u T, u T B,..., u T B n 1 n 2 n n Integer matrices 31
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n B = A n n 3 n u T, u T B,..., u T B n 1 n 2 n n u T B i A j v = u T A k v, i = 0,..., n 1, j = 0,... n 1 n 2 n Integer matrices 31
Baby steps/giant steps and elimination of the divisions, over Z [Shanks, Kaltofen 1992, Kaltofen & Villard 2002-2005] u T A k v, k = 0,..., n 1, A Z n n, u, v Z n? v, Av,..., A n 1 v n 2 n n B = A n n 3 n u T, u T B,..., u T B n 1 n 2 n n u T B i A j v = u T A k v, i = 0,..., n 1, j = 0,... n 1 n 2 n Characteristic polynomial in time O (n 3+ n log A ) or... O (n 2. log A ) Integer matrices 31
Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z Integer matrices 32
Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p Integer matrices 32
Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p By i+1 = r i (mod x) By i+1 = r i mod p Integer matrices 32
Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p By i+1 = r i (mod x) r i+1 = (A(x)y i+1 b(x))/x By i+1 = r i mod p r i+1 = (Ay i+1 b)/p Integer matrices 32
Example 3. Correspondence between K[x] and Z Linear system solution A(x)Y (x) = b(x), K[x] AY = b, Z B = A 1 0 (mod x) B = A 1 mod p By i+1 = r i (mod x) r i+1 = (A(x)y i+1 b(x))/x By i+1 = r i mod p r i+1 = (Ay i+1 b)/p LinSys K[x] (n, d) = O (MM(n, d)) LinSys Z (n, log A ) = O (MM(n, log A )) [Storjohann 2002-2005] Integer matrices 32
Theorem: The Smith normal form of A Z n n, hence the determinant, can be computed in O (MM(n, log A )) = O (n ω log A ) operations in K by a randomized Las Vegas (certified) algorithm. [Storjohann, 2004-2005] Proof. Fast system solution (correction of the residue / p-adic lifting) Divide-double and conquer (based on the sizes of the invariant factors) Integer matrices 33
The talk Introduction I - Known reductions between problems II - Divide-double and conquer III - Matrix polynomials IV - Integer matrices Conclusion
Classical open problems, algebraic complexity Determinant without divisions in O (n ω )? Reduction of matrix multiplication to system solution? Open problems, K[x] or bit complexity Small nullspace bases in O (n ω log A )? (Z) Essentially optimal inversion? New reductions to matrix multiplication? Conclusion 34