High Performance and Reliable Algebraic Computing

Size: px

Start display at page:

Download "High Performance and Reliable Algebraic Computing"

Randolf Cox
6 years ago
Views:

1 High Performance and Reliable Algebraic Computing Soutenance d habilitation à diriger des recherches Clément Pernet Université Joseph Fourier (Grenoble 1) November 25, 2014 Rapporteurs : Daniel Augot, Examinateurs : Jean-Guillaume Dumas, Mark Giesbrecht, Jean-Charles Faugère, Laura Grigori, Erich L. Kaltofen, Brigitte Plateau.

2 Introduction Computer Algebra Computing exactly over Z, Q, Z/pZ, GF(q), K[X]. Symbolic manipulations. Applications where all digits matter: breaking Discrete Log Pb. in quasi-polynomial time [Barbulescu & al. 14], building modular form databases to test the BSD conjecture [Stein 12], formal verification of Hales proof of Kepler conjecture [Hales 05]. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

3 Introduction Computer Algebra Computing exactly over Z, Q, Z/pZ, GF(q), K[X]. Symbolic manipulations. Applications where all digits matter: breaking Discrete Log Pb. in quasi-polynomial time [Barbulescu & al. 14], building modular form databases to test the BSD conjecture [Stein 12], formal verification of Hales proof of Kepler conjecture [Hales 05]. Efficiency mostly rely on linear algebra over Z and Z/pZ. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

4 Introduction Coding theory Protecting information against alteration: deep space communication, data storage, fault tolerance of large scale computations. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

5 Introduction Coding theory Protecting information against alteration: deep space communication, data storage, fault tolerance of large scale computations. Numerical linear algebra Computing fast with approximations: delivering flops to most scientific computations for over 60 years, LinPack: benchmark for the top 500 supercomputers, impacts nowadays computer architectures. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

6 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

7 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

8 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q [Chowdhury & al. 14]: fast list decoding of Reed-Solomon codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

9 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q [Chowdhury & al. 14]: fast list decoding of Reed-Solomon codes [Huang and Abraham 84]: Algorithm Based Fault Tolerance (ABFT) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

10 Interactions RS and CRT codes Sparse codes BLAS, parallel algorithms Contributions: design of high performance linear algebra kernels, C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

11 Interactions RS and CRT codes Sparse codes BLAS, parallel algorithms Contributions: design of high performance linear algebra kernels, fault tolerant computer algebra. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

12 Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

13 Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

14 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

15 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) Matrix Product [Strassen 69]: O(n ). [Schönhage 81] O(n 2.52 ). [Coppersmith, Winograd 90] O(n ). [Le Gall 14] O(n ) MM(n) = O(n ω ) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

16 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) Matrix Product [Strassen 69]: O(n ). [Schönhage 81] O(n 2.52 ). [Coppersmith, Winograd 90] O(n ). [Le Gall 14] O(n ) Other operations [Strassen 69]: Inverse in O(n ω ) [Schönhage 72]: QR in O(n ω ) [Bunch, Hopcroft 74]: LU in O(n ω ) [Ibarra & al. 82]: Rank in O(n ω ) [Keller-Gehrig 85]: CharPoly in O(n ω log n) MM(n) = O(n ω ) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

17 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

18 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

19 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

20 Making theoretical reductions effective C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

21 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

22 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable Lucky coincidence building blocks in theory happen to be the most efficient routines in practice reduction trees are still relevant C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

23 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable Lucky coincidence building blocks in theory happen to be the most efficient routines in practice reduction trees are still relevant Roadmap 1 Tune building blocks (MatMul) 2 Improve existing reductions (LU, Echelon) leading constants memory footprint 3 Produce new reduction schemes (CharPoly, Rank Profiles) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

24 Design of parallel exact linear algebra ANR HPAC project: 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

25 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

26 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

27 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

28 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints Exact linear algebra specificities cost affected by the splitting O(n ω ) for w < 3 modular reductions coarse grain recursive algorithms rank deficiencies unbalanced task loads C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

29 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints Exact linear algebra specificities cost affected by the splitting O(n ω ) for w < 3 modular reductions coarse grain recursive algorithms rank deficiencies unbalanced task loads [Broquedis, Danjean and Gautier 12]: libkomp based on XKaapi C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

30 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions k ( ) 2 p 1 2 < 2 mantissa C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

31 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions k ( ) 2 p 1 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations numerical BLAS C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

32 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions 9 l ( ) 2 k p 1 2 l 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations Strassen-Winograd 6n numerical BLAS C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

33 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions 9 l ( ) 2 k p 1 2 l 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations Strassen-Winograd 6n numerical BLAS with memory efficient schedules [Boyer, Dumas, P. and Zhou 09] Tradeoffs: Overwriting input Extra memory Leading constant Fully in-place in 7.2n C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

34 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) OpenBLAS sgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

35 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z OpenBLAS sgemm matrix dimension p = 83, 1 mod / mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

36 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z FFLAS fgemm over Z/821Z OpenBLAS sgemm matrix dimension p = 83, 1 mod / mul. p = 821, 1 mod / 100 mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

37 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z FFLAS fgemm over Z/821Z OpenBLAS sgemm 10 FFLAS fgemm over Z/ Z FFLAS fgemm over Z/ Z OpenBLAS dgemm matrix dimension p = 83, 1 mod / mul. p = 821, 1 mod / 100 mul. p = , 1 mod / mul. p = , 1 mod / 100 mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

38 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

39 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C pfgemm double MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

40 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C 22 pfgemm double 100 pfgemm mod MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

41 Gaussian elimination Gaussian elimination Slab iterative LAPACK Slab recursive FFLAS-FFPACK Tile iterative PLASMA Tile recursive FFLAS-FFPACK C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

42 Gaussian elimination Gaussian elimination Slab recursive FFLAS-FFPACK Tile recursive FFLAS-FFPACK Prefer recursive algorithms C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

43 Gaussian elimination Gaussian elimination Tile recursive FFLAS-FFPACK Prefer recursive algorithms Better data locality C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

44 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops PLASMA-Quark dgetrf tiled storage (k=212) 0 PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

45 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops Intel MKL dgetrf PLASMA-Quark dgetrf tiled storage (k=212) PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

46 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops FFLAS-FFPACK<double> explicit synch FFLAS-FFPACK<double> dataflow synch 50 Intel MKL dgetrf PLASMA-Quark dgetrf tiled storage (k=212) 0 PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

47 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Over the finite field Z/131071Z 400 Parallel PLUQ over Z/131071Z with full rank matrices on 32 cores Gfops Tiled Rec explicit sync 0 Tiled Rec dataflow sync matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

48 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Over the finite field Z/131071Z 400 Parallel PLUQ over Z/131071Z with full rank matrices on 32 cores Gfops Tiled Rec explicit sync 50 Tiled Rec dataflow sync Tiled Iter dataflow sync 0 Tiled Iter explicit sync matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

49 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

50 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

51 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

52 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} Generic ColRP. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

53 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} Generic ColRP. Major invariant of a matrix (echelon form) Gröbner basis computations (Macaulay matrix) [Faugère 99, 02] Krylov methods C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

54 Rank profiles Computing rank profiles Via Gaussian elimination revealing echelon forms: [Ibarra, Moran and Hui 82] A = L S P [Keller-Gehrig 85] X A = R [Storjohann 00] X A = R [Jeannerod, P. and Storjohann 13] A = P L E C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

55 Rank profiles Computing rank profiles Via Gaussian elimination revealing echelon forms: [Ibarra, Moran and Hui 82] A = L S P [Keller-Gehrig 85] X A = R [Storjohann 00] X A = R [Jeannerod, P. and Storjohann 13] A = P L E Lessons learned (or what we thought was necessary): treat rows in order exhaust all columns before considering the next row slab block splitting required (recursive or iterative) similar to partial pivoting C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

56 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row order: any non-zero on the first non-zero row Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

57 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

58 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex order: first non-zero on the first non-zero row Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

59 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. Rev. lex. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

60 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. Rev. lex. Product C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

61 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Permutation Transpositions Lex/RevLex order: first non-zero on the first non-zero row/col Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

62 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product Rotation Transposition Product Transposition Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

63 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

64 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Lexico. Transposition Rotation Rev. lex. Transposition Transposition Rev. lex. Rotation Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

65 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Lexico. Transposition Rotation Rev. lex. Transposition Transposition Rev. lex. Rotation Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

66 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. A R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

67 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1} ColRP = {1} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

68 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1,2} ColRP = {1,3} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

69 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1,4} ColRP = {1,2} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

70 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 Ir U V A = P LUQ = P Q M 0 I m r I n r R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

71 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P QQ T U V Q M I m r 0 I n r R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

72 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) L A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P Q Q T U V Q M I m r 0 I n r }{{}}{{}}{{} Π P,Q U R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

73 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) L A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P Q Q T U V Q M I m r 0 I n r }{{}}{{}}{{} Π P,Q With appropriate pivoting: Π P,Q = R(A) U R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

74 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan block splitting C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

75 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Recursive call C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

76 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B BU 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

77 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B L 1 B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

78 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

79 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

80 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

81 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 2 independent recursive calls C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

82 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B BU 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

83 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B L 1 B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

84 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

85 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

86 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

87 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Recursive call C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

88 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Puzzle game (block cyclic rotations) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

89 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 O(mnr ω 2 ) (degenerating to 2/3n 3 ) computing col. and row rank profiles of all leading sub-matrices fewer modular reductions than slab algorithms rank deficiency introduces parallelism C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

90 Characteristic polynomial Computing the characteristic polynomial Motivation Connection with the Frobenius normal form Krylov methods at large Graph invariants Crucial step in modular form computations The last missing reduction [Danilevskĭi 37], [Hessenberg 42] CharPoly, deterministic O(n 3 ) [Keller-Gehrig 85] CharPoly, deterministic O(n ω log n) [Giesbrecht 93] Frobenius form, Las-Vegas probabilistic O(n ω log n) [Augot, Camion 94] Frobenius form, deterministic O(n 3 #inv factors) [Storjohann 00] Frobenius form, deterministic O(n 3 ) or O(n ω log n log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

91 Characteristic polynomial P. and Storjohann ISSAC 07 k-shifted form: k k k C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

92 Characteristic polynomial P. and Storjohann ISSAC 07 k + 1-shifted form: k + 1 k + 1 d l C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

93 Characteristic polynomial P. and Storjohann ISSAC k + 1-shifted form: k + 1 k + 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

94 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

95 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

96 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

97 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) n magma-v s 24.28s 332.7s 2497s fflas-ffpack 0.532s 2.936s 32.71s 219.2s C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

98 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k=1 n magma-v s 24.28s 332.7s 2497s fflas-ffpack 0.532s 2.936s 32.71s 219.2s Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

99 Coding Theory for Fault Tolerant Computer Algebra Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

100 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

101 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

102 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors Trust in outsourced computations (P2P, Cloud, Volunteer, etc) Byzantine error model: a corrupted node is not always wrong black-listing is not an option C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

103 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors Trust in outsourced computations (P2P, Cloud, Volunteer, etc) Byzantine error model: a corrupted node is not always wrong black-listing is not an option Algorithm Based Fault Tolerance: exploit the algebraic specificity of the algorithm to embed redundancy. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

104 Coding Theory for Fault Tolerant Computer Algebra ABFT using error correcting codes Computations Communication C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

105 Coding Theory for Fault Tolerant Computer Algebra ABFT using error correcting codes Unsecure Computations Noisy Communication Choice of the parallelization algorithm determines the communication channel the error model C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

106 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Polynomial evaluation Ev (x0,...,x n 1 ) : K <n [X] K n f (f(x 0 ),..., f(x n 1 )) for x 0,..., x n 1 distinct. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

107 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Polynomial evaluation Ev (x0,...,x n 1 ) : K <n [X] K n f (f(x 0 ),..., f(x n 1 )) for x 0,..., x n 1 distinct. A K[X] m m A(x 0 ) A(x n 1 ) det(a) d 0 d n 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

108 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Chinese Remainder Theorem Ev (p0,...,p n 1 ) : Z <p0 p n 1 Z p0 Z pn 1 m (m mod p 0,..., m mod p n 1 ) for p 1,..., p n pairwise co-prime. A Z m m A (mod p 0 ) A (mod p n 1 ) det(a) d 0 dn 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

109 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f(x i ) f? Problem Recover an unknown function f, given as a black-box, from its evaluations. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

110 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f(x i ) f = d f? f i=0 c ix i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

111 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i ) f = t i=1 c ix d i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

112 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f g? f g (x i) f = d f i=0 f ix i, g = d g i=0 g ix i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

113 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i )+e i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds Trust in the evaluations errors (outliers) approximations: numerical noise C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

114 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i )+e i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds Trust in the evaluations errors (outliers) approximations: numerical noise C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

115 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Rational function reconstruction Problem (RFR: Rational Function Reconstruction) Given A, B K[X] with deg B < deg A = n, Find f K df [X], g K n df 1[X] such that f = gb mod A. Fact The Extended Euclidean Algo. run on (A, B) and terminated when deg f j d f < deg f j 1, produces f j = u j A + v j B s.t. 1 (f j, v j ) is a solution to the RFR problem. 2 it is minimal: any other solution (f, g) is of the form f = qf j, g = qv j for q K[X]. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

116 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation x i F f(x i ) f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

117 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation with errors x i F f(x i )+e i = y i f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). with errors: Reed-Solomon decoding Erroneous interpolant: h = Interp((y i, x i )) Error locator polynomial: Λ = e i 0 (X x i) Λf = Λh n 1 mod (X x i ) i=0 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

118 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation with errors x i F f(x i )+e i = y i f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). with errors: Reed-Solomon decoding Erroneous interpolant: h = Interp((y i, x i )) Error locator polynomial: Λ = e i 0 (X x i) Λf }{{} N = Λ }{{} D h n 1 mod (X x i ) i=0 Rational Reconstruction Problem: (Λf, Λ) is a minimal solution computed by Ext. Euclidean Algorithm f = f j /v j. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

119 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

120 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization Parameter oblivious decoding Khonji, P., Roch, Roche and Stalinsky 10 Effective redundancy available how to use all available redundancy? f Upper bound on deg f redundancy used with RS codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

121 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization Parameter oblivious decoding Khonji, P., Roch, Roche and Stalinsky 10 Effective redundancy available how to use all available redundancy? f Upper bound on deg f redundancy used with RS codes list decoder exploring all length n Reed-Solomon codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

Exact Linear Algebra Algorithmic: Theory and Practice

Exact Linear Algebra Algorithmic: Theory and Practice ISSAC 15 Tutorial Clément Pernet Université Grenoble Alpes, Inria, LIP-AriC July 6, 2015 C. Pernet Exact Linear Algebra Algorithmic July 6, 2015 1