High Performance and Reliable Algebraic Computing

Size: px
Start display at page:

Download "High Performance and Reliable Algebraic Computing"

Transcription

1 High Performance and Reliable Algebraic Computing Soutenance d habilitation à diriger des recherches Clément Pernet Université Joseph Fourier (Grenoble 1) November 25, 2014 Rapporteurs : Daniel Augot, Examinateurs : Jean-Guillaume Dumas, Mark Giesbrecht, Jean-Charles Faugère, Laura Grigori, Erich L. Kaltofen, Brigitte Plateau.

2 Introduction Computer Algebra Computing exactly over Z, Q, Z/pZ, GF(q), K[X]. Symbolic manipulations. Applications where all digits matter: breaking Discrete Log Pb. in quasi-polynomial time [Barbulescu & al. 14], building modular form databases to test the BSD conjecture [Stein 12], formal verification of Hales proof of Kepler conjecture [Hales 05]. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

3 Introduction Computer Algebra Computing exactly over Z, Q, Z/pZ, GF(q), K[X]. Symbolic manipulations. Applications where all digits matter: breaking Discrete Log Pb. in quasi-polynomial time [Barbulescu & al. 14], building modular form databases to test the BSD conjecture [Stein 12], formal verification of Hales proof of Kepler conjecture [Hales 05]. Efficiency mostly rely on linear algebra over Z and Z/pZ. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

4 Introduction Coding theory Protecting information against alteration: deep space communication, data storage, fault tolerance of large scale computations. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

5 Introduction Coding theory Protecting information against alteration: deep space communication, data storage, fault tolerance of large scale computations. Numerical linear algebra Computing fast with approximations: delivering flops to most scientific computations for over 60 years, LinPack: benchmark for the top 500 supercomputers, impacts nowadays computer architectures. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

6 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

7 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

8 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q [Chowdhury & al. 14]: fast list decoding of Reed-Solomon codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

9 Interactions Parity check, RS codes [Berlekamp 68, Massey 69] [Giorgi, Jeannerod and Villard 03] Krylov methods [Wiedemann 86]: sparse linear system solving over F q [Chowdhury & al. 14]: fast list decoding of Reed-Solomon codes [Huang and Abraham 84]: Algorithm Based Fault Tolerance (ABFT) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

10 Interactions RS and CRT codes Sparse codes BLAS, parallel algorithms Contributions: design of high performance linear algebra kernels, C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

11 Interactions RS and CRT codes Sparse codes BLAS, parallel algorithms Contributions: design of high performance linear algebra kernels, fault tolerant computer algebra. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

12 Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

13 Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

14 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

15 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) Matrix Product [Strassen 69]: O(n ). [Schönhage 81] O(n 2.52 ). [Coppersmith, Winograd 90] O(n ). [Le Gall 14] O(n ) MM(n) = O(n ω ) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

16 Reductions: linear algebra s arithmetic complexity < 1969: O(n 3 ) for everyone (Gauss, Householder, Danilevskĭi, etc) Matrix Product [Strassen 69]: O(n ). [Schönhage 81] O(n 2.52 ). [Coppersmith, Winograd 90] O(n ). [Le Gall 14] O(n ) Other operations [Strassen 69]: Inverse in O(n ω ) [Schönhage 72]: QR in O(n ω ) [Bunch, Hopcroft 74]: LU in O(n ω ) [Ibarra & al. 82]: Rank in O(n ω ) [Keller-Gehrig 85]: CharPoly in O(n ω log n) MM(n) = O(n ω ) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

17 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

18 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

19 Reductions HNF(Z) SNF(Z) Det(Z) [P. and Stein 10] [Abbott, Bronstein and Mulders 99] [Storjohann 05] LinSys(Z) [Storjohann 05] [Storjohann 05] MM(Z) [Dumas, P. and Saunders 09] CharPoly(Z) MinPoly(Z) LinSys(Z p) Det(Z p) MinPoly(Z p) CharPoly(Z p) Det(Z p) LinSys(Z p) [P. and Storjohann 07] [Dumas, P. Saunders 09] [Chen & al. 02] LU(Z p) CharPoly(Z p) [P. and Storjohann 07] Rank(Z p) [Chen & al. 02] [Wiedemann 86] [Kaltofen Saunders 91] [Ibarra, Moran and Hui 82] [Jeannerod, P. and Storjohann 13] [Dumas, P. and Sultan 13] TRSM(Z p) MM(Z p) [Wiedemann 86] MinPoly(Z p) MatVecProd(Z p) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

20 Making theoretical reductions effective C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

21 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

22 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable Lucky coincidence building blocks in theory happen to be the most efficient routines in practice reduction trees are still relevant C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

23 Making theoretical reductions effective Common mistrust Fast linear algebra is never faster numerically unstable Lucky coincidence building blocks in theory happen to be the most efficient routines in practice reduction trees are still relevant Roadmap 1 Tune building blocks (MatMul) 2 Improve existing reductions (LU, Echelon) leading constants memory footprint 3 Produce new reduction schemes (CharPoly, Rank Profiles) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

24 Design of parallel exact linear algebra ANR HPAC project: 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

25 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

26 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

27 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

28 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints Exact linear algebra specificities cost affected by the splitting O(n ω ) for w < 3 modular reductions coarse grain recursive algorithms rank deficiencies unbalanced task loads C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

29 Design of parallel exact linear algebra ANR HPAC project: Ziad Sultan PhD. Thesis 1 efficient kernels for exact linear algebra on SMP 2 DSL, runtime as a plugin and composition 3 attacking large scale challenges from cryptography Parallel numerical linear algebra cost invariant wrt. splitting O(n 3 ) fine grain block iterative algorithms regular task load Numerical stability constraints Exact linear algebra specificities cost affected by the splitting O(n ω ) for w < 3 modular reductions coarse grain recursive algorithms rank deficiencies unbalanced task loads [Broquedis, Danjean and Gautier 12]: libkomp based on XKaapi C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

30 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions k ( ) 2 p 1 2 < 2 mantissa C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

31 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions k ( ) 2 p 1 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations numerical BLAS C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

32 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions 9 l ( ) 2 k p 1 2 l 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations Strassen-Winograd 6n numerical BLAS C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

33 Matrix multiplication Matrix Multiplication over Z/pZ Ingedients [Dumas, Gautier and P. 02] Compute over Z and delay modular reductions 9 l ( ) 2 k p 1 2 l 2 < 2 mantissa Fastest integer arithmetic: double, float (SIMD and pipeline) Cache optimizations Strassen-Winograd 6n numerical BLAS with memory efficient schedules [Boyer, Dumas, P. and Zhou 09] Tradeoffs: Overwriting input Extra memory Leading constant Fully in-place in 7.2n C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

34 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) OpenBLAS sgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

35 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z OpenBLAS sgemm matrix dimension p = 83, 1 mod / mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

36 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z FFLAS fgemm over Z/821Z OpenBLAS sgemm matrix dimension p = 83, 1 mod / mul. p = 821, 1 mod / 100 mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

37 Matrix multiplication Sequential Matrix Multiplication i5 3320M at 2.6Ghz with AVX n 3 /time/10 9 (Gfops equiv.) FFLAS fgemm over Z/83Z FFLAS fgemm over Z/821Z OpenBLAS sgemm 10 FFLAS fgemm over Z/ Z FFLAS fgemm over Z/ Z OpenBLAS dgemm matrix dimension p = 83, 1 mod / mul. p = 821, 1 mod / 100 mul. p = , 1 mod / mul. p = , 1 mod / 100 mul. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

38 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

39 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C pfgemm double MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

40 Matrix multiplication Parallel matrix multiplication Dumas, Gautier, P. and Sultan pfgemm over Z/131071Z on a Xeon E Ghz 32 cores 1st recursion cutting 2nd recursion cutting 500 B 1 B Gfops 300 A 1 C 11 C A 2 C 21 C 22 pfgemm double 100 pfgemm mod MKL dgemm PLASMA-QUARK dgemm matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

41 Gaussian elimination Gaussian elimination Slab iterative LAPACK Slab recursive FFLAS-FFPACK Tile iterative PLASMA Tile recursive FFLAS-FFPACK C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

42 Gaussian elimination Gaussian elimination Slab recursive FFLAS-FFPACK Tile recursive FFLAS-FFPACK Prefer recursive algorithms C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

43 Gaussian elimination Gaussian elimination Tile recursive FFLAS-FFPACK Prefer recursive algorithms Better data locality C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

44 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops PLASMA-Quark dgetrf tiled storage (k=212) 0 PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

45 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops Intel MKL dgetrf PLASMA-Quark dgetrf tiled storage (k=212) PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

46 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Comparing numerical efficiency (no modulo) 400 parallel PLUQ over double on full rank matrices on 32 cores Gfops FFLAS-FFPACK<double> explicit synch FFLAS-FFPACK<double> dataflow synch 50 Intel MKL dgetrf PLASMA-Quark dgetrf tiled storage (k=212) 0 PLASMA-Quark dgetrf (k=212) matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

47 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Over the finite field Z/131071Z 400 Parallel PLUQ over Z/131071Z with full rank matrices on 32 cores Gfops Tiled Rec explicit sync 0 Tiled Rec dataflow sync matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

48 Gaussian elimination Full rank Gaussian elimination Dumas, Gautier, P. and Sultan 14 Over the finite field Z/131071Z 400 Parallel PLUQ over Z/131071Z with full rank matrices on 32 cores Gfops Tiled Rec explicit sync 50 Tiled Rec dataflow sync Tiled Iter dataflow sync 0 Tiled Iter explicit sync matrix dimension C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

49 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

50 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

51 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

52 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} Generic ColRP. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

53 Rank profiles Rank profiles Definition (Row Rank Profile: RowRP) Given A K m n, r = rank(a). informally: first r linearly independent rows formally: lexico-minimal sub-sequence of (1,..., m) of r indices of linearly independant rows. Example Rank = 3 RowRP = {1,2,4} ColRP = {1,2,3} Generic ColRP. Major invariant of a matrix (echelon form) Gröbner basis computations (Macaulay matrix) [Faugère 99, 02] Krylov methods C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

54 Rank profiles Computing rank profiles Via Gaussian elimination revealing echelon forms: [Ibarra, Moran and Hui 82] A = L S P [Keller-Gehrig 85] X A = R [Storjohann 00] X A = R [Jeannerod, P. and Storjohann 13] A = P L E C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

55 Rank profiles Computing rank profiles Via Gaussian elimination revealing echelon forms: [Ibarra, Moran and Hui 82] A = L S P [Keller-Gehrig 85] X A = R [Storjohann 00] X A = R [Jeannerod, P. and Storjohann 13] A = P L E Lessons learned (or what we thought was necessary): treat rows in order exhaust all columns before considering the next row slab block splitting required (recursive or iterative) similar to partial pivoting C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

56 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row order: any non-zero on the first non-zero row Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

57 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

58 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex order: first non-zero on the first non-zero row Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

59 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. Rev. lex. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

60 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Col. order Lexico. Rev. lex. Product C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

61 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Permutation Transpositions Lex/RevLex order: first non-zero on the first non-zero row/col Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

62 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product Rotation Transposition Product Transposition Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

63 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Rev. lex. Transposition Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

64 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Lexico. Transposition Rotation Rev. lex. Transposition Transposition Rev. lex. Rotation Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

65 Rank profiles Pivoting strategies revealing rank profiles Pivot Search Pivot s (i, j) position minimizes some pre-order: Row/Col order: any non-zero on the first non-zero row/col Lex/RevLex order: first non-zero on the first non-zero row/col Permutation Transpositions Cylcic Rotations Product order: first non-zero in the (i, j) leading sub-matrix Search Row perm. Col. perm. RowRP ColRP All RPs Tiles possible Row order Transposition Transposition Col. order Transposition Transposition Lexico. Transposition Transposition Lexico. Transposition Rotation Rev. lex. Transposition Transposition Rev. lex. Rotation Transposition Product Rotation Transposition Product Transposition Rotation Product Rotation Rotation C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

66 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. A R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

67 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1} ColRP = {1} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

68 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1,2} ColRP = {1,3} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

69 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) Same holds for any (i, j)-leading submatrix. A RowRP = {1,4} ColRP = {1,2} R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

70 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 Ir U V A = P LUQ = P Q M 0 I m r I n r R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

71 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P QQ T U V Q M I m r 0 I n r R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

72 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) L A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P Q Q T U V Q M I m r 0 I n r }{{}}{{}}{{} Π P,Q U R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

73 Rank profiles Computing all rank profiles at once Dumas, P. and Sultan 13 Definition (Rank Profile matrix) The unique R A {0, 1} m n such that any pair of (i, j)-leading sub-matrix of R A and of A have the same rank. Theorem RowRP and ColRP read directly on R(A) L A Same holds for any (i, j)-leading submatrix. RowRP = {1,4} ColRP = {1,2} [ ] [ ] [ ] L 0 A = P LUQ = P P T Ir P Q Q T U V Q M I m r 0 I n r }{{}}{{}}{{} Π P,Q With appropriate pivoting: Π P,Q = R(A) U R C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

74 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan block splitting C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

75 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Recursive call C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

76 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B BU 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

77 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B L 1 B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

78 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

79 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

80 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

81 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 2 independent recursive calls C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

82 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B BU 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

83 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 TRSM: B L 1 B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

84 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

85 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

86 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 MatMul: C C A B C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

87 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Recursive call C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

88 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 Puzzle game (block cyclic rotations) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

89 Rank profiles A tiled recursive algorithm Dumas, P. and Sultan 13 O(mnr ω 2 ) (degenerating to 2/3n 3 ) computing col. and row rank profiles of all leading sub-matrices fewer modular reductions than slab algorithms rank deficiency introduces parallelism C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

90 Characteristic polynomial Computing the characteristic polynomial Motivation Connection with the Frobenius normal form Krylov methods at large Graph invariants Crucial step in modular form computations The last missing reduction [Danilevskĭi 37], [Hessenberg 42] CharPoly, deterministic O(n 3 ) [Keller-Gehrig 85] CharPoly, deterministic O(n ω log n) [Giesbrecht 93] Frobenius form, Las-Vegas probabilistic O(n ω log n) [Augot, Camion 94] Frobenius form, deterministic O(n 3 #inv factors) [Storjohann 00] Frobenius form, deterministic O(n 3 ) or O(n ω log n log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

91 Characteristic polynomial P. and Storjohann ISSAC 07 k-shifted form: k k k C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

92 Characteristic polynomial P. and Storjohann ISSAC 07 k + 1-shifted form: k + 1 k + 1 d l C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

93 Characteristic polynomial P. and Storjohann ISSAC k + 1-shifted form: k + 1 k + 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

94 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

95 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

96 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

97 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d 2 1 Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k= Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) n magma-v s 24.28s 332.7s 2497s fflas-ffpack 0.532s 2.936s 32.71s 219.2s C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

98 Characteristic polynomial 0 1 P. and Storjohann ISSAC 07 Hessenberg polycyclic: d 1 d l From k to k + 1-shifted in O(n( n k )ω 1 ) d Compute iteratively from a 1-shifted form Invariant factors appear by increasing degree Until the Hessenberg polycyclic form n ( ) 1 ω 1 ζ(ω 1)n ω = O(n ω ) k n ω k=1 n magma-v s 24.28s 332.7s 2497s fflas-ffpack 0.532s 2.936s 32.71s 219.2s Generalized to the Frobenius form as well Transformation matrix in O(n ω log log n) C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

99 Coding Theory for Fault Tolerant Computer Algebra Outline 1 Design of High Performance Exact Linear Algebra Kernels Matrix multiplication Gaussian elimination Rank profiles Characteristic polynomial 2 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Dense polynomial evaluation codes Rational function codes Sparse evaluation codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

100 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

101 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

102 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors Trust in outsourced computations (P2P, Cloud, Volunteer, etc) Byzantine error model: a corrupted node is not always wrong black-listing is not an option C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

103 Coding Theory for Fault Tolerant Computer Algebra Fault Tolerance Reliability of large scale distributed computing Peak Mean Time To Error Mean Time To Failure Blue Waters 14 Pflops 15min 1/2 day Tsubame Pflops? 15.8h Disk crash, hardware/software failures hard errors Bitflip in main or cache memory soft/silent errors Trust in outsourced computations (P2P, Cloud, Volunteer, etc) Byzantine error model: a corrupted node is not always wrong black-listing is not an option Algorithm Based Fault Tolerance: exploit the algebraic specificity of the algorithm to embed redundancy. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

104 Coding Theory for Fault Tolerant Computer Algebra ABFT using error correcting codes Computations Communication C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

105 Coding Theory for Fault Tolerant Computer Algebra ABFT using error correcting codes Unsecure Computations Noisy Communication Choice of the parallelization algorithm determines the communication channel the error model C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

106 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Polynomial evaluation Ev (x0,...,x n 1 ) : K <n [X] K n f (f(x 0 ),..., f(x n 1 )) for x 0,..., x n 1 distinct. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

107 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Polynomial evaluation Ev (x0,...,x n 1 ) : K <n [X] K n f (f(x 0 ),..., f(x n 1 )) for x 0,..., x n 1 distinct. A K[X] m m A(x 0 ) A(x n 1 ) det(a) d 0 d n 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

108 Coding Theory for Fault Tolerant Computer Algebra Evaluation-interpolation schemes Chinese Remainder Theorem Ev (p0,...,p n 1 ) : Z <p0 p n 1 Z p0 Z pn 1 m (m mod p 0,..., m mod p n 1 ) for p 1,..., p n pairwise co-prime. A Z m m A (mod p 0 ) A (mod p n 1 ) det(a) d 0 dn 1 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

109 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f(x i ) f? Problem Recover an unknown function f, given as a black-box, from its evaluations. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

110 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f(x i ) f = d f? f i=0 c ix i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

111 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i ) f = t i=1 c ix d i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

112 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f g? f g (x i) f = d f i=0 f ix i, g = d g i=0 g ix i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

113 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i )+e i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds Trust in the evaluations errors (outliers) approximations: numerical noise C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

114 Coding Theory for Fault Tolerant Computer Algebra Making evaluation-interpolation schemes fault tolerant x i F f? f(x i )+e i Problem Recover an unknown function f, given as a black-box, from its evaluations. Additional knowledge on the model Dense polynomial: degree bound Sparse polynomial: support unknown, bound on sparsity Dense rational function: degree bounds Trust in the evaluations errors (outliers) approximations: numerical noise C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

115 Coding Theory for Fault Tolerant Computer Algebra Approximation problems Rational function reconstruction Problem (RFR: Rational Function Reconstruction) Given A, B K[X] with deg B < deg A = n, Find f K df [X], g K n df 1[X] such that f = gb mod A. Fact The Extended Euclidean Algo. run on (A, B) and terminated when deg f j d f < deg f j 1, produces f j = u j A + v j B s.t. 1 (f j, v j ) is a solution to the RFR problem. 2 it is minimal: any other solution (f, g) is of the form f = qf j, g = qv j for q K[X]. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

116 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation x i F f(x i ) f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

117 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation with errors x i F f(x i )+e i = y i f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). with errors: Reed-Solomon decoding Erroneous interpolant: h = Interp((y i, x i )) Error locator polynomial: Λ = e i 0 (X x i) Λf = Λh n 1 mod (X x i ) i=0 C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

118 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Dense polynomial interpolation with errors x i F f(x i )+e i = y i f = d f? f i=0 c ix i without error: polynomial interpolation (Lagrange, Newton, etc). with errors: Reed-Solomon decoding Erroneous interpolant: h = Interp((y i, x i )) Error locator polynomial: Λ = e i 0 (X x i) Λf }{{} N = Λ }{{} D h n 1 mod (X x i ) i=0 Rational Reconstruction Problem: (Λf, Λ) is a minimal solution computed by Ext. Euclidean Algorithm f = f j /v j. C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

119 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

120 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization Parameter oblivious decoding Khonji, P., Roch, Roche and Stalinsky 10 Effective redundancy available how to use all available redundancy? f Upper bound on deg f redundancy used with RS codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

121 Coding Theory for Fault Tolerant Computer Algebra Dense polynomial evaluation codes Correction capacity Unique decoding of t errors whenever: n deg f + 2E + 1 Bounding the degree deg f rarely known a priori; bound d f deg f often pessimistic Early termination: without errors: add evaluations until interpolant stabilizes with errors: no stabilization Parameter oblivious decoding Khonji, P., Roch, Roche and Stalinsky 10 Effective redundancy available how to use all available redundancy? f Upper bound on deg f redundancy used with RS codes list decoder exploring all length n Reed-Solomon codes C. Pernet (Habilitation defense) High Perf. and Reliable Algebraic Computing November 25, / 39

Exact Linear Algebra Algorithmic: Theory and Practice

Exact Linear Algebra Algorithmic: Theory and Practice Exact Linear Algebra Algorithmic: Theory and Practice ISSAC 15 Tutorial Clément Pernet Université Grenoble Alpes, Inria, LIP-AriC July 6, 2015 C. Pernet Exact Linear Algebra Algorithmic July 6, 2015 1

More information

Computing the Rank Profile Matrix

Computing the Rank Profile Matrix Computing the Rank Profile Matrix Jean-Guillaume Dumas, Clément Pernet, Ziad Sultan To cite this version: Jean-Guillaume Dumas, Clément Pernet, Ziad Sultan. Computing the Rank Profile Matrix. Kazuhiro

More information

Eliminations and echelon forms in exact linear algebra

Eliminations and echelon forms in exact linear algebra Eliminations and echelon forms in exact linear algebra Clément PERNET, INRIA-MOAIS, Grenoble Université, France East Coast Computer Algebra Day, University of Waterloo, ON, Canada, April 9, 2011 Clément

More information

Linear Algebra 1: Computing canonical forms in exact linear algebra

Linear Algebra 1: Computing canonical forms in exact linear algebra Linear Algebra : Computing canonical forms in exact linear algebra Clément PERNET, LIG/INRIA-MOAIS, Grenoble Université, France ECRYPT II: Summer School on Tools, Mykonos, Grece, June st, 202 Introduction

More information

Toward High Performance Matrix Multiplication for Exact Computation

Toward High Performance Matrix Multiplication for Exact Computation Toward High Performance Matrix Multiplication for Exact Computation Pascal Giorgi Joint work with Romain Lebreton (U. Waterloo) Funded by the French ANR project HPAC Séminaire CASYS - LJK, April 2014 Motivations

More information

Solving Sparse Rational Linear Systems. Pascal Giorgi. University of Waterloo (Canada) / University of Perpignan (France) joint work with

Solving Sparse Rational Linear Systems. Pascal Giorgi. University of Waterloo (Canada) / University of Perpignan (France) joint work with Solving Sparse Rational Linear Systems Pascal Giorgi University of Waterloo (Canada) / University of Perpignan (France) joint work with A. Storjohann, M. Giesbrecht (University of Waterloo), W. Eberly

More information

hal , version 1-29 Feb 2008

hal , version 1-29 Feb 2008 Compressed Modular Matrix Multiplication Jean-Guillaume Dumas Laurent Fousse Bruno Salvy February 29, 2008 Abstract We propose to store several integers modulo a small prime into a single machine word.

More information

arxiv: v2 [cs.sc] 14 May 2018

arxiv: v2 [cs.sc] 14 May 2018 Fast Computation of the Rank Profile Matrix and the Generalized Bruhat Decomposition arxiv:1601.01798v2 [cs.sc 14 May 2018 Jean-Guillaume Dumas Université Grenoble Alpes, Laboratoire LJK, umr CNRS, BP53X,

More information

Adaptive decoding for dense and sparse evaluation/interpolation codes

Adaptive decoding for dense and sparse evaluation/interpolation codes Adaptive decoding for dense and sparse evaluation/interpolation codes Clément PERNET INRIA/LIG-MOAIS, Grenoble Université joint work with M. Comer, E. Kaltofen, J-L. Roch, T. Roche Séminaire Calcul formel

More information

On the use of polynomial matrix approximant in the block Wiedemann algorithm

On the use of polynomial matrix approximant in the block Wiedemann algorithm On the use of polynomial matrix approximant in the block Wiedemann algorithm ECOLE NORMALE SUPERIEURE DE LYON Laboratoire LIP, ENS Lyon, France Pascal Giorgi Symbolic Computation Group, School of Computer

More information

Inverting integer and polynomial matrices. Jo60. Arne Storjohann University of Waterloo

Inverting integer and polynomial matrices. Jo60. Arne Storjohann University of Waterloo Inverting integer and polynomial matrices Jo60 Arne Storjohann University of Waterloo Integer Matrix Inverse Input: An n n matrix A filled with entries of size d digits. Output: The matrix inverse A 1.

More information

Algorithms for exact (dense) linear algebra

Algorithms for exact (dense) linear algebra Algorithms for exact (dense) linear algebra Gilles Villard CNRS, Laboratoire LIP ENS Lyon Montagnac-Montpezat, June 3, 2005 Introduction Problem: Study of complexity estimates for basic problems in exact

More information

The Seven Dwarfs of Symbolic Computation and the Discovery of Reduced Symbolic Models. Erich Kaltofen North Carolina State University google->kaltofen

The Seven Dwarfs of Symbolic Computation and the Discovery of Reduced Symbolic Models. Erich Kaltofen North Carolina State University google->kaltofen The Seven Dwarfs of Symbolic Computation and the Discovery of Reduced Symbolic Models Erich Kaltofen North Carolina State University google->kaltofen Outline 2 The 7 Dwarfs of Symbolic Computation Discovery

More information

Sparse Polynomial Interpolation and Berlekamp/Massey algorithms that correct Outlier Errors in Input Values

Sparse Polynomial Interpolation and Berlekamp/Massey algorithms that correct Outlier Errors in Input Values Sparse Polynomial Interpolation and Berlekamp/Massey algorithms that correct Outlier Errors in Input Values Clément PERNET joint work with Matthew T. COMER and Erich L. KALTOFEN : LIG/INRIA-MOAIS, Grenoble

More information

Matrix multiplication over word-size prime fields using Bini s approximate formula

Matrix multiplication over word-size prime fields using Bini s approximate formula Matrix multiplication over word-size prime fields using Bini s approximate formula Brice Boyer, Jean-Guillaume Dumas To cite this version: Brice Boyer, Jean-Guillaume Dumas. Matrix multiplication over

More information

Black Box Linear Algebra

Black Box Linear Algebra Black Box Linear Algebra An Introduction to Wiedemann s Approach William J. Turner Department of Mathematics & Computer Science Wabash College Symbolic Computation Sometimes called Computer Algebra Symbols

More information

Computing over Z, Q, K[X]

Computing over Z, Q, K[X] Computing over Z, Q, K[X] Clément PERNET M2-MIA Calcul Exact Outline Introduction Chinese Remainder Theorem Rational reconstruction Problem Statement Algorithms Applications Dense CRT codes Extension to

More information

Fast computation of normal forms of polynomial matrices

Fast computation of normal forms of polynomial matrices 1/25 Fast computation of normal forms of polynomial matrices Vincent Neiger Inria AriC, École Normale Supérieure de Lyon, France University of Waterloo, Ontario, Canada Partially supported by the mobility

More information

Computing minimal interpolation bases

Computing minimal interpolation bases Computing minimal interpolation bases Vincent Neiger,, Claude-Pierre Jeannerod Éric Schost Gilles Villard AriC, LIP, École Normale Supérieure de Lyon, France University of Waterloo, Ontario, Canada Supported

More information

Lattice reduction of polynomial matrices

Lattice reduction of polynomial matrices Lattice reduction of polynomial matrices Arne Storjohann David R. Cheriton School of Computer Science University of Waterloo Presented at the SIAM conference on Applied Algebraic Geometry at the Colorado

More information

BOUNDS ON THE COEFFICIENTS OF THE CHARACTERISTIC AND MINIMAL POLYNOMIALS

BOUNDS ON THE COEFFICIENTS OF THE CHARACTERISTIC AND MINIMAL POLYNOMIALS BOUNDS ON THE COEFFICIENTS OF THE CHARACTERISTIC AND MINIMAL POLYNOMIALS Received: 5 October, 006 Accepted: 3 April, 007 Communicated by: JEAN-GUILLAUME DUMAS Laboratoire Jean Kuntzmann Université Joseph

More information

Fast algorithms for multivariate interpolation problems

Fast algorithms for multivariate interpolation problems Fast algorithms for multivariate interpolation problems Vincent Neiger,, Claude-Pierre Jeannerod Éric Schost Gilles Villard AriC, LIP, École Normale Supérieure de Lyon, France ORCCA, Computer Science Department,

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

Solving Sparse Integer Linear Systems. Pascal Giorgi. Dali team - University of Perpignan (France)

Solving Sparse Integer Linear Systems. Pascal Giorgi. Dali team - University of Perpignan (France) Solving Sarse Integer Linear Systems Pascal Giorgi Dali team - University of Perignan (France) in collaboration with A. Storjohann, M. Giesbrecht (University of Waterloo), W. Eberly (University of Calgary),

More information

WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE

WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE JEFFREY B. FARR AND ROMAN PEARCE Abstract. We comment on the implementation of various algorithms in multivariate polynomial theory. Specifically, we describe

More information

Fast computation of normal forms of polynomial matrices

Fast computation of normal forms of polynomial matrices 1/25 Fast computation of normal forms of polynomial matrices Vincent Neiger Inria AriC, École Normale Supérieure de Lyon, France University of Waterloo, Ontario, Canada Partially supported by the mobility

More information

Why solve integer linear systems exactly? David Saunders - U of Delaware

Why solve integer linear systems exactly? David Saunders - U of Delaware NSF CDI, 2007Oct30 1 Why solve integer linear systems exactly?... an enabling tool for discovery and innovation? David Saunders - U of Delaware LinBox library at linalg.org NSF CDI, 2007Oct30 2 LinBoxers

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Q-adic Transform revisited

Q-adic Transform revisited Q-adic Transform revisited Jean-Guillaume Dumas To cite this version: Jean-Guillaume Dumas. Q-adic Transform revisited. 2007. HAL Id: hal-00173894 https://hal.archives-ouvertes.fr/hal-00173894v3

More information

A variant of the F4 algorithm

A variant of the F4 algorithm A variant of the F4 algorithm Vanessa VITSE - Antoine JOUX Université de Versailles Saint-Quentin, Laboratoire PRISM CT-RSA, February 18, 2011 Motivation Motivation An example of algebraic cryptanalysis

More information

Giac and GeoGebra: improved Gröbner basis computations

Giac and GeoGebra: improved Gröbner basis computations Giac and GeoGebra: improved Gröbner basis computations Z. Kovács, B. Parisse JKU Linz, University of Grenoble I November 25, 2013 Two parts talk 1 GeoGebra (Z. Kovács) 2 (B. Parisse) History of used CAS

More information

Solving Sparse Integer Linear Systems

Solving Sparse Integer Linear Systems Solving Sparse Integer Linear Systems Wayne Eberly, Mark Giesbrecht, Giorgi Pascal, Arne Storjohann, Gilles Villard To cite this version: Wayne Eberly, Mark Giesbrecht, Giorgi Pascal, Arne Storjohann,

More information

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers

Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers UT College of Engineering Tutorial Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and DPLASMA and StarPU Schedulers Stan Tomov 1, George Bosilca 1, and Cédric

More information

Black Box Linear Algebra with the LinBox Library. William J. Turner North Carolina State University wjturner

Black Box Linear Algebra with the LinBox Library. William J. Turner North Carolina State University   wjturner Black Box Linear Algebra with the LinBox Library William J. Turner North Carolina State University www.math.ncsu.edu/ wjturner Overview 1. Black Box Linear Algebra 2. Project LinBox 3. LinBox Objects 4.

More information

Modular Algorithms for Computing Minimal Associated Primes and Radicals of Polynomial Ideals. Masayuki Noro. Toru Aoyama

Modular Algorithms for Computing Minimal Associated Primes and Radicals of Polynomial Ideals. Masayuki Noro. Toru Aoyama Modular Algorithms for Computing Minimal Associated Primes and Radicals of Polynomial Ideals Toru Aoyama Kobe University Department of Mathematics Graduate school of Science Rikkyo University Department

More information

Exact Arithmetic on a Computer

Exact Arithmetic on a Computer Exact Arithmetic on a Computer Symbolic Computation and Computer Algebra William J. Turner Department of Mathematics & Computer Science Wabash College Crawfordsville, IN 47933 Tuesday 21 September 2010

More information

Fast, deterministic computation of the Hermite normal form and determinant of a polynomial matrix

Fast, deterministic computation of the Hermite normal form and determinant of a polynomial matrix Fast, deterministic computation of the Hermite normal form and determinant of a polynomial matrix George Labahn, Vincent Neiger, Wei Zhou To cite this version: George Labahn, Vincent Neiger, Wei Zhou.

More information

Maximal Matching via Gaussian Elimination

Maximal Matching via Gaussian Elimination Maximal Matching via Gaussian Elimination Piotr Sankowski Uniwersytet Warszawski - p. 1/77 Outline Maximum Matchings Lovásza s Idea, Maximum matchings, Rabin and Vazirani, Gaussian Elimination, Simple

More information

Black Box Linear Algebra with the LinBox Library. William J. Turner North Carolina State University wjturner

Black Box Linear Algebra with the LinBox Library. William J. Turner North Carolina State University   wjturner Black Box Linear Algebra with the LinBox Library William J. Turner North Carolina State University www.math.ncsu.edu/ wjturner Overview 1. Black Box Linear Algebra 2. Project LinBox 3. LinBox Objects 4.

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

Binding Performance and Power of Dense Linear Algebra Operations

Binding Performance and Power of Dense Linear Algebra Operations 10th IEEE International Symposium on Parallel and Distributed Processing with Applications Binding Performance and Power of Dense Linear Algebra Operations Maria Barreda, Manuel F. Dolz, Rafael Mayo, Enrique

More information

Nearly Sparse Linear Algebra and application to Discrete Logarithms Computations

Nearly Sparse Linear Algebra and application to Discrete Logarithms Computations Nearly Sparse Linear Algebra and application to Discrete Logarithms Computations Antoine Joux 1,2,3 and Cécile Pierrot 1,4 1 Sorbonne Universités, UPMC Univ Paris 06, LIP6, 4 place Jussieu, 75005 PARIS,

More information

A Fast Las Vegas Algorithm for Computing the Smith Normal Form of a Polynomial Matrix

A Fast Las Vegas Algorithm for Computing the Smith Normal Form of a Polynomial Matrix A Fast Las Vegas Algorithm for Computing the Smith Normal Form of a Polynomial Matrix Arne Storjohann and George Labahn Department of Computer Science University of Waterloo Waterloo, Ontario, Canada N2L

More information

Computing Characteristic Polynomials of Matrices of Structured Polynomials

Computing Characteristic Polynomials of Matrices of Structured Polynomials Computing Characteristic Polynomials of Matrices of Structured Polynomials Marshall Law and Michael Monagan Department of Mathematics Simon Fraser University Burnaby, British Columbia, Canada mylaw@sfu.ca

More information

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

On Newton-Raphson iteration for multiplicative inverses modulo prime powers On Newton-Raphson iteration for multiplicative inverses modulo prime powers Jean-Guillaume Dumas To cite this version: Jean-Guillaume Dumas. On Newton-Raphson iteration for multiplicative inverses modulo

More information

Algorithms for Fast Linear System Solving and Rank Profile Computation

Algorithms for Fast Linear System Solving and Rank Profile Computation Algorithms for Fast Linear System Solving and Rank Profile Computation by Shiyun Yang A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master

More information

Cryptographic Engineering

Cryptographic Engineering Cryptographic Engineering Clément PERNET M2 Cyber Security, UFR-IM 2 AG, Univ. Grenoble-Alpes ENSIMAG, Grenoble INP Outline Coding Theory Introduction Linear Codes Reed-Solomon codes Application: Mc Eliece

More information

Dense Linear Algebra over Finite Fields: the FFLAS and FFPACK packages

Dense Linear Algebra over Finite Fields: the FFLAS and FFPACK packages Dense Linear Algebra over Finite Fields: the FFLAS and FFPACK packages Jean-Guillaume Dumas, Thierry Gautier, Pascal Giorgi, Clément Pernet To cite this version: Jean-Guillaume Dumas, Thierry Gautier,

More information

Fast reversion of formal power series

Fast reversion of formal power series Fast reversion of formal power series Fredrik Johansson LFANT, INRIA Bordeaux RAIM, 2016-06-29, Banyuls-sur-mer 1 / 30 Reversion of power series F = exp(x) 1 = x + x 2 2! + x 3 3! + x 4 G = log(1 + x)

More information

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is

More information

Fast Computation of Hermite Normal Forms of Random Integer Matrices

Fast Computation of Hermite Normal Forms of Random Integer Matrices Fast Computation of Hermite Normal Forms of Random Integer Matrices Clément Pernet 1 William Stein 2 Abstract This paper is about how to compute the Hermite normal form of a random integer matrix in practice.

More information

Efficient Algorithms for Order Bases Computation

Efficient Algorithms for Order Bases Computation Efficient Algorithms for Order Bases Computation Wei Zhou and George Labahn Cheriton School of Computer Science University of Waterloo, Waterloo, Ontario, Canada Abstract In this paper we present two algorithms

More information

Modeling and Tuning Parallel Performance in Dense Linear Algebra

Modeling and Tuning Parallel Performance in Dense Linear Algebra Modeling and Tuning Parallel Performance in Dense Linear Algebra Initial Experiences with the Tile QR Factorization on a Multi Core System CScADS Workshop on Automatic Tuning for Petascale Systems Snowbird,

More information

THE VECTOR RATIONAL FUNCTION RECONSTRUCTION PROBLEM

THE VECTOR RATIONAL FUNCTION RECONSTRUCTION PROBLEM 137 TH VCTOR RATIONAL FUNCTION RCONSTRUCTION PROBLM ZACH OLSH and ARN STORJOHANN David R. Cheriton School of Computer Science University of Waterloo, Ontario, Canada N2L 3G1 -mail: astorjoh@uwaterloo.ca

More information

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux

CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S. Ant nine J aux CHAPMAN & HALL/CRC CRYPTOGRAPHY AND NETWORK SECURITY ALGORITHMIC CR YPTAN ALY51S Ant nine J aux (g) CRC Press Taylor 8* Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor &

More information

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria

1 Overview. 2 Adapting to computing system evolution. 11 th European LS-DYNA Conference 2017, Salzburg, Austria 1 Overview Improving LSTC s Multifrontal Linear Solver Roger Grimes 3, Robert Lucas 3, Nick Meng 2, Francois-Henry Rouet 3, Clement Weisbecker 3, and Ting-Ting Zhu 1 1 Cray Incorporated 2 Intel Corporation

More information

Output-sensitive algorithms for sumset and sparse polynomial multiplication

Output-sensitive algorithms for sumset and sparse polynomial multiplication Output-sensitive algorithms for sumset and sparse polynomial multiplication Andrew Arnold Cheriton School of Computer Science University of Waterloo Waterloo, Ontario, Canada Daniel S. Roche Computer Science

More information

Problème du logarithme discret sur courbes elliptiques

Problème du logarithme discret sur courbes elliptiques Problème du logarithme discret sur courbes elliptiques Vanessa VITSE Université de Versailles Saint-Quentin, Laboratoire PRISM Groupe de travail équipe ARITH LIRMM Vanessa VITSE (UVSQ) DLP over elliptic

More information

Lecture 3: Error Correcting Codes

Lecture 3: Error Correcting Codes CS 880: Pseudorandomness and Derandomization 1/30/2013 Lecture 3: Error Correcting Codes Instructors: Holger Dell and Dieter van Melkebeek Scribe: Xi Wu In this lecture we review some background on error

More information

Exactly Solve Integer Linear Systems Using Numerical Methods

Exactly Solve Integer Linear Systems Using Numerical Methods Exactly Solve Integer Linear Systems Using Numerical Methods Zhendong Wan Department of Computer Science, University of Delaware, Newark, DE, USA Abstract We present a new fast way to exactly solve non-singular

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 2 Systems of Linear Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX

Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX 26 Septembre 2018 - JCAD 2018 - Lyon Grégoire Pichon, Mathieu Faverge, Pierre Ramet, Jean Roman Outline 1. Context 2.

More information

arxiv: v5 [cs.sc] 15 May 2018

arxiv: v5 [cs.sc] 15 May 2018 On Newton-Raphson iteration for multiplicative inverses modulo prime powers arxiv:109.666v5 [cs.sc] 15 May 018 Jean-Guillaume Dumas May 16, 018 Abstract We study algorithms for the fast computation of

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Matrix Models Section 6.2 Low Rank Approximation Edgar Solomonik Department of Computer Science University of Illinois at Urbana-Champaign CS 554 / CSE 512 Edgar

More information

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12, 2000 74 6 Summary Here we summarize the most important information about theoretical and numerical linear algebra. MORALS OF THE STORY: I. Theoretically

More information

Information redundancy

Information redundancy Information redundancy Information redundancy add information to date to tolerate faults error detecting codes error correcting codes data applications communication memory p. 2 - Design of Fault Tolerant

More information

Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra

Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra Introduction to communication avoiding algorithms for direct methods of factorization in Linear Algebra Laura Grigori Abstract Modern, massively parallel computers play a fundamental role in a large and

More information

On the Complexity of Gröbner Basis Computation for Regular and Semi-Regular Systems

On the Complexity of Gröbner Basis Computation for Regular and Semi-Regular Systems On the Complexity of Gröbner Basis Computation for Regular and Semi-Regular Systems Bruno.Salvy@inria.fr Algorithms Project, Inria Joint work with Magali Bardet & Jean-Charles Faugère September 21st, 2006

More information

Coding for loss tolerant systems

Coding for loss tolerant systems Coding for loss tolerant systems Workshop APRETAF, 22 janvier 2009 Mathieu Cunche, Vincent Roca INRIA, équipe Planète INRIA Rhône-Alpes Mathieu Cunche, Vincent Roca The erasure channel Erasure codes Reed-Solomon

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

Communication-avoiding LU and QR factorizations for multicore architectures

Communication-avoiding LU and QR factorizations for multicore architectures Communication-avoiding LU and QR factorizations for multicore architectures DONFACK Simplice INRIA Saclay Joint work with Laura Grigori INRIA Saclay Alok Kumar Gupta BCCS,Norway-5075 16th April 2010 Communication-avoiding

More information

Iterative Encoding of Low-Density Parity-Check Codes

Iterative Encoding of Low-Density Parity-Check Codes Iterative Encoding of Low-Density Parity-Check Codes David Haley, Alex Grant and John Buetefuer Institute for Telecommunications Research University of South Australia Mawson Lakes Blvd Mawson Lakes SA

More information

Roundoff Error. Monday, August 29, 11

Roundoff Error. Monday, August 29, 11 Roundoff Error A round-off error (rounding error), is the difference between the calculated approximation of a number and its exact mathematical value. Numerical analysis specifically tries to estimate

More information

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC Hybrid static/dynamic scheduling for already optimized dense matrix factorization Simplice Donfack, Laura Grigori, INRIA, France Bill Gropp, Vivek Kale UIUC, USA Joint Laboratory for Petascale Computing,

More information

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system

Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Error Detection, Correction and Erasure Codes for Implementation in a Cluster File-system Steve Baker December 6, 2011 Abstract. The evaluation of various error detection and correction algorithms and

More information

Polynomial interpolation over finite fields and applications to list decoding of Reed-Solomon codes

Polynomial interpolation over finite fields and applications to list decoding of Reed-Solomon codes Polynomial interpolation over finite fields and applications to list decoding of Reed-Solomon codes Roberta Barbi December 17, 2015 Roberta Barbi List decoding December 17, 2015 1 / 13 Codes Let F q be

More information

Decoding linear codes via systems solving: complexity issues and generalized Newton identities

Decoding linear codes via systems solving: complexity issues and generalized Newton identities Decoding linear codes via systems solving: complexity issues and generalized Newton identities Stanislav Bulygin (joint work with Ruud Pellikaan) University of Valladolid Valladolid, Spain March 14, 2008

More information

Computing Minimal Nullspace Bases

Computing Minimal Nullspace Bases Computing Minimal Nullspace ases Wei Zhou, George Labahn, and Arne Storjohann Cheriton School of Computer Science University of Waterloo, Waterloo, Ontario, Canada {w2zhou,glabahn,astorjoh}@uwaterloo.ca

More information

Integer multiplication with generalized Fermat primes

Integer multiplication with generalized Fermat primes Integer multiplication with generalized Fermat primes CARAMEL Team, LORIA, University of Lorraine Supervised by: Emmanuel Thomé and Jérémie Detrey Journées nationales du Calcul Formel 2015 (Cluny) November

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

The F 4 Algorithm. Dylan Peifer. 9 May Cornell University

The F 4 Algorithm. Dylan Peifer. 9 May Cornell University The F 4 Algorithm Dylan Peifer Cornell University 9 May 2017 Gröbner Bases History Gröbner bases were introduced in 1965 in the PhD thesis of Bruno Buchberger under Wolfgang Gröbner. Buchberger s algorithm

More information

Computing Minimal Polynomial of Matrices over Algebraic Extension Fields

Computing Minimal Polynomial of Matrices over Algebraic Extension Fields Bull. Math. Soc. Sci. Math. Roumanie Tome 56(104) No. 2, 2013, 217 228 Computing Minimal Polynomial of Matrices over Algebraic Extension Fields by Amir Hashemi and Benyamin M.-Alizadeh Abstract In this

More information

Algebraic Codes for Error Control

Algebraic Codes for Error Control little -at- mathcs -dot- holycross -dot- edu Department of Mathematics and Computer Science College of the Holy Cross SACNAS National Conference An Abstract Look at Algebra October 16, 2009 Outline Coding

More information

COMPUTING MINIMAL POLYNOMIALS OF MATRICES

COMPUTING MINIMAL POLYNOMIALS OF MATRICES London Mathematical Society ISSN 1461 1570 COMPUTING MINIMAL POLYNOMIALS OF MATRICES MAX NEUNHÖFFER and CHERYL E. PRAEGER Abstract We present and analyse a Monte-Carlo algorithm to compute the minimal

More information

Modern Computer Algebra

Modern Computer Algebra Modern Computer Algebra JOACHIM VON ZUR GATHEN and JURGEN GERHARD Universitat Paderborn CAMBRIDGE UNIVERSITY PRESS Contents Introduction 1 1 Cyclohexane, cryptography, codes, and computer algebra 9 1.1

More information

3x + 1 (mod 5) x + 2 (mod 5)

3x + 1 (mod 5) x + 2 (mod 5) Today. Secret Sharing. Polynomials Polynomials. Secret Sharing. Share secret among n people. Secrecy: Any k 1 knows nothing. Roubustness: Any k knows secret. Efficient: minimize storage. A polynomial P(x)

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 2 Systems of Linear Equations Copyright c 2001. Reproduction permitted only for noncommercial,

More information

Fast reversion of power series

Fast reversion of power series Fast reversion of power series Fredrik Johansson November 2011 Overview Fast power series arithmetic Fast composition and reversion (Brent and Kung, 1978) A new algorithm for reversion Implementation results

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

Popov Form Computation for Matrices of Ore Polynomials

Popov Form Computation for Matrices of Ore Polynomials Popov Form Computation for Matrices of Ore Polynomials Mohamed Khochtali University of Waterloo Canada mkhochta@uwaterlooca Johan Rosenkilde, né Nielsen Technical University of Denmark Denmark jsrn@jsrndk

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

CALU: A Communication Optimal LU Factorization Algorithm

CALU: A Communication Optimal LU Factorization Algorithm CALU: A Communication Optimal LU Factorization Algorithm James Demmel Laura Grigori Hua Xiang Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-010-9

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Implementation of the DKSS Algorithm for Multiplication of Large Numbers

Implementation of the DKSS Algorithm for Multiplication of Large Numbers Implementation of the DKSS Algorithm for Multiplication of Large Numbers Christoph Lüders Universität Bonn The International Symposium on Symbolic and Algebraic Computation, July 6 9, 2015, Bath, United

More information

Minisymposia 9 and 34: Avoiding Communication in Linear Algebra. Jim Demmel UC Berkeley bebop.cs.berkeley.edu

Minisymposia 9 and 34: Avoiding Communication in Linear Algebra. Jim Demmel UC Berkeley bebop.cs.berkeley.edu Minisymposia 9 and 34: Avoiding Communication in Linear Algebra Jim Demmel UC Berkeley bebop.cs.berkeley.edu Motivation (1) Increasing parallelism to exploit From Top500 to multicores in your laptop Exponentially

More information

Algebraic Aspects of Symmetric-key Cryptography

Algebraic Aspects of Symmetric-key Cryptography Algebraic Aspects of Symmetric-key Cryptography Carlos Cid (carlos.cid@rhul.ac.uk) Information Security Group Royal Holloway, University of London 04.May.2007 ECRYPT Summer School 1 Algebraic Techniques

More information

Optimized LU-decomposition with Full Pivot for Small Batched Matrices S3069

Optimized LU-decomposition with Full Pivot for Small Batched Matrices S3069 Optimized LU-decomposition with Full Pivot for Small Batched Matrices S369 Ian Wainwright High Performance Consulting Sweden ian.wainwright@hpcsweden.se Based on work for GTC 212: 1x speed-up vs multi-threaded

More information

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014

Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014 Anna Dovzhik 1 Coding Theory: Linear-Error Correcting Codes Anna Dovzhik Math 420: Advanced Linear Algebra Spring 2014 Sharing data across channels, such as satellite, television, or compact disc, often

More information

Algorithms and Methods for Fast Model Predictive Control

Algorithms and Methods for Fast Model Predictive Control Algorithms and Methods for Fast Model Predictive Control Technical University of Denmark Department of Applied Mathematics and Computer Science 13 April 2016 Background: Model Predictive Control Model

More information

On Syndrome Decoding of Chinese Remainder Codes

On Syndrome Decoding of Chinese Remainder Codes On Syndrome Decoding of Chinese Remainder Codes Wenhui Li Institute of, June 16, 2012 Thirteenth International Workshop on Algebraic and Combinatorial Coding Theory (ACCT 2012) Pomorie, Bulgaria Wenhui

More information