Balanced Dense Polynomial Multiplication on Multicores

Size: px
Start display at page:

Download "Balanced Dense Polynomial Multiplication on Multicores"

Transcription

1 Balanced Dense Polynomial Multiplication on Multicores Yuzhen Xie SuperTech Group, CSAIL MIT joint work with Marc Moreno Maza ORCCA, UWO ACA09, Montreal, June 26, 2009

2 Introduction Motivation: Multicore-enabling parallel polynomial arithmetic in computer algebra Fast dense polynomial multiplication via FFT Multivariate polynomials over finite fields

3 Introduction Motivation: Multicore-enabling parallel polynomial arithmetic in computer algebra Fast dense polynomial multiplication via FFT Multivariate polynomials over finite fields Framework: Assume 1-D FFT (in particular 1-D TFT) as a black box Rely on the modpn C library (shipped with Maple 13): for 1-D FFT and 1-D TFT computations, for integer modulo arithmetic (Montgomery trick) Implement in Cilk++ targeting multi-cores: provably efficient work-stealing scheduling ease-of-use and low-overhead parallel constructs: cilk for, cilk spawn, cilk sync Cilkscreen for data race detection and parallelism analysis

4 FFT-based Multivariate Multiplication (Algorithm Review) Let k be a field and f, g k[ < < x n ] be polynomials. Define d i = deg(f, x i ) and d i = deg(g, x i ), for all i. Assume there exists a primitive s i -th root unity ω i k for all i, where s i is a power of 2 satisfying s i d i + d i + 1.

5 FFT-based Multivariate Multiplication (Algorithm Review) Let k be a field and f, g k[ < < x n ] be polynomials. Define d i = deg(f, x i ) and d i = deg(g, x i ), for all i. Assume there exists a primitive s i -th root unity ω i k for all i, where s i is a power of 2 satisfying s i d i + d i + 1. Then fg can be computed as follows. Step 1. Evaluate f and g at each point P (i.e. f (P),g(P)) of the n-dimensional grid ((ω e1 1,...,ωen n ),0 e 1 < s 1,...,0 e n < s n ) via n-d FFT.

6 FFT-based Multivariate Multiplication (Algorithm Review) Let k be a field and f, g k[ < < x n ] be polynomials. Define d i = deg(f, x i ) and d i = deg(g, x i ), for all i. Assume there exists a primitive s i -th root unity ω i k for all i, where s i is a power of 2 satisfying s i d i + d i + 1. Then fg can be computed as follows. Step 1. Evaluate f and g at each point P (i.e. f (P),g(P)) of the n-dimensional grid ((ω e1 1,...,ωen n ),0 e 1 < s 1,...,0 e n < s n ) via n-d FFT. Step 2. Evaluate fg at each point P of the grid, simply by computing f (P)g(P),

7 FFT-based Multivariate Multiplication (Algorithm Review) Let k be a field and f, g k[ < < x n ] be polynomials. Define d i = deg(f, x i ) and d i = deg(g, x i ), for all i. Assume there exists a primitive s i -th root unity ω i k for all i, where s i is a power of 2 satisfying s i d i + d i + 1. Then fg can be computed as follows. Step 1. Evaluate f and g at each point P (i.e. f (P),g(P)) of the n-dimensional grid ((ω e1 1,...,ωen n ),0 e 1 < s 1,...,0 e n < s n ) via n-d FFT. Step 2. Evaluate fg at each point P of the grid, simply by computing f (P)g(P), Step 3. Interpolate fg (from its values on the grid) via n-d FFT.

8 Performance of Bivariate Interpolation in Step 3 (d 1 = d 2 ) Bivariate Interpolation Speedup edup Number of Cores Thanks to Dr. Frigo for his cache-efficient code for matrix transposition!

9 Performance of Bivariate Multiplication (d 1 = d 2 = d 1 = d 2) Speedup Bivariate Multiplication Number of Cores

10 Challenges: Irregular Input Data Speedup Multiplication linear speedup bivariate (32765, 63) 8 variate ( all 4) 4 variate (1023, 1, 1, 1023) univariate ( ) Number of Cores These unbalanced data pattern are common in symbolic computation.

11 Performance Analysis by VTune No. Size of Product Two Input Size Polynomials vars. of deg No. INST Clocks per L2 Cache Modified Data Time on RETIRED. Instruction Miss Rate Sharing Ratio 8 Cores ANY 10 9 Retired ( 10 3 ) ( 10 3 ) (s)

12 Complexity Analysis (1/2) Let s = s 1 s n. The number of operations in k for computing fg via n-d FFT is 9 n s j )s i lg(s i ) + (n + 1)s = 9 s lg(s) + (n + 1)s. 2 2 i=1( j i

13 Complexity Analysis (1/2) Let s = s 1 s n. The number of operations in k for computing fg via n-d FFT is 9 n s j )s i lg(s i ) + (n + 1)s = 9 s lg(s) + (n + 1)s. 2 2 i=1( j i Under our 1-D FFT black box assumption, the span of Step 1 is 9 2 (s 1 lg(s 1 ) + + s n lg(s n )), and the parallelism of Step 1 is lower bounded by s/max(s 1,...,s n ). (1)

14 Complexity Analysis (1/2) Let s = s 1 s n. The number of operations in k for computing fg via n-d FFT is 9 n s j )s i lg(s i ) + (n + 1)s = 9 s lg(s) + (n + 1)s. 2 2 i=1( j i Under our 1-D FFT black box assumption, the span of Step 1 is 9 2 (s 1 lg(s 1 ) + + s n lg(s n )), and the parallelism of Step 1 is lower bounded by s/max(s 1,...,s n ). (1) Let L be the size of a cache line. For some constant c > 0, the number of cache misses of Step 1 is upper bounded by n cs L + cs( 1 s s n ). (2)

15 Complexity Analysis (2/2) Let Q(s 1,...,s n ) denotes the total number of cache misses for the whole algorithm, for some constant c we obtain Q(s 1,...,s n ) cs n + 1 L + cs( 1 s s n ) (3)

16 Complexity Analysis (2/2) Let Q(s 1,...,s n ) denotes the total number of cache misses for the whole algorithm, for some constant c we obtain Q(s 1,...,s n ) cs n + 1 L + cs( 1 s s n ) (3) Since n s 1/n 1 s s n, we deduce when s i = s 1/n holds for all i. Q(s 1,...,s n ) ncs( 2 L + 1 ) (4) s1/n

17 Complexity Analysis (2/2) Let Q(s 1,...,s n ) denotes the total number of cache misses for the whole algorithm, for some constant c we obtain Q(s 1,...,s n ) cs n + 1 L + cs( 1 s s n ) (3) Since n s 1/n 1 s s n, we deduce when s i = s 1/n holds for all i. Q(s 1,...,s n ) ncs( 2 L + 1 ) (4) s1/n Remark 1: For n 2, Expr. (4) is minimized at n = 2 and s 1 = s 2 = s. Moreover, when n = 2, under a fixed s = s 1 s 2, Expr. (1) is maximized at s 1 = s 2 = s.

18 Our Solutions (1) Contraction to bivariate from multivariate (2) Extension from univariate to bivariate (3) Balanced multiplication by extension and contraction

19 Solution 1: Contraction to Bivariate from Multivar. Example. Let f k[x,y,z] where k = Z/41Z, with d x = d y = 1, d z = 3, and recursive dense representation: f z 0 z 1 z 2 z 3 y 0 y 1 y 0 y 1 y 0 y 1 y 0 y The coefficients (not monomials) are stored in a contiguous array. Index monomial x e1 y e2 z e3 by e 1 + (d x + 1)e 2 + (d x + 1)(d y + 1)e 3.

20 Solution 1: Contraction to Bivariate from Multivar. Example. Let f k[x,y,z] where k = Z/41Z, with d x = d y = 1, d z = 3, and recursive dense representation: f z 0 z 1 z 2 z 3 y 0 y 1 y 0 y 1 y 0 y 1 y 0 y The coefficients (not monomials) are stored in a contiguous array. Index monomial x e1 y e2 z e3 by e 1 + (d x + 1)e 2 + (d x + 1)(d y + 1)e 3. Contracting f (x,y,z) to p(u,v) by x e1 y e2 u e1+(dx+1)e2, z e3 v e3 : p v 0 v 1 v 2 v

21 Solution 1: Contraction to Bivariate from Multivar. Example. Let f k[x,y,z] where k = Z/41Z, with d x = d y = 1, d z = 3, and recursive dense representation: f z 0 z 1 z 2 z 3 y 0 y 1 y 0 y 1 y 0 y 1 y 0 y The coefficients (not monomials) are stored in a contiguous array. Index monomial x e1 y e2 z e3 by e 1 + (d x + 1)e 2 + (d x + 1)(d y + 1)e 3. Contracting f (x,y,z) to p(u,v) by x e1 y e2 u e1+(dx+1)e2, z e3 v e3 : p v 0 v 1 v 2 v Remark 2: The coefficient array is essentially unchanged by contraction, which is a property of recursive dense representation.

22 Performance of Contraction (timing) Time (second) econd) variate Multiplication degrees = (1023, 1, 1, 1023) 4D TFT method, size = 2047x3x3x2047 Balanced 2D TFT method, size = 6141x Number of Cores

23 Speedup edup Performance of Contraction (speedup) variate Multiplication degrees = (1023, 1, 1, 1023) Balanced 2D TFT method, size = 6141x6141 4D TFT method, size = 2047x3x3x2047 Net speedup Number of Cores

24 Performance of Contraction for a Large Range of Problems (timing on 1 processor) 4-D TFT method on 1 core ( s) Kronecker substitution of 4-D to 1-D TFT on 1 core (35.8- s) Contraction of 4-D to 2-D TFT on 1 core ( s) Time d 1 =d d 4 =d 4 (d 2 =d 2 =d 3 =d 3 =1)

25 Performance of Contraction for a Large Range of Problems (speedup) Contraction of 4-D to 2-D TFT on 16 cores ( x speedup, x net gain) Contraction of 4-D to 2-D TFT on 8 cores ( x speedup, x net gain) 4-D TFT method on 16 cores ( x speedup) Speedup d 1 =d d 4 =d 4 (d 2 =d 2 =d 3 =d 3 =1)

26 Solution 2: Extension from Univariate to Bivariate Example: Consider f, g k[x] univariate, with deg(f ) = 7 and deg(g) = 8; fg has dense size 16. We compute an integer b, such that fg can be performed via f b g b using nearly square 2-D FFTs, where f b := Φ b (f ), g b := Φ b (g) and Φ b : x e u e rem b v e quo b.

27 Solution 2: Extension from Univariate to Bivariate Example: Consider f, g k[x] univariate, with deg(f ) = 7 and deg(g) = 8; fg has dense size 16. We compute an integer b, such that fg can be performed via f b g b using nearly square 2-D FFTs, where f b := Φ b (f ), g b := Φ b (g) and Φ b : x e u e rem b v e quo b. Here b = 3 works since deg(f b g b, u) = deg(f b g b, v) = 4; moreover the dense size of f b g b is 25.

28 Solution 2: Extension from Univariate to Bivariate Example: Consider f, g k[x] univariate, with deg(f ) = 7 and deg(g) = 8; fg has dense size 16. We compute an integer b, such that fg can be performed via f b g b using nearly square 2-D FFTs, where f b := Φ b (f ), g b := Φ b (g) and Φ b : x e u e rem b v e quo b. Here b = 3 works since deg(f b g b, u) = deg(f b g b, v) = 4; moreover the dense size of f b g b is 25. Proposition: For any non-constant f, g k[x], one can always compute b such that deg(f b g b, u) deg(f b g b, v) 2 and the dense size of f b g b is at most twice that of fg.

29 Extension of f (x) to f b (u, v) in Recursive Dense Representation f x 2 x 3 x 4 x 5 x 6 x

30 Extension of f (x) to f b (u, v) in Recursive Dense Representation f x 2 x 3 x 4 x 5 x 6 x f b v 0 v 1 v

31 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14

32 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15

33 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15 c 0 c 1 c 2

34 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15 c 0 c 1 c 2 c 3 + c 5

35 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15 c 0 c 1 c 2 c 3 + c 5 c 4 + c 6

36 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15 c 0 c 1 c 2 c 3 + c 5 c 4 + c 6 c 7 c 8 + c 10 c 9 + c 11 c 12

37 Conversion to Univariate from the Bivariate Product The bivariate product: deg(f b g b, u) = 4, deg(f b g b, v) = 4. f b g b v 0 v 1 v 2 v u 4 u 4 u 4 u c 0 c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 c 13 c 14 Convert to univariate: deg(fg, x) = 15. fg x 2 x 3 x 4 x 5 x 6 x 7 x 8 x9,,15 c 0 c 1 c 2 c 3 + c 5 c 4 + c 6 c 7 c 8 + c 10 c 9 + c 11 c 12 Remark 4: Converting back to fg from f b g b requires only to traverse the coefficient array once, and perform at most deg(fg, x) additions.

38 Remark 3 Our extension technique provides an alternative to the Schönage -Strassen Algorithm for handling problems which sizes are too large for the available primitive roots of unity, a limitation with FFTs over finite fields.

39 Performance of Extension (timing) Time (second) Univariate Mupltiplication degree = D TFT method, size = Balanced 2D TFT method, size = 10085x Number of Cores

40 Performance of Extension (speedup) Speedup Univariate Multiplication degree = D TFT method, size = Balanced 2D TFT method, size = 10085x10085 Net speedup Number of Cores

41 Performance of Extension for a Large Range of Problems (timing) Extension of 1-D to 2-D TFT on 1 core ( s) 1-D TFT method on 1 core ( s) Extension of 1-D to 2-D TFT on 2 cores ( x speedup, x net gain) Extension of 1-D to 2-D TFT on 16 cores ( x speedup, x net gain) Time d d

42 Solution 3: Balanced Multiplication Definition. A pair of bivariate polynomials p, q k[u, v] is balanced if deg(p, u) + deg(q, u) = deg(p, v) + deg(q, v).

43 Solution 3: Balanced Multiplication Definition. A pair of bivariate polynomials p, q k[u, v] is balanced if deg(p, u) + deg(q, u) = deg(p, v) + deg(q, v). Algorithm. Let f, g k[ <... < x n ]. W.l.o.g. one can assume d 1 >> d i and d 1 >> d i for 2 i n (up to variable re-ordering and contraction). Then we obtain fg by Step 1. Extending to {u, v}. Step 2. Contracting {v, x 2,...,x n } to v.

44 Solution 3: Balanced Multiplication Definition. A pair of bivariate polynomials p, q k[u, v] is balanced if deg(p, u) + deg(q, u) = deg(p, v) + deg(q, v). Algorithm. Let f, g k[ <... < x n ]. W.l.o.g. one can assume d 1 >> d i and d 1 >> d i for 2 i n (up to variable re-ordering and contraction). Then we obtain fg by Step 1. Extending to {u, v}. Step 2. Contracting {v, x 2,...,x n } to v. Remark 5: The above extension Φ b can be determined such that f b, g b is (nearly) a balanced pair and f b g b has dense size at most twice that of fg.

45 Performance of Balanced Multiplication for a Large Range of Problems (timing) Ext.+Contr. of 4-D to 2-D TFT on 1 core ( s) Kronecker substitution of 4-D to 1-D TFT on 1 core ( s) Ext.+Contr. of 4-D to 2-D TFT on 2 cores (1.96x speedup, 1.75x net gain) Ext.+Contr. of 4-D to 2-D TFT on 16 cores ( x speedup, x net gain) Time d 1 (d 2 =d 3 =d 4 =2) d 1 (d 2 =d 3 =d 4 =2)

46 Summary and future work: Conclusion We have obtained efficient techniques and implementations for dense polynomial multiplication on multicores: - use balanced bivariate multiplication as a kernel, - contract mutivariate to bivariate, - extend univariate to bivariate, - combine contraction and extension. Our work-in-progress include normal form, GCD/resultant and a polynomial solver via triangular decompositions. Acknowledgements: This work was supported by NSERC and MITACS NCE of Canada, and NSF Grants , , , and We are very grateful for the help of Professor Charles E. Leiserson at MIT, Dr. Matteo Frigo and all other members of Cilk Arts. Thanks to all who have been supporting!

FFT-based Dense Polynomial Arithmetic on Multi-cores

FFT-based Dense Polynomial Arithmetic on Multi-cores FFT-based Dense Polynomial Arithmetic on Multi-cores Yuzhen Xie Computer Science and Artificial Intelligence Laboratory, MIT and Marc Moreno Maza Ontario Research Centre for Computer Algebra, UWO ACA 2009,

More information

Balanced dense polynomial multiplication on multi-cores

Balanced dense polynomial multiplication on multi-cores Balanced dense polynomial multiplication on multi-cores The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher

More information

Parallel Integer Polynomial Multiplication Changbo Chen, Svyatoslav Parallel Integer Covanov, Polynomial FarnamMultiplication

Parallel Integer Polynomial Multiplication Changbo Chen, Svyatoslav Parallel Integer Covanov, Polynomial FarnamMultiplication Parallel Integer Polynomial Multiplication Parallel Integer Polynomial Multiplication Changbo Chen 1 Svyatoslav Covanov 2,3 Farnam Mansouri 2 Marc Moreno Maza 2 Ning Xie 2 Yuzhen Xie 2 1 Chinese Academy

More information

High-Performance Symbolic Computation in a Hybrid Compiled-Interpreted Programming Environment

High-Performance Symbolic Computation in a Hybrid Compiled-Interpreted Programming Environment High-Performance Symbolic Computation in a Hybrid Compiled-Interpreted Programming Environment Xin Li Ontario Research Center for Computer Algebra University of Western Ontario London, Ontario, Canada

More information

Dense Arithmetic over Finite Fields with CUMODP

Dense Arithmetic over Finite Fields with CUMODP Dense Arithmetic over Finite Fields with CUMODP Sardar Anisul Haque 1 Xin Li 2 Farnam Mansouri 1 Marc Moreno Maza 1 Wei Pan 3 Ning Xie 1 1 University of Western Ontario, Canada 2 Universidad Carlos III,

More information

On the Factor Refinement Principle and its Implementation on Multicore Architectures

On the Factor Refinement Principle and its Implementation on Multicore Architectures Home Search Collections Journals About Contact us My IOPscience On the Factor Refinement Principle and its Implementation on Multicore Architectures This article has been downloaded from IOPscience Please

More information

TOWARD HIGH-PERFORMANCE POLYNOMIAL SYSTEM SOLVERS BASED ON TRIANGULAR DECOMPOSITIONS. (Spine title: Contributions to Polynomial System Solvers)

TOWARD HIGH-PERFORMANCE POLYNOMIAL SYSTEM SOLVERS BASED ON TRIANGULAR DECOMPOSITIONS. (Spine title: Contributions to Polynomial System Solvers) TOWARD HIGH-PERFORMANCE POLYNOMIAL SYSTEM SOLVERS BASED ON TRIANGULAR DECOMPOSITIONS (Spine title: Contributions to Polynomial System Solvers) (Thesis format: Monograph) by Xin Li Graduate Program in Computer

More information

Fast Polynomial Multiplication

Fast Polynomial Multiplication Fast Polynomial Multiplication Marc Moreno Maza CS 9652, October 4, 2017 Plan Primitive roots of unity The discrete Fourier transform Convolution of polynomials The fast Fourier transform Fast convolution

More information

The modpn library: Bringing Fast Polynomial Arithmetic into Maple

The modpn library: Bringing Fast Polynomial Arithmetic into Maple The modpn library: Bringing Fast Polynomial Arithmetic into Maple Xin Li Marc Moreno Maza Raqeeb Rasheed Éric Schost Computer Science Department, The University of Western Ontario, London, Ontario, Canada

More information

Cache Complexity and Multicore Implementation for Univariate Real Root Isolation

Cache Complexity and Multicore Implementation for Univariate Real Root Isolation Cache Complexity and Multicore Implementation for Univariate Real Root Isolation Changbo Chen, Marc Moreno Maza and Yuzhen Xie University of Western Ontario, London, Ontario, Canada E-mail: cchen5,moreno,yxie@csd.uwo.ca

More information

Toward High-performance Polynomial System Solvers Based on Triangular Decompositions

Toward High-performance Polynomial System Solvers Based on Triangular Decompositions Toward High-performance Polynomial System Solvers Based on Triangular Decompositions Ph.D. Candidate: Xin Li Supervisors: Marc Moreno Maza, Stephen M. Watt Computer Science, University of Western Ontario

More information

Analysis of Multithreaded Algorithms

Analysis of Multithreaded Algorithms Analysis of Multithreaded Algorithms Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) CS 4435 - CS 9624 (Moreno Maza) Analysis of Multithreaded Algorithms CS 4435 - CS 9624 1 /

More information

Plain Polynomial Arithmetic on GPU

Plain Polynomial Arithmetic on GPU Plain Polynomial Arithmetic on GPU Sardar Anisul Haque, Marc Moreno Maza University of Western Ontario, Canada E-mail: shaque4@uwo.ca, moreno@csd.uwo.ca Abstract. As for serial code on CPUs, parallel code

More information

In-place Arithmetic for Univariate Polynomials over an Algebraic Number Field

In-place Arithmetic for Univariate Polynomials over an Algebraic Number Field In-place Arithmetic for Univariate Polynomials over an Algebraic Number Field Seyed Mohammad Mahdi Javadi 1, Michael Monagan 2 1 School of Computing Science, Simon Fraser University, Burnaby, B.C., V5A

More information

Component-level Parallelization of Triangular Decompositions

Component-level Parallelization of Triangular Decompositions Component-level Parallelization of Triangular Decompositions Marc Moreno Maza & Yuzhen Xie University of Western Ontario, Canada February 1, 2007 Interactive Parallel Computation in Support of Research

More information

Modular Methods for Solving Nonlinear Polynomial Systems

Modular Methods for Solving Nonlinear Polynomial Systems Modular Methods for Solving Nonlinear Polynomial Systems (Thesis format: Monograph) by Raqeeb Rasheed Graduate Program in Computer Science A thesis submitted in partial fulfillment of the requirements

More information

Exact Arithmetic on a Computer

Exact Arithmetic on a Computer Exact Arithmetic on a Computer Symbolic Computation and Computer Algebra William J. Turner Department of Mathematics & Computer Science Wabash College Crawfordsville, IN 47933 Tuesday 21 September 2010

More information

Fast arithmetic for triangular sets: from theory to practice

Fast arithmetic for triangular sets: from theory to practice Fast arithmetic for triangular sets: from theory to practice Xin Li, Marc Moreno Maza, Éric Schost Computer Science Department, The University of Western Ontario, London, Ontario, Canada Abstract We study

More information

Efficient Evaluation of Large Polynomials

Efficient Evaluation of Large Polynomials Efficient Evaluation of Large Polynomials Charles E. Leiserson 1, Liyun Li 2, Marc Moreno Maza 2, and Yuzhen Xie 2 1 CSAIL, Massachussets Institute of Technology, Cambridge MA, USA 2 Department of Computer

More information

PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) Linxiao Wang. Graduate Program in Computer Science

PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) Linxiao Wang. Graduate Program in Computer Science PUTTING FÜRER ALGORITHM INTO PRACTICE WITH THE BPAS LIBRARY. (Thesis format: Monograph) by Linxiao Wang Graduate Program in Computer Science A thesis submitted in partial fulfillment of the requirements

More information

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact Ph.D Thesis Presentation of Sardar Anisul Haque Supervisor: Dr. Marc Moreno Maza Ontario Research Centre for Computer Algebra

More information

Implementation of the DKSS Algorithm for Multiplication of Large Numbers

Implementation of the DKSS Algorithm for Multiplication of Large Numbers Implementation of the DKSS Algorithm for Multiplication of Large Numbers Christoph Lüders Universität Bonn The International Symposium on Symbolic and Algebraic Computation, July 6 9, 2015, Bath, United

More information

On The Parallelization Of Integer Polynomial Multiplication

On The Parallelization Of Integer Polynomial Multiplication Western University Scholarship@Western Electronic Thesis and Dissertation Repository May 2014 On The Parallelization Of Integer Polynomial Multiplication Farnam Mansouri The University of Western Ontario

More information

A parallel implementation for polynomial multiplication modulo a prime.

A parallel implementation for polynomial multiplication modulo a prime. A parallel implementation for polynomial multiplication modulo a prime. ABSTRACT Marshall Law Department of Mathematics Simon Fraser University Burnaby, B.C. Canada. mylaw@sfu.ca. We present a parallel

More information

Computations Modulo Regular Chains

Computations Modulo Regular Chains Computations Modulo Regular Chains Xin Li, Marc Moreno Maza and Wei Pan (University of Western Ontario) ISSAC 2009, Seoul July 30, 2009 Background A historical application of the resultant is to compute

More information

Solving Polynomial Systems Symbolically and in Parallel

Solving Polynomial Systems Symbolically and in Parallel Solving Polynomial Systems Symbolically and in Parallel Marc Moreno Maza & Yuzhen Xie Ontario Research Center for Computer Algebra University of Western Ontario, London, Canada MITACS - CAIMS, June 18,

More information

1. Algebra 1.5. Polynomial Rings

1. Algebra 1.5. Polynomial Rings 1. ALGEBRA 19 1. Algebra 1.5. Polynomial Rings Lemma 1.5.1 Let R and S be rings with identity element. If R > 1 and S > 1, then R S contains zero divisors. Proof. The two elements (1, 0) and (0, 1) are

More information

Exact Computation of the Real Solutions of Arbitrary Polynomial Systems

Exact Computation of the Real Solutions of Arbitrary Polynomial Systems Exact Computation of the Real Solutions of Arbitrary Polynomial Systems Presented by Marc Moreno Maza 1 joint work with Changbo Chen 1, James H. Davenport 2, François Lemaire 3, John P. May 5, Bican Xia

More information

Computing Limits of Real Multivariate Rational Functions

Computing Limits of Real Multivariate Rational Functions Computing Limits of Real Multivariate Rational Functions Parisa Alvandi, Mahsa Kazemi, Marc Moreno Maza Western University, Canada July 22, 2016 Outline 1 Statement of the problem and previous works 2

More information

When does T equal sat(t)?

When does T equal sat(t)? When does T equal sat(t)? Wei Pan joint work with François Lemaire, Marc Moreno Maza and Yuzhen Xie MOCAA M 3 workshop UWO May 7, 2008 1 Introduction Given a regular chain T, the saturated ideal sat(t)

More information

Sparse Polynomial Multiplication and Division in Maple 14

Sparse Polynomial Multiplication and Division in Maple 14 Sparse Polynomial Multiplication and Division in Maple 4 Michael Monagan and Roman Pearce Department of Mathematics, Simon Fraser University Burnaby B.C. V5A S6, Canada October 5, 9 Abstract We report

More information

Optimizing and Parallelizing Brown s Modular GCD Algorithm

Optimizing and Parallelizing Brown s Modular GCD Algorithm Optimizing and Parallelizing Brown s Modular GCD Algorithm Matthew Gibson, Michael Monagan October 7, 2014 1 Introduction Consider the multivariate polynomial problem over the integers; that is, Gcd(A,

More information

Real Solving on Bivariate Systems with Sturm Sequences and SLV Maple TM library

Real Solving on Bivariate Systems with Sturm Sequences and SLV Maple TM library Real Solving on Bivariate Systems with Sturm Sequences and SLV Maple TM library Dimitris Diochnos University of Illinois at Chicago Dept. of Mathematics, Statistics, and Computer Science September 27,

More information

Toward High Performance Matrix Multiplication for Exact Computation

Toward High Performance Matrix Multiplication for Exact Computation Toward High Performance Matrix Multiplication for Exact Computation Pascal Giorgi Joint work with Romain Lebreton (U. Waterloo) Funded by the French ANR project HPAC Séminaire CASYS - LJK, April 2014 Motivations

More information

Parallel Sparse Polynomial Division Using Heaps

Parallel Sparse Polynomial Division Using Heaps Parallel Sparse Polynomial Division Using Heaps Michael Monagan Department of Mathematics Simon Fraser University Burnaby, B.C. V5A S6, CANADA. mmonagan@cecm.sfu.ca Roman Pearce Department of Mathematics

More information

Real Solving on Algebraic Systems of Small Dimension

Real Solving on Algebraic Systems of Small Dimension Real Solving on Algebraic Systems of Small Dimension Master s Thesis Presentation Dimitrios I. Diochnos University of Athens March 8, 2007 D. I. Diochnos (Univ. of Athens, µ Q λ ) Real Solving on Bivariate

More information

Quantum algorithms (CO 781/CS 867/QIC 823, Winter 2013) Andrew Childs, University of Waterloo LECTURE 13: Query complexity and the polynomial method

Quantum algorithms (CO 781/CS 867/QIC 823, Winter 2013) Andrew Childs, University of Waterloo LECTURE 13: Query complexity and the polynomial method Quantum algorithms (CO 781/CS 867/QIC 823, Winter 2013) Andrew Childs, University of Waterloo LECTURE 13: Query complexity and the polynomial method So far, we have discussed several different kinds of

More information

Implementing Fast Carryless Multiplication

Implementing Fast Carryless Multiplication Implementing Fast Carryless Multiplication Joris van der Hoeven, Robin Larrieu and Grégoire Lecerf CNRS & École polytechnique MACIS 2017 Nov. 15, Vienna, Austria van der Hoeven, Larrieu, Lecerf Implementing

More information

d A L. T O S O U LOWELL, MICHIGAN. THURSDAY, DECEMBER 5, 1929 Cadillac, Nov. 20. Indignation

d A L. T O S O U LOWELL, MICHIGAN. THURSDAY, DECEMBER 5, 1929 Cadillac, Nov. 20. Indignation ) - 5 929 XXX - $ 83 25 5 25 $ ( 2 2 z 52 $9285)9 7 - - 2 72 - - 2 3 zz - 9 86 - - - - 88 - q 2 882 q 88 - - - - - - ( 89 < - Q - 857-888 - - - & - - q - { q 7 - - - - q - - - - - - q - - - - 929 93 q

More information

MCS 563 Spring 2014 Analytic Symbolic Computation Monday 14 April. Binomial Ideals

MCS 563 Spring 2014 Analytic Symbolic Computation Monday 14 April. Binomial Ideals Binomial Ideals Binomial ideals offer an interesting class of examples. Because they occur so frequently in various applications, the development methods for binomial ideals is justified. 1 Binomial Ideals

More information

Change of Ordering for Regular Chains in Positive Dimension

Change of Ordering for Regular Chains in Positive Dimension Change of Ordering for Regular Chains in Positive Dimension X. Dahan, X. Jin, M. Moreno Maza, É. Schost University of Western Ontario, London, Ontario, Canada. École polytechnique, 91128 Palaiseau, France.

More information

On the BMS Algorithm

On the BMS Algorithm On the BMS Algorithm Shojiro Sakata The University of Electro-Communications Department of Information and Communication Engineering Chofu-shi, Tokyo 182-8585, JAPAN Abstract I will present a sketch of

More information

Fast computation of normal forms of polynomial matrices

Fast computation of normal forms of polynomial matrices 1/25 Fast computation of normal forms of polynomial matrices Vincent Neiger Inria AriC, École Normale Supérieure de Lyon, France University of Waterloo, Ontario, Canada Partially supported by the mobility

More information

D-MATH Algebra I HS18 Prof. Rahul Pandharipande. Solution 6. Unique Factorization Domains

D-MATH Algebra I HS18 Prof. Rahul Pandharipande. Solution 6. Unique Factorization Domains D-MATH Algebra I HS18 Prof. Rahul Pandharipande Solution 6 Unique Factorization Domains 1. Let R be a UFD. Let that a, b R be coprime elements (that is, gcd(a, b) R ) and c R. Suppose that a c and b c.

More information

Fast algorithms for polynomials and matrices Part 2: polynomial multiplication

Fast algorithms for polynomials and matrices Part 2: polynomial multiplication Fast algorithms for polynomials and matrices Part 2: polynomial multiplication by Grégoire Lecerf Computer Science Laboratory & CNRS École polytechnique 91128 Palaiseau Cedex France 1 Notation In this

More information

Computing with Constructible Sets in Maple

Computing with Constructible Sets in Maple Computing with Constructible Sets in Maple Changbo Chen a François Lemaire b Marc Moreno Maza a Liyun Li a Wei Pan a Yuzhen Xie c a University of Western Ontario Department of Computer Science London,

More information

Output-sensitive algorithms for sumset and sparse polynomial multiplication

Output-sensitive algorithms for sumset and sparse polynomial multiplication Output-sensitive algorithms for sumset and sparse polynomial multiplication Andrew Arnold Cheriton School of Computer Science University of Waterloo Waterloo, Ontario, Canada Daniel S. Roche Computer Science

More information

Section III.6. Factorization in Polynomial Rings

Section III.6. Factorization in Polynomial Rings III.6. Factorization in Polynomial Rings 1 Section III.6. Factorization in Polynomial Rings Note. We push several of the results in Section III.3 (such as divisibility, irreducibility, and unique factorization)

More information

Handout - Algebra Review

Handout - Algebra Review Algebraic Geometry Instructor: Mohamed Omar Handout - Algebra Review Sept 9 Math 176 Today will be a thorough review of the algebra prerequisites we will need throughout this course. Get through as much

More information

Polynomial multiplication and division using heap.

Polynomial multiplication and division using heap. Polynomial multiplication and division using heap. Michael Monagan and Roman Pearce Department of Mathematics, Simon Fraser University. Abstract We report on new code for sparse multivariate polynomial

More information

CS 829 Polynomial systems: geometry and algorithms Lecture 3: Euclid, resultant and 2 2 systems Éric Schost

CS 829 Polynomial systems: geometry and algorithms Lecture 3: Euclid, resultant and 2 2 systems Éric Schost CS 829 Polynomial systems: geometry and algorithms Lecture 3: Euclid, resultant and 2 2 systems Éric Schost eschost@uwo.ca Summary In this lecture, we start actual computations (as opposed to Lectures

More information

Places of Number Fields and Function Fields MATH 681, Spring 2018

Places of Number Fields and Function Fields MATH 681, Spring 2018 Places of Number Fields and Function Fields MATH 681, Spring 2018 From now on we will denote the field Z/pZ for a prime p more compactly by F p. More generally, for q a power of a prime p, F q will denote

More information

SOLVING VIA MODULAR METHODS

SOLVING VIA MODULAR METHODS SOLVING VIA MODULAR METHODS DEEBA AFZAL, FAIRA KANWAL, GERHARD PFISTER, AND STEFAN STEIDEL Abstract. In this article we present a parallel modular algorithm to compute all solutions with multiplicities

More information

Fast Algorithms, Modular Methods, Parallel Approaches and Software Engineering for Solving Polynomial Systems Symbolically

Fast Algorithms, Modular Methods, Parallel Approaches and Software Engineering for Solving Polynomial Systems Symbolically Fast Algorithms, Modular Methods, Parallel Approaches and Software Engineering for Solving Polynomial Systems Symbolically PhD Thesis Presentation by Yuzhen Xie Supervised by Dr. Marc Moreno Maza and Dr.

More information

On Fulton s Algorithm for Computing Intersection Multiplicities

On Fulton s Algorithm for Computing Intersection Multiplicities On Fulton s Algorithm for Computing Intersection Multiplicities Steffen Marcus 1 Marc Moreno Maza 2 Paul Vrbik 2 1 University of Utah 2 University of Western Ontario March 9, 2012 1/35 Overview Let f 1,...,f

More information

Revisiting the Cache Miss Analysis of Multithreaded Algorithms

Revisiting the Cache Miss Analysis of Multithreaded Algorithms Revisiting the Cache Miss Analysis of Multithreaded Algorithms Richard Cole Vijaya Ramachandran September 19, 2012 Abstract This paper concerns the cache miss analysis of algorithms when scheduled in work-stealing

More information

8 Appendix: Polynomial Rings

8 Appendix: Polynomial Rings 8 Appendix: Polynomial Rings Throughout we suppose, unless otherwise specified, that R is a commutative ring. 8.1 (Largely) a reminder about polynomials A polynomial in the indeterminate X with coefficients

More information

CS 4424 Matrix multiplication

CS 4424 Matrix multiplication CS 4424 Matrix multiplication 1 Reminder: matrix multiplication Matrix-matrix product. Starting from a 1,1 a 1,n A =.. and B = a n,1 a n,n b 1,1 b 1,n.., b n,1 b n,n we get AB by multiplying A by all columns

More information

PART II: Research Proposal Algorithms for the Simplification of Algebraic Formulae

PART II: Research Proposal Algorithms for the Simplification of Algebraic Formulae Form 101 Part II 6 Monagan, 195283 PART II: Research Proposal Algorithms for the Simplification of Algebraic Formulae 1 Research Area Computer algebra (CA) or symbolic computation, as my field is known

More information

Space- and Time-Efficient Polynomial Multiplication

Space- and Time-Efficient Polynomial Multiplication Space- and Time-Efficient Polynomial Multiplication Daniel S. Roche Symbolic Computation Group School of Computer Science University of Waterloo ISSAC 2009 Seoul, Korea 30 July 2009 Univariate Polynomial

More information

On the complexity of the D5 principle

On the complexity of the D5 principle On the complexity of the D5 principle Xavier Dahan Marc Moreno Maza Éric Schost Yuzhen Xie November 15, 2005 Abstract The D5 Principle was introduced in 1985 by Jean Della Dora, Claire Dicrescenzo and

More information

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences

CSE 421 Algorithms. T(n) = at(n/b) + n c. Closest Pair Problem. Divide and Conquer Algorithms. What you really need to know about recurrences CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer What you really need to know about recurrences Work per level changes geometrically with the level Geometrically increasing (x > 1) The

More information

SPECTRA - a Maple library for solving linear matrix inequalities in exact arithmetic

SPECTRA - a Maple library for solving linear matrix inequalities in exact arithmetic SPECTRA - a Maple library for solving linear matrix inequalities in exact arithmetic Didier Henrion Simone Naldi Mohab Safey El Din Version 1.0 of November 5, 2016 Abstract This document briefly describes

More information

Parallel Sparse Polynomial Interpolation over Finite Fields

Parallel Sparse Polynomial Interpolation over Finite Fields Parallel Sparse Polynomial Interpolation over Finite Fields Seyed Mohammad Mahdi Javadi School of Computing Science, Simon Fraser University, Burnaby, B.C. Canada Michael Monagan Department of Mathematics,

More information

Plan. Analysis of Multithreaded Algorithms. Plan. Matrix multiplication. University of Western Ontario, London, Ontario (Canada) Marc Moreno Maza

Plan. Analysis of Multithreaded Algorithms. Plan. Matrix multiplication. University of Western Ontario, London, Ontario (Canada) Marc Moreno Maza Pla Aalysis of Multithreaded Algorithms Marc Moreo Maza Uiversity of Wester Otario, Lodo, Otario (Caada) CS4402-9535 1 2 3 Pla (Moreo Maza) Aalysis of Multithreaded Algorithms CS4402-9535 1 / 27 (Moreo

More information

Algorithms for the Non-monic Case of the Sparse Modular GCD Algorithm

Algorithms for the Non-monic Case of the Sparse Modular GCD Algorithm Algorithms for the Non-monic Case of the Sparse Modular GCD Algorithm Jennifer de Kleine Department of Mathematics Simon Fraser University Burnaby, B.C. Canada. dekleine@cecm.sfu.ca. Michael Monagan Department

More information

Optimizing and Parallelizing the Modular GCD Algorithm

Optimizing and Parallelizing the Modular GCD Algorithm Optimizing and Parallelizing the Modular GCD Algorithm Matthew Gibson Department of Mathematics Simon Fraser University Burnaby, B.C., Canada. V5A 1S6 mdg583@gmail.com Michael Monagan Department of Mathematics

More information

Intro to Rings, Fields, Polynomials: Hardware Modeling by Modulo Arithmetic

Intro to Rings, Fields, Polynomials: Hardware Modeling by Modulo Arithmetic Intro to Rings, Fields, Polynomials: Hardware Modeling by Modulo Arithmetic Priyank Kalla Associate Professor Electrical and Computer Engineering, University of Utah kalla@ece.utah.edu http://www.ece.utah.edu/~kalla

More information

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. Prof. R. Fateman

UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division. Prof. R. Fateman UNIVERSITY OF CALIFORNIA Department of Electrical Engineering and Computer Sciences Computer Science Division CS 282 Spring, 2000 Prof. R. Fateman The (finite field) Fast Fourier Transform 0. Introduction

More information

Elliptic Curves Spring 2013 Lecture #3 02/12/2013

Elliptic Curves Spring 2013 Lecture #3 02/12/2013 18.783 Elliptic Curves Spring 2013 Lecture #3 02/12/2013 3.1 Arithmetic in finite fields To make explicit computations with elliptic curves over finite fields, we need to know how to perform arithmetic

More information

SCALED REMAINDER TREES

SCALED REMAINDER TREES Draft. Aimed at Math. Comp. SCALED REMAINDER TREES DANIEL J. BERNSTEIN Abstract. It is well known that one can compute U mod p 1, U mod p 2,... in time n(lg n) 2+o(1) where n is the number of bits in U,

More information

2 Multi-point evaluation in higher dimensions tion and interpolation problems in several variables; as an application, we improve algorithms for multi

2 Multi-point evaluation in higher dimensions tion and interpolation problems in several variables; as an application, we improve algorithms for multi Multi-point evaluation in higher dimensions Joris van der Hoeven Laboratoire d'informatique UMR 7161 CNRS cole polytechnique 91128 Palaiseau Cedex France Email: vdhoeven@lix.polytechnique.fr Web: http://www.lix.polytechnique.fr/~vdhoeven

More information

Strassen s Algorithm for Tensor Contraction

Strassen s Algorithm for Tensor Contraction Strassen s Algorithm for Tensor Contraction Jianyu Huang, Devin A. Matthews, Robert A. van de Geijn The University of Texas at Austin September 14-15, 2017 Tensor Computation Workshop Flatiron Institute,

More information

On the complexity of computing with zero-dimensional triangular sets

On the complexity of computing with zero-dimensional triangular sets On the complexity of computing with zero-dimensional triangular sets Adrien Poteaux Éric Schost adrien.poteaux@lip6.fr eschost@uwo.ca : Computer Science Department, The University of Western Ontario, London,

More information

On enumerating monomials and other combinatorial structures by polynomial interpolation

On enumerating monomials and other combinatorial structures by polynomial interpolation Theory of Computing Systems manuscript No. (will be inserted by the editor) On enumerating monomials and other combinatorial structures by polynomial interpolation Yann Strozecki the date of receipt and

More information

Computing the real solutions of polynomial systems with the RegularChains library in Maple

Computing the real solutions of polynomial systems with the RegularChains library in Maple ISSAC 2011 Software Presentation San Jose CA, June 9, 2011 (CDMMXX) RealTriangularize ISSAC 2011 1 / 17 Computing the real solutions of polynomial systems with the RegularChains library in Maple Presented

More information

The tangent FFT. D. J. Bernstein University of Illinois at Chicago

The tangent FFT. D. J. Bernstein University of Illinois at Chicago The tangent FFT D. J. Bernstein University of Illinois at Chicago Advertisement SPEED: Software Performance Enhancement for Encryption and Decryption A workshop on software speeds for secret-key cryptography

More information

Real root isolation for univariate polynomials on GPUs and multicores

Real root isolation for univariate polynomials on GPUs and multicores Master degree in Mathematical Engineering (Advanced Scienti c Computing) of Lille 1 Universities of Lille 1 and Western Ontario internship report : Real root isolation for univariate polynomials on GPUs

More information

FILTERED RINGS AND MODULES. GRADINGS AND COMPLETIONS.

FILTERED RINGS AND MODULES. GRADINGS AND COMPLETIONS. FILTERED RINGS AND MODULES. GRADINGS AND COMPLETIONS. Let A be a ring, for simplicity assumed commutative. A filtering, or filtration, of an A module M means a descending sequence of submodules M = M 0

More information

Between Sparse and Dense Arithmetic

Between Sparse and Dense Arithmetic Between Sparse and Dense Arithmetic Daniel S. Roche Computer Science Department United States Naval Academy NARC Seminar November 28, 2012 The Problem People want to compute with really big numbers and

More information

Inversion Modulo Zero-dimensional Regular Chains

Inversion Modulo Zero-dimensional Regular Chains Inversion Modulo Zero-dimensional Regular Chains Marc Moreno Maza, Éric Schost, and Paul Vrbik Department of Computer Science, Western University moreno@csd.uwo.ca eschost@csd.uwo.ca pvrbik@csd.uwo.ca

More information

Partial Fractions and the Coverup Method Haynes Miller and Jeremy Orloff

Partial Fractions and the Coverup Method Haynes Miller and Jeremy Orloff Partial Fractions and the Coverup Method 8.03 Haynes Miller and Jeremy Orloff *Much of this note is freely borrowed from an MIT 8.0 note written by Arthur Mattuck. Heaviside Cover-up Method. Introduction

More information

CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication

CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication CPSC 518 Introduction to Computer Algebra Schönhage and Strassen s Algorithm for Integer Multiplication March, 2006 1 Introduction We have now seen that the Fast Fourier Transform can be applied to perform

More information

Fast computation of normal forms of polynomial matrices

Fast computation of normal forms of polynomial matrices 1/25 Fast computation of normal forms of polynomial matrices Vincent Neiger Inria AriC, École Normale Supérieure de Lyon, France University of Waterloo, Ontario, Canada Partially supported by the mobility

More information

Towards abstract and executable multivariate polynomials in Isabelle

Towards abstract and executable multivariate polynomials in Isabelle Towards abstract and executable multivariate polynomials in Isabelle Florian Haftmann Andreas Lochbihler Wolfgang Schreiner Institute for Informatics Institute of Information Security RISC TU Munich ETH

More information

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC

Hybrid static/dynamic scheduling for already optimized dense matrix factorization. Joint Laboratory for Petascale Computing, INRIA-UIUC Hybrid static/dynamic scheduling for already optimized dense matrix factorization Simplice Donfack, Laura Grigori, INRIA, France Bill Gropp, Vivek Kale UIUC, USA Joint Laboratory for Petascale Computing,

More information

Reconstructibility of trees from subtree size frequencies

Reconstructibility of trees from subtree size frequencies Stud. Univ. Babeş-Bolyai Math. 59(2014), No. 4, 435 442 Reconstructibility of trees from subtree size frequencies Dénes Bartha and Péter Burcsi Abstract. Let T be a tree on n vertices. The subtree frequency

More information

arxiv: v1 [cs.ds] 28 Jan 2010

arxiv: v1 [cs.ds] 28 Jan 2010 An in-place truncated Fourier transform and applications to polynomial multiplication arxiv:1001.5272v1 [cs.ds] 28 Jan 2010 ABSTRACT David Harvey Courant Institute of Mathematical Sciences New York University

More information

Computing the Monodromy Group of a Plane Algebraic Curve Using a New Numerical-modular Newton-Puiseux Algorithm

Computing the Monodromy Group of a Plane Algebraic Curve Using a New Numerical-modular Newton-Puiseux Algorithm Computing the Monodromy Group of a Plane Algebraic Curve Using a New Numerical-modular Newton-Puiseux Algorithm Poteaux Adrien XLIM-DMI UMR CNRS 6172 Université de Limoges, France SNC'07 University of

More information

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn

Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn Chapter 1 Divide and Conquer Polynomial Multiplication Algorithm Theory WS 2015/16 Fabian Kuhn Formulation of the D&C principle Divide-and-conquer method for solving a problem instance of size n: 1. Divide

More information

A Maple Package for Parametric Matrix Computations

A Maple Package for Parametric Matrix Computations A Maple Package for Parametric Matrix Computations Robert M. Corless, Marc Moreno Maza and Steven E. Thornton Department of Applied Mathematics, Western University Ontario Research Centre for Computer

More information

Fast Multivariate Power Series Multiplication in Characteristic Zero

Fast Multivariate Power Series Multiplication in Characteristic Zero Fast Multivariate Power Series Multiplication in Characteristic Zero Grégoire Lecerf and Éric Schost Laboratoire GAGE, École polytechnique 91128 Palaiseau, France E-mail: lecerf,schost@gage.polytechnique.fr

More information

An Illustrated Introduction to the Truncated Fourier Transform

An Illustrated Introduction to the Truncated Fourier Transform An Illustrated Introduction to the Truncated Fourier Transform arxiv:1602.04562v2 [cs.sc] 17 Feb 2016 Paul Vrbik. School of Mathematical and Physical Sciences The University of Newcastle Callaghan, Australia

More information

WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE

WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE WORKING WITH MULTIVARIATE POLYNOMIALS IN MAPLE JEFFREY B. FARR AND ROMAN PEARCE Abstract. We comment on the implementation of various algorithms in multivariate polynomial theory. Specifically, we describe

More information

COMMUTATIVE/NONCOMMUTATIVE RANK OF LINEAR MATRICES AND SUBSPACES OF MATRICES OF LOW RANK

COMMUTATIVE/NONCOMMUTATIVE RANK OF LINEAR MATRICES AND SUBSPACES OF MATRICES OF LOW RANK Séminaire Lotharingien de Combinatoire 52 (2004), Article B52f COMMUTATIVE/NONCOMMUTATIVE RANK OF LINEAR MATRICES AND SUBSPACES OF MATRICES OF LOW RANK MARC FORTIN AND CHRISTOPHE REUTENAUER Dédié à notre

More information

On the complexity of the D5 principle

On the complexity of the D5 principle On the complexity of the D5 principle Xavier Dahan Marc Moreno Maza Éric Schost Yuzhen Xie March 17, 2006 Abstract The D5 Principle was introduced in 1985 by Jean Della Dora, Claire Dicrescenzo and Dominique

More information

Computing Cylindrical Algebraic Decomposition via Triangular Decomposition

Computing Cylindrical Algebraic Decomposition via Triangular Decomposition Computing Cylindrical Algebraic Decomposition via Triangular Decomposition ABSTRACT Changbo Chen ORCCA, University of Western Ontario (UWO) London, Ontario, Canada cchen252@csd.uwo.ca Bican Xia School

More information

Four-Dimensional GLV Scalar Multiplication

Four-Dimensional GLV Scalar Multiplication Four-Dimensional GLV Scalar Multiplication ASIACRYPT 2012 Beijing, China Patrick Longa Microsoft Research Francesco Sica Nazarbayev University Elliptic Curve Scalar Multiplication A (Weierstrass) elliptic

More information

Real Root Isolation of Regular Chains.

Real Root Isolation of Regular Chains. Real Root Isolation of Regular Chains. François Boulier 1, Changbo Chen 2, François Lemaire 1, Marc Moreno Maza 2 1 University of Lille I (France) 2 University of London, Ontario (Canada) ASCM 2009 (Boulier,

More information

What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers?

What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers? What s the best data structure for multivariate polynomials in a world of 64 bit multicore computers? Michael Monagan Center for Experimental and Constructive Mathematics Simon Fraser University British

More information