Table-Based Polynomials for Fast Hardware Function Evaluation

Size: px
Start display at page:

Download "Table-Based Polynomials for Fast Hardware Function Evaluation"

Transcription

1 ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE ECOLE NORMALE SUPERIEURE DE LYON

2 Overview 1 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 1 / 34

3 Context 2 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 2 / 34

4 Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

5 Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... X w I w O f(x) usually w I = w O and 8 w I, w O 32 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

6 Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... X w I? w O f(x) usually w I = w O and 8 w I, w O 32 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

7 Order 0: direct look-up table 4 tabulate all the possible values X w I f(0) f(1).. f(2 w I 2) f(2 w I 1) w O f(x) very short critical path: only 1 table look-up huge look-up table: w O 2 w I bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 4 / 34

8 Order 1: lookup-multiply method 5 piecewise linear approximation K 0 ( A) w O + g X w I A K 1 ( A) w O + g w O f(x) B smaller tables longer critical path: 1 table look-up, 1 mult and 1 add Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 5 / 34

9 Order 1: bipartite table method [2] 6 tabulate the product in a table of offsets (TO) TIV( A) w O + g A w O f(x) X w I A 0 TO(, B) A 0 w O + g B shorter critical path: 1 table look-up and 1 add slightly larger tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 6 / 34

10 Order 1: multipartite table method [14,11,4] 7 split the linear offset (TO) as a sum of several offsets (TO i s) X w I A TIV w O + g A 0 B B 0 A 1 O X R TO 0 O X R w O + g w O f(x) B 1 O X R TO 1 O X R w O + g B 2 A 2 O X R TO 2 O X R w O + g critical path: 2 XOR stages, 1 table look-up and log 2 (n) adds much smaller tables, but adder tree Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 7 / 34

11 Order 2: SMSO method [6] 8 split the order-1 term as the sum of a small product and an offset X w I A TIV w O + g 0 B A 0 TS w O + g 1 w O + g 0 B 0 w O f(x) A 1 B 1 O X R TO 1 O X R w O + g 0 A 2 B 2 O X R TO 2 w O + g 0 critical path: 1 table look-up, 1 rectangular mult and 2 adds multiplier, but smaller tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 8 / 34

12 Higher order methods 9 Hörner evaluation interleaved memory interpolators: Lewis partial product arrays: Hassler and Takagi specialized squaring unit: Piñero, Bruguera and Muller this work Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 9 / 34

13 Objectives 10 higher order approximation for larger precisions and smaller tables accurate error analysis to help the optimization of the hardware cost split large operators into smaller ones for architectural exploration Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 10 / 34

14 The HOTBM method (Higher-Order Table-Based Method) 11 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 11 / 34

15 Polynomial approximation 12 1 f(x) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

16 Polynomial approximation 12 1 f(x) 0 0 1/8 2/8 3/8 4/8 5/8 6/8 7/8 1 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A B α β Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

17 Polynomial approximation 12 1 f(x) 0 0 1/8 2/8 3/8 4/8 5/8 6/8 7/8 1 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A B α β piecewise order-n minimax polynomial approximation: n f(x) P (A)(B) = K k (A) (2 α B) k k=0 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

18 Polynomial approximation: architecture 13 X w I A K 0 ( A) w O + g B K 1 ( A) 2 α B w O + g w O f(x) K 2 ( A) ( 2 α B) 2 w O + g. K n ( A) ( 2 α B ) n w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 13 / 34

19 Polynomial approximation: architecture 13 X w I A K 0 ( A) w O + g B? K 1 ( A) 2 α B w O + g w O f(x)? K 2 ( A) ( 2 α B) 2 w O + g.? K n ( A) ( 2 α B ) n w O + g architectural choices to implement each term T k (A, B) = K k (A) (2 α B) k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 13 / 34

20 Computing the terms: exploiting symmetry 14 each term T k (A, B) is symmetric with respect to the middle of each sub-interval: when k is even, T k (A, B) = T k (A, B): B < 0 B > 0 A B T k ( A, B) when k is odd, T k (A, B) = T k (A, B): B < 0 B > 0 A B T k ( A, B) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 14 / 34

21 Computing the terms: exploiting symmetry 14 each term T k (A, B) is symmetric with respect to the middle of each sub-interval: when k is even, T k (A, B) = T k (A, B): B < 0 B > 0 A b 1 B B T k ( A, B ) when k is odd, T k (A, B) = T k (A, B): B < 0 B > 0 A b 1 B B T k ( A, B ) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 14 / 34

22 Computing the terms: simple look-up table 15 tabulate all the possible values A b 1 B T k ( A, B ) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 15 / 34

23 Computing the terms: power-and-multiply 16 compute S k = B k with a powering unit split S k into several sub-words S k,1,..., S k,mk : k (β 1) S k,1 S k,2... S k,mk σ k,1 σ k,2 σ k,mk compute the product K k (A) S k K k (A) S k,j : as the sum of all the sub-products the most significant ones implemented as actual multipliers the least significant ones implemented as look-up tables exploit symmetry for each of those sub-products Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 16 / 34

24 Computing the terms: power-and-multiply 17 A K k ( A) b 1 S k,1 B k B..... K k ( A) S k,2. S k,mk K k ( A) S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 17 / 34

25 Computing the terms: power-and-multiply 17 A K k ( A) S b 1? k,1 k O X B. R B.... K k ( A) S k,2. S k,mk K k ( A) S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 17 / 34

26 Computing the terms: powering unit 18 implemented as a look-up table Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 18 / 34

27 Computing the terms: powering unit 18 implemented as a look-up table implemented as a sum of partial products B partial products S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 18 / 34

28 Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

29 Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

30 Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others we can save area by using less bits to compute the most accurate tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

31 Degrading accuracy: global architecture 20 each term T k is computed using only: A k, the α k most significant bits of A B k, the β k most significant bits of B w I A k α k B k β k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 20 / 34

32 Degrading accuracy: global architecture 21 X w I A A 0 T 0 ( ) A 0 w O + g A 1 B B 1 T 1 ( A 1, B 1 ) w O + g A 2 w O f(x) B 2 T 2 ( A 2, B 2 ). w O + g A 3 B n T n ( A n, B n ) w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 21 / 34

33 Degrading accuracy: global architecture 21 X w I A A 0 T 0 ( ) A 0 w O + g A 1 B B 1 T 1 ( A 1, B 1 ) w O + g A 2 w O f(x) B 2 T 2 ( A 2, B 2 ). w O + g A 3 B n T n ( A n, B n ) w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 21 / 34

34 Degrading accuracy: power-and-multiply terms 22 only the λ k most significant bits of B k k are used for S k each sub-product K k (A k ) S k,j is computed using only A k,j, the α k,j most significant bits of A k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 22 / 34

35 Degrading accuracy: power-and-multiply terms 23 A k A k,1 K k ( ) A k,1 b 1 S k,1 B O X k R k B k..... A k,2 K k ( ) A k,2 S k,2. A k,mk S k,mk K k ( ) A k,mk S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 23 / 34

36 Degrading accuracy: power-and-multiply terms 23 A k A k,1 K k ( ) A k,1 b 1 S k,1 B O X k R k B k. S k, A k,2 K k ( ) A k,2 S k,2. S k,mk A k,mk K k ( ) A k,mk S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 23 / 34

37 Degrading accuracy: ad-hoc powering units 24 each ad-hoc powering unit is truncated to µ k bits B k S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 24 / 34

38 Degrading accuracy: ad-hoc powering units 24 each ad-hoc powering unit is truncated to µ k bits B k S k S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 24 / 34

39 Error analysis 25 every error entailed by the operator is accurately bounded: minimax error method errors rounding errors we can easily compute g the number of guard bits required to ensure faithful rounding (last bit accuracy) a trial-and-error method is then applied to decrease g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 25 / 34

40 Results 26 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 26 / 34

41 Results: area estimations for log 2 (1 + x) 27 Operator area (in slices) 3000 FPGA area ratio order 2 SMSO order 3 50% % 1000 order % Input / output precision w I = w O (in bits) as expected, exponential growth order 2 up to 24 bits, order 3 up to 28 bits, order 4 up to 32 bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 27 / 34

42 Results: area estimations for sin x 28 Operator area (in slices) 3000 FPGA area ratio % order 2 SMSO 30% order 3 order 4 10% Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 28 / 34

43 Results: delay estimations for log 2 (1 + x) 29 Operator delay (in ns) order order 3 25 order 2 SMSO Input / output precision w I = w O (in bits) latency increase for higher orders Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 29 / 34

44 Results: delay estimations for sin x 30 Operator delay (in ns) order 4 35 order 3 30 SMSO 25 order Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 30 / 34

45 Conclusion 31 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 31 / 34

46 Contribution 32 a novel function approximation method: arbitrary order: smaller tables optimized powering units small multipliers: shorter critical path, and can benefit from recent FPGA technologies (Virtex-II) highly parameterizable design, adaptable to various metrics accurate approximation and rounding error analysis targeted to precisions up to 32 bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 32 / 34

47 Future work 33 improve parameter space exploration heuristic following user-specified criteria adapt this method to ASIC (different metric, architectural choices,...) take advantage of accurate error analysis method to finely tune the tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 33 / 34

48 Future work 33 improve parameter space exploration heuristic following user-specified criteria adapt this method to ASIC (different metric, architectural choices,...) take advantage of accurate error analysis method to finely tune the tables work-in-progress: library of parameterizable floating-point operators for elementary functions: logarithm exponential Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 33 / 34

49 Thank you for your attention 34 more information: CVS repository: Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 34 / 34

50 Thank you for your attention 34 more information: CVS repository: Questions? Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 34 / 34

Second Order Function Approximation Using a Single Multiplication on FPGAs

Second Order Function Approximation Using a Single Multiplication on FPGAs FPL 04 Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/

More information

Table-based polynomials for fast hardware function evaluation

Table-based polynomials for fast hardware function evaluation Table-based polynomials for fast hardware function evaluation Jérémie Detrey, Florent de Dinechin LIP, École Normale Supérieure de Lyon 46 allée d Italie 69364 Lyon cedex 07, France E-mail: {Jeremie.Detrey,

More information

Table-based polynomials for fast hardware function evaluation

Table-based polynomials for fast hardware function evaluation Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON-UCBL n o 5668 Table-based polynomials for fast hardware function evaluation Jérémie

More information

Automatic generation of polynomial-based hardware architectures for function evaluation

Automatic generation of polynomial-based hardware architectures for function evaluation Automatic generation of polynomial-based hardware architectures for function evaluation Florent De Dinechin, Mioara Joldes, Bogdan Pasca To cite this version: Florent De Dinechin, Mioara Joldes, Bogdan

More information

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Hardware Operator for Simultaneous Sine and Cosine Evaluation Hardware Operator for Simultaneous Sine and Cosine Evaluation Arnaud Tisserand To cite this version: Arnaud Tisserand. Hardware Operator for Simultaneous Sine and Cosine Evaluation. ICASSP 6: International

More information

Fixed-Point Trigonometric Functions on FPGAs

Fixed-Point Trigonometric Functions on FPGAs Fixed-Point Trigonometric Functions on FPGAs Florent de Dinechin Matei Iştoan Guillaume Sergent LIP, Université de Lyon (CNRS/ENS-Lyon/INRIA/UCBL) 46, allée d Italie, 69364 Lyon Cedex 07 June 14th, 2013

More information

Laboratoire de l Informatique du Parallélisme. École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 8512

Laboratoire de l Informatique du Parallélisme. École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 8512 Laboratoire de l Informatique du Parallélisme École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 8512 SPI A few results on table-based methods Jean-Michel Muller October

More information

Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations

Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations electronics Article Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations Masoud Sadeghian 1,, James E. Stine 1, *, and E. George Walters III 2, 1 Oklahoma

More information

Efficient Function Approximation Using Truncated Multipliers and Squarers

Efficient Function Approximation Using Truncated Multipliers and Squarers Efficient Function Approximation Using Truncated Multipliers and Squarers E. George Walters III Lehigh University Bethlehem, PA, USA waltersg@ieee.org Michael J. Schulte University of Wisconsin Madison

More information

Automated design of floating-point logarithm functions on integer processors

Automated design of floating-point logarithm functions on integer processors 23rd IEEE Symposium on Computer Arithmetic Santa Clara, CA, USA, 10-13 July 2016 Automated design of floating-point logarithm functions on integer processors Guillaume Revy (presented by Florent de Dinechin)

More information

NUMERICAL FUNCTION GENERATORS USING BILINEAR INTERPOLATION

NUMERICAL FUNCTION GENERATORS USING BILINEAR INTERPOLATION NUMERICAL FUNCTION GENERATORS USING BILINEAR INTERPOLATION Shinobu Nagayama 1, Tsutomu Sasao 2, Jon T Butler 3 1 Department of Computer Engineering, Hiroshima City University, Japan 2 Department of Computer

More information

Hardware implementations of fixed-point Atan2

Hardware implementations of fixed-point Atan2 Hardware implementations of fixed-point Atan2 Florent De Dinechin, Matei Istoan To cite this version: Florent De Dinechin, Matei Istoan. Hardware implementations of fixed-point Atan2. 22nd IEEE Symposium

More information

A Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series

A Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series A Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series V. S. Dimitrov 12, V. Ariyarathna 3, D. F. G. Coelho 1, L. Rakai 1, A. Madanayake 3, R. J. Cintra 4 1 ECE Department,

More information

On the number of segments needed in a piecewise linear approximation

On the number of segments needed in a piecewise linear approximation On the number of segments needed in a piecewise linear approximation Christopher L. Frenzen a, Tsutomu Sasao b and Jon T. Butler c. a Department of Applied Mathematics, Naval Postgraduate School, Monterey,

More information

Arithmetic Operators for Pairing-Based Cryptography

Arithmetic Operators for Pairing-Based Cryptography Arithmetic Operators for Pairing-Based Cryptography Jean-Luc Beuchat Laboratory of Cryptography and Information Security Graduate School of Systems and Information Engineering University of Tsukuba 1-1-1

More information

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,

More information

Arithmetic operators for pairing-based cryptography

Arithmetic operators for pairing-based cryptography 7. Kryptotag November 9 th, 2007 Arithmetic operators for pairing-based cryptography Jérémie Detrey Cosec, B-IT, Bonn, Germany jdetrey@bit.uni-bonn.de Joint work with: Jean-Luc Beuchat Nicolas Brisebarre

More information

Karatsuba with Rectangular Multipliers for FPGAs

Karatsuba with Rectangular Multipliers for FPGAs Karatsuba with Rectangular Multipliers for FPGAs Martin Kumm, Oscar Gustafsson, Florent De Dinechin, Johannes Kappauf, Peter Zipf To cite this version: Martin Kumm, Oscar Gustafsson, Florent De Dinechin,

More information

A Hardware-Oriented Method for Evaluating Complex Polynomials

A Hardware-Oriented Method for Evaluating Complex Polynomials A Hardware-Oriented Method for Evaluating Complex Polynomials Miloš D Ercegovac Computer Science Department University of California at Los Angeles Los Angeles, CA 90095, USA milos@csuclaedu Jean-Michel

More information

Automated design of floating-point logarithm functions on integer processors

Automated design of floating-point logarithm functions on integer processors Automated design of floating-point logarithm functions on integer processors Guillaume Revy To cite this version: Guillaume Revy. Automated design of floating-point logarithm functions on integer processors.

More information

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA Efficient Polynomial Evaluation Algorithm and Implementation on FPGA by Simin Xu School of Computer Engineering A thesis submitted to Nanyang Technological University in partial fullfillment of the requirements

More information

Computation of the error functions erf and erfc in arbitrary precision with correct rounding

Computation of the error functions erf and erfc in arbitrary precision with correct rounding Computation of the error functions erf and erfc in arbitrary precision with correct rounding Sylvain Chevillard Arenaire, LIP, ENS-Lyon, France Sylvain.Chevillard@ens-lyon.fr Nathalie Revol INRIA, Arenaire,

More information

Arithmetic Operators for Pairing-Based Cryptography

Arithmetic Operators for Pairing-Based Cryptography Arithmetic Operators for Pairing-Based Cryptography J.-L. Beuchat 1 N. Brisebarre 2 J. Detrey 3 E. Okamoto 1 1 University of Tsukuba, Japan 2 École Normale Supérieure de Lyon, France 3 Cosec, b-it, Bonn,

More information

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical &

More information

Return of the hardware floating-point elementary function

Return of the hardware floating-point elementary function Return of the hardware floating-point elementary function Jérémie Detrey, Florent De Dinechin, Xavier Pujol To cite this version: Jérémie Detrey, Florent De Dinechin, Xavier Pujol. Return of the hardware

More information

Computing Machine-Efficient Polynomial Approximations

Computing Machine-Efficient Polynomial Approximations Computing Machine-Efficient Polynomial Approximations N. Brisebarre, S. Chevillard, G. Hanrot, J.-M. Muller, D. Stehlé, A. Tisserand and S. Torres Arénaire, LIP, É.N.S. Lyon Journées du GDR et du réseau

More information

Design Method for Numerical Function Generators Based on Polynomial Approximation for FPGA Implementation

Design Method for Numerical Function Generators Based on Polynomial Approximation for FPGA Implementation Design Method for Numerical Function Generators Based on Polynomial Approximation for FPGA Implementation Shinobu Nagayama Tsutomu Sasao Jon T. Butler Dept. of Computer Engineering, Dept. of Computer Science

More information

Design and Implementation of a Radix-4 Complex Division Unit with Prescaling

Design and Implementation of a Radix-4 Complex Division Unit with Prescaling esign and Implementation of a Radix-4 Complex ivision Unit with Prescaling Pouya ormiani Computer Science epartment University of California at Los Angeles Los Angeles, CA 90024, USA Email: pouya@cs.ucla.edu

More information

Lecture 11. Advanced Dividers

Lecture 11. Advanced Dividers Lecture 11 Advanced Dividers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division

More information

Efficient Subquadratic Space Complexity Binary Polynomial Multipliers Based On Block Recombination

Efficient Subquadratic Space Complexity Binary Polynomial Multipliers Based On Block Recombination Efficient Subquadratic Space Complexity Binary Polynomial Multipliers Based On Block Recombination Murat Cenk, Anwar Hasan, Christophe Negre To cite this version: Murat Cenk, Anwar Hasan, Christophe Negre.

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Miloš D. Ercegovac Computer Science Department Univ. of California at Los Angeles California Robert McIlhenny

More information

MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series.

MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series. MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series. The objective of this section is to become familiar with the theory and application of power series and Taylor series. By

More information

Computer Problems for Fourier Series and Transforms

Computer Problems for Fourier Series and Transforms Computer Problems for Fourier Series and Transforms 1. Square waves are frequently used in electronics and signal processing. An example is shown below. 1 π < x < 0 1 0 < x < π y(x) = 1 π < x < 2π... and

More information

Rigorous Polynomial Approximations and Applications

Rigorous Polynomial Approximations and Applications Rigorous Polynomial Approximations and Applications Mioara Joldeș under the supervision of: Nicolas Brisebarre and Jean-Michel Muller École Normale Supérieure de Lyon, Arénaire Team, Laboratoire de l Informatique

More information

Part VI Function Evaluation

Part VI Function Evaluation Part VI Function Evaluation Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations

More information

Section 5.8. Taylor Series

Section 5.8. Taylor Series Difference Equations to Differential Equations Section 5.8 Taylor Series In this section we will put together much of the work of Sections 5.-5.7 in the context of a discussion of Taylor series. We begin

More information

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms David Lewis Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S

More information

ECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders

ECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders ECE 645: Lecture 3 Conditional-Sum Adders and Parallel Prefix Network Adders FPGA Optimized Adders Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 7.4, Conditional-Sum

More information

f (x) = k=0 f (0) = k=0 k=0 a k k(0) k 1 = a 1 a 1 = f (0). a k k(k 1)x k 2, k=2 a k k(k 1)(0) k 2 = 2a 2 a 2 = f (0) 2 a k k(k 1)(k 2)x k 3, k=3

f (x) = k=0 f (0) = k=0 k=0 a k k(0) k 1 = a 1 a 1 = f (0). a k k(k 1)x k 2, k=2 a k k(k 1)(0) k 2 = 2a 2 a 2 = f (0) 2 a k k(k 1)(k 2)x k 3, k=3 1 M 13-Lecture Contents: 1) Taylor Polynomials 2) Taylor Series Centered at x a 3) Applications of Taylor Polynomials Taylor Series The previous section served as motivation and gave some useful expansion.

More information

FPGA Implementation of a Predictive Controller

FPGA Implementation of a Predictive Controller FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan

More information

A 32-bit Decimal Floating-Point Logarithmic Converter

A 32-bit Decimal Floating-Point Logarithmic Converter A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation

More information

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic

More information

Semi-Automatic Floating-Point Implementation of Special Functions

Semi-Automatic Floating-Point Implementation of Special Functions Semi-Automatic Floating-Point Implementation of Special Functions Christoph Lauter 1 Marc Mezzarobba 1,2 Pequan group 1 Université Paris 6 2 CNRS ARITH 22, Lyon, 2015-06-23 }main() { int temp; float celsius;

More information

Cost/Performance Tradeoff of n-select Square Root Implementations

Cost/Performance Tradeoff of n-select Square Root Implementations Australian Computer Science Communications, Vol.22, No.4, 2, pp.9 6, IEEE Comp. Society Press Cost/Performance Tradeoff of n-select Square Root Implementations Wanming Chu and Yamin Li Computer Architecture

More information

A technique for DDA seed shifting and scaling

A technique for DDA seed shifting and scaling A technique for DDA seed shifting and scaling John Kerl Feb 8, 2001 Abstract This paper describes a simple, unified technique for DDA seed shifting and scaling. As well, the terms DDA, seed, shifting and

More information

On-Line Hardware Implementation for Complex Exponential and Logarithm

On-Line Hardware Implementation for Complex Exponential and Logarithm On-Line Hardware Implementation for Complex Exponential and Logarithm Ali SKAF, Jean-Michel MULLER * and Alain GUYOT Laboratoire TIMA / INPG - 46, Av. Félix Viallet, 3831 Grenoble Cedex * Laboratoire LIP

More information

Math Practice Exam 3 - solutions

Math Practice Exam 3 - solutions Math 181 - Practice Exam 3 - solutions Problem 1 Consider the function h(x) = (9x 2 33x 25)e 3x+1. a) Find h (x). b) Find all values of x where h (x) is zero ( critical values ). c) Using the sign pattern

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com of SubBytes and InvSubBytes s of AES Algorithm Using Power Analysis Attack Resistant Reversible

More information

Newton-Raphson Algorithms for Floating-Point Division Using an FMA

Newton-Raphson Algorithms for Floating-Point Division Using an FMA Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008

More information

A Deep Convolutional Neural Network Based on Nested Residue Number System

A Deep Convolutional Neural Network Based on Nested Residue Number System A Deep Convolutional Neural Network Based on Nested Residue Number System Hiroki Nakahara Tsutomu Sasao Ehime University, Japan Meiji University, Japan Outline Background Deep convolutional neural network

More information

Practice Problems: Integration by Parts

Practice Problems: Integration by Parts Practice Problems: Integration by Parts Answers. (a) Neither term will get simpler through differentiation, so let s try some choice for u and dv, and see how it works out (we can always go back and try

More information

Optimizing Scientific Libraries for the Itanium

Optimizing Scientific Libraries for the Itanium 0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common

More information

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose

More information

NUMERICAL MATHEMATICS & COMPUTING 6th Edition

NUMERICAL MATHEMATICS & COMPUTING 6th Edition NUMERICAL MATHEMATICS & COMPUTING 6th Edition Ward Cheney/David Kincaid c UT Austin Engage Learning: Thomson-Brooks/Cole www.engage.com www.ma.utexas.edu/cna/nmc6 September 1, 2011 2011 1 / 42 1.1 Mathematical

More information

NUMERICAL METHODS. x n+1 = 2x n x 2 n. In particular: which of them gives faster convergence, and why? [Work to four decimal places.

NUMERICAL METHODS. x n+1 = 2x n x 2 n. In particular: which of them gives faster convergence, and why? [Work to four decimal places. NUMERICAL METHODS 1. Rearranging the equation x 3 =.5 gives the iterative formula x n+1 = g(x n ), where g(x) = (2x 2 ) 1. (a) Starting with x = 1, compute the x n up to n = 6, and describe what is happening.

More information

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials C. Shu, S. Kwon and K. Gaj Abstract: The efficient design of digit-serial multipliers

More information

An Algorithm for the η T Pairing Calculation in Characteristic Three and its Hardware Implementation

An Algorithm for the η T Pairing Calculation in Characteristic Three and its Hardware Implementation An Algorithm for the η T Pairing Calculation in Characteristic Three and its Hardware Implementation Jean-Luc Beuchat 1 Masaaki Shirase 2 Tsuyoshi Takagi 2 Eiji Okamoto 1 1 Graduate School of Systems and

More information

Power Series Solutions We use power series to solve second order differential equations

Power Series Solutions We use power series to solve second order differential equations Objectives Power Series Solutions We use power series to solve second order differential equations We use power series expansions to find solutions to second order, linear, variable coefficient equations

More information

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks Yufei Ma, Yu Cao, Sarma Vrudhula,

More information

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical & Electronic

More information

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) Stefan Tillich, Johann Großschädl Institute for Applied Information Processing and

More information

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Mourad Gouicem PEQUAN Team, LIP6/UPMC Nancy, France May 28 th 2013

More information

What s the Deal? MULTIPLICATION. Time to multiply

What s the Deal? MULTIPLICATION. Time to multiply What s the Deal? MULTIPLICATION Time to multiply Multiplying two numbers requires a multiply Luckily, in binary that s just an AND gate! 0*0=0, 0*1=0, 1*0=0, 1*1=1 Generate a bunch of partial products

More information

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Study Homework: Arithmetic NTU IC54CA (Fall 2004) SOLUTIONS Short adders A The delay of the ripple

More information

1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents. Math120 - Precalculus. Final Review. Fall, 2011 Prepared by Dr. P. Babaali 1 Algebra 1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

More information

Gal s Accurate Tables Method Revisited

Gal s Accurate Tables Method Revisited Gal s Accurate Tables Method Revisited Damien Stehlé UHP/LORIA 615 rue du jardin botanique F-5460 Villers-lès-Nancy Cedex stehle@loria.fr Paul Zimmermann INRIA Lorraine/LORIA 615 rue du jardin botanique

More information

Fast and accurate Bessel function computation

Fast and accurate Bessel function computation 0 Fast and accurate Bessel function computation John Harrison, Intel Corporation ARITH-19 Portland, OR Tue 9th June 2009 (11:00 11:30) 1 Bessel functions and their computation Bessel functions are certain

More information

This is your first impression to me as a mathematician. Make it good.

This is your first impression to me as a mathematician. Make it good. Calculus Summer 2016 DVHS (AP or RIO) Name : Welcome! Congratulations on reaching this advanced level of mathematics. Calculus is unlike the mathematics you have already studied, and yet it is built on

More information

Low-complexity generation of scalable complete complementary sets of sequences

Low-complexity generation of scalable complete complementary sets of sequences University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2006 Low-complexity generation of scalable complete complementary sets

More information

Author(s) Beuchat, Jean-Luc; Muller, Jean-Mic. Citation IEEE transactions on computers, 57(

Author(s) Beuchat, Jean-Luc; Muller, Jean-Mic. Citation IEEE transactions on computers, 57( Title Automatic Generation of Modular Mul Applications Author(s) Beuchat, Jean-Luc; Muller, Jean-Mic Citation IEEE transactions on computers, 57( Issue Date 008-1 Text version publisher URL http://hdl.handle.net/41/101169

More information

CHALLENGE! (0) = 5. Construct a polynomial with the following behavior at x = 0:

CHALLENGE! (0) = 5. Construct a polynomial with the following behavior at x = 0: TAYLOR SERIES Construct a polynomial with the following behavior at x = 0: CHALLENGE! P( x) = a + ax+ ax + ax + ax 2 3 4 0 1 2 3 4 P(0) = 1 P (0) = 2 P (0) = 3 P (0) = 4 P (4) (0) = 5 Sounds hard right?

More information

Optimal Eta Pairing on Supersingular Genus-2 Binary Hyperelliptic Curves

Optimal Eta Pairing on Supersingular Genus-2 Binary Hyperelliptic Curves CT-RSA 2012 February 29th, 2012 Optimal Eta Pairing on Supersingular Genus-2 Binary Hyperelliptic Curves Joint work with: Nicolas Estibals CARAMEL project-team, LORIA, Université de Lorraine / CNRS / INRIA,

More information

Math 12 Final Exam Review 1

Math 12 Final Exam Review 1 Math 12 Final Exam Review 1 Part One Calculators are NOT PERMITTED for this part of the exam. 1. a) The sine of angle θ is 1 What are the 2 possible values of θ in the domain 0 θ 2π? 2 b) Draw these angles

More information

Chapter 2 Algorithms for Periodic Functions

Chapter 2 Algorithms for Periodic Functions Chapter 2 Algorithms for Periodic Functions In this chapter we show how to compute the Discrete Fourier Transform using a Fast Fourier Transform (FFT) algorithm, including not-so special case situations

More information

Computer Architecture 10. Fast Adders

Computer Architecture 10. Fast Adders Computer Architecture 10 Fast s Ma d e wi t h Op e n Of f i c e. o r g 1 Carry Problem Addition is primary mechanism in implementing arithmetic operations Slow addition directly affects the total performance

More information

7.0: Minimax approximations

7.0: Minimax approximations 7.0: Minimax approximations In this section we study the problem min f p = min max f(x) p(x) p A p A x [a,b] where f C[a, b] and A is a linear subspace of C[a, b]. Let p be a trial solution (e.g. a guess)

More information

LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation

LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation Jingyang Zhu 1, Zhiliang Qian 2, and Chi-Ying Tsui 1 1 The Hong Kong University of Science and

More information

Solution of Algebric & Transcendental Equations

Solution of Algebric & Transcendental Equations Page15 Solution of Algebric & Transcendental Equations Contents: o Introduction o Evaluation of Polynomials by Horner s Method o Methods of solving non linear equations o Bracketing Methods o Bisection

More information

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System G.Suresh, G.Indira Devi, P.Pavankumar Abstract The use of the improved table look up Residue Number System

More information

Construction of a reconfigurable dynamic logic cell

Construction of a reconfigurable dynamic logic cell PRAMANA c Indian Academy of Sciences Vol. 64, No. 3 journal of March 2005 physics pp. 433 441 Construction of a reconfigurable dynamic logic cell K MURALI 1, SUDESHNA SINHA 2 and WILLIAM L DITTO 3 1 Department

More information

Hardware implementations of ECC

Hardware implementations of ECC Hardware implementations of ECC The University of Electro- Communications Introduction Public- key Cryptography (PKC) The most famous PKC is RSA and ECC Used for key agreement (Diffie- Hellman), digital

More information

Designing a Correct Numerical Algorithm

Designing a Correct Numerical Algorithm Intro Implem Errors Sollya Gappa Norm Conc Christoph Lauter Guillaume Melquiond March 27, 2013 Intro Implem Errors Sollya Gappa Norm Conc Outline 1 Introduction 2 Implementation theory 3 Error analysis

More information

Step 1: Greatest Common Factor Step 2: Count the number of terms If there are: 2 Terms: Difference of 2 Perfect Squares ( + )( - )

Step 1: Greatest Common Factor Step 2: Count the number of terms If there are: 2 Terms: Difference of 2 Perfect Squares ( + )( - ) Review for Algebra 2 CC Radicals: r x p 1 r x p p r = x p r = x Imaginary Numbers: i = 1 Polynomials (to Solve) Try Factoring: i 2 = 1 Step 1: Greatest Common Factor Step 2: Count the number of terms If

More information

Hardware Acceleration of the Tate Pairing in Characteristic Three

Hardware Acceleration of the Tate Pairing in Characteristic Three Hardware Acceleration of the Tate Pairing in Characteristic Three CHES 2005 Hardware Acceleration of the Tate Pairing in Characteristic Three Slide 1 Introduction Pairing based cryptography is a (fairly)

More information

Computing Machine-Efficient Polynomial Approximations

Computing Machine-Efficient Polynomial Approximations Computing Machine-Efficient Polynomial Approximations NICOLAS BRISEBARRE Université J. Monnet, St-Étienne and LIP-E.N.S. Lyon JEAN-MICHEL MULLER CNRS, LIP-ENS Lyon and ARNAUD TISSERAND INRIA, LIP-ENS Lyon

More information

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS * Copyright IEEE 999: Published in the Proceedings of Globecom 999, Rio de Janeiro, Dec 5-9, 999 A HIGH-SPEED PROCESSOR FOR RECTAGULAR-TO-POLAR COVERSIO WITH APPLICATIOS I DIGITAL COMMUICATIOS * Dengwei

More information

HARDWARE IMPLEMENTATION OF FIR/IIR DIGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION. Arash Ardakani, François Leduc-Primeau and Warren J.

HARDWARE IMPLEMENTATION OF FIR/IIR DIGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION. Arash Ardakani, François Leduc-Primeau and Warren J. HARWARE IMPLEMENTATION OF FIR/IIR IGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION Arash Ardakani, François Leduc-Primeau and Warren J. Gross epartment of Electrical and Computer Engineering McGill

More information

Tunable Floating-Point for Energy Efficient Accelerators

Tunable Floating-Point for Energy Efficient Accelerators Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable

More information

function independent dependent domain range graph of the function The Vertical Line Test

function independent dependent domain range graph of the function The Vertical Line Test Functions A quantity y is a function of another quantity x if there is some rule (an algebraic equation, a graph, a table, or as an English description) by which a unique value is assigned to y by a corresponding

More information

8.5 Taylor Polynomials and Taylor Series

8.5 Taylor Polynomials and Taylor Series 8.5. TAYLOR POLYNOMIALS AND TAYLOR SERIES 50 8.5 Taylor Polynomials and Taylor Series Motivating Questions In this section, we strive to understand the ideas generated by the following important questions:

More information

Chapter 1 Numerical approximation of data : interpolation, least squares method

Chapter 1 Numerical approximation of data : interpolation, least squares method Chapter 1 Numerical approximation of data : interpolation, least squares method I. Motivation 1 Approximation of functions Evaluation of a function Which functions (f : R R) can be effectively evaluated

More information

Polynomial Functions and Their Graphs

Polynomial Functions and Their Graphs Polynomial Functions and Their Graphs Definition of a Polynomial Function Let n be a nonnegative integer and let a n, a n- 1,, a 2, a 1, a 0, be real numbers with a n 0. The function defined by f (x) a

More information

ECE 645: Lecture 2. Carry-Lookahead, Carry-Select, & Hybrid Adders

ECE 645: Lecture 2. Carry-Lookahead, Carry-Select, & Hybrid Adders ECE 645: Lecture 2 Carry-Lookahead, Carry-Select, & Hybrid Adders Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 6, Carry-Lookahead Adders Sections 6.1-6.2.

More information

y = 5 x. Which statement is true? x 2 6x 25 = 0 by completing the square?

y = 5 x. Which statement is true? x 2 6x 25 = 0 by completing the square? Algebra /Trigonometry Regents Exam 064 www.jmap.org 064a Which survey is least likely to contain bias? ) surveying a sample of people leaving a movie theater to determine which flavor of ice cream is the

More information

THIS paper is devoted to the study of modular multiplication

THIS paper is devoted to the study of modular multiplication AUTOATIC GENERATION OF OULAR ULTIPLIERS FOR FPGA APPLICATIONS Automatic Generation of odular ultipliers for FPGA Applications Jean-Luc Beuchat and Jean-ichel uller, Senior ember, IEEE LIP Research Report

More information

Janus: FPGA Based System for Scientific Computing Filippo Mantovani

Janus: FPGA Based System for Scientific Computing Filippo Mantovani Janus: FPGA Based System for Scientific Computing Filippo Mantovani Physics Department Università degli Studi di Ferrara Ferrara, 28/09/2009 Overview: 1. The physical problem: - Ising model and Spin Glass

More information

AES [and other Block Ciphers] Implementation Tricks

AES [and other Block Ciphers] Implementation Tricks AES [and other Bloc Ciphers] Implementation Trics Cryptographic algorithms Basic primitives Survey by Stephen et al, LNCS 1482, Sep. 98 General Structure of a Bloc Cipher Useful Properties for Implementing

More information