Table-Based Polynomials for Fast Hardware Function Evaluation

Similar documents
Second Order Function Approximation Using a Single Multiplication on FPGAs

Table-based polynomials for fast hardware function evaluation

Table-based polynomials for fast hardware function evaluation

Automatic generation of polynomial-based hardware architectures for function evaluation

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Fixed-Point Trigonometric Functions on FPGAs

Laboratoire de l Informatique du Parallélisme. École Normale Supérieure de Lyon Unité Mixte de Recherche CNRS-INRIA-ENS LYON n o 8512

Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations

Efficient Function Approximation Using Truncated Multipliers and Squarers

Automated design of floating-point logarithm functions on integer processors

NUMERICAL FUNCTION GENERATORS USING BILINEAR INTERPOLATION

Hardware implementations of fixed-point Atan2

A Parallel Method for the Computation of Matrix Exponential based on Truncated Neumann Series

On the number of segments needed in a piecewise linear approximation

Arithmetic Operators for Pairing-Based Cryptography

Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs

Arithmetic operators for pairing-based cryptography

Karatsuba with Rectangular Multipliers for FPGAs

A Hardware-Oriented Method for Evaluating Complex Polynomials

Automated design of floating-point logarithm functions on integer processors

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA

Computation of the error functions erf and erfc in arbitrary precision with correct rounding

Arithmetic Operators for Pairing-Based Cryptography

Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA

Return of the hardware floating-point elementary function

Computing Machine-Efficient Polynomial Approximations

Design Method for Numerical Function Generators Based on Polynomial Approximation for FPGA Implementation

Design and Implementation of a Radix-4 Complex Division Unit with Prescaling

Lecture 11. Advanced Dividers

Efficient Subquadratic Space Complexity Binary Polynomial Multipliers Based On Block Recombination

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives

MATH 1231 MATHEMATICS 1B CALCULUS. Section 5: - Power Series and Taylor Series.

Computer Problems for Fourier Series and Transforms

Rigorous Polynomial Approximations and Applications

Part VI Function Evaluation

Section 5.8. Taylor Series

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms

ECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders

f (x) = k=0 f (0) = k=0 k=0 a k k(0) k 1 = a 1 a 1 = f (0). a k k(k 1)x k 2, k=2 a k k(k 1)(0) k 2 = 2a 2 a 2 = f (0) 2 a k k(k 1)(k 2)x k 3, k=3

FPGA Implementation of a Predictive Controller

A 32-bit Decimal Floating-Point Logarithmic Converter

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte

Semi-Automatic Floating-Point Implementation of Special Functions

Cost/Performance Tradeoff of n-select Square Root Implementations

A technique for DDA seed shifting and scaling

On-Line Hardware Implementation for Complex Exponential and Logarithm

Math Practice Exam 3 - solutions

Australian Journal of Basic and Applied Sciences

Newton-Raphson Algorithms for Floating-Point Division Using an FMA

A Deep Convolutional Neural Network Based on Nested Residue Number System

Practice Problems: Integration by Parts

Optimizing Scientific Libraries for the Itanium

Numbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture

NUMERICAL MATHEMATICS & COMPUTING 6th Edition

NUMERICAL METHODS. x n+1 = 2x n x 2 n. In particular: which of them gives faster convergence, and why? [Work to four decimal places.

FPGA accelerated multipliers over binary composite fields constructed via low hamming weight irreducible polynomials

An Algorithm for the η T Pairing Calculation in Characteristic Three and its Hardware Implementation

Power Series Solutions We use power series to solve second order differential equations

Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.

What s the Deal? MULTIPLICATION. Time to multiply

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p

1. Use the properties of exponents to simplify the following expression, writing your answer with only positive exponents.

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Gal s Accurate Tables Method Revisited

Fast and accurate Bessel function computation

This is your first impression to me as a mathematician. Make it good.

Low-complexity generation of scalable complete complementary sets of sequences

Author(s) Beuchat, Jean-Luc; Muller, Jean-Mic. Citation IEEE transactions on computers, 57(

CHALLENGE! (0) = 5. Construct a polynomial with the following behavior at x = 0:

Optimal Eta Pairing on Supersingular Genus-2 Binary Hyperelliptic Curves

Math 12 Final Exam Review 1

Chapter 2 Algorithms for Periodic Functions

Computer Architecture 10. Fast Adders

7.0: Minimax approximations

LRADNN: High-Throughput and Energy- Efficient Deep Neural Network Accelerator using Low Rank Approximation

Solution of Algebric & Transcendental Equations

Implementation Of Digital Fir Filter Using Improved Table Look Up Scheme For Residue Number System

Construction of a reconfigurable dynamic logic cell

Hardware implementations of ECC

Designing a Correct Numerical Algorithm

Step 1: Greatest Common Factor Step 2: Count the number of terms If there are: 2 Terms: Difference of 2 Perfect Squares ( + )( - )

Hardware Acceleration of the Tate Pairing in Characteristic Three

Computing Machine-Efficient Polynomial Approximations

A HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *

HARDWARE IMPLEMENTATION OF FIR/IIR DIGITAL FILTERS USING INTEGRAL STOCHASTIC COMPUTATION. Arash Ardakani, François Leduc-Primeau and Warren J.

Tunable Floating-Point for Energy Efficient Accelerators

function independent dependent domain range graph of the function The Vertical Line Test

8.5 Taylor Polynomials and Taylor Series

Chapter 1 Numerical approximation of data : interpolation, least squares method

Polynomial Functions and Their Graphs

ECE 645: Lecture 2. Carry-Lookahead, Carry-Select, & Hybrid Adders

y = 5 x. Which statement is true? x 2 6x 25 = 0 by completing the square?

THIS paper is devoted to the study of modular multiplication

Janus: FPGA Based System for Scientific Computing Filippo Mantovani

AES [and other Block Ciphers] Implementation Tricks

Transcription:

ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/ CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE ECOLE NORMALE SUPERIEURE DE LYON

Overview 1 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 1 / 34

Context 2 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 2 / 34

Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... X w I w O f(x) usually w I = w O and 8 w I, w O 32 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

Context: function evaluation 3 fixed-point elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks dedicated computations logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x )... X w I? w O f(x) usually w I = w O and 8 w I, w O 32 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 3 / 34

Order 0: direct look-up table 4 tabulate all the possible values X w I f(0) f(1).. f(2 w I 2) f(2 w I 1) w O f(x) very short critical path: only 1 table look-up huge look-up table: w O 2 w I bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 4 / 34

Order 1: lookup-multiply method 5 piecewise linear approximation K 0 ( A) w O + g X w I A K 1 ( A) w O + g w O f(x) B smaller tables longer critical path: 1 table look-up, 1 mult and 1 add Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 5 / 34

Order 1: bipartite table method [2] 6 tabulate the product in a table of offsets (TO) TIV( A) w O + g A w O f(x) X w I A 0 TO(, B) A 0 w O + g B shorter critical path: 1 table look-up and 1 add slightly larger tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 6 / 34

Order 1: multipartite table method [14,11,4] 7 split the linear offset (TO) as a sum of several offsets (TO i s) X w I A TIV w O + g A 0 B B 0 A 1 O X R TO 0 O X R w O + g w O f(x) B 1 O X R TO 1 O X R w O + g B 2 A 2 O X R TO 2 O X R w O + g critical path: 2 XOR stages, 1 table look-up and log 2 (n) adds much smaller tables, but adder tree Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 7 / 34

Order 2: SMSO method [6] 8 split the order-1 term as the sum of a small product and an offset X w I A TIV w O + g 0 B A 0 TS w O + g 1 w O + g 0 B 0 w O f(x) A 1 B 1 O X R TO 1 O X R w O + g 0 A 2 B 2 O X R TO 2 w O + g 0 critical path: 1 table look-up, 1 rectangular mult and 2 adds multiplier, but smaller tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 8 / 34

Higher order methods 9 Hörner evaluation interleaved memory interpolators: Lewis partial product arrays: Hassler and Takagi specialized squaring unit: Piñero, Bruguera and Muller this work Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 9 / 34

Objectives 10 higher order approximation for larger precisions and smaller tables accurate error analysis to help the optimization of the hardware cost split large operators into smaller ones for architectural exploration Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 10 / 34

The HOTBM method (Higher-Order Table-Based Method) 11 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 11 / 34

Polynomial approximation 12 1 f(x) 0 0 1 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

Polynomial approximation 12 1 f(x) 0 0 1/8 2/8 3/8 4/8 5/8 6/8 7/8 1 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A B α β Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

Polynomial approximation 12 1 f(x) 0 0 1/8 2/8 3/8 4/8 5/8 6/8 7/8 1 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A B α β piecewise order-n minimax polynomial approximation: n f(x) P (A)(B) = K k (A) (2 α B) k k=0 Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 12 / 34

Polynomial approximation: architecture 13 X w I A K 0 ( A) w O + g B K 1 ( A) 2 α B w O + g w O f(x) K 2 ( A) ( 2 α B) 2 w O + g. K n ( A) ( 2 α B ) n w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 13 / 34

Polynomial approximation: architecture 13 X w I A K 0 ( A) w O + g B? K 1 ( A) 2 α B w O + g w O f(x)? K 2 ( A) ( 2 α B) 2 w O + g.? K n ( A) ( 2 α B ) n w O + g architectural choices to implement each term T k (A, B) = K k (A) (2 α B) k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 13 / 34

Computing the terms: exploiting symmetry 14 each term T k (A, B) is symmetric with respect to the middle of each sub-interval: when k is even, T k (A, B) = T k (A, B): B < 0 B > 0 A B T k ( A, B) when k is odd, T k (A, B) = T k (A, B): B < 0 B > 0 A B T k ( A, B) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 14 / 34

Computing the terms: exploiting symmetry 14 each term T k (A, B) is symmetric with respect to the middle of each sub-interval: when k is even, T k (A, B) = T k (A, B): B < 0 B > 0 A b 1 B B T k ( A, B ) when k is odd, T k (A, B) = T k (A, B): B < 0 B > 0 A b 1 B B T k ( A, B ) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 14 / 34

Computing the terms: simple look-up table 15 tabulate all the possible values A b 1 B T k ( A, B ) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 15 / 34

Computing the terms: power-and-multiply 16 compute S k = B k with a powering unit split S k into several sub-words S k,1,..., S k,mk : k (β 1) S k,1 S k,2... S k,mk σ k,1 σ k,2 σ k,mk compute the product K k (A) S k K k (A) S k,j : as the sum of all the sub-products the most significant ones implemented as actual multipliers the least significant ones implemented as look-up tables exploit symmetry for each of those sub-products Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 16 / 34

Computing the terms: power-and-multiply 17 A K k ( A) b 1 S k,1 B k B..... K k ( A) S k,2. S k,mk K k ( A) S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 17 / 34

Computing the terms: power-and-multiply 17 A K k ( A) S b 1? k,1 k O X B. R B.... K k ( A) S k,2. S k,mk K k ( A) S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 17 / 34

Computing the terms: powering unit 18 implemented as a look-up table Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 18 / 34

Computing the terms: powering unit 18 implemented as a look-up table implemented as a sum of partial products B partial products S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 18 / 34

Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

Degrading accuracy 19 T 0 ( A) = K 0 ( A) α K 1 ( A) S 1,1 K 1 ( A) S 1,2 K 1 ( A) T 1 ( A, B) = K 1 ( A) 2 α B S 1,3 2α. f(x). K 2 ( A) K 2 ( A) S 2,1 T2 ( A, B) = K 2 ( A) ( 2 α B) 2 S 2,2. some of the terms are more accurate than others we can save area by using less bits to compute the most accurate tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 19 / 34

Degrading accuracy: global architecture 20 each term T k is computed using only: A k, the α k most significant bits of A B k, the β k most significant bits of B w I A k α k B k β k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 20 / 34

Degrading accuracy: global architecture 21 X w I A A 0 T 0 ( ) A 0 w O + g A 1 B B 1 T 1 ( A 1, B 1 ) w O + g A 2 w O f(x) B 2 T 2 ( A 2, B 2 ). w O + g A 3 B n T n ( A n, B n ) w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 21 / 34

Degrading accuracy: global architecture 21 X w I A A 0 T 0 ( ) A 0 w O + g A 1 B B 1 T 1 ( A 1, B 1 ) w O + g A 2 w O f(x) B 2 T 2 ( A 2, B 2 ). w O + g A 3 B n T n ( A n, B n ) w O + g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 21 / 34

Degrading accuracy: power-and-multiply terms 22 only the λ k most significant bits of B k k are used for S k each sub-product K k (A k ) S k,j is computed using only A k,j, the α k,j most significant bits of A k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 22 / 34

Degrading accuracy: power-and-multiply terms 23 A k A k,1 K k ( ) A k,1 b 1 S k,1 B O X k R k B k..... A k,2 K k ( ) A k,2 S k,2. A k,mk S k,mk K k ( ) A k,mk S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 23 / 34

Degrading accuracy: power-and-multiply terms 23 A k A k,1 K k ( ) A k,1 b 1 S k,1 B O X k R k B k. S k,1..... A k,2 K k ( ) A k,2 S k,2. S k,mk A k,mk K k ( ) A k,mk S k,m k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 23 / 34

Degrading accuracy: ad-hoc powering units 24 each ad-hoc powering unit is truncated to µ k bits B k S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 24 / 34

Degrading accuracy: ad-hoc powering units 24 each ad-hoc powering unit is truncated to µ k bits B k S k S k Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 24 / 34

Error analysis 25 every error entailed by the operator is accurately bounded: minimax error method errors rounding errors we can easily compute g the number of guard bits required to ensure faithful rounding (last bit accuracy) a trial-and-error method is then applied to decrease g Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 25 / 34

Results 26 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 26 / 34

Results: area estimations for log 2 (1 + x) 27 Operator area (in slices) 3000 FPGA area ratio 2500 2000 order 2 SMSO order 3 50% 1500 30% 1000 order 4 500 10% 0 12 16 20 24 28 32 Input / output precision w I = w O (in bits) as expected, exponential growth order 2 up to 24 bits, order 3 up to 28 bits, order 4 up to 32 bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 27 / 34

Results: area estimations for sin x 28 Operator area (in slices) 3000 FPGA area ratio 2500 50% 2000 1500 order 2 SMSO 30% 1000 500 order 3 order 4 10% 0 12 16 20 24 28 32 Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 28 / 34

Results: delay estimations for log 2 (1 + x) 29 Operator delay (in ns) 50 45 40 order 4 35 30 order 3 25 order 2 SMSO 20 12 16 20 24 28 32 Input / output precision w I = w O (in bits) latency increase for higher orders Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 29 / 34

Results: delay estimations for sin x 30 Operator delay (in ns) 50 45 40 order 4 35 order 3 30 SMSO 25 order 2 20 12 16 20 24 28 32 Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 30 / 34

Conclusion 31 Context The HOTBM method Results Conclusion Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 31 / 34

Contribution 32 a novel function approximation method: arbitrary order: smaller tables optimized powering units small multipliers: shorter critical path, and can benefit from recent FPGA technologies (Virtex-II) highly parameterizable design, adaptable to various metrics accurate approximation and rounding error analysis targeted to precisions up to 32 bits Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 32 / 34

Future work 33 improve parameter space exploration heuristic following user-specified criteria adapt this method to ASIC (different metric, architectural choices,...) take advantage of accurate error analysis method to finely tune the tables Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 33 / 34

Future work 33 improve parameter space exploration heuristic following user-specified criteria adapt this method to ASIC (different metric, architectural choices,...) take advantage of accurate error analysis method to finely tune the tables work-in-progress: library of parameterizable floating-point operators for elementary functions: logarithm exponential Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 33 / 34

Thank you for your attention 34 more information: http://www.ens-lyon.fr/lip/arenaire/ CVS repository: http://lipforge.ens-lyon.fr/www/hotbm/ Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 34 / 34

Thank you for your attention 34 more information: http://www.ens-lyon.fr/lip/arenaire/ CVS repository: http://lipforge.ens-lyon.fr/www/hotbm/ Questions? Jérémie Detrey, Florent de Dinechin Table-Based Polynomials for Fast Hardware Function Evaluation 34 / 34