Second Order Function Approximation Using a Single Multiplication on FPGAs

Size: px

Start display at page:

Download "Second Order Function Approximation Using a Single Multiplication on FPGAs"

Rolf Gilmore
5 years ago
Views:

1 FPL 04 Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE ECOLE NORMALE SUPERIEURE DE LYON

2 Overview 1 Context The SMSO method Optimization Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 1 / 31

3 Context 2 Context Function evaluation State of the art Objectives The SMSO method Optimization Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 2 / 31

4 Context: function evaluation 3 elementary functions sin(x), cos(x), log(x), e x,... signal or image processing neural networks... special functions: logarithmic number system: log 2 (1 + 2 x ) and log 2 (1 2 x ) log 2 (1 + 2 x ) log 2 (1 2 x ) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 3 / 31

5 Context: function evaluation 4 input: function f : [0; 1[ [0; 1[ input precision w I output precision w O (usually w O = w I ) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 4 / 31

6 Context: function evaluation 4 input: function f : [0; 1[ [0; 1[ input precision w I output precision w O (usually w O = w I ) output: X w I w O f(x) where X =.x 1 x 2 x wi and f(x) = Y =.y 1 y 2 y wo f(x) at the required precision Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 4 / 31

7 Context: function evaluation 4 input: function f : [0; 1[ [0; 1[ input precision w I output precision w O (usually w O = w I ) output: X w I? w O f(x) where X =.x 1 x 2 x wi and f(x) = Y =.y 1 y 2 y wo f(x) at the required precision Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 4 / 31

8 Order 0: direct look-up table 5 X w I f(0) f(1).. f(2 w I 2) f(2 w I 1) w O f(x) very short critical path: only 1 table look-up huge look-up table: w O 2 w I bits Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 5 / 31

9 Order 1: lookup-multiply method 6 Mencer, Boullis, Luk and Styles K 0 (A) w O + g X w I A K 1 (A) w O + g K 1 (A) B w O f(x) B smaller tables longer critical path: 1 table look-up, 1 mult and 1 add Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 6 / 31

10 Order 1: bipartite table method 7 Das Sarma and Matula, generalized by Schulte and Stine TIV(A) w O + g A w O f(x) X w I A 0 TO(A 0, B) w O + g B shorter critical path: 1 table look-up and 1 add slightly larger tables Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 7 / 31

11 Order 1: multipartite table method 8 Schulte and Stine, Muller, de Dinechin and Tisserand: generalization and extension of the idea of the bipartite table method X w I A TIV w O + g A 0 B B 0 A 1 O X R TO 0 O X R w O + g w O f(x) B 1 O X R TO 1 O X R w O + g B 2 A 2 O X R TO 2 O X R w O + g critical path: 2 XOR stages, 1 table look-up and log 2 (n) adds much smaller tables, but adder tree Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 8 / 31

12 Higher order methods 9 Hörner evaluation interleaved memory interpolators: Lewis partial product arrays: Hassler and Takagi specialized squaring unit: Piñero, Bruguera and Muller simplified order 5 Taylor approximation: Defour, de Dinechin and Muller... Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 9 / 31

13 Objectives 10 simplify, generalize and extend Defour s method use ideas from order 1 methods (bipartite and multipartite table) for an order 2 approximation Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 10 / 31

14 Objectives 10 simplify, generalize and extend Defour s method use ideas from order 1 methods (bipartite and multipartite table) for an order 2 approximation maintain short critical path while keeping tables as small as possible use only small multipliers (Virtex-II) accurate error analysis for a fine tuning of the operators Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 10 / 31

15 The SMSO method 11 Context The SMSO method General idea Architecture Optimization Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 11 / 31

16 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

17 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β order 2 approximation on each interval addressed by A: f(x) = K 0 (A) + K 1 (A) 2 α B + K 2 (A) 2 2α B 2 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

18 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β order 2 approximation on each interval addressed by A: f(x) = K 0 (A) + K 1 (A) 2 α B + K 2 (A) 2 2α B 2 }{{} look-up table TIV(A) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

19 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β order 2 approximation on each interval addressed by A: f(x) = K 0 (A) + K 1 (A) 2 α B + K 2 (A) 2 2α B 2 }{{} } {{ } look-up table look-up table TIV(A) TO 2 (A, B) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

20 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β order 2 approximation on each interval addressed by A: f(x) = K 0 (A) + K 1 (A) 2 α B + K 2 (A) 2 2α B 2 }{{}}{{}}{{} look-up table multiplier? look-up table TIV(A) look-up table? TO 2 (A, B) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

21 General idea: second order approximation 12 input word decomposition: X = A + 2 α B =.a 1 a 2 a α b 1 b 2 b β w I A α B β order 2 approximation on each interval addressed by A: f(x) = K 0 (A) + K 1 (A) 2 α B + K 2 (A) 2 2α B 2 }{{}}{{}}{{} look-up table multiplier? look-up table TIV(A) look-up table? TO 2 (A, B) both Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 12 / 31

22 General idea: multiplication vs. look-up table tradeoff 13 second decomposition: B = B β 0B 1 =.b 1 b 2 b β0 b β0 +1 b β : w I A α B β B 0 B 1 β 0 β 1 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 13 / 31

23 General idea: multiplication vs. look-up table tradeoff 13 second decomposition: B = B β 0B 1 =.b 1 b 2 b β0 b β0 +1 b β : w I A α B β B 0 B 1 β 0 β 1 we obtain: K 1 (A) 2 α B = K 1 (A) 2 α B 0 + K 1 (A) 2 α β 0B 1 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 13 / 31

24 General idea: multiplication vs. look-up table tradeoff 13 second decomposition: B = B β 0B 1 =.b 1 b 2 b β0 b β0 +1 b β : w I A α B β B 0 B 1 β 0 β 1 we obtain: K 1 (A) 2 α B = K 1 (A) 2 α B 0 + K 1 (A) 2 α β 0B 1 }{{}}{{} multiplier look-up table TS(A) B 0 TO 1 (A, B 1 ) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 13 / 31

25 General idea 14 we have 4 tables: Table of Initial Values: Table of Slopes: Table of Offsets (order 1): Table of Offsets (order 2): TIV(A) = K 0 (A) TS(A) = 2 α K 1 (A) TO 1 (A, B 1 ) = 2 α β 0 K 1 (A) B 1 TO 2 (A, B) = 2 2α K 2 (A) B 2 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 14 / 31

26 General idea 14 we have 4 tables: Table of Initial Values: Table of Slopes: Table of Offsets (order 1): Table of Offsets (order 2): TIV(A) = K 0 (A) TS(A) = 2 α K 1 (A) TO 1 (A, B 1 ) = 2 α β 0 K 1 (A) B 1 TO 2 (A, B) = 2 2α K 2 (A) B 2 we obtain: f(x) = TIV(A) + TS(A) B 0 + TO 1 (A, B 1 ) + TO 2 (A, B) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 14 / 31

27 General idea: degrading accuracy 15 TIV(A) = K 0 (A) α α + β 0 2α f(x) TS(A) B 0 TO 1 (A, B 1 ) TO 2 (A, B) = 2 α K 1 (A) B 0 = 2 α β0 K 1 (A) B 1 = 2 2α K 2 (A) B 2 some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 15 / 31

28 General idea: degrading accuracy 15 TIV(A) = K 0 (A) α α + β 0 2α f(x) TS(A) B 0 TO 1 (A, B 1 ) TO 2 (A, B) = 2 α K 1 (A) B 0 = 2 α β0 K 1 (A) B 1 = 2 2α K 2 (A) B 2 some of the terms are more accurate than others Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 15 / 31

29 General idea: degrading accuracy 15 TIV(A) = K 0 (A) α α + β 0 2α f(x) TS(A) B 0 TO 1 (A, B 1 ) TO 2 (A, B) = 2 α K 1 (A) B 0 = 2 α β0 K 1 (A) B 1 = 2 2α K 2 (A) B 2 some of the terms are more accurate than others we can save area by addressing the more accurate tables with less bits Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 15 / 31

30 General idea: degrading accuracy 16 input word decomposition: w I A α B β A 0 α 0 B 0 β 0 A 1 α 1 B 1 β 1 A 2 α 2 B 2 β 2 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 16 / 31

31 General idea: degrading accuracy 16 input word decomposition: w I A α B β A 0 α 0 B 0 β 0 A 1 α 1 B 1 β 1 A 2 α 2 B 2 β 2 we obtain the final SMSO formula: f(x) = TIV(A) + TS(A 0 ) B 0 + TO 1 (A 1, B 1 ) + TO 2 (A 2, B 2 ) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 16 / 31

32 General idea: exploiting symmetry 17 both TO 1 and TO 2 have symmetry property: TO 1 (A 1, B 1 ) = TO 1 (A 1, B 1 ) TO 2 (A 2, B 2 ) = TO 2 (A 2, B 2 ) TO 1 (A 1, B 1 ) TO 2 (A 2, B 2 ) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 17 / 31

33 Architecture 18 X w I A TIV w O + g 0 B A 0 TS w O + g 1 w O + g 0 B 0 w O f(x) A 1 B 1 O X R TO 1 O X R w O + g 0 A 2 B 2 O X R TO 2 w O + g 0 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 18 / 31

34 Optimization 19 Context The SMSO method Optimization Rounding considerations Parameter space exploration Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 19 / 31

35 Rounding considerations 20 we want to achieve faithful rounding: f(x) f(x) < 2 w O Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 20 / 31

36 Rounding considerations 20 we want to achieve faithful rounding: f(x) f(x) < 2 w O the SMSO operator entails several errors: polynomial approximation: degrading table accuracy: rounding table values: final rounding: ɛ poly ɛ tab ɛ rt < 4 2 w O g w O g 1 1 ɛ rf < 2 w O 1 (1 2 g 0) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 20 / 31

37 Rounding considerations 21 error constraint: ɛ poly + ɛ tab + ɛ rt + ɛ rf < 2 w O Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 21 / 31

38 Rounding considerations 21 error constraint: ɛ poly + ɛ tab + ɛ rt + ɛ rf < 2 w O constraint on g 0 and g 1 : ɛ poly + ɛ tab w O g w O g w O 1 (1 2 g 0 ) < 2 w O Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 21 / 31

39 Rounding considerations 21 error constraint: ɛ poly + ɛ tab + ɛ rt + ɛ rf < 2 w O constraint on g 0 and g 1 : ɛ poly + ɛ tab w O g w O g w O 1 (1 2 g 0 ) < 2 w O assuming g 0 = g 1, we compute the bound: g 0 > log 2 ( ) 4 2 w O 1 ɛ poly ɛ tab w O 1 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 21 / 31

40 Parameter space exploration 22 lots of parameters: α, β, α 0, β 0, α 1, β 1, α 2, β 2, g 0, g 1 to find the best decomposition: parameter space exploration Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 22 / 31

41 Parameter space exploration 22 lots of parameters: α, β, α 0, β 0, α 1, β 1, α 2, β 2, g 0, g 1 to find the best decomposition: parameter space exploration the parameter space is huge, so we need a heuristic: find all the acceptable decompositions such that ɛ poly + ɛ tab < 2 w O 1 for each candidate: compute the bounds for g 0 and g 1 fill the tables compute the width of the signals apply a user-defined score function (area, latency, multiplier size,...) the score determines the best decomposition trial-and-error method to decrease g 0 and g 1 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 22 / 31

42 Results 23 Context The SMSO method Optimization Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 23 / 31

43 Results: ROM size 24 ROM size (in bits) 120k 100k log exp sin 80k 60k 40k 20k Input / output precision w I = w O (in bits) 25 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 24 / 31

44 Results: operator area 25 Operator area (in slices) lookup-multiply multipartite log exp sin Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 25 / 31

45 Results: operator latency 26 Delay (in ns) multipartite log exp sin Input / output precision w I = w O (in bits) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 26 / 31

46 Results: using Virtex-II small multipliers 27 Function log Precision (w I = w O ) 16 bits 20 bits 24 bits Multiplier bit size not using area (slices) block multipliers delay (ns) using area (slices) block multipliers delay (ns) Function sin Precision (w I = w O ) (bits) Multiplier bit size not using area (slices) block multipliers delay (ns) using area (slices) block multipliers delay (ns) Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 27 / 31

47 Conclusion 28 Context The SMSO method Optimization Results Conclusion Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 28 / 31

48 Contribution 29 a novel function approximation method: second order: smaller tables only one small multiplier: shorter critical path, and can benefit from recent FPGA technologies (Virtex-II) accurate approximation and rounding error analysis automated exploration of the parameter space according to user-specified criteria Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 29 / 31

49 Future work 30 split the TO i s on several tables, as in the multipartite table method work on table compression techniques extend this method to higher order approximations Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 30 / 31

50 Thank you for your attention 31 Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 31 / 31

51 Thank you for your attention 31 Questions? Jérémie Detrey, Florent de Dinechin Second Order Function Approximation Using a Single Multiplication on FPGAs 31 / 31

Table-Based Polynomials for Fast Hardware Function Evaluation

ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/ CENTRE