Computing correctly rounded logarithms with fixed-point operations. Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie

Size: px
Start display at page:

Download "Computing correctly rounded logarithms with fixed-point operations. Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie"

Transcription

1 Computing correctly rounded logarithms with fixed-point operations Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie

2 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 2

3 Logarithm, the mathematical version y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3

4 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) 2 y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3

5 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) ln (b a ) = a ln(b) 2 y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3

6 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) ln (b a ) = a ln(b) Taylor: for x small, ln(1 + x) x x 2 /2 + x 3 /3... y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3

7 Logarithm, the floating-point version The floating point version of the natural logarithm is called log (you will also find log2 and log10 and a few others) 2 y x F 64 log(x) = (ln(x)) y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 4

8 Logarithm, the floating-point version The floating point version of the natural logarithm is called log (you will also find log2 and log10 and a few others) 2 y x F 64 log(x) = (ln(x)) y = ln(x) x An experiment Implementing the floating-point logarithm function using only integer arithmetic for performance (previous work motivated by lack of FP hardware) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 4

9 Why using fixed-point arithmetic? bits IEEE-754 (64 bits) mainstream floating-point mainstream integer J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 5

10 Why using fixed-point arithmetic? bits IEEE-754 (64 bits) 64-bits mainstream floating-point mainstream integer 64-bit floating-point, but only 52-bit precision if you can predict the value of the exponent, exponent bits are wasted bits. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 5

11 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6

12 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6

13 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6

14 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers small multiprecision out of the box: mainstream compilers (gcc, clang, icc) support int_128 addition 128x shift on two registers (add, adc) (shld, shrd) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6

15 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers small multiprecision out of the box: mainstream compilers (gcc, clang, icc) support int_128 addition 128x shift on two registers (add, adc) (shld, shrd) Caveat: integer SIMD/vector support still lagging behind FP (no vector multiplication) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6

16 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 7

17 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8

18 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8

19 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8

20 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy Trade-off between first and second steps: MeanTime = Time(1st step) + Pr[need 2nd step] Time(2nd step) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8

21 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy Trade-off between first and second steps: MeanTime = Time(1st step) + Pr[need 2nd step] Time(2nd step) Best so far: Time(2nd step) 10 Time(1st step) This work: Time(2nd step) 2 Time(1st step) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8

22 The big picture 1. Filter special cases (negative numbers,,...) 2. Argument range reduction 3. Polynomial approximation 4. Solution reconstruction 5. Error evaluation and rounding test 6. If more accuracy needed: Rerun the steps 3 and 4 with the worst-case accuracy. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 9

23 First argument range reduction input = 2 E (1 + x) ln(input) = E ln (2) + ln (1 + x) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 10

24 First argument range reduction input = 2 E (1 + x) ln(input) = E ln (2) + ln (1 + x) Evaluation algorithm: approximate ln (1 + x) with a polynomial p(x) degree needed: at least 26 evaluate E ln (2) add both terms J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 10

25 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11

26 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits A table, addressed by x 1, the 7 most significand bits of x, stores r x 1 1 ( ) 1 and ln 1 + x x r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11

27 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros A table, addressed by x 1, the 7 most significand bits of x, stores r x 1 1 ( ) 1 and ln 1 + x x Define 1 + y = r x (1 + x) 1 Then ln(1 + x) = ln(1 + y) + ln( 1 r x ) r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11

28 Tang s range reduction algorithm 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros Extract the index x 1 Read, from a table addressed by x 1, both r x and ln( 1 r x ) compute y = r x (1 + x) 1 (exactly) approximate ln (1 + y) with a polynomial p(y) Degree needed: 8 add it all: ln(input) E ln (2) + p (y) + ln( 1 ) r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 12

29 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros y = r x (1 + x) 1 With 1 + x on 53 bits we can tabulate r x on 18 bits: the exact product would need 71 bits but we can predict the 7 leading bits... so we can let them overflow quietly and use a multiplication. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 13

30 Two levels of Tang reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 0. fractional part on 7 bits r x : 0. fractional part on 9 bits 1 + y: y 1 : fractional part on 60 bits including 5 zeros. fractional part on 13 bits r y : 0. fractional part on 15 bits 1 + z: fractional part on 64 bits plus 12 implicit zeros x [0, 1) y [ 0, ) x 1 takes 64 different values z [ 0, ) y 1 takes 96 different values the whole reduction of x to z is computed exactly in 64-bit int. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 14

31 A few Pareto points in the design space Table size (bytes) degree 1st degree 2nd 39, , , , , J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 15

32 Why stop at two levels of reduction? Answer is: diminushing return. For a target accuracy of 2 60 : interval of x degree needed No reduction [ 1/2, 1/2] 29 1 level [ 2 7, 2 7 ] 8 2 levels [ 2 12, 2 12 ] 4 3 levels [ 2 18, 2 18 ] 3 Adding more levels will cost more operations than it saves... J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 16

33 Polynomial approximation (advertisement) We want to approximate log(1 + z) on an interval around 0. Use the (now standard) tool set to obtain it. Sollya: finds a machine-efficient polynomial P(z) computes a safe bound on the approximation error P(z) ln(1 + z) Gappa: bounds the accumulation of rounding errors when evaluating P(z) in C We obtain a Coq proof of the error: computed approximation of ln(1 + z), with relative error margin real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 17

34 Reconstructing the solution input = 2 e (1 + x) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

35 Reconstructing the solution input = 2 e 1 r x (1 + y) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

36 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

37 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

38 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

39 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

40 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

41 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18

42 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19

43 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19

44 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19

45 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e P(z) 2 59) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19

46 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20

47 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20

48 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers For comparison, the proof of the floating-point-based rounding test (invented by Ziv and used in CRLibm) is an 18-page paper that took 20 years to publish... J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20

49 Second step Use 3 words instead of 2 for the precomputed ln(2) Use a much more accurate polynomial: with coefficients on 128 bits instead of 64 (but z is still only a 64-bit number) and using a higher degree polynomial J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 21

50 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 22

51 Implementation parameters of correctly rounded implementations glibc crlibm-td crlibm-de cr-fixp degree pol. 1 3/ degree pol tables size 13 Kb 8192 bytes 6144 bytes 4032 bytes % accurate phase N/A J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 23

52 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24

53 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24

54 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24

55 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24

56 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 25

57 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26

58 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap Competitive against state-of-the-art Worst case improved compared to others implementations Second step-only version is a viable alternative J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26

59 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap Competitive against state-of-the-art Worst case improved compared to others implementations Second step-only version is a viable alternative Limitations: Require 64 bits integer support No support for vectorization J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26

60 Thanks for your attention Any question? J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 27

61 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 28

62 Floating-point in, fixed-point out output: fixed-point, 11 bits integer part, 53 bit fractional part integer part target absolute accuracy 2 52 fraction output absolute table Core i5 Bostan format accuracy size cycles cycles Fix Fix double (libm) Fix64 is the code of the first step only, without the conversion to float. tweak: poly degree 3 only for abs. accuracy 2 59 Fix128 is the code of the second step only, without the conversion to float. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 29

63 Motivation TKF91 : DNA sequence alignment algorithm dynamic programming algorithm: alignment as a path within a 2D array. borders of an array initialized with log-likelihoods then array filled using recurrence formulae that involve only max and + operations. All current implementations of this algorithm use a floating-point array, but int64 + and max are 1-cycle, vectorizable, and exact operations; absolute accuracy of initialization logs: up to 2 42 with FP log, 2 52 with FixP log. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 30

64 Only partial experiments Improvement in accuracy measured No noticeable improvement in performance J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 31

65 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 32

66 Special cases: businesss as usual / r e i n t e r p r e t x to m a n i p u l a t e i t s b i t s more e a s i l y / u i n t 6 4 _ t i n p u t b i t s = ( ( union { double d ; u i n t 6 4 _ t u ; }){ i n p u t } ). u ; i n t xe = i n p u t b i t s >> 5 2 ; / f i l t e r t h e s p e c i a l c a s e s :! ( x i s n o r m a l i z e d and 0 < x < +I n f ) i f (0 x7feu <= ( unsigned ) xe 1u ) { / x = + 0 : r a i s e a D i v i d e B y z e r o, r e t u r n I n f / i f ( ( x b i t s & ~(1 u l l << 6 3 ) ) == 0) r e t u r n 1.0/0.0; / x < 0. 0 : r a i s e a I n v a l i d O p e r a t i o n, r e t u r n a qnan / i f ( ( x b i t s & (1 u l l << 6 3 ) )!= 0) r e t u r n ( x x ) / 0 ; / x = qnan : r e t u r n a qnan x = snan : r a i s e a I n v a l i d O p e r a t i o n, r e t u r n a qnan x = +I n f : r e t u r n +I n f / i f ( xe!= 0) r e t u r n x+x ; / x subnormal : change x to a n o r m a l i z e d number / e l s e { i n t u = c l z 6 4 ( x b i t s ) 1 2 ; x b i t s <<= u + 1 ; xe = u ; } } xe = ; Only interesting line: the subnormal management J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 33

67 Argument reduction / i n p u t = 2^ xe (1 + X) / u i n t 6 4 _ t x = i n p u t b i t s & 0 xfffffffffffffull ; / 1 + X = ( 1/ Ri ) Y / u i n t 8 _ t x1 = x >> (52 ARG_REDUC_1_PREC ) ; u i n t 6 4 _ t y = argreduc1_inv [ x1 ] ( x + (1 u l l << 5 2 ) ) ; b u i l t i n _ p r e f e t c h (& argreduc1_log [ x1 ], 0, 0 ) ; / Y = (1/ S ) (1 + Z) / / w i t h dz = dz /2^(52 + ARG_REDUC_1_SIZE + ARG_REDUC_2_SIZE) and 1/S = argreduc2 [ s i ]. v a l /2^ARG_REDUC_2_SIZE / u i n t 8 _ t y1 = ( y >> ( 52 + ARG_REDUC_1_SIZE ARG_REDUC_2_PREC) ) ( 1 u << ARG_REDUC_2_PREC ) ; u i n t 6 4 _ t z = argreduc2_inv [ y1 ] y ; // +1 p a r t removed by o v e r f l o w b u i l t i n _ p r e f e t c h (& argreduc2_log [ y1 ], 0, 0 ) ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 34

68 Polynomial evaluation / P o l y n o m i a l a p p r o x i m a t i o n o f l o g (1+Z)/Z ~= P(Z) and Z P(Z) / s t a t i c c o n s t u i n t 6 4 _ t a4 = UINT64_C (0 x 3 f f c c b ) ; s t a t i c c o n s t u i n t 6 4 _ t a3 = UINT64_C (0 x d c 7 0 f 0 d f d ) ; s t a t i c c o n s t u i n t 6 4 _ t a2 = UINT64_C (0 x 7 f f f f f f f f f d f d ) ; s t a t i c c o n s t u i n t 6 4 _ t a1 = UINT64_C (0 x f f f f f f f f f f f f f f f a ) ; u i n t 6 4 _ t pz = a1 ( highmul ( z, a2 ( highmul ( z, a3 ( highmul ( z, a4 ) >> 12) ) >> 12) ) >> 1 2 ) ; u i n t _ t zpz = f u l l m u l ( z, pz ) ; / P o l y n o m i a l a p p r o x i m a t i o n w i t h o u t s h i f t s : r e p l a c e highmul ( z, a ) >> 12 w i t h highmul ( z, a >> 12) / s t a t i c c o n s t u i n t 6 4 _ t a4 = UINT64_C (0 x f f c ) ; s t a t i c c o n s t u i n t 6 4 _ t a3 = UINT64_C (0 x dc6 ) ; s t a t i c c o n s t u i n t 6 4 _ t a2 = UINT64_C (0 x f f f f f f f f f d 5 7 ) ; s t a t i c c o n s t u i n t 6 4 _ t a1 = UINT64_C (0 x f f f f f f f f f f f f f f f a ) ; u i n t 6 4 _ t pz = a1 highmul ( z, a2 highmul ( z, a3 highmul ( z, a4 ) ) ) ; u i n t _ t zpz = f u l l m u l ( z, pz ) ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 35

69 Reconstructing the solution / Compute p a r t o f t h e r e s u l t t h a t don t depend on Z ( xe l o g ( 2 ) + l o g (1/ Ri ) + l o g (1/ S i ) ) / u i n t _ t c s t p a r t = f u l l i m u l ( xe, log2fw_mid ) + UINT128 ( ( i n t 6 4 _ t ) xe log2fw_high, 0) // f u l l m u l not needed + UINT128 ( argreduc1 [ r i ]. l o g _ hi, argreduc1 [ r i ]. l o g _ l o ) + UINT128 ( argreduc2 [ s i ]. l o g_hi, argreduc2 [ s i ]. l o g _ l o ) ; / P o l y n o m i a l a p p r o x i m a t i o n o f l o g (1+Z)/Z ~= P(Z) and Z P(Z) /... / Assemble t h e two p a r t s, compute s i g n, m a n t i s s a and exponent / u i n t _ t l o n g r e s = c s t p a r t + ( zpz >> (11 + IMPLICIT_ZEROS ) ) ; u i n t 6 4 _ t s i g n = ( HI ( l o n g r e s ) >> 6 3 ) ; // 0 o r ~0 // i f s i g n!= 0, t h i s i s l o n g r e s = ~ l o n g r e s ( a = ~a + 1) // to a v o i d t h e +1 approx, do : // l o n g r e s = ( ( i n t 6 4 _ t ) s i g n + l o n g r e s ) ^ UINT128 ( s i g n, s i g n ) ; l o n g r e s ^= UINT128 ( s i g n, s i g n ) ; i n t u = c l z 6 4 ( HI ( l o n g r e s ) ) + 1 ; i n t exponent = 11 u ; u i n t 6 4 _ t m a n t i s s a = ( HI ( l o n g r e s ) << u ) (LO( l o n g r e s ) >> (64 u ) ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 36

70 Rounding test and conversion / Compute t h e maximal a b s o l u t e e r r o r ( a l i g n e d w i t h l o n g r e s ) : abs ( xe ) f o r xe l o g ( 2 ), l o g (1/ Ri ) and l o g (1/ S i ) zpz >>(POLYNOMIAL_PREC+IMPLICIT_ZEROS+11) f o r t h e p o l y n o I f r e s u l t (1 + maxrelerr ) a r e not rounded to t h e same number, we u i n t 6 4 _ t maxabserr = 3 + abs ( xe ) + ( HI ( zpz ) >> (POLYNOMIAL_PREC + u i n t 6 4 _ t maxrelerr = ( maxabserr >> (64 u ) ) + 1 ; i f ( ( ( m a n t i s s a + maxrelerr ) ^ ( m a n t i s s a maxrelerr ) ) >> 11) { r e t u r n l o g _ r n _ a c c u r a t e ( c s t p a r t, z, xe ) ; } / Assemble t h e computed r e s u l t / u i n t 6 4 _ t r e s u l t b i t s = ( ( u i n t 6 4 _ t ) s i g n << 63) + ( ( u i n t 6 4 _ t ) ( exponent +1023) << 52) + ( m a n t i s s a >> 12) + ( ( m a n t i s s a >> 11) & 1 ) ; / round to n e a r e s t / r e t u r n ( union { u i n t 6 4 _ t u ; double d ; }){ r e s u l t b i t s }. d ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 37

71 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 38

72 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

73 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

74 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

75 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

76 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

77 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39

Automated design of floating-point logarithm functions on integer processors

Automated design of floating-point logarithm functions on integer processors 23rd IEEE Symposium on Computer Arithmetic Santa Clara, CA, USA, 10-13 July 2016 Automated design of floating-point logarithm functions on integer processors Guillaume Revy (presented by Florent de Dinechin)

More information

Computing Machine-Efficient Polynomial Approximations

Computing Machine-Efficient Polynomial Approximations Computing Machine-Efficient Polynomial Approximations N. Brisebarre, S. Chevillard, G. Hanrot, J.-M. Muller, D. Stehlé, A. Tisserand and S. Torres Arénaire, LIP, É.N.S. Lyon Journées du GDR et du réseau

More information

Designing a Correct Numerical Algorithm

Designing a Correct Numerical Algorithm Intro Implem Errors Sollya Gappa Norm Conc Christoph Lauter Guillaume Melquiond March 27, 2013 Intro Implem Errors Sollya Gappa Norm Conc Outline 1 Introduction 2 Implementation theory 3 Error analysis

More information

Optimizing Scientific Libraries for the Itanium

Optimizing Scientific Libraries for the Itanium 0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common

More information

How do computers represent numbers?

How do computers represent numbers? How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete

More information

ECS 231 Computer Arithmetic 1 / 27

ECS 231 Computer Arithmetic 1 / 27 ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers

More information

On various ways to split a floating-point number

On various ways to split a floating-point number On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-

More information

Rigorous Polynomial Approximations and Applications

Rigorous Polynomial Approximations and Applications Rigorous Polynomial Approximations and Applications Mioara Joldeș under the supervision of: Nicolas Brisebarre and Jean-Michel Muller École Normale Supérieure de Lyon, Arénaire Team, Laboratoire de l Informatique

More information

Numerical Methods - Preliminaries

Numerical Methods - Preliminaries Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical

More information

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Mourad Gouicem PEQUAN Team, LIP6/UPMC Nancy, France May 28 th 2013

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic

More information

Faithful Horner Algorithm

Faithful Horner Algorithm ICIAM 07, Zurich, Switzerland, 16 20 July 2007 Faithful Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, France http://webdali.univ-perp.fr Ph. Langlois (Université de Perpignan)

More information

Number Representation and Waveform Quantization

Number Representation and Waveform Quantization 1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.

More information

Automated design of floating-point logarithm functions on integer processors

Automated design of floating-point logarithm functions on integer processors Automated design of floating-point logarithm functions on integer processors Guillaume Revy To cite this version: Guillaume Revy. Automated design of floating-point logarithm functions on integer processors.

More information

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points. Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,

More information

Ex code

Ex code Ex. 8.4 7-4-2-1 code Codeconverter 7-4-2-1-code to BCD-code. When encoding the digits 0... 9 sometimes in the past a code having weights 7-4-2-1 instead of the binary code weights 8-4-2-1 was used. In

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Chapter 1: Introduction and mathematical preliminaries

Chapter 1: Introduction and mathematical preliminaries Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,

More information

Lecture 7. Floating point arithmetic and stability

Lecture 7. Floating point arithmetic and stability Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating

More information

Tunable Floating-Point for Energy Efficient Accelerators

Tunable Floating-Point for Energy Efficient Accelerators Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable

More information

Comparison between binary and decimal floating-point numbers

Comparison between binary and decimal floating-point numbers 1 Comparison between binary and decimal floating-point numbers Nicolas Brisebarre, Christoph Lauter, Marc Mezzarobba, and Jean-Michel Muller Abstract We introduce an algorithm to compare a binary floating-point

More information

Newton-Raphson Algorithms for Floating-Point Division Using an FMA

Newton-Raphson Algorithms for Floating-Point Division Using an FMA Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008

More information

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction

More information

Topic 17. Analysis of Algorithms

Topic 17. Analysis of Algorithms Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm

More information

Lecture 11. Advanced Dividers

Lecture 11. Advanced Dividers Lecture 11 Advanced Dividers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division

More information

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented

More information

Notes for Chapter 1 of. Scientific Computing with Case Studies

Notes for Chapter 1 of. Scientific Computing with Case Studies Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What

More information

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal -1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,

More information

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015 CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015 Today Registration should be done. Homework 1 due 11:59pm next Wednesday, April 8 th. Review math

More information

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460 Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how

More information

CR-LIBM A library of correctly rounded elementary functions in double-precision

CR-LIBM A library of correctly rounded elementary functions in double-precision CR-LIBM A library of correctly rounded elementary functions in double-precision Catherine Daramy-Loirat, David Defour, Florent De Dinechin, Matthieu Gallet, Nicolas Gast, Christoph Lauter, Jean-Michel

More information

Thanks to: University of Illinois at Chicago NSF CCR Alfred P. Sloan Foundation

Thanks to: University of Illinois at Chicago NSF CCR Alfred P. Sloan Foundation The Poly1305-AES message-authentication code D. J. Bernstein Thanks to: University of Illinois at Chicago NSF CCR 9983950 Alfred P. Sloan Foundation The AES function ( Rijndael 1998 Daemen Rijmen; 2001

More information

Optimizing MPC for robust and scalable integer and floating-point arithmetic

Optimizing MPC for robust and scalable integer and floating-point arithmetic Optimizing MPC for robust and scalable integer and floating-point arithmetic Liisi Kerik * Peeter Laud * Jaak Randmets * * Cybernetica AS University of Tartu, Institute of Computer Science January 30,

More information

Tight and rigourous error bounds for basic building blocks of double-word arithmetic. Mioara Joldes, Jean-Michel Muller, Valentina Popescu

Tight and rigourous error bounds for basic building blocks of double-word arithmetic. Mioara Joldes, Jean-Michel Muller, Valentina Popescu Tight and rigourous error bounds for basic building blocks of double-word arithmetic Mioara Joldes, Jean-Michel Muller, Valentina Popescu SCAN - September 2016 AMPAR CudA Multiple Precision ARithmetic

More information

Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function

Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Jean-Michel Muller collaborators: S. Graillat, C.-P. Jeannerod, N. Louvet V. Lefèvre

More information

Introduction to Randomized Algorithms III

Introduction to Randomized Algorithms III Introduction to Randomized Algorithms III Joaquim Madeira Version 0.1 November 2017 U. Aveiro, November 2017 1 Overview Probabilistic counters Counting with probability 1 / 2 Counting with probability

More information

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies

SOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies SOBER Cryptanalysis Daniel Bleichenbacher and Sarvar Patel {bleichen,sarvar}@lucent.com Bell Laboratories Lucent Technologies Abstract. SOBER is a new stream cipher that has recently been developed by

More information

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff

More information

Toward Correctly Rounded Transcendentals

Toward Correctly Rounded Transcendentals IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 11, NOVEMBER 1998 135 Toward Correctly Rounded Transcendentals Vincent Lefèvre, Jean-Michel Muller, Member, IEEE, and Arnaud Tisserand Abstract The Table Maker

More information

Optimizing the Representation of Intervals

Optimizing the Representation of Intervals Optimizing the Representation of Intervals Javier D. Bruguera University of Santiago de Compostela, Spain Numerical Sofware: Design, Analysis and Verification Santander, Spain, July 4-6 2012 Contents 1

More information

Innocuous Double Rounding of Basic Arithmetic Operations

Innocuous Double Rounding of Basic Arithmetic Operations Innocuous Double Rounding of Basic Arithmetic Operations Pierre Roux ISAE, ONERA Double rounding occurs when a floating-point value is first rounded to an intermediate precision before being rounded to

More information

Numerical Mathematical Analysis

Numerical Mathematical Analysis Numerical Mathematical Analysis Numerical Mathematical Analysis Catalin Trenchea Department of Mathematics University of Pittsburgh September 20, 2010 Numerical Mathematical Analysis Math 1070 Numerical

More information

Lecture 2: Number Representations (2)

Lecture 2: Number Representations (2) Lecture 2: Number Representations (2) ECE 645 Computer Arithmetic 1/29/08 ECE 645 Computer Arithmetic Lecture Roadmap Number systems (cont'd) Floating point number system representations Residue number

More information

UNIT V FINITE WORD LENGTH EFFECTS IN DIGITAL FILTERS PART A 1. Define 1 s complement form? In 1,s complement form the positive number is represented as in the sign magnitude form. To obtain the negative

More information

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)

More information

Gal s Accurate Tables Method Revisited

Gal s Accurate Tables Method Revisited Gal s Accurate Tables Method Revisited Damien Stehlé UHP/LORIA 615 rue du jardin botanique F-5460 Villers-lès-Nancy Cedex stehle@loria.fr Paul Zimmermann INRIA Lorraine/LORIA 615 rue du jardin botanique

More information

EE 354 Fall 2013 Lecture 10 The Sampling Process and Evaluation of Difference Equations

EE 354 Fall 2013 Lecture 10 The Sampling Process and Evaluation of Difference Equations EE 354 Fall 203 Lecture 0 The Sampling Process and Evaluation of Difference Equations Digital Signal Processing (DSP) is centered around the idea that you can convert an analog signal to a digital signal

More information

Accurate polynomial evaluation in floating point arithmetic

Accurate polynomial evaluation in floating point arithmetic in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms

More information

Software implementation of ECC

Software implementation of ECC Software implementation of ECC Radboud University, Nijmegen, The Netherlands June 4, 2015 Summer school on real-world crypto and privacy Šibenik, Croatia Software implementation of (H)ECC Radboud University,

More information

ALU (3) - Division Algorithms

ALU (3) - Division Algorithms HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)

More information

The Hash Function JH 1

The Hash Function JH 1 The Hash Function JH 1 16 January, 2011 Hongjun Wu 2,3 wuhongjun@gmail.com 1 The design of JH is tweaked in this report. The round number of JH is changed from 35.5 to 42. This new version may be referred

More information

z = log loglog

z = log loglog Name: Units do not have to be included. 2016 2017 Log1 Contest Round 2 Theta Logs and Exponents points each 1 Write in logarithmic form: 2 = 1 8 2 Evaluate: log 5 0 log 5 8 (log 2 log 6) Simplify the expression

More information

Computation of dot products in finite fields with floating-point arithmetic

Computation of dot products in finite fields with floating-point arithmetic Computation of dot products in finite fields with floating-point arithmetic Stef Graillat Joint work with Jérémy Jean LIP6/PEQUAN - Université Pierre et Marie Curie (Paris 6) - CNRS Computer-assisted proofs

More information

Lecture 1: Introduction to computation

Lecture 1: Introduction to computation UNIVERSITY OF WESTERN ONTARIO LONDON ONTARIO Paul Klein Office: SSC 4044 Extension: 85484 Email: pklein2@uwo.ca URL: http://paulklein.ca/newsite/teaching/619.php Economics 9619 Computational methods in

More information

Random Number Generation. Stephen Booth David Henty

Random Number Generation. Stephen Booth David Henty Random Number Generation Stephen Booth David Henty Introduction Random numbers are frequently used in many types of computer simulation Frequently as part of a sampling process: Generate a representative

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2019 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

Parallel Reproducible Summation

Parallel Reproducible Summation Parallel Reproducible Summation James Demmel Mathematics Department and CS Division University of California at Berkeley Berkeley, CA 94720 demmel@eecs.berkeley.edu Hong Diep Nguyen EECS Department University

More information

Algorithms and Their Complexity

Algorithms and Their Complexity CSCE 222 Discrete Structures for Computing David Kebo Houngninou Algorithms and Their Complexity Chapter 3 Algorithm An algorithm is a finite sequence of steps that solves a problem. Computational complexity

More information

Computing logarithms and other special functions

Computing logarithms and other special functions Computing logarithms and other special functions Mike Giles University of Oxford Mathematical Institute Napier 400 NAIS Symposium April 2, 2014 Mike Giles (Oxford) Computing special functions April 2,

More information

Binary floating point

Binary floating point Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate

More information

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about

More information

Assignment 5 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Assignment 5 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Assignment 5 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. For the given functions f and g, find the requested composite function. 1) f(x)

More information

Fast and accurate Bessel function computation

Fast and accurate Bessel function computation 0 Fast and accurate Bessel function computation John Harrison, Intel Corporation ARITH-19 Portland, OR Tue 9th June 2009 (11:00 11:30) 1 Bessel functions and their computation Bessel functions are certain

More information

1. Write a program to calculate distance traveled by light

1. Write a program to calculate distance traveled by light G. H. R a i s o n i C o l l e g e O f E n g i n e e r i n g D i g d o h H i l l s, H i n g n a R o a d, N a g p u r D e p a r t m e n t O f C o m p u t e r S c i e n c e & E n g g P r a c t i c a l M a

More information

Equations with regular-singular points (Sect. 5.5).

Equations with regular-singular points (Sect. 5.5). Equations with regular-singular points (Sect. 5.5). Equations with regular-singular points. s: Equations with regular-singular points. Method to find solutions. : Method to find solutions. Recall: The

More information

Table-Based Polynomials for Fast Hardware Function Evaluation

Table-Based Polynomials for Fast Hardware Function Evaluation ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/ CENTRE

More information

arxiv: v1 [cs.na] 8 Feb 2016

arxiv: v1 [cs.na] 8 Feb 2016 Toom-Coo Multiplication: Some Theoretical and Practical Aspects arxiv:1602.02740v1 [cs.na] 8 Feb 2016 M.J. Kronenburg Abstract Toom-Coo multiprecision multiplication is a well-nown multiprecision multiplication

More information

EECS150 - Digital Design Lecture 21 - Design Blocks

EECS150 - Digital Design Lecture 21 - Design Blocks EECS150 - Digital Design Lecture 21 - Design Blocks April 3, 2012 John Wawrzynek Spring 2012 EECS150 - Lec21-db3 Page 1 Fixed Shifters / Rotators fixed shifters hardwire the shift amount into the circuit.

More information

Abstract. Ariane 5. Ariane 5, arithmetic overflow, exact rounding and Diophantine approximation

Abstract. Ariane 5. Ariane 5, arithmetic overflow, exact rounding and Diophantine approximation May 11, 2019 Suda Neu-Tech Institute Sanmenxia, Henan, China EV Summit Forum, Sanmenxia Suda Energy Conservation &NewEnergyTechnologyResearchInstitute Ariane 5, arithmetic overflow, exact rounding and

More information

Math Practice Exam 3 - solutions

Math Practice Exam 3 - solutions Math 181 - Practice Exam 3 - solutions Problem 1 Consider the function h(x) = (9x 2 33x 25)e 3x+1. a) Find h (x). b) Find all values of x where h (x) is zero ( critical values ). c) Using the sign pattern

More information

Second Order Function Approximation Using a Single Multiplication on FPGAs

Second Order Function Approximation Using a Single Multiplication on FPGAs FPL 04 Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/

More information

Chapter 1 Computer Arithmetic

Chapter 1 Computer Arithmetic Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

ACM 106a: Lecture 1 Agenda

ACM 106a: Lecture 1 Agenda 1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of

More information

Formal verification of IA-64 division algorithms

Formal verification of IA-64 division algorithms Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64

More information

Round-off Errors and Computer Arithmetic - (1.2)

Round-off Errors and Computer Arithmetic - (1.2) Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed

More information

Remainders. We learned how to multiply and divide in elementary

Remainders. We learned how to multiply and divide in elementary Remainders We learned how to multiply and divide in elementary school. As adults we perform division mostly by pressing the key on a calculator. This key supplies the quotient. In numerical analysis and

More information

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018 CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation

More information

MATH20602 Numerical Analysis 1

MATH20602 Numerical Analysis 1 M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, February 1, 2016 Outline General Course Information Introduction

More information

Chapter 1 Error Analysis

Chapter 1 Error Analysis Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of

More information

Polynomial multiplication and division using heap.

Polynomial multiplication and division using heap. Polynomial multiplication and division using heap. Michael Monagan and Roman Pearce Department of Mathematics, Simon Fraser University. Abstract We report on new code for sparse multivariate polynomial

More information

Introduction to Scientific Computing

Introduction to Scientific Computing (Lecture 2: Machine precision and condition number) B. Rosić, T.Moshagen Institute of Scientific Computing General information :) 13 homeworks (HW) Work in groups of 2 or 3 people Each HW brings maximally

More information

Math 128A: Homework 2 Solutions

Math 128A: Homework 2 Solutions Math 128A: Homework 2 Solutions Due: June 28 1. In problems where high precision is not needed, the IEEE standard provides a specification for single precision numbers, which occupy 32 bits of storage.

More information

Short Division of Long Integers. (joint work with David Harvey)

Short Division of Long Integers. (joint work with David Harvey) Short Division of Long Integers (joint work with David Harvey) Paul Zimmermann October 6, 2011 The problem to be solved Divide efficiently a p-bit floating-point number by another p-bit f-p number in the

More information

This Unit: Arithmetic. CIS 371 Computer Organization and Design. Pre-Class Exercise. Readings

This Unit: Arithmetic. CIS 371 Computer Organization and Design. Pre-Class Exercise. Readings This Unit: Arithmetic CI 371 Computer Organization and Design Unit 3: Arithmetic Based on slides by Prof. Amir Roth & Prof. Milo Martin App App App ystem software Mem CPU I/O A little review Binary + 2s

More information

Midterm Review. Igor Yanovsky (Math 151A TA)

Midterm Review. Igor Yanovsky (Math 151A TA) Midterm Review Igor Yanovsky (Math 5A TA) Root-Finding Methods Rootfinding methods are designed to find a zero of a function f, that is, to find a value of x such that f(x) =0 Bisection Method To apply

More information

System Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address

System Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address Intel 8080 CPU block diagram 8 System Data Bus (8-bit) Data Buffer Registry Array B 8 C Internal Data Bus (8-bit) F D E H L ALU SP A PC Address Buffer 16 System Address Bus (16-bit) Internal register addressing:

More information

The final is cumulative, but with more emphasis on chapters 3 and 4. There will be two parts.

The final is cumulative, but with more emphasis on chapters 3 and 4. There will be two parts. Math 141 Review for Final The final is cumulative, but with more emphasis on chapters 3 and 4. There will be two parts. Part 1 (no calculator) graphing (polynomial, rational, linear, exponential, and logarithmic

More information

Numerics and Error Analysis

Numerics and Error Analysis Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What

More information

Effect of round-off errors on the. Accuracy of randomized algorithms

Effect of round-off errors on the. Accuracy of randomized algorithms Effect of round-off errors on the accuracy of randomized algorithms ÉLIAUS-PROMES (UPR 8521 CNRS) Perpignan France marc.daumas@univ-perp.fr Outline 1 2 3 4 5 Characterize the accuracy of the result of

More information

S. R. Tate. Stable Computation of the Complex Roots of Unity, IEEE Transactions on Signal Processing, Vol. 43, No. 7, 1995, pp

S. R. Tate. Stable Computation of the Complex Roots of Unity, IEEE Transactions on Signal Processing, Vol. 43, No. 7, 1995, pp Stable Computation of the Complex Roots of Unity By: Stephen R. Tate S. R. Tate. Stable Computation of the Complex Roots of Unity, IEEE Transactions on Signal Processing, Vol. 43, No. 7, 1995, pp. 1709

More information

You separate binary numbers into columns in a similar fashion. 2 5 = 32

You separate binary numbers into columns in a similar fashion. 2 5 = 32 RSA Encryption 2 At the end of Part I of this article, we stated that RSA encryption works because it s impractical to factor n, which determines P 1 and P 2, which determines our private key, d, which

More information

Introduction CSE 541

Introduction CSE 541 Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters

More information

Data Structures and Algorithms Running time and growth functions January 18, 2018

Data Structures and Algorithms Running time and growth functions January 18, 2018 Data Structures and Algorithms Running time and growth functions January 18, 2018 Measuring Running Time of Algorithms One way to measure the running time of an algorithm is to implement it and then study

More information

MATH ASSIGNMENT 03 SOLUTIONS

MATH ASSIGNMENT 03 SOLUTIONS MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for

More information