Computing correctly rounded logarithms with fixed-point operations. Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie
|
|
- Anis Doyle
- 5 years ago
- Views:
Transcription
1 Computing correctly rounded logarithms with fixed-point operations Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie
2 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 2
3 Logarithm, the mathematical version y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3
4 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) 2 y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3
5 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) ln (b a ) = a ln(b) 2 y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3
6 Logarithm, the mathematical version ln (a b) = ln (a) + ln (b) ln (b a ) = a ln(b) Taylor: for x small, ln(1 + x) x x 2 /2 + x 3 /3... y y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 3
7 Logarithm, the floating-point version The floating point version of the natural logarithm is called log (you will also find log2 and log10 and a few others) 2 y x F 64 log(x) = (ln(x)) y = ln(x) x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 4
8 Logarithm, the floating-point version The floating point version of the natural logarithm is called log (you will also find log2 and log10 and a few others) 2 y x F 64 log(x) = (ln(x)) y = ln(x) x An experiment Implementing the floating-point logarithm function using only integer arithmetic for performance (previous work motivated by lack of FP hardware) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 4
9 Why using fixed-point arithmetic? bits IEEE-754 (64 bits) mainstream floating-point mainstream integer J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 5
10 Why using fixed-point arithmetic? bits IEEE-754 (64 bits) 64-bits mainstream floating-point mainstream integer 64-bit floating-point, but only 52-bit precision if you can predict the value of the exponent, exponent bits are wasted bits. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 5
11 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6
12 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6
13 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6
14 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers small multiprecision out of the box: mainstream compilers (gcc, clang, icc) support int_128 addition 128x shift on two registers (add, adc) (shld, shrd) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6
15 Integer better than floating-point? modern 64-bit machines offer all sort of useful integer instructions addition multiplication 64x (mulq) count leading zeroes, shifts (lzcnt, bsr) most operations are faster on integers, especially addition (which more or less defines the processor cycle time) multiprecision faster and simpler using integers small multiprecision out of the box: mainstream compilers (gcc, clang, icc) support int_128 addition 128x shift on two registers (add, adc) (shld, shrd) Caveat: integer SIMD/vector support still lagging behind FP (no vector multiplication) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 6
16 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 7
17 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8
18 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8
19 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8
20 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy Trade-off between first and second steps: MeanTime = Time(1st step) + Pr[need 2nd step] Time(2nd step) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8
21 On-demand accuracy Muller and Lefèvre solved the table maker dilema for ln Computing the log with an error enables correct rounding two consecutive floating-point numbers real numbers computed logarithm, with error margin CRLibm refinement of Ziv s technique: First step: quick-and-dirty evaluation of ln(x) (just accurate enough to ensure correct rounding in most cases) test if rounding can be decided if not (rarely), recompute ln(x) with the worst-case accuracy Trade-off between first and second steps: MeanTime = Time(1st step) + Pr[need 2nd step] Time(2nd step) Best so far: Time(2nd step) 10 Time(1st step) This work: Time(2nd step) 2 Time(1st step) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 8
22 The big picture 1. Filter special cases (negative numbers,,...) 2. Argument range reduction 3. Polynomial approximation 4. Solution reconstruction 5. Error evaluation and rounding test 6. If more accuracy needed: Rerun the steps 3 and 4 with the worst-case accuracy. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 9
23 First argument range reduction input = 2 E (1 + x) ln(input) = E ln (2) + ln (1 + x) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 10
24 First argument range reduction input = 2 E (1 + x) ln(input) = E ln (2) + ln (1 + x) Evaluation algorithm: approximate ln (1 + x) with a polynomial p(x) degree needed: at least 26 evaluate E ln (2) add both terms J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 10
25 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11
26 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits A table, addressed by x 1, the 7 most significand bits of x, stores r x 1 1 ( ) 1 and ln 1 + x x r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11
27 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros A table, addressed by x 1, the 7 most significand bits of x, stores r x 1 1 ( ) 1 and ln 1 + x x Define 1 + y = r x (1 + x) 1 Then ln(1 + x) = ln(1 + y) + ln( 1 r x ) r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 11
28 Tang s range reduction algorithm 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros Extract the index x 1 Read, from a table addressed by x 1, both r x and ln( 1 r x ) compute y = r x (1 + x) 1 (exactly) approximate ln (1 + y) with a polynomial p(y) Degree needed: 8 add it all: ln(input) E ln (2) + p (y) + ln( 1 ) r x J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 12
29 Tang s range reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 1. fractional part on 7 bits r x : 0. fractional part on 18 bits 1 + y: fractional part on 64 bits plus 6 implicit zeros y = r x (1 + x) 1 With 1 + x on 53 bits we can tabulate r x on 18 bits: the exact product would need 71 bits but we can predict the 7 leading bits... so we can let them overflow quietly and use a multiplication. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 13
30 Two levels of Tang reduction 1 + x: 1. fractional part on 52 bits 1 + x 1 : 0. fractional part on 7 bits r x : 0. fractional part on 9 bits 1 + y: y 1 : fractional part on 60 bits including 5 zeros. fractional part on 13 bits r y : 0. fractional part on 15 bits 1 + z: fractional part on 64 bits plus 12 implicit zeros x [0, 1) y [ 0, ) x 1 takes 64 different values z [ 0, ) y 1 takes 96 different values the whole reduction of x to z is computed exactly in 64-bit int. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 14
31 A few Pareto points in the design space Table size (bytes) degree 1st degree 2nd 39, , , , , J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 15
32 Why stop at two levels of reduction? Answer is: diminushing return. For a target accuracy of 2 60 : interval of x degree needed No reduction [ 1/2, 1/2] 29 1 level [ 2 7, 2 7 ] 8 2 levels [ 2 12, 2 12 ] 4 3 levels [ 2 18, 2 18 ] 3 Adding more levels will cost more operations than it saves... J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 16
33 Polynomial approximation (advertisement) We want to approximate log(1 + z) on an interval around 0. Use the (now standard) tool set to obtain it. Sollya: finds a machine-efficient polynomial P(z) computes a safe bound on the approximation error P(z) ln(1 + z) Gappa: bounds the accumulation of rounding errors when evaluating P(z) in C We obtain a Coq proof of the error: computed approximation of ln(1 + z), with relative error margin real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 17
34 Reconstructing the solution input = 2 e (1 + x) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
35 Reconstructing the solution input = 2 e 1 r x (1 + y) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
36 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
37 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
38 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
39 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
40 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
41 Reconstructing the solution input = 2 e 1 r x 1 r y (1 + z) ln(input) = e ln(2) + ln(r x 1 ) + ln(r y 1 ) + ln(1 + z) e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: If we can predict the exponents, exponent bits are wasted bits J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 18
42 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19
43 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19
44 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19
45 Error evaluation e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: ɛ < ( e P(z) 2 59) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 19
46 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20
47 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20
48 Rounding test Simple technique: compute the two bounds of the interval, and see if they round to the same mantissa (two additions, a xor and a shift) real numbers For comparison, the proof of the floating-point-based rounding test (invented by Ziv and used in CRLibm) is an 18-page paper that took 20 years to publish... J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 20
49 Second step Use 3 words instead of 2 for the precomputed ln(2) Use a much more accurate polynomial: with coefficients on 128 bits instead of 64 (but z is still only a 64-bit number) and using a higher degree polynomial J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 21
50 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 22
51 Implementation parameters of correctly rounded implementations glibc crlibm-td crlibm-de cr-fixp degree pol. 1 3/ degree pol tables size 13 Kb 8192 bytes 6144 bytes 4032 bytes % accurate phase N/A J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 23
52 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24
53 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24
54 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24
55 Average and max runing time (in processor cycles) Pentium timing (lower is better) cycles MKL glibc crlibm cr-de cr-fixp avg time max time 25 11, Timing breakdown on two processors cycles Core i5 Bostan System glibc newlib quick phase alone accurate phase alone both phases (avg time) both phases (max time) Slanted means: no correct rounding J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 24
56 Outline Introduction and context Algorithm Results and comparisons Conclusions J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 25
57 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26
58 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap Competitive against state-of-the-art Worst case improved compared to others implementations Second step-only version is a viable alternative J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26
59 Conclusions Better range reduction thanks to a wider format... leading to improvements in polynomial degree and table size Faster multiprecision due to higher precision Able to reuse some computations of the fast step in the accurate step Alternative rounding test for acurate step Probability to launch 2nd step is high, but this is acceptable since 2nd step is so cheap Competitive against state-of-the-art Worst case improved compared to others implementations Second step-only version is a viable alternative Limitations: Require 64 bits integer support No support for vectorization J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 26
60 Thanks for your attention Any question? J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 27
61 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 28
62 Floating-point in, fixed-point out output: fixed-point, 11 bits integer part, 53 bit fractional part integer part target absolute accuracy 2 52 fraction output absolute table Core i5 Bostan format accuracy size cycles cycles Fix Fix double (libm) Fix64 is the code of the first step only, without the conversion to float. tweak: poly degree 3 only for abs. accuracy 2 59 Fix128 is the code of the second step only, without the conversion to float. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 29
63 Motivation TKF91 : DNA sequence alignment algorithm dynamic programming algorithm: alignment as a path within a 2D array. borders of an array initialized with log-likelihoods then array filled using recurrence formulae that involve only max and + operations. All current implementations of this algorithm use a floating-point array, but int64 + and max are 1-cycle, vectorizable, and exact operations; absolute accuracy of initialization logs: up to 2 42 with FP log, 2 52 with FixP log. J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 30
64 Only partial experiments Improvement in accuracy measured No noticeable improvement in performance J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 31
65 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 32
66 Special cases: businesss as usual / r e i n t e r p r e t x to m a n i p u l a t e i t s b i t s more e a s i l y / u i n t 6 4 _ t i n p u t b i t s = ( ( union { double d ; u i n t 6 4 _ t u ; }){ i n p u t } ). u ; i n t xe = i n p u t b i t s >> 5 2 ; / f i l t e r t h e s p e c i a l c a s e s :! ( x i s n o r m a l i z e d and 0 < x < +I n f ) i f (0 x7feu <= ( unsigned ) xe 1u ) { / x = + 0 : r a i s e a D i v i d e B y z e r o, r e t u r n I n f / i f ( ( x b i t s & ~(1 u l l << 6 3 ) ) == 0) r e t u r n 1.0/0.0; / x < 0. 0 : r a i s e a I n v a l i d O p e r a t i o n, r e t u r n a qnan / i f ( ( x b i t s & (1 u l l << 6 3 ) )!= 0) r e t u r n ( x x ) / 0 ; / x = qnan : r e t u r n a qnan x = snan : r a i s e a I n v a l i d O p e r a t i o n, r e t u r n a qnan x = +I n f : r e t u r n +I n f / i f ( xe!= 0) r e t u r n x+x ; / x subnormal : change x to a n o r m a l i z e d number / e l s e { i n t u = c l z 6 4 ( x b i t s ) 1 2 ; x b i t s <<= u + 1 ; xe = u ; } } xe = ; Only interesting line: the subnormal management J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 33
67 Argument reduction / i n p u t = 2^ xe (1 + X) / u i n t 6 4 _ t x = i n p u t b i t s & 0 xfffffffffffffull ; / 1 + X = ( 1/ Ri ) Y / u i n t 8 _ t x1 = x >> (52 ARG_REDUC_1_PREC ) ; u i n t 6 4 _ t y = argreduc1_inv [ x1 ] ( x + (1 u l l << 5 2 ) ) ; b u i l t i n _ p r e f e t c h (& argreduc1_log [ x1 ], 0, 0 ) ; / Y = (1/ S ) (1 + Z) / / w i t h dz = dz /2^(52 + ARG_REDUC_1_SIZE + ARG_REDUC_2_SIZE) and 1/S = argreduc2 [ s i ]. v a l /2^ARG_REDUC_2_SIZE / u i n t 8 _ t y1 = ( y >> ( 52 + ARG_REDUC_1_SIZE ARG_REDUC_2_PREC) ) ( 1 u << ARG_REDUC_2_PREC ) ; u i n t 6 4 _ t z = argreduc2_inv [ y1 ] y ; // +1 p a r t removed by o v e r f l o w b u i l t i n _ p r e f e t c h (& argreduc2_log [ y1 ], 0, 0 ) ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 34
68 Polynomial evaluation / P o l y n o m i a l a p p r o x i m a t i o n o f l o g (1+Z)/Z ~= P(Z) and Z P(Z) / s t a t i c c o n s t u i n t 6 4 _ t a4 = UINT64_C (0 x 3 f f c c b ) ; s t a t i c c o n s t u i n t 6 4 _ t a3 = UINT64_C (0 x d c 7 0 f 0 d f d ) ; s t a t i c c o n s t u i n t 6 4 _ t a2 = UINT64_C (0 x 7 f f f f f f f f f d f d ) ; s t a t i c c o n s t u i n t 6 4 _ t a1 = UINT64_C (0 x f f f f f f f f f f f f f f f a ) ; u i n t 6 4 _ t pz = a1 ( highmul ( z, a2 ( highmul ( z, a3 ( highmul ( z, a4 ) >> 12) ) >> 12) ) >> 1 2 ) ; u i n t _ t zpz = f u l l m u l ( z, pz ) ; / P o l y n o m i a l a p p r o x i m a t i o n w i t h o u t s h i f t s : r e p l a c e highmul ( z, a ) >> 12 w i t h highmul ( z, a >> 12) / s t a t i c c o n s t u i n t 6 4 _ t a4 = UINT64_C (0 x f f c ) ; s t a t i c c o n s t u i n t 6 4 _ t a3 = UINT64_C (0 x dc6 ) ; s t a t i c c o n s t u i n t 6 4 _ t a2 = UINT64_C (0 x f f f f f f f f f d 5 7 ) ; s t a t i c c o n s t u i n t 6 4 _ t a1 = UINT64_C (0 x f f f f f f f f f f f f f f f a ) ; u i n t 6 4 _ t pz = a1 highmul ( z, a2 highmul ( z, a3 highmul ( z, a4 ) ) ) ; u i n t _ t zpz = f u l l m u l ( z, pz ) ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 35
69 Reconstructing the solution / Compute p a r t o f t h e r e s u l t t h a t don t depend on Z ( xe l o g ( 2 ) + l o g (1/ Ri ) + l o g (1/ S i ) ) / u i n t _ t c s t p a r t = f u l l i m u l ( xe, log2fw_mid ) + UINT128 ( ( i n t 6 4 _ t ) xe log2fw_high, 0) // f u l l m u l not needed + UINT128 ( argreduc1 [ r i ]. l o g _ hi, argreduc1 [ r i ]. l o g _ l o ) + UINT128 ( argreduc2 [ s i ]. l o g_hi, argreduc2 [ s i ]. l o g _ l o ) ; / P o l y n o m i a l a p p r o x i m a t i o n o f l o g (1+Z)/Z ~= P(Z) and Z P(Z) /... / Assemble t h e two p a r t s, compute s i g n, m a n t i s s a and exponent / u i n t _ t l o n g r e s = c s t p a r t + ( zpz >> (11 + IMPLICIT_ZEROS ) ) ; u i n t 6 4 _ t s i g n = ( HI ( l o n g r e s ) >> 6 3 ) ; // 0 o r ~0 // i f s i g n!= 0, t h i s i s l o n g r e s = ~ l o n g r e s ( a = ~a + 1) // to a v o i d t h e +1 approx, do : // l o n g r e s = ( ( i n t 6 4 _ t ) s i g n + l o n g r e s ) ^ UINT128 ( s i g n, s i g n ) ; l o n g r e s ^= UINT128 ( s i g n, s i g n ) ; i n t u = c l z 6 4 ( HI ( l o n g r e s ) ) + 1 ; i n t exponent = 11 u ; u i n t 6 4 _ t m a n t i s s a = ( HI ( l o n g r e s ) << u ) (LO( l o n g r e s ) >> (64 u ) ) J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 36
70 Rounding test and conversion / Compute t h e maximal a b s o l u t e e r r o r ( a l i g n e d w i t h l o n g r e s ) : abs ( xe ) f o r xe l o g ( 2 ), l o g (1/ Ri ) and l o g (1/ S i ) zpz >>(POLYNOMIAL_PREC+IMPLICIT_ZEROS+11) f o r t h e p o l y n o I f r e s u l t (1 + maxrelerr ) a r e not rounded to t h e same number, we u i n t 6 4 _ t maxabserr = 3 + abs ( xe ) + ( HI ( zpz ) >> (POLYNOMIAL_PREC + u i n t 6 4 _ t maxrelerr = ( maxabserr >> (64 u ) ) + 1 ; i f ( ( ( m a n t i s s a + maxrelerr ) ^ ( m a n t i s s a maxrelerr ) ) >> 11) { r e t u r n l o g _ r n _ a c c u r a t e ( c s t p a r t, z, xe ) ; } / Assemble t h e computed r e s u l t / u i n t 6 4 _ t r e s u l t b i t s = ( ( u i n t 6 4 _ t ) s i g n << 63) + ( ( u i n t 6 4 _ t ) ( exponent +1023) << 52) + ( m a n t i s s a >> 12) + ( ( m a n t i s s a >> 11) & 1 ) ; / round to n e a r e s t / r e t u r n ( union { u i n t 6 4 _ t u ; double d ; }){ r e s u l t b i t s }. d ; J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 37
71 Outline Bonus: a floating-point in, fixed-point out variant Some code details on the solution reconstruction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 38
72 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
73 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
74 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
75 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
76 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
77 Reconstructing the solution e ln(2): ln(r x 1 ): ln(r y 1 ): P(z) ln(1 + z): sum: 1 floating-point fraction J. Le Maire, F. de Dinechin, J.-M. Muller and N. Brunie Computing correctly rounded logarithm with fixed-point operations 39
Automated design of floating-point logarithm functions on integer processors
23rd IEEE Symposium on Computer Arithmetic Santa Clara, CA, USA, 10-13 July 2016 Automated design of floating-point logarithm functions on integer processors Guillaume Revy (presented by Florent de Dinechin)
More informationComputing Machine-Efficient Polynomial Approximations
Computing Machine-Efficient Polynomial Approximations N. Brisebarre, S. Chevillard, G. Hanrot, J.-M. Muller, D. Stehlé, A. Tisserand and S. Torres Arénaire, LIP, É.N.S. Lyon Journées du GDR et du réseau
More informationDesigning a Correct Numerical Algorithm
Intro Implem Errors Sollya Gappa Norm Conc Christoph Lauter Guillaume Melquiond March 27, 2013 Intro Implem Errors Sollya Gappa Norm Conc Outline 1 Introduction 2 Implementation theory 3 Error analysis
More informationOptimizing Scientific Libraries for the Itanium
0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common
More informationHow do computers represent numbers?
How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete
More informationECS 231 Computer Arithmetic 1 / 27
ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers
More informationOn various ways to split a floating-point number
On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-
More informationRigorous Polynomial Approximations and Applications
Rigorous Polynomial Approximations and Applications Mioara Joldeș under the supervision of: Nicolas Brisebarre and Jean-Michel Muller École Normale Supérieure de Lyon, Arénaire Team, Laboratoire de l Informatique
More informationNumerical Methods - Preliminaries
Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical
More informationContinued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.
Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Mourad Gouicem PEQUAN Team, LIP6/UPMC Nancy, France May 28 th 2013
More informationMathematical preliminaries and error analysis
Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic
More informationFaithful Horner Algorithm
ICIAM 07, Zurich, Switzerland, 16 20 July 2007 Faithful Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, France http://webdali.univ-perp.fr Ph. Langlois (Université de Perpignan)
More informationNumber Representation and Waveform Quantization
1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.
More informationAutomated design of floating-point logarithm functions on integer processors
Automated design of floating-point logarithm functions on integer processors Guillaume Revy To cite this version: Guillaume Revy. Automated design of floating-point logarithm functions on integer processors.
More information8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.
Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu
More informationEAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science
EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,
More informationChapter 4 Number Representations
Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point
More informationComputer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic
Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,
More informationEx code
Ex. 8.4 7-4-2-1 code Codeconverter 7-4-2-1-code to BCD-code. When encoding the digits 0... 9 sometimes in the past a code having weights 7-4-2-1 instead of the binary code weights 8-4-2-1 was used. In
More information9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017
9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin
More informationChapter 1: Introduction and mathematical preliminaries
Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,
More informationLecture 7. Floating point arithmetic and stability
Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating
More informationTunable Floating-Point for Energy Efficient Accelerators
Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable
More informationComparison between binary and decimal floating-point numbers
1 Comparison between binary and decimal floating-point numbers Nicolas Brisebarre, Christoph Lauter, Marc Mezzarobba, and Jean-Michel Muller Abstract We introduce an algorithm to compare a binary floating-point
More informationNewton-Raphson Algorithms for Floating-Point Division Using an FMA
Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008
More informationPowerPoints organized by Dr. Michael R. Gustafson II, Duke University
Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction
More informationTopic 17. Analysis of Algorithms
Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm
More informationLecture 11. Advanced Dividers
Lecture 11 Advanced Dividers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division
More informationThe Euclidean Division Implemented with a Floating-Point Multiplication and a Floor
The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented
More informationNotes for Chapter 1 of. Scientific Computing with Case Studies
Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What
More informationThe Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal
-1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,
More informationCSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015 Today Registration should be done. Homework 1 due 11:59pm next Wednesday, April 8 th. Review math
More informationArithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460
Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how
More informationCR-LIBM A library of correctly rounded elementary functions in double-precision
CR-LIBM A library of correctly rounded elementary functions in double-precision Catherine Daramy-Loirat, David Defour, Florent De Dinechin, Matthieu Gallet, Nicolas Gast, Christoph Lauter, Jean-Michel
More informationThanks to: University of Illinois at Chicago NSF CCR Alfred P. Sloan Foundation
The Poly1305-AES message-authentication code D. J. Bernstein Thanks to: University of Illinois at Chicago NSF CCR 9983950 Alfred P. Sloan Foundation The AES function ( Rijndael 1998 Daemen Rijmen; 2001
More informationOptimizing MPC for robust and scalable integer and floating-point arithmetic
Optimizing MPC for robust and scalable integer and floating-point arithmetic Liisi Kerik * Peeter Laud * Jaak Randmets * * Cybernetica AS University of Tartu, Institute of Computer Science January 30,
More informationTight and rigourous error bounds for basic building blocks of double-word arithmetic. Mioara Joldes, Jean-Michel Muller, Valentina Popescu
Tight and rigourous error bounds for basic building blocks of double-word arithmetic Mioara Joldes, Jean-Michel Muller, Valentina Popescu SCAN - September 2016 AMPAR CudA Multiple Precision ARithmetic
More informationGetting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function
Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Jean-Michel Muller collaborators: S. Graillat, C.-P. Jeannerod, N. Louvet V. Lefèvre
More informationIntroduction to Randomized Algorithms III
Introduction to Randomized Algorithms III Joaquim Madeira Version 0.1 November 2017 U. Aveiro, November 2017 1 Overview Probabilistic counters Counting with probability 1 / 2 Counting with probability
More informationSOBER Cryptanalysis. Daniel Bleichenbacher and Sarvar Patel Bell Laboratories Lucent Technologies
SOBER Cryptanalysis Daniel Bleichenbacher and Sarvar Patel {bleichen,sarvar}@lucent.com Bell Laboratories Lucent Technologies Abstract. SOBER is a new stream cipher that has recently been developed by
More informationFloating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le
Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff
More informationToward Correctly Rounded Transcendentals
IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 11, NOVEMBER 1998 135 Toward Correctly Rounded Transcendentals Vincent Lefèvre, Jean-Michel Muller, Member, IEEE, and Arnaud Tisserand Abstract The Table Maker
More informationOptimizing the Representation of Intervals
Optimizing the Representation of Intervals Javier D. Bruguera University of Santiago de Compostela, Spain Numerical Sofware: Design, Analysis and Verification Santander, Spain, July 4-6 2012 Contents 1
More informationInnocuous Double Rounding of Basic Arithmetic Operations
Innocuous Double Rounding of Basic Arithmetic Operations Pierre Roux ISAE, ONERA Double rounding occurs when a floating-point value is first rounded to an intermediate precision before being rounded to
More informationNumerical Mathematical Analysis
Numerical Mathematical Analysis Numerical Mathematical Analysis Catalin Trenchea Department of Mathematics University of Pittsburgh September 20, 2010 Numerical Mathematical Analysis Math 1070 Numerical
More informationLecture 2: Number Representations (2)
Lecture 2: Number Representations (2) ECE 645 Computer Arithmetic 1/29/08 ECE 645 Computer Arithmetic Lecture Roadmap Number systems (cont'd) Floating point number system representations Residue number
More informationUNIT V FINITE WORD LENGTH EFFECTS IN DIGITAL FILTERS PART A 1. Define 1 s complement form? In 1,s complement form the positive number is represented as in the sign magnitude form. To obtain the negative
More informationApplied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)
More informationGal s Accurate Tables Method Revisited
Gal s Accurate Tables Method Revisited Damien Stehlé UHP/LORIA 615 rue du jardin botanique F-5460 Villers-lès-Nancy Cedex stehle@loria.fr Paul Zimmermann INRIA Lorraine/LORIA 615 rue du jardin botanique
More informationEE 354 Fall 2013 Lecture 10 The Sampling Process and Evaluation of Difference Equations
EE 354 Fall 203 Lecture 0 The Sampling Process and Evaluation of Difference Equations Digital Signal Processing (DSP) is centered around the idea that you can convert an analog signal to a digital signal
More informationAccurate polynomial evaluation in floating point arithmetic
in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms
More informationSoftware implementation of ECC
Software implementation of ECC Radboud University, Nijmegen, The Netherlands June 4, 2015 Summer school on real-world crypto and privacy Šibenik, Croatia Software implementation of (H)ECC Radboud University,
More informationALU (3) - Division Algorithms
HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)
More informationThe Hash Function JH 1
The Hash Function JH 1 16 January, 2011 Hongjun Wu 2,3 wuhongjun@gmail.com 1 The design of JH is tweaked in this report. The round number of JH is changed from 35.5 to 42. This new version may be referred
More informationz = log loglog
Name: Units do not have to be included. 2016 2017 Log1 Contest Round 2 Theta Logs and Exponents points each 1 Write in logarithmic form: 2 = 1 8 2 Evaluate: log 5 0 log 5 8 (log 2 log 6) Simplify the expression
More informationComputation of dot products in finite fields with floating-point arithmetic
Computation of dot products in finite fields with floating-point arithmetic Stef Graillat Joint work with Jérémy Jean LIP6/PEQUAN - Université Pierre et Marie Curie (Paris 6) - CNRS Computer-assisted proofs
More informationLecture 1: Introduction to computation
UNIVERSITY OF WESTERN ONTARIO LONDON ONTARIO Paul Klein Office: SSC 4044 Extension: 85484 Email: pklein2@uwo.ca URL: http://paulklein.ca/newsite/teaching/619.php Economics 9619 Computational methods in
More informationRandom Number Generation. Stephen Booth David Henty
Random Number Generation Stephen Booth David Henty Introduction Random numbers are frequently used in many types of computer simulation Frequently as part of a sampling process: Generate a representative
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2019 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More informationParallel Reproducible Summation
Parallel Reproducible Summation James Demmel Mathematics Department and CS Division University of California at Berkeley Berkeley, CA 94720 demmel@eecs.berkeley.edu Hong Diep Nguyen EECS Department University
More informationAlgorithms and Their Complexity
CSCE 222 Discrete Structures for Computing David Kebo Houngninou Algorithms and Their Complexity Chapter 3 Algorithm An algorithm is a finite sequence of steps that solves a problem. Computational complexity
More informationComputing logarithms and other special functions
Computing logarithms and other special functions Mike Giles University of Oxford Mathematical Institute Napier 400 NAIS Symposium April 2, 2014 Mike Giles (Oxford) Computing special functions April 2,
More informationBinary floating point
Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate
More informationWhat Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract
What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about
More informationAssignment 5 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Assignment 5 Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. For the given functions f and g, find the requested composite function. 1) f(x)
More informationFast and accurate Bessel function computation
0 Fast and accurate Bessel function computation John Harrison, Intel Corporation ARITH-19 Portland, OR Tue 9th June 2009 (11:00 11:30) 1 Bessel functions and their computation Bessel functions are certain
More information1. Write a program to calculate distance traveled by light
G. H. R a i s o n i C o l l e g e O f E n g i n e e r i n g D i g d o h H i l l s, H i n g n a R o a d, N a g p u r D e p a r t m e n t O f C o m p u t e r S c i e n c e & E n g g P r a c t i c a l M a
More informationEquations with regular-singular points (Sect. 5.5).
Equations with regular-singular points (Sect. 5.5). Equations with regular-singular points. s: Equations with regular-singular points. Method to find solutions. : Method to find solutions. Recall: The
More informationTable-Based Polynomials for Fast Hardware Function Evaluation
ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/ CENTRE
More informationarxiv: v1 [cs.na] 8 Feb 2016
Toom-Coo Multiplication: Some Theoretical and Practical Aspects arxiv:1602.02740v1 [cs.na] 8 Feb 2016 M.J. Kronenburg Abstract Toom-Coo multiprecision multiplication is a well-nown multiprecision multiplication
More informationEECS150 - Digital Design Lecture 21 - Design Blocks
EECS150 - Digital Design Lecture 21 - Design Blocks April 3, 2012 John Wawrzynek Spring 2012 EECS150 - Lec21-db3 Page 1 Fixed Shifters / Rotators fixed shifters hardwire the shift amount into the circuit.
More informationAbstract. Ariane 5. Ariane 5, arithmetic overflow, exact rounding and Diophantine approximation
May 11, 2019 Suda Neu-Tech Institute Sanmenxia, Henan, China EV Summit Forum, Sanmenxia Suda Energy Conservation &NewEnergyTechnologyResearchInstitute Ariane 5, arithmetic overflow, exact rounding and
More informationMath Practice Exam 3 - solutions
Math 181 - Practice Exam 3 - solutions Problem 1 Consider the function h(x) = (9x 2 33x 25)e 3x+1. a) Find h (x). b) Find all values of x where h (x) is zero ( critical values ). c) Using the sign pattern
More informationSecond Order Function Approximation Using a Single Multiplication on FPGAs
FPL 04 Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/
More informationChapter 1 Computer Arithmetic
Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More informationACM 106a: Lecture 1 Agenda
1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of
More informationFormal verification of IA-64 division algorithms
Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64
More informationRound-off Errors and Computer Arithmetic - (1.2)
Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed
More informationRemainders. We learned how to multiply and divide in elementary
Remainders We learned how to multiply and divide in elementary school. As adults we perform division mostly by pressing the key on a calculator. This key supplies the quotient. In numerical analysis and
More informationCSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018
CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis Ruth Anderson Winter 2018 Today Algorithm Analysis What do we care about? How to compare two algorithms Analyzing Code Asymptotic Analysis
More informationProposal to Improve Data Format Conversions for a Hybrid Number System Processor
Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation
More informationMATH20602 Numerical Analysis 1
M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, February 1, 2016 Outline General Course Information Introduction
More informationChapter 1 Error Analysis
Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of
More informationPolynomial multiplication and division using heap.
Polynomial multiplication and division using heap. Michael Monagan and Roman Pearce Department of Mathematics, Simon Fraser University. Abstract We report on new code for sparse multivariate polynomial
More informationIntroduction to Scientific Computing
(Lecture 2: Machine precision and condition number) B. Rosić, T.Moshagen Institute of Scientific Computing General information :) 13 homeworks (HW) Work in groups of 2 or 3 people Each HW brings maximally
More informationMath 128A: Homework 2 Solutions
Math 128A: Homework 2 Solutions Due: June 28 1. In problems where high precision is not needed, the IEEE standard provides a specification for single precision numbers, which occupy 32 bits of storage.
More informationShort Division of Long Integers. (joint work with David Harvey)
Short Division of Long Integers (joint work with David Harvey) Paul Zimmermann October 6, 2011 The problem to be solved Divide efficiently a p-bit floating-point number by another p-bit f-p number in the
More informationThis Unit: Arithmetic. CIS 371 Computer Organization and Design. Pre-Class Exercise. Readings
This Unit: Arithmetic CI 371 Computer Organization and Design Unit 3: Arithmetic Based on slides by Prof. Amir Roth & Prof. Milo Martin App App App ystem software Mem CPU I/O A little review Binary + 2s
More informationMidterm Review. Igor Yanovsky (Math 151A TA)
Midterm Review Igor Yanovsky (Math 5A TA) Root-Finding Methods Rootfinding methods are designed to find a zero of a function f, that is, to find a value of x such that f(x) =0 Bisection Method To apply
More informationSystem Data Bus (8-bit) Data Buffer. Internal Data Bus (8-bit) 8-bit register (R) 3-bit address 16-bit register pair (P) 2-bit address
Intel 8080 CPU block diagram 8 System Data Bus (8-bit) Data Buffer Registry Array B 8 C Internal Data Bus (8-bit) F D E H L ALU SP A PC Address Buffer 16 System Address Bus (16-bit) Internal register addressing:
More informationThe final is cumulative, but with more emphasis on chapters 3 and 4. There will be two parts.
Math 141 Review for Final The final is cumulative, but with more emphasis on chapters 3 and 4. There will be two parts. Part 1 (no calculator) graphing (polynomial, rational, linear, exponential, and logarithmic
More informationNumerics and Error Analysis
Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What
More informationEffect of round-off errors on the. Accuracy of randomized algorithms
Effect of round-off errors on the accuracy of randomized algorithms ÉLIAUS-PROMES (UPR 8521 CNRS) Perpignan France marc.daumas@univ-perp.fr Outline 1 2 3 4 5 Characterize the accuracy of the result of
More informationS. R. Tate. Stable Computation of the Complex Roots of Unity, IEEE Transactions on Signal Processing, Vol. 43, No. 7, 1995, pp
Stable Computation of the Complex Roots of Unity By: Stephen R. Tate S. R. Tate. Stable Computation of the Complex Roots of Unity, IEEE Transactions on Signal Processing, Vol. 43, No. 7, 1995, pp. 1709
More informationYou separate binary numbers into columns in a similar fashion. 2 5 = 32
RSA Encryption 2 At the end of Part I of this article, we stated that RSA encryption works because it s impractical to factor n, which determines P 1 and P 2, which determines our private key, d, which
More informationIntroduction CSE 541
Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters
More informationData Structures and Algorithms Running time and growth functions January 18, 2018
Data Structures and Algorithms Running time and growth functions January 18, 2018 Measuring Running Time of Algorithms One way to measure the running time of an algorithm is to implement it and then study
More informationMATH ASSIGNMENT 03 SOLUTIONS
MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for
More information