Automated design of floating-point logarithm functions on integer processors

Size: px
Start display at page:

Download "Automated design of floating-point logarithm functions on integer processors"

Transcription

1 23rd IEEE Symposium on Computer Arithmetic Santa Clara, CA, USA, July 2016 Automated design of floating-point logarithm functions on integer processors Guillaume Revy (presented by Florent de Dinechin) Univ. Perpignan Via Domitia, DALI project-team Univ. Montpellier 2, LIRMM, UMR 5506 CNRS, LIRMM, UMR 5506 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 1/22

2 Summary Context and objectives Automated synthesis of floating-point function implementation particular case of logarithm functions log b (x) work done within the french ANR MetaLibm project ( FLIP software library low latency and correctly-rounded implementation VLIW integer processor of the ST200 family no exception handling Achievements 1. Unified range reduction for log b (x) implementation 2. Correctly-rounded log 2 (x), log(x), and log 10 (x) for the binary32 format 200 cycles on the ST231 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 2/22

3 Problem statement b R > 1 (p,e min,e max ) {RN,RU,RD,RZ} G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 3/22

4 Problem statement x = m 2 e, x > 0 b R > 1 (p,e min,e max ) {RN,RU,RD,RZ} integer arithmetic r = (log b (x)) input/output: ±0, ±, NaN, subnormal and normal numbers G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 3/22

5 Problem statement x = m 2 e, x > 0 b R > 1 (p,e min,e max ) {RN,RU,RD,RZ} integer arithmetic r = (log b (x)) input/output: ±0, ±, NaN, subnormal and normal numbers Our focus: IEEE-754 binary32 implementations on the ST231 b {2,exp(1),10} 4-issue VLIW 32-bit integer processor (p,emin,e max ) = (24, 126,127) 1-cycle ALU / 3-cycle MUL = RN 64-bit arithmetic emulated in software no underflow nor overflow high latency memory accesses G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 3/22

6 Outline of the talk 1. Unified evaluation scheme for log b (x) 2. Error analysis 3. Implementation and results 4. Concluding remarks and future work G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 4/22

7 Unified evaluation scheme for log b (x) Outline of the talk 1. Unified evaluation scheme for log b (x) 2. Error analysis 3. Implementation and results 4. Concluding remarks and future work G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 5/22

8 Unified evaluation scheme for log b (x) Logarithm range reduction Let x be a non-negative normal floating-point number, with x 1 log b (x) = 2 d u RN(log b (x)) = ( 1) sign(u) 2 d RN( u ) with u [1,2) and d {e min,,e max } G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 6/22

9 Unified evaluation scheme for log b (x) Logarithm range reduction Let x be a non-negative normal floating-point number, with x 1 log b (x) = 2 d u RN(log b (x)) = ( 1) sign(u) 2 d RN( u ) with u [1,2) and d {e min,,e max } x = 2 e m log b (x) = log b (2) e + log b (m), with m [1,2] catastrophic cancellation when e = 1 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 6/22

10 Unified evaluation scheme for log b (x) Logarithm range reduction Let x be a non-negative normal floating-point number, with x 1 log b (x) = 2 d u RN(log b (x)) = ( 1) sign(u) 2 d RN( u ) with u [1,2) and d {e min,,e max } x = 2 e m log b (x) = log b (2) e + log b (m), with m [1,2] catastrophic cancellation when e = 1 Range reduction: define τ = 1 if m 1.5, 0 otherwise log b (x) = log b (2) (e + τ) + log b (m/2 τ ), with m/2 τ [0.75,1.5] no branch instruction G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 6/22

11 Unified evaluation scheme for log b (x) Logarithm range reduction Let x be a non-negative normal floating-point number, with x 1 log b (x) = 2 d u RN(log b (x)) = ( 1) sign(u) 2 d RN( u ) with u [1,2) and d {e min,,e max } x = 2 e m log b (x) = log b (2) e + log b (m), with m [1,2] catastrophic cancellation when e = 1 Range reduction: define τ = 1 if m 1.5, 0 otherwise log b (x) = log b (2) (e + τ) + log b (m/2 τ ), with m/2 τ [0.75,1.5] }{{} u 2 d no branch instruction G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 6/22

12 Unified evaluation scheme for log b (x) Evaluation scheme: log b (2) (e + τ) Let t = m/2 τ 1 [ 0.25,0.5], the objective is to evaluate log b (x) = log b (2) (e + τ) + log b (1 + t) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 7/22

13 Unified evaluation scheme for log b (x) Evaluation scheme: log b (2) (e + τ) Let t = m/2 τ 1 [ 0.25,0.5], the objective is to evaluate log b (x) = log b (2) (e + τ) + log b (1 + t) log b (2) known at compile time: for example log 10 (2) = log 10 (2) = G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 7/22

14 Unified evaluation scheme for log b (x) Evaluation scheme: log b (2) (e + τ) Let t = m/2 τ 1 [ 0.25,0.5], the objective is to evaluate log b (x) = log b (2) (e + τ) + log b (1 + t) log b (2) known at compile time: for example log 10 (2) = log 10 (2) = stored as a signed integer ( ) 2 on k bits, associated with an implicit scaling factor = 2 30 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 7/22

15 Unified evaluation scheme for log b (x) Evaluation scheme: log b (2) (e + τ) Let t = m/2 τ 1 [ 0.25,0.5], the objective is to evaluate log b (x) = log b (2) (e + τ) + log b (1 + t) log b (2) known at compile time: for example log 10 (2) = log 10 (2) = stored as a signed integer ( ) 2 on k bits, associated with an implicit scaling factor = 2 30 Hence, rewrite log b (2) as log b (2) = 2 µ ϕ, with 1 ϕ < 2 scaling done statically at synthesis-time G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 7/22

16 Unified evaluation scheme for log b (x) Evaluation scheme: log b (2) (e + τ) Let t = m/2 τ 1 [ 0.25,0.5], the objective is to evaluate log b (x) = log b (2) (e + τ) + log b (1 + t) log b (2) known at compile time: for example log 10 (2) = log 10 (2) = stored as a signed integer ( ) 2 on k bits, associated with an implicit scaling factor = 2 30 Hence, rewrite log b (2) as log b (2) = 2 µ ϕ, with 1 ϕ < 2 scaling done statically at synthesis-time The formula becomes log b (x) = 2 µ (ϕ (e + τ) + log b (1 + t)/2 µ) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 7/22

17 Unified evaluation scheme for log b (x) Evaluation scheme: log b (m/2 τ ) Table lookup-based method Tang (1990) and Gal (1991) tabulated values of log(x) combined with small degree polynomial evaluation requires storing tabulated values on memory Polynomial evaluation-based method Schulte and Swartzlander (1993) univariate polynomial to compte log 2 (x) our approach: particular bivariate polynomial G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 8/22

18 Unified evaluation scheme for log b (x) Polynomial approximation The formula so far log b (x) = 2 µ (ϕ (e + τ) + log b (1 + t)/2 µ) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 9/22

19 Unified evaluation scheme for log b (x) Polynomial approximation The formula so far log b (x) = 2 µ (ϕ (e + τ) + log b (1 + t)/2 µ) For example, when (b,µ) = (10, 2) log 10 (1 + ( ))/2 µ = log 10 ( )/2 µ = G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 9/22

20 Unified evaluation scheme for log b (x) Polynomial approximation The formula so far log b (x) = 2 µ (ϕ (e + τ) + log b (1 + t)/2 µ) For example, when (b,µ) = (10, 2) log 10 (1 + ( ))/2 µ = log 10 ( )/2 µ = when e = 0, we need to compute log 10 (1 + t)/2 µ on a large number of bits to get sufficient accuracy after renormalization G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 9/22

21 Unified evaluation scheme for log b (x) Polynomial approximation The formula so far log b (x) = 2 µ (ϕ (e + τ) + log b (1 + t)/2 µ) For example, when (b,µ) = (10, 2) log 10 (1 + ( ))/2 µ = log 10 ( )/2 µ = when e = 0, we need to compute log 10 (1 + t)/2 µ on a large number of bits to get sufficient accuracy after renormalization Hence rewrite t = 2 δ h, where h 0.5 (δ is the leading zero count on t, and h is t shifted by δ bits ) log b (1 + t) 2 µ = 2 δ h log b(1 + t) 2 µ t with log b (1 + t) 2 µ (1,4) t scaling done dynamically at evaluation-time G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 9/22

22 Unified evaluation scheme for log b (x) Evaluation scheme Objective: compute RN(log b (x)), with ( log b (x) = 2 µ ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 10/22

23 Unified evaluation scheme for log b (x) Evaluation scheme Objective: compute RN(log b (x)), with ( log b (x) = 2 µ ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t Main step and d = c + µ ϕ (e + τ) + 2 δ h log b(1 + t) 2 µ t }{{} u 2 c G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 10/22

24 Unified evaluation scheme for log b (x) Evaluation scheme Objective: compute RN(log b (x)), with ( log b (x) = 2 µ ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t Main step and d = c + µ ϕ (e + τ) + 2 δ h log b(1 + t) 2 µ t }{{} u 2 c The value u is approximated by a finite-precision value û, such that u û < ε. G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 10/22

25 Error analysis Outline of the talk 1. Unified evaluation scheme for log b (x) 2. Error analysis 3. Implementation and results 4. Concluding remarks and future work G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 11/22

26 Error analysis Three sources of error Objective: approximate u by a finite precision value û ( u = 2 c ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 12/22

27 Error analysis Three sources of error Objective: approximate u by a finite precision value û ( u = 2 c ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t ϕ ˆϕ 2 2 k ( 2 c ˆϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 12/22

28 Error analysis Three sources of error Objective: approximate u by a finite precision value û ( u = 2 c ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t ϕ ˆϕ 2 2 k ( 2 c ˆϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t α(a) ( ) 2 c ˆϕ (e + τ) + 2 δ h a(t) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 12/22

29 Error analysis Three sources of error Objective: approximate u by a finite precision value û ( u = 2 c ϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t ϕ ˆϕ 2 2 k ( 2 c ˆϕ (e + τ) + 2 δ h log b(1 + t) ) 2 µ t α(a) ( 2 c ( û = 2 c ) ˆϕ (e + τ) + 2 δ h a(t) ρ(p ) ˆϕ (e + τ) 2 δ h a(t) }{{} bivariate polynomial ) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 12/22

30 Error analysis Sufficient error bounds By triangular inequality u û 2 6 k + (2 2 3 p ) α(a) + ρ(p ) To compute û such that u û < ε, we have to ensure 2 6 k + (2 2 3 p ) α(a) + ρ(p ) < ε G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 13/22

31 Error analysis Sufficient error bounds By triangular inequality u û 2 6 k + (2 2 3 p ) α(a) + ρ(p ) To compute û such that u û < ε, we have to ensure 2 6 k + (2 2 3 p ) α(a) + ρ(p ) < ε More particularly, we have to compute a polynomial approximant a(t) such that α(a) < ε 26 k p G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 13/22

32 Error analysis Sufficient error bounds By triangular inequality u û 2 6 k + (2 2 3 p ) α(a) + ρ(p ) To compute û such that u û < ε, we have to ensure 2 6 k + (2 2 3 p ) α(a) + ρ(p ) < ε More particularly, we have to compute a polynomial approximant a(t) such that α(a) θ with θ < ε 26 k p G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 13/22

33 Error analysis Sufficient error bounds By triangular inequality u û 2 6 k + (2 2 3 p ) α(a) + ρ(p ) To compute û such that u û < ε, we have to ensure 2 6 k + (2 2 3 p ) α(a) + ρ(p ) < ε More particularly, we have to compute a polynomial approximant a(t) such that α(a) θ with θ < ε 26 k p And finally, we have to find a finite-precision evaluation program P ρ(p ) < ε 2 6 k (2 2 3 p ) θ G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 13/22

34 Error analysis Computation of the required error bound Objective: compute û such that u û < ε to ensure correct rounding Table s Maker Dilemma exhaustive searching feasible for the binary32 floating-point format (with p = 24) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 14/22

35 Error analysis Computation of the required error bound Objective: compute û such that u û < ε to ensure correct rounding Table s Maker Dilemma exhaustive searching feasible for the binary32 floating-point format (with p = 24) Exhaustive search smallest distance between log b (x ) and the middle of two floating-point numbers in precision p, for all floating-point inputs x breakpoints in precision p d ˆv v log b (x ) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 14/22

36 Error analysis Computation of the required error bound Objective: compute û such that u û < ε to ensure correct rounding Table s Maker Dilemma exhaustive searching feasible for the binary32 floating-point format (with p = 24) Exhaustive search smallest distance between log b (x ) and the middle of two floating-point numbers in precision p, for all floating-point inputs x b 2 exp(1) 10 error bound: log 2 (ε) for example: x = log(x) = }{{} 5 58 bits G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 14/22

37 Implementation and results Outline of the talk 1. Unified evaluation scheme for log b (x) 2. Error analysis 3. Implementation and results 4. Concluding remarks and future work G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 15/22

38 Implementation and results Determining word-length k and polynomial degree Computation of a polynomial approximant a(t), such that α(a) < ε 26 k ε 26 k =, since p = p The word-length k must be chosen such that this bound remains non-negative b 2 exp(1) 10 error bound: log 2 (ε) value of k G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 16/22

39 Implementation and results Determining word-length k and polynomial degree Computation of a polynomial approximant a(t), such that α(a) < ε 26 k ε 26 k =, since p = p The word-length k must be chosen such that this bound remains non-negative b 2 exp(1) 10 error bound: log 2 (ε) value of k For log(x), only three inputs required more accuracy than 2 56 we first ignore these inputs at synthesis-time then we check if the implementation returns the correct answer for these inputs interest: reducing the word-length k to 64 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 16/22

40 Implementation and results Design space exploration: polynomial evaluator Given ε, the synthesis process is the following 1. determine the smallest degree of a(t) so as the error bound is satisfied b 2 exp(1) 10 error bound: log 2 (ε) smallest degree of a (22) build polynomial approximant and compute the approximation error bound 3. build evaluation program (CGPE), and check its evaluation error (Gappa) explore Horner, Estrin, and other parallel schemes if no program can be found, increase the polynomial degree G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 17/22

41 Implementation and results Correctly-rounded binary32 implementations Latency (# cycles) Horner Horner-2 Horner log 2 (x) log(x) log 10 (x) G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 18/22

42 Implementation and results Faithful binary32/64 implementations Latency (# cycles) Horner Horner-2 Horner-4 Estrin Estrin rule exposes much more ILP than the other schemes register spilling in binary log 2 (x) log(x) log 10 (x) log 2 (x) log(x) log 10 (x) binary32 binary64 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 19/22

43 Implementation and results Faithful binary32/64 implementations 5 4 IPC (# instructions per cycle) Horner Horner-2 Horner-4 Estrin Estrin rule exposes much more ILP than the other schemes register spilling in binary64 0 log 2 (x) log(x) log 10 (x) log 2 (x) log(x) log 10 (x) binary32 binary64 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 19/22

44 Concluding remarks and future work Outline of the talk 1. Unified evaluation scheme for log b (x) 2. Error analysis 3. Implementation and results 4. Concluding remarks and future work G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 20/22

45 Concluding remarks and future work Conclusion remarks and future work Work done so far Automated design of floating-point implementation of logarithm functions unified range reduction polynomial evaluation-based algorithm Correctly-rounded binary32 implementation in 200 cycles on the ST231 FDLibm over SoftFloat was cycles FDLibm over FLIP was about 1000 cycles Extension to faithful binary32/64 implementations Future work is threefold Evaluate the performance of this approach on other architectures Extend this approach to other transcendental functions, like exp b (x) Study the impact on performances of the polynomial evaluation scheme G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 21/22

46 23rd IEEE Symposium on Computer Arithmetic Santa Clara, CA, USA, July 2016 Automated design of floating-point logarithm functions on integer processors Guillaume Revy (presented by Florent de Dinechin) Univ. Perpignan Via Domitia, DALI project-team Univ. Montpellier 2, LIRMM, UMR 5506 CNRS, LIRMM, UMR 5506 G. Revy (DALI UPVD/LIRMM, UM2, CNRS) Automated design of floating-point logarithm functions on integer processors 22/22

Automated design of floating-point logarithm functions on integer processors

Automated design of floating-point logarithm functions on integer processors Automated design of floating-point logarithm functions on integer processors Guillaume Revy To cite this version: Guillaume Revy. Automated design of floating-point logarithm functions on integer processors.

More information

Transformation of a PID Controller for Numerical Accuracy

Transformation of a PID Controller for Numerical Accuracy Replace this file with prentcsmacro.sty for your meeting, or with entcsmacro.sty for your meeting. Both can be found at the ENTCS Macro Home Page. Transformation of a PID Controller for Numerical Accuracy

More information

Computing correctly rounded logarithms with fixed-point operations. Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie

Computing correctly rounded logarithms with fixed-point operations. Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie Computing correctly rounded logarithms with fixed-point operations Julien Le Maire, Florent de Dinechin, Jean-Michel Muller and Nicolas Brunie Outline Introduction and context Algorithm Results and comparisons

More information

Table-Based Polynomials for Fast Hardware Function Evaluation

Table-Based Polynomials for Fast Hardware Function Evaluation ASAP 05 Table-Based Polynomials for Fast Hardware Function Evaluation Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/ CENTRE

More information

Second Order Function Approximation Using a Single Multiplication on FPGAs

Second Order Function Approximation Using a Single Multiplication on FPGAs FPL 04 Second Order Function Approximation Using a Single Multiplication on FPGAs Jérémie Detrey Florent de Dinechin Projet Arénaire LIP UMR CNRS ENS Lyon UCB Lyon INRIA 5668 http://www.ens-lyon.fr/lip/arenaire/

More information

Notes for Chapter 1 of. Scientific Computing with Case Studies

Notes for Chapter 1 of. Scientific Computing with Case Studies Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What

More information

Newton-Raphson Algorithms for Floating-Point Division Using an FMA

Newton-Raphson Algorithms for Floating-Point Division Using an FMA Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008

More information

Designing a Correct Numerical Algorithm

Designing a Correct Numerical Algorithm Intro Implem Errors Sollya Gappa Norm Conc Christoph Lauter Guillaume Melquiond March 27, 2013 Intro Implem Errors Sollya Gappa Norm Conc Outline 1 Introduction 2 Implementation theory 3 Error analysis

More information

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460 Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how

More information

Optimizing Scientific Libraries for the Itanium

Optimizing Scientific Libraries for the Itanium 0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common

More information

ECS 231 Computer Arithmetic 1 / 27

ECS 231 Computer Arithmetic 1 / 27 ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers

More information

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units

Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Exploring the Potential of Instruction-Level Parallelism of Exposed Datapath Architectures with Buffered Processing Units Anoop Bhagyanath and Klaus Schneider Embedded Systems Chair University of Kaiserslautern

More information

1 Floating point arithmetic

1 Floating point arithmetic Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent

More information

Accurate Solution of Triangular Linear System

Accurate Solution of Triangular Linear System SCAN 2008 Sep. 29 - Oct. 3, 2008, at El Paso, TX, USA Accurate Solution of Triangular Linear System Philippe Langlois DALI at University of Perpignan France Nicolas Louvet INRIA and LIP at ENS Lyon France

More information

Floating-point Computation

Floating-point Computation Chapter 2 Floating-point Computation 21 Positional Number System An integer N in a number system of base (or radix) β may be written as N = a n β n + a n 1 β n 1 + + a 1 β + a 0 = P n (β) where a i are

More information

MATH ASSIGNMENT 03 SOLUTIONS

MATH ASSIGNMENT 03 SOLUTIONS MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for

More information

4.2 Floating-Point Numbers

4.2 Floating-Point Numbers 101 Approximation 4.2 Floating-Point Numbers 4.2 Floating-Point Numbers The number 3.1416 in scientific notation is 0.31416 10 1 or (as computer output) -0.31416E01..31416 10 1 exponent sign mantissa base

More information

On various ways to split a floating-point number

On various ways to split a floating-point number On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-

More information

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Hardware Operator for Simultaneous Sine and Cosine Evaluation Hardware Operator for Simultaneous Sine and Cosine Evaluation Arnaud Tisserand To cite this version: Arnaud Tisserand. Hardware Operator for Simultaneous Sine and Cosine Evaluation. ICASSP 6: International

More information

Computation of the error functions erf and erfc in arbitrary precision with correct rounding

Computation of the error functions erf and erfc in arbitrary precision with correct rounding Computation of the error functions erf and erfc in arbitrary precision with correct rounding Sylvain Chevillard Arenaire, LIP, ENS-Lyon, France Sylvain.Chevillard@ens-lyon.fr Nathalie Revol INRIA, Arenaire,

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Homework 2 Foundations of Computational Math 1 Fall 2018

Homework 2 Foundations of Computational Math 1 Fall 2018 Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming

More information

Accurate polynomial evaluation in floating point arithmetic

Accurate polynomial evaluation in floating point arithmetic in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic

More information

Round-off Error Analysis of Explicit One-Step Numerical Integration Methods

Round-off Error Analysis of Explicit One-Step Numerical Integration Methods Round-off Error Analysis of Explicit One-Step Numerical Integration Methods 24th IEEE Symposium on Computer Arithmetic Sylvie Boldo 1 Florian Faissole 1 Alexandre Chapoutot 2 1 Inria - LRI, Univ. Paris-Sud

More information

ALU (3) - Division Algorithms

ALU (3) - Division Algorithms HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)

More information

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,

More information

CR-LIBM A library of correctly rounded elementary functions in double-precision

CR-LIBM A library of correctly rounded elementary functions in double-precision CR-LIBM A library of correctly rounded elementary functions in double-precision Catherine Daramy-Loirat, David Defour, Florent De Dinechin, Matthieu Gallet, Nicolas Gast, Christoph Lauter, Jean-Michel

More information

Outline. Computer Arithmetic for Cryptography in the Arith Group. LIRMM Montpellier Laboratory of Computer Science, Robotics, and Microelectronics

Outline. Computer Arithmetic for Cryptography in the Arith Group. LIRMM Montpellier Laboratory of Computer Science, Robotics, and Microelectronics Outline Computer Arithmetic for Cryptography in the Arith Group Arnaud Tisserand LIRMM, CNRS Univ. Montpellier 2 Arith Group Crypto Puces Porquerolles, April 16 18, 2007 Introduction LIRMM Laboratory Arith

More information

Innocuous Double Rounding of Basic Arithmetic Operations

Innocuous Double Rounding of Basic Arithmetic Operations Innocuous Double Rounding of Basic Arithmetic Operations Pierre Roux ISAE, ONERA Double rounding occurs when a floating-point value is first rounded to an intermediate precision before being rounded to

More information

Automating the Verification of Floating-point Algorithms

Automating the Verification of Floating-point Algorithms Introduction Interval+Error Advanced Gappa Conclusion Automating the Verification of Floating-point Algorithms Guillaume Melquiond Inria Saclay Île-de-France LRI, Université Paris Sud, CNRS 2014-07-18

More information

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented

More information

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.

Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Mourad Gouicem PEQUAN Team, LIP6/UPMC Nancy, France May 28 th 2013

More information

CSCI-564 Advanced Computer Architecture

CSCI-564 Advanced Computer Architecture CSCI-564 Advanced Computer Architecture Lecture 8: Handling Exceptions and Interrupts / Superscalar Bo Wu Colorado School of Mines Branch Delay Slots (expose control hazard to software) Change the ISA

More information

Automatic generation of polynomial-based hardware architectures for function evaluation

Automatic generation of polynomial-based hardware architectures for function evaluation Automatic generation of polynomial-based hardware architectures for function evaluation Florent De Dinechin, Mioara Joldes, Bogdan Pasca To cite this version: Florent De Dinechin, Mioara Joldes, Bogdan

More information

Gal s Accurate Tables Method Revisited

Gal s Accurate Tables Method Revisited Gal s Accurate Tables Method Revisited Damien Stehlé UHP/LORIA 615 rue du jardin botanique F-5460 Villers-lès-Nancy Cedex stehle@loria.fr Paul Zimmermann INRIA Lorraine/LORIA 615 rue du jardin botanique

More information

Faithful Horner Algorithm

Faithful Horner Algorithm ICIAM 07, Zurich, Switzerland, 16 20 July 2007 Faithful Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, France http://webdali.univ-perp.fr Ph. Langlois (Université de Perpignan)

More information

Binary floating point

Binary floating point Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate

More information

Compiling Techniques

Compiling Techniques Lecture 11: Introduction to 13 November 2015 Table of contents 1 Introduction Overview The Backend The Big Picture 2 Code Shape Overview Introduction Overview The Backend The Big Picture Source code FrontEnd

More information

Comparison between binary and decimal floating-point numbers

Comparison between binary and decimal floating-point numbers 1 Comparison between binary and decimal floating-point numbers Nicolas Brisebarre, Christoph Lauter, Marc Mezzarobba, and Jean-Michel Muller Abstract We introduce an algorithm to compare a binary floating-point

More information

4ccurate 4lgorithms in Floating Point 4rithmetic

4ccurate 4lgorithms in Floating Point 4rithmetic 4ccurate 4lgorithms in Floating Point 4rithmetic Philippe LANGLOIS 1 1 DALI Research Team, University of Perpignan, France SCAN 2006 at Duisburg, Germany, 26-29 September 2006 Philippe LANGLOIS (UPVD)

More information

A 32-bit Decimal Floating-Point Logarithmic Converter

A 32-bit Decimal Floating-Point Logarithmic Converter A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan

More information

How to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm

How to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm How to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, DALI Research Team {langlois, nicolas.louvet}@univ-perp.fr

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Computation of dot products in finite fields with floating-point arithmetic

Computation of dot products in finite fields with floating-point arithmetic Computation of dot products in finite fields with floating-point arithmetic Stef Graillat Joint work with Jérémy Jean LIP6/PEQUAN - Université Pierre et Marie Curie (Paris 6) - CNRS Computer-assisted proofs

More information

Efficient Reproducible Floating Point Summation and BLAS

Efficient Reproducible Floating Point Summation and BLAS Efficient Reproducible Floating Point Summation and BLAS Peter Ahrens Hong Diep Nguyen James Demmel Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No.

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 21 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 21 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Computing Time for Summation Algorithm: Less Hazard and More Scientific Research

Computing Time for Summation Algorithm: Less Hazard and More Scientific Research Computing Time for Summation Algorithm: Less Hazard and More Scientific Research Bernard Goossens, Philippe Langlois, David Parello, Kathy Porada To cite this version: Bernard Goossens, Philippe Langlois,

More information

Special Nodes for Interface

Special Nodes for Interface fi fi Special Nodes for Interface SW on processors Chip-level HW Board-level HW fi fi C code VHDL VHDL code retargetable compilation high-level synthesis SW costs HW costs partitioning (solve ILP) cluster

More information

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about

More information

Computing logarithms and other special functions

Computing logarithms and other special functions Computing logarithms and other special functions Mike Giles University of Oxford Mathematical Institute Napier 400 NAIS Symposium April 2, 2014 Mike Giles (Oxford) Computing special functions April 2,

More information

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems...

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems... Contents 2 How to Obtain and Estimate Accuracy 1 2.1 Basic Concepts in Error Estimation................ 1 2.1.1 Sources of Error.................... 1 2.1.2 Absolute and Relative Errors............. 4

More information

Implementation of float float operators on graphic hardware

Implementation of float float operators on graphic hardware Implementation of float float operators on graphic hardware Guillaume Da Graça David Defour DALI, Université de Perpignan Outline Why GPU? Floating point arithmetic on GPU Float float format Problem &

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

CMP 334: Seventh Class

CMP 334: Seventh Class CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative

More information

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)

More information

Sampling exactly from the normal distribution

Sampling exactly from the normal distribution 1 Sampling exactly from the normal distribution Charles Karney charles.karney@sri.com SRI International AofA 2017, Princeton, June 20, 2017 Background: In my day job, I ve needed normal (Gaussian) deviates

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Abstract. Ariane 5. Ariane 5, arithmetic overflow, exact rounding and Diophantine approximation

Abstract. Ariane 5. Ariane 5, arithmetic overflow, exact rounding and Diophantine approximation May 11, 2019 Suda Neu-Tech Institute Sanmenxia, Henan, China EV Summit Forum, Sanmenxia Suda Energy Conservation &NewEnergyTechnologyResearchInstitute Ariane 5, arithmetic overflow, exact rounding and

More information

Introduction CSE 541

Introduction CSE 541 Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters

More information

GF(2 m ) arithmetic: summary

GF(2 m ) arithmetic: summary GF(2 m ) arithmetic: summary EE 387, Notes 18, Handout #32 Addition/subtraction: bitwise XOR (m gates/ops) Multiplication: bit serial (shift and add) bit parallel (combinational) subfield representation

More information

Fixed-Point Trigonometric Functions on FPGAs

Fixed-Point Trigonometric Functions on FPGAs Fixed-Point Trigonometric Functions on FPGAs Florent de Dinechin Matei Iştoan Guillaume Sergent LIP, Université de Lyon (CNRS/ENS-Lyon/INRIA/UCBL) 46, allée d Italie, 69364 Lyon Cedex 07 June 14th, 2013

More information

Round-off Errors and Computer Arithmetic - (1.2)

Round-off Errors and Computer Arithmetic - (1.2) Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed

More information

Numerical Methods - Preliminaries

Numerical Methods - Preliminaries Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical

More information

Table-based polynomials for fast hardware function evaluation

Table-based polynomials for fast hardware function evaluation Table-based polynomials for fast hardware function evaluation Jérémie Detrey, Florent de Dinechin LIP, École Normale Supérieure de Lyon 46 allée d Italie 69364 Lyon cedex 07, France E-mail: {Jeremie.Detrey,

More information

Efficient Function Approximation Using Truncated Multipliers and Squarers

Efficient Function Approximation Using Truncated Multipliers and Squarers Efficient Function Approximation Using Truncated Multipliers and Squarers E. George Walters III Lehigh University Bethlehem, PA, USA waltersg@ieee.org Michael J. Schulte University of Wisconsin Madison

More information

Toward Correctly Rounded Transcendentals

Toward Correctly Rounded Transcendentals IEEE TRANSACTIONS ON COMPUTERS, VOL. 47, NO. 11, NOVEMBER 1998 135 Toward Correctly Rounded Transcendentals Vincent Lefèvre, Jean-Michel Muller, Member, IEEE, and Arnaud Tisserand Abstract The Table Maker

More information

McBits: Fast code-based cryptography

McBits: Fast code-based cryptography McBits: Fast code-based cryptography Peter Schwabe Radboud University Nijmegen, The Netherlands Joint work with Daniel Bernstein, Tung Chou December 17, 2013 IMA International Conference on Cryptography

More information

Lecture 4 Modeling, Analysis and Simulation in Logic Design. Dr. Yinong Chen

Lecture 4 Modeling, Analysis and Simulation in Logic Design. Dr. Yinong Chen Lecture 4 Modeling, Analysis and Simulation in Logic Design Dr. Yinong Chen The Engineering Design Process Define Problem and requirement Research Define Alternative solutions CAD Modeling Analysis Simulation

More information

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA

Efficient Polynomial Evaluation Algorithm and Implementation on FPGA Efficient Polynomial Evaluation Algorithm and Implementation on FPGA by Simin Xu School of Computer Engineering A thesis submitted to Nanyang Technological University in partial fullfillment of the requirements

More information

An accurate algorithm for evaluating rational functions

An accurate algorithm for evaluating rational functions An accurate algorithm for evaluating rational functions Stef Graillat To cite this version: Stef Graillat. An accurate algorithm for evaluating rational functions. 2017. HAL Id: hal-01578486

More information

1 Finding and fixing floating point problems

1 Finding and fixing floating point problems Notes for 2016-09-09 1 Finding and fixing floating point problems Floating point arithmetic is not the same as real arithmetic. Even simple properties like associativity or distributivity of addition and

More information

Chapter 1 Error Analysis

Chapter 1 Error Analysis Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of

More information

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION

More information

Modular Multiplication in GF (p k ) using Lagrange Representation

Modular Multiplication in GF (p k ) using Lagrange Representation Modular Multiplication in GF (p k ) using Lagrange Representation Jean-Claude Bajard, Laurent Imbert, and Christophe Nègre Laboratoire d Informatique, de Robotique et de Microélectronique de Montpellier

More information

A technique for DDA seed shifting and scaling

A technique for DDA seed shifting and scaling A technique for DDA seed shifting and scaling John Kerl Feb 8, 2001 Abstract This paper describes a simple, unified technique for DDA seed shifting and scaling. As well, the terms DDA, seed, shifting and

More information

Rigorous Polynomial Approximations and Applications

Rigorous Polynomial Approximations and Applications Rigorous Polynomial Approximations and Applications Mioara Joldeș under the supervision of: Nicolas Brisebarre and Jean-Michel Muller École Normale Supérieure de Lyon, Arénaire Team, Laboratoire de l Informatique

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

More information

Formal verification of IA-64 division algorithms

Formal verification of IA-64 division algorithms Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64

More information

QUADRATIC PROGRAMMING?

QUADRATIC PROGRAMMING? QUADRATIC PROGRAMMING? WILLIAM Y. SIT Department of Mathematics, The City College of The City University of New York, New York, NY 10031, USA E-mail: wyscc@cunyvm.cuny.edu This is a talk on how to program

More information

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points. Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu

More information

A new multiplication algorithm for extended precision using floating-point expansions. Valentina Popescu, Jean-Michel Muller,Ping Tak Peter Tang

A new multiplication algorithm for extended precision using floating-point expansions. Valentina Popescu, Jean-Michel Muller,Ping Tak Peter Tang A new multiplication algorithm for extended precision using floating-point expansions Valentina Popescu, Jean-Michel Muller,Ping Tak Peter Tang ARITH 23 July 2016 AMPAR CudA Multiple Precision ARithmetic

More information

Exploiting In-Memory Processing Capabilities for Density Functional Theory Applications

Exploiting In-Memory Processing Capabilities for Density Functional Theory Applications Exploiting In-Memory Processing Capabilities for Density Functional Theory Applications 2016 Aug 23 P. F. Baumeister, T. Hater, D. Pleiter H. Boettiger, T. Maurer, J. R. Brunheroto Contributors IBM R&D

More information

Curve41417: Karatsuba revisited

Curve41417: Karatsuba revisited Curve41417: Karatsuba revisited Chitchanok Chuengsatiansup Technische Universiteit Eindhoven September 25, 2014 Joint work with Daniel J. Bernstein and Tanja Lange Chitchanok Chuengsatiansup Curve41417:

More information

Arithmetic Operators for Pairing-Based Cryptography

Arithmetic Operators for Pairing-Based Cryptography Arithmetic Operators for Pairing-Based Cryptography J.-L. Beuchat 1 N. Brisebarre 2 J. Detrey 3 E. Okamoto 1 1 University of Tsukuba, Japan 2 École Normale Supérieure de Lyon, France 3 Cosec, b-it, Bonn,

More information

Isolating critical cases for reciprocals using integer factorization. John Harrison Intel Corporation ARITH-16 Santiago de Compostela 17th June 2003

Isolating critical cases for reciprocals using integer factorization. John Harrison Intel Corporation ARITH-16 Santiago de Compostela 17th June 2003 0 Isolating critical cases for reciprocals using integer factorization John Harrison Intel Corporation ARITH-16 Santiago de Compostela 17th June 2003 1 result before the final rounding Background Suppose

More information

Accelerated Shift-and-Add algorithms

Accelerated Shift-and-Add algorithms c,, 1 12 () Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Accelerated Shift-and-Add algorithms N. REVOL AND J.-C. YAKOUBSOHN nathalie.revol@univ-lille1.fr, yak@cict.fr Lab. ANO,

More information

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015 CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015 Today Registration should be done. Homework 1 due 11:59pm next Wednesday, April 8 th. Review math

More information

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research)

International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

[2] Predicting the direction of a branch is not enough. What else is necessary?

[2] Predicting the direction of a branch is not enough. What else is necessary? [2] What are the two main ways to define performance? [2] Predicting the direction of a branch is not enough. What else is necessary? [2] The power consumed by a chip has increased over time, but the clock

More information

CSCE 564, Fall 2001 Notes 6 Page 1 13 Random Numbers The great metaphysical truth in the generation of random numbers is this: If you want a function

CSCE 564, Fall 2001 Notes 6 Page 1 13 Random Numbers The great metaphysical truth in the generation of random numbers is this: If you want a function CSCE 564, Fall 2001 Notes 6 Page 1 13 Random Numbers The great metaphysical truth in the generation of random numbers is this: If you want a function that is reasonably random in behavior, then take any

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. memory inst 32 register

More information

Lecture 3, Performance

Lecture 3, Performance Repeating some definitions: Lecture 3, Performance CPI MHz MIPS MOPS Clocks Per Instruction megahertz, millions of cycles per second Millions of Instructions Per Second = MHz / CPI Millions of Operations

More information

Numerics and Error Analysis

Numerics and Error Analysis Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What

More information

iretilp : An efficient incremental algorithm for min-period retiming under general delay model

iretilp : An efficient incremental algorithm for min-period retiming under general delay model iretilp : An efficient incremental algorithm for min-period retiming under general delay model Debasish Das, Jia Wang and Hai Zhou EECS, Northwestern University, Evanston, IL 60201 Place and Route Group,

More information

How do computers represent numbers?

How do computers represent numbers? How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete

More information

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics).

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics). 1 Introduction Read sections 1.1, 1.2.1 1.2.4, 1.2.6, 1.3.8, 1.3.9, 1.4. Review questions 1.1 1.6, 1.12 1.21, 1.37. The subject of Scientific Computing is to simulate the reality. Simulation is the representation

More information

Formal Verification of Mathematical Algorithms

Formal Verification of Mathematical Algorithms Formal Verification of Mathematical Algorithms 1 Formal Verification of Mathematical Algorithms John Harrison Intel Corporation The cost of bugs Formal verification Levels of verification HOL Light Formalizing

More information