Floating Point Arithmetic and Rounding Error Analysis
|
|
- Meryl Lewis
- 5 years ago
- Views:
Transcription
1 Floating Point Arithmetic and Rounding Error Analysis Claude-Pierre Jeannerod Nathalie Revol Inria LIP, ENS de Lyon
2 2 Context Starting point: How do numerical algorithms behave in nite precision arithmetic? Typically, basic matrix computations: Ax = b,... oating-point arithmetic as specied by IEEE 754.
3 2 Context Starting point: How do numerical algorithms behave in nite precision arithmetic? Typically, basic matrix computations: Ax = b,... oating-point arithmetic as specied by IEEE 754. Finite precision rounding errors: =
4 2 Context Starting point: How do numerical algorithms behave in nite precision arithmetic? Typically, basic matrix computations: Ax = b,... oating-point arithmetic as specied by IEEE 754. Finite precision rounding errors: = / =
5 2 Context Starting point: How do numerical algorithms behave in nite precision arithmetic? Typically, basic matrix computations: Ax = b,... oating-point arithmetic as specied by IEEE 754. Finite precision rounding errors: = / = =
6 2 Context Starting point: How do numerical algorithms behave in nite precision arithmetic? Typically, basic matrix computations: Ax = b,... oating-point arithmetic as specied by IEEE 754. Finite precision rounding errors: = / = = What is the eect of all such errors on the computed solution x?
7 Rounding error analysis 3 Old and nontrivial question [von Neumann, Turing, Wilkinson,...]
8 3 Rounding error analysis Old and nontrivial question [von Neumann, Turing, Wilkinson,...] In this lecture, two approaches: A priori analysis: Goal: bound on x x / x for any input and format Tool: the many nice properties of oating-point Ideal: readable, provably tight bound + short proof
9 3 Rounding error analysis Old and nontrivial question [von Neumann, Turing, Wilkinson,...] In this lecture, two approaches: A priori analysis: Goal: bound on x x / x for any input and format Tool: the many nice properties of oating-point Ideal: readable, provably tight bound + short proof A posteriori, automatic analysis: Goal: x and enclosure of x x for given input and format Tool: interval arithmetic based on oating-point Ideal: a narrow interval computed fast
10 4 Rounding error analysis Old and nontrivial question [von Neumann, Turing, Wilkinson,...] In this lecture, two approaches: A priori analysis: this lecture Goal: bound on x x / x for any input and format Tool: the many nice properties of oating-point Ideal: readable, provably tight bound + short proof A posteriori, automatic analysis: Nathalie's lecture Goal: x and enclosure of x x for given input and format Tool: interval arithmetic based on oating-point Ideal: a narrow interval computed fast
11 5 Context Floating-point arithmetic A priori analysis Conclusion
12 Floating-point arithmetic 6 An approximation of arithmetic over R. 1940's: rst implementations [Zuse's computers] : full specication [IEEE 754 standard]. Today: IEEE arithmetic everywhere!
13 Floating-point arithmetic 6 An approximation of arithmetic over R. 1940's: rst implementations [Zuse's computers] : full specication [IEEE 754 standard]. Today: IEEE arithmetic everywhere! Although often considered as fuzzy, it is highly structured and has many nice mathematical properties.
14 Floating-point arithmetic 6 An approximation of arithmetic over R. 1940's: rst implementations [Zuse's computers] : full specication [IEEE 754 standard]. Today: IEEE arithmetic everywhere! Although often considered as fuzzy, it is highly structured and has many nice mathematical properties. How to exploit these properties for rigorous analyses?
15 7 Floating-point numbers Rational numbers of the form M β E, where M, E Z, M < β p, E + p 1 [e min, e max ]. base β, precision p, exponent range dened by e min and e max.
16 7 Floating-point numbers Rational numbers of the form M β E, where M, E Z, M < β p, E + p 1 [e min, e max ]. base β, precision p, exponent range dened by e min and e max. Floats in C have β = 2, p = 24, and [e min, e max ] = [ 126, 127].
17 7 Floating-point numbers Rational numbers of the form M β E, where M, E Z, M < β p, E + p 1 [e min, e max ]. base β, precision p, exponent range dened by e min and e max. Floats in C have β = 2, p = 24, and [e min, e max ] = [ 126, 127].
18 8 Floating-point numbers We assume e min = and e max = + : unbounded exponent range, β is even and p 2.
19 8 Floating-point numbers We assume e min = and e max = + : unbounded exponent range, β is even and p 2. Denition: The set F of oating-point numbers in base β and precision p is F := {0} {M β E : M, E Z, β p 1 M < β p }.
20 Floating-point numbers: other representations x F\{0} x = m β e, m = ( } {{ } ) β [1, β). p 1 Three useful units: Unit in the rst place: ufp(x) = β e, Unit in the last place: ulp(x) = β e p+1, Unit roundo: u = 1 2 β1 p. 9
21 Floating-point numbers: other representations 9 x F\{0} x = m β e, m = ( } {{ } ) β [1, β). p 1 Three useful units: Unit in the rst place: ufp(x) = β e, Unit in the last place: ulp(x) = β e p+1, Unit roundo: u = 1 2 β1 p. Alternative views, which display the structure of F very well: x ulp(x)z, x = (1 + 2ku) ufp(x), k N. F [1, β) = { } 1, 1 + 2u, 1 + 4u,....
22 10 Floating-point numbers: some properties F can be seen as a structured grid with many nice properties: Symmetry: f F f F ; Auto-similarity: f F, e Z f β e F ;
23 Floating-point numbers: some properties 10 F can be seen as a structured grid with many nice properties: Symmetry: f F f F ; Auto-similarity: f F, e Z f β e F ; { } F [1, β) = 1, 1 + 2u, 1 + 4u,... ;
24 Floating-point numbers: some properties 10 F can be seen as a structured grid with many nice properties: Symmetry: f F f F ; Auto-similarity: f F, e Z f β e F ; { } F [1, β) = 1, 1 + 2u, 1 + 4u,... ; F [β e, β e+1 ] has (β 1)β p equally spaced elements, with spacing equal to 2uβ e ;
25 Floating-point numbers: some properties 10 F can be seen as a structured grid with many nice properties: Symmetry: f F f F ; Auto-similarity: f F, e Z f β e F ; { } F [1, β) = 1, 1 + 2u, 1 + 4u,... ; F [β e, β e+1 ] has (β 1)β p equally spaced elements, with spacing equal to 2uβ e ; Neighborhood of 1 F:..., 1 4u β, 1 2u β, 1, 1 + 2u, 1 + 4u,.... Hence 1 is β x closer to its predecessor than to its successor.
26 Rounding functions 11 Round-to-nearest function RN : R F such that t R, RN(t) t = min f t, f F with given tie-breaking rule.
27 Rounding functions 11 Round-to-nearest function RN : R F such that t R, RN(t) t = min f t, f F with given tie-breaking rule. t F RN(t) = t RN nondecreasing reasonable tie-breaking rule: RN( t) = RN(t) RN(tβ e ) = RN(t)β e, e Z
28 Rounding functions 12 More generally, a rounding function is any map from R to F such that t F (t) = t; t t (t) (t ).
29 12 Rounding functions More generally, a rounding function is any map from R to F such that t F (t) = t; t t (t) (t ). Adding just one extra constraint gives the usual directed roundings: Rounding down: RD(t) t. Rounding up: t RU(t). Rounding to zero: RZ(t) t.
30 12 Rounding functions More generally, a rounding function is any map from R to F such that t F (t) = t; t t (t) (t ). Adding just one extra constraint gives the usual directed roundings: Rounding down: RD(t) t. Rounding up: t RU(t). Rounding to zero: RZ(t) t. Key property for interval arithmetic: t F t [ ] RD(t), RU(t).
31 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u.
32 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}.
33 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}. Then RN(t) t 1 2u = u. 2
34 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}. Then RN(t) t 1 2u = u. 2 Dividing by RN(t) 1 gives directly the bound on E 2.
35 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}. Then RN(t) t 1 2u = u. 2 Dividing by RN(t) 1 gives directly the bound on E 2. If t 1 + u then the bound E 1 (t) u 1+u follows.
36 Error bounds for real numbers 13 E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}. Then RN(t) t 1 2u = u. 2 Dividing by RN(t) 1 gives directly the bound on E 2. If t 1 + u then the bound E 1 (t) u 1+u follows. Else 1 t < 1 + u RN(t) = 1 E 1 (t) = t 1 t < u 1+u.
37 Error bounds for real numbers E 1 (t) := RN(t) t t u 1 + u, E 2(t) := RN(t) t RN(t) u. Proof: Assume 1 t < β, so that RN(t) {1, 1 + 2u, 1 + 4u,..., β}. Then RN(t) t 1 2u = u. 2 Dividing by RN(t) 1 gives directly the bound on E 2. If t 1 + u then the bound E 1 (t) u 1+u follows. Else 1 t < 1 + u RN(t) = 1 E 1 (t) = t 1 t < u 1+u. u Bound 1+u : sharp and well known [Dekker'71, Holm'80, Knuth'81-98], but simpler bound u almost always used in practice. 13
38 Error bounds for real numbers 14 Example for the usual binary formats: if p = 11 (half), u u 1 + u if p = 24 (oat), if p = 53 (double), if p = 113 (quad). For directed roundings, replace these bounds by 2u. Conclusion: in all cases, the relative errors due to rounding can be bounded by a tiny quantity which depends only on the format.
39 Correct rounding 15 This is the result of the composition of two functions: basic operations performed exactly, and exact result then rounded: x, y F, op = ±,, return r := RN(x op y). op extends to square root and FMA (fused multiply add: xy + z).
40 Correct rounding 15 This is the result of the composition of two functions: basic operations performed exactly, and exact result then rounded: x, y F, op = ±,, return r := RN(x op y). op extends to square root and FMA (fused multiply add: xy + z). The error bounds on E 1 and E 2 yield two standard models: r = (x op y) (1 + δ 1 ), δ 1 u 1+u =: u 1, = (x op y) 1, 1 + δ 2 δ 2 u.
41 Example 16 ( ) Let r = x+y RN(x+y) be evaluated naively as r = RN. 2 2
42 Example 16 ( ) Let r = x+y RN(x+y) be evaluated naively as r = RN. 2 2 High relative accuracy is ensured: RN(x + y) r = (1 + δ 1 ), 2 δ 1 u 1, = x + y (1 + δ 1 )(1 + δ 2 1), δ 1 u 1, =: r (1 + ɛ), ɛ 2u.
43 Example 16 ( ) Let r = x+y RN(x+y) be evaluated naively as r = RN. 2 2 High relative accuracy is ensured: RN(x + y) r = (1 + δ 1 ), 2 δ 1 u 1, = x + y (1 + δ 1 )(1 + δ 2 1), δ 1 u 1, =: r (1 + ɛ), ɛ 2u. We'd also like to have min(x, y) r max(x, y)...
44 Example Not always true: (RN ( ) ) ( ) 10 β = 10, p = 3 RN = RN =
45 Example Not always true: (RN ( ) ) ( ) 10 β = 10, p = 3 RN = RN = True if β = 2 or sign(x) sign(y). 17
46 Example 17 Not always true: (RN ( ) ) ( ) 10 β = 10, p = 3 RN = RN = True if β = 2 or sign(x) sign(y). Proof for base two: r := RN ( RN(x+y) 2 ) = RN ( ) x+y 2. x x+y y RN(x) RN ( x+y 2 2 x r y. ) RN(y)
47 Example 17 Not always true: (RN ( ) ) ( ) 10 β = 10, p = 3 RN = RN = True if β = 2 or sign(x) sign(y). Proof for base two: r := RN ( RN(x+y) 2 ) = RN ( ) x+y 2. x x+y y RN(x) RN ( x+y 2 2 x r y. ) RN(y) Repair other cases using r = x + y x. [Sterbenz'74, Boldo'15] 2
48 Other typical oating-point surprises 18 β = 2, any precision p 1 10 F.
49 Other typical oating-point surprises 18 β = 2, any precision p 1 10 F. Loss of algebraic properties: commutativity of +,, is preserved by correct rounding, but associativity and distributivity are lost: in general, ( (x + y) + z) (x + (y + z)).
50 Other typical oating-point surprises 18 β = 2, any precision p 1 10 F. Loss of algebraic properties: commutativity of +,, is preserved by correct rounding, but associativity and distributivity are lost: in general, ( (x + y) + z) (x + (y + z)). Catastrophic cancellation: 2 oating-point operations are enough to produce a result with relative error 1.
51 Catastrophic cancellation 19 For example, if x = 1, y = u β, and z = 1 then r := RN(RN(x + y) + z) = 0, and, since r := x + y + z is nonzero, we obtain r r / r = 1.
52 19 Catastrophic cancellation For example, if x = 1, y = u β, and z = 1 then r := RN(RN(x + y) + z) = 0, and, since r := x + y + z is nonzero, we obtain r r / r = 1. Possible workarounds: Sorting the input (if possible) Rewriting: a 2 b 2 = (a + b)(a b). Compensation: compute the rounding errors, and use them later in the algorithm in order to compensate for their eect. [Kahan, Rump,...]
53 Conditions for exact subtraction 20 Sterbenz' lemma: [Sterbenz'74] x, y F, y 2 x 2y x y F.
54 20 Conditions for exact subtraction Sterbenz' lemma: x, y F, y 2 x 2y x y F. [Sterbenz'74] Valid for any base β. Applications: Cody and Waite's range reduction, Kahan's accurate algorithms (discriminants, triangle area),...
55 Conditions for exact subtraction 20 Sterbenz' lemma: [Sterbenz'74] x, y F, y 2 x 2y x y F. Valid for any base β. Applications: Cody and Waite's range reduction, Kahan's accurate algorithms (discriminants, triangle area),... Proof: assume 0 < y x 2y. ulp(y) ulp(x) x y β e Z with β e = ulp(y). x y β is an integer such that 0 x y e β y e ulp(y) < βp. [Hauser'96]
56 Representable error terms 21 Addition and multiplication: x, y F, op {+, } x op y RN(x op y) F.
57 21 Representable error terms Addition and multiplication: x, y F, op {+, } x op y RN(x op y) F. Division and square root: x y RN(x/y) F, x RN ( x ) 2 F. Noted quite early. [Dekker'71, Pichat'76, Bohlender et al.'91] RN required only for ADD and SQRT. [Boldo & Daumas'03] FMA: its error is the sum of two oats. [Boldo & Muller'11]
58 22 Error-free transformations (EFT) Floating-point algorithms for computing such error terms exactly: x + y RN(x + y) in 6 additions [Møller'65, Knuth] and not less [Kornerup, Lefèvre, Louvet, Muller'12]
59 Error-free transformations (EFT) 22 Floating-point algorithms for computing such error terms exactly: x + y RN(x + y) in 6 additions [Møller'65, Knuth] and not less [Kornerup, Lefèvre, Louvet, Muller'12] xy RN(xy) can be obtained in 17 + and x [Dekker'71, Boldo'06] in only 2 ops if an FMA is available: ẑ := RN(xy) xy ẑ = FMA(x, y, ẑ).
60 Error-free transformations (EFT) 22 Floating-point algorithms for computing such error terms exactly: x + y RN(x + y) in 6 additions [Møller'65, Knuth] and not less [Kornerup, Lefèvre, Louvet, Muller'12] xy RN(xy) can be obtained in 17 + and x [Dekker'71, Boldo'06] in only 2 ops if an FMA is available: ẑ := RN(xy) xy ẑ = FMA(x, y, ẑ). Similar FMA-based EFT for DIV, SQRT... and FMA. EFT are key for extended precision algorithms: error compensation [Kahan'65,..., Higham'96, Ogita, Rump, Oishi'04+, Graillat, Langlois, Louvet'05+,...], oating-point expansions [Priest'91, Shewchuk'97, Joldes, Muller, Popescu'14+].
61 23 Optimal relative error bounds When t can be any real number, E 1 (t) best possible: u 1+u and E 2(t) u are t := 1 + u RN(t) is 1 or 1 + 2u t RN(t) = u.
62 23 Optimal relative error bounds When t can be any real number, E 1 (t) best possible: u 1+u and E 2(t) u are t := 1 + u RN(t) is 1 or 1 + 2u t RN(t) = u. Hence E 1 (t) = u 1 + u
63 23 Optimal relative error bounds When t can be any real number, E 1 (t) best possible: u 1+u and E 2(t) u are t := 1 + u RN(t) is 1 or 1 + 2u t RN(t) = u. Hence E 1 (t) = u 1 + u and, if rounding ties to even, RN(t) = 1 and thus E 2 (t) = u.
64 23 Optimal relative error bounds When t can be any real number, E 1 (t) best possible: u 1+u and E 2(t) u are t := 1 + u RN(t) is 1 or 1 + 2u t RN(t) = u. Hence E 1 (t) = u 1 + u and, if rounding ties to even, RN(t) = 1 and thus E 2 (t) = u. These are examples of optimal bounds: valid for all (t, RN) with t of a certain type; attained for some (t, RN) with t parametrized by β and p.
65 24 Can we do better when t = x op y and x, y F? This depends on op and, sometimes, on β and p. [J. & Rump'14]
66 Can we do better when t = x op y and x, y F? This depends on op and, sometimes, on β and p. [J. & Rump'14] t optimal bound on E 1(t) optimal bound on E 2(t) x ± y xy x/y u 1+u u 1+u u u 1+u ( ) u ( ) if β > 2, u 2u 2 if β = 2 u if β > 2, u 2u 2 1+u 2u 2 if β = 2 x u 1 + 2u 1 ( ) i β > 2 or 2 p + 1 is not a Fermat prime. Two standard models for each arithmetic operation. Application: sharper bounds and/or much simpler proofs. 24
67 25 Context Floating-point arithmetic A priori analysis Conclusion
68 Classical approach: Wilkinson's analysis 26 This is the most common way to guarantee a priori that the computed solution x has some kind of numerical quality: the forward error x x is 'small', the backward error A such that (A + A) x = b is 'small'.
69 Classical approach: Wilkinson's analysis 26 This is the most common way to guarantee a priori that the computed solution x has some kind of numerical quality: the forward error x x is 'small', the backward error A such that (A + A) x = b is 'small'. Developed by Wilkinson in the 1950s and 1960s. Relies almost exclusively on the rst standard model: r = (x op y)(1 + δ), δ u.
70 Classical approach: Wilkinson's analysis 26 This is the most common way to guarantee a priori that the computed solution x has some kind of numerical quality: the forward error x x is 'small', the backward error A such that (A + A) x = b is 'small'. Developed by Wilkinson in the 1950s and 1960s. Relies almost exclusively on the rst standard model: Eminently powerful: see Higham's book Accuracy and Stability of Numerical Algorithms (SIAM). r = (x op y)(1 + δ), δ u.
71 Example of analysis: a 2 b 2 Applying the standard model to each operation gives: r = ( RN(a 2 ) RN(b 2 ) ) (1 + δ 3 ) = ( a 2 (1 + δ 1 ) b 2 (1 + δ 2 ) ) (1 + δ 3 ), δ i u 1. 27
72 Example of analysis: a 2 b 2 Applying the standard model to each operation gives: r = ( RN(a 2 ) RN(b 2 ) ) (1 + δ 3 ) = ( a 2 (1 + δ 1 ) b 2 (1 + δ 2 ) ) (1 + δ 3 ), δ i u 1. r r = a 2 (δ 1 + δ 3 + δ 1 δ 3 ) b 2 (δ 2 + δ 3 + δ 2 δ 3 ). 27
73 Example of analysis: a 2 b 2 Applying the standard model to each operation gives: r = ( RN(a 2 ) RN(b 2 ) ) (1 + δ 3 ) = ( a 2 (1 + δ 1 ) b 2 (1 + δ 2 ) ) (1 + δ 3 ), δ i u 1. r r = a 2 (δ 1 + δ 3 + δ 1 δ 3 ) b 2 (δ 2 + δ 3 + δ 2 δ 3 ). r r 2u (a 2 +b 2 ). 27
74 Example of analysis: a 2 b 2 Applying the standard model to each operation gives: r = ( RN(a 2 ) RN(b 2 ) ) (1 + δ 3 ) = ( a 2 (1 + δ 1 ) b 2 (1 + δ 2 ) ) (1 + δ 3 ), δ i u 1. r r = a 2 (δ 1 + δ 3 + δ 1 δ 3 ) b 2 (δ 2 + δ 3 + δ 2 δ 3 ). r r 2u (a 2 +b 2 ). r r r 2u C, condition number C := a2 + b 2 a 2 b 2. 27
75 Example of analysis: a 2 b 2 Applying the standard model to each operation gives: r = ( RN(a 2 ) RN(b 2 ) ) (1 + δ 3 ) = ( a 2 (1 + δ 1 ) b 2 (1 + δ 2 ) ) (1 + δ 3 ), δ i u 1. r r = a 2 (δ 1 + δ 3 + δ 1 δ 3 ) b 2 (δ 2 + δ 3 + δ 2 δ 3 ). r r 2u (a 2 +b 2 ). r r r 2u C, condition number C := a2 + b 2 a 2 b 2. Bound easy to derive and to interpret: If C = O(1) then relative error in O(u): highly accurate! If C 1/u then relative error upper bounded by 1: it could be that catastrophic cancellation occurs. 27
76 28 Example of analysis: (a + b)(a b) ( ) r := RN RN(a + b) RN(a b) = (a + b)(a b) (1 + δ 1 )(1 + δ 2 )(1 + δ 3 ), δ i u 1. r r r (1 + u) 3 1 3u. Always highly accurate!
77 29 Floating-point summation Given x 1,..., x n F, evaluate their sum in any order. Classical analysis [Wilkinson'60]: Apply the standard model n 1 times. Deduce that the computed value ŝ F satises n n ŝ x i α x i, α = (1 + u) n 1 1. i=1 i=1
78 29 Floating-point summation Given x 1,..., x n F, evaluate their sum in any order. Classical analysis [Wilkinson'60]: Apply the standard model n 1 times. Deduce that the computed value ŝ F satises n n ŝ x i α x i, α = (1 + u) n 1 1. i=1 i=1 Easy to derive, valid for any order, asymptotically optimal: error 1 as u 0. error bound
79 29 Floating-point summation Given x 1,..., x n F, evaluate their sum in any order. Classical analysis [Wilkinson'60]: Apply the standard model n 1 times. Deduce that the computed value ŝ F satises n n ŝ x i α x i, α = (1 + u) n 1 1. i=1 i=1 Easy to derive, valid for any order, asymptotically optimal: error 1 as u 0. error bound u But, even with u replaced by 1+u, α = (n 1)u + O(u2 ), which hides a constant. So, classically bounded as α γ n 1, γ l = lu, lu < 1. [Higham'96] 1 lu
80 A simpler, O(u 2 )-free bound 30 Theorem [Rump'12] For recursive summation, one can take α = (n 1)u.
81 30 A simpler, O(u 2 )-free bound Theorem [Rump'12] For recursive summation, one can take α = (n 1)u. To prove this, Don't use just the (rened) standard model RN(x + y) (x + y) u 1+u x + y. (1)
82 30 A simpler, O(u 2 )-free bound Theorem [Rump'12] For recursive summation, one can take α = (n 1)u. To prove this, Don't use just the (rened) standard model RN(x + y) (x + y) But combine it with the lower-level property u 1+u x + y. (1) RN(x + y) (x + y) f (x + y), f F, min{ x, y }; (2)
83 30 A simpler, O(u 2 )-free bound Theorem [Rump'12] For recursive summation, one can take α = (n 1)u. To prove this, Don't use just the (rened) standard model RN(x + y) (x + y) But combine it with the lower-level property u 1+u x + y. (1) RN(x + y) (x + y) f (x + y), f F, min{ x, y }; (2) Conclude by induction on n with a clever case-distinction comparing x n to u i<n x i, and using either (1) or (2).
84 Wilkinson's bounds revisited 31 Problem Classical α New α Ref. summation (n 1)u + O(u 2 ) (n 1)u [1]
85 Wilkinson's bounds revisited Problem Classical α New α Ref. summation (n 1)u + O(u 2 ) (n 1)u [1] dot prod., mat. mul. nu + O(u 2 ) nu [1] Euclidean norm ( n + 1)u + 2 O(u2 ) ( n + 1)u 2 [2] Tx = b, A = LU nu + O(u 2 ) nu [2] A = R T R (n + 1)u + O(u 2 ) (n + 1)u [2] x n (recursive, β = 2) (n 1)u + O(u 2 ) (n 1)u ( ) [3] product x 1x 2 x n (n 1)u + O(u 2 ) (n 1)u ( ) [4] poly. eval. (Horner) 2nu + O(u 2 ) 2nu ( ) [4] ( ) if n < c u 1/2. [1]: with Rump'13; [2]: with Rump'14; [3]: Graillat, Lefèvre, Muller'14; [4]: with Bünger and Rump'14. 31
86 32 Kahan's algorithm for ad bc [ a Kahan's algorithm uses the FMA to evaluate det c ] b = ad bc: d ŵ := RN(bc); f := RN(ad ŵ); r := RN( f + e); e := RN(ŵ bc);
87 32 Kahan's algorithm for ad bc [ a Kahan's algorithm uses the FMA to evaluate det c ] b = ad bc: d ŵ := RN(bc); f := RN(ad ŵ); r := RN( f + e); e := RN(ŵ bc); The operation ad bc is not in IEEE 754, but very common: complex arithmetic, discriminant of a quadratic equation, robust orientation predicates using tests like 'ad bc > ɛ?' If evaluated naively, ad bc leads to highly inaccurate results: f r r can be of the order of u 1 1.
88 33 Kahan's algorithm for ad bc Analysis in the standard model [Higham'96]: r r r ( 2u 1 + u bc 2 r high relative accuracy as long as u bc 2 r. ).
89 33 Kahan's algorithm for ad bc Analysis in the standard model [Higham'96]: r r r ( 2u 1 + u bc 2 r ). high relative accuracy as long as u bc 2 r. When u bc 2 r, the error bound can be > 1 and does not even allow to conclude that sign( r) = sign(r).
90 33 Kahan's algorithm for ad bc Analysis in the standard model [Higham'96]: r r r ( 2u 1 + u bc 2 r ). high relative accuracy as long as u bc 2 r. When u bc 2 r, the error bound can be > 1 and does not even allow to conclude that sign( r) = sign(r). In fact, Kahan's algorithm is always highly accurate: the standard model alone fails to predict this; misinterpreting bounds dismissing good algorithms.
91 34 Further analysis [J., Louvet, Muller'13] The key is an ulp-analysis of the error terms ɛ 1 and ɛ 2 given by: ŵ := RN(bc); f := RN(ad ŵ); r := RN( f + e); e := RN(ŵ bc); f = ad ŵ + ɛ 1 r = f + e + ɛ 2 Since e is exactly ŵ bc, we have r r = ɛ 1 + ɛ 2. Furthermore, we can prove that ɛ i β ulp(r) for i = 1, 2. 2 Proposition: r r β ulp(r) 2βu r.
92 34 Further analysis [J., Louvet, Muller'13] The key is an ulp-analysis of the error terms ɛ 1 and ɛ 2 given by: ŵ := RN(bc); f := RN(ad ŵ); r := RN( f + e); e := RN(ŵ bc); f = ad ŵ + ɛ 1 r = f + e + ɛ 2 Since e is exactly ŵ bc, we have r r = ɛ 1 + ɛ 2. Furthermore, we can prove that ɛ i β ulp(r) for i = 1, 2. 2 Proposition: r r β ulp(r) 2βu r. These bounds mean Kahan's algorithm is always highly accurate.
93 Further analysis [J., Louvet, Muller'13] We can do better via a case analysis comparing ɛ 2 to 1 2 ulp(r): Theorem: relative error r r / r 2u; 35
94 Further analysis [J., Louvet, Muller'13] We can do better via a case analysis comparing ɛ 2 to 1 2 ulp(r): Theorem: relative error r r / r 2u; the leading constant 2 is best possible. 35
95 Certicate of optimality This is an explicit input set parametrized by β and p such that error error bound 1 as u 0. 36
96 Certicate of optimality 36 This is an explicit input set parametrized by β and p such that error error bound 1 as u 0. Example: for Kahan's algorithm for r = ad bc: a = b = β p c = β p 1 + β r r / r 2 βp 2 = 1 2u 1 + 2u = 1 2u + O(u2 ). d = 2β p 1 + β 2 βp 2
97 Certicate of optimality 36 This is an explicit input set parametrized by β and p such that error error bound 1 as u 0. Example: for Kahan's algorithm for r = ad bc: a = b = β p c = β p 1 + β r r / r 2 βp 2 = 1 2u 1 + 2u = 1 2u + O(u2 ). d = 2β p 1 + β 2 βp 2 Optimality is asymptotic, but often OK in practice: for β = 2 and p = 11, the above example has relative error u. The certicate consists of sparse, symbolic oating-point data, which we can handle automatically. [J., Louvet, Muller, Plet]
98 37 Context Floating-point arithmetic A priori analysis Conclusion
99 38 Summary Floating-point arithmetic is specied rigorously by IEEE 754, highly structured and much richer than the standard model. Exploiting this structure leads to enhanced a priori analysis: optimal standard models for basic arithmetic operations, simpler and sharper Wilkinson-like bounds, proofs of nice behavior of some numerical kernels.
100 38 Summary Floating-point arithmetic is specied rigorously by IEEE 754, highly structured and much richer than the standard model. Exploiting this structure leads to enhanced a priori analysis: optimal standard models for basic arithmetic operations, simpler and sharper Wilkinson-like bounds, proofs of nice behavior of some numerical kernels. On-going research: consider directed roundings as well. take underow and overow into account.
Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function
Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Jean-Michel Muller collaborators: S. Graillat, C.-P. Jeannerod, N. Louvet V. Lefèvre
More informationThe Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal
-1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,
More informationOn various ways to split a floating-point number
On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-
More informationAccurate Solution of Triangular Linear System
SCAN 2008 Sep. 29 - Oct. 3, 2008, at El Paso, TX, USA Accurate Solution of Triangular Linear System Philippe Langlois DALI at University of Perpignan France Nicolas Louvet INRIA and LIP at ENS Lyon France
More informationHow to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm
How to Ensure a Faithful Polynomial Evaluation with the Compensated Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, DALI Research Team {langlois, nicolas.louvet}@univ-perp.fr
More informationFaithful Horner Algorithm
ICIAM 07, Zurich, Switzerland, 16 20 July 2007 Faithful Horner Algorithm Philippe Langlois, Nicolas Louvet Université de Perpignan, France http://webdali.univ-perp.fr Ph. Langlois (Université de Perpignan)
More informationSolving Triangular Systems More Accurately and Efficiently
17th IMACS World Congress, July 11-15, 2005, Paris, France Scientific Computation, Applied Mathematics and Simulation Solving Triangular Systems More Accurately and Efficiently Philippe LANGLOIS and Nicolas
More informationNewton-Raphson Algorithms for Floating-Point Division Using an FMA
Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008
More informationAccurate polynomial evaluation in floating point arithmetic
in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms
More informationSome issues related to double rounding
Noname manuscript No. will be inserted by the editor) Some issues related to double rounding Érik Martin-Dorel Guillaume Melquiond Jean-Michel Muller Received: date / Accepted: date Abstract Double rounding
More informationMANY numerical problems in dynamical systems
IEEE TRANSACTIONS ON COMPUTERS, VOL., 201X 1 Arithmetic algorithms for extended precision using floating-point expansions Mioara Joldeş, Olivier Marty, Jean-Michel Muller and Valentina Popescu Abstract
More informationAlgorithms for Accurate, Validated and Fast Polynomial Evaluation
Algorithms for Accurate, Validated and Fast Polynomial Evaluation Stef Graillat, Philippe Langlois, Nicolas Louvet Abstract: We survey a class of algorithms to evaluate polynomials with oating point coecients
More information4ccurate 4lgorithms in Floating Point 4rithmetic
4ccurate 4lgorithms in Floating Point 4rithmetic Philippe LANGLOIS 1 1 DALI Research Team, University of Perpignan, France SCAN 2006 at Duisburg, Germany, 26-29 September 2006 Philippe LANGLOIS (UPVD)
More informationHomework 2 Foundations of Computational Math 1 Fall 2018
Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming
More informationOn the Computation of Correctly-Rounded Sums
INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE On the Computation of Correctly-Rounded Sums Peter Kornerup Vincent Lefèvre Nicolas Louvet Jean-Michel Muller N 7262 April 2010 Algorithms,
More informationAdditive symmetries: the non-negative case
Theoretical Computer Science 91 (003) 143 157 www.elsevier.com/locate/tcs Additive symmetries: the non-negative case Marc Daumas, Philippe Langlois Laboratoire de l Informatique du Parallelisme, UMR 5668
More informationERROR ESTIMATION OF FLOATING-POINT SUMMATION AND DOT PRODUCT
accepted for publication in BIT, May 18, 2011. ERROR ESTIMATION OF FLOATING-POINT SUMMATION AND DOT PRODUCT SIEGFRIED M. RUMP Abstract. We improve the well-known Wilkinson-type estimates for the error
More informationThe Euclidean Division Implemented with a Floating-Point Multiplication and a Floor
The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented
More informationComputation of dot products in finite fields with floating-point arithmetic
Computation of dot products in finite fields with floating-point arithmetic Stef Graillat Joint work with Jérémy Jean LIP6/PEQUAN - Université Pierre et Marie Curie (Paris 6) - CNRS Computer-assisted proofs
More informationAn accurate algorithm for evaluating rational functions
An accurate algorithm for evaluating rational functions Stef Graillat To cite this version: Stef Graillat. An accurate algorithm for evaluating rational functions. 2017. HAL Id: hal-01578486
More informationElements of Floating-point Arithmetic
Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow
More informationSome Functions Computable with a Fused-mac
Some Functions Computable with a Fused-mac Sylvie Boldo, Jean-Michel Muller To cite this version: Sylvie Boldo, Jean-Michel Muller. Some Functions Computable with a Fused-mac. Paolo Montuschi, Eric Schwarz.
More informationAccurate Floating Point Product and Exponentiation
Accurate Floating Point Product and Exponentiation Stef Graillat To cite this version: Stef Graillat. Accurate Floating Point Product and Exponentiation. IEEE Transactions on Computers, Institute of Electrical
More informationVerification methods for linear systems using ufp estimation with rounding-to-nearest
NOLTA, IEICE Paper Verification methods for linear systems using ufp estimation with rounding-to-nearest Yusuke Morikura 1a), Katsuhisa Ozaki 2,4, and Shin ichi Oishi 3,4 1 Graduate School of Fundamental
More informationElements of Floating-point Arithmetic
Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow
More informationCompensated Horner scheme in complex floating point arithmetic
Compensated Horner scheme in complex floating point arithmetic Stef Graillat, Valérie Ménissier-Morain UPMC Univ Paris 06, UMR 7606, LIP6 4 place Jussieu, F-75252, Paris cedex 05, France stef.graillat@lip6.fr,valerie.menissier-morain@lip6.fr
More informationFaithfully Rounded Floating-point Computations
1 Faithfully Rounded Floating-point Computations MARKO LANGE, Waseda University SIEGFRIED M. RUMP, Hamburg University of Technology We present a pair arithmetic for the four basic operations and square
More informationJim Lambers MAT 610 Summer Session Lecture 2 Notes
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the
More informationFast and accurate Bessel function computation
0 Fast and accurate Bessel function computation John Harrison, Intel Corporation ARITH-19 Portland, OR Tue 9th June 2009 (11:00 11:30) 1 Bessel functions and their computation Bessel functions are certain
More informationNotes for Chapter 1 of. Scientific Computing with Case Studies
Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What
More informationIntroduction 5. 1 Floating-Point Arithmetic 5. 2 The Direct Solution of Linear Algebraic Systems 11
SCIENTIFIC COMPUTING BY NUMERICAL METHODS Christina C. Christara and Kenneth R. Jackson, Computer Science Dept., University of Toronto, Toronto, Ontario, Canada, M5S 1A4. (ccc@cs.toronto.edu and krj@cs.toronto.edu)
More informationTight and rigourous error bounds for basic building blocks of double-word arithmetic. Mioara Joldes, Jean-Michel Muller, Valentina Popescu
Tight and rigourous error bounds for basic building blocks of double-word arithmetic Mioara Joldes, Jean-Michel Muller, Valentina Popescu SCAN - September 2016 AMPAR CudA Multiple Precision ARithmetic
More informationRepresentable correcting terms for possibly underflowing floating point operations
Representable correcting terms for possibly underflowing floating point operations Sylvie Boldo & Marc Daumas E-mail: Sylvie.Boldo@ENS-Lyon.Fr & Marc.Daumas@ENS-Lyon.Fr Laboratoire de l Informatique du
More informationMath 411 Preliminaries
Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector
More informationDense LU factorization and its error analysis
Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,
More informationContents. 2.1 Vectors in R n. Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v. 2.50) 2 Vector Spaces
Linear Algebra (part 2) : Vector Spaces (by Evan Dummit, 2017, v 250) Contents 2 Vector Spaces 1 21 Vectors in R n 1 22 The Formal Denition of a Vector Space 4 23 Subspaces 6 24 Linear Combinations and
More informationFLOATING POINT ARITHMETHIC - ERROR ANALYSIS
FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors Roundoff errors and floating-point arithmetic
More informationFLOATING POINT ARITHMETHIC - ERROR ANALYSIS
FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors 3-1 Roundoff errors and floating-point arithmetic
More informationComputation of the error functions erf and erfc in arbitrary precision with correct rounding
Computation of the error functions erf and erfc in arbitrary precision with correct rounding Sylvain Chevillard Arenaire, LIP, ENS-Lyon, France Sylvain.Chevillard@ens-lyon.fr Nathalie Revol INRIA, Arenaire,
More informationAutomating the Verification of Floating-point Algorithms
Introduction Interval+Error Advanced Gappa Conclusion Automating the Verification of Floating-point Algorithms Guillaume Melquiond Inria Saclay Île-de-France LRI, Université Paris Sud, CNRS 2014-07-18
More informationECS 231 Computer Arithmetic 1 / 27
ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers
More informationChapter. Algebra techniques. Syllabus Content A Basic Mathematics 10% Basic algebraic techniques and the solution of equations.
Chapter 2 Algebra techniques Syllabus Content A Basic Mathematics 10% Basic algebraic techniques and the solution of equations. Page 1 2.1 What is algebra? In order to extend the usefulness of mathematical
More informationArithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460
Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how
More informationValidation for scientific computations Error analysis
Validation for scientific computations Error analysis Cours de recherche master informatique Nathalie Revol Nathalie.Revol@ens-lyon.fr 20 octobre 2006 References for today s lecture Condition number: Ph.
More informationCHAPTER 1 POLYNOMIALS
1 CHAPTER 1 POLYNOMIALS 1.1 Removing Nested Symbols of Grouping Simplify. 1. 4x + 3( x ) + 4( x + 1). ( ) 3x + 4 5 x 3 + x 3. 3 5( y 4) + 6 y ( y + 3) 4. 3 n ( n + 5) 4 ( n + 8) 5. ( x + 5) x + 3( x 6)
More informationCompensated Horner Scheme
Compensated Horner Scheme S. Graillat 1, Philippe Langlois 1, Nicolas Louvet 1 DALI-LP2A Laboratory. Université de Perpignan Via Domitia. firstname.lastname@univ-perp.fr, http://webdali.univ-perp.fr Abstract.
More informationa. Define a function called an inner product on pairs of points x = (x 1, x 2,..., x n ) and y = (y 1, y 2,..., y n ) in R n by
Real Analysis Homework 1 Solutions 1. Show that R n with the usual euclidean distance is a metric space. Items a-c will guide you through the proof. a. Define a function called an inner product on pairs
More informationImproving the Compensated Horner Scheme with a Fused Multiply and Add
Équipe de Recherche DALI Laboratoire LP2A, EA 3679 Université de Perpignan Via Domitia Improving the Compensated Horner Scheme with a Fused Multiply and Add S. Graillat, Ph. Langlois, N. Louvet DALI-LP2A
More informationAdaptive and Efficient Algorithm for 2D Orientation Problem
Japan J. Indust. Appl. Math., 26 (2009), 215 231 Area 2 Adaptive and Efficient Algorithm for 2D Orientation Problem Katsuhisa Ozaki, Takeshi Ogita, Siegfried M. Rump and Shin ichi Oishi Research Institute
More informationHandout #6 INTRODUCTION TO ALGEBRAIC STRUCTURES: Prof. Moseley AN ALGEBRAIC FIELD
Handout #6 INTRODUCTION TO ALGEBRAIC STRUCTURES: Prof. Moseley Chap. 2 AN ALGEBRAIC FIELD To introduce the notion of an abstract algebraic structure we consider (algebraic) fields. (These should not to
More informationTomáš Madaras Congruence classes
Congruence classes For given integer m 2, the congruence relation modulo m at the set Z is the equivalence relation, thus, it provides a corresponding partition of Z into mutually disjoint sets. Definition
More informationEssentials of Intermediate Algebra
Essentials of Intermediate Algebra BY Tom K. Kim, Ph.D. Peninsula College, WA Randy Anderson, M.S. Peninsula College, WA 9/24/2012 Contents 1 Review 1 2 Rules of Exponents 2 2.1 Multiplying Two Exponentials
More informationContents. 4 Arithmetic and Unique Factorization in Integral Domains. 4.1 Euclidean Domains and Principal Ideal Domains
Ring Theory (part 4): Arithmetic and Unique Factorization in Integral Domains (by Evan Dummit, 018, v. 1.00) Contents 4 Arithmetic and Unique Factorization in Integral Domains 1 4.1 Euclidean Domains and
More informationLecture 7. Floating point arithmetic and stability
Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating
More informationContinued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic.
Continued fractions and number systems: applications to correctly-rounded implementations of elementary functions and modular arithmetic. Mourad Gouicem PEQUAN Team, LIP6/UPMC Nancy, France May 28 th 2013
More information1 Finding and fixing floating point problems
Notes for 2016-09-09 1 Finding and fixing floating point problems Floating point arithmetic is not the same as real arithmetic. Even simple properties like associativity or distributivity of addition and
More informationComputing Time for Summation Algorithm: Less Hazard and More Scientific Research
Computing Time for Summation Algorithm: Less Hazard and More Scientific Research Bernard Goossens, Philippe Langlois, David Parello, Kathy Porada To cite this version: Bernard Goossens, Philippe Langlois,
More information1 Matrices and Systems of Linear Equations
Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 207, v 260) Contents Matrices and Systems of Linear Equations Systems of Linear Equations Elimination, Matrix Formulation
More informationoutput H = 2*H+P H=2*(H-P)
Ecient Algorithms for Multiplication on Elliptic Curves by Volker Muller TI-9/97 22. April 997 Institut fur theoretische Informatik Ecient Algorithms for Multiplication on Elliptic Curves Volker Muller
More information1. Introduction to commutative rings and fields
1. Introduction to commutative rings and fields Very informally speaking, a commutative ring is a set in which we can add, subtract and multiply elements so that the usual laws hold. A field is a commutative
More informationMATH 117 LECTURE NOTES
MATH 117 LECTURE NOTES XIN ZHOU Abstract. This is the set of lecture notes for Math 117 during Fall quarter of 2017 at UC Santa Barbara. The lectures follow closely the textbook [1]. Contents 1. The set
More informationPartitions in the quintillions or Billions of congruences
Partitions in the quintillions or Billions of congruences Fredrik Johansson November 2011 The partition function p(n) counts the number of ways n can be written as the sum of positive integers without
More information1 Floating point arithmetic
Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent
More informationLECTURE NOTES IN CRYPTOGRAPHY
1 LECTURE NOTES IN CRYPTOGRAPHY Thomas Johansson 2005/2006 c Thomas Johansson 2006 2 Chapter 1 Abstract algebra and Number theory Before we start the treatment of cryptography we need to review some basic
More informationBinary floating point
Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate
More informationLinear Algebra (part 1) : Matrices and Systems of Linear Equations (by Evan Dummit, 2016, v. 2.02)
Linear Algebra (part ) : Matrices and Systems of Linear Equations (by Evan Dummit, 206, v 202) Contents 2 Matrices and Systems of Linear Equations 2 Systems of Linear Equations 2 Elimination, Matrix Formulation
More informationLinear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space
Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................
More informationOptimizing the Representation of Intervals
Optimizing the Representation of Intervals Javier D. Bruguera University of Santiago de Compostela, Spain Numerical Sofware: Design, Analysis and Verification Santander, Spain, July 4-6 2012 Contents 1
More informationOn the computation of the reciprocal of floating point expansions using an adapted Newton-Raphson iteration
On the computation of the reciprocal of floating point expansions using an adapted Newton-Raphson iteration Mioara Joldes, Valentina Popescu, Jean-Michel Muller ASAP 2014 1 / 10 Motivation Numerical problems
More informationMath Lecture 18 Notes
Math 1010 - Lecture 18 Notes Dylan Zwick Fall 2009 In our last lecture we talked about how we can add, subtract, and multiply polynomials, and we figured out that, basically, if you can add, subtract,
More informationA2 HW Imaginary Numbers
Name: A2 HW Imaginary Numbers Rewrite the following in terms of i and in simplest form: 1) 100 2) 289 3) 15 4) 4 81 5) 5 12 6) -8 72 Rewrite the following as a radical: 7) 12i 8) 20i Solve for x in simplest
More informationArithmetic Operations. The real numbers have the following properties: In particular, putting a 1 in the Distributive Law, we get
MCA AP Calculus AB Summer Assignment The following packet is a review of many of the skills needed as we begin the study of Calculus. There two major sections to this review. Pages 2-9 are review examples
More informationThe Ring Z of Integers
Chapter 2 The Ring Z of Integers The next step in constructing the rational numbers from N is the construction of Z, that is, of the (ring of) integers. 2.1 Equivalence Classes and Definition of Integers
More informationSemester Review Packet
MATH 110: College Algebra Instructor: Reyes Semester Review Packet Remarks: This semester we have made a very detailed study of four classes of functions: Polynomial functions Linear Quadratic Higher degree
More informationFast and Accurate Bessel Function Computation
Fast and Accurate Bessel Function Computation John Harrison Intel Corporation, JF-3 NE 5th Avenue Hillsboro OR 974, USA Email: johnh@ichips.intel.com Abstract The Bessel functions are considered relatively
More informationOptimizing Scientific Libraries for the Itanium
0 Optimizing Scientific Libraries for the Itanium John Harrison Intel Corporation Gelato Federation Meeting, HP Cupertino May 25, 2005 1 Quick summary Intel supplies drop-in replacement versions of common
More informationWhat Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract
What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about
More informationFoundations of Discrete Mathematics
Foundations of Discrete Mathematics Chapter 0 By Dr. Dalia M. Gil, Ph.D. Statement Statement is an ordinary English statement of fact. It has a subject, a verb, and a predicate. It can be assigned a true
More informationGROUPS. Chapter-1 EXAMPLES 1.1. INTRODUCTION 1.2. BINARY OPERATION
Chapter-1 GROUPS 1.1. INTRODUCTION The theory of groups arose from the theory of equations, during the nineteenth century. Originally, groups consisted only of transformations. The group of transformations
More informationLecture Notes: Mathematical/Discrete Structures
Lecture Notes: Mathematical/Discrete Structures Rashid Bin Muhammad, PhD. This booklet includes lecture notes, homework problems, and exam problems from discrete structures course I taught in Fall 2006
More informationarxiv: v1 [math.nt] 3 Jun 2016
Absolute real root separation arxiv:1606.01131v1 [math.nt] 3 Jun 2016 Yann Bugeaud, Andrej Dujella, Tomislav Pejković, and Bruno Salvy Abstract While the separation(the minimal nonzero distance) between
More informationAddition sequences and numerical evaluation of modular forms
Addition sequences and numerical evaluation of modular forms Fredrik Johansson (INRIA Bordeaux) Joint work with Andreas Enge (INRIA Bordeaux) William Hart (TU Kaiserslautern) DK Statusseminar in Strobl,
More informationERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION
MATHEMATICS OF COMPUTATION Volume 00, Number 0, Pages 000 000 S 005-5718(XX)0000-0 ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION RICHARD BRENT, COLIN PERCIVAL, AND PAUL ZIMMERMANN In memory of
More informationA video College Algebra course & 6 Enrichment videos
A video College Algebra course & 6 Enrichment videos Recorded at the University of Missouri Kansas City in 1998. All times are approximate. About 43 hours total. Available on YouTube at http://www.youtube.com/user/umkc
More informationMAT 243 Test 2 SOLUTIONS, FORM A
MAT Test SOLUTIONS, FORM A 1. [10 points] Give a recursive definition for the set of all ordered pairs of integers (x, y) such that x < y. Solution: Let S be the set described above. Note that if (x, y)
More informationREVIEW Chapter 1 The Real Number System
REVIEW Chapter The Real Number System In class work: Complete all statements. Solve all exercises. (Section.4) A set is a collection of objects (elements). The Set of Natural Numbers N N = {,,, 4, 5, }
More informationACM 106a: Lecture 1 Agenda
1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of
More informationIntroduction and mathematical preliminaries
Chapter Introduction and mathematical preliminaries Contents. Motivation..................................2 Finite-digit arithmetic.......................... 2.3 Errors in numerical calculations.....................
More informationConvergence of Rump s Method for Inverting Arbitrarily Ill-Conditioned Matrices
published in J. Comput. Appl. Math., 205(1):533 544, 2007 Convergence of Rump s Method for Inverting Arbitrarily Ill-Conditioned Matrices Shin ichi Oishi a,b Kunio Tanabe a Takeshi Ogita b,a Siegfried
More informationThe numerical stability of barycentric Lagrange interpolation
IMA Journal of Numerical Analysis (2004) 24, 547 556 The numerical stability of barycentric Lagrange interpolation NICHOLAS J. HIGHAM Department of Mathematics, University of Manchester, Manchester M13
More information2. Introduction to commutative rings (continued)
2. Introduction to commutative rings (continued) 2.1. New examples of commutative rings. Recall that in the first lecture we defined the notions of commutative rings and field and gave some examples of
More informationCONTINUED FRACTIONS, PELL S EQUATION, AND TRANSCENDENTAL NUMBERS
CONTINUED FRACTIONS, PELL S EQUATION, AND TRANSCENDENTAL NUMBERS JEREMY BOOHER Continued fractions usually get short-changed at PROMYS, but they are interesting in their own right and useful in other areas
More information1. Introduction to commutative rings and fields
1. Introduction to commutative rings and fields Very informally speaking, a commutative ring is a set in which we can add, subtract and multiply elements so that the usual laws hold. A field is a commutative
More informationCommutative Rings and Fields
Commutative Rings and Fields 1-22-2017 Different algebraic systems are used in linear algebra. The most important are commutative rings with identity and fields. Definition. A ring is a set R with two
More informationMathematics High School Algebra
Mathematics High School Algebra Expressions. An expression is a record of a computation with numbers, symbols that represent numbers, arithmetic operations, exponentiation, and, at more advanced levels,
More informationAlgebra 1. Unit 3: Quadratic Functions. Romeo High School
Algebra 1 Unit 3: Quadratic Functions Romeo High School Contributors: Jennifer Boggio Jennifer Burnham Jim Cali Danielle Hart Robert Leitzel Kelly McNamara Mary Tarnowski Josh Tebeau RHS Mathematics Department
More informationMathematical Reasoning & Proofs
Mathematical Reasoning & Proofs MAT 1362 Fall 2018 Alistair Savage Department of Mathematics and Statistics University of Ottawa This work is licensed under a Creative Commons Attribution-ShareAlike 4.0
More informationSequences. 1. Number sequences. 2. Arithmetic sequences. Consider the illustrated pattern of circles:
Sequences 1. Number sequences Consider the illustrated pattern of circles: The first layer has just one blue ball. The second layer has three pink balls. The third layer has five black balls. The fourth
More information27 Wyner Math 2 Spring 2019
27 Wyner Math 2 Spring 2019 CHAPTER SIX: POLYNOMIALS Review January 25 Test February 8 Thorough understanding and fluency of the concepts and methods in this chapter is a cornerstone to success in the
More information