FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

Size: px
Start display at page:

Download "FLOATING POINT ARITHMETHIC - ERROR ANALYSIS"

Transcription

1 FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors 3-1

2 Roundoff errors and floating-point arithmetic The basic problem: The set A of all possible representable numbers on a given machine is finite - but we would like to use this set to perform standard arithmetic operations (+,*,-,/) on an infinite set. The usual algebra rules are no longer satisfied since results of operations are rounded. Basic algebra breaks down in floating point arithmetic. Example: In floating point arithmetic. a + (b + c)! = (a + b) + c Matlab experiment: For 10,000 random numbers find number of instances when the above is true. Same thing for the multiplication TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-2

3 Floating point representation: Real numbers are represented in two parts: A mantissa (significand) and an exponent. If the representation is in the base β then: x = ±(.d 1 d 2 d t )β e.d 1 d 2 d t is a fraction in the base-β representation (Generally the form is normalized in that d 1 0), and e is an integer Often, more convenient to rewrite the above as: x = ±(m/β t ) β e ±m β e t Mantissa m is an integer with 0 m β t TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-3

4 Machine precision - machine epsilon Notation : f l(x) = closest floating point representation of real number x ( rounding ) When a number x is very small, there is a point when 1+x == 1 in a machine sense. The computer no longer makes a difference between 1 and 1 + x. Machine epsilon: The smallest number ɛ such that 1 + ɛ is a float that is different from one, is called machine epsilon. Denoted by macheps or eps, it represents the distance from 1 to the next larger floating point number. With previous representation, eps is equal to β (t 1). 3-4 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-4

5 Example: In IEEE standard double precision, β = 2, and t = 53 (includes hidden bit ). Therefore eps = Unit Round-off A real number x can be approximated by a floating number fl(x) with relative error no larger than u = 1 2 β (t 1). u is called Unit Round-off. In fact can easily show: fl(x) = x(1 + δ) with δ < u Matlab experiment: find the machine epsilon on your computer. Many discussions on what conditions/ rules should be satisfied by floating point arithmetic. The IEEE standard is a set of standards adopted by many CPU manufacturers. 3-5 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-5

6 Rule 1. fl(x) = x(1 + ɛ), where ɛ u Rule 2. For all operations (one of +,,, /) fl(x y) = (x y)(1 + ɛ ), where ɛ u Rule 3. For +, operations fl(a b) = fl(b a) Matlab experiment: Verify experimentally Rule 3 with 10,000 randomly generated numbers a i, b i. 3-6 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-6

7 Example: Consider the sum of 3 numbers: y = a + b + c. Done as fl(fl(a + b) + c) η = fl(a + b) = (a + b)(1 + ɛ 1 ) y 1 = fl(η + c) = (η + c)(1 + ɛ 2 ) = [(a + b)(1 + ɛ 1 ) + c] (1 + ɛ 2 ) = [(a + b + c) [ + (a + b)ɛ 1 )] (1 + ɛ 2 ) = (a + b + c) 1 + a + b ] a + b + c ɛ 1(1 + ɛ 2 ) + ɛ 2 So disregarding the high order term ɛ 1 ɛ 2 fl(fl(a + b) + c) = (a + b + c)(1 + ɛ 3 ) ɛ 3 a + b a + b + c ɛ 1 + ɛ TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-7

8 If we redid the computation as y 2 = fl(a + fl(b + c)) we would find fl(a + fl(b + c)) = (a + b + c)(1 + ɛ 4 ) ɛ 4 b + c a + b + c ɛ 1 + ɛ 2 The error is amplified by the factor (a + b)/y in the first case and (b + c)/y in the second case. In order to sum n numbers accurately, it is better to start with small numbers first. [However, sorting before adding is not worth it.] But watch out if the numbers have mixed signs! 3-8 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-8

9 The absolute value notation For a given vector x, x is the vector with components x i, i.e., x is the component-wise absolute value of x. Similarly for matrices: A = { a ij } i=1,...,m; j=1,...,n An obvious result: The basic inequality fl(a ij ) a ij u a ij translates into fl(a) = A + E with E u A A B means a ij b ij for all 1 i m; 1 j n 3-9 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-9

10 Error Analysis: Inner product Inner products are in the innermost parts of many calculations. Their analysis is important. Lemma: If δ i u and nu < 1 then Π n i=1 (1 + δ i) = 1 + θ n where θ n nu 1 nu Common notation γ n nu 1 nu Prove the lemma [Hint: use induction] 3-10 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-10

11 Can use the following simpler result: Lemma: If δ i u and nu <.01 then Π n i=1 (1 + δ i) = 1 + θ n where θ n 1.01nu Example: Previous sum of numbers can be written fl(a + b + c) = a(1 + ɛ 1 )(1 + ɛ 2 ) + b(1 + ɛ 1 )(1 + ɛ 2 ) + c(1 + ɛ 2 ) = a(1 + θ 1 ) + b(1 + θ 2 ) + c(1 + θ 3 ) = exact sum of slightly perturbed inputs, where all θ i s satisfy θ i 1.01nu (here n = 2). Alternatively, can write forward bound: fl(a + b + c) (a + b + c) aθ 1 + bθ 2 + cθ TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-11

12 Backward and forward errors Assume the approximation ŷ to y = alg(x) is computed by some algorithm with arithmetic precision ɛ. Possible analysis: find an upper bound for the Forward error This is not always easy. Alternative question: y = y ŷ find equivalent perturbation on initial data (x) that produces the result ŷ. In other words, find x so that: alg(x + x) = ŷ The value of x is called the backward error. An analysis to find an upper bound for x is called Backward error analysis TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-12

13 Example: A = ( ) a b 0 c B = ( ) d e 0 f Consider the product: f l(a.b) = [ (ad)(1 + ɛ1 ) [ae(1 + ɛ 2 ) + bf(1 + ɛ 3 )] (1 + ɛ 4 ) 0 cf(1 + ɛ 5 ) with ɛ i u, for i = 1,..., 5. Result can be written as: [ ] [ a b(1 + ɛ3 )(1 + ɛ 4 ) d(1 + ɛ1 ) e(1 + ɛ 2 )(1 + ɛ 4 ) 0 c(1 + ɛ 5 ) 0 f ] ] So fl(a.b) = (A + E A )(B + E B ). Backward errors E A, E B satisfy: E A 2u A + O(u 2 ) ; E B 2u B + O(u 2 ) 3-13 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-13

14 When solving Ax = b by Gaussian Elimination, we will see that a bound on e x such that this holds exactly: A(x computed + e x ) = b is much harder to find than bounds on E A, e b such that this holds exactly: (A + E A )x computed = (b + e b ). Note: In many instances backward errors are more meaningful than forward errors: if initial data is accurate only to 4 digits say, then my algorithm for computing x need not guarantee a backward error of less then for example. A backward error of order 10 4 is acceptable TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-14

15 Main result on inner products: Backward error expression: fl(x T y) = [x. (1 + d x )] T [y. (1 + d y )] where d 1.01nu, = x, y. Can show equality valid even if one of the d x, d y absent. Forward error expression: fl(x T y) x T y γ n x T y with 0 γ n 1.01nu. Elementwise absolute value x and multiply. notation. Above assumes nu.01. For u = , this holds for n TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-15

16 Consequence of lemma: fl(a B) A B γ n A B Another way to write the result (less precise) is fl(x T y) x T y n u x T y + O(u 2 ) 3-16 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-16

17 Assume you use single precision for which you have u = What is the largest n for which nu 0.01 holds? Any conclusions for the use of single precision arithmetic? What does the main result on inner products imply for the case when y = x? [Contrast the relative accuracy you get in this case vs. the general case when y x] 3-17 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-17

18 Show for any x, y, there exist x, y such that fl(x T y) = (x + x) T y, with x γ n x fl(x T y) = x T (y + y), with y γ n y (Continuation) Let A an m n matrix, x an n-vector, and y = Ax. Show that there exist a matrix A such fl(y) = (A + A)x, with A γ n A (Continuation) From the above derive a result about a column of the product of two matrices A and B. Does a similar result hold for the product AB as a whole? 3-18 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-18

19 Supplemental notes: Floating Point Arithmetic In most computing systems, real numbers are represented in two parts: A mantissa and an exponent. If the representation is in the base β then: x = ±(.d 1 d 2 d m ) β β e.d 1 d 2 d m is a fraction in the base-β representation e is an integer - can be negative, positive or zero. Generally the form is normalized in that d TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-19

20 Example: In base 10 (for illustration) can be written as can be written as Problem with floating point arithmetic: we have to live with limited precision. Example: Assume that we have only 5 digits of accuray in the mantissa and 2 digits for the exponent (excluding sign)..d 1 d 2 d 3 d 4 d 5 e 1 e TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-20

21 Try to add =.10002e+03 and 1.07 =.10700e+01: = ; 1.07 = First task: align decimal points. The one with smallest exponent will be (internally) rewritten so its exponent matches the largest one: 1.07 = Second task: add mantissas: = TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-21

22 Third task: round result. Result has 6 digits - can use only 5 so we can Chop result: ; Round result: ; Fourth task: Normalize result if needed (not needed here) result with rounding: ; Redo the same thing with or TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-22

23 Some More Examples Each operation f l(x y) proceeds in 4 steps: 1. Line up exponents (for addition & subtraction). 2. Compute temporary exact answer. 3. Normalize temporary result. 4. Round to nearest representable number (round-to-even in case of a tie) e e e e e e-98 temporary e e e-98 normalize e e e-99 round e e e-99 note: round to even round to nearest =not all 0 s too small: unnormalized exactly halfway between values closer to upper value exponent is at minimum 3-23 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-23

24 The IEEE standard 32 bit (Single precision) : ± 8 bits 23 bits }{{} exponent } mantissa {{} sign In binary: The leading one in mantissa does not need to be represented. One bit gained. Hidden bit. Largest exponent: = 127; Smallest: = 126. [ bias of 127] 3-24 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-24

25 64 bit (Double precision) : ± 11 bits 52 bits }{{} exponent } mantissa {{} sign Bias of 1023 so if c is the contents of exponent field actual exponent value is 2 c 1023 e + bias = 2047 (all ones) = special use Largest exponent: 1023; Smallest = Including the hidden bit, mantissa has total of 53 bits (52 bits represented, one hidden) TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-25

26 Take the number 1.0 and see what will happen if you add 1/2, 1/4,..., 2 i. Do not forget the hidden bit! Hidden bit (Not represented) Expon. 52 bits e e e e e (Note: The e part has 12 bits and includes the sign) Conclusion fl( ) 1 but: fl( ) == 1!! 3-26 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-26

27 Special Values Exponent field = (smallest possible value) No hidden bit. All bits == 0 means exactly zero. Allow for unnormalized numbers, leading to gradual underflow. Exponent field = (largest possible value) Number represented is Inf -Inf or NaN TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-27

28 Appendix to set 3: Analysis of inner products Consider s n = fl(x 1 y 1 + x 2 y x n y n ) In what follows η i s comme from, ɛ i s comme from + They satisfy: η i u and ɛ i u. The inner product s n is computed as: 1. s 1 = fl(x 1 y 1 ) = (x 1 y 1 )(1 + η 1 ) 2. s 2 = fl(s 1 + fl(x 2 y 2 )) = fl(s 1 + x 2 y 2 (1 + η 2 )) = (x 1 y 1 (1 + η 1 ) + x 2 y 2 (1 + η 2 )) (1 + ɛ 2 ) = x 1 y 1 (1 + η 1 )(1 + ɛ 2 ) + x 2 y 2 (1 + η 2 )(1 + ɛ 2 ) 3. s 3 = fl(s 2 + fl(x 3 y 3 )) = fl(s 2 + x 3 y 3 (1 + η 3 )) = (s 2 + x 3 y 3 (1 + η 3 ))(1 + ɛ 3 ) 3-28 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-28

29 Expand: s 3 = x 1 y 1 (1 + η 1 )(1 + ɛ 2 )(1 + ɛ 3 ) +x 2 y 2 (1 + η 2 )(1 + ɛ 2 )(1 + ɛ 3 ) +x 3 y 3 (1 + η 3 )(1 + ɛ 3 ) Induction would show that [with convention that ɛ 1 0] s n = n x i y i (1 + η i ) n (1 + ɛ j ) i=1 j=i Q: How many terms in the coefficient of x i y i do we have? A: When i > 1 : 1 + (n i + 1) = n i + 2 When i = 1 : n (since ɛ 1 = 0 does not count) Bottom line: always n TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-29

30 For each of these products (1 + η i ) n j=i (1 + ɛ j) = 1 + θ i, with θ i γ n u so: s n = n i=1 x iy i (1 + θ i ) with θ i γ n or: fl ( n i=1 x iy i ) = n i=1 x iy i + n i=1 x iy i θ i with θ i γ n This leads to the final result (forward form) ( n ) n n fl x i y i x i y i γ n x i y i i=1 i=1 i=1 or (backward form) ( n ) fl x i y i = i=1 n i=1 x i y i (1 + θ i ) with θ i γ n 3-30 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-30

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors Roundoff errors and floating-point arithmetic

More information

Lecture 7. Floating point arithmetic and stability

Lecture 7. Floating point arithmetic and stability Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b. CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5 GENE H GOLUB Suppose we want to solve We actually have an approximation ξ such that 1 Perturbation Theory Ax = b x = ξ + e The question is, how

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

1 Floating point arithmetic

1 Floating point arithmetic Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent

More information

Notes for Chapter 1 of. Scientific Computing with Case Studies

Notes for Chapter 1 of. Scientific Computing with Case Studies Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What

More information

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460 Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how

More information

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS Chapter 1 NUMBER REPRESENTATION, ERROR ANALYSIS 1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS Floating-point representation x t,r of a number x: x t,r = m t P cr, where: P - base (the

More information

Notes on floating point number, numerical computations and pitfalls

Notes on floating point number, numerical computations and pitfalls Notes on floating point number, numerical computations and pitfalls November 6, 212 1 Floating point numbers An n-digit floating point number in base β has the form x = ±(.d 1 d 2 d n ) β β e where.d 1

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

ECS 231 Computer Arithmetic 1 / 27

ECS 231 Computer Arithmetic 1 / 27 ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers

More information

Chapter 1 Error Analysis

Chapter 1 Error Analysis Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Floating-point Computation

Floating-point Computation Chapter 2 Floating-point Computation 21 Positional Number System An integer N in a number system of base (or radix) β may be written as N = a n β n + a n 1 β n 1 + + a 1 β + a 0 = P n (β) where a i are

More information

ACM 106a: Lecture 1 Agenda

ACM 106a: Lecture 1 Agenda 1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of

More information

Binary floating point

Binary floating point Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate

More information

Introduction CSE 541

Introduction CSE 541 Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters

More information

Lecture Notes 7, Math/Comp 128, Math 250

Lecture Notes 7, Math/Comp 128, Math 250 Lecture Notes 7, Math/Comp 128, Math 250 Misha Kilmer Tufts University October 23, 2005 Floating Point Arithmetic We talked last time about how the computer represents floating point numbers. In a floating

More information

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,

More information

Homework 2 Foundations of Computational Math 1 Fall 2018

Homework 2 Foundations of Computational Math 1 Fall 2018 Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming

More information

Errors. Intensive Computation. Annalisa Massini 2017/2018

Errors. Intensive Computation. Annalisa Massini 2017/2018 Errors Intensive Computation Annalisa Massini 2017/2018 Intensive Computation - 2017/2018 2 References Scientific Computing: An Introductory Survey - Chapter 1 M.T. Heath http://heath.cs.illinois.edu/scicomp/notes/index.html

More information

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)

More information

1 Backward and Forward Error

1 Backward and Forward Error Math 515 Fall, 2008 Brief Notes on Conditioning, Stability and Finite Precision Arithmetic Most books on numerical analysis, numerical linear algebra, and matrix computations have a lot of material covering

More information

MAT 460: Numerical Analysis I. James V. Lambers

MAT 460: Numerical Analysis I. James V. Lambers MAT 460: Numerical Analysis I James V. Lambers January 31, 2013 2 Contents 1 Mathematical Preliminaries and Error Analysis 7 1.1 Introduction............................ 7 1.1.1 Error Analysis......................

More information

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points. Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu

More information

Math 128A: Homework 2 Solutions

Math 128A: Homework 2 Solutions Math 128A: Homework 2 Solutions Due: June 28 1. In problems where high precision is not needed, the IEEE standard provides a specification for single precision numbers, which occupy 32 bits of storage.

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

Numerical Algorithms. IE 496 Lecture 20

Numerical Algorithms. IE 496 Lecture 20 Numerical Algorithms IE 496 Lecture 20 Reading for This Lecture Primary Miller and Boxer, Pages 124-128 Forsythe and Mohler, Sections 1 and 2 Numerical Algorithms Numerical Analysis So far, we have looked

More information

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff

More information

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction

More information

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1 Mathematical Preliminaries and Error Analysis Chapter 1 Mathematical Preliminaries and Error Analysis Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128A Numerical Analysis Limits and Continuity

More information

Numerical Methods - Preliminaries

Numerical Methods - Preliminaries Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical

More information

Lecture Note 2: The Gaussian Elimination and LU Decomposition

Lecture Note 2: The Gaussian Elimination and LU Decomposition MATH 5330: Computational Methods of Linear Algebra Lecture Note 2: The Gaussian Elimination and LU Decomposition The Gaussian elimination Xianyi Zeng Department of Mathematical Sciences, UTEP The method

More information

How do computers represent numbers?

How do computers represent numbers? How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete

More information

4.2 Floating-Point Numbers

4.2 Floating-Point Numbers 101 Approximation 4.2 Floating-Point Numbers 4.2 Floating-Point Numbers The number 3.1416 in scientific notation is 0.31416 10 1 or (as computer output) -0.31416E01..31416 10 1 exponent sign mantissa base

More information

An Introduction to Numerical Analysis. James Brannick. The Pennsylvania State University

An Introduction to Numerical Analysis. James Brannick. The Pennsylvania State University An Introduction to Numerical Analysis James Brannick The Pennsylvania State University Contents Chapter 1. Introduction 5 Chapter 2. Computer arithmetic and Error Analysis 7 Chapter 3. Approximation and

More information

Round-off Errors and Computer Arithmetic - (1.2)

Round-off Errors and Computer Arithmetic - (1.2) Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9 GENE H GOLUB 1 Error Analysis of Gaussian Elimination In this section, we will consider the case of Gaussian elimination and perform a detailed

More information

Chapter 1: Introduction and mathematical preliminaries

Chapter 1: Introduction and mathematical preliminaries Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,

More information

Chapter 1 Computer Arithmetic

Chapter 1 Computer Arithmetic Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations

More information

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.

Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x. ESTIMATION OF ERROR Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call the residual for x. Then r := b A x r = b A x = Ax A x = A

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

INTRODUCTION TO COMPUTATIONAL MATHEMATICS

INTRODUCTION TO COMPUTATIONAL MATHEMATICS INTRODUCTION TO COMPUTATIONAL MATHEMATICS Course Notes for CM 271 / AMATH 341 / CS 371 Fall 2007 Instructor: Prof. Justin Wan School of Computer Science University of Waterloo Course notes by Prof. Hans

More information

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Review of matrices. Let m, n IN. A rectangle of numbers written like A = Review of matrices Let m, n IN. A rectangle of numbers written like a 11 a 12... a 1n a 21 a 22... a 2n A =...... a m1 a m2... a mn where each a ij IR is called a matrix with m rows and n columns or an

More information

1 ERROR ANALYSIS IN COMPUTATION

1 ERROR ANALYSIS IN COMPUTATION 1 ERROR ANALYSIS IN COMPUTATION 1.2 Round-Off Errors & Computer Arithmetic (a) Computer Representation of Numbers Two types: integer mode (not used in MATLAB) floating-point mode x R ˆx F(β, t, l, u),

More information

Gaussian Elimination for Linear Systems

Gaussian Elimination for Linear Systems Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination

More information

1 Finding and fixing floating point problems

1 Finding and fixing floating point problems Notes for 2016-09-09 1 Finding and fixing floating point problems Floating point arithmetic is not the same as real arithmetic. Even simple properties like associativity or distributivity of addition and

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 19 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 19 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Next topics: Solving systems of linear equations

Next topics: Solving systems of linear equations Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:

More information

Binary Floating-Point Numbers

Binary Floating-Point Numbers Binary Floating-Point Numbers S exponent E significand M F=(-1) s M β E Significand M pure fraction [0, 1-ulp] or [1, 2) for β=2 Normalized form significand has no leading zeros maximum # of significant

More information

Tu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1

Tu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1 Tu: 9/3/13 Math 71, Fall 2013, Section 001 Lecture 1 1 Course intro Notes : Take attendance. Instructor introduction. Handout : Course description. Note the exam days (and don t be absent). Bookmark the

More information

Outline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy,

Outline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy, Math 54 - Numerical Analysis Lecture Notes Linear Algebra: Part B Outline Joseph M. Mahaffy, jmahaffy@mail.sdsu.edu Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences

More information

Algebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions.

Algebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions. Standard 1: Relations and Functions Students graph relations and functions and find zeros. They use function notation and combine functions by composition. They interpret functions in given situations.

More information

Accurate polynomial evaluation in floating point arithmetic

Accurate polynomial evaluation in floating point arithmetic in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

SOLVING LINEAR SYSTEMS

SOLVING LINEAR SYSTEMS SOLVING LINEAR SYSTEMS We want to solve the linear system a, x + + a,n x n = b a n, x + + a n,n x n = b n This will be done by the method used in beginning algebra, by successively eliminating unknowns

More information

ROUNDOFF ERRORS; BACKWARD STABILITY

ROUNDOFF ERRORS; BACKWARD STABILITY SECTION.5 ROUNDOFF ERRORS; BACKWARD STABILITY ROUNDOFF ERROR -- error due to the finite representation (usually in floatingpoint form) of real (and complex) numers in digital computers. FLOATING-POINT

More information

ALU (3) - Division Algorithms

ALU (3) - Division Algorithms HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 21 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 21 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Decimal Scientific Decimal Scientific

Decimal Scientific Decimal Scientific Experiment 00 - Numerical Review Name: 1. Scientific Notation Describing the universe requires some very big (and some very small) numbers. Such numbers are tough to write in long decimal notation, so

More information

MATH ASSIGNMENT 03 SOLUTIONS

MATH ASSIGNMENT 03 SOLUTIONS MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for

More information

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1 Mathematical Preliminaries and Error Analysis Numerical Analysis (Math 3313) 2019-2018 Chapter 1 Mathematical Preliminaries and Error Analysis Intended learning outcomes: Upon successful completion of this chapter, a student will be able to (1) list

More information

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics).

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics). 1 Introduction Read sections 1.1, 1.2.1 1.2.4, 1.2.6, 1.3.8, 1.3.9, 1.4. Review questions 1.1 1.6, 1.12 1.21, 1.37. The subject of Scientific Computing is to simulate the reality. Simulation is the representation

More information

An Introduction to Numerical Analysis

An Introduction to Numerical Analysis An Introduction to Numerical Analysis Department of Mathematical Sciences, NTNU 21st august 2012 Practical issues webpage: http://wiki.math.ntnu.no/tma4215/2012h/start Lecturer:, elenac@math.ntnu.no Lecures:

More information

Review Questions REVIEW QUESTIONS 71

Review Questions REVIEW QUESTIONS 71 REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of

More information

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41 Numerical Analysis Yutian LI CUHKSZ 2018/19 Term 1 Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41 Reference Books BF R. L. Burden and J. D. Faires, Numerical Analysis, 9th edition, Thomsom Brooks/Cole,

More information

Lecture notes to the course. Numerical Methods I. Clemens Kirisits

Lecture notes to the course. Numerical Methods I. Clemens Kirisits Lecture notes to the course Numerical Methods I Clemens Kirisits November 8, 08 ii Preface These lecture notes are intended as a written companion to the course Numerical Methods I taught from 06 to 08

More information

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented

More information

What is it we are looking for in these algorithms? We want algorithms that are

What is it we are looking for in these algorithms? We want algorithms that are Fundamentals. Preliminaries The first question we want to answer is: What is computational mathematics? One possible definition is: The study of algorithms for the solution of computational problems in

More information

Lecture 1. Introduction to Numerical Methods. T. Gambill. Department of Computer Science University of Illinois at Urbana-Champaign.

Lecture 1. Introduction to Numerical Methods. T. Gambill. Department of Computer Science University of Illinois at Urbana-Champaign. Lecture 1 Introduction to Numerical Methods T. Gambill Department of Computer Science University of Illinois at Urbana-Champaign June 10, 2013 T. Gambill (UIUC) CS 357 June 10, 2013 1 / 52 Course Info

More information

Error Analysis for Solving a Set of Linear Equations

Error Analysis for Solving a Set of Linear Equations Error Analysis for Solving a Set of Linear Equations We noted earlier that the solutions to an equation Ax = b depends significantly on the matrix A. In particular, a unique solution to Ax = b exists if

More information

The numerical stability of barycentric Lagrange interpolation

The numerical stability of barycentric Lagrange interpolation IMA Journal of Numerical Analysis (2004) 24, 547 556 The numerical stability of barycentric Lagrange interpolation NICHOLAS J. HIGHAM Department of Mathematics, University of Manchester, Manchester M13

More information

MATH2071: LAB #5: Norms, Errors and Condition Numbers

MATH2071: LAB #5: Norms, Errors and Condition Numbers MATH2071: LAB #5: Norms, Errors and Condition Numbers 1 Introduction Introduction Exercise 1 Vector Norms Exercise 2 Matrix Norms Exercise 3 Compatible Matrix Norms Exercise 4 More on the Spectral Radius

More information

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) D. ARAPURA Gaussian elimination is the go to method for all basic linear classes including this one. We go summarize the main ideas. 1.

More information

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II) Math 59 Lecture 2 (Tue Mar 5) Gaussian elimination and LU factorization (II) 2 Gaussian elimination - LU factorization For a general n n matrix A the Gaussian elimination produces an LU factorization if

More information

Lecture 2 MODES OF NUMERICAL COMPUTATION

Lecture 2 MODES OF NUMERICAL COMPUTATION 1. Diversity of Numbers Lecture 2 Page 1 It is better to solve the right problem the wrong way than to solve the wrong problem the right way. The purpose of computing is insight, not numbers. Richard Wesley

More information

QUADRATIC PROGRAMMING?

QUADRATIC PROGRAMMING? QUADRATIC PROGRAMMING? WILLIAM Y. SIT Department of Mathematics, The City College of The City University of New York, New York, NY 10031, USA E-mail: wyscc@cunyvm.cuny.edu This is a talk on how to program

More information

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can

More information

Number Representation and Waveform Quantization

Number Representation and Waveform Quantization 1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.

More information

Introduction to Scientific Computing

Introduction to Scientific Computing (Lecture 2: Machine precision and condition number) B. Rosić, T.Moshagen Institute of Scientific Computing General information :) 13 homeworks (HW) Work in groups of 2 or 3 people Each HW brings maximally

More information

MA580: Numerical Analysis: I. Zhilin Li

MA580: Numerical Analysis: I. Zhilin Li MA580: Numerical Analysis: I Zhilin Li Chapter 1 Introduction Why is this course important (motivations)? What is the role of this class in the problem solving process using mathematics and computers?

More information

AMSC/CMSC 466 Problem set 3

AMSC/CMSC 466 Problem set 3 AMSC/CMSC 466 Problem set 3 1. Problem 1 of KC, p180, parts (a), (b) and (c). Do part (a) by hand, with and without pivoting. Use MATLAB to check your answer. Use the command A\b to get the solution, and

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

Least-Squares Systems and The QR factorization

Least-Squares Systems and The QR factorization Least-Squares Systems and The QR factorization Orthogonality Least-squares systems. The Gram-Schmidt and Modified Gram-Schmidt processes. The Householder QR and the Givens QR. Orthogonality The Gram-Schmidt

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Order of convergence. MA3232 Numerical Analysis Week 3 Jack Carl Kiefer ( ) Question: How fast does x n

Order of convergence. MA3232 Numerical Analysis Week 3 Jack Carl Kiefer ( ) Question: How fast does x n Week 3 Jack Carl Kiefer (94-98) Jack Kiefer was an American statistician. Much of his research was on the optimal design of eperiments. However, he also made significant contributions to other areas of

More information

Numerical methods for solving linear systems

Numerical methods for solving linear systems Chapter 2 Numerical methods for solving linear systems Let A C n n be a nonsingular matrix We want to solve the linear system Ax = b by (a) Direct methods (finite steps); Iterative methods (convergence)

More information

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra CS227-Scientific Computing Lecture 4: A Crash Course in Linear Algebra Linear Transformation of Variables A common phenomenon: Two sets of quantities linearly related: y = 3x + x 2 4x 3 y 2 = 2.7x 2 x

More information

MATH 612 Computational methods for equation solving and function minimization Week # 6

MATH 612 Computational methods for equation solving and function minimization Week # 6 MATH 612 Computational methods for equation solving and function minimization Week # 6 F.J.S. Spring 2014 University of Delaware FJS MATH 612 1 / 58 Plan for this week Discuss any problems you couldn t

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Chapter 2. Error Correcting Codes. 2.1 Basic Notions

Chapter 2. Error Correcting Codes. 2.1 Basic Notions Chapter 2 Error Correcting Codes The identification number schemes we discussed in the previous chapter give us the ability to determine if an error has been made in recording or transmitting information.

More information

Linear Algebraic Equations

Linear Algebraic Equations Linear Algebraic Equations 1 Fundamentals Consider the set of linear algebraic equations n a ij x i b i represented by Ax b j with [A b ] [A b] and (1a) r(a) rank of A (1b) Then Axb has a solution iff

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 09: Accuracy and Stability Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 12 Outline 1 Condition Number of Matrices

More information

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal -1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,

More information

Numerics and Error Analysis

Numerics and Error Analysis Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What

More information