FLOATING POINT ARITHMETHIC - ERROR ANALYSIS
|
|
- Brice Lang
- 5 years ago
- Views:
Transcription
1 FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors 3-1
2 Roundoff errors and floating-point arithmetic The basic problem: The set A of all possible representable numbers on a given machine is finite - but we would like to use this set to perform standard arithmetic operations (+,*,-,/) on an infinite set. The usual algebra rules are no longer satisfied since results of operations are rounded. Basic algebra breaks down in floating point arithmetic. Example: In floating point arithmetic. a + (b + c)! = (a + b) + c Matlab experiment: For 10,000 random numbers find number of instances when the above is true. Same thing for the multiplication TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-2
3 Floating point representation: Real numbers are represented in two parts: A mantissa (significand) and an exponent. If the representation is in the base β then: x = ±(.d 1 d 2 d t )β e.d 1 d 2 d t is a fraction in the base-β representation (Generally the form is normalized in that d 1 0), and e is an integer Often, more convenient to rewrite the above as: x = ±(m/β t ) β e ±m β e t Mantissa m is an integer with 0 m β t TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-3
4 Machine precision - machine epsilon Notation : f l(x) = closest floating point representation of real number x ( rounding ) When a number x is very small, there is a point when 1+x == 1 in a machine sense. The computer no longer makes a difference between 1 and 1 + x. Machine epsilon: The smallest number ɛ such that 1 + ɛ is a float that is different from one, is called machine epsilon. Denoted by macheps or eps, it represents the distance from 1 to the next larger floating point number. With previous representation, eps is equal to β (t 1). 3-4 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-4
5 Example: In IEEE standard double precision, β = 2, and t = 53 (includes hidden bit ). Therefore eps = Unit Round-off A real number x can be approximated by a floating number fl(x) with relative error no larger than u = 1 2 β (t 1). u is called Unit Round-off. In fact can easily show: fl(x) = x(1 + δ) with δ < u Matlab experiment: find the machine epsilon on your computer. Many discussions on what conditions/ rules should be satisfied by floating point arithmetic. The IEEE standard is a set of standards adopted by many CPU manufacturers. 3-5 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-5
6 Rule 1. fl(x) = x(1 + ɛ), where ɛ u Rule 2. For all operations (one of +,,, /) fl(x y) = (x y)(1 + ɛ ), where ɛ u Rule 3. For +, operations fl(a b) = fl(b a) Matlab experiment: Verify experimentally Rule 3 with 10,000 randomly generated numbers a i, b i. 3-6 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-6
7 Example: Consider the sum of 3 numbers: y = a + b + c. Done as fl(fl(a + b) + c) η = fl(a + b) = (a + b)(1 + ɛ 1 ) y 1 = fl(η + c) = (η + c)(1 + ɛ 2 ) = [(a + b)(1 + ɛ 1 ) + c] (1 + ɛ 2 ) = [(a + b + c) [ + (a + b)ɛ 1 )] (1 + ɛ 2 ) = (a + b + c) 1 + a + b ] a + b + c ɛ 1(1 + ɛ 2 ) + ɛ 2 So disregarding the high order term ɛ 1 ɛ 2 fl(fl(a + b) + c) = (a + b + c)(1 + ɛ 3 ) ɛ 3 a + b a + b + c ɛ 1 + ɛ TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-7
8 If we redid the computation as y 2 = fl(a + fl(b + c)) we would find fl(a + fl(b + c)) = (a + b + c)(1 + ɛ 4 ) ɛ 4 b + c a + b + c ɛ 1 + ɛ 2 The error is amplified by the factor (a + b)/y in the first case and (b + c)/y in the second case. In order to sum n numbers accurately, it is better to start with small numbers first. [However, sorting before adding is not worth it.] But watch out if the numbers have mixed signs! 3-8 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-8
9 The absolute value notation For a given vector x, x is the vector with components x i, i.e., x is the component-wise absolute value of x. Similarly for matrices: A = { a ij } i=1,...,m; j=1,...,n An obvious result: The basic inequality fl(a ij ) a ij u a ij translates into fl(a) = A + E with E u A A B means a ij b ij for all 1 i m; 1 j n 3-9 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-9
10 Error Analysis: Inner product Inner products are in the innermost parts of many calculations. Their analysis is important. Lemma: If δ i u and nu < 1 then Π n i=1 (1 + δ i) = 1 + θ n where θ n nu 1 nu Common notation γ n nu 1 nu Prove the lemma [Hint: use induction] 3-10 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-10
11 Can use the following simpler result: Lemma: If δ i u and nu <.01 then Π n i=1 (1 + δ i) = 1 + θ n where θ n 1.01nu Example: Previous sum of numbers can be written fl(a + b + c) = a(1 + ɛ 1 )(1 + ɛ 2 ) + b(1 + ɛ 1 )(1 + ɛ 2 ) + c(1 + ɛ 2 ) = a(1 + θ 1 ) + b(1 + θ 2 ) + c(1 + θ 3 ) = exact sum of slightly perturbed inputs, where all θ i s satisfy θ i 1.01nu (here n = 2). Alternatively, can write forward bound: fl(a + b + c) (a + b + c) aθ 1 + bθ 2 + cθ TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-11
12 Backward and forward errors Assume the approximation ŷ to y = alg(x) is computed by some algorithm with arithmetic precision ɛ. Possible analysis: find an upper bound for the Forward error This is not always easy. Alternative question: y = y ŷ find equivalent perturbation on initial data (x) that produces the result ŷ. In other words, find x so that: alg(x + x) = ŷ The value of x is called the backward error. An analysis to find an upper bound for x is called Backward error analysis TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-12
13 Example: A = ( ) a b 0 c B = ( ) d e 0 f Consider the product: f l(a.b) = [ (ad)(1 + ɛ1 ) [ae(1 + ɛ 2 ) + bf(1 + ɛ 3 )] (1 + ɛ 4 ) 0 cf(1 + ɛ 5 ) with ɛ i u, for i = 1,..., 5. Result can be written as: [ ] [ a b(1 + ɛ3 )(1 + ɛ 4 ) d(1 + ɛ1 ) e(1 + ɛ 2 )(1 + ɛ 4 ) 0 c(1 + ɛ 5 ) 0 f ] ] So fl(a.b) = (A + E A )(B + E B ). Backward errors E A, E B satisfy: E A 2u A + O(u 2 ) ; E B 2u B + O(u 2 ) 3-13 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-13
14 When solving Ax = b by Gaussian Elimination, we will see that a bound on e x such that this holds exactly: A(x computed + e x ) = b is much harder to find than bounds on E A, e b such that this holds exactly: (A + E A )x computed = (b + e b ). Note: In many instances backward errors are more meaningful than forward errors: if initial data is accurate only to 4 digits say, then my algorithm for computing x need not guarantee a backward error of less then for example. A backward error of order 10 4 is acceptable TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-14
15 Main result on inner products: Backward error expression: fl(x T y) = [x. (1 + d x )] T [y. (1 + d y )] where d 1.01nu, = x, y. Can show equality valid even if one of the d x, d y absent. Forward error expression: fl(x T y) x T y γ n x T y with 0 γ n 1.01nu. Elementwise absolute value x and multiply. notation. Above assumes nu.01. For u = , this holds for n TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-15
16 Consequence of lemma: fl(a B) A B γ n A B Another way to write the result (less precise) is fl(x T y) x T y n u x T y + O(u 2 ) 3-16 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-16
17 Assume you use single precision for which you have u = What is the largest n for which nu 0.01 holds? Any conclusions for the use of single precision arithmetic? What does the main result on inner products imply for the case when y = x? [Contrast the relative accuracy you get in this case vs. the general case when y x] 3-17 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-17
18 Show for any x, y, there exist x, y such that fl(x T y) = (x + x) T y, with x γ n x fl(x T y) = x T (y + y), with y γ n y (Continuation) Let A an m n matrix, x an n-vector, and y = Ax. Show that there exist a matrix A such fl(y) = (A + A)x, with A γ n A (Continuation) From the above derive a result about a column of the product of two matrices A and B. Does a similar result hold for the product AB as a whole? 3-18 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float 3-18
19 Supplemental notes: Floating Point Arithmetic In most computing systems, real numbers are represented in two parts: A mantissa and an exponent. If the representation is in the base β then: x = ±(.d 1 d 2 d m ) β β e.d 1 d 2 d m is a fraction in the base-β representation e is an integer - can be negative, positive or zero. Generally the form is normalized in that d TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-19
20 Example: In base 10 (for illustration) can be written as can be written as Problem with floating point arithmetic: we have to live with limited precision. Example: Assume that we have only 5 digits of accuray in the mantissa and 2 digits for the exponent (excluding sign)..d 1 d 2 d 3 d 4 d 5 e 1 e TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-20
21 Try to add =.10002e+03 and 1.07 =.10700e+01: = ; 1.07 = First task: align decimal points. The one with smallest exponent will be (internally) rewritten so its exponent matches the largest one: 1.07 = Second task: add mantissas: = TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-21
22 Third task: round result. Result has 6 digits - can use only 5 so we can Chop result: ; Round result: ; Fourth task: Normalize result if needed (not needed here) result with rounding: ; Redo the same thing with or TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-22
23 Some More Examples Each operation f l(x y) proceeds in 4 steps: 1. Line up exponents (for addition & subtraction). 2. Compute temporary exact answer. 3. Normalize temporary result. 4. Round to nearest representable number (round-to-even in case of a tie) e e e e e e-98 temporary e e e-98 normalize e e e-99 round e e e-99 note: round to even round to nearest =not all 0 s too small: unnormalized exactly halfway between values closer to upper value exponent is at minimum 3-23 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-23
24 The IEEE standard 32 bit (Single precision) : ± 8 bits 23 bits }{{} exponent } mantissa {{} sign In binary: The leading one in mantissa does not need to be represented. One bit gained. Hidden bit. Largest exponent: = 127; Smallest: = 126. [ bias of 127] 3-24 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-24
25 64 bit (Double precision) : ± 11 bits 52 bits }{{} exponent } mantissa {{} sign Bias of 1023 so if c is the contents of exponent field actual exponent value is 2 c 1023 e + bias = 2047 (all ones) = special use Largest exponent: 1023; Smallest = Including the hidden bit, mantissa has total of 53 bits (52 bits represented, one hidden) TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-25
26 Take the number 1.0 and see what will happen if you add 1/2, 1/4,..., 2 i. Do not forget the hidden bit! Hidden bit (Not represented) Expon. 52 bits e e e e e (Note: The e part has 12 bits and includes the sign) Conclusion fl( ) 1 but: fl( ) == 1!! 3-26 TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-26
27 Special Values Exponent field = (smallest possible value) No hidden bit. All bits == 0 means exactly zero. Allow for unnormalized numbers, leading to gradual underflow. Exponent field = (largest possible value) Number represented is Inf -Inf or NaN TB: 13; GvL 2.7; Ort 9.2; AB: FloatSuppl 3-27
28 Appendix to set 3: Analysis of inner products Consider s n = fl(x 1 y 1 + x 2 y x n y n ) In what follows η i s comme from, ɛ i s comme from + They satisfy: η i u and ɛ i u. The inner product s n is computed as: 1. s 1 = fl(x 1 y 1 ) = (x 1 y 1 )(1 + η 1 ) 2. s 2 = fl(s 1 + fl(x 2 y 2 )) = fl(s 1 + x 2 y 2 (1 + η 2 )) = (x 1 y 1 (1 + η 1 ) + x 2 y 2 (1 + η 2 )) (1 + ɛ 2 ) = x 1 y 1 (1 + η 1 )(1 + ɛ 2 ) + x 2 y 2 (1 + η 2 )(1 + ɛ 2 ) 3. s 3 = fl(s 2 + fl(x 3 y 3 )) = fl(s 2 + x 3 y 3 (1 + η 3 )) = (s 2 + x 3 y 3 (1 + η 3 ))(1 + ɛ 3 ) 3-28 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-28
29 Expand: s 3 = x 1 y 1 (1 + η 1 )(1 + ɛ 2 )(1 + ɛ 3 ) +x 2 y 2 (1 + η 2 )(1 + ɛ 2 )(1 + ɛ 3 ) +x 3 y 3 (1 + η 3 )(1 + ɛ 3 ) Induction would show that [with convention that ɛ 1 0] s n = n x i y i (1 + η i ) n (1 + ɛ j ) i=1 j=i Q: How many terms in the coefficient of x i y i do we have? A: When i > 1 : 1 + (n i + 1) = n i + 2 When i = 1 : n (since ɛ 1 = 0 does not count) Bottom line: always n TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-29
30 For each of these products (1 + η i ) n j=i (1 + ɛ j) = 1 + θ i, with θ i γ n u so: s n = n i=1 x iy i (1 + θ i ) with θ i γ n or: fl ( n i=1 x iy i ) = n i=1 x iy i + n i=1 x iy i θ i with θ i γ n This leads to the final result (forward form) ( n ) n n fl x i y i x i y i γ n x i y i i=1 i=1 i=1 or (backward form) ( n ) fl x i y i = i=1 n i=1 x i y i (1 + θ i ) with θ i γ n 3-30 TB: 13-15; GvL 2.7; Ort 9.2; AB: Float2 3-30
FLOATING POINT ARITHMETHIC - ERROR ANALYSIS
FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors Roundoff errors and floating-point arithmetic
More informationLecture 7. Floating point arithmetic and stability
Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating
More informationCME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.
CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5 GENE H GOLUB Suppose we want to solve We actually have an approximation ξ such that 1 Perturbation Theory Ax = b x = ξ + e The question is, how
More informationMathematical preliminaries and error analysis
Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic
More informationJim Lambers MAT 610 Summer Session Lecture 2 Notes
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the
More information1 Floating point arithmetic
Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent
More informationNotes for Chapter 1 of. Scientific Computing with Case Studies
Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What
More informationArithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460
Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how
More information1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS
Chapter 1 NUMBER REPRESENTATION, ERROR ANALYSIS 1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS Floating-point representation x t,r of a number x: x t,r = m t P cr, where: P - base (the
More informationNotes on floating point number, numerical computations and pitfalls
Notes on floating point number, numerical computations and pitfalls November 6, 212 1 Floating point numbers An n-digit floating point number in base β has the form x = ±(.d 1 d 2 d n ) β β e where.d 1
More informationElements of Floating-point Arithmetic
Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow
More informationECS 231 Computer Arithmetic 1 / 27
ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers
More informationChapter 1 Error Analysis
Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of
More informationElements of Floating-point Arithmetic
Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow
More informationFloating-point Computation
Chapter 2 Floating-point Computation 21 Positional Number System An integer N in a number system of base (or radix) β may be written as N = a n β n + a n 1 β n 1 + + a 1 β + a 0 = P n (β) where a i are
More informationACM 106a: Lecture 1 Agenda
1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of
More informationBinary floating point
Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate
More informationIntroduction CSE 541
Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters
More informationLecture Notes 7, Math/Comp 128, Math 250
Lecture Notes 7, Math/Comp 128, Math 250 Misha Kilmer Tufts University October 23, 2005 Floating Point Arithmetic We talked last time about how the computer represents floating point numbers. In a floating
More informationComputer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic
Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,
More informationHomework 2 Foundations of Computational Math 1 Fall 2018
Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming
More informationErrors. Intensive Computation. Annalisa Massini 2017/2018
Errors Intensive Computation Annalisa Massini 2017/2018 Intensive Computation - 2017/2018 2 References Scientific Computing: An Introductory Survey - Chapter 1 M.T. Heath http://heath.cs.illinois.edu/scicomp/notes/index.html
More informationApplied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)
More information1 Backward and Forward Error
Math 515 Fall, 2008 Brief Notes on Conditioning, Stability and Finite Precision Arithmetic Most books on numerical analysis, numerical linear algebra, and matrix computations have a lot of material covering
More informationMAT 460: Numerical Analysis I. James V. Lambers
MAT 460: Numerical Analysis I James V. Lambers January 31, 2013 2 Contents 1 Mathematical Preliminaries and Error Analysis 7 1.1 Introduction............................ 7 1.1.1 Error Analysis......................
More information8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.
Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu
More informationMath 128A: Homework 2 Solutions
Math 128A: Homework 2 Solutions Due: June 28 1. In problems where high precision is not needed, the IEEE standard provides a specification for single precision numbers, which occupy 32 bits of storage.
More informationEAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science
EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,
More informationNumerical Algorithms. IE 496 Lecture 20
Numerical Algorithms IE 496 Lecture 20 Reading for This Lecture Primary Miller and Boxer, Pages 124-128 Forsythe and Mohler, Sections 1 and 2 Numerical Algorithms Numerical Analysis So far, we have looked
More informationFloating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le
Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff
More informationPowerPoints organized by Dr. Michael R. Gustafson II, Duke University
Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction
More informationChapter 1 Mathematical Preliminaries and Error Analysis
Chapter 1 Mathematical Preliminaries and Error Analysis Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128A Numerical Analysis Limits and Continuity
More informationNumerical Methods - Preliminaries
Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical
More informationLecture Note 2: The Gaussian Elimination and LU Decomposition
MATH 5330: Computational Methods of Linear Algebra Lecture Note 2: The Gaussian Elimination and LU Decomposition The Gaussian elimination Xianyi Zeng Department of Mathematical Sciences, UTEP The method
More informationHow do computers represent numbers?
How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete
More information4.2 Floating-Point Numbers
101 Approximation 4.2 Floating-Point Numbers 4.2 Floating-Point Numbers The number 3.1416 in scientific notation is 0.31416 10 1 or (as computer output) -0.31416E01..31416 10 1 exponent sign mantissa base
More informationAn Introduction to Numerical Analysis. James Brannick. The Pennsylvania State University
An Introduction to Numerical Analysis James Brannick The Pennsylvania State University Contents Chapter 1. Introduction 5 Chapter 2. Computer arithmetic and Error Analysis 7 Chapter 3. Approximation and
More informationRound-off Errors and Computer Arithmetic - (1.2)
Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed
More informationCME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9
CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9 GENE H GOLUB 1 Error Analysis of Gaussian Elimination In this section, we will consider the case of Gaussian elimination and perform a detailed
More informationChapter 1: Introduction and mathematical preliminaries
Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,
More informationChapter 1 Computer Arithmetic
Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations
More informationLet x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call. r := b A x.
ESTIMATION OF ERROR Let x be an approximate solution for Ax = b, e.g., obtained by Gaussian elimination. Let x denote the exact solution. Call the residual for x. Then r := b A x r = b A x = Ax A x = A
More informationMath 411 Preliminaries
Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector
More informationINTRODUCTION TO COMPUTATIONAL MATHEMATICS
INTRODUCTION TO COMPUTATIONAL MATHEMATICS Course Notes for CM 271 / AMATH 341 / CS 371 Fall 2007 Instructor: Prof. Justin Wan School of Computer Science University of Waterloo Course notes by Prof. Hans
More informationWhat Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract
What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about
More informationDense LU factorization and its error analysis
Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,
More informationReview of matrices. Let m, n IN. A rectangle of numbers written like A =
Review of matrices Let m, n IN. A rectangle of numbers written like a 11 a 12... a 1n a 21 a 22... a 2n A =...... a m1 a m2... a mn where each a ij IR is called a matrix with m rows and n columns or an
More information1 ERROR ANALYSIS IN COMPUTATION
1 ERROR ANALYSIS IN COMPUTATION 1.2 Round-Off Errors & Computer Arithmetic (a) Computer Representation of Numbers Two types: integer mode (not used in MATLAB) floating-point mode x R ˆx F(β, t, l, u),
More informationGaussian Elimination for Linear Systems
Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination
More information1 Finding and fixing floating point problems
Notes for 2016-09-09 1 Finding and fixing floating point problems Floating point arithmetic is not the same as real arithmetic. Even simple properties like associativity or distributivity of addition and
More informationIntroduction to Scientific Computing Languages
1 / 19 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 19 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...
More informationNext topics: Solving systems of linear equations
Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:
More informationBinary Floating-Point Numbers
Binary Floating-Point Numbers S exponent E significand M F=(-1) s M β E Significand M pure fraction [0, 1-ulp] or [1, 2) for β=2 Normalized form significand has no leading zeros maximum # of significant
More informationTu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1
Tu: 9/3/13 Math 71, Fall 2013, Section 001 Lecture 1 1 Course intro Notes : Take attendance. Instructor introduction. Handout : Course description. Note the exam days (and don t be absent). Bookmark the
More informationOutline. Math Numerical Analysis. Errors. Lecture Notes Linear Algebra: Part B. Joseph M. Mahaffy,
Math 54 - Numerical Analysis Lecture Notes Linear Algebra: Part B Outline Joseph M. Mahaffy, jmahaffy@mail.sdsu.edu Department of Mathematics and Statistics Dynamical Systems Group Computational Sciences
More informationAlgebra II. A2.1.1 Recognize and graph various types of functions, including polynomial, rational, and algebraic functions.
Standard 1: Relations and Functions Students graph relations and functions and find zeros. They use function notation and combine functions by composition. They interpret functions in given situations.
More informationAccurate polynomial evaluation in floating point arithmetic
in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms
More information14.2 QR Factorization with Column Pivoting
page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution
More informationScientific Computing
Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting
More informationSOLVING LINEAR SYSTEMS
SOLVING LINEAR SYSTEMS We want to solve the linear system a, x + + a,n x n = b a n, x + + a n,n x n = b n This will be done by the method used in beginning algebra, by successively eliminating unknowns
More informationROUNDOFF ERRORS; BACKWARD STABILITY
SECTION.5 ROUNDOFF ERRORS; BACKWARD STABILITY ROUNDOFF ERROR -- error due to the finite representation (usually in floatingpoint form) of real (and complex) numers in digital computers. FLOATING-POINT
More informationALU (3) - Division Algorithms
HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)
More informationIntroduction to Scientific Computing Languages
1 / 21 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 21 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...
More informationDecimal Scientific Decimal Scientific
Experiment 00 - Numerical Review Name: 1. Scientific Notation Describing the universe requires some very big (and some very small) numbers. Such numbers are tough to write in long decimal notation, so
More informationMATH ASSIGNMENT 03 SOLUTIONS
MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for
More informationChapter 1 Mathematical Preliminaries and Error Analysis
Numerical Analysis (Math 3313) 2019-2018 Chapter 1 Mathematical Preliminaries and Error Analysis Intended learning outcomes: Upon successful completion of this chapter, a student will be able to (1) list
More informationCompute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics).
1 Introduction Read sections 1.1, 1.2.1 1.2.4, 1.2.6, 1.3.8, 1.3.9, 1.4. Review questions 1.1 1.6, 1.12 1.21, 1.37. The subject of Scientific Computing is to simulate the reality. Simulation is the representation
More informationAn Introduction to Numerical Analysis
An Introduction to Numerical Analysis Department of Mathematical Sciences, NTNU 21st august 2012 Practical issues webpage: http://wiki.math.ntnu.no/tma4215/2012h/start Lecturer:, elenac@math.ntnu.no Lecures:
More informationReview Questions REVIEW QUESTIONS 71
REVIEW QUESTIONS 71 MATLAB, is [42]. For a comprehensive treatment of error analysis and perturbation theory for linear systems and many other problems in linear algebra, see [126, 241]. An overview of
More informationNumerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41
Numerical Analysis Yutian LI CUHKSZ 2018/19 Term 1 Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41 Reference Books BF R. L. Burden and J. D. Faires, Numerical Analysis, 9th edition, Thomsom Brooks/Cole,
More informationLecture notes to the course. Numerical Methods I. Clemens Kirisits
Lecture notes to the course Numerical Methods I Clemens Kirisits November 8, 08 ii Preface These lecture notes are intended as a written companion to the course Numerical Methods I taught from 06 to 08
More informationThe Euclidean Division Implemented with a Floating-Point Multiplication and a Floor
The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented
More informationWhat is it we are looking for in these algorithms? We want algorithms that are
Fundamentals. Preliminaries The first question we want to answer is: What is computational mathematics? One possible definition is: The study of algorithms for the solution of computational problems in
More informationLecture 1. Introduction to Numerical Methods. T. Gambill. Department of Computer Science University of Illinois at Urbana-Champaign.
Lecture 1 Introduction to Numerical Methods T. Gambill Department of Computer Science University of Illinois at Urbana-Champaign June 10, 2013 T. Gambill (UIUC) CS 357 June 10, 2013 1 / 52 Course Info
More informationError Analysis for Solving a Set of Linear Equations
Error Analysis for Solving a Set of Linear Equations We noted earlier that the solutions to an equation Ax = b depends significantly on the matrix A. In particular, a unique solution to Ax = b exists if
More informationThe numerical stability of barycentric Lagrange interpolation
IMA Journal of Numerical Analysis (2004) 24, 547 556 The numerical stability of barycentric Lagrange interpolation NICHOLAS J. HIGHAM Department of Mathematics, University of Manchester, Manchester M13
More informationMATH2071: LAB #5: Norms, Errors and Condition Numbers
MATH2071: LAB #5: Norms, Errors and Condition Numbers 1 Introduction Introduction Exercise 1 Vector Norms Exercise 2 Matrix Norms Exercise 3 Compatible Matrix Norms Exercise 4 More on the Spectral Radius
More informationGAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)
GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511) D. ARAPURA Gaussian elimination is the go to method for all basic linear classes including this one. We go summarize the main ideas. 1.
More informationLecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)
Math 59 Lecture 2 (Tue Mar 5) Gaussian elimination and LU factorization (II) 2 Gaussian elimination - LU factorization For a general n n matrix A the Gaussian elimination produces an LU factorization if
More informationLecture 2 MODES OF NUMERICAL COMPUTATION
1. Diversity of Numbers Lecture 2 Page 1 It is better to solve the right problem the wrong way than to solve the wrong problem the right way. The purpose of computing is insight, not numbers. Richard Wesley
More informationQUADRATIC PROGRAMMING?
QUADRATIC PROGRAMMING? WILLIAM Y. SIT Department of Mathematics, The City College of The City University of New York, New York, NY 10031, USA E-mail: wyscc@cunyvm.cuny.edu This is a talk on how to program
More informationNumerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods
Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can
More informationNumber Representation and Waveform Quantization
1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.
More informationIntroduction to Scientific Computing
(Lecture 2: Machine precision and condition number) B. Rosić, T.Moshagen Institute of Scientific Computing General information :) 13 homeworks (HW) Work in groups of 2 or 3 people Each HW brings maximally
More informationMA580: Numerical Analysis: I. Zhilin Li
MA580: Numerical Analysis: I Zhilin Li Chapter 1 Introduction Why is this course important (motivations)? What is the role of this class in the problem solving process using mathematics and computers?
More informationAMSC/CMSC 466 Problem set 3
AMSC/CMSC 466 Problem set 3 1. Problem 1 of KC, p180, parts (a), (b) and (c). Do part (a) by hand, with and without pivoting. Use MATLAB to check your answer. Use the command A\b to get the solution, and
More informationNumerical Linear Algebra
Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost
More informationProgram Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects
Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen
More informationLeast-Squares Systems and The QR factorization
Least-Squares Systems and The QR factorization Orthogonality Least-squares systems. The Gram-Schmidt and Modified Gram-Schmidt processes. The Householder QR and the Givens QR. Orthogonality The Gram-Schmidt
More informationA Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem
A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present
More informationOrder of convergence. MA3232 Numerical Analysis Week 3 Jack Carl Kiefer ( ) Question: How fast does x n
Week 3 Jack Carl Kiefer (94-98) Jack Kiefer was an American statistician. Much of his research was on the optimal design of eperiments. However, he also made significant contributions to other areas of
More informationNumerical methods for solving linear systems
Chapter 2 Numerical methods for solving linear systems Let A C n n be a nonsingular matrix We want to solve the linear system Ax = b by (a) Direct methods (finite steps); Iterative methods (convergence)
More informationCS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra
CS227-Scientific Computing Lecture 4: A Crash Course in Linear Algebra Linear Transformation of Variables A common phenomenon: Two sets of quantities linearly related: y = 3x + x 2 4x 3 y 2 = 2.7x 2 x
More informationMATH 612 Computational methods for equation solving and function minimization Week # 6
MATH 612 Computational methods for equation solving and function minimization Week # 6 F.J.S. Spring 2014 University of Delaware FJS MATH 612 1 / 58 Plan for this week Discuss any problems you couldn t
More informationChapter 4 Number Representations
Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point
More informationChapter 2. Error Correcting Codes. 2.1 Basic Notions
Chapter 2 Error Correcting Codes The identification number schemes we discussed in the previous chapter give us the ability to determine if an error has been made in recording or transmitting information.
More informationLinear Algebraic Equations
Linear Algebraic Equations 1 Fundamentals Consider the set of linear algebraic equations n a ij x i b i represented by Ax b j with [A b ] [A b] and (1a) r(a) rank of A (1b) Then Axb has a solution iff
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 09: Accuracy and Stability Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 12 Outline 1 Condition Number of Matrices
More informationThe Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal
-1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,
More informationNumerics and Error Analysis
Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What
More information