Round-off Errors and Computer Arithmetic - (1.2)

Similar documents
Round-off Errors and Computer Arithmetic - (1.2)

2. Review of Calculus Notation. C(X) all functions continuous on the set X. C[a, b] all functions continuous on the interval [a, b].

CSC165H, Mathematical expression and reasoning for computer science week 12

Mathematical preliminaries and error analysis

Unit 1 - Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Floating-point Computation

Computer arithmetic. Intensive Computation. Annalisa Massini 2017/2018

Chapter 1: Introduction and mathematical preliminaries

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Notes on floating point number, numerical computations and pitfalls

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

1 Floating point arithmetic

Lecture 7. Floating point arithmetic and stability

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

How do computers represent numbers?

Notes for Chapter 1 of. Scientific Computing with Case Studies

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.

Section 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative

HENSEL S LEMMA KEITH CONRAD

Tu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

ECS 231 Computer Arithmetic 1 / 27

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1: Preliminaries and Error Analysis

ACM 106a: Lecture 1 Agenda

Chapter 4 Number Representations

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

Series Handout A. 1. Determine which of the following sums are geometric. If the sum is geometric, express the sum in closed form.

Chapter 7 Rational and Irrational Numbers

Elements of Floating-point Arithmetic

CHAPTER-II Control Charts for Fraction Nonconforming using m-of-m Runs Rules

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

Applicable Analysis and Discrete Mathematics available online at HENSEL CODES OF SQUARE ROOTS OF P-ADIC NUMBERS

4. Score normalization technical details We now discuss the technical details of the score normalization method.

E( x ) [b(n) - a(n, m)x(m) ]

Introduction CSE 541

0.6 Factoring 73. As always, the reader is encouraged to multiply out (3

Convolutional Codes. Lecture 13. Figure 93: Encoder for rate 1/2 constraint length 3 convolutional code.

Elliptic Curves and Cryptography

Chapter 1 Error Analysis

SUMS OF TWO SQUARES PAIR CORRELATION & DISTRIBUTION IN SHORT INTERVALS

Elements of Floating-point Arithmetic

Convex Optimization methods for Computing Channel Capacity

1 1 c (a) 1 (b) 1 Figure 1: (a) First ath followed by salesman in the stris method. (b) Alternative ath. 4. D = distance travelled closing the loo. Th

Cryptanalysis of Pseudorandom Generators

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

Binary floating point

Integer-Valued Polynomials

8 STOCHASTIC PROCESSES

Number Representation and Waveform Quantization

Outline. EECS150 - Digital Design Lecture 26 Error Correction Codes, Linear Feedback Shift Registers (LFSRs) Simple Error Detection Coding

1 ERROR ANALYSIS IN COMPUTATION

MATH 361: NUMBER THEORY ELEVENTH LECTURE

Machine Learning: Homework 4

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS

Introduction and mathematical preliminaries

A generalization of Amdahl's law and relative conditions of parallelism

Galois Fields, Linear Feedback Shift Registers and their Applications

Homework 2 Foundations of Computational Math 1 Fall 2018

Almost 4000 years ago, Babylonians had discovered the following approximation to. x 2 dy 2 =1, (5.0.2)

Introduction to Scientific Computing

ALU (3) - Division Algorithms

ON THE GRID REFINEMENT RATIO FOR ONE-DIMENSIONAL ADVECTIVE PROBLEMS WITH NONUNIFORM GRIDS

PHYS 301 HOMEWORK #9-- SOLUTIONS

Lecture Notes 7, Math/Comp 128, Math 250

Binary Floating-Point Numbers

(Workshop on Harmonic Analysis on symmetric spaces I.S.I. Bangalore : 9th July 2004) B.Sury

CSE 599d - Quantum Computing When Quantum Computers Fall Apart

Approximating min-max k-clustering

Mathematics for Engineers. Numerical mathematics

1/25/2018 LINEAR INDEPENDENCE LINEAR INDEPENDENCE LINEAR INDEPENDENCE LINEAR INDEPENDENCE

x 2 a mod m. has a solution. Theorem 13.2 (Euler s Criterion). Let p be an odd prime. The congruence x 2 1 mod p,

Ex code

Numerical Linear Algebra

Chapter 6. Phillip Hall - Room 537, Huxley

Chapter 1 Computer Arithmetic

Math 104B: Number Theory II (Winter 2012)

Equations and inequalities

Relaxed p-adic Hensel lifting for algebraic systems

5. PRESSURE AND VELOCITY SPRING Each component of momentum satisfies its own scalar-transport equation. For one cell:

Computation of dot products in finite fields with floating-point arithmetic

Numerical Mathematical Analysis

QUADRATIC PROGRAMMING?

Introduction to Arithmetic Geometry Fall 2013 Lecture #10 10/8/2013

EXERCISES Practice and Problem Solving

MATH Dr. Halimah Alshehri Dr. Halimah Alshehri

Finite-State Verification or Model Checking. Finite State Verification (FSV) or Model Checking

MTH303. Section 1.3: Error Analysis. R.Touma

AN IMPROVED BABY-STEP-GIANT-STEP METHOD FOR CERTAIN ELLIPTIC CURVES. 1. Introduction

Infinite Number of Twin Primes

= =5 (0:4) 4 10 = = = = = 2:005 32:4 2: :

MAT 460: Numerical Analysis I. James V. Lambers

Numerical Algorithms. IE 496 Lecture 20

Numerical Methods - Preliminaries

Transcription:

Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed in a machine involves numbers with only a finite number of digits and the calculated results are only aroximations of the actual numbers. Formats for single, double and extended recision, and their standards are given in IEEE Reort on Binary Floating Point Arithmetic Standard 74-98. Let us look through the following examles to see how a calculator and a comuter stores and works with real numbers. Examle Using a TI-89, comute the values of n n for n 0,0,0 4,0. n 0 0 0 4 0 n n. 78888477. 78888489. 0. 0 It is known that lim x n n e.7888849044... Whyare.0? What went wrong? Let us check out a few more numbers. 0 n n n 0.78888489 0 0.4.78888489 calculator treat 0 0.4 as 0 0 4 0 0..7888849 calculator treat 0 0. as 0 0.7888849 this value is larger than e so it is no longer accurate 4. 0 and The calculator treat 0 as.0 and then 0 4 4 0 4.004.0. I think the recision of a TI-89 is 4-digit. So, it either truncates the th digit if it is less than or adds to the 4th digit if it is or higher. Examle Consider a PC that imlements a 64-bit (binary digit) reresentation for a real number. bit e e...e 64 bit m m...m sign exonent mantissa s c (characteristic) f (fraction) s c e 0 e 9...e 0 e f m m The system gives a floating-oint number of the form: s c0 f. Note that 0 c 0 9... 0 f... Let c max 047, and f max. 047...m

Since all machine numbers are in the form of s c0 f, the minimum number is 0. 6996 0 08 and the maximum number (in magnitude) is cmax 0 f max 0470 04. 98669746 0 08. Any number x occurring on comutation with x 0 results in underflow and is reresented by a 0 and with x 04 results in overflow and the comutation will be stoed. Examle: Consider the machine number x: 0 00000000 000000...0. Find the interval I which contains all real numbers whose machine numbers are x. We need to find a lower bound a and an uer bound b of x and then I a, b. c 0 00 f 0.406 x 0 000 0.406 80.0 The very next machine number which is smaller to x, 0 00000000 000000...0 is: which is 0 00000000 000... f 6 7... 6... 46 6 47 47 a 000 47 000 47 x 000 x 80 4 The very next machine number which is larger to 0 00000000 000000...0 is: 0 00000000 000000...0 f b 000 47 000 x 4 80 4 000

4. 847094040 4 0 4,. 90804677 0 x reresents all real numbers in a, b 80. 90804677 0, 80. 8470940404 0 4 Note that if x 80 ( 00000000 000000...0 ) then a, b 80. 8470940404 0 4, 80. 90804677 0 Given x 80, its binary reresentation can be comuted as follows. s. 80 7 80 8, ( 8 6 80, use the same idea below) 0, 0 4 4, 4 0 80 7 4 7 4, f Hence, c 7 0 00, 00 0 00 04 6, 6 c 0 x 00000000 000000...0. Now let the machine numbers be reresented in the normalized decimal floating-oint form as they are dislayed on a screen, say in a k digit decimal machine numbers: or 0. d d...d k 0 n, where d 9 and 0 d i 9 for i,,...,k. Let flx be the floating-oint form of x. Now remember that flx is a machine aroximation of the true value of x. Let x 0.d d...d k d k d k...0 n. Choing method: flx 0. d d...d k 0 n Rounding method: flx 0. d d...d k 0 n if d k 0. d d...d k 0 n if d k Examle Give the floating -oint form of using a -digit choing; and b -digit rounding.. 49689794 0.49689794 0 a fl 0.4 0 b fl 0.46 0. Absolute Error and Relative Error: Let be an aroximation to. Then the absolute error is defined as and the relative error is defined as rovided that 0. Examle Let and fl 0.46. aroximation. Find the absolute error and relative error of this 0.46 0 0.0000074640 0.46 0. 8449978044 0 6

Examle Let 0. 0 4, and 0.0 0 4 ; and let 0. 0 4, and 0.0 0 4. Comute the absolute error and relative error for each aroximation. 0.0 0 4 0. 0 4 0.00000 0.0 0 4 0. 0 4 0. 0 4 0.0806469 0.0 0 4 0. 0 4 00.0 0.0 0 4 0. 0 4 0.0806469 0. 0 4 The absolute error of is much large than the one for, the relative errors for both and are the same. From this examle, we see that the absolute error deends on the magnitude of, on the other hand, the relative error does not deend on the magnitude of. So, the relative error is usually used to evaluate the closeness of the aroximation.. Significant Digits: The number is said to aroximate to t significant digits if t is the largest nonnegative integer for which 0 t. Examle Consider,,, and in the revious examle, In each case, find t. Since for i, and i, i i 0.0806469.806469 0 0. i Hence, i aroximates i to significant digits. Usually, we don t know the exact value of. If we know the relative error when aroximates is at most 0 t and know the value of, we can find the largest interval containing. Since 0 t, 0t 0 t 0t 0 t 0t 0 t 0 t 0 t 0 t. If we know and the relative error when aroximates is at most 0 t, how can we find the largest interval in which must lie for? 0 t 0 t 0 t 0 t 0 t 0 t Examle Find the largest interval containing if 0.46 0 is used to aroximate to 4

significant digits. 0.46 0 0.46 0 0 0 0.4688446 0 0.464646 0 Examle Find the largest interval in which must lie to aroximate to significant digits.. 46766 0 0. 4640696 The loss of accuracy due to round-off error can often be avoided by a reformulation of the roblem. Examle Solve the equation x 6.0x 0 using a 4-digit rounding arithmetic. We know if b 4ac 0 then the equation ax bx c 0 has two real solutions and they are b x b 4ac, x a b b 4ac a Now comute x and x ste by ste in a 4-digit rounding arithmetic: Ste Exression Value b 6.06.0 86. 4 86 b 4ac 86 4 8 b 4ac 8 6. 064489676 6.06 4 b b 4ac 6.0 6.06 0.04 a 6 b b 4ac a 0.04 0.0 x 7 b b 4ac 6.0 6.06 4. 6. 8 b b 4ac a. 6.6 x True solutions (or solutions comuted using a k-digit rounding arithmetic where k 4 : b 4ac 6.06.0 4 6. 067788 Relative errors: x x x x x x x x 6.0 6. 067788 6.0 6. 067788 0.0 0.060774089 0.060774089 6.6 6. 0889769 6. 0889769 0.060774089 6. 0889769 0.468 7. 794760749 0 Why the aroximation of x is so oor? Note that Ste 4 involves a subtraction of two close numbers. Check out the relative error for this subtraction: aroximation - true - difference true difference 0.04 6.0 6. 067788 6.0 6. 067788 0.4677 8 If the subtraction of two numbers in close magnitudes can be avoided, then the accuracy of the comutation of

x can be imroved. Rewrite the formula for x : x x x x b b 4ac a b b 4ac a b b 4ac x b b b 4ac a c b 4ac b b b 4ac b 4ac x.. 6766766 0.6 0.6 0 0.060774089 0.060774089 7. 6798048 0 Examle The nested method: Let P nx a n x n a n x n...a x a 0 where a i s are real numbers. How many multilications and additions are needed to valuate P nx 0? Rewrite P nx : Px a n x a nx n a n x n...a x a 0 a n x a nx a nx n a n x n...a x a 0 a n x a nx a nx a nx n...a x a 0 : a n x a nx a nx a nx...a x a 0 Each a i x a i requires multilication and addition and there are n of them. So, totally n multilications and n addition are needed. By the way, multilication and addition is also counted as flo (floting oint oeration). Examle: P x 4x x x. Evaluate P. P 4, 4 6 6, P. 6