Round-off Errors and Computer Arithmetic - (1.2)

Similar documents
Round-off Errors and Computer Arithmetic - (1.2)

Number Representation

CONSTRUCTING TRUNCATED IRRATIONAL NUMBERS AND DETERMINING THEIR NEIGHBORING PRIMES

Confidence Intervals

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

1. ARITHMETIC OPERATIONS IN OBSERVER'S MATHEMATICS

Chapter Vectors

Confidence intervals for proportions

NAME: ALGEBRA 350 BLOCK 7. Simplifying Radicals Packet PART 1: ROOTS

Putnam Training Exercise Counting, Probability, Pigeonhole Principle (Answers)

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

1 Generating functions for balls in boxes

NUMERICAL METHODS FOR SOLVING EQUATIONS

SNAP Centre Workshop. Basic Algebraic Manipulation

6.3 Testing Series With Positive Terms

Classification of DT signals

COMPUTING FOURIER SERIES

Complex Numbers Solutions

The Boolean Ring of Intervals

Ma 530 Introduction to Power Series

EE260: Digital Design, Spring n Binary Addition. n Complement forms. n Subtraction. n Multiplication. n Inputs: A 0, B 0. n Boolean equations:

HOMEWORK #10 SOLUTIONS

PUTNAM TRAINING PROBABILITY

WORKING WITH NUMBERS

3 Gauss map and continued fractions

ANALYSIS OF EXPERIMENTAL ERRORS

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

[ 47 ] then T ( m ) is true for all n a. 2. The greatest integer function : [ ] is defined by selling [ x]

10.6 ALTERNATING SERIES

Revision Topic 1: Number and algebra

Chapter 9: Numerical Differentiation

Math 104: Homework 2 solutions

Analysis of Experimental Measurements

MA131 - Analysis 1. Workbook 3 Sequences II

Activity 3: Length Measurements with the Four-Sided Meter Stick

Chapter 2. Finite Fields (Chapter 3 in the text)

SOME NEW OBSERVATIONS ON MERSENNE NUMBERS AND PRIMES

Chapter 2 The Solution of Numerical Algebraic and Transcendental Equations

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

Properties and Tests of Zeros of Polynomial Functions

CHAPTER 1 SEQUENCES AND INFINITE SERIES

[ 11 ] z of degree 2 as both degree 2 each. The degree of a polynomial in n variables is the maximum of the degrees of its terms.

Math 132, Fall 2009 Exam 2: Solutions

Nuclear Physics Worksheet

Lesson 10: Limits and Continuity

Polynomial Functions. New Section 1 Page 1. A Polynomial function of degree n is written is the form:

p we will use that fact in constructing CI n for population proportion p. The approximation gets better with increasing n.

ENGI Series Page 6-01

9.3 Taylor s Theorem: Error Analysis for Series. Tacoma Narrows Bridge: November 7, 1940

CHAPTER 10 INFINITE SEQUENCES AND SERIES

THE INTEGRAL TEST AND ESTIMATES OF SUMS

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

Arkansas Tech University MATH 2924: Calculus II Dr. Marcel B. Finan

Section A assesses the Units Numerical Analysis 1 and 2 Section B assesses the Unit Mathematics for Applied Mathematics

page Suppose that S 0, 1 1, 2.

Solutions to Homework 7

MAT1026 Calculus II Basic Convergence Tests for Series

15.093J Optimization Methods. Lecture 22: Barrier Interior Point Algorithms

LESSON 2: SIMPLIFYING RADICALS

The Method of Least Squares. To understand least squares fitting of data.

Topic 5 [434 marks] (i) Find the range of values of n for which. (ii) Write down the value of x dx in terms of n, when it does exist.

Different kinds of Mathematical Induction

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Section 5.5. Infinite Series: The Ratio Test

INFINITE SEQUENCES AND SERIES

Alternating Series. 1 n 0 2 n n THEOREM 9.14 Alternating Series Test Let a n > 0. The alternating series. 1 n a n.

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Physics 116A Solutions to Homework Set #1 Winter Boas, problem Use equation 1.8 to find a fraction describing

(7 One- and Two-Sample Estimation Problem )

Almost all hyperharmonic numbers are not integers

Confidence Intervals for the Difference Between Two Proportions

New Definition of Density on Knapsack Cryptosystems

Zeros of Polynomials

Axioms of Measure Theory

Chapter 8: Estimating with Confidence

tests 17.1 Simple versus compound

SEQUENCES AND SERIES

MA1200 Exercise for Chapter 7 Techniques of Differentiation Solutions. First Principle 1. a) To simplify the calculation, note. Then. lim h.

Math 2412 Review 3(answers) kt

Unit 4: Polynomial and Rational Functions

TEACHER CERTIFICATION STUDY GUIDE

Math 113 Exam 3 Practice

Chapter 6 Infinite Series

PROBLEM SET 5 SOLUTIONS. Solution. We prove that the given congruence equation has no solutions. Suppose for contradiction that. (x 2) 2 1 (mod 7).

ECE534, Spring 2018: Final Exam

2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Generating Functions. 1 Operations on generating functions

Median and IQR The median is the value which divides the ordered data values in half.

Infinite Sequences and Series

Linearly Independent Sets, Bases. Review. Remarks. A set of vectors,,, in a vector space is said to be linearly independent if the vector equation

PC5215 Numerical Recipes with Applications - Review Problems

Limit Theorems. Convergence in Probability. Let X be the number of heads observed in n tosses. Then, E[X] = np and Var[X] = np(1-p).

Injections, Surjections, and the Pigeonhole Principle

PROPERTIES OF THE POSITIVE INTEGERS

September 2012 C1 Note. C1 Notes (Edexcel) Copyright - For AS, A2 notes and IGCSE / GCSE worksheets 1

18.440, March 9, Stirling s formula

Quadratic Functions. Before we start looking at polynomials, we should know some common terminology.

Transcription:

Roud-off Errors ad Comuter Arithmetic - (1.) 1. Roud-off Errors: Roud-off errors is roduced whe a calculator or comuter is used to erform real umber calculatios. That is because the arithmetic erformed i a machie ivolves umbers with oly a fiite umber of digits ad the calculated results are oly aroximatios of the actual umbers. Formats for sigle, double ad exteded recisio, ad their stadards are give i IEEE Reort o Biary Floatig Poit Arithmetic Stadard 74-198. Ay real umber ca be rereseted by four values as follows: 1 sig sigificad b exoet The desigers of the IEEE 74 biary floatig oit arithmetic stadard selected abaseof(b because it avoided the sudde jums i reresetable values suffered by larger bases. They the took the ositio that the bit format would set aside 1bitforthesig(sig0, or 1), 8 bits for the exoet ad bits for the sigificad. This format reresets values i the rage 18 (about 10 9 ) ad 17 (about 10 9 ). It also rovides a recisio of about 7 digits as 4 is about 16,000,000 (16,777,16). Their 64 bit format sets aside three additioal bits for the exoet (i.e. 11 bits) while leavig bits for the sigificad. This rovides a cosiderably icreased rage of roughly 10 00 to 10 00 ad a rough doublig of recisio to about 1 decimal digits. Let us look through the followig examles to see how a calculator ad a comuter stores ad works with real umbers. Examle Usig a TI-89, comute the values of 1 for 0 1,10 1,10 14,10 1. 10 1 10 1 10 14 10 1 1. 7188188477. 7188188489 1. 0 1. 0 It is kow that lim x e. 7188188489... Why are 1 1 1.0? What wet wrog? Let us check out a few more umbers. 10 1 1 10 1.7188188489 10 1 0.4.7188188489 calculator treat 10 1 0.4 as 10 1 10 14 14.0 ad 10 1 0..718818849 calculator treat 10 1 0. as 10 1 10 1.718818849 this value is larger tha e so it is o loger accurate The calculator treat 1 10 as 1.0 ad the 1 0 1 14 1.0014.0. The recisio of a 14 10 14 TI-89 is 14-digit. So, it either trucates the 1th digit if it is less tha or adds 1 to the 14th digit if it is or higher. Examle Usig MatLab 14, comute the values of x[10^1; 10^1; 10^14; 10^1]; y(11./x).^x 1 1 for 0 1,10 1,10 14,10 1.

y.718496074.7161100408690.716110040870.0006496 ex(1) as.7188188490 From these two examles, we kow e is ot comuted by TI-89. How is the value e comuted? 1 for very large i MatLab 14 ad Examle Cosider a PC that imlemets a 64-bit (biary digit) reresetatio for a real umber. 1 1 11 bit e 1 e...e 11 1 64 bit m 1 m...m sig exoet matissa (sigificad) s c (characteristic) f (fractio) 1 s c e 1 10 e 9...e 10 e 11 f m 1 m...m The system gives a floatig-oit umber of the form: 1 s c 10 1 f. Note that 0 c 10 9... 1 1 1 047 0 f 1... 1 1 1 1 Let c max 047, ad f max 1. Sice all machie umbers are i the form of 1 s c 10 1 f, the miimum umber is 10. 116996 0 08 ad the maximum umber (i magitude) is cmax 10 1 f max 047 10 1 1 104 1. 98669746 0 08. Ay umber x occurrig o comutatio with x 10 results i uderflow ad is rereseted by a 0 ad with x 104 results i overflow ad the comutatio will be stoed. Examle Cosider the machie umber x: 0 10000000110 011010000...0. Fid the iterval I digits which cotais all real umbers whose machie umbers are x. We eed to fid a lower boud a ad a uer boud b of x ad the I a, b.

c 10 00 f 0.406 x 1 0 100 10 1 0.406 80.0 The very ext machie umber which is smaller to x is: 0 10000000110 011001111...1 digits which is f 6 7... 6 1... 46 6 1 1 47 1 1 1 1 1 47 1 47 1 a 100 10 1 1 x 7 1 x 1 80 1 4 4 The very ext machie umber which is larger to x is: 0 10110000000 011010000...01 f b 100 10 1 x 7 1 80 4 1. 84170940404 0 14. 4 Hece, x reresets all real umbers i a, b 80. 84170940404 0 14, 180. 84170940404 0 14. Examle Give x 180, fid the biary reresetatio of x. We kow s.

180 7 80 18, ( 8 6 80, use the same idea below) 0, 0 4 4, 4 0 180 7 4 7 1 4, f Hece, c 7 0 00, 100 10 00 104 6, 6 c 10 1 x 10000000110 011010000...0.. Choig ad Roudig Methods: Now let the machie umbers be rereseted i the ormalized decimal floatig-oit form as they are dislayed o a scree, say i a k digit decimal machie umbers: or 0.d 1 d...d k 0, where 1 d 1 9 ad 0 d i 9fori,,...,k. Let flx be the floatig-oit form of x. Now remember that flx is a machie aroximatio of the true value of x. Letx 0.d 1 d...d k d k1 d k...10. Choig method: flx 0.d 1 d...d k 0 Roudig method: flx 0.d 1 d...d k 0 if d k1 0.d 1 d...d k 0 if d k1 Examle Give the floatig -oit form of usig a -digit choig; ad b -digit roudig.. 1419689794 0.1419689794 0 1 a fl 0.141 0 1 b fl 0.1416 0 1. Absolute Error ad Relative Error: Let be a aroximatio to. The the absolute error is defied as ad the relative error is defied as rovided that 0. Examle Let ad fl 0.1416. aroximatio. Fid the absolute error ad relative error of this 0.1416 0 1 0.000007464101 0.1416 0 1. 8449978044 0 6 Examle Let 1 0.1 0 4, ad 1 0.0 0 4 ; ad let 0.1 0 4, ad 0.0 0 4. Comute the absolute error ad relative error for each aroximatio. 4

1 1 0.0 0 4 0.1 0 4 0.000001 1 1 1 0.0 0 4 0.1 0 4 0.1 0 4 0.080641619 0.0 0 4 0.1 0 4 00.0 0.0 0 4 0.1 0 4 0.080641619 0.1 0 4 The absolute error of is much large tha the oe for, the relative errors for both 1 ad are the same. From this examle, we see that the absolute error deeds o the magitude of, o the other had, the relative error does ot deed o the magitude of. So, the relative error is usually used to evaluate the closeess of the aroximatio. 4. Sigificat Digits: The umber is said to aroximate to t sigificat digits if t is the largest oegative iteger for which 0 t. Examle Cosider 1, 1,, ad i the revious examle, I each case, fid t. Sice for i, ad i, i i 0.080641619.80641619 0 0. i Hece, i aroximates i to sigificat digits. Usually, we do t kow the exact value of. If we kow the relative error whe aroximates is at most 10 t ad kow the value of, we ca fid the largest iterval cotaiig. Sice 0 t, 1 0 t 10 t 1 0 t 1 10 t 0 t 1 0 t 1 10 t 1 0 t 1 10 t. If we kow ad the relative error whe aroximates is at most 10 t, how ca we fid the largest iterval i which must lie for? 0 t 0 t 0 t 0 t 0 t 0 t Examle Fid the largest iterval cotaiig if 0.1416 0 is used to aroximate to

sigificat digits. 0.1416 0 0.1416 0 1 0 1 10 0.14168841416 0 0.141614161416 0 Examle Fid the largest iterval i which must lie to aroximate to sigificat digits.. 14161766 10 10. 1416406916 The loss of accuracy due to roud-off error ca ofte be avoided by a reformulatio of the roblem. Examle Solve the equatio x 6.10x 0 usig a 4-digit roudig arithmetic. We kow if b 4ac 0 the the equatio ax bx c 0hastworealsolutiosadtheyare x 1 b b 4ac, x a b 4ac a Now comute x 1 ad x ste by ste i a 4-digit roudig arithmetic: Ste Exressio Value 1 b 6.106.10 86. 41 86 b 4ac 86 411 8 b 4ac 8 6. 0644896716 6.06 4 b b 4ac 6.10 6.06 0.04 a 1 6 b b 4ac a 0.04 0.0 x 1 7 b 4ac 6.10 6.06 14. 16 1. 8 b 4ac a 1. 61.6 x True solutios (or solutios comuted usig a k-digit roudig arithmetic where k 4 : b 4ac 6.106.10 411 6. 06778181 Relative errors: x 1 x 1 x 1 x x x x 1 x 6.10 6. 06778181 1 6.10 6. 06778181 1 0.0 0.01610774089 0.01610774089 61.6 6. 08897691 6. 08 89 76 91 0.01610774089 6. 08897691 0.4168 7. 79417607491 0 Why the aroximatio of x 1 is so oor? Note that Ste 4 ivolves a subtractio of two close umbers. Check out the relative error for this subtractio: 6

aroximatio - true - differece true differece 0.04 6.10 6. 06778181 6.10 6. 06778181 0.416778 If the subtractio of two umbers i close magitudes ca be avoided, the the accuracy of the comutatio of x 1 ca be imroved. Rewrite the formula for x 1 : x 1 x 1 x 1 x 1 b b 4ac a b b 4ac a b 4ac x 1 b b 4ac a c b 4ac b 4ac b 4ac x 1 1. 1. 6766766 0 1.6 0 1.6 0 0.01610774089 0.01610774089 7. 6179801481 0 Examle The ested method: Let P x a x a 1 x 1...a 1 x a 0 where a i s are real umbers. How may multilicatios ad additios are eeded to valuate P x 0? Rewrite P x : Px a x a 1 x 1 a x...a 1 x a 0 a x a 1 x a x a x...a 1 x a 0 a x a 1 x a x a x...a 1 x a 0 : a x a 1 x a x a x...a 1 x a 0 Each a i x a i 1 requires 1 multilicatio ad 1 additio ad there are of them. So, totally multilicatios ad additio are eeded. By the way, 1 multilicatio ad 1 additio is couted as 1 flo (floatig oit oeratio). Examle P x 4x x x. Evaluate P. P 4, 4 6 6 1, P. 7