ECS 231 Computer Arithmetic 1 / 27

Size: px
Start display at page:

Download "ECS 231 Computer Arithmetic 1 / 27"

Transcription

1 ECS 231 Computer Arithmetic 1 / 27

2 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27

3 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 3 / 27

4 Floating-point numbers and representations 1. Floating-point (FP) representation of numbers (scientific notation): sign significand base exponent 2. FP representation of a nonzero binary number: x = ± b 0.b 1 b 2 b p 1 2 E. (1) It is normalized, i.e., b0 = 1 (the hidden bit) Precision (= p) is the number of bits in the significand (mantissa) (including the hidden bit). Machine epsilon ɛ = 2 (p 1), the gap between the number 1 and the smallest FP number that is greater than Special numbers: 0, 0,,, NaN(= Not a Number ). 4 / 27

5 IEEE standard All computers designed since 1985 use the IEEE Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Std ), represent each number as a binary number and use binary arithmetic. Essentials of the IEEE standard: consistent representation of FP numbers correctly rounded FP operations (using various rounding modes) consistent treatment of exceptional situation such as division by zero. 5 / 27

6 IEEE single precision format Single format takes 32 bits (=4 bytes) long: 8 23 s E f sign exponent It represents the number binary point ( 1) s (1.f) 2 E 127 fraction The leading 1 in the fraction need not be stored explicitly since it is always 1 (hidden bit) E min = ( ) 2 = (1) 10, E max = ( ) 2 = (254) 10. E 127 in exponent avoids the need for storage of a sign bit. The range of positive normalized numbers: N min = Emin 127 = N max = Emax Special repsentations for 0, ± and NaN. 6 / 27

7 IEEE double pecision format Double format takes 64 bits (= 8 bytes) long: s E f sign exponent It represents the numer binary point ( 1) s (1.f) 2 E 1023 fraction The range of positive normalized numbers is from N min = N max = Special repsentations for 0, ± and NaN. 7 / 27

8 Summary I Precision and machine epsilon of IEEE single, double and extended formats Format Precision p Machine epsilon ɛ = 2 p 1 single 24 ɛ = double 53 ɛ = extended 64 ɛ = Extra: Higham s lecture for additional formats, such as half (16 bits) and quadruple (128 bits). 8 / 27

9 Rounding modes Let a positive real number x be in the normalized range, i.e., N min x N max, and write in the normalized form x = (1.b 1 b 2 b p 1 b p b p+1...) 2 E, Then the closest fp number less than or equal to x is i.e., x is obtained by truncating. x = 1.b 1 b 2 b p 1 2 E The next fp number bigger than x (also the next one that bigger than x) is x + = ((1.b 1 b 2 b p 1 ) + ( )) 2 E If x is negative, the situtation is reversed. 9 / 27

10 Correctly rounding modes: round down: round(x) = x round up: round(x) = x + round towards zero: round(x) = x if x 0 round(x) = x + if x 0 round to nearest: round(x) = x or x + whichever is nearer to x. 1 1 except that if x > N max, round(x) =, and if x < N max, round(x) =. In the case of tie, i.e., x and x + are the same distance from x, the one with its least significant bit equal to zero is chosen. 10 / 27

11 Rounding error When the round to nearest (IEEE default rounding mode) is in effect, Therefore, we have relerr = relerr(x) = round(x) x x 1 2 ɛ = , single , double. 11 / 27

12 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 12 / 27

13 Floating-point arithmetic IEEE rules for correctly rounded fp operations: if x and y are correctly rounded fp numbers, then fl(x + y) = round(x + y) = (x + y)(1 + δ) fl(x y) = round(x y) = (x y)(1 + δ) fl(x y) = round(x y) = (x y)(1 + δ) fl(x/y) = round(x/y) = (x/y)(1 + δ) where δ 1 2 ɛ IEEE standard also requires that correctly rounded remainder and square root operations be provided. 13 / 27

14 Floating-point arithmetic, cont d IEEE standard response to exceptions Event Example Set result to Invalid operation 0/0, 0 NaN Division by zero Finite nonzero/0 ± Overflow x > N max ± or ±N max underflow x 0, x < N min ±0, ±N min or subnormal Inexact whenever fl(x y) x y correctly rounded value 14 / 27

15 Floating-point arithmetic error Let x and ŷ be the fp numbers and that x = x(1 + τ 1 ) and ŷ = y(1 + τ 2 ), for τ i τ 1 where τ i could be the relative errors in the process of collecting/getting the data from the original source or the previous operations, and Question: how do the four basic arithmetic operations behave? 15 / 27

16 Floating-point arithmetic error: +, Addition and subtraction: fl( x + ŷ) = ( x + ŷ)(1 + δ) = x(1 + τ 1)(1 + δ) + y(1 + τ 2)(1 + δ) = x + y + x(τ 1 + δ + O(τɛ)) + y(τ 2 + δ + O(τɛ)) ( = (x + y) 1 + x (τ1 + δ + O(τɛ)) + y ) (τ2 + δ + O(τɛ)) x + y x + y (x + y)(1 + δ), where δ 1 2 ɛ, δ x + y x + y (τ + 12 ɛ + O(τɛ) ). 16 / 27

17 Floating-point arithmetic error: +, Three possible cases: 1. If x and y have the same sign, i.e., xy > 0, then x + y = x + y ; this implies δ τ + 1 ɛ + O(τɛ) 1. 2 Thus fl( x + ŷ) approximates x + y well. 2. If x y x + y 0, then ( x + y )/ x + y 1; this implies that δ could be nearly or much bigger than 1. This is so called catastrophic cancellation, it causes relative errors or uncertainties already presented in x and ŷ to be magnified. 3. In general, if ( x + y )/ x + y is not too big, fl( x + ŷ) provides a good approximation to x + y. 17 / 27

18 Catastrophic cancellation: example 1 Computing x + 1 x straightforward causes substantial loss of significant digits for large n x fl( x + 1) fl( x) fl(fl( x + 1) fl( x) 1.00e e e e e e e e e e e e e e e e e e e e e e e e e e e e+00 Catastrophic cancellation can sometimes be avoided if a formula is properly reformulated. In the present case, one can compute x + 1 x almost to full precision by using the equality x + 1 x = 1 x x. 18 / 27

19 Catastrophic cancellation: example 2 Consider the function Note that f(x) = 1 cos x x 2 0 f(x) < 1/2 for all x 0 Let x = , then the computed is completely wrong! fl(f(x)) = Alternatively, the function can be re-written as ( ) 2 sin(x/2) f(x) =. x/2 Consequently, for x = , then the computed fl(f(x)) = < 1/2 is fine! 19 / 27

20 Floating-point arithmetic error:, / Multiplication and Division: fl( x ŷ) = ( x ŷ)(1 + δ) = xy(1 + τ 1 )(1 + τ 2 )(1 + δ) xy(1 + δ ), fl( x/ŷ) = ( x/ŷ)(1 + δ) = (x/y)(1 + τ 1 )(1 + τ 2 ) 1 (1 + δ) xy(1 + δ ), where Thus δ = τ 1 + τ 2 + δ + O(τɛ), δ 2τ ɛ + O(τɛ), δ = τ 1 τ 2 + δ + O(τɛ). δ 2τ ɛ + O(τɛ) we can conclude that multiplication and division are very well-behaved! 20 / 27

21 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 21 / 27

22 Floating-point error analysis Illustrate the basic idea of error analysis through a simple example. Consider the inner product: x T y = x 1 y 1 + x 2 y 2 + x 3 y 3, assuming already x i s and y j s are fp numbers. fl(x T y) is computed in the following order: fl(x T y) = fl ( fl(fl(x 1 y 1 ) + fl(x 2 y 2 )) + fl(x 3 y 3 ) ). By the fp arithmetic model, we have fl(x T y) = fl ( fl(x 1 y 1 (1 + ɛ 1 ) + x 2 y 2 (1 + ɛ 2 )) + x 3 y 3 (1 + ɛ 3 ) ) = fl ( (x 1 y 1 (1 + ɛ 1 ) + x 2 y 2 (1 + ɛ 2 ))(1 + δ 1 ) + x 3 y 3 (1 + ɛ 3 ) ) = ( (x 1 y 1 (1 + ɛ 1 ) + x 2 y 2 (1 + ɛ 2 ))(1 + δ 1 ) + x 3 y 3 (1 + ɛ 3 ) ) (1 + δ 2 ) = x 1 y 1 (1 + ɛ 1 )(1 + δ 1 )(1 + δ 2 ) + x 2 y 2 (1 + ɛ 2 )(1 + δ 1 )(1 + δ 2 ) +x 3 y 3 (1 + ɛ 3 )(1 + δ 2 ), where ɛ i 1 2 ɛ and δ j 1 2 ɛ. 22 / 27

23 Floating-point error analysis, cont d There are two ways to interpret the errors in the computed fl(x T y): Forward error analysis Backward error analysis 23 / 27

24 Forward error analysis We have where fl(x T y) = x T y + E, It implies that E =x 1 y 1 (ɛ 1 + δ 1 + δ 2 ) + x 2 y 2 (ɛ 2 + δ 1 + δ 2 ) + x 3 y 3 (ɛ 3 + δ 2 ) + O(ɛ 2 ). E 1 2 ɛ(3 x 1y x 2 y x 3 y 3 ) + O(ɛ 2 ) 3 2 ɛ x T y + O(ɛ 2 ). This bound on E tells the worst-case difference between the exact x T y and its computed value. 24 / 27

25 Backward error analysis We can also write where fl(x T y) = x T ŷ = (x + x) T (y + y), x 1 = x 1 (1 + ɛ 1 ), ŷ 1 = y 1 (1 + δ 1 )(1 + δ 2 ) y 1 (1 + δ 1 ), x 2 = x 2 (1 + ɛ 2 ), ŷ 2 = y 2 (1 + δ 1 )(1 + δ 2 ) y 2 (1 + δ 2 ), x 3 = x 3 (1 + ɛ 3 ), ŷ 3 = y 3 (1 + δ 2 ) y 3 (1 + δ 3 ). and δ 1 = δ 2 ɛ + O(ɛ 2 ) and δ ɛ. This says the computed value fl(x T y) is the exact inner product of a slightly perturbed x and ŷ. 25 / 27

26 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 26 / 27

27 Further reading 1. D. Goldberg. What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys, 18(1):5 48, Rencet lecture by N. Higham on the latest development on low precision and multiprecision arithmetic Discussion of numerical disasters: T. Huckle, Collection of software bugs huckle/bugse.html Bits and Bugs: A Scientific and Historical Review of Software Failures in Computational Science by T. Huckle and T. Neckel, SIAM, March / 27

1 Floating point arithmetic

1 Floating point arithmetic Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent

More information

Lecture 7. Floating point arithmetic and stability

Lecture 7. Floating point arithmetic and stability Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating

More information

Notes for Chapter 1 of. Scientific Computing with Case Studies

Notes for Chapter 1 of. Scientific Computing with Case Studies Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What

More information

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460 Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Binary floating point

Binary floating point Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b. CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5 GENE H GOLUB Suppose we want to solve We actually have an approximation ξ such that 1 Perturbation Theory Ax = b x = ξ + e The question is, how

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 21 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 21 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Lecture Notes 7, Math/Comp 128, Math 250

Lecture Notes 7, Math/Comp 128, Math 250 Lecture Notes 7, Math/Comp 128, Math 250 Misha Kilmer Tufts University October 23, 2005 Floating Point Arithmetic We talked last time about how the computer represents floating point numbers. In a floating

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 19 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 19 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Errors. Intensive Computation. Annalisa Massini 2017/2018

Errors. Intensive Computation. Annalisa Massini 2017/2018 Errors Intensive Computation Annalisa Massini 2017/2018 Intensive Computation - 2017/2018 2 References Scientific Computing: An Introductory Survey - Chapter 1 M.T. Heath http://heath.cs.illinois.edu/scicomp/notes/index.html

More information

Chapter 1 Error Analysis

Chapter 1 Error Analysis Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of

More information

ACM 106a: Lecture 1 Agenda

ACM 106a: Lecture 1 Agenda 1 ACM 106a: Lecture 1 Agenda Introduction to numerical linear algebra Common problems First examples Inexact computation What is this course about? 2 Typical numerical linear algebra problems Systems of

More information

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction

More information

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors 3-1 Roundoff errors and floating-point arithmetic

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic

More information

Notes on floating point number, numerical computations and pitfalls

Notes on floating point number, numerical computations and pitfalls Notes on floating point number, numerical computations and pitfalls November 6, 212 1 Floating point numbers An n-digit floating point number in base β has the form x = ±(.d 1 d 2 d n ) β β e where.d 1

More information

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors Roundoff errors and floating-point arithmetic

More information

Tu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1

Tu: 9/3/13 Math 471, Fall 2013, Section 001 Lecture 1 Tu: 9/3/13 Math 71, Fall 2013, Section 001 Lecture 1 1 Course intro Notes : Take attendance. Instructor introduction. Handout : Course description. Note the exam days (and don t be absent). Bookmark the

More information

Floating-point Computation

Floating-point Computation Chapter 2 Floating-point Computation 21 Positional Number System An integer N in a number system of base (or radix) β may be written as N = a n β n + a n 1 β n 1 + + a 1 β + a 0 = P n (β) where a i are

More information

1 ERROR ANALYSIS IN COMPUTATION

1 ERROR ANALYSIS IN COMPUTATION 1 ERROR ANALYSIS IN COMPUTATION 1.2 Round-Off Errors & Computer Arithmetic (a) Computer Representation of Numbers Two types: integer mode (not used in MATLAB) floating-point mode x R ˆx F(β, t, l, u),

More information

Chapter 1: Introduction and mathematical preliminaries

Chapter 1: Introduction and mathematical preliminaries Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,

More information

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1 Mathematical Preliminaries and Error Analysis Chapter 1 Mathematical Preliminaries and Error Analysis Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128A Numerical Analysis Limits and Continuity

More information

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about

More information

Introduction CSE 541

Introduction CSE 541 Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters

More information

Binary Floating-Point Numbers

Binary Floating-Point Numbers Binary Floating-Point Numbers S exponent E significand M F=(-1) s M β E Significand M pure fraction [0, 1-ulp] or [1, 2) for β=2 Normalized form significand has no leading zeros maximum # of significant

More information

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems...

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems... Contents 2 How to Obtain and Estimate Accuracy 1 2.1 Basic Concepts in Error Estimation................ 1 2.1.1 Sources of Error.................... 1 2.1.2 Absolute and Relative Errors............. 4

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

Lecture 2 MODES OF NUMERICAL COMPUTATION

Lecture 2 MODES OF NUMERICAL COMPUTATION 1. Diversity of Numbers Lecture 2 Page 1 It is better to solve the right problem the wrong way than to solve the wrong problem the right way. The purpose of computing is insight, not numbers. Richard Wesley

More information

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit 0: Overview of Scientific Computing. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit 0: Overview of Scientific Computing Lecturer: Dr. David Knezevic Scientific Computing Computation is now recognized as the third pillar of science (along with theory and experiment)

More information

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,

More information

1 Backward and Forward Error

1 Backward and Forward Error Math 515 Fall, 2008 Brief Notes on Conditioning, Stability and Finite Precision Arithmetic Most books on numerical analysis, numerical linear algebra, and matrix computations have a lot of material covering

More information

Round-off Errors and Computer Arithmetic - (1.2)

Round-off Errors and Computer Arithmetic - (1.2) Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed

More information

Numerical Methods - Preliminaries

Numerical Methods - Preliminaries Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical

More information

Numerical Mathematical Analysis

Numerical Mathematical Analysis Numerical Mathematical Analysis Numerical Mathematical Analysis Catalin Trenchea Department of Mathematics University of Pittsburgh September 20, 2010 Numerical Mathematical Analysis Math 1070 Numerical

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

Lecture 28 The Main Sources of Error

Lecture 28 The Main Sources of Error Lecture 28 The Main Sources of Error Truncation Error Truncation error is defined as the error caused directly by an approximation method For instance, all numerical integration methods are approximations

More information

4.2 Floating-Point Numbers

4.2 Floating-Point Numbers 101 Approximation 4.2 Floating-Point Numbers 4.2 Floating-Point Numbers The number 3.1416 in scientific notation is 0.31416 10 1 or (as computer output) -0.31416E01..31416 10 1 exponent sign mantissa base

More information

Chapter 1: Preliminaries and Error Analysis

Chapter 1: Preliminaries and Error Analysis Chapter 1: Error Analysis Peter W. White white@tarleton.edu Department of Tarleton State University Summer 2015 / Numerical Analysis Overview We All Remember Calculus Derivatives: limit definition, sum

More information

Chapter 1 Computer Arithmetic

Chapter 1 Computer Arithmetic Numerical Analysis (Math 9372) 2017-2016 Chapter 1 Computer Arithmetic 1.1 Introduction Numerical analysis is a way to solve mathematical problems by special procedures which use arithmetic operations

More information

Lecture notes to the course. Numerical Methods I. Clemens Kirisits

Lecture notes to the course. Numerical Methods I. Clemens Kirisits Lecture notes to the course Numerical Methods I Clemens Kirisits November 8, 08 ii Preface These lecture notes are intended as a written companion to the course Numerical Methods I taught from 06 to 08

More information

Numerics and Error Analysis

Numerics and Error Analysis Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What

More information

MAT 460: Numerical Analysis I. James V. Lambers

MAT 460: Numerical Analysis I. James V. Lambers MAT 460: Numerical Analysis I James V. Lambers January 31, 2013 2 Contents 1 Mathematical Preliminaries and Error Analysis 7 1.1 Introduction............................ 7 1.1.1 Error Analysis......................

More information

How do computers represent numbers?

How do computers represent numbers? How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete

More information

1 Finding and fixing floating point problems

1 Finding and fixing floating point problems Notes for 2016-09-09 1 Finding and fixing floating point problems Floating point arithmetic is not the same as real arithmetic. Even simple properties like associativity or distributivity of addition and

More information

Homework 2 Foundations of Computational Math 1 Fall 2018

Homework 2 Foundations of Computational Math 1 Fall 2018 Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 09: Accuracy and Stability Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 12 Outline 1 Condition Number of Matrices

More information

Lecture 1. Introduction to Numerical Methods. T. Gambill. Department of Computer Science University of Illinois at Urbana-Champaign.

Lecture 1. Introduction to Numerical Methods. T. Gambill. Department of Computer Science University of Illinois at Urbana-Champaign. Lecture 1 Introduction to Numerical Methods T. Gambill Department of Computer Science University of Illinois at Urbana-Champaign June 10, 2013 T. Gambill (UIUC) CS 357 June 10, 2013 1 / 52 Course Info

More information

Numerical Algorithms. IE 496 Lecture 20

Numerical Algorithms. IE 496 Lecture 20 Numerical Algorithms IE 496 Lecture 20 Reading for This Lecture Primary Miller and Boxer, Pages 124-128 Forsythe and Mohler, Sections 1 and 2 Numerical Algorithms Numerical Analysis So far, we have looked

More information

On various ways to split a floating-point number

On various ways to split a floating-point number On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-

More information

Number Representation and Waveform Quantization

Number Representation and Waveform Quantization 1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.

More information

Introduction 5. 1 Floating-Point Arithmetic 5. 2 The Direct Solution of Linear Algebraic Systems 11

Introduction 5. 1 Floating-Point Arithmetic 5. 2 The Direct Solution of Linear Algebraic Systems 11 SCIENTIFIC COMPUTING BY NUMERICAL METHODS Christina C. Christara and Kenneth R. Jackson, Computer Science Dept., University of Toronto, Toronto, Ontario, Canada, M5S 1A4. (ccc@cs.toronto.edu and krj@cs.toronto.edu)

More information

MATH ASSIGNMENT 03 SOLUTIONS

MATH ASSIGNMENT 03 SOLUTIONS MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for

More information

Parallel Reproducible Summation

Parallel Reproducible Summation Parallel Reproducible Summation James Demmel Mathematics Department and CS Division University of California at Berkeley Berkeley, CA 94720 demmel@eecs.berkeley.edu Hong Diep Nguyen EECS Department University

More information

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal

The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal -1- The Classical Relative Error Bounds for Computing a2 + b 2 and c/ a 2 + b 2 in Binary Floating-Point Arithmetic are Asymptotically Optimal Claude-Pierre Jeannerod Jean-Michel Muller Antoine Plet Inria,

More information

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics).

Compute the behavior of reality even if it is impossible to observe the processes (for example a black hole in astrophysics). 1 Introduction Read sections 1.1, 1.2.1 1.2.4, 1.2.6, 1.3.8, 1.3.9, 1.4. Review questions 1.1 1.6, 1.12 1.21, 1.37. The subject of Scientific Computing is to simulate the reality. Simulation is the representation

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Introduction to Scientific Computing

Introduction to Scientific Computing (Lecture 2: Machine precision and condition number) B. Rosić, T.Moshagen Institute of Scientific Computing General information :) 13 homeworks (HW) Work in groups of 2 or 3 people Each HW brings maximally

More information

Accurate polynomial evaluation in floating point arithmetic

Accurate polynomial evaluation in floating point arithmetic in floating point arithmetic Université de Perpignan Via Domitia Laboratoire LP2A Équipe de recherche en Informatique DALI MIMS Seminar, February, 10 th 2006 General motivation Provide numerical algorithms

More information

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points. Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu

More information

ESO 208A: Computational Methods in Engineering. Saumyen Guha

ESO 208A: Computational Methods in Engineering. Saumyen Guha ESO 208A: Computational Methods in Engineering Introduction, Error Analysis Saumyen Guha Department of Civil Engineering IIT Kanpur What is Computational Methods or Numerical Methods in Engineering? Formulation

More information

ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION

ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION MATHEMATICS OF COMPUTATION Volume 00, Number 0, Pages 000 000 S 005-5718(XX)0000-0 ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION RICHARD BRENT, COLIN PERCIVAL, AND PAUL ZIMMERMANN In memory of

More information

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can

More information

CHAPTER 11. A Revision. 1. The Computers and Numbers therein

CHAPTER 11. A Revision. 1. The Computers and Numbers therein CHAPTER A Revision. The Computers and Numbers therein Traditional computer science begins with a finite alphabet. By stringing elements of the alphabet one after another, one obtains strings. A set of

More information

Chapter 4 No. 4.0 Answer True or False to the following. Give reasons for your answers.

Chapter 4 No. 4.0 Answer True or False to the following. Give reasons for your answers. MATH 434/534 Theoretical Assignment 3 Solution Chapter 4 No 40 Answer True or False to the following Give reasons for your answers If a backward stable algorithm is applied to a computational problem,

More information

An Introduction to Numerical Analysis. James Brannick. The Pennsylvania State University

An Introduction to Numerical Analysis. James Brannick. The Pennsylvania State University An Introduction to Numerical Analysis James Brannick The Pennsylvania State University Contents Chapter 1. Introduction 5 Chapter 2. Computer arithmetic and Error Analysis 7 Chapter 3. Approximation and

More information

Numerical Derivatives in Scilab

Numerical Derivatives in Scilab Numerical Derivatives in Scilab Michaël Baudin February 2017 Abstract This document present the use of numerical derivatives in Scilab. In the first part, we present a result which is surprising when we

More information

Computing Machine-Efficient Polynomial Approximations

Computing Machine-Efficient Polynomial Approximations Computing Machine-Efficient Polynomial Approximations N. Brisebarre, S. Chevillard, G. Hanrot, J.-M. Muller, D. Stehlé, A. Tisserand and S. Torres Arénaire, LIP, É.N.S. Lyon Journées du GDR et du réseau

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Formal verification of IA-64 division algorithms

Formal verification of IA-64 division algorithms Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64

More information

BASIC COMPUTER ARITHMETIC

BASIC COMPUTER ARITHMETIC BASIC COMPUTER ARITHMETIC TSOGTGEREL GATUMUR Abstract. First, we consider how integers and fractional numbers are represented and manipulated internally on a computer. Then we develop a basic theoretical

More information

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS

1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS Chapter 1 NUMBER REPRESENTATION, ERROR ANALYSIS 1.1 COMPUTER REPRESENTATION OF NUM- BERS, REPRESENTATION ERRORS Floating-point representation x t,r of a number x: x t,r = m t P cr, where: P - base (the

More information

Lecture 2: Number Representations (2)

Lecture 2: Number Representations (2) Lecture 2: Number Representations (2) ECE 645 Computer Arithmetic 1/29/08 ECE 645 Computer Arithmetic Lecture Roadmap Number systems (cont'd) Floating point number system representations Residue number

More information

2.29 Numerical Fluid Mechanics Fall 2011 Lecture 2

2.29 Numerical Fluid Mechanics Fall 2011 Lecture 2 2.29 Numerical Fluid Mechanics Fall 2011 Lecture 2 REVIEW Lecture 1 1. Syllabus, Goals and Objectives 2. Introduction to CFD 3. From mathematical models to numerical simulations (1D Sphere in 1D flow)

More information

Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function

Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Getting tight error bounds in floating-point arithmetic: illustration with complex functions, and the real x n function Jean-Michel Muller collaborators: S. Graillat, C.-P. Jeannerod, N. Louvet V. Lefèvre

More information

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor

The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor The Euclidean Division Implemented with a Floating-Point Multiplication and a Floor Vincent Lefèvre 11th July 2005 Abstract This paper is a complement of the research report The Euclidean division implemented

More information

Can You Count on Your Computer?

Can You Count on Your Computer? Can You Count on Your Computer? Professor Nick Higham School of Mathematics University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ p. 1/33 p. 2/33 Counting to Six I asked my computer

More information

Efficient Reproducible Floating Point Summation and BLAS

Efficient Reproducible Floating Point Summation and BLAS Efficient Reproducible Floating Point Summation and BLAS Peter Ahrens Hong Diep Nguyen James Demmel Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No.

More information

An Introduction to Numerical Analysis

An Introduction to Numerical Analysis An Introduction to Numerical Analysis Department of Mathematical Sciences, NTNU 21st august 2012 Practical issues webpage: http://wiki.math.ntnu.no/tma4215/2012h/start Lecturer:, elenac@math.ntnu.no Lecures:

More information

Numerical Analysis and Computing

Numerical Analysis and Computing Numerical Analysis and Computing Lecture Notes #02 Calculus Review; Computer Artihmetic and Finite Precision; and Convergence; Joe Mahaffy, mahaffy@math.sdsu.edu Department of Mathematics Dynamical Systems

More information

Mathematics for Engineers. Numerical mathematics

Mathematics for Engineers. Numerical mathematics Mathematics for Engineers Numerical mathematics Integers Determine the largest representable integer with the intmax command. intmax ans = int32 2147483647 2147483647+1 ans = 2.1475e+09 Remark The set

More information

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013

Number Systems III MA1S1. Tristan McLoughlin. December 4, 2013 Number Systems III MA1S1 Tristan McLoughlin December 4, 2013 http://en.wikipedia.org/wiki/binary numeral system http://accu.org/index.php/articles/1558 http://www.binaryconvert.com http://en.wikipedia.org/wiki/ascii

More information

Round-off error propagation and non-determinism in parallel applications

Round-off error propagation and non-determinism in parallel applications Round-off error propagation and non-determinism in parallel applications Vincent Baudoui (Argonne/Total SA) vincent.baudoui@gmail.com Franck Cappello (Argonne/INRIA/UIUC-NCSA) Georges Oppenheim (Paris-Sud

More information

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1 Mathematical Preliminaries and Error Analysis Numerical Analysis (Math 3313) 2019-2018 Chapter 1 Mathematical Preliminaries and Error Analysis Intended learning outcomes: Upon successful completion of this chapter, a student will be able to (1) list

More information

QUADRATIC PROGRAMMING?

QUADRATIC PROGRAMMING? QUADRATIC PROGRAMMING? WILLIAM Y. SIT Department of Mathematics, The City College of The City University of New York, New York, NY 10031, USA E-mail: wyscc@cunyvm.cuny.edu This is a talk on how to program

More information

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary.

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary. Numbering Systems Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary. Addition & Subtraction using Octal & Hexadecimal 2 s Complement, Subtraction Using 2 s Complement.

More information

Negative Bit Representation Outline

Negative Bit Representation Outline Negative Bit Representation Outline 1. Negative Bit Representation Outline 2. Negative Integers 3. Representing Negativity 4. Which Bit for the Sign? 5. Sign-Value 6. Disadvantages of Sign-Value 7. One

More information

Ex code

Ex code Ex. 8.4 7-4-2-1 code Codeconverter 7-4-2-1-code to BCD-code. When encoding the digits 0... 9 sometimes in the past a code having weights 7-4-2-1 instead of the binary code weights 8-4-2-1 was used. In

More information

Math 348: Numerical Methods with application in finance and economics. Boualem Khouider University of Victoria

Math 348: Numerical Methods with application in finance and economics. Boualem Khouider University of Victoria Math 348: Numerical Methods with application in finance and economics Boualem Khouider University of Victoria Lecture notes: Last Updated September 2016 Contents 1 Basics of numerical analysis 10 1.1 Notion

More information

One-Sided Difference Formula for the First Derivative

One-Sided Difference Formula for the First Derivative POLYTECHNIC UNIVERSITY Department of Computer and Information Science One-Sided Difference Formula for the First Derivative K. Ming Leung Abstract: Derive a one-sided formula for the first derive of a

More information

Can ANSYS Mechanical Handle My Required Modeling Precision?

Can ANSYS Mechanical Handle My Required Modeling Precision? Can ANSYS Mechanical Handle My Required Modeling Precision? 2/8/2017 www.telescope-optics.net Recently, PADT received the following inquiry from a scientist interested in using ANSYS for telescope application

More information

Computation of the error functions erf and erfc in arbitrary precision with correct rounding

Computation of the error functions erf and erfc in arbitrary precision with correct rounding Computation of the error functions erf and erfc in arbitrary precision with correct rounding Sylvain Chevillard Arenaire, LIP, ENS-Lyon, France Sylvain.Chevillard@ens-lyon.fr Nathalie Revol INRIA, Arenaire,

More information

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41

Numerical Analysis. Yutian LI. 2018/19 Term 1 CUHKSZ. Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41 Numerical Analysis Yutian LI CUHKSZ 2018/19 Term 1 Yutian LI (CUHKSZ) Numerical Analysis 2018/19 1 / 41 Reference Books BF R. L. Burden and J. D. Faires, Numerical Analysis, 9th edition, Thomsom Brooks/Cole,

More information

Introduction and mathematical preliminaries

Introduction and mathematical preliminaries Chapter Introduction and mathematical preliminaries Contents. Motivation..................................2 Finite-digit arithmetic.......................... 2.3 Errors in numerical calculations.....................

More information

Computation of dot products in finite fields with floating-point arithmetic

Computation of dot products in finite fields with floating-point arithmetic Computation of dot products in finite fields with floating-point arithmetic Stef Graillat Joint work with Jérémy Jean LIP6/PEQUAN - Université Pierre et Marie Curie (Paris 6) - CNRS Computer-assisted proofs

More information

Some Functions Computable with a Fused-mac

Some Functions Computable with a Fused-mac Some Functions Computable with a Fused-mac Sylvie Boldo, Jean-Michel Muller To cite this version: Sylvie Boldo, Jean-Michel Muller. Some Functions Computable with a Fused-mac. Paolo Montuschi, Eric Schwarz.

More information