Binary Floating-Point Numbers

Size: px
Start display at page:

Download "Binary Floating-Point Numbers"

Transcription

1 Binary Floating-Point Numbers S exponent E significand M F=(-1) s M β E Significand M pure fraction [0, 1-ulp] or [1, 2) for β=2

2 Normalized form significand has no leading zeros maximum # of significant digits easy comparison x x normalization for β=2 k at least one nonzero bit in the first k position x x M min =1/β ( ) M max =1-ulp ( )

3 Biased exponent E=E true +bias - 2 e-1 E true 2 e-1-1 if bias=2 e-1, 0 E 2 e -1 excess 2 e-1 code two s complement inversion of sign bit zero: M=0, E=0

4 biased exponent: can compare exponent as if they were unsigned numbers +normalization: compare exponents, then compare significands compare floating-point numbers as if they were integers in signed-magnitude representation S exponent E significand M magnitude

5 Range Positive floating-point number F + M min β E min F+ M max β E max E > E max : exponent overflow E < E min : exponent underflow range of F + = range of -F -

6 IBM S exponent E significand M β=16 F = (-1) S M 16 E-64 F + min=16-1 x x F + max=( ) x x 10 75

7 DEC/VAX S exponent E significand f β=2 F = (-1) S 0.1f 2 E-128 E = 0 is reserved for zero F + min=0.1x =2-128 F + max=( ) x = ( ) x 2 127

8 Floating-point operations Multiplication (F 3 =F 1 x F 2 ) sign bit S 3 = S 1 S 2 exponent E 3 = E 1 + E 2 - bias E 3 > E max overflow E 3 < E min underflow significand 1/β M i < 1 1/β 2 M 1 M 2 < 1 postnormalization if M 3 = M 1 M 2 < 1/β, shift left M 3 and decrease E 3 (may cause underflow)

9 Division (F 3 =F 1 / F 2 ) exponent E 3 = E 1 - E 2 + bias significand 1/β M i < 1 1/β M 1 / M 2 < β postnormalization if M 3 = M 1 / M 2 1, shift right M 3 and increase E 3 (may cause overflow) M 2 = 0: division by zero M 1 = M 2 = 0: undefined

10 Addition/Subtraction Shift right the significand of the smaller operand by E 1 - E 2 base-β positions addition: 1/β M < 2 if M 1 postnormalization subtraction: 0 M < 1 if M 1/β postnormalization

11 example F 1 = ( ) 16 x 16 3 F 2 = (0.FFFFFF) 16 x 16 2 F x 16 3 F 2 aligned 0. 0 F F F F F x 16 3 F 1 - F x 16 3 postnormalization x 16-2 guard digit F x 16 3 F 2 aligned 0. 0 F F F F F F x 16 3 F 1 - F x 16 3 postnormalization x 16-3

12 IEEE floating-point standard Single precision S exponent E fraction f F = (-1) S x 1.f x 2 E-127 (1 E 254) F + max = ( ) x = ( ) x x F + min = 1.0 x = Precision =

13 E = 255 is reserved for and NAN E = 0 is reserved for zero and denormalized numbers zero: E = 0, f = 0 denormalized numbers: E = 0, f 0 F = (-1) S 0.f

14 Double precision S exponent E fraction f F = (-1) S x 1.f x 2 E-1023

15 Round-off schemes Truncation Trunc(x) x

16 rounding errors for the truncation scheme with d = 2 (d: # extra digit) Number Trunc(x) Error X.00 X 0 X.01 X -1/4 X.10 X -1/2 X.11 X -3/4 bias = average error = (0-1/4-1/2-3/4)/4 = -3/8

17 round-to-nearest Round-tonearest(x) x

18 rounding errors for the round-to-nearest scheme with d = 2 Number Round-to-nearest(x) Error X.00 X 0 X.01 X -1/4 X.10 X +1 +1/2 X.11 X +1 +1/4 bias = (0-1/4+1/2+1/4)/4 = 1/8

19 round-to-nearest-even Round-tonearest-even(x) x

20 rounding errors for the round-to-nearesteven scheme with d = 2 Number Round-to- Error nearest- even(x) X0.00 X0. 0 X0.01 X0. -1/4 X0.10 X0. -1/2 X0.11 X1. +1/4 X1.00 X1. 0 X1.01 X1. -1/4 X1.10 X /2 X1.11 X /4 bias = 0

21 ROM rounding avoid add operation (l - 1) MSB of xtra d bits 2 l x (l - 1) ROM (l - 1)

22 round-to-nearest using ROM ROM(x) x

23 rounding errors for the ROM rounding scheme with l = 3 and d = 1 Number Round-to- Error nearest(x) X00.0 X00. 0 X00.1 X01. +1/2 X01.0 X01. 0 X01.1 X10. +1/2 X10.0 X10. 0 X10.1 X11. +1/2 X11.0 X11. 0 X11.1 X11. -1/2 bias = 1/2[(1/2) d - (1/2) l-1 ]

24 Guard Digits Multiplication 1.f x 1.f : at most one shift right 0.f x 0.f : at most one shift left one guard digit(g) for postnormalization Division 0.f / 0.f : at most one shift right 1.f / 1.f : at most one shift left one guard digit(g) for postnormalization Round-to-nearest needs one more digit R(round)

25 Round-to-nearest-even needs additional digit S(sticky) indicates whether all the additional digits are zero

26 Addition/Subtraction no shift for the alignment A B aligned A - B postnormalization one bit shift for the alignment one guard bit A B aligned A - B postnormalization

27 shift by two or more bits for the alignment: A B aligned A - B postnormalization postnormalization needs at most one bit shift but one guard bit is not sufficient A B aligned A - B postnormalization

28 use a sticky bit A B aligned A - B postnormalization G S A B aligned A - B postnormalization good for truncation scheme

29 round-to-nearest(-even) scheme requires an additional bit A B aligned A - B postnormalization round-to-nearest G S A B aligned A - B postnormalization round-to-nearest

30 use a round bit A B aligned A - B postnormalization round-to-nearest G R S A B aligned A - B postnormalization round-to-nearest round-to-nearest-even scheme can use S bit

31 if no postnormalization is required (no shift left), then G bit serves as an R bit S new = R + S old round-to-nearest-even : add ulp to the significand if R S + R S L = R (S + L) = 1 G R S A B aligned A - B R S before rounding after round-to nearest

ALU (3) - Division Algorithms

ALU (3) - Division Algorithms HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)

More information

Solutions - Homework 1 (Due date: September 25 th ) Presentation and clarity are very important! Show your procedure!

Solutions - Homework 1 (Due date: September 25 th ) Presentation and clarity are very important! Show your procedure! c 10 =0 c 9 =0 c 8 =0 c 7 =0 c 6 =0 c 5 =0 c 10 =1 c 9 =1 c 8 =1 c 7 =0 c 6 =1 c 5 =1 c 4 =1 c 8 =1 c 7 =1 c 6 =0 c 5 =0 c 8 =0 c 7 =0 c 6 =0 c 5 =0 c 8 =1 c 7 =1 c 6 =1 c 5 =0 c 4 =1 b 7 =0 b 6 =0 b 5

More information

Number Representation and Waveform Quantization

Number Representation and Waveform Quantization 1 Number Representation and Waveform Quantization 1 Introduction This lab presents two important concepts for working with digital signals. The first section discusses how numbers are stored in memory.

More information

1 Floating point arithmetic

1 Floating point arithmetic Introduction to Floating Point Arithmetic Floating point arithmetic Floating point representation (scientific notation) of numbers, for example, takes the following form.346 0 sign fraction base exponent

More information

How do computers represent numbers?

How do computers represent numbers? How do computers represent numbers? Tips & Tricks Week 1 Topics in Scientific Computing QMUL Semester A 2017/18 1/10 What does digital mean? The term DIGITAL refers to any device that operates on discrete

More information

Chapter 1 Error Analysis

Chapter 1 Error Analysis Chapter 1 Error Analysis Several sources of errors are important for numerical data processing: Experimental uncertainty: Input data from an experiment have a limited precision. Instead of the vector of

More information

Complement Arithmetic

Complement Arithmetic Complement Arithmetic Objectives In this lesson, you will learn: How additions and subtractions are performed using the complement representation, What is the Overflow condition, and How to perform arithmetic

More information

Unit 1 - Computer Arithmetic

Unit 1 - Computer Arithmetic FIXD-POINT (FX) ARITHMTIC Unit 1 - Comuter Arithmetic INTGR NUMBRS n bit number: b n 1 b n 2 b 0 Decimal Value Range of values UNSIGND n 1 SIGND D = b i 2 i D = 2 n 1 b n 1 + b i 2 i n 2 i=0 i=0 [0, 2

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Tunable Floating-Point for Energy Efficient Accelerators

Tunable Floating-Point for Energy Efficient Accelerators Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable

More information

Notes for Chapter 1 of. Scientific Computing with Case Studies

Notes for Chapter 1 of. Scientific Computing with Case Studies Notes for Chapter 1 of Scientific Computing with Case Studies Dianne P. O Leary SIAM Press, 2008 Mathematical modeling Computer arithmetic Errors 1999-2008 Dianne P. O'Leary 1 Arithmetic and Error What

More information

ECS 231 Computer Arithmetic 1 / 27

ECS 231 Computer Arithmetic 1 / 27 ECS 231 Computer Arithmetic 1 / 27 Outline 1 Floating-point numbers and representations 2 Floating-point arithmetic 3 Floating-point error analysis 4 Further reading 2 / 27 Outline 1 Floating-point numbers

More information

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460

Arithmetic and Error. How does error arise? How does error arise? Notes for Part 1 of CMSC 460 Notes for Part 1 of CMSC 460 Dianne P. O Leary Preliminaries: Mathematical modeling Computer arithmetic Errors 1999-2006 Dianne P. O'Leary 1 Arithmetic and Error What we need to know about error: -- how

More information

Chapter 4 Number Representations

Chapter 4 Number Representations Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point

More information

Lecture 7. Floating point arithmetic and stability

Lecture 7. Floating point arithmetic and stability Lecture 7 Floating point arithmetic and stability 2.5 Machine representation of numbers Scientific notation: 23 }{{} }{{} } 3.14159265 {{} }{{} 10 sign mantissa base exponent (significand) s m β e A floating

More information

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin

More information

Elements of Floating-point Arithmetic

Elements of Floating-point Arithmetic Elements of Floating-point Arithmetic Sanzheng Qiao Department of Computing and Software McMaster University July, 2012 Outline 1 Floating-point Numbers Representations IEEE Floating-point Standards Underflow

More information

Chapter 8: Solutions to Exercises

Chapter 8: Solutions to Exercises 1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier Science, c 2004 with contributions by Fabrizio Lamberti Exercise 8.1 Fixed-point representation

More information

Lecture 2: Number Representations (2)

Lecture 2: Number Representations (2) Lecture 2: Number Representations (2) ECE 645 Computer Arithmetic 1/29/08 ECE 645 Computer Arithmetic Lecture Roadmap Number systems (cont'd) Floating point number system representations Residue number

More information

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points.

8/13/16. Data analysis and modeling: the tools of the trade. Ø Set of numbers. Ø Binary representation of numbers. Ø Floating points. Data analysis and modeling: the tools of the trade Patrice Koehl Department of Biological Sciences National University of Singapore http://www.cs.ucdavis.edu/~koehl/teaching/bl5229 koehl@cs.ucdavis.edu

More information

Multiplication of signed-operands

Multiplication of signed-operands Multiplication of signed-operands Recall we discussed multiplication of unsigned numbers: Combinatorial array multiplier. Sequential multiplier. Need an approach that works uniformly with unsigned and

More information

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract

What Every Programmer Should Know About Floating-Point Arithmetic DRAFT. Last updated: November 3, Abstract What Every Programmer Should Know About Floating-Point Arithmetic Last updated: November 3, 2014 Abstract The article provides simple answers to the common recurring questions of novice programmers about

More information

ECE 372 Microcontroller Design

ECE 372 Microcontroller Design Data Formats Humor There are 10 types of people in the world: Those who get binary and those who don t. 1 Information vs. Data Information An abstract description of facts, processes or perceptions How

More information

Computer Architecture, IFE CS and T&CS, 4 th sem. Representation of Integer Numbers in Computer Systems

Computer Architecture, IFE CS and T&CS, 4 th sem. Representation of Integer Numbers in Computer Systems Representation of Integer Numbers in Computer Systems Positional Numbering System Additive Systems history but... Roman numerals Positional Systems: r system base (radix) A number value a - digit i digit

More information

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary.

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary. Numbering Systems Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary. Addition & Subtraction using Octal & Hexadecimal 2 s Complement, Subtraction Using 2 s Complement.

More information

Negative Bit Representation Outline

Negative Bit Representation Outline Negative Bit Representation Outline 1. Negative Bit Representation Outline 2. Negative Integers 3. Representing Negativity 4. Which Bit for the Sign? 5. Sign-Value 6. Disadvantages of Sign-Value 7. One

More information

14:332:231 DIGITAL LOGIC DESIGN. 2 s-complement Representation

14:332:231 DIGITAL LOGIC DESIGN. 2 s-complement Representation 4:332:23 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical & Computer Engineering Fall 203 Lecture #3: Addition, Subtraction, Multiplication, and Division 2 s-complement Representation RECALL

More information

Introduction CSE 541

Introduction CSE 541 Introduction CSE 541 1 Numerical methods Solving scientific/engineering problems using computers. Root finding, Chapter 3 Polynomial Interpolation, Chapter 4 Differentiation, Chapter 4 Integration, Chapters

More information

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors 3-1 Roundoff errors and floating-point arithmetic

More information

Number Representations

Number Representations Computer Arithmetic Algorithms Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Number Representations Information Textbook Israel Koren, Computer Arithmetic

More information

ECE260: Fundamentals of Computer Engineering

ECE260: Fundamentals of Computer Engineering Data Representation & 2 s Complement James Moscola Dept. of Engineering & Computer Science York College of Pennsylvania Based on Computer Organization and Design, 5th Edition by Patterson & Hennessy Data

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

More information

Chapter 1: Introduction and mathematical preliminaries

Chapter 1: Introduction and mathematical preliminaries Chapter 1: Introduction and mathematical preliminaries Evy Kersalé September 26, 2011 Motivation Most of the mathematical problems you have encountered so far can be solved analytically. However, in real-life,

More information

A 32-bit Decimal Floating-Point Logarithmic Converter

A 32-bit Decimal Floating-Point Logarithmic Converter A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan

More information

Optimizing MPC for robust and scalable integer and floating-point arithmetic

Optimizing MPC for robust and scalable integer and floating-point arithmetic Optimizing MPC for robust and scalable integer and floating-point arithmetic Liisi Kerik * Peeter Laud * Jaak Randmets * * Cybernetica AS University of Tartu, Institute of Computer Science January 30,

More information

Ex code

Ex code Ex. 8.4 7-4-2-1 code Codeconverter 7-4-2-1-code to BCD-code. When encoding the digits 0... 9 sometimes in the past a code having weights 7-4-2-1 instead of the binary code weights 8-4-2-1 was used. In

More information

In the 1960's, the term "Op Art" was coined to describe the work of a growing group of abstract painters. This movement was led by Vasarely.

In the 1960's, the term Op Art was coined to describe the work of a growing group of abstract painters. This movement was led by Vasarely. In the 1960's, the term "Op Art" was coined to describe the work of a growing group of abstract painters. This movement was led by Vasarely. Preliminary version Range: From logic gate to combinatorial

More information

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic

Computer Arithmetic. MATH 375 Numerical Analysis. J. Robert Buchanan. Fall Department of Mathematics. J. Robert Buchanan Computer Arithmetic Computer Arithmetic MATH 375 Numerical Analysis J. Robert Buchanan Department of Mathematics Fall 2013 Machine Numbers When performing arithmetic on a computer (laptop, desktop, mainframe, cell phone,

More information

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS

FLOATING POINT ARITHMETHIC - ERROR ANALYSIS FLOATING POINT ARITHMETHIC - ERROR ANALYSIS Brief review of floating point arithmetic Model of floating point arithmetic Notation, backward and forward errors Roundoff errors and floating-point arithmetic

More information

Round-off Errors and Computer Arithmetic - (1.2)

Round-off Errors and Computer Arithmetic - (1.2) Round-off Errors and Comuter Arithmetic - (.). Round-off Errors: Round-off errors is roduced when a calculator or comuter is used to erform real number calculations. That is because the arithmetic erformed

More information

Residue Number Systems. Alternative number representations. TSTE 8 Digital Arithmetic Seminar 2. Residue Number Systems.

Residue Number Systems. Alternative number representations. TSTE 8 Digital Arithmetic Seminar 2. Residue Number Systems. TSTE8 Digital Arithmetic Seminar Oscar Gustafsson The idea is to use the residues of the numbers and perform operations on the residues Also called modular arithmetic since the residues are computed using

More information

Conversions between Decimal and Binary

Conversions between Decimal and Binary Conversions between Decimal and Binary Binary to Decimal Technique - use the definition of a number in a positional number system with base 2 - evaluate the definition formula ( the formula ) using decimal

More information

UNIT V FINITE WORD LENGTH EFFECTS IN DIGITAL FILTERS PART A 1. Define 1 s complement form? In 1,s complement form the positive number is represented as in the sign magnitude form. To obtain the negative

More information

Hakim Weatherspoon CS 3410 Computer Science Cornell University

Hakim Weatherspoon CS 3410 Computer Science Cornell University Hakim Weatherspoon CS 3410 Computer Science Cornell University The slides are the product of many rounds of teaching CS 3410 by Professors Weatherspoon, Bala, Bracy, and Sirer. memory inst 32 register

More information

Tutorial 2: Expressing Uncertainty (Sig Figs, Scientific Notation and Rounding)

Tutorial 2: Expressing Uncertainty (Sig Figs, Scientific Notation and Rounding) Tutorial 2: Expressing Uncertainty (Sig Figs, Scientific Notation and Rounding) Goals: To be able to convert quantities from one unit to another. To be able to express measurements and answers to the correct

More information

ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION

ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION MATHEMATICS OF COMPUTATION Volume 00, Number 0, Pages 000 000 S 005-5718(XX)0000-0 ERROR BOUNDS ON COMPLEX FLOATING-POINT MULTIPLICATION RICHARD BRENT, COLIN PERCIVAL, AND PAUL ZIMMERMANN In memory of

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b.

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5. Ax = b. CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 5 GENE H GOLUB Suppose we want to solve We actually have an approximation ξ such that 1 Perturbation Theory Ax = b x = ξ + e The question is, how

More information

arxiv: v1 [cs.na] 15 Apr 2013

arxiv: v1 [cs.na] 15 Apr 2013 Relative error due to a single bit-flip in floating-point arithmetic Bradley R. Lowery arxiv:1304.4292v1 [cs.na] 15 Apr 2013 February 7, 2014 We consider the error due to a single bit-flip in a floating

More information

On various ways to split a floating-point number

On various ways to split a floating-point number On various ways to split a floating-point number Claude-Pierre Jeannerod Jean-Michel Muller Paul Zimmermann Inria, CNRS, ENS Lyon, Université de Lyon, Université de Lorraine France ARITH-25 June 2018 -2-

More information

DSP Design Lecture 2. Fredrik Edman.

DSP Design Lecture 2. Fredrik Edman. DSP Design Lecture Number representation, scaling, quantization and round-off Noise Fredrik Edman fredrik.edman@eit.lth.se Representation of Numbers Numbers is a way to use symbols to describe and model

More information

Residue Number Systems Ivor Page 1

Residue Number Systems Ivor Page 1 Residue Number Systems 1 Residue Number Systems Ivor Page 1 7.1 Arithmetic in a modulus system The great speed of arithmetic in Residue Number Systems (RNS) comes from a simple theorem from number theory:

More information

Notes on floating point number, numerical computations and pitfalls

Notes on floating point number, numerical computations and pitfalls Notes on floating point number, numerical computations and pitfalls November 6, 212 1 Floating point numbers An n-digit floating point number in base β has the form x = ±(.d 1 d 2 d n ) β β e where.d 1

More information

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Floating Point Number Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Floating Point Number Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview Real number system Examples Absolute and relative errors Floating point numbers Roundoff

More information

Chapter 1 Mathematical Preliminaries and Error Analysis

Chapter 1 Mathematical Preliminaries and Error Analysis Chapter 1 Mathematical Preliminaries and Error Analysis Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128A Numerical Analysis Limits and Continuity

More information

Lecture 2 MODES OF NUMERICAL COMPUTATION

Lecture 2 MODES OF NUMERICAL COMPUTATION 1. Diversity of Numbers Lecture 2 Page 1 It is better to solve the right problem the wrong way than to solve the wrong problem the right way. The purpose of computing is insight, not numbers. Richard Wesley

More information

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems...

Contents Experimental Perturbations Introduction to Interval Arithmetic Review Questions Problems... Contents 2 How to Obtain and Estimate Accuracy 1 2.1 Basic Concepts in Error Estimation................ 1 2.1.1 Sources of Error.................... 1 2.1.2 Absolute and Relative Errors............. 4

More information

Lecture Notes 7, Math/Comp 128, Math 250

Lecture Notes 7, Math/Comp 128, Math 250 Lecture Notes 7, Math/Comp 128, Math 250 Misha Kilmer Tufts University October 23, 2005 Floating Point Arithmetic We talked last time about how the computer represents floating point numbers. In a floating

More information

Efficient Reproducible Floating Point Summation and BLAS

Efficient Reproducible Floating Point Summation and BLAS Efficient Reproducible Floating Point Summation and BLAS Peter Ahrens Hong Diep Nguyen James Demmel Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No.

More information

Graduate Institute of Electronics Engineering, NTU Basic Division Scheme

Graduate Institute of Electronics Engineering, NTU Basic Division Scheme Basic Division Scheme 台灣大學電子所吳安宇博士 2002 ACCESS IC LAB Outline Shift/subtract division algorithm. Programmed division. Restoring hardware dividers. Nonstoring and signed division. Radix-2 SRT divisioin.

More information

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science

EAD 115. Numerical Solution of Engineering and Scientific Problems. David M. Rocke Department of Applied Science EAD 115 Numerical Solution of Engineering and Scientific Problems David M. Rocke Department of Applied Science Computer Representation of Numbers Counting numbers (unsigned integers) are the numbers 0,

More information

ENGIN 112 Intro to Electrical and Computer Engineering

ENGIN 112 Intro to Electrical and Computer Engineering ENGIN 112 Intro to Electrical and Computer Engineering Lecture 3 More Number Systems Overview Hexadecimal numbers Related to binary and octal numbers Conversion between hexadecimal, octal and binary Value

More information

EE260: Digital Design, Spring n Digital Computers. n Number Systems. n Representations. n Conversions. n Arithmetic Operations.

EE260: Digital Design, Spring n Digital Computers. n Number Systems. n Representations. n Conversions. n Arithmetic Operations. EE 260: Introduction to Digital Design Number Systems Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa Overview n Digital Computers n Number Systems n Representations n Conversions

More information

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation

More information

Combinational Logic Design Arithmetic Functions and Circuits

Combinational Logic Design Arithmetic Functions and Circuits Combinational Logic Design Arithmetic Functions and Circuits Overview Binary Addition Half Adder Full Adder Ripple Carry Adder Carry Look-ahead Adder Binary Subtraction Binary Subtractor Binary Adder-Subtractor

More information

4.2 Floating-Point Numbers

4.2 Floating-Point Numbers 101 Approximation 4.2 Floating-Point Numbers 4.2 Floating-Point Numbers The number 3.1416 in scientific notation is 0.31416 10 1 or (as computer output) -0.31416E01..31416 10 1 exponent sign mantissa base

More information

Menu. Review of Number Systems EEL3701 EEL3701. Math. Review of number systems >Binary math >Signed number systems

Menu. Review of Number Systems EEL3701 EEL3701. Math. Review of number systems >Binary math >Signed number systems Menu Review of number systems >Binary math >Signed number systems Look into my... 1 Our decimal (base 10 or radix 10) number system is positional. Ex: 9437 10 = 9x10 3 + 4x10 2 + 3x10 1 + 7x10 0 We have

More information

Optimizing the Representation of Intervals

Optimizing the Representation of Intervals Optimizing the Representation of Intervals Javier D. Bruguera University of Santiago de Compostela, Spain Numerical Sofware: Design, Analysis and Verification Santander, Spain, July 4-6 2012 Contents 1

More information

Design of Digital Circuits Reading: Binary Numbers. Required Reading for Week February 2017 Spring 2017

Design of Digital Circuits Reading: Binary Numbers. Required Reading for Week February 2017 Spring 2017 Design of Digital Circuits Reading: Binary Numbers Required Reading for Week 1 23-24 February 2017 Spring 2017 Binary Numbers Design of Digital Circuits 2016 Srdjan Capkun Frank K. Gürkaynak http://www.syssec.ethz.ch/education/digitaltechnik_16

More information

UNSIGNED BINARY NUMBERS DIGITAL ELECTRONICS SYSTEM DESIGN WHAT ABOUT NEGATIVE NUMBERS? BINARY ADDITION 11/9/2018

UNSIGNED BINARY NUMBERS DIGITAL ELECTRONICS SYSTEM DESIGN WHAT ABOUT NEGATIVE NUMBERS? BINARY ADDITION 11/9/2018 DIGITAL ELECTRONICS SYSTEM DESIGN LL 2018 PROFS. IRIS BAHAR & ROD BERESFORD NOVEMBER 9, 2018 LECTURE 19: BINARY ADDITION, UNSIGNED BINARY NUMBERS For the binary number b n-1 b n-2 b 1 b 0. b -1 b -2 b

More information

Digital Systems Overview. Unit 1 Numbering Systems. Why Digital Systems? Levels of Design Abstraction. Dissecting Decimal Numbers

Digital Systems Overview. Unit 1 Numbering Systems. Why Digital Systems? Levels of Design Abstraction. Dissecting Decimal Numbers Unit Numbering Systems Fundamentals of Logic Design EE2369 Prof. Eric MacDonald Fall Semester 2003 Digital Systems Overview Digital Systems are Home PC XBOX or Playstation2 Cell phone Network router Data

More information

0,..., r 1 = digits in radix r number system, that is 0 d i r 1 where m i n 1

0,..., r 1 = digits in radix r number system, that is 0 d i r 1 where m i n 1 RADIX r NUMBER SYSTEM Let (N) r be a radix r number in a positional weighting number system, then (N) r = d n 1 r n 1 + + d 0 r 0 d 1 r 1 + + d m r m where: r = radix d i = digit at position i, m i n 1

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Round-off errors and computer arithmetic

More information

Cs302 Quiz for MID TERM Exam Solved

Cs302 Quiz for MID TERM Exam Solved Question # 1 of 10 ( Start time: 01:30:33 PM ) Total Marks: 1 Caveman used a number system that has distinct shapes: 4 5 6 7 Question # 2 of 10 ( Start time: 01:31:25 PM ) Total Marks: 1 TTL based devices

More information

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University Part 1 Chapter 4 Roundoff and Truncation Errors PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction

More information

BASIC COMPUTER ARITHMETIC

BASIC COMPUTER ARITHMETIC BASIC COMPUTER ARITHMETIC TSOGTGEREL GATUMUR Abstract. First, we consider how integers and fractional numbers are represented and manipulated internally on a computer. Then we develop a basic theoretical

More information

Binary floating point

Binary floating point Binary floating point Notes for 2017-02-03 Why do we study conditioning of problems? One reason is that we may have input data contaminated by noise, resulting in a bad solution even if the intermediate

More information

1 ERROR ANALYSIS IN COMPUTATION

1 ERROR ANALYSIS IN COMPUTATION 1 ERROR ANALYSIS IN COMPUTATION 1.2 Round-Off Errors & Computer Arithmetic (a) Computer Representation of Numbers Two types: integer mode (not used in MATLAB) floating-point mode x R ˆx F(β, t, l, u),

More information

14:332:231 DIGITAL LOGIC DESIGN. Why Binary Number System?

14:332:231 DIGITAL LOGIC DESIGN. Why Binary Number System? :33:3 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical & Computer Engineering Fall 3 Lecture #: Binary Number System Complement Number Representation X Y Why Binary Number System? Because

More information

Introduction to Scientific Computing Languages

Introduction to Scientific Computing Languages 1 / 21 Introduction to Scientific Computing Languages Prof. Paolo Bientinesi pauldj@aices.rwth-aachen.de Numerical Representation 2 / 21 Numbers 123 = (first 40 digits) 29 4.241379310344827586206896551724137931034...

More information

Numerical Methods - Preliminaries

Numerical Methods - Preliminaries Numerical Methods - Preliminaries Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Preliminaries 2013 1 / 58 Table of Contents 1 Introduction to Numerical Methods Numerical

More information

Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic

Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic Design and Implementation of REA for Single Precision Floating Point Multiplier Using Reversible Logic MadivalappaTalakal 1, G.Jyothi 2, K.N.Muralidhara 3, M.Z.Kurian 4 PG Student [VLSI & ES], Dept. of

More information

Homework 2 Foundations of Computational Math 1 Fall 2018

Homework 2 Foundations of Computational Math 1 Fall 2018 Homework 2 Foundations of Computational Math 1 Fall 2018 Note that Problems 2 and 8 have coding in them. Problem 2 is a simple task while Problem 8 is very involved (and has in fact been given as a programming

More information

10/14/2009. Reading: Hambley Chapters

10/14/2009. Reading: Hambley Chapters EE40 Lec 14 Digital Signal and Boolean Algebra Prof. Nathan Cheung 10/14/2009 Reading: Hambley Chapters 7.1-7.4 7.4 Slide 1 Analog Signals Analog: signal amplitude is continuous with time. Amplitude Modulated

More information

CprE 281: Digital Logic

CprE 281: Digital Logic CprE 281: Digital Logic Instructor: Alexander Stoytchev http://www.ece.iastate.edu/~alexs/classes/ Signed Numbers CprE 281: Digital Logic Iowa State University, Ames, IA Copyright Alexander Stoytchev Administrative

More information

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can

More information

Chapter 1: Preliminaries and Error Analysis

Chapter 1: Preliminaries and Error Analysis Chapter 1: Error Analysis Peter W. White white@tarleton.edu Department of Tarleton State University Summer 2015 / Numerical Analysis Overview We All Remember Calculus Derivatives: limit definition, sum

More information

Four Important Number Systems

Four Important Number Systems Four Important Number Systems System Why? Remarks Decimal Base 10: (10 fingers) Most used system Binary Base 2: On/Off systems 3-4 times more digits than decimal Octal Base 8: Shorthand notation for working

More information

Chapter 1: Solutions to Exercises

Chapter 1: Solutions to Exercises 1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier, c 2004 Exercise 1.1 (a) 1. 9 bits since 2 8 297 2 9 2. 3 radix-8 digits since 8 2 297 8 3 3. 3

More information

Numbers and Arithmetic

Numbers and Arithmetic Numbers and Arithmetic See: P&H Chapter 2.4 2.6, 3.2, C.5 C.6 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register file alu

More information

Innocuous Double Rounding of Basic Arithmetic Operations

Innocuous Double Rounding of Basic Arithmetic Operations Innocuous Double Rounding of Basic Arithmetic Operations Pierre Roux ISAE, ONERA Double rounding occurs when a floating-point value is first rounded to an intermediate precision before being rounded to

More information

MATH ASSIGNMENT 03 SOLUTIONS

MATH ASSIGNMENT 03 SOLUTIONS MATH444.0 ASSIGNMENT 03 SOLUTIONS 4.3 Newton s method can be used to compute reciprocals, without division. To compute /R, let fx) = x R so that fx) = 0 when x = /R. Write down the Newton iteration for

More information

MATH Dr. Halimah Alshehri Dr. Halimah Alshehri

MATH Dr. Halimah Alshehri Dr. Halimah Alshehri MATH 1101 haalshehri@ksu.edu.sa 1 Introduction To Number Systems First Section: Binary System Second Section: Octal Number System Third Section: Hexadecimal System 2 Binary System 3 Binary System The binary

More information

Numerics and Error Analysis

Numerics and Error Analysis Numerics and Error Analysis CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Numerics and Error Analysis 1 / 30 A Puzzle What

More information

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits Logic and Computer Design Fundamentals Chapter 5 Arithmetic Functions and Circuits Arithmetic functions Operate on binary vectors Use the same subfunction in each bit position Can design functional block

More information

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms

Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms Complex Logarithmic Number System Arithmetic Using High-Radix Redundant CORDIC Algorithms David Lewis Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S

More information

CMP 334: Seventh Class

CMP 334: Seventh Class CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative

More information

Representing signed numbers in Two s Complement notation

Representing signed numbers in Two s Complement notation EE457 Representing signed numbers in Two s Complement notation A way to view the signed number representation in 2 s complement notation as a positional weighted coefficient system. example ( 2) = (6 4

More information

Numerical Mathematical Analysis

Numerical Mathematical Analysis Numerical Mathematical Analysis Numerical Mathematical Analysis Catalin Trenchea Department of Mathematics University of Pittsburgh September 20, 2010 Numerical Mathematical Analysis Math 1070 Numerical

More information