Lecture 11. Advanced Dividers
|
|
- Gyles Park
- 5 years ago
- Views:
Transcription
1 Lecture 11 Advanced Dividers
2 Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 15 Variation in Dividers 15.3, Combinational and Array Dividers Chapter 16, Division by Convergence
3 Division versus Multiplication Division is more complex than multiplication: Need for quotient digit selection or estimation Overflow possibility: the high-order k bits of z must be strictly less than d; this overflow check also detects the divide-by-zero condition. Pentium III latencies Instruction Latency Cycles/Issue Load / Store 3 1 Integer Multiply 4 1 Integer Divide Double/Single FP Multiply 5 2 Double/Single FP Add 3 1 Double/Single FP Divide The ratios haven t changed much in later Pentiums, Atom, or AMD products* *Source: T. Granlund, Instruction Latencies and Throughput for AMD and Intel x86 Processors, Feb May 2012 Computer Arithmetic, Division Slide 3
4 Classification of Dividers Radix-2 Sequential High-radix Array Dividers Dividers by Convergence Restoring Non-restoring regular SRT using carry save adders SRT using carry save adders 4
5 Fractional Division
6 Unsigned Fractional Division z frac Dividend.z -1 z z -(2k-1) z -2k d frac Divisor.d -1 d d -(k-1) d -k q frac Quotient.q -1 q q -(k-1) q -k s frac Remainder.000 0s -(k+1)... s -(2k-1) s -2k k bits 6
7 For Integers: Integer vs. Fractional Division z = q d + s 2-2k For Fractions: z 2-2k = (q 2 -k ) (d 2 -k ) + s (2-2k ) where z frac = q frac d frac + s frac z frac = z 2-2k d frac = d 2 -k q frac = q 2 -k s frac = s 2-2k 7
8 Unsigned Fractional Division Overflow Condition for no overflow: z frac < d frac 8
9 Sequential Fractional Division Basic Equations s (0) = z frac s (j) = 2 s (j-1) - q -j d frac for j=1..k 2 k s frac = s (k) s frac = 2 -k s (k) 9
10 Fig Examples of sequential division with integer and fractional operands. 10
11 Array Dividers 11
12 Sequential Fractional Division Basic Equations s frac (0) = z frac s (j) = 2 s (j-1) - q -j d frac s (k) frac = 2k s frac 12
13 Restoring Unsigned Fractional Division s (0) = z for j = 1 to k if 2 s (j-1) - d > 0 q -j = 1 s (j) = 2 s (j-1) - d else q -j = 0 s (j) = 2 s (j-1) 13
14 Restoring Array Divider q 1 z 1 d 1 z 2 d 2 z 3 d 3 z 4 q 2 0 z 5 q 3 0 z 6 Cell 0 FS 1 0 s s s Dividend z =.z 1 z 2 3 z z 4 5 z 6 z Divisor d =.d 1 d 2 3 d Quotient q =.q 1 q 2 3 q Remainder s = s 4 5 s 6 s May 2012 Computer Arithmetic, Division Slide 14
15 Non-Restoring Unsigned Fractional Division s (-1) = z-d for j = 0 to k-1 if s (j-1) > 0 q -j = 1 s (j) = 2 s (j-1) - d else q -j = 0 s (j) = 2 s (j-1) + d end for if s (k-1) > 0 q -k = 1 else q -k = 0 15
16 d z 0 0 d z 1 1 d z 2 2 d z Nonrestoring Array Divider q 0 z 4 q Critical path 1 q 2 z 5 Similarity to array multiplier is deceiving z 6 Cell q 3 XOR FA s s s s Dividend z = 0 z.z 1 2 z 3 z 4 z 5 z 6 z Divisor d = 0 d.d 1 2 d 3 d Quotient q = 0 q.q 1 2 q 3 q Remainder s = s 4 s 5 s 6 s May 2012 Computer Arithmetic, Division Slide 16
17 Division by Convergence 17
18 Division by Convergence Chapter Goals Show how by using multiplication as the basic operation in each division step, the number of iterations can be reduced Chapter Highlights Digit-recurrence as convergence method Convergence by Newton-Raphson iteration Computing the reciprocal of a number Hardware implementation and fine tuning May 2012 Computer Arithmetic, Division Slide 18
19 16.1 General Convergence Methods Sequential digit-at-a-time (binary or high-radix) division can be viewed as a convergence scheme As each new digit of q = z / d is determined, the quotient value is refined, until it reaches the final correct value Convergence is from below in restoring division and oscillating in nonrestoring division Meanwhile, the remainder s = z q d approaches 0; the scaled remainder is kept in a certain range, such as [ d, d) q Digit May 2012 Computer Arithmetic, Division Slide 19
20 Elaboration on Scaled Remainder in Division The partial remainder s (j) in division recurrence isn t the true remainder but a version scaled by 2 j Division with left shifts s (j) = 2s (j 1) q k j (2 k d) with s (0) = z and shift s (k) = 2 k s subtract Quotient digit selection keeps the scaled remainder bounded (say, in the range d to d) to ensure the convergence of the true remainder to q Digit May 2012 Computer Arithmetic, Division Slide 20
21 Recurrence Formulas for Convergence Methods u (i+1) = f(u (i), v (i) ) v (i+1) = g(u (i), v (i) ) Constant Desired function u (i+1) = f(u (i), v (i), w (i) ) v (i+1) = g(u (i), v (i), w (i) ) w (i+1) = h(u (i), v (i), w (i) ) Guide the iteration such that one of the values converges to a constant (usually 0 or 1) The other value then converges to the desired function The complexity of this method depends on two factors: a. Ease of evaluating f and g (and h) b. Rate of convergence (number of iterations needed) May 2012 Computer Arithmetic, Division Slide 21
22 16.2 Division by Repeated Multiplications Motivation: Suppose add takes 1 clock and multiply 3 clocks 64-bit divide takes 64 clocks in radix 2, 32 in radix 4 à Divide faster via multiplications faster if 10 or fewer needed Idea: q = z d = zx dx (0) (0) x x (1) (1)! x! x ( m 1) ( m 1) Converges to q Force to 1 Remainder often not needed, but can be obtained by another multiplication if desired: s = z qd To turn the identity into a division algorithm, we face three questions: 1. How to select the multipliers x (i)? 2. How many iterations (pairs of multiplications)? 3. How to implement in hardware? May 2012 Computer Arithmetic, Division Slide 22
23 Formulation as a Convergence Computation Idea: q = z d = zx dx (0) (0) x x (1) (1)! x! x ( m 1) ( m 1) Converges to q Force to 1 d (i+1) = d (i) x (i) Set d (0) = d; make d (m) converge to 1 z (i+1) = z (i) x (i) Set z (0) = z; obtain z/d = q z (m) Question 1: How to select the multipliers x (i)? x (i) = 2 d (i) This choice transforms the recurrence equations into: d (i+1) = d (i) (2 - d (i) ) Set d (0) = d; iterate until d (m) 1 z (i+1) = z (i) (2 - d (i) ) Set z (0) = z; obtain z/d = q z (m) u (i+1) = f(u (i), v (i) ) v (i+1) = g(u (i), v (i) ) Fits the general form May 2012 Computer Arithmetic, Division Slide 23
24 Determining the Rate of Convergence d (i+1) = d (i) x (i) Set d (0) = d; make d (m) converge to 1 z (i+1) = z (i) x (i) Set z (0) = z; obtain z/d = q z (m) Question 2: How quickly does d (i) converge to 1? We can relate the error in step i + 1 to the error in step i: d (i+1) = d (i) (2 - d (i) ) = 1 (1 d (i) ) 2 1 d (i+1) = (1 d (i) ) 2 For 1 d (i) ε, we get 1 d (i+1) ε 2 : In general, for k-bit operands, we need Quadratic convergence 2m 1 multiplications and m 2 s complementations where m = log 2 k May 2012 Computer Arithmetic, Division Slide 24
25 Quadratic Convergence Table 16.1 Quadratic convergence in computing z/d by repeated multiplications, where 1/2 d = 1 y < 1 i d (i) = d (i 1) x (i 1), with d (0) = d x (i) = 2 d (i) 0 1 y = (.1xxx xxxx xxxx xxxx) two 1/2 1 + y 1 1 y 2 = (.11xx xxxx xxxx xxxx) two 3/4 1 + y y 4 = (.1111 xxxx xxxx xxxx) two 15/ y y 8 = ( xxxx xxxx) two 255/ y y 16 = ( ) two = 1 ulp Each iteration doubles the number of guaranteed leading 1s (convergence to 1 is from below) Beginning with a single 1 (d ½), after log 2 k iterations we get as close to 1 as is possible in a fractional representation May 2012 Computer Arithmetic, Division Slide 25
26 Graphical Depiction of Convergence to q 1 1 ulp d q d (i) q ε z z (i) Fig Graphical representation of convergence in division by repeated multiplications. Iteration i May 2012 Computer Arithmetic, Division Slide 26
27 16.5 Hardware Implementation Repeated multiplications: Each pair of ops involves the same multiplier d (i+1) = d (i) (2 - d (i) ) Set d (0) = d; iterate until d (m) 1 z (i+1) = z (i) (2 - d (i) ) Set z (0) = z; obtain z/d = q z (m) z (i) x (i) d (i+1) 2's Compl x (i+1) z (i+1) x (i+1) z (i) x (i) d (i+1) x (i+1) z (i+1) x (i+1) d (i) x (i) z (i) x (i) (i+1) x (i+1) d d (i+1) z (i+1) d (i+2) Fig Two multiplications fully overlapped in a 2-stage pipelined multiplier. May 2012 Computer Arithmetic, Division Slide 27
28 16.3 Division by Reciprocation The Newton-Raphson method can be used for finding a root of f (x) = 0 f(x) Start with an initial estimate x (0) for the root Tangent at x (i) Iteratively refine the estimate via the recurrence x (i+1) = x (i) f (x (i) ) / f ʹ (x (i) ) Justification: tan α (i) = f ʹ (x (i) ) = f (x (i) ) / (x (i) x (i+1) ) Root x (i+2) x (i+1) Fig Convergence to a root of f(x) = 0 in the Newton-Raphson method. α (i) x f(x (i)) (i) x May 2012 Computer Arithmetic, Division Slide 28
29 Computing 1/d by Convergence 1/d is the root of f (x) = 1/x d f ʹ (x) = 1/x 2 Substitute in the Newton-Raphson recurrence x (i+1) = x (i) f (x (i) ) / f ʹ (x (i) ) to get: x (i+1) = x (i) (2 - x (i) d) f(x) -d 1/d x One iteration = Two multiplications + One 2 s complementation Error analysis: Let δ (i) = 1/d x(i) be the error at the ith iteration δ (i+1) = 1/d x (i+1) = 1/d x (i) (2 x (i) d) = d (1/d x (i) ) 2 = d (δ (i) ) 2 Because d < 1, we have δ (i+1) < (δ (i) ) 2 May 2012 Computer Arithmetic, Division Slide 29
30 Choosing the Initial Approximation to 1/d With x (0) in the range 0 < x (0) < 2/d, convergence is guaranteed Justification: δ (0) = x (0) 1/d < 1/d δ (1) = x (1) 1/d = d (δ (0) ) 2 = (d δ (0) ) δ (0) < δ (0) For d in [1/2, 1): 1/x Simple choice x (0) = 1.5 Better approx. Max error = 0.5 < 1/d x (0) = 4( 3 1) 2d = d Max error x May 2012 Computer Arithmetic, Division Slide 30
31 16.4 Speedup of Convergence Division q = z d = zx dx (0) (0) x x (1) (1) ( m 1)! x Compute y = 1/d ( m 1)! x Do the multiplication yz Division can be performed via 2 log 2 k 1 multiplications This is not yet very impressive 64-bit numbers, 3-ns multiplier 33-ns division Three types of speedup are possible: Fewer multiplications (reduce m) Narrower multiplications (reduce the width of some x (i) s) Faster multiplications May 2012 Computer Arithmetic, Division Slide 31
32 Initial Approximation via Table Lookup Convergence is slow in the beginning: it takes 6 multiplications to get 8 bits of convergence and another 5 to go from 8 bits to 64 bits Approx to 1/d Better approx d x (0) x (1) x (2) = ( ) two Read this value, x (0+), directly from a table, thereby reducing 6 multiplications to 2 A 2 w w lookup table is necessary and sufficient for w bits of convergence after 2 multiplications Example with 4-bit lookup: d = xxxx... (11/16 d < 12/16) Inverses of the two extremes are 16/ and 16/ So, is a good estimate for 1/d = (11/8) (11/16) = 121/128 = = (11/8) (3/4) = 33/32 = May 2012 Computer Arithmetic, Division Slide 32
33 Visualizing the Convergence with Table Lookup 1 1 ulp d q ε z After the 2nd pair of multiplications After table lookup and 1st pair of multiplications, replacing several iterations Iterations Fig Convergence in division by repeated multiplications with initial table lookup. May 2012 Computer Arithmetic, Division Slide 33
34 Convergence Does Not Have to Be from Below 1 1 ± ulp d q ± ε z Iterations Fig Convergence in division by repeated multiplications with initial table lookup and the use of truncated multiplicative factors. May 2012 Computer Arithmetic, Division Slide 34
35 Sequential Dividers with Carry-Save Adders 35
36 Block diagram of a radix-2 SRT divider with partial remainder in stored-carry form 36
37 October 1994 Pentium bug (1) Thomas Nicely, Lynchburg Collage, Virginia finds an error in his computer calculations, and traces it back to the Pentium processor November 7, 1994 First press announcement, Electronic Engineering Times Late 1994 Tim Coe, Vitesse Semiconductor presents an example with the worst-case error c = / Pentium = Correct result =
38 Intel admits subtle flaw Pentium bug (2) November 30, 1994 Intel s white paper about the bug and its possible consequences Intel - average spreadsheet user affected once in 27,000 years IBM - average spreadsheet user affected once every 24 days Replacements based on customer needs December 20, 1994 Announcement of no-question-asked replacements 38
39 Pentium bug (3) Error traced back to the look-up table used by the radix-4 SRT division algorithm 2048 cells, 1066 non-zero values {-2, -1, 1, 2} 5 non-zero values not downloaded correctly to the lookup table due to an error in the C script 39
40 40
41 Follow-up Courses 41
42 DIGITAL SYSTEMS DESIGN 1. ECE 681 VLSI Design for ASICs (Fall semesters) H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 2. ECE 699 Digital Signal Processing Hardware Architectures (Spring semesters) A. Cohen, project, FPGA design for DSP 3. ECE 682 VLSI Test Concepts (Spring semesters) T. Storey, homework
43 NETWORK AND SYSTEM SECURITY 1. ECE 646 Cryptography and Computer Network Security (Fall semesters) K.Gaj, hardware, software, or analytical project 2. ECE 746 Advanced Applied Cryptography (Spring semesters) J.-P. Kaps, hardware, software, or analytical project 3. ECE 899 Cryptographic Engineering (Spring semesters) J.-P. Kaps, research-oriented project
Graduate Institute of Electronics Engineering, NTU Basic Division Scheme
Basic Division Scheme 台灣大學電子所吳安宇博士 2002 ACCESS IC LAB Outline Shift/subtract division algorithm. Programmed division. Restoring hardware dividers. Nonstoring and signed division. Radix-2 SRT divisioin.
More information9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017
9. Datapath Design Jacob Abraham Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017 October 2, 2017 ECE Department, University of Texas at Austin
More informationCost/Performance Tradeoff of n-select Square Root Implementations
Australian Computer Science Communications, Vol.22, No.4, 2, pp.9 6, IEEE Comp. Society Press Cost/Performance Tradeoff of n-select Square Root Implementations Wanming Chu and Yamin Li Computer Architecture
More informationPart VI Function Evaluation
Part VI Function Evaluation Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationLecture 8: Sequential Multipliers
Lecture 8: Sequential Multipliers ECE 645 Computer Arithmetic 3/25/08 ECE 645 Computer Arithmetic Lecture Roadmap Sequential Multipliers Unsigned Signed Radix-2 Booth Recoding High-Radix Multiplication
More informationLecture 8. Sequential Multipliers
Lecture 8 Sequential Multipliers Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter
More informationChapter 1: Solutions to Exercises
1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier, c 2004 Exercise 1.1 (a) 1. 9 bits since 2 8 297 2 9 2. 3 radix-8 digits since 8 2 297 8 3 3. 3
More informationDIVIDER IMPLEMENTATION
c n = cn-= DAIL LLAOCCA CLab@OU DIVID IPLTATIO The division of two unsigned integer numbers A (where A is the dividend and the divisor), results in a quotient and a residue. These quantities are related
More informationA HIGH-SPEED PROCESSOR FOR RECTANGULAR-TO-POLAR CONVERSION WITH APPLICATIONS IN DIGITAL COMMUNICATIONS *
Copyright IEEE 999: Published in the Proceedings of Globecom 999, Rio de Janeiro, Dec 5-9, 999 A HIGH-SPEED PROCESSOR FOR RECTAGULAR-TO-POLAR COVERSIO WITH APPLICATIOS I DIGITAL COMMUICATIOS * Dengwei
More informationHardware Design I Chap. 4 Representative combinational logic
Hardware Design I Chap. 4 Representative combinational logic E-mail: shimada@is.naist.jp Already optimized circuits There are many optimized circuits which are well used You can reduce your design workload
More informationSRT Division and the Pentium FDIV Bug (draft lecture notes, CSCI P415)
SRT Division and the Pentium FDIV Bug (draft lecture notes, CSCI P4) Steven D. Johnson September 0, 000 Abstract This talk explains the widely publicized design error in a 994 issue of the Intel Corp.
More informationSvoboda-Tung Division With No Compensation
Svoboda-Tung Division With No Compensation Luis MONTALVO (IEEE Student Member), Alain GUYOT Integrated Systems Design Group, TIMA/INPG 46, Av. Félix Viallet, 38031 Grenoble Cedex, France. E-mail: montalvo@archi.imag.fr
More informationDivider Implementation
c n = cn-= LCTRICAL AD COTR GIRIG DPARTT, OAKLAD UIVRSITY RCRLA@OU ALGORITH Divider Implementation The division of two unsigned integer numbers A (where A is the dividend and the divisor), results in a
More informationDIVISION BY DIGIT RECURRENCE
DIVISION BY DIGIT RECURRENCE 1 SEVERAL DIVISION METHODS: DIGIT-RECURRENCE METHOD studied in this chapter MULTIPLICATIVE METHOD (Chapter 7) VARIOUS APPROXIMATION METHODS (power series expansion), SPECIAL
More informationDesign and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives
Design and FPGA Implementation of Radix-10 Algorithm for Division with Limited Precision Primitives Miloš D. Ercegovac Computer Science Department Univ. of California at Los Angeles California Robert McIlhenny
More informationEECS150 - Digital Design Lecture 27 - misc2
EECS150 - Digital Design Lecture 27 - misc2 May 1, 2002 John Wawrzynek Spring 2002 EECS150 - Lec27-misc2 Page 1 Outline Linear Feedback Shift Registers Theory and practice Simple hardware division algorithms
More informationComplement Arithmetic
Complement Arithmetic Objectives In this lesson, you will learn: How additions and subtractions are performed using the complement representation, What is the Overflow condition, and How to perform arithmetic
More informationChapter 5: Solutions to Exercises
1 DIGITAL ARITHMETIC Miloš D. Ercegovac and Tomás Lang Morgan Kaufmann Publishers, an imprint of Elsevier Science, c 2004 Updated: September 9, 2003 Chapter 5: Solutions to Selected Exercises With contributions
More informationReduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs
Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,
More informationECE380 Digital Logic. Positional representation
ECE380 Digital Logic Number Representation and Arithmetic Circuits: Number Representation and Unsigned Addition Dr. D. J. Jackson Lecture 16-1 Positional representation First consider integers Begin with
More informationFormal verification of IA-64 division algorithms
Formal verification of IA-64 division algorithms 1 Formal verification of IA-64 division algorithms John Harrison Intel Corporation IA-64 overview HOL Light overview IEEE correctness Division on IA-64
More informationALU (3) - Division Algorithms
HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Lecture 12 ALU (3) - Division Algorithms Sommersemester 2002 Leitung: Prof. Dr. Miroslaw Malek www.informatik.hu-berlin.de/rok/ca CA - XII - ALU(3)
More informationBinary Multipliers. Reading: Study Chapter 3. The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding
Binary Multipliers The key trick of multiplication is memorizing a digit-to-digit table Everything else was just adding 2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 4 6 8 2 4 6 8 3 3 6 9 2 5 8 2 24 27 4 4 8 2 6
More information14:332:231 DIGITAL LOGIC DESIGN. 2 s-complement Representation
4:332:23 DIGITAL LOGIC DESIGN Ivan Marsic, Rutgers University Electrical & Computer Engineering Fall 203 Lecture #3: Addition, Subtraction, Multiplication, and Division 2 s-complement Representation RECALL
More informationELEN Electronique numérique
ELEN0040 - Electronique numérique Patricia ROUSSEAUX Année académique 2014-2015 CHAPITRE 3 Combinational Logic Circuits ELEN0040 3-4 1 Combinational Functional Blocks 1.1 Rudimentary Functions 1.2 Functions
More informationECE 645: Lecture 2. Carry-Lookahead, Carry-Select, & Hybrid Adders
ECE 645: Lecture 2 Carry-Lookahead, Carry-Select, & Hybrid Adders Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 6, Carry-Lookahead Adders Sections 6.1-6.2.
More informationThe goal differs from prime factorization. Prime factorization would initialize all divisors to be prime numbers instead of integers*
Quantum Algorithm Processor For Finding Exact Divisors Professor J R Burger Summary Wiring diagrams are given for a quantum algorithm processor in CMOS to compute, in parallel, all divisors of an n-bit
More informationCHAPTER 2 NUMBER SYSTEMS
CHAPTER 2 NUMBER SYSTEMS The Decimal Number System : We begin our study of the number systems with the familiar decimal number system. The decimal system contains ten unique symbol 0, 1, 2, 3, 4, 5, 6,
More informationPart II Addition / Subtraction
Part II Addition / Subtraction Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationThis Unit: Arithmetic. CIS 371 Computer Organization and Design. Pre-Class Exercise. Readings
This Unit: Arithmetic CI 371 Computer Organization and Design Unit 3: Arithmetic Based on slides by Prof. Amir Roth & Prof. Milo Martin App App App ystem software Mem CPU I/O A little review Binary + 2s
More informationLinear Feedback Shift Registers (LFSRs) 4-bit LFSR
Linear Feedback Shift Registers (LFSRs) These are n-bit counters exhibiting pseudo-random behavior. Built from simple shift-registers with a small number of xor gates. Used for: random number generation
More informationResidue Number Systems Ivor Page 1
Residue Number Systems 1 Residue Number Systems Ivor Page 1 7.1 Arithmetic in a modulus system The great speed of arithmetic in Residue Number Systems (RNS) comes from a simple theorem from number theory:
More informationComputer Architecture 10. Residue Number Systems
Computer Architecture 10 Residue Number Systems Ma d e wi t h Op e n Of f i c e. o r g 1 A Puzzle What number has the reminders 2, 3 and 2 when divided by the numbers 7, 5 and 3? x mod 7 = 2 x mod 5 =
More informationEfficient Function Approximation Using Truncated Multipliers and Squarers
Efficient Function Approximation Using Truncated Multipliers and Squarers E. George Walters III Lehigh University Bethlehem, PA, USA waltersg@ieee.org Michael J. Schulte University of Wisconsin Madison
More informationTunable Floating-Point for Energy Efficient Accelerators
Tunable Floating-Point for Energy Efficient Accelerators Alberto Nannarelli DTU Compute, Technical University of Denmark 25 th IEEE Symposium on Computer Arithmetic A. Nannarelli (DTU Compute) Tunable
More informationA 32-bit Decimal Floating-Point Logarithmic Converter
A 3-bit Decimal Floating-Point Logarithmic Converter Dongdong Chen 1, Yu Zhang 1, Younhee Choi 1, Moon Ho Lee, Seok-Bum Ko 1, Department of Electrical and Computer Engineering, University of Saskatchewan
More informationRemainders. We learned how to multiply and divide in elementary
Remainders We learned how to multiply and divide in elementary school. As adults we perform division mostly by pressing the key on a calculator. This key supplies the quotient. In numerical analysis and
More informationProposal to Improve Data Format Conversions for a Hybrid Number System Processor
Proceedings of the 11th WSEAS International Conference on COMPUTERS, Agios Nikolaos, Crete Island, Greece, July 6-8, 007 653 Proposal to Improve Data Format Conversions for a Hybrid Number System Processor
More informationRadix-4 Vectoring CORDIC Algorithm and Architectures. July 1998 Technical Report No: UMA-DAC-98/20
Radix-4 Vectoring CORDIC Algorithm and Architectures J. Villalba E. Antelo J.D. Bruguera E.L. Zapata July 1998 Technical Report No: UMA-DAC-98/20 Published in: J. of VLSI Signal Processing Systems for
More informationAdders, subtractors comparators, multipliers and other ALU elements
CSE4: Components and Design Techniques for Digital Systems Adders, subtractors comparators, multipliers and other ALU elements Instructor: Mohsen Imani UC San Diego Slides from: Prof.Tajana Simunic Rosing
More informationConversions between Decimal and Binary
Conversions between Decimal and Binary Binary to Decimal Technique - use the definition of a number in a positional number system with base 2 - evaluate the definition formula ( the formula ) using decimal
More informationChapter 5 Arithmetic Circuits
Chapter 5 Arithmetic Circuits SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 11, 2016 Table of Contents 1 Iterative Designs 2 Adders 3 High-Speed
More informationCOMPUTERS ORGANIZATION 2ND YEAR COMPUTE SCIENCE MANAGEMENT ENGINEERING UNIT 3 - ARITMETHIC-LOGIC UNIT JOSÉ GARCÍA RODRÍGUEZ JOSÉ ANTONIO SERRA PÉREZ
OMUTERS ORGANIZATION 2ND YEAR OMUTE SIENE MANAGEMENT ENGINEERING UNIT 3 - ARITMETHI-LOGI UNIT JOSÉ GARÍA RODRÍGUEZ JOSÉ ANTONIO SERRA ÉREZ Tema 3. La Unidad entral de roceso. A.L.U. Arithmetic Logic Unit
More informationNumbering Systems. Computational Platforms. Scaling and Round-off Noise. Special Purpose. here that is dedicated architecture
Computational Platforms Numbering Systems Basic Building Blocks Scaling and Round-off Noise Computational Platforms Viktor Öwall viktor.owall@eit.lth.seowall@eit lth Standard Processors or Special Purpose
More informationARITHMETIC COMBINATIONAL MODULES AND NETWORKS
ARITHMETIC COMBINATIONAL MODULES AND NETWORKS 1 SPECIFICATION OF ADDER MODULES FOR POSITIVE INTEGERS HALF-ADDER AND FULL-ADDER MODULES CARRY-RIPPLE AND CARRY-LOOKAHEAD ADDER MODULES NETWORKS OF ADDER MODULES
More informationLogic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits
Logic and Computer Design Fundamentals Chapter 5 Arithmetic Functions and Circuits Arithmetic functions Operate on binary vectors Use the same subfunction in each bit position Can design functional block
More informationEC-121 Digital Logic Design
EC-121 Digital Logic Design Lecture 2 [Updated on 02-04-18] Boolean Algebra and Logic Gates Dr Hashim Ali Spring 2018 Department of Computer Science and Engineering HITEC University Taxila!1 Overview What
More informationCS 140 Lecture 14 Standard Combinational Modules
CS 14 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris 1 Part III. Standard Modules A. Interconnect B. Operators. Adders Multiplier
More informationMATH Dr. Halimah Alshehri Dr. Halimah Alshehri
MATH 1101 haalshehri@ksu.edu.sa 1 Introduction To Number Systems First Section: Binary System Second Section: Octal Number System Third Section: Hexadecimal System 2 Binary System 3 Binary System The binary
More informationSUFFIX PROPERTY OF INVERSE MOD
IEEE TRANSACTIONS ON COMPUTERS, 2018 1 Algorithms for Inversion mod p k Çetin Kaya Koç, Fellow, IEEE, Abstract This paper describes and analyzes all existing algorithms for computing x = a 1 (mod p k )
More informationAdders, subtractors comparators, multipliers and other ALU elements
CSE4: Components and Design Techniques for Digital Systems Adders, subtractors comparators, multipliers and other ALU elements Adders 2 Circuit Delay Transistors have instrinsic resistance and capacitance
More informationChapter 4 Number Representations
Chapter 4 Number Representations SKEE2263 Digital Systems Mun im/ismahani/izam {munim@utm.my,e-izam@utm.my,ismahani@fke.utm.my} February 9, 2016 Table of Contents 1 Fundamentals 2 Signed Numbers 3 Fixed-Point
More informationPart II Addition / Subtraction
Part II Addition / Subtraction Parts Chapters I. Number Representation 1. 2. 3. 4. Numbers and Arithmetic Representing Signed Numbers Redundant Number Systems Residue Number Systems Elementary Operations
More informationArithmetic in Integer Rings and Prime Fields
Arithmetic in Integer Rings and Prime Fields A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 FA C 3 FA C 2 FA C 1 FA C 0 C 4 S 3 S 2 S 1 S 0 http://koclab.org Çetin Kaya Koç Spring 2018 1 / 71 Contents Arithmetic in Integer
More informationCOMPUTER ARITHMETIC. 13/05/2010 cryptography - math background pp. 1 / 162
COMPUTER ARITHMETIC 13/05/2010 cryptography - math background pp. 1 / 162 RECALL OF COMPUTER ARITHMETIC computers implement some types of arithmetic for instance, addition, subtratction, multiplication
More informationA COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER. Jesus Garcia and Michael J. Schulte
A COMBINED 16-BIT BINARY AND DUAL GALOIS FIELD MULTIPLIER Jesus Garcia and Michael J. Schulte Lehigh University Department of Computer Science and Engineering Bethlehem, PA 15 ABSTRACT Galois field arithmetic
More informationECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks
ECE 545 Digital System Design with VHDL Lecture Digital Logic Refresher Part A Combinational Logic Building Blocks Lecture Roadmap Combinational Logic Basic Logic Review Basic Gates De Morgan s Law Combinational
More informationNumber Representations
Computer Arithmetic Algorithms Prof. Dae Hyun Kim School of Electrical Engineering and Computer Science Washington State University Number Representations Information Textbook Israel Koren, Computer Arithmetic
More informationECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders
ECE 645: Lecture 3 Conditional-Sum Adders and Parallel Prefix Network Adders FPGA Optimized Adders Required Reading Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design Chapter 7.4, Conditional-Sum
More informationA Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m )
A Simple Architectural Enhancement for Fast and Flexible Elliptic Curve Cryptography over Binary Finite Fields GF(2 m ) Stefan Tillich, Johann Großschädl Institute for Applied Information Processing and
More informationWhat s the Deal? MULTIPLICATION. Time to multiply
What s the Deal? MULTIPLICATION Time to multiply Multiplying two numbers requires a multiply Luckily, in binary that s just an AND gate! 0*0=0, 0*1=0, 1*0=0, 1*1=1 Generate a bunch of partial products
More informationOptimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations
electronics Article Optimized Linear, Quadratic and Cubic Interpolators for Elementary Function Hardware Implementations Masoud Sadeghian 1,, James E. Stine 1, *, and E. George Walters III 2, 1 Oklahoma
More informationProposal to Improve Data Format Conversions for a Hybrid Number System Processor
Proposal to Improve Data Format Conversions for a Hybrid Number System Processor LUCIAN JURCA, DANIEL-IOAN CURIAC, AUREL GONTEAN, FLORIN ALEXA Department of Applied Electronics, Department of Automation
More informationVLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California
VLSI Arithmetic Lecture 9: Carry-Save and Multi-Operand Addition Prof. Vojin G. Oklobdzija University of California http://www.ece.ucdavis.edu/acsel Carry-Save Addition* *from Parhami 2 June 18, 2003 Carry-Save
More informationDual-Field Arithmetic Unit for GF(p) and GF(2 m ) *
Institute for Applied Information Processing and Communications Graz University of Technology Dual-Field Arithmetic Unit for GF(p) and GF(2 m ) * CHES 2002 Workshop on Cryptographic Hardware and Embedded
More informationCORDIC, Divider, Square Root
4// EE6B: VLSI Signal Processing CORDIC, Divider, Square Root Prof. Dejan Marković ee6b@gmail.com Iterative algorithms CORDIC Division Square root Lecture Overview Topics covered include Algorithms and
More informationGF(2 m ) arithmetic: summary
GF(2 m ) arithmetic: summary EE 387, Notes 18, Handout #32 Addition/subtraction: bitwise XOR (m gates/ops) Multiplication: bit serial (shift and add) bit parallel (combinational) subfield representation
More informationEfficient random number generation on FPGA-s
Proceedings of the 9 th International Conference on Applied Informatics Eger, Hungary, January 29 February 1, 2014. Vol. 1. pp. 313 320 doi: 10.14794/ICAI.9.2014.1.313 Efficient random number generation
More informationEE260: Digital Design, Spring n Digital Computers. n Number Systems. n Representations. n Conversions. n Arithmetic Operations.
EE 260: Introduction to Digital Design Number Systems Yao Zheng Department of Electrical Engineering University of Hawaiʻi at Mānoa Overview n Digital Computers n Number Systems n Representations n Conversions
More informationPre-Algebra 2. Unit 9. Polynomials Name Period
Pre-Algebra Unit 9 Polynomials Name Period 9.1A Add, Subtract, and Multiplying Polynomials (non-complex) Explain Add the following polynomials: 1) ( ) ( ) ) ( ) ( ) Subtract the following polynomials:
More informationAn Effective New CRT Based Reverse Converter for a Novel Moduli Set { 2 2n+1 1, 2 2n+1, 2 2n 1 }
An Effective New CRT Based Reverse Converter for a Novel Moduli Set +1 1, +1, 1 } Edem Kwedzo Bankas, Kazeem Alagbe Gbolagade Department of Computer Science, Faculty of Mathematical Sciences, University
More informationComputer Arithmetic Design
Computer Arithmetic Design Instructor: Kuan Jen Lin E-Mail: kjlin@mails.fju.edu.tw Web: http://vlsi.ee.fju.edu.tw/teacher/kjlin/kjlin.htm Dept. of EE, FJU, Taiwan Room: SF 727B Computer Arithmetic 1, Dept.
More informationShort Division of Long Integers. (joint work with David Harvey)
Short Division of Long Integers (joint work with David Harvey) Paul Zimmermann October 6, 2011 The problem to be solved Divide efficiently a p-bit floating-point number by another p-bit f-p number in the
More informationMenu. Review of Number Systems EEL3701 EEL3701. Math. Review of number systems >Binary math >Signed number systems
Menu Review of number systems >Binary math >Signed number systems Look into my... 1 Our decimal (base 10 or radix 10) number system is positional. Ex: 9437 10 = 9x10 3 + 4x10 2 + 3x10 1 + 7x10 0 We have
More informationChapter 2 Boolean Algebra and Logic Gates
Chapter 2 Boolean Algebra and Logic Gates The most common postulates used to formulate various algebraic structures are: 1. Closure. N={1,2,3,4 }, for any a,b N we obtain a unique c N by the operation
More informationOn the computation of the reciprocal of floating point expansions using an adapted Newton-Raphson iteration
On the computation of the reciprocal of floating point expansions using an adapted Newton-Raphson iteration Mioara Joldes, Valentina Popescu, Jean-Michel Muller ASAP 2014 1 / 10 Motivation Numerical problems
More informationFaster arithmetic for number-theoretic transforms
University of New South Wales 7th October 2011, Macquarie University Plan for talk 1. Review number-theoretic transform (NTT) 2. Discuss typical butterfly algorithm 3. Improvements to butterfly algorithm
More informationCSE 241 Digital Systems Spring 2013
CSE 241 Digital Systems Spring 2013 Instructor: Prof. Kui Ren Department of Computer Science and Engineering Lecture slides modified from many online resources and used solely for the educational purpose.
More informationCMP 334: Seventh Class
CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative
More informationDigital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH
More informationSample Marking Scheme
Page 1 of 10 School of Computer Science 60-265-01 Computer Architecture and Digital Design Fall 2008 Midterm Examination # 1 B Wednesday, November 5, 2008 Sample Marking Scheme Duration of examination:
More informationSOLUTION OF ALGEBRAIC AND TRANSCENDENTAL EQUATIONS BISECTION METHOD
BISECTION METHOD If a function f(x) is continuous between a and b, and f(a) and f(b) are of opposite signs, then there exists at least one root between a and b. It is shown graphically as, Let f a be negative
More informationMultiplication of signed-operands
Multiplication of signed-operands Recall we discussed multiplication of unsigned numbers: Combinatorial array multiplier. Sequential multiplier. Need an approach that works uniformly with unsigned and
More informationVLSI Signal Processing
VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data
More informationDesign of Sequential Circuits
Design of Sequential Circuits Seven Steps: Construct a state diagram (showing contents of flip flop and inputs with next state) Assign letter variables to each flip flop and each input and output variable
More informationChapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>
Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building
More informationUNSIGNED BINARY NUMBERS DIGITAL ELECTRONICS SYSTEM DESIGN WHAT ABOUT NEGATIVE NUMBERS? BINARY ADDITION 11/9/2018
DIGITAL ELECTRONICS SYSTEM DESIGN LL 2018 PROFS. IRIS BAHAR & ROD BERESFORD NOVEMBER 9, 2018 LECTURE 19: BINARY ADDITION, UNSIGNED BINARY NUMBERS For the binary number b n-1 b n-2 b 1 b 0. b -1 b -2 b
More informationINF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)
INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder
More informationOverview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples
Overview rithmetic circuits Last lecture PLDs ROMs Tristates Design examples Today dders Ripple-carry Carry-lookahead Carry-select The conclusion of combinational logic!!! General-purpose building blocks
More informationPanHomc'r I'rui;* :".>r '.a'' W"»' I'fltolt. 'j'l :. r... Jnfii<on. Kslaiaaac. <.T i.. %.. 1 >
5 28 (x / &» )»(»»» Q ( 3 Q» (» ( (3 5» ( q 2 5 q 2 5 5 8) 5 2 2 ) ~ ( / x {» /»»»»» (»»» ( 3 ) / & Q ) X ] Q & X X X x» 8 ( &» 2 & % X ) 8 x & X ( #»»q 3 ( ) & X 3 / Q X»»» %» ( z 22 (»» 2» }» / & 2 X
More informationFinite Fields. SOLUTIONS Network Coding - Prof. Frank H.P. Fitzek
Finite Fields In practice most finite field applications e.g. cryptography and error correcting codes utilizes a specific type of finite fields, namely the binary extension fields. The following exercises
More informationMcBits: Fast code-based cryptography
McBits: Fast code-based cryptography Peter Schwabe Radboud University Nijmegen, The Netherlands Joint work with Daniel Bernstein, Tung Chou December 17, 2013 IMA International Conference on Cryptography
More informationPower Consumption Analysis. Arithmetic Level Countermeasures for ECC Coprocessor. Arithmetic Operators for Cryptography.
Power Consumption Analysis General principle: measure the current I in the circuit Arithmetic Level Countermeasures for ECC Coprocessor Arnaud Tisserand, Thomas Chabrier, Danuta Pamula I V DD circuit traces
More informationExtended Introduction to Computer Science CS1001.py. Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration
Extended Introduction to Computer Science CS1001.py Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration Instructors: Benny Chor, Amir Rubinstein Teaching Assistants: Yael Baran,
More informationNewton-Raphson Algorithms for Floating-Point Division Using an FMA
Newton-Raphson Algorithms for Floating-Point Division Using an FMA Nicolas Louvet, Jean-Michel Muller, Adrien Panhaleux Abstract Since the introduction of the Fused Multiply and Add (FMA) in the IEEE-754-2008
More informationDiscrete Mathematics U. Waterloo ECE 103, Spring 2010 Ashwin Nayak May 17, 2010 Recursion
Discrete Mathematics U. Waterloo ECE 103, Spring 2010 Ashwin Nayak May 17, 2010 Recursion During the past week, we learnt about inductive reasoning, in which we broke down a problem of size n, into one
More informationNumbers and Arithmetic
Numbers and Arithmetic See: P&H Chapter 2.4 2.6, 3.2, C.5 C.6 Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University Big Picture: Building a Processor memory inst register file alu
More informationComputer Architecture 10. Fast Adders
Computer Architecture 10 Fast s Ma d e wi t h Op e n Of f i c e. o r g 1 Carry Problem Addition is primary mechanism in implementing arithmetic operations Slow addition directly affects the total performance
More information1 RN(1/y) Ulp Accurate, Monotonic
URL: http://www.elsevier.nl/locate/entcs/volume24.html 29 pages Analysis of Reciprocal and Square Root Reciprocal Instructions in the AMD K6-2 Implementation of 3DNow! Cristina Iordache and David W. Matula
More informationDIGIT-SERIAL ARITHMETIC
DIGIT-SERIAL ARITHMETIC 1 Modes of operation:lsdf and MSDF Algorithm and implementation models LSDF arithmetic MSDF: Online arithmetic TIMING PARAMETERS 2 radix-r number system: conventional and redundant
More information