Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Similar documents
Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits

Digital Integrated Circuits A Design Perspective

Bit-Sliced Design. EECS 141 F01 Arithmetic Circuits. A Generic Digital Processor. Full-Adder. The Binary Adder

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

Arithmetic Building Blocks

Hw 6 due Thursday, Nov 3, 5pm No lab this week

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

EE141-Fall 2010 Digital Integrated Circuits. Announcements. An Intel Microprocessor. Bit-Sliced Design. Class Material. Last lecture.

EECS 427 Lecture 8: Adders Readings: EECS 427 F09 Lecture 8 1. Reminders. HW3 project initial proposal: due Wednesday 10/7

CSE477 VLSI Digital Circuits Fall Lecture 20: Adder Design

EE141- Spring 2004 Digital Integrated Circuits

Homework 4 due today Quiz #4 today In class (80min) final exam on April 29 Project reports due on May 4. Project presentations May 5, 1-4pm

CMPEN 411 VLSI Digital Circuits Spring Lecture 21: Shifters, Decoders, Muxes

VLSI Design I; A. Milenkovic 1

Arithmetic Circuits-2

L15: Custom and ASIC VLSI Integration

Arithmetic Circuits-2

Design of System Elements. Basics of VLSI

Arithmetic Circuits-2

Where are we? Data Path Design

Where are we? Data Path Design. Bit Slice Design. Bit Slice Design. Bit Slice Plan

The Inverter. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5 Arithmetic Circuits

Hardware Design I Chap. 4 Representative combinational logic

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10,

CS 140 Lecture 14 Standard Combinational Modules

Lecture 4. Adders. Computer Systems Laboratory Stanford University

CSE140: Components and Design Techniques for Digital Systems. Decoders, adders, comparators, multipliers and other ALU elements. Tajana Simunic Rosing

EE141. Lecture 28 Multipliers. Lecture #20. Project Phase 2 Posted. Sign up for one of three project goals today

L8/9: Arithmetic Structures

Digital Integrated Circuits A Design Perspective

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

CSE140: Components and Design Techniques for Digital Systems. Logic minimization algorithm summary. Instructor: Mohsen Imani UC San Diego

EFFICIENT MULTIOUTPUT CARRY LOOK-AHEAD ADDERS

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p

ALUs and Data Paths. Subtitle: How to design the data path of a processor. 1/8/ L3 Data Path Design Copyright Joanne DeGroat, ECE, OSU 1

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

Lecture 7: Logic design. Combinational logic circuits

ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN. Week 9 Dr. Srinivas Shakkottai Dept. of Electrical and Computer Engineering

CSE 140 Lecture 11 Standard Combinational Modules. CK Cheng and Diba Mirza CSE Dept. UC San Diego

Chapter 7. VLSI System Components

Lecture 18: Datapath Functional Units

Part II Addition / Subtraction

Adders, subtractors comparators, multipliers and other ALU elements

Part II Addition / Subtraction

Name: Answers. Mean: 83, Standard Deviation: 12 Q1 Q2 Q3 Q4 Q5 Q6 Total. ESE370 Fall 2015

Floating Point Representation and Digital Logic. Lecture 11 CS301

Menu. 7-Segment LED. Misc. 7-Segment LED MSI Components >MUX >Adders Memory Devices >D-FF, RAM, ROM Computer/Microprocessor >GCPU

Integrated Circuits & Systems

Lecture 12: Datapath Functional Units

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

Adders, subtractors comparators, multipliers and other ALU elements

Digital Logic. CS211 Computer Architecture. l Topics. l Transistors (Design & Types) l Logic Gates. l Combinational Circuits.

CprE 281: Digital Logic

Timing Issues. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolić. January 2003

CMOS logic gates. João Canas Ferreira. March University of Porto Faculty of Engineering

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

Integer Multipliers 1

ECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

EE115C Digital Electronic Circuits Homework #6

Combinational Logic. By : Ali Mustafa

Combinational Logic. Mantıksal Tasarım BBM231. section instructor: Ufuk Çelikcan

CprE 281: Digital Logic

Digital Integrated Circuits A Design Perspective

VLSI Arithmetic. Lecture 9: Carry-Save and Multi-Operand Addition. Prof. Vojin G. Oklobdzija University of California

Review. EECS Components and Design Techniques for Digital Systems. Lec 18 Arithmetic II (Multiplication) Computer Number Systems

Area-Time Optimal Adder with Relative Placement Generator

Logarithmic Circuits

Digital Integrated Circuits A Design Perspective

EECS150 - Digital Design Lecture 22 - Arithmetic Blocks, Part 1

Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. November Digital Integrated Circuits 2nd Sequential Circuits

EE241 - Spring 2001 Advanced Digital Integrated Circuits

Digital Integrated Circuits A Design Perspective

Fundamentals of Computer Systems

Lecture 12: Datapath Functional Units

Full Adder Ripple Carry Adder Carry-Look-Ahead Adder Manchester Adders Carry Select Adder

EECS150 - Digital Design Lecture 25 Shifters and Counters. Recap

Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995

CSEE 3827: Fundamentals of Computer Systems. Combinational Circuits

CMOS Digital Integrated Circuits Lec 10 Combinational CMOS Logic Circuits

I. INTRODUCTION. CMOS Technology: An Introduction to QCA Technology As an. T. Srinivasa Padmaja, C. M. Sri Priya

ECE 645: Lecture 3. Conditional-Sum Adders and Parallel Prefix Network Adders. FPGA Optimized Adders

Computer Architecture 10. Fast Adders

Midterm Exam Two is scheduled on April 8 in class. On March 27 I will help you prepare Midterm Exam Two.

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

S No. Questions Bloom s Taxonomy Level UNIT-I

Very Large Scale Integration (VLSI)

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic

Overview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples

Vectorized 128-bit Input FP16/FP32/ FP64 Floating-Point Multiplier

Semiconductor Memories

Xarxes de distribució del senyal de. interferència electromagnètica, consum, soroll de conmutació.

Design and Implementation of Carry Tree Adders using Low Power FPGAs

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Datapath Component Tradeoffs

Transcription:

Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1

A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH 2

Building Blocks for Digital Architectures Arithmetic unit - Bit-sliced datapath (adder, multiplier, shifter, comparator, etc. Memory - RAM, ROM, Buffers, Shift registers Control - Finite state machine (PLA, random logic. - Counters Interconnect - Switches - Arbiters - Bus 3

An Intel Microprocessor 9-1 Mux 5-1 Mux a CARRYGEN g64 node1 ck1 SUMSEL REG sum sumb to Cache 9-1 Mux 2-1 Mux b SUMGEN + LU s0 s1 LU : Logical Unit 1000um Itanium has 6 integer execution units like this 4

Bit-Sliced Design Control Bit 3 Data-In Register Adder Shifter Multiplexer Bit 2 Bit 1 Bit 0 Data-Out Tile identical processing elements 5

Loopback Bus Loopback Bus Loopback Bus Bit slice 0 Bit slice 1 Bit slice 2 Bit slice 63 Bit-Sliced Datapath From register files / Cache / Bypass Multiplexers Shifter Adder stage 1 Wiring Adder stage 2 Wiring Adder stage 3 Sum Select To register files / Cache 6

Itanium Integer Datapath Fetzer, Orton, ISSCC 02 7

Adders 8

Full-Adder A B Cin Full adder Sum Cout 9

The Binary Adder A B Cin Full adder Sum Cout S = A B C i = ABC i + ABC i + ABC i + ABC i C o = AB + BC i + AC i 10

Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G = AB Propagate (P = A B Delete = A B Can also derive expressions for S and C o based on D and P Note that we will be sometimes using an alternate definition for Propagate (P = A + B 11

The Ripple-Carry Adder A 0 B 0 A 1 B 1 A 2 B 2 A 3 B 3 C i,0 C o,0 C o,1 C o,2 C o,3 FA FA FA FA (= C i,1 S 0 S 1 S 2 S 3 Worst case delay linear with the number of bits t d = O(N t adder = (N-1t carry + t sum Goal: Make the fastest possible carry path circuit 12

Complimentary Static CMOS Full Adder V DD V DD C i A B A B A B A C i X B C i V DD C i A S C i A B B V DD A B C i A C o B 28 Transistors 13

Inversion Property A B A B C i FA C o C i FA C o S S SABC i = SABC i C o ABC i = C o ABC i 14

Minimize Critical Path by Reducing Inverting Stages Even cell Odd cell A 0 B 0 A 1 B 1 A 2 B 2 A 3 B 3 C i,0 C o,0 C o,1 C o,2 C o,3 FA FA FA FA S 0 S 1 S 2 S 3 Exploit Inversion Property 15

A Better Structure: The Mirror Adder V DD V DD V DD A "0"-Propagate A C i B B A Kill C o A B C i B C i S "1"-Propagate A Generate C i A B B A B C i A B 24 transistors 16

Mirror Adder Stick Diagram V DD A B C i B A C i C o C i A B C o S GND 17

The Mirror Adder The NMOS and PMOS chains are completely symmetrical. A maximum of two series transistors can be observed in the carrygeneration circuitry. When laying out the cell, the most critical issue is the minimization of the capacitance at node C o. The reduction of the diffusion capacitances is particularly important. The capacitance at node C o is composed of four diffusion capacitances, two internal gate capacitances, and six gate capacitances in the connecting adder cell. The transistors connected to C i are placed closest to the output. Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size. 18

Transmission Gate Full Adder P V DD A V DD A A P C i C i P S Sum Generation V DD B A P B A P P V DD C o Carry Generation C i C i Setup A C i P 19

Manchester Carry Chain V DD P i C i C o G i 20

Manchester Carry Chain V DD P 0 P 1 P 2 P 3 C 3 C i,0 G 0 G 1 G 2 G 3 C 0 C 1 C 2 C 3 21

Manchester Carry Chain Stick Diagram Propagate/Generate Row V DD P i G i P i + 1 G i + 1 C i C i - 1 C i + 1 GND Inverter/Sum Row 22

Carry-Bypass Adder C i,0 P 0 G 1 P 0 G 1 P 2 G 2 P 3 G 3 C o,0 C o,1 C o,2 FA FA FA FA C o,3 Also called Carry-Skip P 0 G 1 P 0 G 1 P 2 G 2 P 3 G 3 BP=P o P 1 P 2 P 3 C i,0 C o,0 C o,1 C o,2 FA FA FA FA Multiplexer C o,3 Idea: If (P0 and P1 and P2 and P3 = 1 then C o3 = C 0, else kill or generate. 23

Carry-Bypass Adder (cont. Bit 0 3 Bit 4 7 Bit 8 11 Bit 12 15 Setup t setup Setup t bypass Setup Setup Carry propagation Carry propagation Carry propagation Carry propagation Sum Sum Sum t sum Sum M bits t adder = t setup + M tcarry + (N/M-1t bypass + (M-1t carry + t sum 24

Carry Ripple versus Carry Bypass t p ripple adder bypass adder 4..8 N 25

Carry-Select Adder Setup P,G "0" "0" Carry Propagation "1" "1" Carry Propagation C o,k-1 Multiplexer C o,k+3 Sum Generation Carry Vector 26

Carry Select Adder: Critical Path Bit 0 3 Bit 4 7 Bit 8 11 Bit 12 15 Setup Setup Setup Setup 0 0-Carry 0 0-Carry 0 0-Carry 0 0-Carry 1 1-Carry 1 1-Carry 1 1-Carry 1 1-Carry Multiplexer Multiplexer Multiplexer Multiplexer C i,0 C o,3 C o,7 C o,11 C o,15 Sum Generation Sum Generation Sum Generation Sum Generation S 0 3 S 4 7 S 8 11 S 12 15 27

Linear Carry Select Bit 0-3 Bit 4-7 Bit 8-11 Bit 12-15 Setup Setup Setup Setup (1 "0" (1 "0" Carry "0" "0" Carry "0" "0" Carry "0" "0" Carry C i,0 "1" "1" Carry (5 (5 Multiplexer "1" "1" Carry (5 "1" "1" Carry (5 "1" "1" Carry (5 (6 (7 (8 Multiplexer Multiplexer Multiplexer (9 Sum Generation Sum Generation Sum Generation Sum Generation S 0-3 S 4-7 S 8-11 S 12-15 (10 28

Square Root Carry Select Bit 0-1 Bit 2-4 Bit 5-8 Bit 9-13 Bit 14-19 Setup Setup Setup Setup (1 "0" Carry "0" (1 "0" "0" Carry "0" "0" Carry "0" "0" Carry C i,0 "1" Carry "1" Carry "1" Carry "1" Carry "1" "1" "1" "1" (3 (3 (4 (5 (6 (4 (5 (6 (7 Multiplexer Multiplexer Multiplexer Multiplexer (7 Mux (8 Sum Generation Sum Generation Sum Generation Sum Generation Sum S 0-1 S 2-4 S 5-8 S 9-13 S 14-19 (9 29

t p (in unit delays Adder Delays - Comparison 50 40 Ripple adder 30 20 Linear select 10 Square root select 0 0 20 40 N 60 30

LookAhead - Basic Idea A 0, B 0 A 1, B 1 A N-1, B N-1 C i,0 P 0 C i,1 P 1 C i, N-1 P N-1 S 0 S 1 S N-1 C ok = fa k B k C G P ok 1 = + C k k o k 1 31

Look-Ahead: Topology Expanding Lookahead equations: V DD C ok = G k + P k G k 1 + P k 1 C o k 2 G 3 G 2 All the way: C ok = G k + P k G k 1 + P k 1 + P 1 G 0 + P 0 C i0 C i,0 G 1 G 0 C o,3 P 0 P 1 P 2 P 3 32

Logarithmic Look-Ahead Adder A 0 F A 1 A 2 A 3 A 4 A 5 A 6 A 7 33

Carry Lookahead Trees C o2 C o1 C o0 = G 0 + P 0 C i0 = G 1 + P 1 G 0 + P 1 P 0 C i0 = G 2 + P 2 G 1 + P 2 P 1 G 0 + P 2 P 1 P 0 C i0 = G 2 + P 2 G 1 + P 2 P 1 G 0 + P 0 C i0 = G 2:1 + P 2:1 C o0 Can continue building the tree hierarchically. 34

Tree Adders (A 0, B 0 (A 1, B 1 (A 2, B 2 (A 3, B 3 (A 4, B 4 (A 5, B 5 (A 6, B 6 (A 7, B 7 (A 8, B 8 (A 9, B 9 (A 10, B 10 (A 11, B 11 (A 12, B 12 (A 13, B 13 (A 14, B 14 (A 15, B 15 S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 16-bit radix-2 Kogge-Stone tree 35

Tree Adders (a 0, b 0 (a 1, b 1 (a 2, b 2 (a 3, b 3 (a 4, b 4 (a 5, b 5 (a 6, b 6 (a 7, b 7 (a 8, b 8 (a 9, b 9 (a 10, b 10 (a 11, b 11 (a 12, b 12 (a 13, b 13 (a 14, b 14 (a 15, b 15 36

Sparse Trees (a 0, b 0 (a 1, b 1 (a 2, b 2 (a 3, b 3 (a 4, b 4 (a 5, b 5 (a 6, b 6 (a 7, b 7 (a 8, b 8 (a 9, b 9 (a 10, b 10 (a 11, b 11 (a 12, b 12 (a 13, b 13 (a 14, b 14 (a 15, b 15 S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 16-bit radix-2 sparse tree with sparseness of 2 37

Tree Adders (A 0, B 0 (A 1, B 1 (A 2, B 2 (A 3, B 3 (A 4, B 4 (A 5, B 5 (A 6, B 6 (A 7, B 7 (A 8, B 8 (A 9, B 9 (A 10, B 10 (A 11, B 11 (A 12, B 12 (A 13, B 13 (A 14, B 14 (A 15, B 15 S 0 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 S 10 S 11 S 12 S 13 S 14 S 15 Brent-Kung Tree 38

Example: Domino Adder V DD V DD Clk G i = a i b i Clk P i = a i + b i a i a i b i b i Clk Clk Propagate Generate 39

Example: Domino Adder V DD V DD Clk k P i:i-2k+1 Clk k G i:i-2k+1 P i:i-k+1 P i:i-k+1 G i:i-k+1 P i-k:i-2k+1 G i-k:i-2k+1 Propagate Generate 40

Example: Domino Sum 41

Multipliers 42

The Binary Multiplication Z X Y = = = = M 1 i = 0 M 1 i = 0 X i 2 i N 1 j = 0 M + N 1 k = 0 N 1 j = 0 Z k 2 k Y j 2 j X i Y j 2 i+ j with X Y = = M 1 i = 0 N 1 j = 0 X i 2 i Y j 2 j 43

The Binary Multiplication + x 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 0 Multiplicand Multiplier Partial products Result 44

The Array Multiplier X 3 X 2 X 1 X 0 Y 0 X 3 X 2 X 1 X 0 Y 1 Z 0 HA FA FA HA X 3 X 2 X 1 X 0 Y 2 Z 1 FA FA FA HA X3 X 2 X 1 X 0 Y 3 Z 2 FA FA FA HA Z 7 Z 6 Z 5 Z 4 Z 3 45

The MxN Array Multiplier Critical Path HA FA FA HA FA FA FA HA Critical Path 1 Critical Path 2 FA FA FA HA Critical Path 1 & 2 46

Carry-Save Multiplier HA HA HA HA HA FA FA FA HA FA FA FA HA FA FA HA Vector Merging Adder 47

Multiplier Floorplan 48

Wallace-Tree Multiplier Partial products First stage 6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position (a (b Second stage Final adder 6 5 4 3 2 1 0 6 5 4 3 2 1 0 FA (c HA (d 49

Wallace-Tree Multiplier 50

Wallace-Tree Multiplier y 0 y 1 y2 FA C i-1 y 0 y 1 y 2 y 3 y 4 y 5 y 3 FA FA C i FA C i-1 C i C i C i-1 C i-1 y 4 FA C i FA C i-1 C i C i-1 y 5 C i FA FA C S C S 51

Multipliers Summary Optimization Goals Different Vs Binary Adder Once Again: Identify Critical Path Other possible techniques - Logarithmic versus Linear (Wallace Tree Mult - Data encoding (Booth - Pipelining FIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATION 52

Shifters 53

The Binary Shifter Right nop Left A i B i A i-1 B i-1 Bit-Slice i... 54

The Barrel Shifter A 3 B 3 Sh1 A 2 B 2 Sh2 : Data Wire A 1 B 1 : Control Wire Sh3 A 0 B 0 Sh0 Sh1 Sh2 Sh3 Area Dominated by Wiring 55

4x4 barrel shifter A 3 A 2 A 1 A 0 Sh0 Sh1 Sh2 Sh3 Width barrel ~ 2 p m M Buffer 56

Logarithmic Shifter Sh1 Sh1 Sh2 Sh2 Sh4 Sh4 A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 57

0-7 bit Logarithmic Shifter A 3 Out3 A 2 Out2 A 1 Out1 A 0 Out0 58