Computer Architecture. ESE 345 Computer Architecture. Design Process. CA: Design process

Similar documents
Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1

Arithmetic and Logic Unit First Part

Lecture 5: Arithmetic

Binary addition by hand. Adding two bits

Overview. Arithmetic circuits. Binary half adder. Binary full adder. Last lecture PLDs ROMs Tristates Design examples

CS 140 Lecture 14 Standard Combinational Modules

UNSIGNED BINARY NUMBERS DIGITAL ELECTRONICS SYSTEM DESIGN WHAT ABOUT NEGATIVE NUMBERS? BINARY ADDITION 11/9/2018

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

What s the Deal? MULTIPLICATION. Time to multiply

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Review: Performance and Technology Trends. CS152 Computer Architecture and Engineering Lecture 4. Cost and Design

CS61C : Machine Structures

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

ECE 545 Digital System Design with VHDL Lecture 1. Digital Logic Refresher Part A Combinational Logic Building Blocks

CS61C : Machine Structures

CprE 281: Digital Logic

ECE/CS 552: Introduction To Computer Architecture 1. Instructor:Mikko H Lipasti. Fall 2010 University i of Wisconsin-Madison

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic

14:332:231 DIGITAL LOGIC DESIGN

Chapter 5 Arithmetic Circuits

CMPUT 329. Circuits for binary addition

Lecture 4. Adders. Computer Systems Laboratory Stanford University

EECS150. Arithmetic Circuits

Hardware Design I Chap. 4 Representative combinational logic

Combinational Logic Design Arithmetic Functions and Circuits

Design of Sequential Circuits

CSE140: Components and Design Techniques for Digital Systems. Decoders, adders, comparators, multipliers and other ALU elements. Tajana Simunic Rosing

Arithmetic Circuits Didn t I learn how to do addition in the second grade? UNC courses aren t what they used to be...

Midterm Exam Two is scheduled on April 8 in class. On March 27 I will help you prepare Midterm Exam Two.

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

ECE 250 / CPS 250 Computer Architecture. Basics of Logic Design Boolean Algebra, Logic Gates

Computer Architecture ELEC2401 & ELEC3441

Digital Logic (2) Boolean Algebra

Hakim Weatherspoon CS 3410 Computer Science Cornell University

1 Short adders. t total_ripple8 = t first + 6*t middle + t last = 4t p + 6*2t p + 2t p = 18t p

Adders, subtractors comparators, multipliers and other ALU elements

Carry Look Ahead Adders

ECE/CS 250 Computer Architecture

Fundamentals of Digital Design

Combinational Logic. Mantıksal Tasarım BBM231. section instructor: Ufuk Çelikcan

Number representation

CSE370 HW3 Solutions (Winter 2010)

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Adders, subtractors comparators, multipliers and other ALU elements

Combinational Logic. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C.

Arithmetic Circuits How to add and subtract using combinational logic Setting flags Adding faster

Binary addition (1-bit) P Q Y = P + Q Comments Carry = Carry = Carry = Carry = 1 P Q

EC 413 Computer Organization

Class Website:

Combina-onal Logic Chapter 4. Topics. Combina-on Circuit 10/13/10. EECE 256 Dr. Sidney Fels Steven Oldridge

Number System. Decimal to binary Binary to Decimal Binary to octal Binary to hexadecimal Hexadecimal to binary Octal to binary

We are here. Assembly Language. Processors Arithmetic Logic Units. Finite State Machines. Circuits Gates. Transistors

The Design Procedure. Output Equation Determination - Derive output equations from the state table

Project Two RISC Processor Implementation ECE 485

Latches. October 13, 2003 Latches 1

L8/9: Arithmetic Structures

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Computer Architecture 10. Fast Adders

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

EECS150 - Digital Design Lecture 10 - Combinational Logic Circuits Part 1

Numbers and Arithmetic

Introduction to Digital Logic

convert a two s complement number back into a recognizable magnitude.

Combinational Logic. By : Ali Mustafa

Menu. Binary Adder EEL3701 EEL3701. Add, Subtract, Compare, ALU

Numbers & Arithmetic. Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University. See: P&H Chapter , 3.2, C.5 C.

Computer organization

Systems I: Computer Organization and Architecture

EECS 427 Lecture 8: Adders Readings: EECS 427 F09 Lecture 8 1. Reminders. HW3 project initial proposal: due Wednesday 10/7

Solutions for Appendix C Exercises

ECE 545 Digital System Design with VHDL Lecture 1A. Digital Logic Refresher Part A Combinational Logic Building Blocks

EE 660: Computer Architecture Out-of-Order Processors

BOOLEAN ALGEBRA. Introduction. 1854: Logical algebra was published by George Boole known today as Boolean Algebra

UNIT II COMBINATIONAL CIRCUITS:

ELCT201: DIGITAL LOGIC DESIGN

TEST 1 REVIEW. Lectures 1-5

Module 2. Basic Digital Building Blocks. Binary Arithmetic & Arithmetic Circuits Comparators, Decoders, Encoders, Multiplexors Flip-Flops

CMSC 313 Lecture 17. Focus Groups. Announcement: in-class lab Thu 10/30 Homework 3 Questions Circuits for Addition Midterm Exam returned

Lecture 12: Datapath Functional Units

CS/COE0447: Computer Organization and Assembly Language

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary.

Review for Test 1 : Ch1 5

Section 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

Review: Additional Boolean operations

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

ECE380 Digital Logic. Positional representation

GALOP : A Generalized VLSI Architecture for Ultrafast Carry Originate-Propagate adders

211: Computer Architecture Summer 2016

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10,

9. Datapath Design. Jacob Abraham. Department of Electrical and Computer Engineering The University of Texas at Austin VLSI Design Fall 2017

Bit-Sliced Design. EECS 141 F01 Arithmetic Circuits. A Generic Digital Processor. Full-Adder. The Binary Adder

CprE 281: Digital Logic

Review. EECS Components and Design Techniques for Digital Systems. Lec 18 Arithmetic II (Multiplication) Computer Number Systems

Datapath Component Tradeoffs

Introduction to Digital Logic Missouri S&T University CPE 2210 Subtractors

Lecture 8: Sequential Multipliers

ALUs and Data Paths. Subtitle: How to design the data path of a processor. 1/8/ L3 Data Path Design Copyright Joanne DeGroat, ECE, OSU 1

Proposal to Improve Data Format Conversions for a Hybrid Number System Processor

Cost/Performance Tradeoffs:

Transcription:

Computer Architecture ESE 345 Computer Architecture Design Process 1

The Design Process "To Design Is To Represent" Design activity yields description/representation of an object -- Traditional craftsman does not distinguish between the conceptualization and the artifact -- Separation comes about because of complexity -- The concept is captured in one or more representation languages -- This process IS design Design Begins With Requirements -- Functional Capabilities: what it will do -- Performance Characteristics: Speed, Power, Area, Cost,... 2

Design Process (cont.) Design Finishes As Assembly -- Design understood in terms of components and how they have been assembled Datapath CPU ALU Regs Shifter Control -- Top Down decomposition of complex functions (behaviors) into more primitive functions Nand Gate -- bottom-up composition of primitive building blocks into more complex assemblies Design is a "creative process," not a simple method 3

Design Refinement Informal System Requirement Initial Specification Intermediate Specification refinement increasing level of detail Final Architectural Description Intermediate Specification of Implementation Final Internal Specification Physical Implementation 4

Design as Search Problem A Strategy 1 Strategy 2 SubProb 1 SubProb2 SubProb3 BB1 BB2 BB3 BBn Design involves educated guesses and verification -- Given the goals, how should these be prioritized? -- Given alternative design pieces, which should be selected? -- Given design space of components & assemblies, which part will yield the best solution? Feasible (good) choices vs. Optimal choices 5

Measurement and Evaluation Design Architecture is an iterative process -- searching the space of possible designs -- at all levels of computer systems Analysis Creativity Cost / Performance Analysis Bad Ideas Good Ideas Mediocre Ideas 6

Problem: Design a fast ALU for the MIPS ISA Requirements? Must support the MIPS ISA Arithmetic / Logic operations Tradeoffs of cost and speed based on frequency of occurrence, hardware budget 7

MIPS ALU Requirements Add, AddU, Sub, SubU, AddI, AddIU => 2 s complement adder/sub with overflow detection And, Or, AndI, OrI, Xor, Xori, Nor => Logical AND, logical OR, XOR, nor SLTI, SLTIU (set less than) => 2 s complement adder with inverter, check sign bit of result 8

MIPS arithmetic instruction format 31 25 20 15 5 0 R-type: op Rs Rt Rd funct I-Type: op Rs Rt Immed 16 Type op funct ADDI 10 xx ADDIU 11 xx SLTI 12 xx SLTIU 13 xx ANDI 14 xx ORI 15 xx XORI 16 xx LUI 17 xx Type op funct ADD 00 40 ADDU 00 41 SUB 00 42 SUBU 00 43 AND 00 44 OR 00 45 XOR 00 46 NOR 00 47 Type op funct 00 50 00 51 SLT 00 52 SLTU 00 53 Signed arithmetic generate overflow, no carry 9

Design Trick: divide & conquer Trick #1: Break the problem into simpler problems, solve them and glue together the solution Example: assume the immediates have been taken care of before the ALU 10 operations (4 bits) 00 add 01 addu 02 sub 03 subu 04 and 05 or 06 xor 07 nor 12 slt 13 sltu 10

Refined Requirements (1) Functional Specification inputs: 2 x 32-bit operands A, B, 4-bit mode outputs: 32-bit result S, 1-bit carry, 1 bit overflow operations: add, addu, sub, subu, and, or, xor, nor, slt, sltu (2) Block Diagram (schematic symbol, Verilog description) 32 32 c ovf A ALU S 32 B m 4 11

Behavioral Representation: Verilog module ALU(A, B, m, S, c, ovf); input [0:31] A, B; input [0:3] m; output [0:31] S; output c, ovf; reg [0:31] S; reg c, ovf; always @(A, B, m) begin case (m) 0: S = A + B;... end endmodule 12

Design Decisions ALU bit slice 7-to-2 C/L 7 3-to-2 C/L PLD Gates CL0 CL6 mux Simple bit-slice big combinational problem many little combinational problems partition into 2-step problem Bit slice with carry look-ahead... 13

Refined Diagram: bit-slice ALU A 32 B 32 a31 b31 ALU31 m co cin s31 a0 ALU0 co s0 b0 m cin 4 M Ovflw S 32 14

7-to-2 Combinational Logic start turning the crank... Function Inputs Outputs K-Map 0 M0 M1 M2 M3 A B Cin S Cout add 0 0 0 0 0 0 0 0 0 127 15

Seven plus a MUX? Design trick 2: take pieces you know (or can imagine) and try to put them together Design trick 3: solve part of the problem and extend A B CarryIn 1-bit Full Adder and or add CarryOut S-select Mux Result Full Adder (3->2 element) A B C in C out S 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 1 1 16

Additional operations A - B = A + ( B) = A + B + 1 form two complement by invert and add one S-select invert CarryIn and A or Mux Result B 1-bit Full Adder add CarryOut Set-less-than? left as an exercise 17

Revised Diagram LSB and MSB need to do a little extra A 32 B 32? a31 b31 ALU0 co cin s31 Ovflw S 32 a0 ALU0 co s0 b0 cin 4 M C/L to produce select, comp, c-in 18

Overflow Decimal Binary Decimal 0 0000 0 1 0001-1 2 0010-2 3 0011-3 4 0100-4 5 0101-5 6 0110-6 7 0111-7 Examples: 7 + 3 = 10 but... - 4-5 = - 9 but... 0 1 1 1-8 1 2 s Complement 0000 1111 1110 1101 1100 1011 1010 1001 1000 + 0 1 1 1 0 0 1 1 7 3 + 1 1 0 0 1 0 1 1 4 5 1 0 1 0 6 0 1 1 1 7 19

Overflow Detection Overflow: the result is too large (or too small) to represent properly Example: - 8 4-bit binary number 7 When adding operands with different signs, overflow cannot occur! Overflow occurs when adding: 2 positive numbers and the sum is negative 2 negative numbers and the sum is positive On your own: Prove you can detect overflow by: Carry into MSB Carry out of MSB 0 1 1 1 1 0 0 1 1 1 7 1 1 0 0 4 + 0 0 1 1 3 + 1 0 1 1 5 1 0 1 0 6 0 1 1 1 7 20

Overflow Detection Logic Carry into MSB Carry out of MSB For a N-bit ALU: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1] CarryIn0 A0 B0 A1 B1 A2 B2 A3 B3 1-bit Result0 ALU CarryIn1 CarryOut0 1-bit Result1 ALU CarryIn2 CarryOut1 1-bit Result2 ALU CarryIn3 1-bit Result3 ALU CarryOut3 X Y X XOR Y 0 0 0 0 1 1 1 0 1 1 1 0 Overflow 21

More Revised Diagram LSB and MSB need to do a little extra A 32 B 32 signed-arith and c-in xor c-out a31 b31 ALU0 co cin s31 Ovflw S 32 a0 ALU0 co s0 b0 cin 4 M C/L to produce select, comp, c-in 22

Which is good design advice? 1. Wait until you know everything before you start ( Be prepared ) 2. The best design is a one-pass, top down process ( Plan Ahead ) 3. Start simple, measure, then optimize ( Less is more ) 4. Don t be biased by the components you already know ( Start with a clean slate ) 23

But What about Performance? Critical Path of n-bit Rippled-carry adder is n*cp of 1-bit adder CarryIn0 A0 B0 A1 B1 A2 B2 A3 B3 1-bit ALU CarryIn1 CarryOut0 1-bit ALU CarryIn2 CarryOut1 1-bit ALU CarryIn3 CarryOut2 1-bit ALU CarryOut3 Result0 Result1 Result2 Result3 Design Trick: Throw hardware at it 24

Carry Look Ahead (Design trick: peek) A0 B0 A1 B1 G P G P C0 = Cin S S C1 = G0 + C0 P0 C2 = G1 + G0 P1 + C0 P0 P1 A B C-out 0 0 0 kill 0 1 C-in propagate 1 0 C-in propagate 1 1 1 generate G = A and B P = A xor B A2 B2 G P S C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 A3 B3 G P S G P C4 =... 25

Plumbing as Carry Lookahead Analogy g0 p0 c0 c1 g0 p0 c0 g1 p1 g0 p0 c0 c2 g1 p1 g2 p2 g3 p3 c4 26

Cascaded Carry Look-ahead (16-bit): Abstraction C L A C0 G0 P0 C1 = G0 + C0 P0 4-bit Adder C2 = G1 + G0 P1 + C0 P0 P1 4-bit Adder 4-bit Adder C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2 G P C4 =... 27

2nd level Carry, Propagate as Plumbing p0 g0 p2 p1 g1 p1 P0 p3 g2 p2 g3 p3 G0 28

Design Trick: Guess (or Precompute ) CP(2n) = 2*CP(n) n-bit adder n-bit adder CP(2n) = CP(n) + CP(mux) n-bit adder 1 n-bit adder 0 n-bit adder Cout Carry-select adder 29

Carry Skip Adder: reduce worst case delay B A4 B A0 4-bit Ripple Adder 4-bit Ripple Adder P3 P2 P1 P0 S P3 P2 P1 P0 S Just speed up the slowest case for each block Exercise: optimal design uses variable block sizes 30

Additional MIPS ALU requirements Mult, MultU, Div, DivU => Need 32-bit multiply and divide, signed and unsigned Sll, Srl, Sra => Need left shift, right shift, right shift arithmetic by 0 to 31 bits Nor (leave as exercise) 31

Elements of the Design Process Divide and Conquer (e.g., ALU) Formulate a solution in terms of simpler components. Design each of the components (subproblems) Generate and Test (e.g., ALU) Given a collection of building blocks, look for ways of putting them together that meets requirement Successive Refinement (e.g., carry lookahead) Solve "most" of the problem (i.e., ignore some constraints or special cases), examine and correct shortcomings. Formulate High-Level Alternatives (e.g., carry select) Articulate many strategies to "keep in mind" while pursuing any one approach. Work on the Things you Know How to Do The unknown will become obvious as you make progress. 32

Summary of the Design Process Hierarchical Design to manage complexity Top Down vs. Bottom Up vs. Successive Refinement Importance of Design Representations: Block Diagrams Decomposition into Bit Slices Truth Tables, K-Maps Circuit Diagrams top down bottom up Other Descriptions: state diagrams, timing diagrams, reg xfer,... Optimization Criteria: Gate Count [Package Count] Area Logic Levels Fan-in/Fan-out Delay Power Pin Out Cost Design time 33

Match for Design Principle? I. Composition II. III. Divide and Conquer Start simple, then optimize critical paths Best match I. II. III. 1. A B C 2. A C B 3. B A C 4. B C A 5. C A B 6. C B A A. Design 1-bit ALU slice before 32-bit ALU B. Replace ripple carry with carry lookahead C. Use Mux to join AND, OR gates with Adder 34

Summary An Overview of the Design Process Design is an iterative process, multiple approaches to get started Do NOT wait until you know everything before you start Example: Instruction Set drives the ALU design Divide and Conquer Take pieces you know and put them together Start with a partial solution and extend Optimization: Start simple and analyze critical path For adder: the carry is the slowest element Logarithmic trees to flatten linear computation Precompute: Double hardware and postpone slow decision 35

Acknowledgements These slides contain material developed and copyright by: Morgan Kauffmann (Elsevier, Inc.) David Patterson (UCB) Arvind (MIT) Krste Asanovic (MIT/UCB) Joel Emer (Intel/MIT) James Hoe (CMU) John Kubiatowicz (UCB) 36