Clock T FF1 T CL1 T FF2 T T T FF T T FF T CL T FF T CL T FF T T FF T T FF T CL. T cyc T H. Clock T FF T T FF T CL T FF T T FF T CL.

Similar documents
CSE Computer Architecture I

Computer Engineering Department. CC 311- Computer Architecture. Chapter 4. The Processor: Datapath and Control. Single Cycle

CPU DESIGN The Single-Cycle Implementation

Spiral 1 / Unit 3

CSc 256 Midterm 2 Fall 2010

EC 413 Computer Organization

LH NMOS 256K (256K 1) Dynamic RAM DESCRIPTION

DQ0 DQ1 DQ2 DQ3 NC WE# RAS# A0 A1 A2 A3 A4 A5. x = speed

Implementing the Controller. Harvard-Style Datapath for DLX

DatasheetDirect.com. Visit to get your free datasheets. This datasheet has been downloaded by

Design of Digital Circuits Lecture 14: Microprogramming. Prof. Onur Mutlu ETH Zurich Spring April 2017

Outcomes. Spiral 1 / Unit 2. Boolean Algebra BOOLEAN ALGEBRA INTRO. Basic Boolean Algebra Logic Functions Decoders Multiplexers

Vcc DQ1 DQ2 DQ3 DQ4 Vcc DQ5 DQ6 DQ7 DQ8 WE# RAS# A0 A1 A2 A3 Vcc

HM534251B Series word 4-bit Multiport CMOS Video RAM

Project Two RISC Processor Implementation ECE 485

Designing MIPS Processor

DQ0 DQ1 NC NC NC NC WE# RAS# A0 A1 A2 A3 A4 A5

Review. Combined Datapath

DRAM MT4LC4M16R6, MT4LC4M16N3. 4 MEG x 16 EDO DRAM

2M x 32 Bit 5V FPM SIMM. Fast Page Mode (FPM) DRAM SIMM S51T04JD Pin 2Mx32 FPM SIMM Unbuffered, 1k Refresh, 5V. General Description.

Topics: A multiple cycle implementation. Distributed Notes

Pipelining. Traditional Execution. CS 365 Lecture 12 Prof. Yih Huang. add ld beq CS CS 365 2

TEST 1 REVIEW. Lectures 1-5

IBM IBM M IBM B IBM P 4M x 4 12/10 DRAM

CSCI-564 Advanced Computer Architecture

HB56A1636B/SB-6B/7B/8B

CA Compiler Construction

L07-L09 recap: Fundamental lesson(s)!

Review: Single-Cycle Processor. Limits on cycle time

VCC DQ0 DQ1 DQ2 DQ3 DQ4 DQ5 DQ6 DQ7 NC NC NC WE# RAS# NC NC A0 A1 A2 A3

A Second Datapath Example YH16

On my honor, as an Aggie, I have neither given nor received unauthorized aid on this academic work

Designing Single-Cycle MIPS Processor

IBM B IBM P 8M x 8 12/11 EDO DRAM

HM514400B/BL Series HM514400C/CL Series

3. (2) What is the difference between fixed and hybrid instructions?

IBM IBM M IBM B IBM P 4M x 4 11/11 EDO DRAM

Lecture 3, Performance

Lecture 3, Performance

MB81C4256A-60/-70/-80/-10 CMOS 256K x 4 BIT FAST PAGE MODE DYNAMIC RAM

Computer Architecture ELEC2401 & ELEC3441

4. (3) What do we mean when we say something is an N-operand machine?

[2] Predicting the direction of a branch is not enough. What else is necessary?

1. (2 )Clock rates have grown by a factor of 1000 while power consumed has only grown by a factor of 30. How was this accomplished?

[2] Predicting the direction of a branch is not enough. What else is necessary?

Lecture: Pipelining Basics

Design. Dr. A. Sahu. Indian Institute of Technology Guwahati

ECEN 651: Microprogrammed Control of Digital Systems Department of Electrical and Computer Engineering Texas A&M University

Enrico Nardelli Logic Circuits and Computer Architecture

ALU A functional unit

EXAMPLES 4/12/2018. The MIPS Pipeline. Hazard Summary. Show the pipeline diagram. Show the pipeline diagram. Pipeline Datapath and Control

COVER SHEET: Problem#: Points

Spiral 2-1. Datapath Components: Counters Adders Design Example: Crosswalk Controller

CMSC 313 Lecture 25 Registers Memory Organization DRAM

Sequential Logic Worksheet

61C In the News. Processor Design: 5 steps

Control. Control. the ALU. ALU control signals 11/4/14. Next: control. We built the instrument. Now we read music and play it...

ECE290 Fall 2012 Lecture 22. Dr. Zbigniew Kalbarczyk

Sequential vs. Combinational

CSE 140 Midterm 2 Tajana Simunic Rosing. Spring 2008

Microprocessor Power Analysis by Labeled Simulation

CPSC 3300 Spring 2017 Exam 2

LOGIC CIRCUITS. Basic Experiment and Design of Electronics

LOGIC CIRCUITS. Basic Experiment and Design of Electronics. Ho Kyung Kim, Ph.D.

Computer Architecture

Lecture 9: Control Hazard and Resolution. James C. Hoe Department of ECE Carnegie Mellon University

EE141- Spring 2007 Digital Integrated Circuits

Multiplexers Decoders ROMs (LUTs) Page 1

Building a Computer. Quiz #2 on 10/31, open book and notes. (This is the last lecture covered) I wonder where this goes? L16- Building a Computer 1

Please read carefully. Good luck & Go Gators!!!

Logic and Computer Design Fundamentals. Chapter 8 Sequencing and Control

UNIVERSITY OF WISCONSIN MADISON

University of Toronto. Final Exam

Instruction register. Data. Registers. Register # Memory data register

Logic Design. CS 270: Mathematical Foundations of Computer Science Jeremy Johnson

P a g e 5 1 of R e p o r t P B 4 / 0 9

EM48AM3284LBB. Revision History. Revision 0.1 (May. 2010) - First release.

eorex EM48AM3284LBA Revision History Revision 0.1 (Jul. 2006) - First release. Revision 0.2 (Aug. 2007).. - Add IDD6 PASR Spec.

Solutions - Final Exam (Online Section) (Due Date: December 11th by 10:00 am) Clarity is very important! Show your procedure!

AUSTIN SEMICONDUCTOR, INC. 4 MEG x 1 DRAM RAS *A10. Vcc 2-23

RAO PAHALD SINGH GROUP OF INSTITUTIONS BALANA(MOHINDER GARH)123029

Computer Science. Questions for discussion Part II. Computer Science COMPUTER SCIENCE. Section 4.2.

2

Basic Computer Organization and Design Part 3/3

Review for Final Exam

EECS 312: Digital Integrated Circuits Final Exam Solutions 23 April 2009

CSE 140 Midterm 3 version A Tajana Simunic Rosing Spring 2015

Name: ID# a) Complete the state transition table for the aforementioned circuit

ADVANCED. 16M (2-Bank x 524,288-Word x 16-Bit) Synchronous DRAM FEATURES OPTIONS GENERAL DESCRIPTION. APR (Rev.2.9)

CS/COE0447: Computer Organization and Assembly Language

Lecture 13: Sequential Circuits, FSM

BALL CONFIGURATION (TOP VIEW) (BGA 90, 8mmX13mmX1.0mm Body, 0.8mm Ball Pitch) A DQ26 DQ24 VSS VDD DQ23 DQ21 B DQ28 VDDQ VSSQ VDDQ VSSQ DQ19

256M (16Mx16bit) Hynix SDRAM Memory

Computer Architecture. ECE 361 Lecture 5: The Design Process & ALU Design. 361 design.1

Digital Logic: Boolean Algebra and Gates. Textbook Chapter 3

Lecture 34: Portable Systems Technology Background Professor Randy H. Katz Computer Science 252 Fall 1995

King Fahd University of Petroleum and Minerals College of Computer Science and Engineering Computer Engineering Department

Energy Delay Optimization

256M (16Mx16bit) Hynix SDRAM Memory

Chapter Overview. Memory Classification. Memory Architectures. The Memory Core. Periphery. Reliability. Memory

Transcription:

etup TA 60 c, vcc 3v Hold TA 30 c, vcc 5v Tkew TL TL TH FF FF 2 T cyc T H T L Clock TpdFF 2 TpdCL2Tetup FF Tcyc TL 2 2 TpdFF TpdCL Tetup FF2 TH 2 T FF T T FF T CL Hold L cd cd T FF T T FF T CL Hold L cd cd FF FF 2 FF FF 2 FF FF 2 Tkew TH TH TL FF 2 FF T cyc T H T L Clock 2 2 2 2 T FF T CL T FF2 T T 2 T FF T CL T FF T pd pd etup L T FF T T FF T CL Hold H cd cd pd pd etup cyc H T FF T T FF T CL Hold H cd cd FF FF 2 FF FF 2 FF FF 2 ( '!&$%%###!

FF 2 FF T L T H T cyc T H T L Clock T FF2 T CL2 T FF T FF 2 pd pd etup L T FF T CL T FF T pd pd etup H FF 2 T Hold THold Tpd FF 2 FF #! FF FF 2 $ #! &% T FF2 T CL2 T FF T T ( FF) T T ( FF2) T ( CL2) pd pd etup L hold L cd cd T FF2 T CL2 T FF T T ( FF 2) T ( CL2) T ( FF) pd pd etup L cd cd hold T, T T, T pd etup cd Hold ( '!&$%%###!

NAND( x,'') x ) t pd t cd t setup t hold 40 5 20 22 42 5 2 3 NAND In In 2 D Out CLK '' '' NAND T cd T 5 2 2 2 hold ( '!&$%%###!

RA CA W A0 A CA W D A0 A KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 Decoder 3 8 RA KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 0 2 3 4 5 6 7 A2 A3 A 4 D D D D D D D 0 ' KM4C6000C-5 Fanout DRAM' 2 3 4 5 6 D 7 DRAM RA RA DRAM Tpd Decoder inout Tpd t * t Tpd RRH RRH t * t Tpd RAH RAH tcrp* tcrptpd t * t Tpd RAL RAL t * t Tpd RAC RAC tcac* tcac Tpd ( '!&$%%###!

CA W D A0 A KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 KM4C6000C-5 Decoder 3 8 A2 A3 A 4 RA ( DRAM RA ) W CA RA t * t Tpd RA RAH RAH tpd 0 ( '!&$%%###!

t RCD t CA 20 3 33 CA RA trac 50 T t ns min RC 90 Tmin tra trp 50 30 80ns T max t, t max t, t t, t t 50 t min CRP AR RAC max RAD AAmax RCD min CAC max OFF min max 5, 0 max 50,5 25, 203 50 0 5 50 5005ns 75>tRA,tCH,tRDC+tRH T trp>30 min 05 ns $ trcd>20 55 tcrp=5 25 50 50-95ns & %&% 5ns 45ns T min 00 T max t, t max t, t t, t t 50 t min CRP AR RAC max RAD AAmax RCD min CAC max OFF min max 5, 0 max 50,5 25, 203 500 5 50 500 95ns Tmin 95ns ns ( '!&$%%###!

* CA t max t, t rows max t t, t min min 2 ns ns ns ns CRP AR RA RP RC max 5, 0 2 max 50 30,90 368.645s 0.37ms W CA t max t, t max t, t rows max t t, t min min 2 RP CR RPC CP RA RP RC max 30,5max 5,0 2 max 5030,90 2 30ns 2 90ns 368.67s 0.37ms 64ms DRAM 0.4msCA BEFORE RA) +*DRAM ++ 5ms 0.4ms 9000nsRA ONLY DRAMTRADEOFF DRAM + 0.4msCA BEFORE RA) CBR RA ONLY++* ( '!&$%%###!

+ Idle Output_Ready Compute Init0 Clr_Reg Init DR_Num LD_A el_cnt Init2 DR_Y LD_B LD_C el_cnt Z Z' Z CheckZ el_cnt Calc ALU_OP DR_ALU LD_C updnum el_cnt DR_ALU LD_A OutRdy Clr_Reg + + + + + + + + + + + +, Idle + +++ Init0 ++ Num Init ++ Y Init2 $ + ANum- updnum! ++ CB+C Calc - + Num-=0 CheckZ. + ALU_OP DR_Y DR_Num DR_ALU LD_C LD_B LD_A + + + + + + + + + + + + + + + + + + + + + + + + + + $ + + + + +! + + + + + - + + + + + + +. P Compute= Z=0 Z= +00 00 00 00 00 00 00 0 0 0 00 +00 00 0 0 0 0 0 0 00 +00 el_cnt Z' ( '!&$%%###!

Output Compute Z Preset tate, ROM Output_Ready Next tate Register Controls &% Output Compute Z Preset tate ROM Cominational Logic Next tate Output_Ready Controls ZCompute$-ROM /Output_Ready$ROM 5 5 ROM 2 3 46$ 2 &% ROM 5 2 3 96 ROM Address Data Com DR_ P OutRdy ClrReg elcnt ALUOP DR_Y pute Z Num DRALU LD_C LD_B LD_A N +++ + + + + + + + + + + +++ +++ + + + + + + + + + ++ ++ + + + + + + + + + 00 ++ + + + + + + + 0 + + + + + + + + 00 + + + + + + + +++ ++ + + + + + + + 0 + + + + + + + + 0 + + + + + + + + + + + 00 + + + + + + + + + + 00+ ( '!&$%%###! Register

ROM Addres Data Com P pute Z N +++ + +++ +++ ++ ++ 00 ++ 0 + + 00 + +++ ++ 0 + 0 + + 00 + 00+ ( 0 Dr_Num Output_Ready Clr_Reg Next tate 3 3 8 Decoder 2 3 4 5 OR OR el_cnt Dr_Y, Ld_B Dr_ALU ALU_OP 6 7 OR OR Ld_C Ld_A ( '!&$%%###!

Preset tate Test ROM True False Outputs Output_Ready Controls Compute Z 0 ele ctor elector Next tate Register &% Preset tate Test ROM True False Controls Output_Ready Compute Z 0 ele ctor elector Next tate Combinational Logic Register 3 2 +0False True $0Test Outputs 2 3 3 30 2 3 7 36ROM ROMtest, linkt, linkf 3 2 7 56 ( '!&$%%###!

Address +++ ++ ++ + ++ + + test + linktrue linkfalse Output ++ +++ +++++++++ 00 00 ++++++++++ + + +++++++++ +++ 00 +++++++ + + +++++++ + + +++++++ +++ 00 +++++++++ ROM Address test linktrue linkfalse +++ + ++ +++ ++ 00 00 ++ + + + +++ 00 ++ + + + + + + +++ 00 a d b e F J c f T 2 2 2 2 2 6 26ns ( '!&$%%###!

) T 2T s 2T c 42ns L pd pd o TP 42 c FA s o co FA s c FA s o FA FA co co s s Count FF 9 T max,0 T FF T K 4 TPmax FF 539ns cyc pd su CL T 3AdderP KT 49 76ns L cyc TP max 9 ( '!&$%%###!

ai bi tart FA ci c s o 2 el Controller T T FF T el T FA T FF tart el, Done ns 5 7 0 3 25 cyc pd pd pd su T nt 26n n L cyc TP ntcyc Done n ( '!&$%%###!

! $ D-2, G-, H-, K-, N-,- n 336 384 525 648 n 48, 48 ' 75,. 08 5 360 72 288 72! Latency - ' 2 2 # pipe $ $ $ $ 2 2 TP 4 4 5 6 6 6 T cyc!! -... T L n7 K 3 T 4 cyc KT 3 4 2 cyc D-2, G-, H-, K-, N-,- D-,G-,H-,I-,J-,K-,N-,- D-,G-,H-,M-,N-,O-,- D-,G-,H-,O-,P-,- I-,J-,K-,N-,- M-,N-,O-,- TP T cyc n!t cyc 4 6 ( '!&$%%###!

while (a OR b) do nothing; c=0; while (a NAND b) do nothing; c=; a b C c d F a F b C F c F d a b J F c d x A x f A, B,, t x P g f x P f x f x f p B p p 2 g A g B p 2 ( '!&$%%###!

* A x A, B f f p B p p F J B p 2 A g g ForkJoin W A x A, B f f p B p p F J W B p 2 A g g (! ( ( '!&$%%###!

+ rs label - beqz rs,label - beq $rs,$zero,label branch blt * & %branch branch )label label=4 000000 0 0 0 00000 00000 add $t7, $t7, $t7 00000 0 0 beq $t7, $t7, -4 00000 0 0 000000000000000 addi $t7,$t7, $t7 addi branch $t7( [0x00004000] add $t0,$zero,$zero $t0 = 0 [0x00004004] addi $s0,$zero,0x4004 $s0 = 0x4004 [0x00004008] addi $t0,$t0,4 $t0 += 4 [0x0000400C] add $s0,$s0,$t0 $s0 += $t0 [0x0000400] jr $s0 Goto $s0 +$t0 $s0*$s0=0x4004 $t0=4!$t0 $ $t0! $s0*$s0=0x4008$t0$s0! $s00x400 - $t0=8!$t0. jr $s0*+$s0=0x40$t0$s0, $s0jr ' '$t0jr ( '!&$%%###!

, s2=s2+0 add $s2, $s2, $zero 000000 000 00000 000 00000 00000 NOP s2=s2+0 addi $s2, $s2, 0 NOP 00000 000 000 0000000000000000 ( ) NOP j 0x 00000 0000000000000000000000000 ) ( beq NOP beq $s2, $s2 0x 00000 000 000 000000000000000 ' *( % n 9 5 4.5n 5 n # -& 2 & 4.5 n 8 n%ingle Cycle MIP.data 0x0008000 # Data egment start (assembler directive) A:.word 0 # array element A[0] of Fibonacci series array.word 0 # A[].word 0.word 0.word 0.word 0.word 0.word 0.word 0.word 0 # A[9] n:.word 9 # n.text.globl main main: la $s0, A # Code egment start (assembler directive) # load value of A into register $s0 --> $s0 = &A[0] la $a0, n # load address of n into $a0 # lw $a0, 0($a0) # load value of n into $a0 # --> $a0 = n loop: addi $t0, $zero, 0 # first element $t0 = a_ = 0 addi $t, $zero, # second element $t = a_2 = sw $t0, 0($s0) addi $a0, $a0 - beq $a0, $zero, done sw $t, 4($s0) # stores element a_n # decrease index: n=n- # stores element a_n+ done: add $t0, $t0, $t add $t, $t0, $t addi $s0, $s0, 8 addi $a0, $a0, - bne $a0, $zero, loop # calculates $t0 = a_n+2 # calculates $t = a_n+3 # moves to the next 2 elements in the array # decrease index: n=n- addi $v0, $0, 0 # exit the program by calling syscall with parameter 0 syscall ( '!&$%%###!

.data 0x0008000 # Data egment start n:.word # n $ 0 2 n v.text # Code egment start.globl main main: la $a0, n lw $a0, 0($a0) # load $a0=n addi $s0, $zero, # $s0 = addi $s, $zero, 0 # $s = 0 loop: slt $s2, $a0, $s # if $a0 < $s then $s2 = else $s2 = 0 bne $s2, $zero, finish # if $s2 == goto finish # actually: if $a0 < $s goto finish add $s0, $s0, $s0 # $s0 = 2 * $s0 addi $s, $s, # $s ++ j loop # goto loop finish: add $v0, $s0, $zero # $v0 = $s0 ( '!&$%%###!

! jrr-type orii-type jj-type $s2 $s=, $s2=5, $s3=3 ALUmux +--Zero Extend ALUrc=2$ rsoralu OR PC+4 R-type lw/sw PC) (beq ) PC+4 beq j ( '!&$%%###!

Preserved on call ( Prologue Epilogue Preserved on call $a0-$a3 A $a0a $ B B jal B $a0c C jal C,$a0BC $$a0 BEpilogue Preseved on $v0,$v call B B$v0A post-callbpre-callframe Preserved on call &%)++3 2 Preserved on call&% ( ( '!&$%%###!

main: addi $sp, $sp, -32 # Main - prologue sw $ra, 20($sp) # Main - prologue sw $fp, 6($sp) # Main - prologue addi $fp, $sp, 28 # Main - prologue li $a0, 2 li $a, 8 jal gcd lw $fp, 6($sp) # Main - epilogue lw $ra, 20($sp) # Main - epilogue addi $sp, $sp, 32 # Main - epilogue jr $ra # Main - epilogue gcd: addi $sp, $sp, -32 # GCD - prologue sw $ra, 20($sp) # GCD - prologue sw $fp, 6($sp) # GCD - prologue sw $a0, 2($sp) # GCD - prologue sw $a, 8($sp) # GCD - prologue sw $s0, 4($sp) # GCD - prologue addi $fp, $sp, 28 # GCD - prologue slt $t0, $a0, $a bne $t0, $0, smaller slt $s0, $a, $a0 bne $s0, $0, greater addi $v0, $a0, 0 j return smaller: sub $a, $a, $a0 j callagain greater: sub $a0, $a0, $a callagain: jal gcd return: lw $s0, 4($sp) # GCD - prologue lw $a, 8($sp) # GCD - epilogue lw $a0, 2($sp) # GCD - epilogue lw $fp, 6($sp) # GCD - epilogue lw $ra, 20($sp) # GCD - epilogue addi $sp, $sp, 32 # GCD - epilogue jr $ra # GCD - epilogue ) gcdinstances $fp Frame ) $v0 gcd jal gcd returnreturn ))() $v0 $sp $ra $fp $a0 $a $s0 ( '!&$%%###!

# MemRead (lw (!opcode Dispatch ROM cyclememread ALUOut rt [20-6]( Label dr-a dr-b ld-a ld-b dr-alu ALUop Done equencing Idle Dispatch LoadA eq LoadB eq CalcA-B 0 Dispatch2 A<-A-B 0 CalcA-B B<-B-A CalcA-B $ AddrCtrl tate Mux CalcA-B Dispatch ROM 2 Dispatch ROM CC tart 0, 0 equencing 3 2 8 ROM. 3 2 7 2 89 72bitROM ( '!&$%%###!

rt &!%rs&rs %rs! DataMem[-4($rs)]$rt $rs$rs 4 MemRead ALUrcA=0 IorD=0 IRWrite ALUrcB=0 ALUOp=00 PCWrite PCource=00 ALUrcA=0 ALUrcB= ALUOp=00 beqsub ALUrcA= ALUrcB=00 ALUOp=0 PCWriteCond PCource=0 RegDest=0 RegWrite MemtoReg=0 ( '!&$%%###!