Logic Synthesis and Verification

Similar documents
Timing Analysis. Andreas Kuehlmann. A k = max{a 1 +D k1, A 2 +D k2,a 3 +D k3 } S k. S j. Required times: S ki. given required times on primary outputs

EECS 427 Lecture 8: Adders Readings: EECS 427 F09 Lecture 8 1. Reminders. HW3 project initial proposal: due Wednesday 10/7

STATIC TIMING ANALYSIS

UNIVERSITY OF CALIFORNIA, BERKELEY College of Engineering Department of Electrical Engineering and Computer Sciences

Chapter 5 CMOS Logic Gate Design

Issues on Timing and Clocking

For smaller NRE cost For faster time to market For smaller high-volume manufacturing cost For higher performance

VLSI Design Verification and Test Simulation CMPE 646. Specification. Design(netlist) True-value Simulator

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

Testability. Shaahin Hessabi. Sharif University of Technology. Adapted from the presentation prepared by book authors.

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Lecture 5. MOS Inverter: Switching Characteristics and Interconnection Effects

Lecture 4. Adders. Computer Systems Laboratory Stanford University

Logic Synthesis and Verification

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Hw 6 due Thursday, Nov 3, 5pm No lab this week

Floating Point Representation and Digital Logic. Lecture 11 CS301

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

Fault Modeling. 李昆忠 Kuen-Jong Lee. Dept. of Electrical Engineering National Cheng-Kung University Tainan, Taiwan. VLSI Testing Class

GALOP : A Generalized VLSI Architecture for Ultrafast Carry Originate-Propagate adders

Latches. October 13, 2003 Latches 1

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

Computer Science 324 Computer Architecture Mount Holyoke College Fall Topic Notes: Digital Logic

Chapter 2 Fault Modeling

Very Large Scale Integration (VLSI)

CPE/EE 422/522. Chapter 1 - Review of Logic Design Fundamentals. Dr. Rhonda Kay Gaede UAH. 1.1 Combinational Logic

Single Stuck-At Fault Model Other Fault Models Redundancy and Untestable Faults Fault Equivalence and Fault Dominance Method of Boolean Difference

Lecture 7: Logic design. Combinational logic circuits

Topics. Dynamic CMOS Sequential Design Memory and Control. John A. Chandy Dept. of Electrical and Computer Engineering University of Connecticut

Efficient Circuit Analysis under Multiple Input Switching (MIS) Anupama R. Subramaniam

Arithmetic Building Blocks

VLSI Design, Fall Logical Effort. Jacob Abraham

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Where are we? Data Path Design

Adders, subtractors comparators, multipliers and other ALU elements

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th,

CMPEN 411 VLSI Digital Circuits Spring Lecture 14: Designing for Low Power

C.K. Ken Yang UCLA Courtesy of MAH EE 215B

Hardware Design I Chap. 4 Representative combinational logic

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Sciences

ISSN (PRINT): , (ONLINE): , VOLUME-4, ISSUE-10,

CSE140: Components and Design Techniques for Digital Systems. Logic minimization algorithm summary. Instructor: Mohsen Imani UC San Diego

ECE429 Introduction to VLSI Design

Datapath Component Tradeoffs

Adders, subtractors comparators, multipliers and other ALU elements

EE M216A.:. Fall Lecture 5. Logical Effort. Prof. Dejan Marković

9/18/2008 GMU, ECE 680 Physical VLSI Design

EE141-Fall 2011 Digital Integrated Circuits

Digital Integrated Circuits Designing Combinational Logic Circuits. Fuyuzhuo

Optimum Prefix Adders in a Comprehensive Area, Timing and Power Design Space

Lecture 14: Circuit Families

Advanced VLSI Design Prof. A. N. Chandorkar Department of Electrical Engineering Indian Institute of Technology- Bombay

CSE 140 Midterm 3 version A Tajana Simunic Rosing Spring 2015

CPE/EE 427, CPE 527 VLSI Design I L18: Circuit Families. Outline

Where are we? Data Path Design. Bit Slice Design. Bit Slice Design. Bit Slice Plan

Area-Time Optimal Adder with Relative Placement Generator

Computation of Floating Mode Delay in Combinational Circuits: Theory and Algorithms. Kurt Keutzer. Synopsys. Abstract

Vidyalankar S.E. Sem. III [CMPN] Digital Logic Design and Analysis Prelim Question Paper Solution

EE213, Spr 2017 HW#3 Due: May 17 th, in class. Figure 1

University of Minnesota Department of Electrical and Computer Engineering

Computational Boolean Algebra. Pingqiang Zhou ShanghaiTech University

Lecture 16: Circuit Pitfalls

Lecture 5. Logical Effort Using LE on a Decoder

Fundamentals of Computer Systems

Boolean Algebra. Digital Logic Appendix A. Postulates, Identities in Boolean Algebra How can I manipulate expressions?

Skew Management of NBTI Impacted Gated Clock Trees

Logical Effort. Sizing Transistors for Speed. Estimating Delays

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

EE40 Lec 15. Logic Synthesis and Sequential Logic Circuits

Dynamic Combinational Circuits. Dynamic Logic

Homework 4 due today Quiz #4 today In class (80min) final exam on April 29 Project reports due on May 4. Project presentations May 5, 1-4pm

S.Y. Diploma : Sem. III [DE/ED/EI/EJ/EN/ET/EV/EX/IC/IE/IS/IU/MU] Principles of Digital Techniques

Section 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic

Sample Test Paper - I

Combinational Logic Design

Lecture 6: Circuit design part 1

Boolean Algebra. Digital Logic Appendix A. Boolean Algebra Other operations. Boolean Algebra. Postulates, Identities in Boolean Algebra

EE 447 VLSI Design. Lecture 5: Logical Effort

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

EE141Microelettronica. CMOS Logic

Lecture 6: Logical Effort

Problem Set 9 Solutions

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

TAU 2015 Contest Incremental Timing Analysis and Incremental Common Path Pessimism Removal (CPPR) Contest Education. v1.9 January 19 th, 2015

Digital Integrated Circuits A Design Perspective

Fundamentals of Computer Systems

EE115C Digital Electronic Circuits Homework #6

Homework #2 10/6/2016. C int = C g, where 1 t p = t p0 (1 + C ext / C g ) = t p0 (1 + f/ ) f = C ext /C g is the effective fanout

Chapter 7 Logic Circuits

Bit-Sliced Design. EECS 141 F01 Arithmetic Circuits. A Generic Digital Processor. Full-Adder. The Binary Adder

Lecture 11: Adders. Slides courtesy of Deming Chen. Slides based on the initial set from David Harris. 4th Ed.

CMOS Digital Integrated Circuits Lec 10 Combinational CMOS Logic Circuits

Logarithmic Circuits

Fault Modeling. Fault Modeling Outline

CARNEGIE MELLON UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING DIGITAL INTEGRATED CIRCUITS FALL 2002

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

Combinational Logic. Mantıksal Tasarım BBM231. section instructor: Ufuk Çelikcan

Review for B33DV2-Digital Design. Digital Design

Transcription:

Logic Synthesis and Verification Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Fall Timing Analysis & Optimization Reading: Logic Synthesis in a Nutshell Sections 5 & 6 part of the following slides are by courtesy of Andreas Kuehlmann

Delay Models Model Ak A k = arrival time = max{a,a,a 3 } + D k, where D k is the delay at node k, parameterized according to function f k and fanout node k Model A A A3 Can also have different times for rise time and fall time D k A A A3 Ak A Ak D k D k D k3 A A k = max{a +D k, A +D k,a 3 +D k3 } A3 3 Gate Delay The delay of a gate depends on its circuit context, and in particular:. Output Load Capacitive loading the charges that a gate must move to swing the output voltage Due to interconnect and logic fanout. Input Slew Slew = transition time Slower transistor switching longer delay, longer output slew V in C load An inverter e.g. output T slew = R eff C load R eff C load 4

Rising and Falling Edges Driving strengths of pull-up and pull-down networks may not be equivalent Rising and falling outputs may have different delays R down C load R up Cload Idea: maintain the latest/earliest arrival time of rising and falling transitions independently Unateness of each input/output pair is encoded in the library Positively unate inputs: only trigger output transitions in the same direction (e.g. an AND gate) Negatively unate inputs: only trigger output transitions in the opposite direction (e.g. a NOR gate) A transition on a binate input could trigger either direction on an output (e.g. an XOR gate) Only considers local functionality, but allows a less conservative analysis 5 Timing Library Timing library contains all relevant information about each standard cell E.g., pin direction, clock, pin capacitance, etc. A B Z Delay (fastest, slowest, and often typical) and output slew are encoded for each input-to-output path and each pair of transition directions Values typically represented as dimensional look-up tables (of output load and input slew) Interpolation is used Path( inputports(a), outputports(z), inputtransition(), outputtransition(), delay_table_, output_slew_table_ ); Input slew (ns) delay_table_ Output load (nf).. 4.....6 3.4 6..5.4.9 3.9 7...6 3.4 4. 8...8 3.7 4.9.3 6

Sequential Circuit Arrival times known at l, l, and l 5 (PIs and latch outputs) Required times known at l 3, l 4, and l 5 (Pos and latch inputs) Delay analysis gives arrival and required times (hence slacks) for C, C, C 3, C 4 l C C l4 l C3 l5 C4 l3 7 Arrival Time Calculation // level of PI nodes initialized to, // the others are set to -. // Invoke LEVEL from PO Algorithm LEVEL(k) { // levelize nodes if( k.level!= -) return(k.level) else k.level = +max{level(k i ) k i fanin(k)} return(k.level) } // Compute arrival times: // Given arrival times on PI s Algorithm ARRIVAL() { for L = to MAXLEVEL for {k k.level = L} A k = MAX{A ki } + D k } 8

Required Time Calculation Required time: Given required times on primary outputs Traverse in reverse topological order (i.e. from POs to PIs) If (k i, k) is an edge between k i and k, the required time of this edge is R ki,k = R k -D k The required time of output of node k is R k = min { R k,kj k j fanout(k) } R k k j R j // Compute required times: // Given required times on PO s Algorithm REQUIRED() { for L = MAXLEVEL- to for {k k.level = L} R k = MIN{R k,ki } } k i R ki k i 9 Slack Slack: Slack at the output of node k is S k = R k -A k Since R j,k = R k D k S j,k = R j,k A j S j,k + A j = R k -D k = S k + A k D k Since A k = max {A i, A j } + D k S j,k = S k + max {A fanin(k) } - A j S j = min{s j,fanout(j) } i S k S j k l S j,k S j,l S j j Note: Each edge of a circuit graph has a slack and required time Negative slack is bad

Static Timing Analysis A static critical path of a Boolean network is a path P = {n, n,, n p }, where S nk, nk+ < Note: If a node n is on a static critical path, then at least one of the fanin edges of n is critical. Hence, all critical paths reach from an input to an output. There may be several critical paths Timing optimization is a min-max problem: minimize max{-s i, } Static Timing Analysis Example A = 6 R = 5 A = 5 R = 5 3 4 4 8 A 8 = - 6 R =5 6 - - - 4 9 - - A 9 = R =5 5 7 slack arrival time 3 A = 5 node ID S = - R 3 = 3 S = R 7 = S 3, = - R 9 = - S 4, = - S 4, = S 5, = S 6,3 = S 7,3 = - S 7,4 = - S 7,5 = S 8,6 = S 9,7 = - critical path edges S ki, k = S k + max{a kj } - A ki, k j,k i fanin(k) S k = min{s k, kj }, k j fanout(k)

Static Timing Analysis Problems We want to determine the true critical paths of a circuit in order to: determine the minimum cycle time that a circuit will function correctly identify critical paths for performance optimization - don t want to try to optimize wrong (non-critical) paths Implications: Don t want false paths (produced by static delay analysis) Delay model is a worst case model Need to ensure correctness for case where i th gate delay D i Max 3 Functional Timing Analysis Functional timing analysis estimates when the output of a given circuit gets stable clock Combinational block T 4

Functional Timing Analysis Motivation Timing verification Verifies whether a design meets a given timing constraint E.g., cycle-time constraint Timing optimization Needs to identify critical portion of a design for further optimization Critical path identification In both applications, accurate analysis is desirable 5 Gate-Level Timing Analysis Naïve approach - Simulate all input vectors with SPICE Accurate, but too expensive Gate-level timing analysis (our focus) Less accurate than SPICE due to the level of abstraction, but much more efficient Scenario: Gate/wire delays are pre-characterized (accuracy loss) Perform timing analysis of a gate-level circuit assuming the gate/wire delays 6

Gate-Level Timing Analysis A naive approach is topological analysis Easy longest-path problem Linear in the size of a network Not all paths can propagate signal events False paths If all longest paths are false, topological analysis gives delay over-estimation Functional timing analysis = false-path-aware timing analysis Compute false-path-aware arrival time False path aware arr(z)? z x x arr(x )= arr(x )= 7 Gate-Level Timing Analysis Example -bit carry-skip adder c_in ripple carry adder s a b Length 5 Length s a b mux c_out 8

False Path Analysis Is a path responsible for circuit delay? If the answer is no, can ignore the path for delay computation Check the falsity of long paths until we find the longest true path How can we determine whether a path is a false path? Delay under-estimation is unacceptable Can lead to overlooking timing violation Delay over-estimation is acceptable, but not desirable Topological analysis may yield over-estimation, but never under-estimation 9 False Path Analysis Controlling and non-controlling values Controlled value of AND Controlling value of AND Non-controlling value of AND Controlled value of OR Controlling value of OR Non-controlling value of OR

False Path Analysis Static Sensitization Static sensitization A path is statically-sensitizable if there exists an input vector such that all the side-inputs to the path are of non-controlling values This is independent of gate delays Controlling value! t= t= t= These paths are not statically-sensitizable The longest true path is of length? False Path Analysis Static Sensitization Example The (dashed) path is responsible for delay! Delay under-estimation by static sensitization (delay = when true delay = 3) incorrect condition 3

False Path Analysis Static Sensitization Problem: The idea of forcing noncontrolling values to side inputs is okay, but timing was ignored The same signal can have a controlling value at one time and a non-controlling value at another time How about timing simulation as a correct method? 3 False Path Analysis Timing Simulation 4 3 4 Implies delay = under input vector (,) BUT! 4

False Path Analysis Timing Simulation 4 3 3 4 Implies delay = 4 under the same input (,) as before 5 False Path Analysis Timing Simulation Problem: If gate delays are reduced, delay estimates can increase Not acceptable since Gate delays are just upper-bounds, actual delay is in [,d] Delay uncertainty due to manufacturing We are implicitly analyzing a family of circuits where gate delays are within the upper-bounds 6

False Path Analysis Timing Simulation Definition: Given a circuit C and a delay estimation method delay_estimate, if C* is obtained from C by reducing some gate delays, and delay_estimate(c*) delay_estimate(c), then delay_estimate has monotone speedup property Timing simulation does not have such a property 7 False Path Analysis Timing Simulation 3 4 4 4 X-valued simulation 4 means that the rising signal occurs anywhere between t = - and t = 4. 8

False Path Analysis Timing Simulation Timed 3-valued (,,X) simulation called X-valued simulation monotone speedup property is satisfied Underlying model of floating mode condition Applies to simple gate networks only viability Applies to general Boolean networks 9 False Path Analysis Timing Simulation Checking the falsity of every path explicitly is too expensive due to the exponential number of paths Modern approach:. Start: L = L top = topological longest path delay L old =. Binary search: if (Delay(L)) ( * ) L d = L L old /, L old = L, L = L + L d else L d = L L old /, L old = L, L = L L d if (L > L top or L d < threshold) L = L old, done ( * ) Delay(L) = if there is an input vector under which an output gets stable only at time t where L t? Can be reduced to a SAT or timed-atpg problem 3

SAT-Based False Path Analysis Decision problem: Is there an input vector under which the output gets stable only after t = T? Idea:. Characterize the set of all input vectors S(T) that make the output stable no later than t = T. Check if S(T) contains S (all possible input vectors) Can be solved as a SAT problem: Is S \ S(T) empty? - set difference + emptiness checking Let F and F(T) be the characteristic functions of S and S(T) Is F F(T) satisfiable? 3 SAT-Based False Path Analysis Example d a g b c e f Assume all the PIs arrive at t =, all gate delays = Is the output stable time t >? 3

SAT-Based False Path Analysis Example a b c e d f g Onset: stabilized by t=? g(,t=) : the set of input vectors under which g gets stable to value = no later than t = g(,t=) = d(,t=) f(,t=) = (a(,t=) b(,t=)) (c(,t=) e(,t=)) = a b (c ) = a b c = S(t=) g(,t= ) = onset = a b c = g(,t=) = S 33 SAT-Based False Path Analysis Example a b e d f g c g(,t=) : the set of input vectors under which g gets stable to value = no later than t= g(,t=) = d(,t=) f(,t=) = (a(,t=) b(,t=)) (c(,t=) e(,t=)) = (a+b) + ( c ) = a+b = S(t=) g(,t= ) = offset = a+b+ c = S 34

SAT-Based False Path Analysis Example d g a b c e f Offset: NOT stabilized by t= under abc= g(,t=) : the set of input vectors under which g gets stable to no later than t= g(,t=) = a+b g(,t= ) = offset = a+b+ c g(,t= ) \ g(,t=) = (a+b+ c) (a+b) = a b c satisfiable 35 Timing Analysis Summary False-path-aware arrival time analysis is wellunderstood Practical algorithms exist Remaining problems Incremental analysis (make it so that a small change in the circuit does not make the analysis start all over) Integration with logic optimization Clock domain crossing issues 36

Timing Optimization Several factors affecting circuit delay: Technology Design type (e.g. domino, static CMOS, etc.) Gate type Gate size Circuit structure Length of computation paths False paths Buffering Electrical parasitics Wire loads Layout 37 Timing Optimization Problem statement Given: Initial circuit function description Library of primitive functions Performance constraints (arrival/required times) Generate: An implementation of the circuit using the primitive functions, such that: performance constraints are met circuit area is minimized 38

Timing Optimization Design flow Behavioral description Logic and latches Logic equations Gate netlist Layout Behavior optimization (scheduling) RTL synthesis Logic synthesis Technology independent Technology mapping Gate library Timing constraints Delay models Timing-driven place and route 39 Timing Optimization Buffered circuit structure function tree buffer tree 4

Timing Optimization Circuit restructuring Reschedule operations to reduce computation time Timing-driven technology mapping Selection of gates from library Minimum delay Similar to area minimization under the load independent model Minimize delay and area Implementation of buffer trees Gate/wire sizing Constant delay synthesis 4 Timing Optimization Circuit restructuring Global: Reduce depth of entire circuit Partial collapsing Boolean simplification Local: Mimic optimization techniques in adders Carry lookahead (THR tree-height reduction) Conditional sum (GST transformation) Carry bypass (GBX transformation) 4

Timing Optimization Circuit restructuring Performance measured by logic levels, sensitisable paths, technology-dependent delays Level-based optimization Tree height reduction Partial collapsing and simplification Generalized select transform Sensitisable path based optimization Generalized bypass transform 43 Timing Optimization Tree height reduction 6 5 n 5 l m 4 i j 3 h k critical region i collapsed critical region n j m 4 3 h 5 duplicated logic k a bc d e f g a b c d e f g 44

Timing Optimization Tree height reduction i Collapsed Critical region n j m 4 3 h 5 Duplicated logic k i 4 n 3 j New delay = 5 5 m 4 k 3 h a b c d e f g a b c d e f g 45 Timing Optimization Generalized bypass transform f m =f f m+ f n =g f m =f f m+ f n =g g Boolean difference dg df s-a- redundant Make critical path false to bypass critical logic and speed up circuit 46

Timing Optimization False path removal by logic duplication f m+k is the last node on the false path that fans out f m+ f m+ f m+k f n f m+k+ f m+ f m+ f m+k delay not increased f m+ f m+ f m+k f n f m+k+ controlling value of f m+ 47 Timing Optimization Generalized select transform a out b c d e f g a= b c d e f g out a= b c d e f g Late signal feeds multiplexor a 48

Timing Optimization Summary There are various methods for delay optimization at different synthesis stages No single technique dominates Most techniques (except false path removal by logic duplication) ignore false paths when assessing the delay and critical regions 49