On Potential Design Impacts of Electromigration Awareness

Similar documents
Problem Formulation for Arch Sim and EM Model

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

Methodology From Chaos in IC Implementation

Reliability Breakdown Analysis of an MP-SoC platform due to Interconnect Wear-out

Skew Management of NBTI Impacted Gated Clock Trees

Interconnect s Role in Deep Submicron. Second class to first class

DKDT: A Performance Aware Dual Dielectric Assignment for Tunneling Current Reduction

PARADE: PARAmetric Delay Evaluation Under Process Variation *

Post-Routing Back-End-Of-Line Layout Optimization for Improved Time-Dependent Dielectric Breakdown Reliability

System-Level Modeling and Microprocessor Reliability Analysis for Backend Wearout Mechanisms

Smart Non-Default Routing for Clock Power Reduction

Interconnect Lifetime Prediction for Temperature-Aware Design

Variation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits

Skew Management of NBTI Impacted Gated Clock Trees

Chapter 8. Low-Power VLSI Design Methodology

Lecture 2: CMOS technology. Energy-aware computing

SI for Free: Machine Learning of Interconnect Coupling Delay and Transition Effects

Very Large Scale Integration (VLSI)

Lecture 16: Circuit Pitfalls

Timing-Aware Decoupling Capacitance Allocation in Power Distribution Networks

Design for Variability and Signoff Tips

An Efficient Transient Thermal Simulation Methodology for Power Management IC Designs

CSE493/593. Designing for Low Power

Circuit Delay Variability Due to Wire Resistance Evolution Under AC Electromigration

VLSI Design I; A. Milenkovic 1

DC and Transient. Courtesy of Dr. Daehyun Dr. Dr. Shmuel and Dr.

Static Electromigration Analysis for Signal Interconnects

Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors

Lecture 5: DC & Transient Response

The Wire. Digital Integrated Circuits A Design Perspective. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. July 30, 2002

Lecture 5: DC & Transient Response

Buffered Clock Tree Sizing for Skew Minimization under Power and Thermal Budgets

Energy Delay Optimization

Optimum Prefix Adders in a Comprehensive Area, Timing and Power Design Space

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th,

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

MODULE III PHYSICAL DESIGN ISSUES

Lecture 12 CMOS Delay & Transient Response

A Novel Flow for Reducing Clock Skew Considering NBTI Effect and Process Variations

PARADE: PARAmetric Delay Evaluation Under Process Variation * (Revised Version)

NTE4035B Integrated Circuit CMOS, 4 Bit Parallel In/Parallel Out Shift Register

Lecture 15: Scaling & Economics

T st Cost Reduction LG Electronics Lee Y, ong LG Electronics 2009

MM74C73 Dual J-K Flip-Flops with Clear and Preset

EE5780 Advanced VLSI CAD

EE 447 VLSI Design. Lecture 5: Logical Effort

Spiral 2 7. Capacitance, Delay and Sizing. Mark Redekopp

(a) (b) Fig. 1. Transition propagation in (a) GBA mode and (b) PBA mode.

Logic Synthesis and Verification

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

Research Challenges and Opportunities. in 3D Integrated Circuits. Jan 30, 2009

EECS 427 Lecture 11: Power and Energy Reading: EECS 427 F09 Lecture Reminders

Reliability-Constrained Die Stacking Order in 3DICs Under Manufacturing Variability

Lecture 6: DC & Transient Response

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

INTEGRATED CIRCUITS. 74LVC138A 3-to-8 line decoder/demultiplexer; inverting. Product specification 1998 Apr 28

Website: ECE 260B CSE 241A Parasitic Estimation 1

Issues on Timing and Clocking

Role of Computer Experiment

A Method to Accelerate SoC Implementation Cycle by Automatically Generating CDC constraints

Chapter 2 Fault Modeling

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

NTE40160B, NTE40161B NTE40162B, NTE40163B Integrated Circuit CMOS, Synchronous Programmable 4 Bit Counters

Errata of K Introduction to VLSI Systems: A Logic, Circuit, and System Perspective

Standard & Canonical Forms

Power Consumption in CMOS CONCORDIA VLSI DESIGN LAB

Lecture 8: Logic Effort and Combinational Circuit Design

CD4021BC 8-Stage Static Shift Register

Efficient Selection and Analysis of Critical-Reliability Paths and Gates

CD4027BC Dual J-K Master/Slave Flip-Flop with Set and Reset

Tradeoff between Reliability and Power Management

TAU 2015 Contest Incremental Timing Analysis and Incremental Common Path Pessimism Removal (CPPR) Contest Education. v1.9 January 19 th, 2015

Lecture 21: Packaging, Power, & Clock

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

Low Power Design Methodologies and Techniques: An Overview

ECE429 Introduction to VLSI Design

74LV393 Dual 4-bit binary ripple counter

EECS 151/251A Spring 2018 Digital Design and Integrated Circuits. Instructors: Nick Weaver & John Wawrzynek. Lecture 10 EE141

Static Timing Analysis Considering Power Supply Variations

ECE 407 Computer Aided Design for Electronic Systems. Simulation. Instructor: Maria K. Michael. Overview

Static Electromigration Analysis for On-Chip Signal Interconnects

EE241 - Spring 2000 Advanced Digital Integrated Circuits. Announcements

Lecture 5 Fault Modeling

The complexity of VLSI power-delay optimization by interconnect resizing

Timing Analysis with Clock Skew

5.0 CMOS Inverter. W.Kucewicz VLSICirciuit Design 1

Performance and Variability Driven Guidelines for BEOL Layout Decomposition with LELE Double Patterning

Toward More Accurate Scaling Estimates of CMOS Circuits from 180 nm to 22 nm

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

MM74C90 MM74C93 4-Bit Decade Counter 4-Bit Binary Counter

74LVC823A 9-bit D-type flip-flop with 5-volt tolerant inputs/outputs; positive-edge trigger (3-State)

INTEGRATED CIRCUITS. For a complete data sheet, please also download:

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Optimal Reliability-Constrained Overdrive Frequency Selection In Multicore Systems

Quality and Reliability Report

3. Design a stick diagram for the PMOS logic shown below [16] Y = (A + B).C. 4. Design a layout diagram for the CMOS logic shown below [16]

Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling. Tam Chantem Electrical & Computer Engineering

NTE74HC165 Integrated Circuit TTL High Speed CMOS, 8 Bit Parallel In/Serial Out Shift Register

Effects of electrical, thermal and thermal gradient stress on reliability of metal interconnects

Transcription:

On Potential Design Impacts of Electromigration Awareness Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing VLSI CAD LABORATORY, UC San Diego UC San Diego / VLSI CAD Laboratory -1-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -2-

Electromigration in Interconnects Electromigration (EM) is the gradual displacement of metal atoms in an interconnect I avg causes DC EM and affects power delivery networks I rms causes AC EM and affects clock and logic signals -3-

EM Lifetime EM degrades interconnect lifetime Black s Equation calculates lifetime of interconnect segment due to EM degradation t 50 = A J n ee a kt t 50 median time to failure (= log e 2 x MTTF) A* geometry-dependent constant J current density in interconnect segment n constant ( = 2) E a activation energy of metal atoms k Boltzmann s constant T temperature of the interconnect -4-

Parameters Affecting EM MTTF W wire Driver size Fanout J rms V dd MTTF Freq α Temp A A B Inverse relation; if A increases then B decreases B Direct relation; if A increases then B increases Design parameters Runtime parameters -5-

Current Density (MA/cm 2 ) Current Density (MA/cm 2 ) Why Is EM Important Now? ITRS 2011 data shows that EM will be a significant reliability issue Physical 3 design teams trade off 2.5 performance and/or resources to meet EM MTTF 2 2.5 1.5 What values of MTTF do we really need? 2 1 In the US, people replace MTTF 1.5 0.5 10 Cell phones every 2 years 2004 2008 2012 2016 2020 0.5 Laptops every 3 5 years Year 0 Servers every 3 7 years 10 9 8 7 6 5 4 3 2 1 Devices can be designed with small EM lifetimes MTTF (years) J rms -6-

Examples of EM Guardband W wire Driver size Fanout J rms V dd MTTF Freq α Temp To meet EM MTTF margin at given wire width upper bound Reduce J rms reduce driver size slower circuit To meet EM MTTF margin at given performance requirement Increase W wire increase capacitance, dynamic power -7-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -8-

To Meet EM Lifetime Requirements Three major categories of prior work EM MTTF modeling Black69 (Black s Equation) Liew89 (AC lifetime models) Lu07 and Wu12 (Joule heating) Architecture changes to mitigate EM Srinivasan04 (RAMP) Romanescu08 (core cannibalization) Synthesis and physical design (PD) techniques to reduce current density violations Dasgupta96 (limit J rms violation at synthesis) Jerke04 (limit J rms violation at PD) Lienig03 (post-route J rms fixes) -9-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -10-

Key Idea We quantify impact of EM guardband on performance (F max ), area and power [Black s Equation] MTTF = A WH 2 e E a kt I rms 2 (I rms,limit ) 2 = (I rms,default ) 2 x MTTF default /MTTF reduced F max Area / Power Decrease MTTF reduced increase I rms,limit We study impacts on F max, area and power -11-

Approach We conduct two studies 1. MTTF vs. F max tradeoffs with fixed resource budget 2. MTTF vs. resources tradeoffs with fixed performance requirement Assumptions 10 years = example default EM MTTF Six testcases Report three representative (AES, DMA, JPEG) -12-

Key Contributions We are the first to quantify impacts of EM guardband on performance and resources by using PD flows We introduce EM slack as an accurate measure of potential performance improvements in different circuits at reduced MTTF requirements Black s Equation cannot accurately quantify the impacts of EM-awareness in circuits We study how tightness vs. looseness of timing constraints determine area and power trends at reduced MTTF Our study flow/methodology can potentially be used by architects and front-end designers to improve performance at no area cost physical designers whose levers are conventional SI and EM fixing methods -13-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -14-

EM Slack When EM violations occur I rms,net = C load V dd α F max 1 + 1 t rise t fall > I rms,limit Black s Equation MTTF = A WH 2 I rms 2 e E a kt Theoretical limit of I rms,net I rms,net I rms,limit MTTF default MTTFreduced Basic Concept: EM slack of a net (units: ma) EM slack,net = I rms,net I rms,limit I rms,limit MTTF default MTTF reduced 1-15-

Significance of EM Slack Positive EM slack potential for improved F max If EM slack > 0, a part of it can be used to increase I rms,limit by reducing MTTF (from Black s Equation), and improve F max by using SP&R knobs (e.g., gate sizing) without causing EM violations I rms,net = C load V dd α F max 1 + 1 t rise t fall > I rms,limit -16-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -17-

Study 1: MTTF vs. F max Study MTTF vs. F max tradeoffs given upper bounds on area, temperature and #EM violations Setup Three testcases: AES, DMA and JPEG Two technology libraries: TSMC 45GS and 65GPLUS Upper bounds temperature = 378 K area = 66% utilization #EM violations = 25 Synopsys DesignCompiler and Cadence SOC Encounter flows Thermal analysis using Hotspot -18-

Automated Flow to Determine F max RTL LIB SDC Synthesis (DC) NETLIST MTTF reduced AR Tech. LEF α PI LEF Place and Route (SOCE) Check constraints UTIL tar -- timing slack -- peak temperature -- UTIL eff -- #EM violations Met all constraints? Y Increase frequency N Decrease frequency Frequency tried before? N Y EXIT -19-

Automated Flow to Determine F max RTL LIB SDC Synthesis (DC) NETLIST MTTF reduced AR Tech. LEF α PI LEF Place and Route (SOCE) Check constraints UTIL tar -- timing slack -- peak temperature -- UTIL eff -- #EM violations Met all constraints? Y Increase frequency N Decrease frequency Frequency tried before? N Y EXIT -20-

Derating LEF We derate current density limits in technology Library Exchange Format (LEF) file Reduced lifetime (ratio > 1) increases J rms limit Reduced toggle rate (ratio > 1) increases J rms limit Increased maximum temperature increases J rms limit -21-

Automated Flow to Determine F max RTL LIB SDC Synthesis (DC) NETLIST MTTF reduced AR Tech. LEF α PI LEF Place and Route (SOCE) Check constraints UTIL tar -- timing slack -- peak temperature -- UTIL eff -- #EM violations Met all constraints? Y Increase frequency N Decrease frequency Frequency tried before? N Y EXIT -22-

Binary Search for F max Increase frequency by step until some constraint is violated Perform binary search between the current F and the last feasible F to find F max -23-

Automated Flow to Determine F max RTL LIB SDC Synthesis (DC) NETLIST MTTF reduced AR Tech. LEF α PI LEF Place and Route (SOCE) Check constraints UTIL tar -- timing slack -- peak temperature -- UTIL eff -- #EM violations Met all constraints? Y Increase frequency N Decrease frequency Frequency tried before? N Y EXIT -24-

Flow to Fix EM Violations Netlist from F max determination flow Group nets depending on the extent of I rms,limit violations Y return N N Timing met? Create nondefault rules (NDR) for each net group Perform timing analysis Decrease fanout Perform ECO route All EM violations fixed? N Downsize drivers Y return Y -25-

Automated Flow to Determine F max RTL LIB SDC Synthesis (DC) NETLIST MTTF reduced AR Tech. LEF α PI LEF Place and Route (SOCE) Check constraints UTIL tar -- timing slack -- peak temperature -- UTIL eff -- #EM violations Met all constraints? Y Increase frequency N Decrease frequency Frequency tried before? N Y EXIT -26-

% Increase in Performance Observation 1 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% DMA AES JPEG 45nm 10 9 8 7 6 5 4 3 2 1 F max scaling is not uniform across designs and at reduced MTTF as suggested by Black s Equation F max scaling is determined by the EM slack in each design at each MTTF requirement Large F max improvements may be setup artifacts -27-

% of EM violations % EM slack Observation 2 90% 120% 80% 100% 70% 60% 80% 50% 60% 40% 30% 40% 20% 20% 10% DMA AES JPEG DMA AES JPEG EM slack (not timing slack) limits performance scaling 0% 10 10 9 9due 8 8 to 7 7AC 6 EM 6 5 54 43 23 12 1 EM slack determines F max at fixed resources % of positive EM slack is usable to improve F max by reducing MTTF requirement EM violations in critical paths lead to positive EM slack -28-

% of target utilization EM AREA TEMP Observation 3 105% DMA AES JPEG 100% 95% 90% 85% 80% 10 9 8 7 6 5 4 3 2 1 Area and temperature can be dominating constraints 75% at lower MTTF requirements 10 9 8 7 6 5 4 3 2 1 Area limits F max scaling for MTTF 7 years (DMA) Area upper bounds are violated for MTTF 6 years; Temperature upper bounds are violated for MTTF 3 years -29-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -30-

Study 2: MTTF vs. Area, Power Study MTTF vs. area and power tradeoffs at a fixed performance requirement Setup DMA at 2000 MHz (2ps slack after SP&R at 45nm) AES at 1100 MHz (1.6ps slack after SP&R at 45nm) JPEG at 850 MHz (93ps slack after SP&R at 45nm) Two technology libraries: TSMC 45GS and 65GPLUS -31-

Power Area (µm (mw) 2 ) Observation 4 48980 36.5 48960 36.4 36.3 48940 36.2 48920 36.1 48900 36 48880 35.9 35.8 48860 35.7 48840 35.6 48820 35.5 48800 35.4 0 0 1 1 22 33 44 5 6 7 8 9 10 Large positive timing slack at MTTF = 10 years can lead to smaller area when MTTF requirement is reduced Large positive timing slack at MTTF = 10 years can lead to smaller power when MTTF requirement is reduced -32-

Power Area (µm (mw) 2 ) Observation 5 16.8 13740 16.6 13720 16.4 13700 16.2 13680 16 15.8 13660 15.6 13640 15.4 13620 15.2 13600 15 0 0 1 1 2 2 3 3 44 55 6 7 8 9 10 10 Area and power can decrease as MTTF requirement is reduced for designs with loose timing constraints Small positive timing slack at MTTF = 10 years can lead to increase in area as MTTF requirement is reduced Small positive timing slack at MTTF = 10 years can lead to increase in power as MTTF requirement is reduced -33-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -34-

Conventional EM Fixes and MTTF Study how conventional SI and EM fixing methods affect area and performance at reduced MTTF requirements. Setup Sweep MTTF from 10 years down to 1 year Apply per-net NDRs, driver downsizing and fanout reduction fixes Study using AES, JPEG and DMA testcases Two technology libraries: TSMC 45GS and 65GPLUS Insights are very instance-, technology/libraryand flow-specific -35-

Observation 6 3% 2.5% 2% 1.5% 1% 0.5% Fmax Area 0% AES JPEG DMA Fixing EM violations using NDRs can be effective in improving F max only till MTTF = 7 years % increase in F max is less than 5% % increase in area is ~2% -36-

Observation 7 3% 3.5% 3% 2.5% 2.5% 2% 2% 1.5% 1.5% 1% 1% 0.5% 0.5% 0% 0% Fmax Area 8 11 14 17 20 23 26 29 Fanout 12 16 20 24 28 32 Driver size NDRs can be more effective knobs to increase F max with less increase in area Fanout reductions to fix EM can increase F max by 3% at the cost of 1.86% increase in area Drive downsizing to fix EM can increase F max by 2.5% at the cost of 2% increase in area -37-

Outline Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F max Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions -38-

Conclusions We study and quantify potential impacts of improved EM-awareness in designs through two basic studies Our key observations Study 1: Available performance scaling (up to 80%) from MTTF reduction is dependent on EM slack Study 2: Area and power can decrease when MTTF is reduced in designs with loose timing constraints Additional studies: NDRs can be more effective in increasing performance ~5% at the cost of 2% increase in area for MTTF up to 7 years Ongoing work EM reliability requirements in multiple operating modes Combined impacts of EM and other back end of the line reliability mechanisms on interconnect lifetime -39-

Acknowledgments Work supported by IMPACT, SRC, NSF, Qualcomm Inc. and NXP Semiconductors -40-

Thank You! -41-

Backup -42-

Hotspot Setup We use Hotspot5.0 calibrated with thermal package from Qualcomm Inc. We perform two kinds of modeling Without heat spread and heat sink when profiling single block of AES, JPEG or DMA (area in µm 2 ) With heat spreader and heat sink when profiling 50x50 blocks of AES, JPEG, or DMA in an area of ~5mm 2 We get same values of temperature for a single block from both these methods -43-