Dynamic Adaptation for Resilient Integrated Circuits and Systems

Similar documents
Representative Critical-Path Selection for Aging-Induced Delay Monitoring

Statistical Analysis of BTI in the Presence of Processinduced Voltage and Temperature Variations

PARADE: PARAmetric Delay Evaluation Under Process Variation * (Revised Version)

Representative Path Selection for Post-Silicon Timing Prediction Under Variability

PARADE: PARAmetric Delay Evaluation Under Process Variation *

EECS150 - Digital Design Lecture 26 Faults and Error Correction. Recap

A Novel Cell Placement Algorithm for Flexible TFT Circuit with Mechanical Strain and Temperature Consideration

Low-Rank Approximations, Random Sampling and Subspace Iteration

Predicting Circuit Aging Using Ring Oscillators

EECS150 - Digital Design Lecture 26 - Faults and Error Correction. Types of Faults in Digital Designs

TrenchStop Series. Low Loss DuoPack : IGBT in TrenchStop and Fieldstop technology with soft, fast recovery anti-parallel Emitter Controlled HE diode

On Critical Path Selection Based Upon Statistical Timing Models -- Theory and Practice

Modeling and Analyzing NBTI in the Presence of Process Variation

Capturing Post-Silicon Variations using a Representative Critical Path

Fast Buffer Insertion Considering Process Variation

Timing-Aware Decoupling Capacitance Allocation in Power Distribution Networks

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

MM74C906 Hex Open Drain N-Channel Buffers

MM74C14 Hex Schmitt Trigger

MM74C00 MM74C02 MM74C04 Quad 2-Input NAND Gate Quad 2-Input NOR Gate Hex Inverter

Resilient Design for Process and Runtime Variations

MM74C14 Hex Schmitt Trigger

Fig. 1 CMOS Transistor Circuits (a) Inverter Out = NOT In, (b) NOR-gate C = NOT (A or B)

SGP30N60HS SGW30N60HS

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

DATASHEET CD4093BMS. Features. Pinout. Functional Diagram. Applications. Description. CMOS Quad 2-Input NAND Schmitt Triggers

Fast IGBT in NPT-technology with soft, fast recovery anti-parallel Emitter Controlled Diode

Fast IGBT in NPT-technology with soft, fast recovery anti-parallel Emitter Controlled Diode

DKDT: A Performance Aware Dual Dielectric Assignment for Tunneling Current Reduction

SKP15N60 SKW15N60. Fast IGBT in NPT-technology with soft, fast recovery anti-parallel Emitter Controlled Diode

MM74C00 MM74C02 MM74C04 Quad 2-Input NAND Gate Quad 2-Input NOR Gate Hex Inverter

Adding a New Dimension to Physical Design. Sachin Sapatnekar University of Minnesota

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th,

DATASHEET CD40109BMS. Features. Description. Applications. Functional Diagram. Pinout. CMOS Quad Low-to-High Voltage Level Shifter

Non-Invasive Pre-Bond TSV Test Using Ring Oscillators and Multiple Voltage Levels

TrenchStop Series. P t o t 270 W

IGP03N120H2 IGW03N120H2

416 Distributed Systems

Problems in VLSI design

Soft Switching Series

SKP06N60 SKA06N60. Fast IGBT in NPT-technology with soft, fast recovery anti-parallel Emitter Controlled Diode

High-Performance SRAM Design

60 30 Pulsed collector current, t p limited by T jmax I Cpuls 90 Turn off safe operating area V CE 900V, T j 175 C - 90 Diode forward current

Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006

Runtime Mechanisms for Leakage Current Reduction in CMOS VLSI Circuits

MM54C14 MM74C14 Hex Schmitt Trigger

Dimension Reduction and Iterative Consensus Clustering

Understanding Integrated Circuit Package Power Capabilities

TrenchStop Series I C

SGB02N120. Fast IGBT in NPT-technology. Power Semiconductors 1 Rev. 2_3 Jan 07

Low Loss DuoPack : IGBT in TrenchStop and Fieldstop technology with soft, fast recovery anti-parallel Emitter Controlled HE diode

IKW50N60TA q. Low Loss DuoPack : IGBT in TRENCHSTOP TM and Fieldstop technology with soft, fast recovery anti-parallel Emitter Controlled HE diode

EVERLIGHT ELECTRONICS CO.,LTD. Technical Data Sheet High Power LED 1W (Preliminary)

Understanding Integrated Circuit Package Power Capabilities

EffiTest2: Efficient Delay Test and Prediction for Post-Silicon Clock Skew Configuration under Process Variations

An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

5-V Low Drop Fixed Voltage Regulator TLE

Skew Management of NBTI Impacted Gated Clock Trees

TRENCHSTOP Series. Low Loss DuoPack : IGBT in TRENCHSTOP and Fieldstop technology with soft, fast recovery anti-parallel Emitter Controlled HE diode

ic-wg BLCC WGC PACKAGE SPECIFICATION

UTPlaceF 3.0: A Parallelization Framework for Modern FPGA Global Placement

Hyperspherical Clustering and Sampling for Rare Event Analysis with Multiple Failure Region Coverage

I C. A Pulsed collector current, t p limited by T jmax I Cpuls 3.5 Turn off safe operating area V CE 1200V, T j 150 C - 3.

OFF-state TDDB in High-Voltage GaN MIS-HEMTs

Pre and post-silicon techniques to deal with large-scale process variations

Transient thermal measurements and thermal equivalent circuit models

Aging Benefits in Nanometer CMOS Designs

TRENCHSTOP Series. Low Loss DuoPack : IGBT in TRENCHSTOP and Fieldstop technology with soft, fast recovery anti-parallel Emitter Controlled HE diode

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

Saving Energy in Sparse and Dense Linear Algebra Computations

STA141C: Big Data & High Performance Statistical Computing

CARNEGIE MELLON UNIVERSITY DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING DIGITAL INTEGRATED CIRCUITS FALL 2002

IGW25T120. TrenchStop Series

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING QUESTION BANK

UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering. Fault Tolerant Computing ECE 655

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

SGP20N60 SGW20N60. Fast IGBT in NPT-technology

Chapter 2 Process Variability. Overview. 2.1 Sources and Types of Variations

ECE-470 Digital Design II Memory Test. Memory Cells Per Chip. Failure Mechanisms. Motivation. Test Time in Seconds (Memory Size: n Bits) Fault Types

Statistical Analysis of Random Telegraph Noise in Digital Circuits

There are six more problems on the next two pages

Chapter 2. Design and Fabrication of VLSI Devices

Thermal Interface Materials (TIMs) for IC Cooling. Percy Chinoy

Stochastic Computing: A Design Sciences Approach to Moore s Law

Soft Switching Series I C I F I FSM

SIPMOS Small-Signal-Transistor

IGW15T120. TrenchStop Series

ReSCALE: Recalibrating Sensor Circuits for Aging and Lifetime Estimation under BTI

Skew Management of NBTI Impacted Gated Clock Trees

CD4071BC CD4081BC Quad 2-Input OR Buffered B Series Gate Quad 2-Input AND Buffered B Series Gate

BTI and Leakage Aware Dynamic Voltage Scaling for Reliable Low Power Cache Memories

TRENCHSTOP TM IGBT3 Chip SIGC100T65R3E

PROBABILISTIC LATENT SEMANTIC ANALYSIS

Efficient Selection and Analysis of Critical-Reliability Paths and Gates

IKW40N120T2 TrenchStop 2 nd Generation Series

Laplace-Beltrami Eigenfunctions for Deformation Invariant Shape Representation

This chip is used for: power module BSM 75GD120DN2. Emitter pad size 8 x ( 2.99 x 1.97 ) Thickness 200 µm. Wafer size 150 mm

Efficient Incremental Analysis of On-Chip Power Grid via Sparse Approximation

MM74C90 MM74C93 4-Bit Decade Counter 4-Bit Binary Counter

Transcription:

Dynamic Adaptation for Resilient Integrated Circuits and Systems Krishnendu Chakrabarty Department of Electrical and Computer Engineering Duke University Durham, NC 27708, USA Department of Computer and Information Science and Engineering National Cheng Kung University Tainan, Taiwan 1

Acknowledgments Fangming Ye, PhD student at Duke University Farshad Firouzi, PhD student at Karlsruhe Institute of Technology, Germany Prof. Mehdi Tahoori, Karlsruhe Institute of Technology, Germany Sponsor: Semiconductor Research Corporation (SRC) 2

Motivation Design-time solutions and guard-bands for resilience no longer sufficient for nanoscale ICs Process variations: each chip born with a unique personality ( nature ) Operating conditions, environment, and workload: each chip grows uniquely ( nurture ) Need: Guarantee that each system, despite different nature and nurture, has an acceptable behavior ( resilience ) Resilience: Persistence of performance level that can justifiably be trusted in the presence of change 3

Motivation (Contd.) Dynamic health monitoring of ICs and systems Reuse additional on-chip sensors Voltage sensors, temperature sensors, etc. Reduce number of sensors needed for monitoring Learn, predict, and adapt 4

Outline Background Selection of representative critical paths Path encoding and feature identification Dynamic RCP monitoring Results Conclusions 5

Circuit Degradation Transistor aging NBTI (PBTI) Degrades path delay Failure rate Burn- in Increased soft errors Aging Time Timing failures Early- life failure Normal lifetime Wearout Degradation rate Varies due to voltage, temperature, workload, process variations 6

Prior Work Periodic delay testing [Baba VTS 09] interrupts normal execution of system In-situ sensors [Agarwal VTS 07] Prohibitively large number of sensors Replica circuits [Tschanz VTS 09] Fail to cover the entire circuit Representative critical paths [Xie DAC 10, Wang ICCAD 12] Static, coarse-grained 7

Proposed Approach Leverage additional sources (e.g., power, temperature, process variation) for path-delay estimation Chip monitoring o C V o C Core Logic V V Infer delays of larger pool of paths based on small set of paths (RCPs) All Paths Passed gates Passed voltage grid Passed temperature grid Path delay RCP Selection RCP Path delay Infer Dynamic delay-estimation model based on features Early stage Later stage RCP model Updated RCP model Adaptation action A Updated adaptation action A 8

Topological feature Chip Features Gate types and locations of gates of path in layout Process-variation feature Captures die-to-die, intra-die, and random variations Temperature feature Reflects temperature grids in which path passes through Voltage feature Reflects voltage grids in which path passes through BTI feature Reflect BTI grid in which path passes through 9

Topological Features Describe a path using locations and types of gates in the path 0 AND OR 0 AND AND 0 0 OR 0 0 OR 0 OR 0 OR 0 0 0 OR 0 OR 0 OR OR OR 0 0 0 OR 10

Process Variation Feature 11

Temperate/Voltage/BTI Feature Workload on each device is translated to: Voltage Temperature BTI model 12

Voltage Feature All gates within this grid load same power 13

Temperature Feature Power Chip Solder Polymer Substrate Gel Heat sink R solder R polymer R sub R gel R sink T ambient 14

Path Encoding Voltage feature BTI feature Informa@on about path Processvariation features Temperature feature Topological features Determine Path delay 15

RCP Selection Measured delay of representa(ve cri@cal paths Transforma@on matrix Delays of all cri@cal paths 16

SVD-QRcp Singular value decomposition (SVD) Purpose: Estimate number of RCPs SVD QRcp QR factorization with column pivoting (QRcp) Purpose: Rank CPs, and provide first few RCPs to be selected Representa@ve cri@cal paths 17

Singular Value Decomposition (SVD) Given: mxn matrix A Voltage feature BTI feature Processvariation features Temperature feature Topological features Path delay A D 18

Singular Value Decomposition (SVD) Given: mxn matrix A = ULV T U : mxr column orthogonal matrix L : rxr diagonal matrix (non-negative, descending order) V : nxr column orthogonal matrix Singular value 19

QR-Factorization with Column Pivoting (QRcp) Given: mxr matrix U = QRP T Q : orthogonal matrix R : singular upper right matrix P : permutation matrix Swap 1 and 3 Max Norm Calcula@on (Col 1 to n) permutation matrix P =[3 2 1 4..] Iteration 20

Transformation Matrix Use delays of RCP set and transformation matrix to infer delays of all paths Transforma@on matrix Ts CP set A Measured Delays D of RCP set Es@mated Delays D of CP set 21

Example Given a set of critical paths (matrix A) and the corresponding path delays (vector D) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 A = 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 1 11 p 1 d 1 8 9 D = 4 8 9 p 8 d 8 5 13 22

Example (Contd.) SVD-QRcp method (Step 1: SVD) A=U S V S determines the number of RCPs 4.65 0 S = 2.48 2.00 1.80 1.30 0.94 0.52 21.6 0 S 2 = 6.15 4.00 3.24 3.90 0.88 0.25 0 0.29 0 0.09 5 RCPs contain most of information of A 23

Example (Contd.) SVD-QRcp method (Step 2: QRcp) QR = UE E is the permutation matrix that we need 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 E = 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 Selected RCP set P 5 P 4 A = P 1 P 8 P 6 5 4 1 8 6 24

C-means Clustering Cluster 1 Cluster 2 Cluster 3 CP set A Es@mated Delays D for CP set A 25

Example of C-means Clustering Set 5 clusters Membership matrix is 0.32 0.11 0.00 0.00 0.98 0.20 0.01 0.05 W = 0.03 0.40 0.00 1.00 0.01 0.20 0.01 0.04 0.03 0.14 1.00 0.00 0.00 0.20 0.00 0.05 0.02 0.18 0.00 0.00 0.01 0.30 0.97 0.03 0.87 0.21 0.00 0.00 0.00 0.10 0.01 0.82 Selected RCP set P 1 P 3 A = P 4 P 5 P 7 26

Dynamic Monitoring (Design Time) Determine RCP set A R CP set A with topological features only SVD-QRcp method C-means clustering method Transformation matrix T S Transformation matrix T C 27

Dynamic Monitoring (Design Time) p p 1 2 p p3 p 3 4 p p 5 p 5 p 8 p 6 p 7 9 Transforma@on matrix T S SVD- QRcp method C- mean clustering method Transforma@on matrix T C p 1 p 3 p 5 p 7 28

Dynamic Monitoring (Design Time) d d 1 2 d d3 d 3 4 d d 5 d 5 d 8 6 d 7 d 9 Transforma@on matrix T S Tranforma@on matrix T C d 1 d 3 d 5 d 7 29

Dynamic Monitoring (Run Time) CP set A with only topological features C-means clustering New transformation matrix T C Additional features of CPs (e.g., T, V, BTI) Estimation error mitigation for P R_C ΔD R_C = D R_C -D R_C RCP subset A R_C Measured delays D R_C Estimated delays D R_C RCP subset A R_S Transformation matrix T S Estimated delays D of A Estimation error mitigation ΔD for A Adjusted delay estimation D =D + ΔD 30

Dynamic Monitoring (Run Time) 1. Update transformation matrix T C of C-mean clustering method by using extra features [p 1 +f 1 ] [p 3 +f 3 ] [p 2 +f 2 ] [p 7 +f 7 ] [p 5 +f 5 ][p 4 +f 4 ] [p 8 +f 8 ] [p [p 9 +f 9 ] 6 +f 6 ] C- mean clustering method [p 5 +f 5 ] [p 7 +f 7 ] New Transforma@o n matrix T C 31

Dynamic Monitoring (Run Time) Transforma@on matrix T S d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 d 1 d 3 32

Dynamic Monitoring (Run Time) d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 D R_C d 5 d 7 D R_C Δ d 5 Δ d 7 Δ D R_C 33

Dynamic Monitoring (Run Time) Δd Δd 1 Δd 2 4 Δd Δd 3 Δd 3 5 Δd Δd 8 6 Δd 7 Δd9 Δd 5 Δd 7 Δ D R_C New Transforma@on matrix T C 34

Dynamic Monitoring (Run Time) 5. Obtain mitigated delays D initial estimated delays D by adding error mitigation ΔD to d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 Δd Δd 1 Δd 2 4 Δd Δd 3 Δd 3 5 Δd Δd 8 6 Δd 7 Δd9 d" d 1 d 2 4 d d 3 d 3 5 d d 8 6 d 7 d 9 35

Example Given a set of critical paths (matrix A) and the corresponding path delays (vector D) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 A = 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 1 11 p 1 d 1 8 9 D = 4 8 9 p 8 d 8 5 13 36

Demonstration of Dynamic Monitoring Base RCP set (using SVD-QRcp) A R_S = {P 2, P 5, P 8 } Mitigation RCP set (using C-means) A R_c = {P 1, P 6 } 11 8 9 D = 4 8 9 5 13 Use A R_S to predict (sta@c) 10.9 8 7.2 D = 3.2 8 7.6 Use A R_C to mi@gate (dynamic) 4.3 Error is 5.8% 13 Error is 0.9% 11 8 8.5 D = 4.1 8 9 4.8 13 37

Experiments: Benchmarks ITC 99 and IWLS 05 benchmarks b17 b18 b19 b22 RISC vga # gates 27k 88k 165k 40k 61k 114k # gate- type features 54 54 54 54 54 54 # temperature features 100 100 100 100 100 100 # voltage features 400 400 400 400 400 400 # process- varia@on features 400 400 400 400 400 400 # cri@cal paths 3021 1104 224 924 3662 461 38

Evaluation Metric 39

Estimation Accuracy with Extra Features Using C-means Method Number of RCP estimation accuracy More features estimation accuracy Elbow curve (a) b18 (b) RISC 40

Estimation Accuracy with Extra Features Using SVD-QRcp Method Number of RCP estimation accuracy More features estimation accuracy (a) b18 (b) RISC 41

Average Estimation Accuracy Proposed method outperforms static SVD-QRcp method and static C-means method 4% 4% 3% 3% rrmse 2% 1% rrsme 2% 1% 0% 0% (b) b18 (e) RISC Proposed method Static SVD-QRcp method Static C-means method 42

Runtime Estimation Accuracy 4% 4% 3% 3% rrsme 2% 1% rrsme 2% 1% 0% 0 0.2 0.4 0.6 0.8 1 2 3 System runtime (years) (a)b18 Proposed method 0% Static SVD-QRcp method 0 0.2 0.4 0.6 0.8 1 2 3 System runtime (years) (b)risc Static C-mean method 43

Conclusions Small set of RCPs to infer large pool of CPs Multiple features of path considered: process variation, voltage, temperature, BTI Dynamic monitoring on RCPs offset aging-induced estimation errors Decrease in number of RCPs Increased delay-estimation accuracy 44