Dynamic Adaptation for Resilient Integrated Circuits and Systems Krishnendu Chakrabarty Department of Electrical and Computer Engineering Duke University Durham, NC 27708, USA Department of Computer and Information Science and Engineering National Cheng Kung University Tainan, Taiwan 1
Acknowledgments Fangming Ye, PhD student at Duke University Farshad Firouzi, PhD student at Karlsruhe Institute of Technology, Germany Prof. Mehdi Tahoori, Karlsruhe Institute of Technology, Germany Sponsor: Semiconductor Research Corporation (SRC) 2
Motivation Design-time solutions and guard-bands for resilience no longer sufficient for nanoscale ICs Process variations: each chip born with a unique personality ( nature ) Operating conditions, environment, and workload: each chip grows uniquely ( nurture ) Need: Guarantee that each system, despite different nature and nurture, has an acceptable behavior ( resilience ) Resilience: Persistence of performance level that can justifiably be trusted in the presence of change 3
Motivation (Contd.) Dynamic health monitoring of ICs and systems Reuse additional on-chip sensors Voltage sensors, temperature sensors, etc. Reduce number of sensors needed for monitoring Learn, predict, and adapt 4
Outline Background Selection of representative critical paths Path encoding and feature identification Dynamic RCP monitoring Results Conclusions 5
Circuit Degradation Transistor aging NBTI (PBTI) Degrades path delay Failure rate Burn- in Increased soft errors Aging Time Timing failures Early- life failure Normal lifetime Wearout Degradation rate Varies due to voltage, temperature, workload, process variations 6
Prior Work Periodic delay testing [Baba VTS 09] interrupts normal execution of system In-situ sensors [Agarwal VTS 07] Prohibitively large number of sensors Replica circuits [Tschanz VTS 09] Fail to cover the entire circuit Representative critical paths [Xie DAC 10, Wang ICCAD 12] Static, coarse-grained 7
Proposed Approach Leverage additional sources (e.g., power, temperature, process variation) for path-delay estimation Chip monitoring o C V o C Core Logic V V Infer delays of larger pool of paths based on small set of paths (RCPs) All Paths Passed gates Passed voltage grid Passed temperature grid Path delay RCP Selection RCP Path delay Infer Dynamic delay-estimation model based on features Early stage Later stage RCP model Updated RCP model Adaptation action A Updated adaptation action A 8
Topological feature Chip Features Gate types and locations of gates of path in layout Process-variation feature Captures die-to-die, intra-die, and random variations Temperature feature Reflects temperature grids in which path passes through Voltage feature Reflects voltage grids in which path passes through BTI feature Reflect BTI grid in which path passes through 9
Topological Features Describe a path using locations and types of gates in the path 0 AND OR 0 AND AND 0 0 OR 0 0 OR 0 OR 0 OR 0 0 0 OR 0 OR 0 OR OR OR 0 0 0 OR 10
Process Variation Feature 11
Temperate/Voltage/BTI Feature Workload on each device is translated to: Voltage Temperature BTI model 12
Voltage Feature All gates within this grid load same power 13
Temperature Feature Power Chip Solder Polymer Substrate Gel Heat sink R solder R polymer R sub R gel R sink T ambient 14
Path Encoding Voltage feature BTI feature Informa@on about path Processvariation features Temperature feature Topological features Determine Path delay 15
RCP Selection Measured delay of representa(ve cri@cal paths Transforma@on matrix Delays of all cri@cal paths 16
SVD-QRcp Singular value decomposition (SVD) Purpose: Estimate number of RCPs SVD QRcp QR factorization with column pivoting (QRcp) Purpose: Rank CPs, and provide first few RCPs to be selected Representa@ve cri@cal paths 17
Singular Value Decomposition (SVD) Given: mxn matrix A Voltage feature BTI feature Processvariation features Temperature feature Topological features Path delay A D 18
Singular Value Decomposition (SVD) Given: mxn matrix A = ULV T U : mxr column orthogonal matrix L : rxr diagonal matrix (non-negative, descending order) V : nxr column orthogonal matrix Singular value 19
QR-Factorization with Column Pivoting (QRcp) Given: mxr matrix U = QRP T Q : orthogonal matrix R : singular upper right matrix P : permutation matrix Swap 1 and 3 Max Norm Calcula@on (Col 1 to n) permutation matrix P =[3 2 1 4..] Iteration 20
Transformation Matrix Use delays of RCP set and transformation matrix to infer delays of all paths Transforma@on matrix Ts CP set A Measured Delays D of RCP set Es@mated Delays D of CP set 21
Example Given a set of critical paths (matrix A) and the corresponding path delays (vector D) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 A = 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 1 11 p 1 d 1 8 9 D = 4 8 9 p 8 d 8 5 13 22
Example (Contd.) SVD-QRcp method (Step 1: SVD) A=U S V S determines the number of RCPs 4.65 0 S = 2.48 2.00 1.80 1.30 0.94 0.52 21.6 0 S 2 = 6.15 4.00 3.24 3.90 0.88 0.25 0 0.29 0 0.09 5 RCPs contain most of information of A 23
Example (Contd.) SVD-QRcp method (Step 2: QRcp) QR = UE E is the permutation matrix that we need 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 E = 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 Selected RCP set P 5 P 4 A = P 1 P 8 P 6 5 4 1 8 6 24
C-means Clustering Cluster 1 Cluster 2 Cluster 3 CP set A Es@mated Delays D for CP set A 25
Example of C-means Clustering Set 5 clusters Membership matrix is 0.32 0.11 0.00 0.00 0.98 0.20 0.01 0.05 W = 0.03 0.40 0.00 1.00 0.01 0.20 0.01 0.04 0.03 0.14 1.00 0.00 0.00 0.20 0.00 0.05 0.02 0.18 0.00 0.00 0.01 0.30 0.97 0.03 0.87 0.21 0.00 0.00 0.00 0.10 0.01 0.82 Selected RCP set P 1 P 3 A = P 4 P 5 P 7 26
Dynamic Monitoring (Design Time) Determine RCP set A R CP set A with topological features only SVD-QRcp method C-means clustering method Transformation matrix T S Transformation matrix T C 27
Dynamic Monitoring (Design Time) p p 1 2 p p3 p 3 4 p p 5 p 5 p 8 p 6 p 7 9 Transforma@on matrix T S SVD- QRcp method C- mean clustering method Transforma@on matrix T C p 1 p 3 p 5 p 7 28
Dynamic Monitoring (Design Time) d d 1 2 d d3 d 3 4 d d 5 d 5 d 8 6 d 7 d 9 Transforma@on matrix T S Tranforma@on matrix T C d 1 d 3 d 5 d 7 29
Dynamic Monitoring (Run Time) CP set A with only topological features C-means clustering New transformation matrix T C Additional features of CPs (e.g., T, V, BTI) Estimation error mitigation for P R_C ΔD R_C = D R_C -D R_C RCP subset A R_C Measured delays D R_C Estimated delays D R_C RCP subset A R_S Transformation matrix T S Estimated delays D of A Estimation error mitigation ΔD for A Adjusted delay estimation D =D + ΔD 30
Dynamic Monitoring (Run Time) 1. Update transformation matrix T C of C-mean clustering method by using extra features [p 1 +f 1 ] [p 3 +f 3 ] [p 2 +f 2 ] [p 7 +f 7 ] [p 5 +f 5 ][p 4 +f 4 ] [p 8 +f 8 ] [p [p 9 +f 9 ] 6 +f 6 ] C- mean clustering method [p 5 +f 5 ] [p 7 +f 7 ] New Transforma@o n matrix T C 31
Dynamic Monitoring (Run Time) Transforma@on matrix T S d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 d 1 d 3 32
Dynamic Monitoring (Run Time) d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 D R_C d 5 d 7 D R_C Δ d 5 Δ d 7 Δ D R_C 33
Dynamic Monitoring (Run Time) Δd Δd 1 Δd 2 4 Δd Δd 3 Δd 3 5 Δd Δd 8 6 Δd 7 Δd9 Δd 5 Δd 7 Δ D R_C New Transforma@on matrix T C 34
Dynamic Monitoring (Run Time) 5. Obtain mitigated delays D initial estimated delays D by adding error mitigation ΔD to d d 1 2 d d 3 d 3 4 d 5 d d 8 6 d 7 d 9 Δd Δd 1 Δd 2 4 Δd Δd 3 Δd 3 5 Δd Δd 8 6 Δd 7 Δd9 d" d 1 d 2 4 d d 3 d 3 5 d d 8 6 d 7 d 9 35
Example Given a set of critical paths (matrix A) and the corresponding path delays (vector D) 1 0 1 0 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 A = 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 0 1 1 1 0 1 1 1 0 1 0 1 11 p 1 d 1 8 9 D = 4 8 9 p 8 d 8 5 13 36
Demonstration of Dynamic Monitoring Base RCP set (using SVD-QRcp) A R_S = {P 2, P 5, P 8 } Mitigation RCP set (using C-means) A R_c = {P 1, P 6 } 11 8 9 D = 4 8 9 5 13 Use A R_S to predict (sta@c) 10.9 8 7.2 D = 3.2 8 7.6 Use A R_C to mi@gate (dynamic) 4.3 Error is 5.8% 13 Error is 0.9% 11 8 8.5 D = 4.1 8 9 4.8 13 37
Experiments: Benchmarks ITC 99 and IWLS 05 benchmarks b17 b18 b19 b22 RISC vga # gates 27k 88k 165k 40k 61k 114k # gate- type features 54 54 54 54 54 54 # temperature features 100 100 100 100 100 100 # voltage features 400 400 400 400 400 400 # process- varia@on features 400 400 400 400 400 400 # cri@cal paths 3021 1104 224 924 3662 461 38
Evaluation Metric 39
Estimation Accuracy with Extra Features Using C-means Method Number of RCP estimation accuracy More features estimation accuracy Elbow curve (a) b18 (b) RISC 40
Estimation Accuracy with Extra Features Using SVD-QRcp Method Number of RCP estimation accuracy More features estimation accuracy (a) b18 (b) RISC 41
Average Estimation Accuracy Proposed method outperforms static SVD-QRcp method and static C-means method 4% 4% 3% 3% rrmse 2% 1% rrsme 2% 1% 0% 0% (b) b18 (e) RISC Proposed method Static SVD-QRcp method Static C-means method 42
Runtime Estimation Accuracy 4% 4% 3% 3% rrsme 2% 1% rrsme 2% 1% 0% 0 0.2 0.4 0.6 0.8 1 2 3 System runtime (years) (a)b18 Proposed method 0% Static SVD-QRcp method 0 0.2 0.4 0.6 0.8 1 2 3 System runtime (years) (b)risc Static C-mean method 43
Conclusions Small set of RCPs to infer large pool of CPs Multiple features of path considered: process variation, voltage, temperature, BTI Dynamic monitoring on RCPs offset aging-induced estimation errors Decrease in number of RCPs Increased delay-estimation accuracy 44