Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

Similar documents
Dynamic Power Management under Uncertain Information. University of Southern California Los Angeles CA

Lecture 2: Metrics to Evaluate Systems

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

Markov decision processes (MDP) CS 416 Artificial Intelligence. Iterative solution of Bellman equations. Building an optimal policy.

Dynamic Voltage and Frequency Scaling Under a Precise Energy Model Considering Variable and Fixed Components of the System Power Dissipation

Amdahl's Law. Execution time new = ((1 f) + f/s) Execution time. S. Then:

Computer Architecture

Thermal Scheduling SImulator for Chip Multiprocessors

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

Measurement & Performance

Measurement & Performance

CS 7180: Behavioral Modeling and Decisionmaking

Energy-Optimal Dynamic Thermal Management for Green Computing

Partially Observable Markov Decision Processes (POMDPs)

Chapter 16 Planning Based on Markov Decision Processes

Artificial Intelligence

Continuous heat flow analysis. Time-variant heat sources. Embedded Systems Laboratory (ESL) Institute of EE, Faculty of Engineering

Probabilistic Planning. George Konidaris

Partially Observable Markov Decision Processes (POMDPs)

Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors

Lecture: Pipelining Basics

A Simulation Methodology for Reliability Analysis in Multi-Core SoCs

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Reinforcement Learning

An Introduction to Markov Decision Processes. MDP Tutorial - 1

Statistical Analysis of BTI in the Presence of Processinduced Voltage and Temperature Variations

Probabilistic Model Checking and Strategy Synthesis for Robot Navigation

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

Chapter 8. Low-Power VLSI Design Methodology

Technical Report GIT-CERCS. Thermal Field Management for Many-core Processors

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Architecture-level Thermal Behavioral Models For Quad-Core Microprocessors

Digital System Clocking: High-Performance and Low-Power Aspects. Vojin G. Oklobdzija, Vladimir M. Stojanovic, Dejan M. Markovic, Nikola M.

Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis

Online Work Maximization under a Peak Temperature Constraint

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Bayesian Congestion Control over a Markovian Network Bandwidth Process

Efficient online computation of core speeds to maximize the throughput of thermally constrained multi-core processors

FastSpot: Host-Compiled Thermal Estimation for Early Design Space Exploration

A Novel Software Solution for Localized Thermal Problems

Today s Outline. Recap: MDPs. Bellman Equations. Q-Value Iteration. Bellman Backup 5/7/2012. CSE 473: Artificial Intelligence Reinforcement Learning

A Detailed Study on Phase Predictors

Spatio-Temporal Thermal-Aware Scheduling for Homogeneous High-Performance Computing Datacenters

RL 14: POMDPs continued

ICS 233 Computer Architecture & Assembly Language

Enabling Power Density and Thermal-Aware Floorplanning

Dynamic Thermal Management in Mobile Devices Considering the Thermal Coupling between Battery and Application Processor

Motivation for introducing probabilities

Markov Model. Model representing the different resident states of a system, and the transitions between the different states

CSE 573. Markov Decision Processes: Heuristic Search & Real-Time Dynamic Programming. Slides adapted from Andrey Kolobov and Mausam

EE382 Processor Design Winter 1999 Chapter 2 Lectures Clocking and Pipelining

PROBABILISTIC PLANNING WITH RISK-SENSITIVE CRITERION PING HOU. A dissertation submitted to the Graduate School

Discrete planning (an introduction)

Semi-Markovian User State Estimation and Policy Optimization for Energy Efficient Mobile Sensing

Machine Learning and Bayesian Inference. Unsupervised learning. Can we find regularity in data without the aid of labels?

Administration. CSCI567 Machine Learning (Fall 2018) Outline. Outline. HW5 is available, due on 11/18. Practice final will also be available soon.

Decision Theory: Markov Decision Processes

Energy Modeling of Processors in Wireless Sensor Networks based on Petri Nets

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

CMP N 301 Computer Architecture. Appendix C

Bayesian Congestion Control over a Markovian Network Bandwidth Process: A multiperiod Newsvendor Problem

Preference Elicitation for Sequential Decision Problems

CMP 338: Third Class

Goals for Performance Lecture

EE 447 VLSI Design. Lecture 5: Logical Effort

Complexity of stochastic branch and bound methods for belief tree search in Bayesian reinforcement learning

Markov Decision Processes Infinite Horizon Problems

Procedia Computer Science 00 (2011) 000 6

Chapter 2 SOME ANALYTICAL TOOLS USED IN THE THESIS

Course 16:198:520: Introduction To Artificial Intelligence Lecture 13. Decision Making. Abdeslam Boularias. Wednesday, December 7, 2016

ECE429 Introduction to VLSI Design

Christopher Watkins and Peter Dayan. Noga Zaslavsky. The Hebrew University of Jerusalem Advanced Seminar in Deep Learning (67679) November 1, 2015

Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems

Energy-efficient Mapping of Big Data Workflows under Deadline Constraints

Prof. Dr. Ann Nowé. Artificial Intelligence Lab ai.vub.ac.be

RL 14: Simplifications of POMDPs

Bayesian reinforcement learning and partially observable Markov decision processes November 6, / 24

Q-Learning in Continuous State Action Spaces

Interconnect Lifetime Prediction for Temperature-Aware Design

Analytical Model for Sensor Placement on Microprocessors

, and rewards and transition matrices as shown below:

MARKOV DECISION PROCESSES (MDP) AND REINFORCEMENT LEARNING (RL) Versione originale delle slide fornita dal Prof. Francesco Lo Presti

Stochastic Safest and Shortest Path Problems

Planning in Markov Decision Processes

Microprocessor Floorplanning with Power Load Aware Temporal Temperature Variation

CPU Consolidation versus Dynamic Voltage and Frequency Scaling in a Virtualized Multi-Core Server: Which is More Effective and When

Switching Activity Calculation of VLSI Adders

A Physical-Aware Task Migration Algorithm for Dynamic Thermal Management of SMT Multi-core Processors

Trade-off Analysis between Timing Error Rate and Power Dissipation for Adaptive Speed Control with Timing Error Prediction

Efficient Maximization in Solving POMDPs

CSEP 573: Artificial Intelligence

Reinforcement learning an introduction

Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042

Markov Decision Processes

Saving Energy in Sparse and Dense Linear Algebra Computations

How to deal with uncertainties and dynamicity?

An Efficient Energy-Optimal Device-Scheduling Algorithm for Hard Real-Time Systems

Evaluating Linear Regression for Temperature Modeling at the Core Level

A Control Flow Integrity Based Trust Model. Ge Zhu Akhilesh Tyagi Iowa State University

Transcription:

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach Hwisung Jung, Massoud Pedram

Outline Introduction Background Thermal Management Framework Accuracy of Modeling Policy Representation Stochastic ti Dynamic Thermal Management SDTM Algorithm Multi-objective Design Optimization Experimental Result Conclusions

Introduction Smaller and faster translate into high power densities, higher operating temperature, and lower circuit reliability. Local Hot Spots are becoming more prevalent in VLSI circuits. It is no longer sufficient to merely add a bigger fan as a downstream fix for thermal problems. Thermal managements need to be best accomplished when it is incorporated starting at the beginning of the design cycle. Any applications that generate heat should engage in runtime thermal management technique, i.e., dynamic thermal management.

Prior Work K. Skadron, et al. : architectural-level thermal model, HotSpot, (ISCA 2003) D. Brooks, et al. : trigger mechanism (HPCA 2001). J. Srinivasan, et al. : predictive DTM (Int l Conf. on Supercomputing 2003) S. Gurumurthi, et al. : performance optimization problem for disk drives (SIGARCH 2005) For a good survey, see P. Dadvar, et al. (Semi-Therm Symp. 2005)

Motivation The above-mentioned DPM techniques have difficulty observing the peak power dissipation on the chip because they rely on a single temperature sensor, whereas the peak temperature may appear in a number of different places on the chip. This gives rise to uncertainty about the true temperature state of a chip. Improving the accuracy of decision making in thermal management by modeling and assessing the uncertainty in temperature observation is an important step to guarantee the quality of electronics. Develop a stochastic thermal management framework, based on - Partially observable Markov decision process (POMDP) - Semi-Markov decision process (SMDP) Combine DTM and DVFS to control temperature of system and its power dissipation.

Uncertainty Critical problems in temperature profile characterization are: Placement restriction for the on-chip temperature sensor Non-uniform temperature distribution across the chip. This causes uncertainty in the temperature observation. sr 60 regfile alu doutreg shifter mul incr dareg busctl iareg execute decode fetch Temperature [ºC] 50 40 0.2 y (mm) 0.1 01 0.1 x (mm) 0.2 Example of non-uniform temperature distribution 0

Stochastic Process Model The uncertainty problem, where a thermal manager cannot reliably identify the thermal state of the chip, can be solved by modeling decision making by a stochastic process. A thermal manager observes the overall thermal state and issues commands (i.e., actions) to control the evolution of the thermal state of the system. These actions and thermal states determine the next-state probability distribution over possible next steps. The sequence of thermal states of the system can be modeled as a stochastic process.

Thermal Management Framework SMDP to model event-driven decision making. POMDP to consider the uncertainty in temperature observation. Note: The time spent in a particular state in the SMDP follows an arbitrary distribution, realistic assumption than an exponential distribution. Definition 1: Partially Observable Semi-Markov Decision Process. A POSMDP is a tuple (S, A, O, T, R, Z) such that 1) S is a finite set of states. 2) A is a finite set of actions. 3) O is a finite set of observations. 4) T is a transition probability function. T: S A (S) 5) R is a reward function. R: S A R 6) Z is an observation function. Z: S A (Z) where ( ) ) denotes the set of probability distributions.

POSMDP By Way of an Example POSMDP Framework State t+1 State t s 3 = [75 C temp < 90 C] s 2 = [65 C temp < 75 C] s 1 =[50 C temp < 65 C] Prob.=0.7 Prob.=0.3 Action t (State t, belief t ) a 3 = [1.95V / 500MHz] a 2 = [1.80V / 350MHz] a 1 = [1.65V / 200MHz] belief t+1 2 s 3 = [75 C temp < 90 C] s 2 = [65 C temp < 75 C] s 1 = [50 C temp < 65 C] identify Observation belief t+1 1 s 3 = [75 C temp < 90 C] s 2 = [65 C temp < 75 C] s 1 = [50 C temp < 65 C] Tem mp [ºC] Time [s] POSMDP framework for dynamic thermal management

Full History In an observable environment, observations are probabilistically dependent on the underlying chip temperature. Transition probability function determines the probability of a transition from thermal state s to s after executing action a Ts ( ', as, ) = Probs ( t+1 = s' a t = as, t = s) Observation function captures the relationship between the actual state and the observation. Z( o', s', a) = Prob( o t+1 = o' a t = a, s t+1 = s') Since thermal manager cannot fully observe the thermal state of the system, it makes decisions based on the observable system history. Relying on the full history <s 0, a 0 >, <s 1, a 1 >,,<s t, a t > makes the decision i Process non-markovian, which is not desirable.

How to Avoid Reliance on Full History Although the observation gives the thermal manager some evidence about the current state s, s is not exactly known. We maintain a distribution over states, called a belief state b. The belief state (vector) for state s is denoted as b(s); Given any actual state, the sum of belief state probabilities over all belief states is equal to 1. s 3 current belief state [ b 1 b 2 b 3 ] s 3 o 1 o 3 s 2 o 2 s 2 s 1 b 1 + b 2 + b 3 = 1 s 1 Note: Even though we know the action, the observation is not known in advance since the observations are probabilistic.

Markovian Property By using the state space B, a properly updated probability distribution over the thermal state S, we can convert the original history-based decision making into into a fully observable SMDP. b'( s') = Prob( s', o s, a) b( s) s S Prob s ', o s, a b s ' ( ) ( ) s S s S The process of maintaining the belief state is Markovian, where the belief state SMDP problem can be solved by adapting the value iteration algorithms. Observe t Observe t+1 Actiont Action t Belief t Belief t+1 State t State t+1 Cost t Cost t

Thermal Management Framework Thermal manager s goal is to choose a policy that minimizes a cost. Let π: B A represent a stationary policy that maps probability distribution over states to actions. By incorporating expectation over actions, the set of stationary policies can be determined by using Bellman equation π C ( b) = b( s) k( s, a) s S + γ o O s' S s S Z( os, ', a) T( s', as, ) C π ( b') The optimal action to take in b is obtained by * π π ( b ) = argmin C ( b ) a A

Stochastic DTM A stochastic dynamic thermal management based on POSMDP. action o 1 o 2 t-stages o n o Dynamic thermal manager calculate b, C (a, ψ) a action action action active action action action N a t= H 1 t= 0 N t o = N H o N N a o 1 1 sleep system idle Definition 2: Observation strategy. In policy tree, an observation strategy is defined as ψ : O Λsuch that 1) O is a finite set of observations 2) Λ = {(a, ψ) a A, ψ Λ o } where A is a finite set of actions, and Λ o is a finite set of all policy trees.

Stochastic DTM A multi-objective design optimization method is used to optimize the performance metrics by formulating mathematical programming model. Let a sequence of belief states b 0, b 1,.., b n denote a processing path δ from b 0 to b n of length n. For a policy π, the discounted cost C of a processing path δ of length n is defined as n i = 0 t i i i C π ( δ) γ cost( b, a ) where t i denotes the duration of time that the system spends in belief state b i before action a i causes a transition to state b i+1, and 1 cost ( b, a) = pow( b) + Prob( b ' b, a ) ene ( b, b ') τ ( ba, ) - pow(b) is the power consumption of the system in belief state b - Prob(b b, a) is the probability of being in state b after action a in state b - ene(b, b ) is the energy required by the system to transit from state b to b - τ(b, a) is the expected duration of time that system spent in b when action a b' B

Stochastic DTM Considering the expectation with respect to the policy over the set of processing paths starting in state b, the expected cost of the system is π pow ( b) = EXP[ C π ( δ )] avg The design objective is a vector J of performance metrics we are trying to optimize, where the design vector a contains DVFS set. System Model Optimizer {design objective} a J 1 a J 2 J 3 a J sys a performance analysis π min Energy/operation min pow ( b) χ( b, a) max Throughput s a avg subject to χ( ba, ) χ( b', a) Prob( b' ba, ) = 0, χ( ba, ) τ( ba, ) = 1, χ( ba, ) 0 a b' a b a all b B, a A Note: χ(b, a) is the frequency that the system is in thermal state b and action a is issued.

Experimental Results For experimental setup, a 32bit RISC processor is designed in 0.18um technology, and stochastic framework is implemented in Matlab. In the first experiment, we analyzed power dissipation distribution of RISC. SPECint 2000 gcc 4.6% 14.4% 13.8% 4.1% 2.3% 2.7% 16.5% 2.2% 15.4% 4.1% 4.1% 4.1% 11.7% gap 4.3% 13.1% 15.7% 4.0% 4.2% 3.1% 14.2% 1.7% 14.8% 4.2% 4.2% 4.2% 12.3% gzip 4.6% 9.2% 15.4% 4.6% 4.5% 3.8% 18.6% 2.3% 15.1% 4.6% 4.6% 4.6% 8.1% SPECint 2000 Specman # of instructions (in millions) gccrtl simulation 6765 gap 12726 Back Annotation gzip 74942 Synopsys 0% Total dynamic percentage 100% RTL design saif2trace rtl2saif SAIF forward file Analyze vcd2saif read_saif SAIF backward file Compile Power report load store add/sub and/or branch etc. Instruction Power characteristics simulation for flow SPECint2000

Experimental Results The second experiment is to demonstrate the effectiveness of our proposed SDTM algorithm. - Randomly choose a sequence of 40 program, e.g., gcc 1 -gzip 2 -gap 3 - -gcc 39 -gcc 40. - The sequence of programs is executed on the RISC. - Calculate the belief state. start point end point Action Description cost [ k(s 1, a) k(s 2, a) k(s 3, a) ] State, Observation Description a 1 [1.65V / 200MHz] [ 1.5 1 0 ] s 1, o 1 [50 C temp < 65 C] a 2 [1.80V / 350MHz] [ 1 0 1 ] s 2, o 2 [65 C temp < 75 C] a 3 [1.95V / 500MHz] [ 0 1 1.5 ] s 3, o 3 [75 C temp < 90 C]

Experimental Results Operating temperature of the RISC by running the sequence. 1.95V / 500MHz 1.80V / 350MHz 1.65V / 200MHz SDTM 0 20 40 60 80 100 Average temperature [ºC] Average temperature for test scenario Note: the average temperature is estimated based on T = T + θ ( P + P + P + P ) chip a ja s / w s / c static int

Experimental Results A low value on the action axis means low supply voltage and low frequency Values of possible target actions, are obtained by multiobjective optimization. 1.95V/500MHz 1.80V/350MHz 1.65V/200MHz SDTM Average power 1.00 0.85 0.72 0.75 Throughput gp Energy/operation 1.00 0.69 0.40 0 0.69 1.00 1.23 1.79 1.08 Achieve low power consumption and operating temperature with little optimal power performance Impact on throughput and energy/operation metrics. zone Optimization with multi-objective functions.

Conclusion Proposed a stochastic dynamic thermal management technique by providing stochastic management framework to improve the accuracy of decision making. POSMDP-based thermal management controls the thermal states of the system and makes a decision (DVFS set) to reduce operating temperature. Experimental results with design optimization formulations demonstrate the effectiveness of our algorithm (low temperature with little impact on performance metrics).

Thank You!!

Yet Another Example A scenario where starting with an action a 2 issued at time t, the next system state may be one of three possible ones as observed at time t+1. a 2 = [1.8V / 350MHz] Observation a 2 = [1.8V / 350MHz] Observation a 2 = [1.8V / 350MHz] Temp [ºC] Temp [ºC] Temp [ºC] Observation Time [s] Time [s] Time [s] Case a: the system remains in the active mode with steady workload, resulting in the same chip temperature. Case b: the system remains in the active mode, but heavy workload, resulting in temperature increase. Case c: the system enters into the idle mode.

Backup Slide Given CPU benchmark SPECint 2000 such as gcc, gap, and gzip with following characteristics, program # of ints. (in Million) gcc 6765 gap 12726 gzip 75942 Calculate the CPI and total execution time. For example, in gcc, Operation Freq. Cycle CPI % ALU 40% 1 40x1/100 = 0.4 0.4/1.4 = 25% Load 35% 2 35x2/100 = 0.7 0.7/1.6 = 43% Store 10% 2 10x2/100 = 0.2 0.2/1.6 = 13% Branch 15% 2 15x2/100 = 0.3 0.3/1.6 = 19% Total CPI = 1.6 CPU time = Inst/program x cycles/inst x second/cycle = I x CPI x C = 6765 x 1.6 x 2ns (500MHz) = 21648