Thermal and Power Characterization of Real Computing Devices

Similar documents
INTEGRATION, the VLSI journal

Technical Report GIT-CERCS. Thermal Field Management for Many-core Processors

Thermal Resistance Measurement

Thermal Measurements & Characterizations of Real. Processors. Honors Thesis Submitted by. Shiqing, Poh. In partial fulfillment of the

Intel Stratix 10 Thermal Modeling and Management

Calibrating the Thermal Camera

Analysis of Thermal Diffusivity of Metals using Lock-in Thermography

Technical Notes. Introduction. PCB (printed circuit board) Design. Issue 1 January 2010

Lecture 9: Clocking, Clock Skew, Clock Jitter, Clock Distribution and some FM

Transient Thermal Measurement and Behavior of Integrated Circuits

Architecture-level Thermal Behavioral Models For Quad-Core Microprocessors

Thermal measurement a requirement for monolithic microwave integrated circuit design

Application of thermography for microelectronic design and modelling

CSE140L: Components and Design Techniques for Digital Systems Lab. Power Consumption in Digital Circuits. Pietro Mercati

Single Photon detectors

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

Reducing power in using different technologies using FSM architecture

Itanium TM Processor Clock Design

Through Silicon Via-Based Grid for Thermal Control in 3D Chips

Blind Identification of Power Sources in Processors

Thermal System Identification (TSI): A Methodology for Post-silicon Characterization and Prediction of the Transient Thermal Field in Multicore Chips

Supplementary Information for On-chip cooling by superlattice based thin-film thermoelectrics

Lecture 21: Packaging, Power, & Clock

MIL-STD-883E METHOD THERMAL CHARACTERISTICS

High-Speed Switching of IR-LEDs (Part I) Background and Datasheet Definition

Thermal Interface Materials (TIMs) for IC Cooling. Percy Chinoy

BETTER DESIGN AND NEW TECHNOLOGIES IMPROVE LASER POWER MEASUREMENT INSTRUMENTATION

CIS 371 Computer Organization and Design

Fig. 1. Schema of a chip and a LED

EECS150 - Digital Design Lecture 26 - Faults and Error Correction. Types of Faults in Digital Designs

Using FLOTHERM and the Command Center to Exploit the Principle of Superposition

Analytical Model for Sensor Placement on Microprocessors

Memory Thermal Management 101

Clock signal in digital circuit is responsible for synchronizing the transfer to the data between processing elements.

Enhancing Multicore Reliability Through Wear Compensation in Online Assignment and Scheduling. Tam Chantem Electrical & Computer Engineering

AN ANALYTICAL THERMAL MODEL FOR THREE-DIMENSIONAL INTEGRATED CIRCUITS WITH INTEGRATED MICRO-CHANNEL COOLING

Interconnect s Role in Deep Submicron. Second class to first class

Mitigating Semiconductor Hotspots

Granularity of Microprocessor Thermal Management: A Technical Report

Lednium Series Optimal X OVTL09LG3x Series

Thermomechanical Stress-Aware Management for 3-D IC Designs

Reliability Analysis of Moog Ultrasonic Air Bubble Detectors

The Linear-Feedback Shift Register

DKDT: A Performance Aware Dual Dielectric Assignment for Tunneling Current Reduction

by M. Kaluza*, I. Papagiannopoulos**, G. De Mey **, V. Chatziathanasiou***, A. Hatzopoulos*** and B. Wiecek*

Chamber Development Plan and Chamber Simulation Experiments

Lecture 16: Circuit Pitfalls

DEPFET sensors development for the Pixel Detector of BELLE II

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Submitted to IEEE Trans. on Components, Packaging, and Manufacturing Technology

Blind Identification of Thermal Models and Power Sources from Thermal Measurements

ECE321 Electronics I

Thermal Scheduling SImulator for Chip Multiprocessors

Spitzer Space Telescope

Laird Thermal Systems Application Note. Precise Temperature Control for Medical Analytical, Diagnostic and Instrumentation Equipment

/qirt The influence of air humidity on effectiveness of heat sink work. by M. Kopeć*, R. Olbrycht*

HT TRANSIENT THERMAL MEASUREMENTS USING THERMOGRAPHIC PHOSPHORS FOR TEMPERATURE RATE ESTIMATES

Two-Layer Network Equivalent for Electromagnetic Transients

Accurate Temperature Estimation for Efficient Thermal Management

Chapter 2 Process Variability. Overview. 2.1 Sources and Types of Variations

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

Lecture 16: Circuit Pitfalls

Accelerated Life Test Principles and Applications in Power Solutions

i.mx 6 Temperature Sensor Module

KATIHAL FİZİĞİ MNT-510

Impact of Scaling on The Effectiveness of Dynamic Power Reduction Schemes

Thermal Management In Microelectronic Circuits

An Efficient Transient Thermal Simulation Methodology for Power Management IC Designs

Study of Solder Ball Bump Bonded Hybrid Silicon Pixel Detectors at DESY

Administrative Stuff

Accurate Joule Loss Estimation for Rotating Machines: An Engineering Approach

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator

Reduction of Detected Acceptable Faults for Yield Improvement via Error-Tolerance

A TES Bolometer for THz FT-Spectroscopy

Chip Level Thermal Profile Estimation Using On-chip Temperature Sensors

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory

Device 3D. 3D Device Simulator. Nano Scale Devices. Fin FET

Tools for Thermal Analysis: Thermal Test Chips Thomas Tarter Package Science Services LLC

ESE 570: Digital Integrated Circuits and VLSI Fundamentals

Status. Embedded System Design and Synthesis. Power and temperature Definitions. Acoustic phonons. Optic phonons

Laser Diodes. Revised: 3/14/14 14: , Henry Zmuda Set 6a Laser Diodes 1

CCD OPERATION. The BBD was an analog delay line, made up of capacitors such that an analog signal was moving along one step at each clock cycle.

HIGH-PERFORMANCE circuits consume a considerable


DM74LS670 3-STATE 4-by-4 Register File

EE115C Winter 2017 Digital Electronic Circuits. Lecture 6: Power Consumption

Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors

Lecture 2: Metrics to Evaluate Systems

The molecules that will be studied with this device will have an overall charge of

Miniature Electronically Trimmable Capacitor V DD. Maxim Integrated Products 1

Thermal Image Resolution on Angular Emissivity Measurements using Infrared Thermography

Stack Sizing for Optimal Current Drivability in Subthreshold Circuits REFERENCES

STUDY OF THERMAL RESISTANCE MEASUREMENT TECHNIQUES

Introduction to Digital Signal Processing

System-Level Modeling and Microprocessor Reliability Analysis for Backend Wearout Mechanisms

FLOW VISUALIZATION OF THE BUOYANCY-INDUCED CONVECTIVE HEAT TRANSFER IN ELECTRONICS COOLING. Carmine Sapia

EE115C Winter 2017 Digital Electronic Circuits. Lecture 19: Timing Analysis

Today. ESE532: System-on-a-Chip Architecture. Energy. Message. Preclass Challenge: Power. Energy Today s bottleneck What drives Efficiency of

Electromagnetic-Thermal Analysis Study Based on HFSS-ANSYS Link

Transcription:

76 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 Thermal and Power Characterization of Real Computing Devices Sherief Reda, Member, IEEE (Invited Paper) Abstract Power and temperature are key design concerns in modern computing systems. Power minimization is essential for battery-operated devices and for large-scale data center facilities. The spatial and temporal allocation of within-die power consumption lead to thermal gradients and hot spots during operation. Temperature impacts key circuit metrics such as reliability, speed, and leakage power, and it is a major constraint towards improving the performance of high-end computing devices. Due to the enormous complexities and sheer number of modeling parameters of state-of-the-art designs, pre-silicon power and thermal models cannot be trusted blindly. It is necessary to complement pre-silicon analysis with post-silicon thermal and power characterization on the fabricated devices, and then to use the characterization results to improve the design during re-spins before ramp and production. In this paper, we describe new techniques for thermal and power characterization of real computing devices. We show how the measurements from infrared imaging, embedded thermal sensors, and current meters can be integrated to accurately characterize the temperatures and power of computing devices during operation. We describe the key algorithmic and experimental techniques required to overcome the challenges encountered when working with real devices. We present characterization results of a dual-core processor and a programmable logic device. Index Terms Characterization, hot spot, infrared imaging, power, sensors, thermal, tracking. I. INTRODUCTION I N RECENT years, power and temperature have emerged as major concerns in the design and operation of computing devices. Power is a major objective to minimize for low-power mobile devices that rely on batteries for operation, as reducing the power increases the operational time. Power is also a constraint towards improving the performance of these mobile devices [15]. For instance, the specifications of a typical cell-phone ARM-based processor give 0.6 mw per MHz of operation; thus, increasing the performance to cater for advanced applications implies running into power as a bottleneck. Power and energy are also constraints for large-scale computing data Manuscript received January 08, 2011; accepted April 22, 2011. Date of publication June 30, 2011; date of current version August 19, 2011. This work is supported in part by the U.S. Department of Defense (DoD) Army Research Office (ARO) under Grant W911NF-09-1-0320, in part by the National Science Foundation under Grant 1115424 and Grant 0952866, and by an equipment donation from Altera Corporation. This paper was recommended by Guest Editor E. Macii. The author is with the School of Engineering, Brown University, Providence, RI 02912 USA (e-mail: sherief_reda@brown.edu). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JETCAS.2011.2158275 centers. The average power consumption of a medium-size data center can easily exceed 20 MW [3]. Minimizing the instantaneous power could be a constraint imposed by electric grid capacity and the electric cable wiring to the facility. Thus, increasing the performance to cope with demand surges can run into power as a major hurdle. Furthermore, the annual electrical energy bill of a data center can reach tens of millions of dollars, making energy a major contributor to the center s total cost of ownership [2]. The magnitudes of on-die temperatures are determined by the spatial and temporal allocation of power consumption and the design of the heat removal system. Elevated temperatures directly impact all key circuit metrics including: lifetime and reliability, speed, power, and costs. Hot spots reduce the mean time to failure as most failure mechanisms (e.g., electromigration, time dependent dielectric breakdown, and negative bias temperature instability) have strong temperature dependencies [40]. Furthermore, different thermal expansion coefficients of chip materials cause mechanical stresses that can eventually crack the chip/package interface [6]. Elevated temperatures also slow down devices and interconnects leading to timing failures [8], [28]. The exponential dependency of leakage power on temperature further increases total power and could lead to thermal runaway [31]. Due to the nonideal scaling of power consumption and costs of heat removal systems, temperature has become a major hurdle towards improving the performance of high-end many-core processors [15]. Temperature and power are two related physical phenomena that need to be simultaneously modeled and managed. To illustrate this need, we use our experimental infrastructure to create a test chip, where we spatially allocate 2 W of power in different ways, as shown in Fig. 1. In Fig. 1(a), 2 W are spatially allocated at the top-left corner of the die, while in Fig. 1(b), 2 W are equally distributed among the four corners of the die. In both allocations, the total area occupied by 2 W is exactly the same. However, the two spatial allocations lead to entirely different temperature maps with different temperature histograms. The allocation in Fig. 1(a) leads to a maximum temperature of about 31 C, while the allocation in Fig. 1(b) leads to a maximum temperature of about 30 C. Furthermore, the histogram of Fig. 1(a) shows large range of thermal variations, while the histogram of Fig. 1(b) shows a tighter thermal distribution around the average temperature. A sole power modeling or management technique will not be able to account for these differences; thus, it is necessary to model both power and temperature. Fine-grain power and temperature modeling during design is a very complex process due to the sheer number of parameters 2156-3357/$26.00 2011 IEEE

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 77 Fig. 1. Impact of spatial allocation of power on temperature. The histograms give the percentages of chip area operating at different temperatures. that need to be known and complex physical phenomena that need to be simulated. In a typical design flow a simplified flow is given in Fig. 2 there are a number of places where inaccuracies in power modeling can occur. One of the major problems in power modeling is the dependency of dynamic power consumption on input patterns [32]. While switch-level power simulators are accurate, it is computationally infeasible to conduct such simulation for an entire workload where trillions of instructions are simulated. To speed up the modeling process, it is necessary to abstract the pattern dependency by switching activity estimates that are driven from architectural simulation tools [49], [7]. These factors can be computed at desired timing granularities from architectural simulation tools. Such abstraction drastically reduces the simulation time but introduces inaccuracies in dynamic power estimation [7]. Leakage power, which has exponential temperature dependency, also contributes to power consumption and complicates its modeling [6]. Hence, it is necessary to chain power and thermal simulators in a feedback loop as shown in Fig. 2 to estimate leakage power [29], [30]. If the dynamic power estimates are inaccurate or only modeled for a subset of possible input patterns, then errors in dynamic power modeling will translate to errors in temperature and leakage power models. Furthermore, process variations, such as gate length variations and dopant fluctuations, lead to variations in the threshold voltage of transistors which impact leakage power and temperature [6]. Fig. 2. Modeling and analysis of power and temperature during design. Due to all the aforementioned complexities, modeling and analysis of power and temperature during design time might not return accurate results when compared to the real post-silicon characteristics. In many cases the mismatch between pre-silicon

78 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 Fig. 3. Framework for post-silicon thermal and power characterization. and post-silicon characteristics forces major changes in the design and implementation of integrated circuits. In a wide study from 2005, it was shown that 70% of new designs go at least one design re-spin to fix post-silicon problems and that 20% of these re-spins are due to power and thermal issues [47]. It is likely that these figures are much higher now as chips are more complex boasting larger number of transistors. Post-silicon power and thermal characterization of integrated circuits provide the true power and temperature characteristics under representative loading conditions, such as workloads and dynamic voltage and frequency (DVFS) settings. Fig. 3 gives an integrated framework, where measurements from infrared imaging equipment and embedded thermal sensors are used for thermal characterization, and measurements from the infrared imaging equipment together with electrical current measurements are used for power characterization. For thermal characterization, within-die embedded thermal sensors sample the junction temperature at strategic locations. Because digital thermal sensors consume valuable die area, they have be used judiciously. The measurements from the sensors are complemented with high-resolution thermal images acquired from an infrared camera. Together, these measurements can identify the true locations and magnitudes of hot spots and the extent of on-chip thermal gradients. For power characterization, electric current measurements can provide the total power consumption. If the chip supports multiple independent power supply networks, then it is possible to increase the pool of current measurements. The thermal emissions captured from the infrared imaging system, together with the lumped electrical current measurements, can be inverted to yield high-resolution spatial power maps for the individual circuit blocks. The results of post-silicon power and thermal characterization can improve the design process during re-spins or for future designs in the following ways. High-level power modeling tools rely on the use of parameters that are estimated from empirical data [7], [42]. The power characterization results can be used to calibrate and tune high-level power modeling tools. Thermal characterization results can drive heat sink design. Passive heat sinks remove heat indiscriminately from the die, and thus, their design is mainly driven by total power consumption. Active heat removal systems, such as thermoelectric Peltier coolers [46], can make use of the detailed thermal characterization results to maximize their heat removal capacity. If the thermal characterization results show discrepancy between hot spot temperatures as reported by the thermal sensors and the infrared system, then the thermal characterization results can be used to adjust the locations of the thermal sensors to reduce the thermal tracking error. To evaluate what if? design re-spin questions, the power characterization results can substitute the power simulator estimates and directly feed the thermal simulator. For example, if the thermal characterization results are unacceptable, then the layout can be changed to reduce the spatial power densities and hot spot temperatures. The power characterization results are fed together with the new layouts to the thermal simulator to evaluate the impact of the changes. The post-silicon power and thermal characterization results can also force a reevaluation of the computing device specifications (e.g., operating frequencies). In the next sections, we elucidate the key experimental and algorithmic techniques required for post-silicon power and thermal characterization of real computing devices. In Section II, we describe techniques for thermal characterization using embedded thermal sensors and infrared imaging. The outcomes of these techniques provide calibrated thermal images of the device together with an identification of the locations of hot spots and their magnitudes. Given the thermal calibrated images and current measurements as inputs, we describe in Section III techniques that provide the true power maps of a computing device during operation. We summarize our work and provide directions for future work in Section IV. II. THERMAL CHARACTERIZATION The physical relationship between power and temperature is described by the physics of heat transfer. Mainly, heat diffusion governs the relationship between power and temperature in the bulk of the die and the associated metal heat spreader, while heat convection governs the transfer of heat from the boundary of the integrated metal spreader/sink to the surrounding fluid medium which is either air or liquid. The heat diffusion equation is given by where is the temperature at location is the power density at location is the material density, is the specific heat density, and is the thermal conductivity [40]., and are all functions of location [40]. The power transferred at the boundary by convection is described by Fourier s law for heat transfer, and it is proportional to the temperature difference between the boundary of the heat sink and the ambient temperature. The constant of proportionality is the heat transfer coefficient which depends on the geometry of the heat sink, the fluid used for heat removal and its speed [19]. (1)

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 79 A. Embedded Thermal Sensors The spatial and temporal variations in workload power consumption, together with large die areas that can accommodate potentially 10 s 100 s of cores, imply that there is a potential for large spatial and temporal thermal variations during runtime. To ensure reliable operation, it is necessary to monitor and track hot spots during runtime using thermal sensors. These sensors are embedded into the active layer of the integrated circuit to measure the junction temperatures at their locations. The temperature measurements by the sensors are used by dynamic thermal management (DTM) systems to adapt the performance depending on the available thermal slack [24], [23], [13], [50], [12], [13], [11]. In case of thermal emergencies, the management system can eventually shut down the processor. Thermal sensors and their support circuitry utilize die area real estate and increase design complexity. For example, commercial-grade digital thermal sensors listed at design-reuse.com consume a sizable die area. An 8-bit digital thermal sensor requires an area of about 0.25 mm in 180 nm technology process and an area of about 0.10 mm in a 65 nm technology process. In a digital thermal sensor, the voltage signal of a thermal diode is routed to and measured by an analog to digital converter (A/D) that stores its results in a register that is periodically checked by the DTM system. Discretization, thermal noise and cross talk all introduce errors in the measurements of thermal sensors. Digital thermal sensors consume a good portion of die area mainly due to the need to accommodate the A/D. Because die size is the main recurring cost during fabrication, there is an inherent trade-off in thermal monitoring: costs are reduced by using the fewest number of thermal sensors, but accurate thermal tracking requires increasing the number of sensors. A number of papers in the literature propose thermal sensor allocation algorithms to ensure effective tracking using a budget number of sensors [27], [36], [37], [45], [33], [20], [26], [10], [38], [52]. The thermal traces, that are used as inputs to sensor allocation algorithms, are initially computed using pre-silicon thermal simulations which might not be accurate or realistic as discussed in Section I. Furthermore, design constraints can force designers to perturb the locations of the thermal sensors. Embedded sensors and infrared imaging play the following synergistic roles in thermal characterization. First, independent sensor measurements acquired while using different heat sinks, but under identical loading conditions, can be used to match the thermal characteristics of infrared-transparent heat sinks to the characteristics of regular metal sinks. Second, measurements from infrared imaging can be used to evaluate the effectiveness of thermal sensors in thermal tracking. Third, the captured thermal traces can be used to identify the true locations of hot spots which are then used as inputs to the thermal sensor allocation algorithm. The aforementioned three synergistic roles will be elaborated in the following subsections. B. Infrared Imaging Any body above absolute zero Kelvin emits infrared thermal radiation with an intensity that depends on its temperature, its emissivity, and the radiation wavelength. Transistors and interconnects in integrated circuits operate at elevated temperatures Fig. 4. Oil-based Infrared transparent heat sink. due to resistive heating by charge carriers [41]. Modern computing integrated circuits use flip-chip packaging, where the die is flipped over and soldered to the package substrate. By removing the package heat spreader, one can obtain optical access to every device on the die through the silicon backside. Silicon is transparent to infrared emissions with photon energies that are less than its bandgap energy (1.12 ev), which corresponds to wavelengths larger than 1.1 m. This transparency is ideal from an infrared imaging perspective as it enables the capture of photonic emissions from the devices, which provide valuable information for thermal and power characterization of computing devices operating under realistic loading conditions. One of the key specifications of an infrared imaging system is its spectral response which determines the part of the infrared spectrum that the imaging equipment can detect. For the range of temperatures encountered during chip operation, the mid-wave infrared (MWIR) range, which stretches from 3 to 5 m, yields the most sensitive and accurate characterization of thermal emissions. Detecting emissions in the MWIR requires the use of InSb quantum detectors which have to be cooled to cryogenic temperatures to ensure sensitivity. As a consequence, high-resolution MWIR imaging systems tend to be fairly expensive. The MWIR imaging system used in our experiment is a FLIR SC5600 camera with a focal point array of 640 512 InSb detectors cooled to 77 K ( 196 C) to reduce the noise in temperature measurements to less than 15 mk. The imaging system can capture thermal traces at a rate of up to 380 Hz, and depending on the used microscopic lens, the system can resolve temperature down to 5 5 m. C. Infrared-Transparent Heat Sinks To capture the thermal emissions from an operational die, it is necessary to remove the obstruction introduced by traditional heat removal mechanisms (e.g., integrated heat spreader, metal heat sink and fan), and to substitute them with mechanisms that can remove the heat while being transparent to infrared emissions. The standard technique is through the use of infrared transparent oil-based heat removal systems [15], [35], [34], [38]. For our experiments, we machined a special sapphire oil-based infrared-transparent heat sink that precisely controls the oil flow into the computing device. Fig. 4 illustrates our design where a sapphire window sits on top of a processor die with a clearance of about 1 mm. Chilled oil is forced into the inlet valve of the sink which then flows on top of the die to sweep the heat and then exits through the outlet valve. The oil maintains its flow using an external DC pump, and the temperature of the oil is controlled using a thermoelectric cooler.

80 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 Fig. 5. Feedback control system to control the oil s temperature during thermal calibration. D. Thermal Calibration Measuring temperatures is not straightforward because an infrared camera does not measure temperatures directly but it rather measures the emitted photon intensities. Thus, it is necessary to convert the digital levels (which reflect photon intensities) recorded by the camera to temperatures. This conversion is complicated by the fact that radiation intensity is not constant among different materials even if they are at the same temperature. Black bodies are perfect emitters with an emissivity equal to one. The emissions of real materials are a fraction of the black-body level, and each material is characterized with an emissivity value, which is defined as the ratio of that material s thermal emission intensity to that of a perfect black-body at the same temperature [44]. As an integrated circuit layout is composed of different materials (e.g., copper, silicon and dielectrics), it is expected that the radiation intensities of various parts of a die will be different even if the die is forced at an isothermal temperature. Thus, thermal calibration has to take place on a pixel-by-pixel basis. The general idea of thermal calibration is to build a relationship between the digital levels and temperatures for each pixel. This relationship can be modeled by an exponential function, where and are coefficients that need to be estimated for each pixel. The exponential relationship arises from the physics of photon detectors where the current of a photo-sensitive diode depends exponentially on the incident radiation [17]. To compute the pixel-by-pixel relationships between temperatures and digital levels, we use the following calibration procedure. The processor is first turned off and forced to a known isothermal status, and then the digital levels of all pixels are recorded. The isothermal status is repeatedly adjusted to a number of different temperature settings (20 C to 60 C at 5 C steps), where the digital level data are recorded at each setting. Using the calibration data, the and of every pixel are computed using standard curve fitting techniques. During normal operation, the measured digital levels together with the pre-computed s and s are used to estimate temperatures on a per-pixel basis according to the exponential relationship. The key for accurate calibration is to be able to strictly force the die into a desired isothermal status. To achieve this objective, we precisely control the temperature of the oil flow. We develop a computer-based feedback control system that controls the thermoelectric device cooling or heating capacity through a programmable current supply as shown in Fig. 5. We also use a fluid monitor to track the oil s flow rate, temperature and pressure just before the oil s entry to the oil-based infrared-transparent sink. The measurements from the flow monitor are logged into a monitoring computer through an A/D acquisition device. According to the logged flow temperature, the computer-based PI controller adjusts the current supply of the thermoelectric device until the desired oil temperature is reached and stabilized. The transparency of silicon and the machined heat sink to infrared emissions enables the infrared system to directly measure the junction temperatures at the active layer of the integrated circuit. In this case the 3-D continuous temperature signal is represented by a vector that gives the temperatures at a discrete set of planar locations. The measured temperature is equal to, where the vector denotes the noise in measurements introduced during imaging. To illustrate the extent of thermal variations during operation, we use our infrared imaging system to capture a few thermal traces of a dual-core AMD Athlon II 240 processor while running various CPU SPEC06 workloads at 2.1 GHz. The calibrated thermal traces are given in Fig. 6. The traces show the extent of thermal gradients within the die with a size of 14 8.5 mm. The traces show that within-die thermal gradients could reach up to 16 C, and that differences in workloads could lead to strong variations in hot spot locations. E. Heat Sink Matching Without careful consideration, replacing the traditional metal heat sink with an infrared-transparent heat sink can change the thermal characteristics as it changes the boundary conditions [19], [34]. This replacement, however, does not change the underlying dynamic power characteristics which have very weak temperature dependency. Thus, it is relevant to contrast the thermal responses from these two different heat sinks, and perhaps introduce corrective measures in case large differences in characteristics are detected. To evaluate such possibility, we devise an experiment where we rely on the measurements of two embedded thermal sensors in the processor to contrast the thermal characteristics of the two different heat sinks. The infrared camera is irrelevant in this experiment. Using identical loadings, we measure independently the embedded thermal sensors of our test processor using the traditional metal heat sink and the infrared-transparent sink. Fig. 7 gives the thermal sensor measurements for the first 200 seconds of the and workloads from the CPU SPEC 2006 benchmark suite. Fig. 7(a) gives the measurements when the processor is coupled with the infrared-transparent oil-based heat

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 81 Fig. 6. Calibrated thermal traces captured from a dual-core AMD Athlon II processor during operation while executing samples of the SPEC CPU 2006 workloads. Fig. 7. Thermal sensors measurements for the first 200 seconds for a variety of SPEC 06 workloads using two different heat sink setups. sink setup, while Fig. 7(b) gives the measurements when the processor is coupled with the regular metal heat sink and fan setup. The measurements lead to the following two key observations. 1) The measurements show that the two setups give spatially and temporally correlated results. That is, if sensor 1 gives a higher measurement than sensor 2 in one setup at some point in time, then it will also give higher temperature in the other setup and vice versa. This result is important because a hot spot is by definition the highest temperature on the die at a point in time, and thus if this result is generalized, we conclude that substituting the heat sink setups does not generally alter the locations of hot spots. We have repeated this experiment with different workloads to confirm this conclusion. 2) The measurements show that thermal gradients exist in both setups. The metal sink setup shows a gradient of 5 C, while the oil sink setup shows a gradient of about 10 C. There are two possible ways to make the measurements from the oil sink match the measurements obtained from the regular metal heat sink. One possibility is to adjust the design of the sapphire heat sink, the oil flow rate, and its temperature until the measurements match those of the regular heat sink [34]. Another possibility is to transform the measurements numerically to make them look alike. For example, Fig. 8 plots the sensor measurements using the regular metal sink against the sensor measurements using the infrared-transparent oil sink. The digital thermal sensors in the tested processor only report integer temperatures, and their errors tend to be around 1 C. The plot shows that a linear relationship can be used to transform the measurements from the infrared-transparent sink to match those of the regular metal sink. F. Identification of Hot Spots To characterize the locations of hot spots, we execute the 29 SPEC CPU workloads on the processor to collect tens of thousands of thermal traces during runtime operation of the processor. We execute the workloads in single and dual workloads configurations. Using a threshold of 37 C, we identify the hot spot location in each trace and then we plot all the identified hot spot locations in Fig. 9. The points in the figure give the set of potential locations where hot spots can occur during runtime. The hot spot locations are generally localized at and around the

82 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 III. POWER CHARACTERIZATION Fig. 8. Sensor measurements acquired using the two heat sink setups. The objective of post-silicon power characterization is to identify the true power consumption of different blocks of a computing integrated circuit under real loading conditions. While it is possible to isolate the power consumption of a circuit block during testing by using scan chains, such approach is not capable of characterizing the wide range of possible power consumption maps under true loading conditions, where workloads simultaneously exercise multiple circuit blocks in intricate ways. The most versatile approach is to compute the power map from the captured thermal emissions,, and the external current measurements, as discussed earlier in Section I. In its discretized version, the steady-state form of (1) is approximated by the following linear matrix formulation (2) Fig. 9. Hot spot locations of SPEC CPU 2006 workloads on evaluated dualcore processor. centers of the two cores and in the common memory controller area between the cores. These hot spot locations are plausible as designers most likely placed the frequently used units of a core towards the core s center to facilitate the interconnections to other functional units. The L2 caches are consistently the coolest areas of the processor. The large range of possible hot spot locations for just two cores demonstrates the need for more than one sensor to accurately track the hot spots during runtime. G. Back to Design The results from post-silicon thermal characterization can be used to improve the design of a computing device during design re-spins before ramp and production. For example, the thermal characterization results can be used to improve the design of the heat sink. These results will be more relevant for future active heat removal systems, such as thermoelectric Peltier and microfluidic coolers, which can make use of the detailed thermal characterization results to maximize their heat removal capacity and to spatially adapt depending on the underlying thermal spatial gradients. Another possibility is to use the true, identified hot spot locations to directly drive the thermal sensor allocation algorithms. The results of these algorithms are used to adjust the thermal sensor locations to reduce the hot spot tracking error. The thermal characterization results can also trigger changes in the design s layout to eliminate or reduce critical hot spots. As many state-of-the-art processors are thermally constrained, the thermal characterization results can be used to adjust the voltage and frequency settings of the computing device. where the matrix is the system s thermal resistance matrix, is the desired power map, is the noise in measurements, and is a calibrated thermal map as described in Section II. For post-silicon power characterization it is necessary to invert the thermal emission maps into power maps. For purpose of power characterization, it is not necessary to match the thermal characteristics,, obtained from the infrared-transparent heat sink to match those of the metal sink, because dynamic power has very weak temperature dependency and the matrix will be directly estimated while using the infrared-transparent heat sink. An inversion problem is well-posed if it satisfies three conditions: existence, uniqueness, and stability [4]. Existence implies that for every measured temperature map, there exists a power map that produces it; uniqueness means that there exists one and only power map that leads to the measured temperature map; and stability means that small perturbations in the temperature measurements lead to small perturbations in power estimates. A. Challenges in Power Characterization In our prior work [9], we have identified two challenges in thermal to power inversion. The first challenge comes from the noise inherent in the measurement process in both the current measurements and in the infrared measurements. There are a number of noise sources that can impact the measurements, including thermal noise, analog-to-digital discretization noise, flicker noise, and shot noise [44]. During inversion tiny amounts of noise in thermal measurements can be amplified into large errors in power characterization. The amplification of noise is controlled by the small singular values of the matrix [4]. Noise can violate the stability condition leading to ill-posed inversion. The second challenge is the spatial low-pass filtering associated with heat diffusion. While the power consumption can vary abruptly in space according to the device layout and application behavior, the temperature will vary smoothly in space [14], [18]. This low-pass filtering effect is inherent in the physical behavior of heat conduction as governed by the heat diffusion equation. This phenomenon is illustrated in Fig. 10 where we create checkerboard power patterns in a test chip and measure their emissions. The power patterns in Fig. 10 have the same amount of total power but they differ in their spatial frequencies. The thermal emissions given in the figure demonstrate that

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 83 Fig. 11. Impact of regularization parameter on noise removal. Fig. 10. Measurements using our experimental setup to show the impact of spatial low-pass filtering on the intensity and variations of thermal emissions. the thermal variations are attenuated as the spatial frequency increases. For example, the standard deviation drops from 184 mk to 111 mk and 64 mk, when the spatial frequency is increased. We observe that if spatial filtering heavily attenuates high-frequency power patterns, then the resultant attenuated emissions could fall below the detection sensitivity of the infrared imaging system, leading to an irreversible loss of information. In a simulation-based world where double-precision floating point numbers are used, attenuation causes no problem, but in real systems with physical limitations on their detectors and their analog to digital converters, attenuation is a major problem as it degrades the signal-to-noise ratio. Spatial filtering can violate the uniqueness condition leading to ill-posed inversion. B. Theory and Practice in Power Characterization To reduce the detrimental impact of noise and spatial filtering on power characterization, we propose using techniques from regularization theory [48], [4], [16]. One promising approach is Tikhonov regularization which finds the power solution that minimizes the sum of two terms: 1) the total squared error between the temperatures computed from the estimated power and the actual thermal measurements and 2) the norm of the power solution. That is, where is the regularization parameter that controls the minimization emphasis between the two terms of the objective function [16]. Fundamentally, Tikhonov regularization filters out the singular components of the matrix that are small relative to the regularization parameter and passes the singular components that are large relative to the regularization parameter [48]. The filtration of small singular components reduces the amplification of noise during inversion. Fig. 11 gives a conceptual overview of Tikhonov regularization. When the regularization parameter is small in value (top-left part of the curve) then total square error minimization is more emphasized and (3) more small singular values are passed, which could lead to over fitting the noise. As the regularization parameter increases, solutions move monotonically towards the bottom right part of the curve, which leads to more noise removal and regularization. A reasonable value for the regularization parameter occurs at the corner of the L-shaped curve. To further reduce the occurrence of an ill-posed inversion, it is necessary to impose constraints on the solution. One possible constraint is that the sum of the elements of the power vector is equal to the total power consumption of the chip,, and that these elements are nonnegative; i.e., where is the norm and is the total power consumption of the chip which is externally measured using an electrical current sensing meter. The importance of the constraint given by (4) is that if spatial filtering leads to multiple solutions to objective (3), then only the solution that satisfies the total power constraint is the one selected by the optimization solver. In practice, any digital current meter has a tolerance, tol, in its measurements (the tolerance is typically listed in the multimeter s data sheet), and thus it is better to replace the constraint of (4) by two constraints: and. In chips with multiple, independent power networks, there will be one measurement for each network, and thus there will be multiple corresponding constraints. By constraining the norm and including the norm in the minimization objective, we essentially control the variance of the power solution. If the Tikhonov parameter is small, then the variance is not controlled which can lead to fitting the noise rather than the signal. As the Tikhonov parameter increases in value, the variance is controlled leading to more regularized solutions. C. Framework for Evaluating the Accuracy of Power Characterization Techniques In our power characterization experiments, we felt the need to validate our characterization results in a scientific way, where the estimated post-silicon power characterization maps are compared against well-trusted power maps. A generic processor chip does not offer an alternative, trusted way to evaluate its power characteristics. As a result, we constructed (4)

84 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 The scanning of the laser system can be automated by using a pair of galvo-directing mirrors [15]. Our method uses the programmable nature of our design to get the same results of the expensive laser scanning system but in a much easier way. Another approach is to use the actual design and layout of the chip to conduct finite-element heat convection and diffusion simulations to estimate the matrix, where power pulses are applied in simulation at the desired locations, and then the results from the thermal simulations of the finite element model are used to construct matrix [21], [15], [35]. Fig. 12. Structure of test circuit consisting of 10 10 micro-block heaters, which is embedded in a section of a FPGA. a test chip where the underlying switching activity can be precisely controlled. Our test chip consists of 10 10 micro-heater blocks organized in a grid fashion as shown in Fig. 12(a). Each block contains nine ring oscillators that can be simultaneously enabled or disabled through a programmable control signal that is stored in an associated flip-flop. When enabled, the total dynamic power consumption of a micro-heater block is equal to 25 mw. The size of each block is about 0.56 mm. The flip-flops of all blocks are connected in a scan chain of length 100. To create a desired spatial power pattern, we program the flip-flops with 100 bits that correspond to the desired power pattern; thus, the length of the vector is equal to 100. The design is embedded in a section of a 90 nm Altera Stratix II field-programmable gate array (FPGA) that is composed of 30 30 configurable logic blocks as shown in Fig. 12(b). The modular structure of the FPGA perfectly suits the grid structure of our design. Our test chip enables us to create arbitrary known dynamic power patterns that can verify the estimated post-silicon power characterization maps. D. Estimating the Modeling Matrix Given the test chip, it is necessary to measure the thermal to power model matrix. The matrix can be estimated in a column-by-column basis as follows. Enabling only one microheater block at a time is mathematically equivalent to setting the vector to be equal to, where is the power consumption of the enabled micro-heater block, assuming it is located at the th location. For instance, if then the first micro-heater block (top-left corner) of the grid of Fig. 12(a) is enabled. The power consumption incurred from enabling only one micro-heater block can be easily measured using the external current sensing meter. If denote the steady-state thermal emissions captured from enabling the th micro-heater block, then column of matrix is equal to. We leverage the programmability of our grid to enable each micro-heater block one at a time, and record the emitted temperatures across the field. We automate the whole process in order to measure the 100 columns of matrix with fast turnaround time. In generic computing devices, the matrix can be estimated in the same conceptual way but through different implementation approaches [15], [35]. One approach is to turn off the chip, and scan a laser beam with known power density to deliver the power from the outside to the regions of interest. E. Power Mapping of Arbitrary Patterns To evaluate the effectiveness of our post-silicon power characterization method, we use our test chip to create three arbitrary spatial dynamic power patterns given in Fig. 13(a). The total power consumption of the three maps, from left to right, is equal to 1.22, 1.27, and 1.16 W. We tried to create a variety of patterns that test the accuracy of our inversion method. For instance, the first pattern spells the acronym of this journal JETCAS in the form of dynamic power switching activity. Our patterns are challenging to estimate because each micro-heater block consumes a tiny amount of power equal to 25 mw in an area of about 0.56 mm, and because the underlying created power maps have intricate spatial patterns. We give in Fig. 13(b) the thermal emission maps arising from exciting the injected power patterns. The differences between the maximum and minimum temperatures in the patterns are equal to 0.73 C, 0.52 C, and 0.55 C from left to right. The small temperature ranges make the thermal to power inversion further challenging. The power estimates from our inversion technique are given in Fig. 13(c). The power modeling errors for the estimated maps are 4.1%, 2.0%, and 0.0%, where the modeling error of each pattern is computed as the average absolute error between the known power map and the estimated power map divided by the total power of the known map. The estimated and original power maps visually agree to a great extent. To understand the source of small characterization errors, we simulate the resultant temperatures if the true power maps are used as inputs; the simulation is basically the result of multiplying the injected power patterns by the matrix. We plot in Fig. 14 the residual errors between the measured temperatures and the simulated measurements. We verify that the errors form a Gaussian distribution using the Kolmogorov-Smirnov test, and we compute the standard deviation for each residual distribution. The standard deviations are given as labels (9.10, 8.76, and 7.94 mk) in the figure. For instance, the first pattern, which has an error of 4.1% in its power map estimates, has a residual standard deviation of 9.10 mk which means that the large majority of errors (97%) are about 18 mk. The residual errors fall within the sensitivity limitation (15 mk) of our camera detectors. F. Power Mapping of an Embedded Processor Soft processors are programmable processors implemented in the reconfigurable logic fabric of FPGAs [51]. Soft processors provide a real design test case. We use the devised inversion methodology to estimate the spatial power consumption of the Nios II soft processor while running the standard Dhrystone

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 85 Fig. 13. Evaluating the accuracy of post-silicon power characterization techniques. The total power consumption of the three maps, from left to right, is equal to 1.22, 1.27, and 1.16 W. Thermal emissions are reported in degrees Celsius ( C) above room temperature. Fig. 14. Difference between measured temperatures and simulated temperature. 2.1 application. We experiment with two different configurations of the Nios II processor: the standard model Nios II/s with multipliers implemented in the FPGA s digital signal processor (DSP) blocks, and the full-performance model Nios II/f. The total power consumption of these models is 315 and 477 mw, respectively. That is, the two models consume under half a Watt in total. The layouts of these processors are given in Fig. 15(a). The steady-state temperature maps during runtime are given in Fig. 15(b), and the estimated spatial power maps are given in Fig. 15(c). Our estimated spatial maps augment the floorplan with true operational spatial power density estimates. Given that the design of the Nios II processor is proprietary, it is not possible to match the spatial power consumption estimates to the various functional blocks of the processor. G. Back to Design The post-silicon power characterization results can be used to improve design re-spins and future designs in a number of ways. Dynamic power maps can substitute the estimates of power modeling tools and can be used directly as inputs for thermal simulation tools to evaluate potential design optimizations. One possibility is thermal-aware placement and routing, where layout alternatives are evaluated for thermal effectiveness. A second possibility is to evaluate different heat sink designs. A third possibility is that the post-silicon power maps can be used to calibrate and tune high-level power modeling tools, making them more trustable. The post-silicon power and thermal results could be also used to revise fine-grain DVFS specifications. IV. SUMMARY AND FUTURE DIRECTION With sky rocketing cost of semiconductor fabrication for the most advanced sub-45 nm technologies, only few designs with large production volumes will be custom fabricated. For these high-stakes designs, blindly trusting pre-silicon power and thermal models and analyses could have disastrous consequences, and thus, it is necessary to conduct thorough post-silicon thermal and power characterization of the fabricated device to ensure its safety, reliability and conformation to specifications. Because of the enormous complexity of these designs, it will likely take a few design re-spins before final ramp and production. These re-spins will leverage the power and thermal characterization results to improve the design to meet or exceed the initial specifications. The lessons learned from the characterization of current devices will also lead to better future designs. The economies of scaling have also led to the prevalent fablite model where design houses contract fabrication facilities to fabricate their designs. In this model, it is possible that ma-

86 IEEE TRANSACTIONS ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 1, NO. 2, JUNE 2011 power models that are tuned and validated using the post-silicon power characterization results. ACKNOWLEDGMENT Some of the techniques and results discussed in this paper are based on earlier work with R. Cochran and A. N. Nowroz [10], [38], [11], [39], [43]. The author would like to thank W. Richards for the machining of the oil-based heat sink, and P. Temple for developing an acquisition system to automate data collection. REFERENCES Fig. 15. Spatial power estimates of different Nios II processors running Dhrystone 2.1 application. licious circuitry, also known as Trojans, are inserted into the design during fabrication. Existing works in the literature investigate side channel detection techniques in which the electric current measurements are used to detect Trojans [1], [22]. It is plausible that the proposed thermal and power characterization techniques, which rely on the use of nonobtrusive infrared imaging techniques, can be used to improve Trojan detection methods. Another possible use of thermal and power characterization results is to characterize process variability [25]. Process variations, such as gate length and dopant fluctuations, lead to variations in the threshold voltage of the transistors and consequently leakage power. Since leakage depends exponentially on temperature, infrared-based thermal characterization can be used as a nonintrusive way to characterize process variability. In the past two years, our research group has dramatically improved its characterization results. Our earlier techniques could only resolve power maps that use block heaters with power consumption of about 187 mw in an area of about 1.60 mm [9]. In this paper, we report improved results using block heaters with power consumption of 25 mw with an area of about 0.56 mm. These improvements are the culmination of better algorithmic and experimental techniques. We believe that there is still plenty of room to improve thermal and power characterization techniques. We are currently working on characterization techniques that explore the use of AC, rather than DC, excitation signals. AC signals can reduce the attenuation associated with spatial thermal filtering and the noise [5]. There is also plenty of work to be done towards devising trustable high-level [1] Y. Alkabani, F. Koushanfar, N. Kiyavash, and M. Potkonjak, Trusted integrated circuits: A nondestructive hidden characteristics extraction approach, in Information Hiding. Berlin, Germany: Springer-Verlag, 2008, pp. 102 117. [2] L. Barroso and U. Hölzle, The case for energy-proportional computing, IEEE Computer, vol. 40, no. 12, pp. 33 37, Dec. 2007. [3] L. A. Barroso and U. Holzle, The Datacenter as a Computer. San Rafael, CA: Morgan and Claypool, 2009. [4] M. Bertero and P. Boccacci, Introduction to Inverse Problems in Imaging.. London, U.K.: Institute of Physics, 1998. [5] O. Breitenstein, W. Warta, and M. Langenkamp, Lock-In Thermography: Basics and Use for Functional Diagnostics of Electronic Components, 2nd ed. Berlin, Germany: Springer Verlag, 2010. [6] D. Brooks, R. Dick, R. Joseph, and L. Shang, Power, thermal, and reliability modeling in nanometer-scale microprocessors, IEEE Micro, vol. 27, no. 3, pp. 49 62, 2007. [7] D. Brooks, V. Tiwari, and M. Martonosi, Wattch: A framework for architectural-level power analysis and optimizations, in Int. Symp. on Computer Architec., 2000, pp. 83 94. [8] Y. Cheng, P. Raha, C. Teng, E. Rosenbaum, and S. Kang, ILLIADS-T: An electrothermal timing simulator for temperature-sensitive reliability diagnosis of CMOS VLSI chips, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 17, no. 8, pp. 668 680, Aug. 1998. [9] R. Cochran, A. Nowroz, and S. Reda, Post-silicon power characterization using thermal infrared emissions, in Int. Symp. on Low Power Electron. and Design, 2010, pp. 331 336. [10] R. Cochran and S. Reda, Spectral techniques for high-resolution thermal characterization with limited sensor data, in Des. Autom, Conf,, 2009, pp. 478 483. [11] R. Cochran and S. Reda, Consistent runtime thermal prediction and control through workload phase detection, in Des. Autom, Conf,, 2010, pp. 62 67. [12] A. Coskun, T. Rosing, and K. Gross, Utilizing predictors for efficient thermal management in multiprocessor SoCs, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. and Syst,, vol. 28, no. 10, pp. 1503 1516, Oct. 2009. [13] A. K. Coskun, T. S. Rosing, and K. C. Gross, Proactive temperature management in MPSoCs, in Int. Symp. on Low Power Electronics and Design, 2008, pp. 165 170. [14] K. Etessam-Yazdani, M. Asheghi, and H. Hamann, Investigation of the impact of power granularity on chip thermal modeling using white noise analysis, IEEE Trans Compon. Packag. Technol., vol. 31, no. 1, pp. 211 215, Mar. 2008. [15] H. Hamann, A. Weger, J. Lacey, Z. Hu, and P. Bose, Hotspot-limited microprocessors: Direct temperature and power distribution measurements, IEEE J. Solid-State Circuits, vol. 42, no. 1, pp. 56 65, Jan. 2007. [16] P. C. Hansen, Discrete Inverse Problems: Insight and Algorithms 2010, Society for Industrial and Applied Math. [17] C. Hsieh, C. Wu, F. Jih, and T. Sun, Focal-plane-arrays and CMOS readout techniques of infrared imaging systems, IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 4, pp. 594 605, Aug. 1997. [18] W. Huan, M. R. Stan, K. Sankaranarayanan, R. J. Ribando, and K. Skadron, Many-core design from a thermal perspective, in Des. Autom. Conf., 2008, pp. 746 749. [19] W. Huang, K. Skadron, S. Gurumurthi, R. J. Ribando, and M. R. Stan, Differentiating the roles of IR measurement and simulation for power and temperature-aware design, in Int. Symp. on Perform. Anal. of Syst. Software, 2009, pp. 1 10.

REDA: THERMAL AND POWER CHARACTERIZATION OF REAL COMPUTING DEVICES 87 [20] H. Jung and M. Pedram, A stochastic local hot spot alerting technique, in Asia and South Pacific Des. Autom. Conf., 2008, pp. 468 473. [21] T. Kemper, Y. Zhang, Z. Bian, and A. Shakouri, Ultrafast temperature profile calculation in IC chips, in THERMINIC, 2006, pp. 133 137. [22] F. Koushanfar and A. Mirhoseini, A unified framework for multimodal submodular integrated circuits trojan detection, IEEE Trans. Inf. Forensic Sec., vol. 6, 2011, to be published. [23] A. Kumar, L. Shang, L. Peh, and N. Jha, System-level dynamic thermal management for high-performance microprocessors, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 27, no. 1, pp. 96 108, Jan. 2008. [24] A. Kumar, L. Shang, L.-S. Peh, and N. K. Jha, HybDTM: A coordinated hardware-software approach for dynamic thermal management, in Des. Autom. Conf., 2006, pp. 548 553. [25] E. Kursun and C.-Y. Cher, Variation-aware thermal characterization and management of multi-core architectures, in IEEE Int. Conf. on Comput. Des., 2008, pp. 280 285. [26] B. Lee, K. Chung, B. Koo, and N. Eum, Thermal sensor allocation and placement for reconfigurable systems, Trans. Des. Autom. Electron. Syst., vol. 4, no. 41, pp. 50:1 23, 2009. [27] K.-J. Lee and K. Skadron, Using performance counters for runtime temperature sensing in high-performance processors, in Proc. Workshop on High-Perform., Power-Aware Computing, 2005, p. 232.1. [28] P. Li, L. Pileggi, M. Asheghi, and R. Chandra, IC thermal simulation and modeling via efficient multigrid-based approaches, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 25, no. 9, pp. 1763 1776, Sep. 2006. [29] S. Lin, G. Chrysler, R. Mahajan, V. De, and K. Banerjee, A selfconsistent substrate thermal profile estimation technique for nanoscale ICs Part I: Electrothermal couplings and full-chip package thermal model, IEEE Trans. Electron Devices, vol. 54, no. 12, pp. 3342 3350, Dec. 2007. [30] S. Lin, G. Chrysler, R. Mahajan, V. De, and K. Banerjee, A selfconsistent substrate thermal-profile-estimation technique for nanoscale ICs Part II: Implementation and implications for power estimation and thermal management, IEEE Trans. Electron Devices, vol. 54, no. 12, pp. 3351 3360, Dec. 2007. [31] S.-C. Lin and K. Banerjee, Cool chips: Opportunities and implications for power and thermal management, IEEE Trans. Electron Devices, vol. 55, no. 1, pp. 245 255, Jan. 2008. [32] E. Macii, M. Pedram, and F. Somenzi, High-level power modeling, estimation, and optimization, IEEE Trans Comput. Aided Des. Integr. Circuits Syst., vol. 17, no. 11, pp. 1061 1079, Nov. 1998. [33] S. Memik, R. Mukherjee, M. Ni, and J. Long, Optimizing thermal sensor allocation for microprocessors, IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 27, no. 3, pp. 516 527, Mar. 2008. [34] F. J. Mesa-Martinez, E. Ardestani, and J. Renau, Characterizing processor thermal behavior, in Architectural Support for Programming Languages and Operating Systems, 2010, pp. 193 204. [35] F. J. Mesa-Martinez, M. Brown, J. Nayfach-Battilana, and J. Renau, Measuring performance, power, and temperature from real processors, in Int. Symp. on Comput, Architec., 2007, pp. 1 10. [36] R. Mukherjee and S. Memik, Systematic temperature sensor allocation and placement for microprocessors, in Des. Autom. Conf., 2006, pp. 542 547. [37] R. Mukherjee, S. Mondal, and S. O. Memik, Thermal sensor allocation and placement for reconfigurable systems, in Int. Conf. on Comput. Aided Des., 2006, pp. 437 442. [38] A. N. Nowroz, R. Cochran, and S. Reda, Thermal monitoring of real processors: Techniques for sensor allocation and full characterization, in Des. Autom. Conf., 2010, pp. 56 61. [39] A. N. Nowroz and S. Reda, Thermal and power characterization of field-programmable gate arrays, in Proc. ACM/SIGDA Int. Symp. on Field Programmable Gate Array, 2011, p. N/A.. [40] M. Pedram and S. Nazarin, Thermal modeling, analysis, and management in VLSI circuits: Principles and methods, Proc. IEEE, vol. 94, no. 8, pp. 1487 1501, Aug. 2006. [41] E. Pop, S. Sinha, and K. Goodson, Heat generation and transport in nanometer-scale transistors, Proc. IEEE, vol. 94, no. 8, pp. 1587 1601, Aug. 2006. [42] M. Powell, A. Biswas, J. Emer, and S. Mukherjee, CAMP: A technique to estimate per-structure power at run-time using a few simple parameters, in Int. Symp. on High Perf. Comput. Architec., 2009, pp. 289 300. [43] S. Reda, R. Cochran, and A. N. Nowroz, Improved thermal tracking for processors using hard and soft sensor allocation techniques, IEEE Trans. Computers, vol. 60, no. 6, pp. 841 851, Jun. 2011. [44] A. Rogalski and K. Chrzanowski, Infrared devices and techniques, Opto-Electron. Rev., vol. 10, no. 2, pp. 111 136, 2002. [45] E. Rotem, J. Hermerding, C. Aviad, and C. Harel, Temperature measurement in the Intel core duo processor, in Proc. Int. Workshop on Thermal Investigations of ICs, 2006, pp. 23 27. [46] A. Shakouri, Nanoscale thermal transport and microrefrigerators on a chip, Proc. IEEE, vol. 94, no. 8, pp. 1613 1638, Aug. 2006. [47] G. Spirakis, Designing for 65 nm and beyond: Where is the revolution?, in Electron. Des. Process (EDP) Symp., 2005. [48] C. R. Vogel, Computational Methods for Inverse Problems. Philadelphia, PA: SIAM, 2002. [49] S. Wilton and N. P. Jouppi, CACTI: An enhanced cache access and cycle time model, IEEE J. Solid-State Circuits, vol. 31, no. 5, pp. 677 688, May 1996. [50] I. Yeo, C. Liu, and E. Kim, Predictive dynamic thermal management for multicore systems, in Des. Autom. Conf., 2008, pp. 734 739. [51] P. Yiannacouras, J. Rose, and J. G. Steffan, The microarchitecture of FPGA-based soft processors, in Int. Conf. on Compilers, Architec. Synthesis for Embedded Syst., 2005, pp. 202 212. [52] Y. Zhang, A. Srivastava, and M. Zahran, On-Chip sensor-driven efficient thermal profile estimation algorithms, ACM Trans. Des. Autom. Electron. Syst., vol. 15, no. 3, p. 25:1, 2010. Sherief Reda received the Ph.D. degree in computer engineering from University of California, San Diego, in 2006. He is an Assistant Professor at the School of Engineering, Brown University, Providence, RI. His research interests include physical design and management of computing systems, variability modeling and yield improvement techniques for planar and 3-D integrated circuits and reconfigurable computing. Dr. Reda served as a member of the technical program committee for a number of IEEE/ACM conferences including: DAC, ICCAD, ASPDAC, DATE, GLSVLSI, and SLIP. He has received four best paper nominations and two Best Paper Awards at DATE 02 and ISLPED 10, and also has rreceived an NSF CAREER award.