Microprocessor Floorplanning with Power Load Aware Temporal Temperature Variation

Size: px
Start display at page:

Download "Microprocessor Floorplanning with Power Load Aware Temporal Temperature Variation"

Transcription

1 Microprocessor Floorplanning with Power Load Aware Temporal Temperature Variation Chun-Ta Chu, Xinyi Zhang, Lei He, and Tom Tong Jing Department of Electrical Engineering, University of California at Los Angeles, CA, {chunta, zxy, lhe, ABSTRACT Traditional microprocessor floorplanning only considers area and wire length. Recent research has started to consider performance optimization due to interconnect pipelining and thermal hotspot caused by increased power density. However, there is no efficient approach on microprocessor floorplanning considering application dependent power load for performance and thermal optimization. This paper studies microprocessor floorplanning considering thermal and throughput optimization. We first develop a stochastic heat diffusion model taking into account the application dependent power load for thermal analysis. Then, we design the floorplanning algorithm based on this model. Experimental results show that, compared with the deterministic heat diffusion model, our model obtains up to 3.2 o C reduction of the on-chip peak temperature, 1.29% reduction of the area, and 1.13x better CPI (cycles per instruction) performance, respectively. Compared with temperature aware floorplanning in the HOTSPOT tool set that ignores interconnect pipelining, our algorithm is up to 27x faster, reduces the peak temperature by up to 3 o C, and also reduces CPI significantly with a negligible area overhead. We also use our thermal model to analyze the floorplanning for dual core microarchitecture. The results show that optimizing the floorplan over two cores simultaneously can further reduce temperature by 3.36 o C with slightly better CPI performance. 1. INTRODUCTION Traditional microprocessor floorplanning only considers area and wirelength, which is under the assumption that micro architecture optimization decides the average cycles per instruction (CPI) and therefore system throughput, and floorplanning determines wirelength and clock rate. Due to the increasing clock rate, the interconnect delay may become longer than the clock period. As a result, interconnect pipelining becomes a must and affects CPI as well [1] [2]. Several recent studies [1] [3] [4] optimized microprocessor floorplanning on area and CPI considering interconnect pipelining. As devices keep shrinking, the decreasing rate of power consumption in a chip cannot catch up with the shrinking chip size. As a result, the power density within a chip keeps increasing, which leads to higher temperature and higher possibility to degrade performance and chip reliability [5]. Floorplanning is the design stage with the largest impact on This paper is partially supported by NSF CAREER award CCR / and a UC MICRO grant sponsored by Altera and Intel. Address comments to lhe@ee.ucla.edu. chip temperature. Therefore, the impact of thermal effects should be considered in the floorplanning phase, which has not been done in the above studies. Thermal aware placement has been studied for several years [6] [7] [8] [9]. Different methods were used to make the thermal distribution of standard cells smooth, which makes the temperature distribution quasi-uniform in one module. However, even though the thermal distribution across one module is smooth, without thermal aware floorplan, it is still possible that the temperature gradient between modules is quite large since different modules may have different power density and thus different temperature distribution. Putting modules with high power density together may lead to higher temperature in those modules since the horizon heat diffusion is littler across those modules. As a result, an improper floorplan may lead to large temperature gradient across the modules and thus causes hotspots within a chip. Several recent developments in microprocessor foorplanning have started to consider thermal modeling and thermal aware optimization. In order to estimate the temperature impact on a floorplan, it is straightforward to calculate the temperature to estimate the impact. Given a microarchitecture, [10] first obtained the average power over all benchmarks for each block. For each new floorplan, it used HOTSPOT [11] to model the whole package as a thermal RC network with power as branch current and temperature as node voltage and calculate the peak steady-state temperature with average power as input. As a result, the objective then consists of the area, CPI, and the peak steady-state temperature. [12] further considered transient power and the dependency between power and CPI with consideration of interconnect pipeline for floorplanning. Besides, for each new floorplan, the input is not just average power but transient power over time. In this case, the objective consists of not only the area and CPI but also the peak temperature and the average temperature over time. One major drawback of the aforementioned work is that the temperature is calculated for new floorplan, which is time-consuming. [12] had even longer running time than [10] since [12] applied transient power instead of the average power. Considering this, [13] proposed a simple deterministic heat diffusion model to model the lateral heat diffusion between modules to avoid directly calculating temperature. However, this heat diffusion model is too simplified to guarantee a good solution. Also, [13] just used total wire length instead of CPI in the objective. More importantly, all the above studies do not consider the application dependent power load and its induced temperature variation. As shown in Fig. 1[14], given

2 a micro-architecture with different applications, the temperature distribution across the chip could be totally different. It is not clear how to extend the above thermal aware microprocessor floorplanning to handle application dependent power load. Figure 1: Temperature distribution for different applications In this paper, we develop an accurate, yet efficient thermalaware floorplanning considering the correlation between power consumptions for different micro-architecture modules and for different microprocessor applications. Instead of calculating temperature directly during floorplanning, we develop a stochastic heat diffusion model with consideration of block geometry and the above power correlation. We apply this model to the iterative-improvement-based floorplanning for thermal optimization, and also simultaneously optimize throughput using the trajectory piecewise linear model (TPWL) for pipelined interconnects developed in [3]. Multi-core micro-architecture has been studied for several years because of less area and high efficiency compared with single-core design. However, the thermal effect on floorplanning is only studied based on one core [15] [16]. In this work, we compare the thermal effect and performance on dual-core floorplan, and show the difference if the design considers thermal effect. The rest of this paper is organized as follows. Section 2 introduces the background of floorplanning, microprocessor performance, and the relation between power and temperature. Section 3 reviews the deterministic heat diffusion model, points out its shortcomings, and then presents our stochastic heat diffusion model. Section 4 summarizes the flow of floorplanning, experiment settings, and experimental results. We conclude this work in Section BACKGROUNDS AND PROBLEM FORMULATION 2.1 Floorplanning Floorplanning determines the locations and the shapes of the modules on chip subject to the minimization of a cost function. Traditionally, the cost function considers the chip area and total wire length. However, floorplan in the microarchitecture level has to consider not only the area and the total wire length but also CPI. Moreover, in order to consider the impact of thermal interaction between any two modules for different floorplans, we have to take into consideration the thermal estimation. Therefore, the objective function in our floorplan algorithm consists of area, CPI, and thermal effect as follows. Area CP I W area + W CP I + Area norm CP I norm thermal W thermal (1) thermal norm where W area, W CP I, and W thermal are the weights for corresponding constraints and can be adjusted by different goal, Area norm, CP I norm, and thermal norm are terms for normalization. Most current floorplanning solvers are based on simulated annealing (SA) algorithm [17] [18], which is also used in this paper. The SA starts with an arbitrary initial floorplan and changes the floorplan to a new one in the next iteration by either changing the positions or the shapes of a small set of modules. During each iteration, the cost is calculated by using the cost function. The new floorplan will be accepted either if the cost is smaller than the current best case or with a probability based on the annealing temperature. If the number of iteration is large enough, SA is able to find a high-quality solution. 2.2 Microprocessor Performance Microprocessor performance is measured by CPI (for a given clock rate) or millions instructions per second (MIPS). In this paper, we use the metric of CPI for a given clock rate under a fixed clock rate, which means the lower the CPI, the better the performance. The traditional objective of floorplanning is to minimize interconnect delay and obtain the targeted clock rate. Due to the ever increasing clock rate, interconnect delay may become longer than one clock cycle. In this case, interconnect pipelining is a must and it affects CPI. Note that floorplanning is the primary design stage to decide interconnect length and the need of interconnect pipelining. CPI with interconnect pipelining obtained by micro-architecture level simulation is time-consuming. In this paper, we apply efficient TPWL model from [3]. It starts with a traditional floorplan with objectives of area and wire length to sample the bus latency vector that is the latencies between modules. A CPI table is then generated by using cycle-accurate simulation tool, SimpleScalar [19]. CPI under a new floorplanning is approximated by CPI values from the CPI table and TPWL. 2.3 Power and Temperature To simulate the power consumption for a given microarchitecture, we use PTscalar [20] that is a cycle-accurate micro-architecture-level performance and power simulator for SuperScalar architectures. We get the power consumption for all modules with simulation from SPEC2000 [21] benchmarks. Given a micro-architecture floorplan and power consumption input, HOTSPOT [11] models the whole package shown in Fig. 2 as a thermal RC network [22] with power as a branch current and temperature as a node voltage. In the flip-chip package design as shown in Fig. 2, the power (heat) generated by modules flow both horizontally to the adjacent modules or to the adjacent heat spreader and vertically down to the PCB and up to the heat spreader and then to heat sink. Without the horizontal flow, vertical flow directly dissipates heat from the modules and thus lower the temperature of modules. However, because of the horizon-

3 tal flow between modules, the relative locations of modules may affect the final temperature of each module and thus generate a temperature gradient. Also, different applications with the same floorplan may have completely different power consumptions and thus different temperature gradients. Fig. 1 shows such an example. Therefore, because of lateral heat flow, we can arrange the locations of different modules to make the temperature distribution across the chip smooth, which means the variance of temperature for modules becomes smaller. During each new floorplan, the transient power for each module may also change since the change of wire length between modules affects CPI and CPI affects transient power. Also, different floorplan leads to different temperature, which affects leakage power. In this work, we assume the power is invariant over different floorplan for modules and different temperature but our work can be easily extended to consider leakage, temperature, floorplanning interdependency [23]. Figure 2: Thermal RC model of flip-chip package In order to make the temperature distribution across the die smooth, previous work [10] [12] used HOTSPOT [11] to directly calculate temperature during each iteration in SA and to evaluate the cost function since the goal is to minimize the peak temperature and mean temperature across the die. The objective function with calculating temperature then becomes Area CP I W area + W CP I + Area norm CP I norm W T emp (T avg + T max) T norm (2) The major drawback is that the computation time is long in SA flow. For example, considering temperature effect, it took around forty minutes to complete the whole flow [10], while it only took less than five minutes to complete the flow without considering the temperature effect. Therefore, directly calculating temperature is not efficient for thermalaware micro-architecture floorplanning. We ll describe how to accurately and efficiently model the thermal effect in the next section. 3. STOCHASTIC THERMAL-AWARE FLOORPLANNING In this section, we first describe the deterministic heat diffusion model and point out why it is inaccurate based on four observations. We then show how to build an accurate stochastic heat diffusion model based on those observations. 3.1 Deterministic Heat Diffusion Modeling Deterministic model definition [13] improved the runtime in thermal-aware floorplanning by using heat diffusion model to avoid directly calculating temperature. The concept is that the temperature of an isolated module linearly depends on its power density. Modules with high power density tend to have high temperature as a result. Therefore, in order to lower the temperature of modules with high power density, it is better to make modules with low power density adjacent to those modules to help to dissipate the heat. The temperature of modules with high power density then becomes lower because of large heat diffusion to the modules with low power density. As a result, in order to lower the temperature of modules with high power density, the more modules with lower power density adjacent to the modules with high power density, the smoother the temperature distribution is across the chip. The heat diffusion between two adjacent modules M i and M j can be represented as h(m i, M j) = (P Di P Dj) shared length ij (3) where P Di and P Dj are the average power density over time for modules M i and M j, shared length ij is the shared length between modules M i and M j. The total heat diffusion for module i is X H i = h(m i, M j) (4) j adjacent to i Since the potentially hottest modules are limited, considering the total heat diffusion for those potentially hottest modules can be a good estimation. The total heat diffusion becomes HeatDiff = X i H i (5) The objective function with heat diffusion becomes Area CP I W area + W CP I + Area norm CP I norm HeatDiff W thermal (6) HeatDiff norm Although the heat diffusion is a good representation to estimate the lateral heat flow, this model is too simplified to guarantee finding a good floorplan considering thermal effect since it ignores other factors described below Shortcomings First of all, given a micro-architecture and a series of applications, transient power may fluctuate a lot over time for one module since different applications have different work load on that module and thus the transient power over time is quite nonuniform. Also, power vectors over time for two modules may be either positively or negatively correlated. Using average power may underestimate the peak temperature for positively correlated modules since transient temperature is higher when both modules have high transient power consumption as shown in Fig. 3, where Fig. 3(a) and Fig. 3(b) have the same average power but the difference

4 positively correlated negatively correlated Tmax Tmin Figure 3: Temporal correlation between M1 and M2, (a) positively correlated and (b) negatively correlated. M1 has a higher transient temperature in (a) than in (b), although the average power is same. M1 M2 M3 Tmax Figure 5: The temperature of M2 is higher than that of M1 and M3 of the peak temperature and that of the lowest temperature are around 5 O C, which can not be ignored. Here a real case from our simulation is shown in Fig. 4. In Fig. 4(a), it shows that the correlation between LSQ and IALU4 is almost perfect, while in Fig. 4(b) RUU and F P Add have no correlation. As a result, without considering the correlation of power consumptions between modules, the maximal temperature is underestimated. power/watt power over time LSQ IALU cycle x 10 4 power/watt power over time RUU FPAdd cycle x 10 4 Figure 4: Different styles of lines for different modules: (a)lsq and IALU4 and (b)ruu and F P Add Second, the module next to the border of a die has extra heat flow to the heat spreader or the ambient. Fig. 5 shows that when three modules have similar power consumptions, temperature of the two modules on the side is lower than that of the module between them. The temperature difference is around 0.5 O C in this case and the difference may vary according to the design of the package. Third, since there is no power consumption for any dead space, the heat diffusion from one module to the dead space (shadow in Fig. 6) is much larger than that from one module to another module. As shown in Fig. 6, M1 has the highest power density and M1 has a lower peak temperature (92.96 O C) in Fig. 6(b) compared with that (94.2 O C) in Fig. 6(a) since the dead space can remove more heat than other modules do. In the floorplanning, thermal effect may try to move modules separately as far as possible while the constraint on the area attempts to minimize the total dead space. As a result, the weights of area and temperature constraints have to be adjusted for balance. Fourth, the heat diffusion between two modules is pro- (a) (b) Tmax Tmin Figure 6: Dead space effect: M1 has lower temperature in (b) than in (a) portional to their shared length since the lateral thermal conductance is proportional to the width of the module. However, the area of each adjacent module should also be considered. Given four modules M1, M2, M3, and M4 with power density P D1 > P D4 > P D2 = P D3 in Fig. 7, M1 may have a lower temperature with adjacency of M3 since M1 can diffuse more heat to M3 than M2, which suggests the heat diffusion should consider not only the shared length but also the depth of the adjacent module. As shown in Fig. 7, the difference of the peak temperature is over 1 O C, and the temperature gradient in Fig. 7(a) is sharper than that in Fig. 7(b). However, if the depth of one adjacent module is too large, this may lead to wrong estimation. The reason is that another possible floorplan candidate with an adjacent module having lower power density may still be rejected because the depth of the adjacent module is much smaller than the current one. Considering this, we can predefine a penetration window to enclose the target module. If the depth of the adjacent module is across the window, we only consider the module inside the window. On the other hand, if the adjacent module is inside the window, we take into account the modules adjacent to it as well. For example, in Fig. 7(a), M2 is inside the red window (dash line), we have to consider the area of M4 inside the red window as well. Similarly, since M3 crosses the window, we only consider the area of M3 inside the window. Details will be described in the next section. Finally, considering several hottest modules, just summing their heat diffusion may not gurantee a good solution. For example, module A and module B are the two hottest modules, and P DA > P DB. After running the SA algorithm, we have H A = H B C. This is not the optimal solution since H A = C + D and H B = C D, where C > D > 0, can give a better solution because of more heat flow for module A as shown in Fig. 8. Considering this effect, we should use the weighted sum of the heat flow for those hottest modules

5 (a) (b) Tmax Tmin Figure 7: M1 has lower temperature in (b) since M3 and M2 have same power density but M3 is larger than M2 to reduce peak temperature more effectively. where L ij is the shared length between modules M i and M j, C ij is the shared length between module M i and dead space N j. The heat diffusion vector to the border is f(b i) = P Di B i Con lateral Con adjacent (11) where B i is the shared length between module M i and the border of the die, Con lateral and Con adjacent are the unit lateral conductance between the heat spreader and module M i and between two adjacent modules, respectively, both of which can be calculated according to [11]. The standard deviation of the total heat diffusion for module M i is σ i = sqrt(e((h i adj + H i dead + f(b i)) 2 ) (E(H i adj + H i dead + f(b i))) 2 ) (12) The stochastic heat diffusion model for module M i is H i = E(H i adj ) + E(H i dead ) + E(f(B i)) + 3 σ i (13) Figure 8: Module deserves more heat flow due to higher power density 3.2 Stochastic Modeling Stochastic heat diffusion model Since directly calculating temperature is time-consuming and the deterministic heat diffusion model is not accurate enough, here we propose an accurate and efficient stochastic heat diffusion model based on the observations in Subsection Given a micro-architecture floorplan with m modules, n dead spaces, and power vector P i = [p i1,..., p it ] over T time steps for module M i, 1 i m. The mean power density P Di for module M i is P Di = E(P Di) = 1 A i 1 T TX p ij (7) j=1 where A i is the area for module M i, P Di is the transient power density vector, which equals P i A i. In this paper, E(X) is the expectation value of vector X. The power density covariance between any two modules M i and M j is cov(p Di, P Dj) = E(P Di P Dj) P Di P Dj (8) Given x adjacent modules, y adjacent dead spaces, and a penetration window size W L, the heat diffusion vector to the adjacent modules H i adj and to the adjacent dead spaces H i dead for module M i are defined as follows, respectively. H i adj = H i dead = xx (P Di P Dj) L ij (9) j = 1 yx P Di C ij (10) j = 1 where the first two terms are the mean heat diffusion to the adjacent modules and dead space, respectively, the third term is the mean heat diffusion to the lateral heat spreader, and 3σ i is the term for the correlation impact approximated by Equation (12). The larger the standard deviation between modules is, the smaller the correlation is. If dead space N j or module M j are totally inside the penetration window, we have to consider other modules that are partially inside the window. Then P Dj is modified to P Dj = P K k = 1 P Dk D k (K k + 1) P k k = 1 D k (K k + 1) (14) where K is level number between the target block and the window, the level contacting M i is level 1 and the level contacting the window is level K. P Dk is the average power density in level k. D k is the depth of each level k. In Fig. 9, the red window (dash line) defines the modules involved, and the blue (slash) one defines the modules to calculate modified P D1. The modified P D1 is derived from modules M 1, M 2, and M 3, and the first belongs to level 1 and the second and the third belong to level 2. Also, the power density of level 1 ( P D1) is just P D1 and the power density of level 2 ( P D2) is composed of P D2 and P D3. Figure 9: Illustration of calculation of modified P Dj Note that Equation (12) can be easily calculated once we calculate equations (7) and (8) that are pre-calculated during one-time cycle-accurate micro-architecture simulation for performance and power before entering SA algorithm. Therefore, there is no significant runtime overhead.

6 Considering Z potential hottest modules, the total stochastic heat flow then becomes ZX Stochastic HeatDiff = W i Hi (15) i = 1 where W i is the weight proportional to P Di If the heat diffusion for module M i is positive, it means the total net heat diffusion flows outward. Similarly, negative value means the total net heat diffusion flows inward to module M i. As a result, the larger the heat diffusion, the more heat can be diffused to the adjacent modules with lower power densities and thus the more the temperature can be lowered Hierarchal clustering We only have to consider the heat diffusion for several potential hottest modules and we use K-mean clustering algorithm to find the right number of potential hottest modules. In this paper, the objective is to minimize variance V ar V ar = kx i = 1 X P Dj S i ( P Dj µ i ) 2 (16) where P Dj is power density of module M j and µ i is the average power density within cluster S i. We use a hierarchical K-mean clustering to find the potential hottest modules. First we set a threshold such as 30% of total modules as the maximum number we have to consider. Then we run K-mean to cluster modules into two clusters and perform the same procedure to the cluster with the higher power density. The above recursive procedure stops when the number of modules in the recursively refined cluster with the higher power density is less than the threshold. Using this hierarchical method, we can find the optimal number to be considered in the calculation of total heat diffusion. 3.3 Floorplanning with Stochastic Thermal Model In Equation (6), the thermal term is deterministic without considering correlations of power densities between modules and other effects. We propose the new objective function as Area CP I W area + W CP I + Area norm CP I norm Stochastic HeatDiff W thermal (17) HeatDiff norm where we replace HeatDiff by Stochastic HeatDiff, which is calculated from Sub-section and Sub-section ALGORITHM FLOW AND EXPERIMENTS In this section, we compare the floorplanning based on our proposed model with [13] and [10]. Moreover, we optimize the overall floorplan on the dual core micro-architectures as well as only on one core to study the impact. The experiment setting is described below. 4.1 Flow and Experiment Setting Similar to [3], we assume two SuperScalar processors for both 90nm and 65nm technologies. The settings are shown in Table 1. We treat the blocks as soft and the aspect ratio is between 0.33 and 3 and L2 is partition into three modules. Table 1: Settings in 90nm and 65nm technology. 90nm 65nm Issue Width 4 8 Die Area(mm 2 ) Die Thickness(mm) 0.5 Heat Spreader(mm 2 ) Heat Sink(mm 2 ) Functional Units 3 Integer ALU 6 Integer ALU 1 Integer Mult 2 Integer Mult 1 FP Adder 2 FP Adder 1 FP Mult 2 FP Mult Register Update Unit 64 Instructions 128 Instructions Load Store Queue 32 Instructions 32 Instructions Fetch Queue 8 Instructions 16 Instructions Clock Frequency 3GHz 6GHz FF Insertion Length 2000 um 707 um Blocks composed of multiple modules are the RUU block including Register Update Unit and Load Store Queue, Decode block including Fetch Queue and the Decoder, Branch block including Fetch Unit and Branch Predictor, DL1 block including the Level 1 Data Cache and the DTLB, and the last IL1 block including the Level 1 Instruction Cache and the ITLB. The L2 unified cache and all functional units are treated as independent blocks. The block area is summarized in Table 2. Table 2: Area of logical block in 90nm and 65nm technology. Technology 90nm 65nm Block Area (mm 2 ) 90nm Area (mm 2 ) 65nm IALU IMULT FPADD FPMULT RUU Decode Branch L IL DL The length of interconnect between two blocks in Table 2 are computed according to the Manhattan distance between the centers of two blocks. We view the latency of each such interconnect as an independent variable. Changing the latency of one of these interconnects is an effective change in the micro-architecture and may impact performance. In Table 3, we specify these interconnects with respect to their terminal blocks. We apply TPWL model [3] as our CPI model. Given the architecture configuration in Table 1, we use PTscalar [20] to simulate the power consumption for four integer applications bzip2, gcc, gzip, and mcf and three floating applications art, equake, and mesa in SPEC2000 [21]. With these power vectors, we calculate the mean power density (w/mm 2 ) and standard deviation for each module. Fig. 10 shows the correlation matrix for 90nm technology, which is acquired from the sampling power vectors from all

7 Table 3: Buses that potentially affect CPI Bus ID Terminal blocks Bus ID Terminal blocks Area 1 IALU,RUU 6 IL1,IL2 2 IMULT,RUU 7 DL1,L2 3 FPAdd,RUU 8 Branch,IL1 4 FPMul,RUU 9 Decode,Branch 5 LSQ,DL1 10 Decode,RUU Number Block Name Description 1 decode Instruction Decoder 2 branch Branch Predictor 3 RAT Rename Table 4 RUU Register Updata Unit 5 LSQ Load Store Queue 6 IALU1 Integer Adder 1 7 IALU2 Integer Adder 2 8 IALU3 Integer Adder 3 9 IntReg Integer Register 10 IL1 Level 1 Instruction Cache and ITLB 11 DL1 Level 1 Data Cache and DTLB 12 IALU4 Integer Multiplier 13 FPAdd Floatpoint Adder 14 FPMul Floatpoint Multiplier 15 FPReg Floatpoint Register 16 L2 left Part of Level 2 Cache 17 L2 right Part of Level 2 Cache 18 L2 bottom Part of Level 2 Cache Figure 10: temporal correlation matrix of power consumption modules. In order to get the real correlation, we only count those sampling points in active mode since in sleep mode, every module consumes only static power. If we consider those sampling points, there is big error in estimating the correlation. As shown in Fig. 10, we can roughly partition all modules into three groups, the first group is from Decode(1) to DL1(11), the second group is IALU4(12), which does not have strong correlation to any other module, and the last one is from FPAdd(13) to L2 right(18). Modules in the same group are highly positive correlated and the correlations between modules in the different groups are either uncorrelated or negative correlated. We use SA-based PARQUET [18] as our base floorplan solver combined with the CPI model [3] and our stochastic heat diffusion model. We run the experiments on a Linux workstation. After completing the whole flow with different objectives, HOTSPOT [11] is used to calculate the temper- Figure 11: Whole flow of thermal-aware floorplanning using TPWL to calculate CPI ature for verification purpose only. Also, for each objective, we run ten iterations to acquire the best case and the average case due to SA algorithm. The whole flow of our work is summarized in Fig Comparison between Thermal Models We compare our stochastic heat diffusion model (SHDM) with [10], which calculates the max temperature for every iteration in SA to estimate the cost for each new floorplan. We run seven benchmarks as described in Section 4.1. The objective function is area and thermal effect with weight 0.6 and 0.3, respectively. Table. 4 summarizes the final result. WS means white space (dead space) percentage in the total area. For example, (4.3%) means the total area is and the white space percentage is 4.3%. From the table, SHDM can reduce T max by up to 3 o C (93.0 o C o C) (3.2%) with a 1.34% increase in the area. The above results show our model is quite accurate while it is 27x faster for 90nm processor and 19x faster for 65nm processor. 4.3 Comparison between Different Objectives In this section, we compare the thermal impact on the floorplan with different objectives and we also compare the results with [13]. We use the same setting as in Section 4.2 with different objective function. We summarize the results in Table 5. Since our model is stochastic, we denote our model with objectives area, CPI, and heat diffusion as ACHs and [13] with the same objectives as ACW Hd. The

8 Table 4: Comparison between stochastic heat diffusion model and [10] 90nm 65nm Tmax ( o C) Area (mm 2 ) (WS) Runtime (s) Tmax ( o C) Area (mm 2 ) (WS) Runtime (s) [10] (4.7%) (4.3%) 2980 SHDM (5.6%) (5.8%) 155 impact -3.2% +1.34% 1/27x -0.2% +1.03% 1/19x weight for area, CPI, and heat diffusion is 0.6, 0.3, and 0.2, respectively, in the objective function. We first compare the results with or without considering thermal effect based on our stochastic model. As shown in Table 5, considering the thermal effect with objective AC in the best case, the maximal temperature is reduced by 9.1% from 97.7 o C to 88.8 o C for 90nm and by 5.4% from o C to 97.2 o C for 65nm with negligible area overhead and increase of CPI up to 7.3%. Clearly, there is a trade off between lowering the temperature and reducing CPI, the reason is that to lower temperature, two hot modules with critical bus or wire have to be separated as far as possible, but which also means CPI or wire length increase. However, the huge amount of reduction in temperature with little decrease in performance is worthy since lower temperature implies better clock rate and clock reliability. The work in [13] used simple heat diffusion model without considering correlation, dead space, border effect and the geometry of the modules, which produced up to 2.9% area overhead and up to 21.3% increase of CPI with similar or less temperature reduction compared with our model, which shows that ours is more accurate and robust. Although our runtime is longer than that in [13] as shown in Table 6, since the run time is just less than a few minutes, it does not have much practical impact. AR (aspect ratio) in Table 6 means the ratio of length and width of the final floorplan. 4.4 Dual Core Micro-architecture In this section, we compare the thermal impacts on the dual-core floorplan for 90nm technology. We assume both cores run the same application and both cores have the same micro-architecture as described in Section 4.1. When optimizing is only on one core, we define the aspect ratio to be two to make the aspect ratio be one after combing two cores. In this experiment, the weight for area, CPI, and thermal is 0.5, 0.6, and 0.3, respectively. In Table 7 and Table 8, AC Separate and ACHs Separate mean the floorplan is optimized only on one core while AC Mixed and ACHs Mixed mean the floorplan is optimized over two cores. Fig. 12, 13, 14, and 15 show the final floorplan for different objectives. Modules that are named * 1 (i.e., 1 at the end) with blue color (dark) belong to core one. Modules named * 2 with yellow color (grey) belong to core two. The white space means dead space. As can been seen from the figures, both AC Separate and AC Mixed tend to cluster the core modules together to reduce the CPI but AC mixed has smaller CPI than AC Separate as shown in Table 7. Theoretically, AC Mixed should get the minimal CPI or at least close result compared with AC Mixed. The experiments show the similar results. Another thing should be noticed is that both the average peak temperature and the peak temperature in the best case for AC Mixed is lower than that for AC Separate as shown in Table 7. The reason is mainly because modules with low power densities have more chance to be adjacent to the potentially hottest modules in the mixed floorplan. In brief, without considering thermal effect, AC Mixed generally has lower peak temperature with slightly smaller CPI compared with AC Separate. With considering thermal effect, ACHs Separate still has well-clustered core with lower peak temperature compared with AC Separate while the core of ACHs Mixed is spread out with the lowest peak temperature compared with AC Mixed. Moreover, ACHs Mixed obtains better temperature reduction by 3.36 o C (98.57 o C o C) and even 1.02x (0.926/0.910) faster CPI performance compared with ACHs separate. However, CPI increases by 7.5% (best, ( )/0.846) and 6.9% (average, ( )/0.896), respectively, and area increases by 1.7% (best, ( )/240.3) and 1.0% (average, ( )/246.8) compared with floorplan without considering thermal effect, which shows a trade off between thermal and performance as shown in Table 7. In conclusion, floorplan optimized over two cores can slightly improve CPI performance compared with floorplan optimized separately. With consideration of thermal effect, mixed floorplan can further reduce the peak temperature compared with floorplan optimized separately. 5. CONCLUSIONS AND DISCUSSIONS We have proposed a stochastic thermal-aware floorplanning with consideration of micro-architecture level throughput optimization. First, we have convincingly shown that there are correlations between power for modules for different microprocessor applications. Second, considering dead space, border effect, and the geometry of modules, we have developed a stochastic heat diffusion model and implemented this model on microprocessor floorplanning. Compared with the existing floorplanning using deterministic heat diffusion model, our model obtains up to 3.2 o C (92.0 o C-88.8 o C) reduction of the on-chip peak temperature, 1.29% (( )/224.1) reduction of the area, and 1.13x (0.995/0.880) better CPI performance, respectively. Moreover, compared with temperature aware floorplanning in the HOTSPOT toolset that ignores interconnect pipelining, our algorithm is up to 27x faster and reduces the peak temperature by up to 3 o C with a negligible area overhead. We also study the dual core micro-architecture to see the effect if the floorplan is optimized over two cores instead of only on two cores separately. The results show optimizing over two cores can further reduce the temperature by 3.36 o C with slightly better CPI performance compared with optimizing only one core, which provides different aspects on dual core micro-architecture design. Also, this concept may be applied to multi core to obtain a better balance between performance and thermal issues. In the future, we will further apply our stochastic thermal model to deal with rectilinear blocks, 3D stacking, multi core micro-architecture, and other floorplanning methods.

9 Table 5: Comparison of stochastic and deterministic heat diffusion model with different objectives 90nm Obj. CPI Tmax ( o C) Area (mm 2 ) WS (%) Best Avg Best Avg Best Avg Best Avg AC ACHd % +12.4% -5.8% -4.7% +2.9% +2.3% ACHs % +7.2% -9.1% -8.1% +2.2% +0.6% 65nm Obj. CPI Tmax ( o C) Area (mm 2 ) WS (%) Best Avg Best Avg Best Avg Best Avg AC ACHd % +8.9% -5.0% -4.6% +2.89% -1.00% ACHs % +1.8% -5.4% -7.5% +1.56% -0.27% Table 6: Comparison of runtime and aspect ratio (AR) under different objectives 90nm Obj. Runtime (s) AR Best Avg AC ACHd ACHs nm Obj. Runtime (s) AR Best Avg AC ACHd ACHs Table 7: Comparison between mixed and separate dual core floorplan Obj. CPI Tmax ( o C) Area (mm 2 ) WS (%) Best Avg Best Avg Best Avg Best Avg AC Separate AC Mixed % -4.57% -6.43% -2.44% +0.00% -1.32% ACHs Separate ACHs Mixed % -0.72% -3.41% -1.84% -0.77% +0.20% Table 8: Comparison of runtime and aspect ratio (AR) under different objectives for dual core microarchitecture Obj. Runtime (s) AR Best Avg AC Separate AC Mixed ACHs Separate ACHs Mixed REFERENCES [1] M. Ekpanyapong, J. R. Minz, T. Watewai, H. S. Lee, and S. K. Lim, Profile-guided microarchitecture floorplanning for deep submicro processor design, in IEEE/ACM Design Automation Conf., [2] V. Nookala, Y. Chen, D. Lilja, and S. Sapatnekat, Microarchitectur-aware floorplanning using a statitical design of experiemtns approach, in

10 Figure 12: Floorplan: AC Separate (half) Figure 13: Floorplan: AC Mixed IEEE/ACM Design Automation Conf., pp , [3] C. Long, L. Simonson, W. Liao, and L. He, Floorplanning optimization with trajectory piecewise-linear model for pipelined interconnects, in IEEE/ACM Design Automation Conf., [4] A. Jagannathan, H. H. Yang, K. Konigsfeld, D. Milliron, M. Mohan, M. Romesis1, G. Reinman, and J. Cong1, Microarchitecture evaluation with floorplanning and interconnect pipelining, in Asia South Pacific Design Automation Conf., pp [5] J. Srinivasan, S. Adve, P. Bose, and J. Rivers, The impact of technology scaling on lifetime reliability, in Proc. Dependable Systems and Networks, [6] C. Tsai and S. Kang, Cell-level placement for improving substrate thermal distribution, in IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol.19, no.2, pp , [7] C. C. Chu and D. Wong, A matrix synthesis approach to thermal placement, in IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol.17, no.11, pp.1166-pp1174, [8] B. Oberneier and F. Johannes, Temperature-aware global placement, in Asia South Pacific Design Automation Conf., pp , [9] G. Chen and S. Sapatnekar, Partition-driven standard cell thermal placement, in Int. Symp. on Physical Design, vol.1, pp.75-80, 2003.

11 Figure 14: Floorplan: ACHs Separate (half) Figure 15: Floorplan: ACHs Mixed [10] K. Sankaranarayanan, S. Velusamy, M. R. Stan, and K. Skadron, A case for thermal-aware floorplanning at the microarchitectural level, in Journal of Instruction-Level Parallelism, [11] K. Skadrona, M. R. Stan, K. Sankaranarayanan, W. Huang, S. Velusamy, and D. Tarjan, Temperature-aware microarchitecture, in Proc. IEEE Int. Symp. on Circuits and Systems, [12] V. Nookala, D. J. Lilja, and S. S. Sapatnekar, Temperature-aware floorplanning of microarchitecture blocks with ipc-power dependence modeling and transient analysis, in Int. Symp. on Low Power Electronics and Design, [13] Y. Han, I. Koren, and C. A. Moritz, Temperature aware floorplanning, in Temperature aware Computing Systems, [14] H. Yu, Y. Hu, C. C. Liu, and L. He, Minimal skew clock embedding considering time variant temperature variation with automatic correlation extraction, in Int. Symp. on Physical Design, [15] M. Monchiero, R.Canal, and A. Gonzalez, Design space exploration for multicore architectures: a power/performance/thermal view, in Proc.of Int. Conf. on Supercomputing pp , [16] E. Humenay, D. Tarjan, and K. Skadron, Impact of process variations on multicore performance symmetry, in Proc. Design Automation and Test in Europe, 2007.

12 [17] N. Sherwani, Algorithms for vlsi design automation, Kluwer, 3rd ed., [18] S. N. Adya and I. L. Markov, Fixed-outlined floorplanning through better local research, in IEEE/ACM Int. Conf. on Computer Aided Design, pp , [19] D. Burger and T. Austin, The simplescalar tool set version 2.0, in University of Wisconsin-Madison, [20] W. L. W. Liao, L. He, and K. Lepakand, Ptscalar version 1.0, in University of California-Los Angeles, [21] J. L. Henning, SPEC CPU 2000:Measuring CPU performance in the new millennium, IEEE Trans. on Computers, pp vol.33, [22] A. Krum, Thermal management, in F.Kreith, ed, The CRC Handbook of Thermal Enginnering pp CRC Press, Boca Raton, FL, [23] W. Liao, L. He, and K. Lepak, Temperature and supply voltage aware performance and power modeling at microarchitecture level, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, pp , 2005.

Micro-architecture Pipelining Optimization with Throughput- Aware Floorplanning

Micro-architecture Pipelining Optimization with Throughput- Aware Floorplanning Micro-architecture Pipelining Optimization with Throughput- Aware Floorplanning Yuchun Ma* Zhuoyuan Li* Jason Cong Xianlong Hong Glenn Reinman Sheqin Dong* Qiang Zhou *Department of Computer Science &

More information

Temperature Aware Floorplanning

Temperature Aware Floorplanning Temperature Aware Floorplanning Yongkui Han, Israel Koren and Csaba Andras Moritz Depment of Electrical and Computer Engineering University of Massachusetts,Amherst, MA 13 E-mail: yhan,koren,andras @ecs.umass.edu

More information

Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis

Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis Vidyasagar Nookala David J. Lilja Sachin S. Sapatnekar ECE Dept, University of Minnesota,

More information

Simulated Annealing Based Temperature Aware Floorplanning

Simulated Annealing Based Temperature Aware Floorplanning Copyright 7 American Scientific Publishers All rights reserved Printed in the United States of America Journal of Low Power Electronics Vol. 3, 1 15, 7 Simulated Annealing Based Temperature Aware Floorplanning

More information

Architecture-level Thermal Behavioral Models For Quad-Core Microprocessors

Architecture-level Thermal Behavioral Models For Quad-Core Microprocessors Architecture-level Thermal Behavioral Models For Quad-Core Microprocessors Duo Li Dept. of Electrical Engineering University of California Riverside, CA 951 dli@ee.ucr.edu Sheldon X.-D. Tan Dept. of Electrical

More information

System Level Leakage Reduction Considering the Interdependence of Temperature and Leakage

System Level Leakage Reduction Considering the Interdependence of Temperature and Leakage System Level Leakage Reduction Considering the Interdependence of Temperature and Leakage Lei He, Weiping Liao and Mircea R. Stan EE Department, University of California, Los Angeles 90095 ECE Department,

More information

Thermomechanical Stress-Aware Management for 3-D IC Designs

Thermomechanical Stress-Aware Management for 3-D IC Designs 2678 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2017 Thermomechanical Stress-Aware Management for 3-D IC Designs Qiaosha Zou, Member, IEEE, Eren Kursun,

More information

Temperature-Aware Performance and Power Modeling

Temperature-Aware Performance and Power Modeling Technical Report UCLA Engr. 04-250 University of California at Los Angeles Temperature-Aware Performance and Power Modeling Weiping Liao, Lei He and Kevin Lepak Electrical Engineering Department University

More information

Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model

Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model Yang Shang 1, Chun Zhang 1, Hao Yu 1, Chuan Seng Tan 1, Xin Zhao 2, Sung Kyu Lim 2 1 School of Electrical

More information

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram

Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach. Hwisung Jung, Massoud Pedram Stochastic Dynamic Thermal Management: A Markovian Decision-based Approach Hwisung Jung, Massoud Pedram Outline Introduction Background Thermal Management Framework Accuracy of Modeling Policy Representation

More information

Buffered Clock Tree Sizing for Skew Minimization under Power and Thermal Budgets

Buffered Clock Tree Sizing for Skew Minimization under Power and Thermal Budgets Buffered Clock Tree Sizing for Skew Minimization under Power and Thermal Budgets Krit Athikulwongse, Xin Zhao, and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology

More information

Coupled Power and Thermal Simulation with Active Cooling

Coupled Power and Thermal Simulation with Active Cooling Coupled Power and Thermal Simulation with Active Cooling Weiping Liao and Lei He Electrical Engineering Department University of California, Los Angeles, CA 90095 {wliao,lhe}@ee.ucla.edu Abstract. Power

More information

Enhancing the Sniper Simulator with Thermal Measurement

Enhancing the Sniper Simulator with Thermal Measurement Proceedings of the 18th International Conference on System Theory, Control and Computing, Sinaia, Romania, October 17-19, 214 Enhancing the Sniper Simulator with Thermal Measurement Adrian Florea, Claudiu

More information

Coupled Power and Thermal Simulation with Active Cooling

Coupled Power and Thermal Simulation with Active Cooling Coupled Power and Thermal Simulation with Active Cooling Weiping Liao and Lei He Electrical Engineering Department, University of California, Los Angeles, CA 90095 {wliao, lhe}@ee.ucla.edu Abstract. Power

More information

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II

CSE241 VLSI Digital Circuits Winter Lecture 07: Timing II CSE241 VLSI Digital Circuits Winter 2003 Lecture 07: Timing II CSE241 L3 ASICs.1 Delay Calculation Cell Fall Cap\Tr 0.05 0.2 0.5 0.01 0.02 0.16 0.30 0.5 2.0 0.04 0.32 0.178 0.08 0.64 0.60 1.20 0.1ns 0.147ns

More information

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah

PERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.

More information

CMOS device technology has scaled rapidly for nearly. Modeling and Analysis of Nonuniform Substrate Temperature Effects on Global ULSI Interconnects

CMOS device technology has scaled rapidly for nearly. Modeling and Analysis of Nonuniform Substrate Temperature Effects on Global ULSI Interconnects IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 24, NO. 6, JUNE 2005 849 Modeling and Analysis of Nonuniform Substrate Temperature Effects on Global ULSI Interconnects

More information

Enabling Power Density and Thermal-Aware Floorplanning

Enabling Power Density and Thermal-Aware Floorplanning Enabling Power Density and Thermal-Aware Floorplanning Ehsan K. Ardestani, Amirkoushyar Ziabari, Ali Shakouri, and Jose Renau School of Engineering University of California Santa Cruz 1156 High Street

More information

Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors

Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors Reliability-aware Thermal Management for Hard Real-time Applications on Multi-core Processors Vinay Hanumaiah Electrical Engineering Department Arizona State University, Tempe, USA Email: vinayh@asu.edu

More information

HIGH-PERFORMANCE circuits consume a considerable

HIGH-PERFORMANCE circuits consume a considerable 1166 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL 17, NO 11, NOVEMBER 1998 A Matrix Synthesis Approach to Thermal Placement Chris C N Chu D F Wong Abstract In this

More information

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance, Power & Energy. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance, Power & Energy ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Recall: Goal of this class Performance Reconfiguration Power/ Energy H. So, Sp10 Lecture 3 - ELEC8106/6102 2 PERFORMANCE EVALUATION

More information

FastSpot: Host-Compiled Thermal Estimation for Early Design Space Exploration

FastSpot: Host-Compiled Thermal Estimation for Early Design Space Exploration FastSpot: Host-Compiled Thermal Estimation for Early Design Space Exploration Darshan Gandhi, Andreas Gerstlauer, Lizy John Electrical and Computer Engineering The University of Texas at Austin Email:

More information

A Novel Software Solution for Localized Thermal Problems

A Novel Software Solution for Localized Thermal Problems A Novel Software Solution for Localized Thermal Problems Sung Woo Chung 1,* and Kevin Skadron 2 1 Division of Computer and Communication Engineering, Korea University, Seoul 136-713, Korea swchung@korea.ac.kr

More information

Interconnect Lifetime Prediction for Temperature-Aware Design

Interconnect Lifetime Prediction for Temperature-Aware Design Interconnect Lifetime Prediction for Temperature-Aware Design UNIV. OF VIRGINIA DEPT. OF COMPUTER SCIENCE TECH. REPORT CS-23-2 NOVEMBER 23 Zhijian Lu, Mircea Stan, John Lach, Kevin Skadron Departments

More information

Technical Report GIT-CERCS. Thermal Field Management for Many-core Processors

Technical Report GIT-CERCS. Thermal Field Management for Many-core Processors Technical Report GIT-CERCS Thermal Field Management for Many-core Processors Minki Cho, Nikhil Sathe, Sudhakar Yalamanchili and Saibal Mukhopadhyay School of Electrical and Computer Engineering Georgia

More information

TEMPERATURE-AWARE COMPUTER SYSTEMS: OPPORTUNITIES AND CHALLENGES

TEMPERATURE-AWARE COMPUTER SYSTEMS: OPPORTUNITIES AND CHALLENGES TEMPERATURE-AWARE COMPUTER SYSTEMS: OPPORTUNITIES AND CHALLENGES TEMPERATURE-AWARE DESIGN TECHNIQUES HAVE AN IMPORTANT ROLE TO PLAY IN ADDITION TO TRADITIONAL TECHNIQUES LIKE POWER-AWARE DESIGN AND PACKAGE-

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

Efficient online computation of core speeds to maximize the throughput of thermally constrained multi-core processors

Efficient online computation of core speeds to maximize the throughput of thermally constrained multi-core processors Efficient online computation of core speeds to maximize the throughput of thermally constrained multi-core processors Ravishankar Rao and Sarma Vrudhula Department of Computer Science and Engineering Arizona

More information

On Optimal Physical Synthesis of Sleep Transistors

On Optimal Physical Synthesis of Sleep Transistors On Optimal Physical Synthesis of Sleep Transistors Changbo Long, Jinjun Xiong and Lei He {longchb, jinjun, lhe}@ee.ucla.edu EE department, University of California, Los Angeles, CA, 90095 ABSTRACT Considering

More information

Dynamic Power Management under Uncertain Information. University of Southern California Los Angeles CA

Dynamic Power Management under Uncertain Information. University of Southern California Los Angeles CA Dynamic Power Management under Uncertain Information Hwisung Jung and Massoud Pedram University of Southern California Los Angeles CA Agenda Introduction Background Stochastic Decision-Making Framework

More information

USING ON-CHIP EVENT COUNTERS FOR HIGH-RESOLUTION, REAL-TIME TEMPERATURE MEASUREMENT 1

USING ON-CHIP EVENT COUNTERS FOR HIGH-RESOLUTION, REAL-TIME TEMPERATURE MEASUREMENT 1 USING ON-CHIP EVENT COUNTERS FOR HIGH-RESOLUTION, REAL-TIME TEMPERATURE MEASUREMENT 1 Sung Woo Chung and Kevin Skadron Division of Computer Science and Engineering, Korea University, Seoul 136-713, Korea

More information

Analytical Model for Sensor Placement on Microprocessors

Analytical Model for Sensor Placement on Microprocessors Analytical Model for Sensor Placement on Microprocessors Kyeong-Jae Lee, Kevin Skadron, and Wei Huang Departments of Computer Science, and Electrical and Computer Engineering University of Virginia kl2z@alumni.virginia.edu,

More information

Short Papers. Efficient In-Package Decoupling Capacitor Optimization for I/O Power Integrity

Short Papers. Efficient In-Package Decoupling Capacitor Optimization for I/O Power Integrity 734 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 4, APRIL 2007 Short Papers Efficient In-Package Decoupling Capacitor Optimization for I/O Power Integrity

More information

Design for Manufacturability and Power Estimation. Physical issues verification (DSM)

Design for Manufacturability and Power Estimation. Physical issues verification (DSM) Design for Manufacturability and Power Estimation Lecture 25 Alessandra Nardi Thanks to Prof. Jan Rabaey and Prof. K. Keutzer Physical issues verification (DSM) Interconnects Signal Integrity P/G integrity

More information

Analysis of Temporal and Spatial Temperature Gradients for IC Reliability

Analysis of Temporal and Spatial Temperature Gradients for IC Reliability 1 Analysis of Temporal and Spatial Temperature Gradients for IC Reliability UNIV. OF VIRGINIA DEPT. OF COMPUTER SCIENCE TECH. REPORT CS-24-8 MARCH 24 Zhijian Lu, Wei Huang, Shougata Ghosh, John Lach, Mircea

More information

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis

EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture Rajeevan Amirtharajah University of California, Davis Outline Announcements Review: PDP, EDP, Intersignal Correlations, Glitching, Top

More information

Lecture 2: CMOS technology. Energy-aware computing

Lecture 2: CMOS technology. Energy-aware computing Energy-Aware Computing Lecture 2: CMOS technology Basic components Transistors Two types: NMOS, PMOS Wires (interconnect) Transistors as switches Gate Drain Source NMOS: When G is @ logic 1 (actually over

More information

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics

Lecture 23. Dealing with Interconnect. Impact of Interconnect Parasitics Lecture 23 Dealing with Interconnect Impact of Interconnect Parasitics Reduce Reliability Affect Performance Classes of Parasitics Capacitive Resistive Inductive 1 INTERCONNECT Dealing with Capacitance

More information

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1

NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 NCU EE -- DSP VLSI Design. Tsung-Han Tsai 1 Multi-processor vs. Multi-computer architecture µp vs. DSP RISC vs. DSP RISC Reduced-instruction-set Register-to-register operation Higher throughput by using

More information

Lecture 2: Metrics to Evaluate Systems

Lecture 2: Metrics to Evaluate Systems Lecture 2: Metrics to Evaluate Systems Topics: Metrics: power, reliability, cost, benchmark suites, performance equation, summarizing performance with AM, GM, HM Sign up for the class mailing list! Video

More information

Accurate Temperature Estimation for Efficient Thermal Management

Accurate Temperature Estimation for Efficient Thermal Management 9th International Symposium on Quality Electronic Design Accurate emperature Estimation for Efficient hermal Management Shervin Sharifi, ChunChen Liu, ajana Simunic Rosing Computer Science and Engineering

More information

Throughput of Multi-core Processors Under Thermal Constraints

Throughput of Multi-core Processors Under Thermal Constraints Throughput of Multi-core Processors Under Thermal Constraints Ravishankar Rao, Sarma Vrudhula, and Chaitali Chakrabarti Consortium for Embedded Systems Arizona State University Tempe, AZ 85281, USA {ravirao,

More information

Variation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits

Variation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits Variation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits Xin Zhao, Jeremy R. Tolbert, Chang Liu, Saibal Mukhopadhyay, and Sung Kyu Lim School of ECE, Georgia Institute of Technology,

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 10, OCTOBER

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 17, NO. 10, OCTOBER IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL 17, NO 10, OCTOBER 2009 1495 Architecture-Level Thermal Characterization for Multicore Microprocessors Duo Li, Student Member, IEEE,

More information

A Detailed Study on Phase Predictors

A Detailed Study on Phase Predictors A Detailed Study on Phase Predictors Frederik Vandeputte, Lieven Eeckhout, and Koen De Bosschere Ghent University, Electronics and Information Systems Department Sint-Pietersnieuwstraat 41, B-9000 Gent,

More information

Fast Thermal Simulation for Run-Time Temperature Tracking and Management

Fast Thermal Simulation for Run-Time Temperature Tracking and Management Fast Thermal Simulation for Run-Time Temperature Tracking and Management Pu Liu, Student Member, IEEE, Hang Li, Student Member, IEEE, Lingling Jin, Wei Wu, Sheldon X.-D. Tan, Senior Member, IEEE, Jun Yang,

More information

3-D Thermal-ADI: A Linear-Time Chip Level Transient Thermal Simulator

3-D Thermal-ADI: A Linear-Time Chip Level Transient Thermal Simulator 1434 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 12, DECEMBER 2002 3-D Thermal-ADI: A Linear-Time Chip Level Transient Thermal Simulator Ting-Yuan Wang and

More information

Online Work Maximization under a Peak Temperature Constraint

Online Work Maximization under a Peak Temperature Constraint Online Work Maximization under a Peak Temperature Constraint Thidapat Chantem Department of CSE University of Notre Dame Notre Dame, IN 46556 tchantem@nd.edu X. Sharon Hu Department of CSE University of

More information

Microprocessor Power Analysis by Labeled Simulation

Microprocessor Power Analysis by Labeled Simulation Microprocessor Power Analysis by Labeled Simulation Cheng-Ta Hsieh, Kevin Chen and Massoud Pedram University of Southern California Dept. of EE-Systems Los Angeles CA 989 Outline! Introduction! Problem

More information

Parameterized Transient Thermal Behavioral Modeling For Chip Multiprocessors

Parameterized Transient Thermal Behavioral Modeling For Chip Multiprocessors Parameterized Transient Thermal Behavioral Modeling For Chip Multiprocessors Duo Li and Sheldon X-D Tan Dept of Electrical Engineering University of California Riverside, CA 95 Eduardo H Pacheco and Murli

More information

An Automated Approach for Evaluating Spatial Correlation in Mixed Signal Designs Using Synopsys HSpice

An Automated Approach for Evaluating Spatial Correlation in Mixed Signal Designs Using Synopsys HSpice Spatial Correlation in Mixed Signal Designs Using Synopsys HSpice Omid Kavehei, Said F. Al-Sarawi, Derek Abbott School of Electrical and Electronic Engineering The University of Adelaide Adelaide, SA 5005,

More information

A Physical-Aware Task Migration Algorithm for Dynamic Thermal Management of SMT Multi-core Processors

A Physical-Aware Task Migration Algorithm for Dynamic Thermal Management of SMT Multi-core Processors A Physical-Aware Task Migration Algorithm for Dynamic Thermal Management of SMT Multi-core Processors Abstract - This paper presents a task migration algorithm for dynamic thermal management of SMT multi-core

More information

Stack Sizing for Optimal Current Drivability in Subthreshold Circuits REFERENCES

Stack Sizing for Optimal Current Drivability in Subthreshold Circuits REFERENCES 598 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL 16, NO 5, MAY 2008 design can be easily expanded to a hierarchical 64-bit adder such that the result will be attained in four cycles

More information

The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization

The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering Michigan Technological University Houghton, Michigan

More information

Chip Level Thermal Profile Estimation Using On-chip Temperature Sensors

Chip Level Thermal Profile Estimation Using On-chip Temperature Sensors Chip Level Thermal Profile Estimation Using On-chip Temperature Sensors Yufu Zhang, Ankur Srivastava and Mohamed Zahran* University of Maryland, *City University of New York {yufuzh, ankurs}@umd.edu, *mzahran@ccny.cuny.edu

More information

Pre-characterization Free, Efficient Power/Performance Analysis of Embedded and General Purpose Software Applications

Pre-characterization Free, Efficient Power/Performance Analysis of Embedded and General Purpose Software Applications Pre-characterization Free, Efficient Power/Performance Analysis of Embedded and General Purpose Software Applications Venkata Syam P. Rapaka, Diana Marculescu Department of Electrical and Computer Engineering

More information

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA

Logical Effort: Designing for Speed on the Back of an Envelope David Harris Harvey Mudd College Claremont, CA Logical Effort: Designing for Speed on the Back of an Envelope David Harris David_Harris@hmc.edu Harvey Mudd College Claremont, CA Outline o Introduction o Delay in a Logic Gate o Multi-stage Logic Networks

More information

School of EECS Seoul National University

School of EECS Seoul National University 4!4 07$ 8902808 3 School of EECS Seoul National University Introduction Low power design 3974/:.9 43 Increasing demand on performance and integrity of VLSI circuits Popularity of portable devices Low power

More information

Granularity of Microprocessor Thermal Management: A Technical Report

Granularity of Microprocessor Thermal Management: A Technical Report Granularity of Microprocessor Thermal Management: A Technical Report Karthik Sankaranarayanan, Wei Huang, Mircea R. Stan, Hossein Haj-Hariri, Robert J. Ribando and Kevin Skadron Department of Computer

More information

Design for Testability

Design for Testability Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning

More information

Three-Tier 3D ICs for More Power Reduction: Strategies in CAD, Design, and Bonding Selection

Three-Tier 3D ICs for More Power Reduction: Strategies in CAD, Design, and Bonding Selection Three-Tier 3D ICs for More Power Reduction: Strategies in CAD, Design, and Bonding Selection Taigon Song 1, Shreepad Panth 2, Yoo-Jin Chae 3, and Sung Kyu Lim 1 1 School of ECE, Georgia Institute of Technology,

More information

Thermal Scheduling SImulator for Chip Multiprocessors

Thermal Scheduling SImulator for Chip Multiprocessors TSIC: Thermal Scheduling SImulator for Chip Multiprocessors Kyriakos Stavrou Pedro Trancoso CASPER group Department of Computer Science University Of Cyprus The CASPER group: Computer Architecture System

More information

Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management

Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized Dynamic Thermal Management Control-Theoretic Techniques and Thermal-RC Modeling for Accurate and Localized ynamic Thermal Management UNIV. O VIRGINIA EPT. O COMPUTER SCIENCE TECH. REPORT CS-2001-27 Kevin Skadron, Tarek Abdelzaher,

More information

Compact Thermal Modeling for Temperature-Aware Design

Compact Thermal Modeling for Temperature-Aware Design Compact Thermal Modeling for Temperature-Aware Design Wei Huang, Mircea R. Stan, Kevin Skadron, Karthik Sankaranarayanan Shougata Ghosh, Sivakumar Velusamy Departments of Electrical and Computer Engineering,

More information

CS 700: Quantitative Methods & Experimental Design in Computer Science

CS 700: Quantitative Methods & Experimental Design in Computer Science CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,

More information

BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power

BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power BeiHang Short Course, Part 7: HW Acceleration: It s about Performance, Energy and Power James C. Hoe Department of ECE Carnegie Mellon niversity Eric S. Chung, et al., Single chip Heterogeneous Computing:

More information

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th,

TAU 2014 Contest Pessimism Removal of Timing Analysis v1.6 December 11 th, TU 2014 Contest Pessimism Removal of Timing nalysis v1.6 ecember 11 th, 2013 https://sites.google.com/site/taucontest2014 1 Introduction This document outlines the concepts and implementation details necessary

More information

Analytical Heat Transfer Model for Thermal Through-Silicon Vias

Analytical Heat Transfer Model for Thermal Through-Silicon Vias Analytical Heat Transfer Model for Thermal Through-Silicon Vias Hu Xu, Vasilis F. Pavlidis, and Giovanni De Micheli LSI - EPFL, CH-1015, Switzerland Email: {hu.xu, vasileios.pavlidis, giovanni.demicheli}@epfl.ch

More information

Through Silicon Via-Based Grid for Thermal Control in 3D Chips

Through Silicon Via-Based Grid for Thermal Control in 3D Chips Through Silicon Via-Based Grid for Thermal Control in 3D Chips José L. Ayala 1, Arvind Sridhar 2, Vinod Pangracious 2, David Atienza 2, and Yusuf Leblebici 3 1 Dept. of Computer Architecture and Systems

More information

Throughput Maximization for Intel Desktop Platform under the Maximum Temperature Constraint

Throughput Maximization for Intel Desktop Platform under the Maximum Temperature Constraint 2011 IEEE/ACM International Conference on Green Computing and Communications Throughput Maximization for Intel Desktop Platform under the Maximum Temperature Constraint Guanglei Liu 1, Gang Quan 1, Meikang

More information

Design for Variability and Signoff Tips

Design for Variability and Signoff Tips Design for Variability and Signoff Tips Alexander Tetelbaum Abelite Design Automation, Walnut Creek, USA alex@abelite-da.com ABSTRACT The paper provides useful design tips and recommendations on how to

More information

Logic BIST. Sungho Kang Yonsei University

Logic BIST. Sungho Kang Yonsei University Logic BIST Sungho Kang Yonsei University Outline Introduction Basics Issues Weighted Random Pattern Generation BIST Architectures Deterministic BIST Conclusion 2 Built In Self Test Test/ Normal Input Pattern

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law Topics 2 Page The Nature of Time real (i.e. wall clock) time = User Time: time spent

More information

A Novel Methodology for Thermal Aware Silicon Area Estimation for 2D & 3D MPSoCs

A Novel Methodology for Thermal Aware Silicon Area Estimation for 2D & 3D MPSoCs A Novel Methodology for Thermal Aware Silicon Area Estimation for 2D & 3D MPSoCs Ramya Menon C. and Vinod Pangracious Department of Electronics & Communication Engineering, Rajagiri School of Engineering

More information

Measurement & Performance

Measurement & Performance Measurement & Performance Topics Timers Performance measures Time-based metrics Rate-based metrics Benchmarking Amdahl s law 2 The Nature of Time real (i.e. wall clock) time = User Time: time spent executing

More information

Computer Architecture

Computer Architecture Lecture 2: Iakovos Mavroidis Computer Science Department University of Crete 1 Previous Lecture CPU Evolution What is? 2 Outline Measurements and metrics : Performance, Cost, Dependability, Power Guidelines

More information

System-Level Modeling and Microprocessor Reliability Analysis for Backend Wearout Mechanisms

System-Level Modeling and Microprocessor Reliability Analysis for Backend Wearout Mechanisms System-Level Modeling and Microprocessor Reliability Analysis for Backend Wearout Mechanisms Chang-Chih Chen and Linda Milor School of Electrical and Comptuer Engineering, Georgia Institute of Technology,

More information

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu

Performance Metrics for Computer Systems. CASS 2018 Lavanya Ramapantulu Performance Metrics for Computer Systems CASS 2018 Lavanya Ramapantulu Eight Great Ideas in Computer Architecture Design for Moore s Law Use abstraction to simplify design Make the common case fast Performance

More information

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops

A Mathematical Solution to. by Utilizing Soft Edge Flip Flops A Mathematical Solution to Power Optimal Pipeline Design by Utilizing Soft Edge Flip Flops M. Ghasemazar, B. Amelifard, M. Pedram University of Southern California Department of Electrical Engineering

More information

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory

L16: Power Dissipation in Digital Systems. L16: Spring 2007 Introductory Digital Systems Laboratory L16: Power Dissipation in Digital Systems 1 Problem #1: Power Dissipation/Heat Power (Watts) 100000 10000 1000 100 10 1 0.1 4004 80088080 8085 808686 386 486 Pentium proc 18KW 5KW 1.5KW 500W 1971 1974

More information

ECE321 Electronics I

ECE321 Electronics I ECE321 Electronics I Lecture 1: Introduction to Digital Electronics Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Tuesday 2:00-3:00PM or by appointment E-mail: payman@ece.unm.edu Slide: 1 Textbook

More information

CHARACTERIZATION AND CLASSIFICATION OF MODERN MICRO-PROCESSOR BENCHMARKS KUNXIANG YAN, B.S. A thesis submitted to the Graduate School

CHARACTERIZATION AND CLASSIFICATION OF MODERN MICRO-PROCESSOR BENCHMARKS KUNXIANG YAN, B.S. A thesis submitted to the Graduate School CHARACTERIZATION AND CLASSIFICATION OF MODERN MICRO-PROCESSOR BENCHMARKS BY KUNXIANG YAN, B.S. A thesis submitted to the Graduate School in partial fulfillment of the requirements for the degree Master

More information

Timing Driven Power Gating in High-Level Synthesis

Timing Driven Power Gating in High-Level Synthesis Timing Driven Power Gating in High-Level Synthesis Shih-Hsu Huang and Chun-Hua Cheng Department of Electronic Engineering Chung Yuan Christian University, Taiwan Outline Introduction Motivation Our Approach

More information

Multi-Objective Module Placement For 3-D System-On-Package

Multi-Objective Module Placement For 3-D System-On-Package IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 14, NO. 5, MAY 2006 553 Multi-Objective Module Placement For 3-D System-On-Package Eric Wong, Jacob Minz, and Sung Kyu Lim Abstract

More information

Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042

Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SRC and Freescale under SRC task number 2042 Evaluating Overheads of Multi-bit Soft Error Protection Techniques at Hardware Level Sponsored by SR and Freescale under SR task number 2042 Lukasz G. Szafaryn, Kevin Skadron Department of omputer Science

More information

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues

CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 17: Dynamic Sequential Circuits And Timing Issues [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan,

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information

Design for Testability

Design for Testability Design for Testability Outline Ad Hoc Design for Testability Techniques Method of test points Multiplexing and demultiplexing of test points Time sharing of I/O for normal working and testing modes Partitioning

More information

CMP 338: Third Class

CMP 338: Third Class CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does

More information

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect

Lecture 25. Dealing with Interconnect and Timing. Digital Integrated Circuits Interconnect Lecture 25 Dealing with Interconnect and Timing Administrivia Projects will be graded by next week Project phase 3 will be announced next Tu.» Will be homework-like» Report will be combined poster Today

More information

Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model

Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model Thermal-reliable 3D Clock-tree Synthesis Considering Nonlinear Electrical-thermal-coupled TSV Model Yang Shang, Chun Zhang, Hao Yu, Chuan Seng Tan Xin Zhao, Sung Kyu Lim School of Electrical and Electronic

More information

Implementation of Clock Network Based on Clock Mesh

Implementation of Clock Network Based on Clock Mesh International Conference on Information Technology and Management Innovation (ICITMI 2015) Implementation of Clock Network Based on Clock Mesh He Xin 1, a *, Huang Xu 2,b and Li Yujing 3,c 1 Sichuan Institute

More information

Interconnect s Role in Deep Submicron. Second class to first class

Interconnect s Role in Deep Submicron. Second class to first class Interconnect s Role in Deep Submicron Dennis Sylvester EE 219 November 3, 1998 Second class to first class Interconnect effects are no longer secondary # of wires # of devices More metal levels RC delay

More information

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So

Performance Metrics & Architectural Adaptivity. ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So Performance Metrics & Architectural Adaptivity ELEC8106/ELEC6102 Spring 2010 Hayden Kwok-Hay So What are the Options? Power Consumption Activity factor (amount of circuit switching) Load Capacitance (size

More information

Robust Optimization of a Chip Multiprocessor s Performance under Power and Thermal Constraints

Robust Optimization of a Chip Multiprocessor s Performance under Power and Thermal Constraints Robust Optimization of a Chip Multiprocessor s Performance under Power and Thermal Constraints Mohammad Ghasemazar, Hadi Goudarzi and Massoud Pedram University of Southern California Department of Electrical

More information

AN ANALYTICAL THERMAL MODEL FOR THREE-DIMENSIONAL INTEGRATED CIRCUITS WITH INTEGRATED MICRO-CHANNEL COOLING

AN ANALYTICAL THERMAL MODEL FOR THREE-DIMENSIONAL INTEGRATED CIRCUITS WITH INTEGRATED MICRO-CHANNEL COOLING THERMAL SCIENCE, Year 2017, Vol. 21, No. 4, pp. 1601-1606 1601 AN ANALYTICAL THERMAL MODEL FOR THREE-DIMENSIONAL INTEGRATED CIRCUITS WITH INTEGRATED MICRO-CHANNEL COOLING by Kang-Jia WANG a,b, Hong-Chang

More information

ECE 3060 VLSI and Advanced Digital Design. Testing

ECE 3060 VLSI and Advanced Digital Design. Testing ECE 3060 VLSI and Advanced Digital Design Testing Outline Definitions Faults and Errors Fault models and definitions Fault Detection Undetectable Faults can be used in synthesis Fault Simulation Observability

More information

CMP 334: Seventh Class

CMP 334: Seventh Class CMP 334: Seventh Class Performance HW 5 solution Averages and weighted averages (review) Amdahl's law Ripple-carry adder circuits Binary addition Half-adder circuits Full-adder circuits Subtraction, negative

More information

iretilp : An efficient incremental algorithm for min-period retiming under general delay model

iretilp : An efficient incremental algorithm for min-period retiming under general delay model iretilp : An efficient incremental algorithm for min-period retiming under general delay model Debasish Das, Jia Wang and Hai Zhou EECS, Northwestern University, Evanston, IL 60201 Place and Route Group,

More information

Overlay Aware Interconnect and Timing Variation Modeling for Double Patterning Technology

Overlay Aware Interconnect and Timing Variation Modeling for Double Patterning Technology Overlay Aware Interconnect and Timing Variation Modeling for Double Patterning Technology Jae-Seok Yang, David Z. Pan Dept. of ECE, The University of Texas at Austin, Austin, Tx 78712 jsyang@cerc.utexas.edu,

More information