Parametric Yield in FPGAs Due to Within-die Delay Variations: A Quantitative Analysis
|
|
- Clemence Singleton
- 5 years ago
- Views:
Transcription
1 Parametric Yield in FPGAs Due to Within-die Delay Variations: A Quantitative Analysis Pete Sedcole and Peter Y. K. Cheung Dept. Electrical and Electronic Engineering, Imperial College London, UK {pete.sedcole,p.cheung}@imperial.ac.uk ABSTRACT Variations in the semiconductor fabrication process results in variability in parameters between transistors on the same die, a problem exacerbated by lithographic scaling. The reconfigurability of Field-Programmable Gate Arrays presents the opportunity to compensate for within-die delay variability. This paper presents three reconfiguration-based strategies for compensating within-die stochastic delay variability in FPGAs: reconfiguring the entire FPGA, relocating subcircuits within an FPGA, and reconfiguring signal paths within a design. The yield of each strategy is analysed and compared with worst-case design and statistical static timing analysis (SSTA). It is demonstrated that significant improvements in circuit yield and timing are possible using SSTA alone, and these improvements can be enhanced by employing reconfiguration-based techniques. Categories and Subject Descriptors B.6. [Integrated Circuits]: Design Styles Logic arrays; B.7. [Integrated Circuits]: Types and Design Styles Advanced technologies General Terms Performance, Keywords Delay, FPGA, modelling, process variation, reconfiguration, statistical theory, within-die variability, yield. INTRODUCTION Variations in process parameters during semiconductor fabrication are manifested in the variability of the performance of the resulting integrated circuits. Historically, performance parameters have varied from wafer to wafer or lot to lot. At-speed testing techniques combined with speed-binning has been employed to partially compensate Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. FPGA 7, February 8, 7, Monterey, California, USA. Copyright 7 ACM /7/...$5.. for variations in propagation delay between dice. In deepsubmicron technology nodes, variations in transistor and wire parameters within the same die are expected to become significant [5, 8]. The parametric difference between two nominally identical features on the same die is partly stochastic and partly correlated, with the correlation depending on physical locality. Importantly, several sources of stochastic variation are intrinsic to the materials used in fabrication [3, ]. Stochastic variability cannot therefore be eliminated by improving the fabrication process, and is in fact predicted to increase relative to other sources of variability [5]. Like other high-performance integrated circuits, Field- Programmable Gate Arrays (FPGAs) are affected by parametric variability. However, their reconfigurability gives FP- GAs two distinct advantages over ASIC solutions. Firstly, the actual performance of each FPGA can be measured and characterised by configuring the device with Built-In Self- Test (BIST) circuits. Secondly, in theory it is possible to compensate for, or even make use of, the variations in performance by adapting the application circuit based on the measured parameters of the target FPGA (see for example [8]). There are a number of ways in which a circuit could be made adaptive to within-fpga variations in performance. The approach taken has a significant impact on the development of parametric test techniques, circuit design methods and tools. It is crucially important to quantify the performance improvement a given approach is expected to provide. The novel contributions of this paper are: (a) a discussion of generalised reconfiguration-based strategies for variationadaptive circuits in FPGAs (in Section 3), (b) analytical models based on the statistical theory underlying each strategy, as well as statistical static timing analysis and worstcase design (Section 4), (c) comparisons of the various techniques using the models, verified by Monte Carlo simulations (Section 5). This fundamental theoretical research provides a basis for exploration of variability-adaption techniques.. BACKGROUND The manufacture of high-performance digital integrated circuits requires rigorous control over many process variables, each of which influence propagation delay to a different extent [5]. Deviations from nominal values in process variables can be systematic or stochastic [4, ]. The effect can be localised to a few transistors, a die, a wafer or an entire lot. Systematic variations induce a shift in circuit parameters, sources of which include, for example, mask er- 78
2 rors due to inaccuracies in the process model, lithographic off-axis focusing errors and reticle stepper alignment errors. Stochastic variations cause circuit parameters to increase in spread, and comes from sources such as vibrations during lithography, wafer unevenness, and non-uniformity in resist thickness. Importantly, some sources of stochastic variability are not caused by imperfections in the fabrication process but are the result of the discrete or granular nature of materials at nanometre scales. Such sources of variation are termed intrinsic and include line-edge roughness, random discrete dopants, and oxide thickness fluctuations [3, ]. As these sources cannot be corrected by improving the process, they must be compensated for by new devices or novel circuit and system level design techniques. Within-die variability exacerbates verification complexity. The conventional approach to verifying circuit designs in the presence of variation uses static timing analysis coupled with corner-case or Monte Carlo simulations. This is feasible if parameters are constant over the entire circuit. In the extreme case, where every transistor and wire in the design has parameters which vary independently, the dimensionality of the parameter space becomes too large for simulation-based methods to be practical. Statistical static timing analysis (SSTA) is a promising new approach to timing analysis which incorporates the effects of within-die variations [5, 9]. The analysis can consider complete end-to-end signal paths or propagate statistically described delays block-by-block through the circuit. Lin, Hutton and Le recently applied SSTA to enhance placement and routing techniques in FPGAs [3]. Some research has been reported on reducing variability in FPGA architectures. Nabaa et al. describe a selfcharacterising and adaptive FPGA which compensates for variability using body-biasing [4]. Wong et al. determine yields using SPICE and numerical methods, and use the information to investigate the effect of LUT and cluster sizes []. The work presented in this paper uses analytical models; moreover, we examine the effect of end-user reconfiguration strategies on yield. Delay variability testing is similar to delay fault detection, which is performed using at-speed tests. In FPGAs, at-speed testing can be performed by Built-In Self-Test circuits, which can be design-independent [, 6] or designspecific [7, ]. Published work describing at-speed testing for within-fpga variability includes Li et al. [] and Katsuki et al. [9]. Li et al. used an array of ring oscillators to detect process variations in commercially available FP- GAs from Xilinx. Katsuki et al. made measurements on a custom-designed 9nm LUT array, and also describe a yield enhancement scheme, whereby placement is optimised based on the measured LUT variations [8]. 3. STRATEGIES This section describes several strategies for variabilityadaptive design in FPGAs. The focus is on delay variations. For reference, a typical signal path in an FPGA is shown in Figure. A path is comprised of a number of elements, which can be considered to be individual basic features (such as LUTs, interconnect switches or wire segments) or groups clk local wiring d switch matrix wire segment switch matrix local wiring LUT other logic and wiring d d3 d4 d5 d6 d7 delays Figure : A signal path is made up of a number of elements, each contributing to the signal propagation delay. of basic features. The propagation delay of the path is the sum of the element delays. Note that all strategies focus on stochastic rather than correlated variation. The speed-binning process gives a guarantee on the maximum delay of the slowest element in the FPGA. Since we do not assume any knowledge about the nature of the correlated variation in the FPGA, to be conservative it is necessary to assume that there are no parts of the FPGA which are significantly faster; in other words, the correlated variation is negligible. For completeness, worst-case timing analysis is considered first. In this case, parametric yield is a manufacturing issue for the FPGA vendor. The remaining techniques are based on achieving a design-specific yield determined by the end-user. How the variability-adaptive strategies are executed in practice is an important consideration. However, before embarking on a practical implementation of any approach, it is critically important to know its theoretical limitations. The knowledge obtained from the theoretical investigation can then be used to pursue the development of the most profitable strategy. This paper concentrates on the theory underpinning each approach. Nevertheless, some practical constraints are assumed. In particular, it will be necessary for the end-user to be able to perform at-speed testing of the design in the FPGA, most likely using BIST techniques to avoid the expense of automatic test equipment. Moreover, it is assumed that BIST will indicate whether a given path or circuit passes or fails the at-speed test. It is improbable that path delays would be quantified, much less delays of individual LUTs and wires, using BIST. 3. Worst-case timing The simplest timing strategy is to assign an upper bound on the value of the delay for each path element in the die. These worst-case values must take into account all sources of variability: within-die and between dice. In speed-binned FPGAs, the delay values can be determined during at-speed testing. If there is little within-die variability, testing is straightforward: each die can be characterised by measuring the delay of a localised test structure. However, if withindie variation is a significant part of overall delay variability, the test coverage must be either exhaustive, or at least sufficiently comprehensive to enable statistically reliable bounds on the slowest element in the die. Provided the signal path does not include chains of unbuffered wire segments, which would result in a more complex path delay function. In practice, FPGA routing is buffered, so this is a valid assumption. clk 79
3 Using worst-case timing, all parametric yield issues are the responsibility of the FPGA vendor; end-user designs are guaranteed to operate correctly at-speed. The designed delay for a signal path is the sum of the worst-case delays of the individual elements. Under stochastic variation, it is improbable that all elements in a path will exhibit near worst-case delays, making the design overly conservative. Swap modules Shift into spare space 3. Statistical static timing analysis Research into statistical static timing analysis applied to FPGA designs has only recently begun [3]. It is instructive to examine the use of SSTA in FPGAs, particularly as a comparison for the reconfiguration-based techniques described below. Statistical static timing analysis improves on worst-case timing by taking into account the probability that each path element has a given delay. The delay of a complete path can therefore be statistically described. Conventional STA will identify a single path in a given circuit implementation as the critical path, having the least slack. However, when implemented in FPGAs with significant within-die variation, the critical path may differ from device to device based on the variation in each die. Using SSTA, a circuit can be designed such that required timing is achieved at a given yield, taking into consideration all paths, not just the path with the least nominal slack. SSTA can be path-based or block-based. A path-based scheme examines complete end-to-end signal paths separately, making it highly accurate but computationally expensive. Block-based schemes are faster and resemble conventional static timing analysers, in that maximum delay values are propagated through leaf nodes of a path in parallel. This is less accurate as the maximum of statistically described values can in general only be estimated. Theoretically, using SSTA in FPGAs requires at-speed testing of all end-user products. The tests would need to be specific to the end-user designs. Testing could be neglected by selecting a sufficiently high design yield, such that the risk of not testing is acceptably low, a decision that would be application and market dependent. This strategy, although not using the full benefit of SSTA, nevertheless would outperform worst-case timing, and would be more amenable to in-field upgrades. Ultimately, statistical static timing analysis enables a trade-off between product parametric yield and speed. 3.3 Multiple configurations We now examine the class of strategies which makes use of the reconfigurability of the FPGA, the first of which is predicated on the use of multiple implementations of the same circuit design. Statistically, a given implementation of a circuit has a certain probability of passing at-speed testing when configured on an FPGA. If several implementations of the circuit are generated, then there is an increased probability that at least one of the implementations will meet at-speed requirements. In this context, a circuit implementation is stored as a configuration bitstream. All configurations are functionally identical, and could be generated from the same netlist if the placement and routing of the netlist differs between configurations. Ideally, each configuration uses a different set of resources for the critical path (or near-critical paths); if re- Move to spare region FPGA Subcircuit module Figure : Relocating regions in an FPGA. source usage is highly correlated between configurations the effectiveness of this technique is diminished. This strategy requires a specific at-speed test for each configuration. Tests are run one by one in a given FPGA until a configuration is found which passes or all configuration options are exhausted. In the later case, the FPGA is failed. Multiple configurations adds a degree of freedom (the number of configurations) to the design space, in addition to parametric yield and speed. The design will therefore theoretically outperform statistical static timing analysis. The approach has limitations. Several circuit configurations and test configurations must be generated and stored. Storage can be particularly problematic if the strategy is implemented on-line in an embedded system. Design constraints will generally preclude completely uncorrelated configurations. 3.4 Region relocation The second strategy which exploits reconfiguration involves reconfiguring and relocating parts of a complete circuit. The premise is similar to the multiple configuration case: different implementations of the same circuit increase the probability that there exists one implementation that passes at-speed testing. In this case, instead of completely reconfiguring the FPGA, different configurations are created by partitioning the circuit into modules and then assembling the modules in different ways. Different approaches to this strategy are illustrated in Figure. Fundamentally, the circuit design must be sufficiently modular such that critical (or near-critical) paths are encapsulated in module blocks. Moreover, the design must support some degree of relocation of the modular blocks. This can include, for example, swapping the location of modules in the FPGA, or shifting modules into unused areas. With appropriate constraints on the circuit design, it is possible to store the implemented circuit modules as partial bitstreams, and perform module relocation using dynamic reconfiguration [6]. A relocatable at-speed test configuration is required for each module. Compared with the multiple configuration case, the amount of bitstream storage required is reduced, and the implementation of the circuit only needs to be generated once. The approach has some limitations. Implementing relocatable modular circuits increases the complexity of system design, in particular in the connectivity between modules. 8
4 Moreover, while there are many ways to assemble circuit modules to form different implementations of the system, the implementations are clearly not all independent. The space of potential solutions is therefore large and not trivial to search. 3.5 Path reconfiguration A signal path, when implemented on a particular FPGA, may fail at-speed testing. Rather than discard the circuit, it may be possible to reduce the propagation delay by making adjustments to the way the path is implemented. A change to a signal path is called isochronal if, in the absence of within-die variability, the path delay would remain the same. Examples of isochronal changes can include altering the equation implemented by a LUT, swapping off-path LUT inputs, or even coarse alterations such as rerouting the path through a different logic cell or component. Although such changes are nominally isochronal, when within-die variations are present the path delay will be affected. It is possible that the delay can be reduced, such that the path meets speed requirements. 4. MODELLING AND ANALYSIS In the preceding section, several broad strategies for variability aware design were described. Before pursuing an implementation of any particular strategy, it is expedient to determine quantitatively the benefits the approach will provide. This section presents an analysis of each of the strategies of Section 3. The objective is to determine bounds on the relative yield or speed improvement of each approach. In the models which follow, die-to-die variation is ignored as it can be accounted for by speed-binning. It is assumed that any within-die variation present is stochastic in nature; correlated within-die variation is negligible. 4. Notation Some of the notation used in the following analysis is listed in Table. Other notation will be introduced as necessary. The error function, erf, has the usual definition: erf(z) π Z z e x dx The complementary error function, erfc, is defined as erfc(z) erf(z). 4. Worst-case timing An FPGA has many types of primitives which may form elemental parts of a signal path, such as LUTs, wires, interconnect switch points, multipliers, and so on. For timing modelling, a path element is composed of the smallest interesting segment of a signal path, such as a LUT together with the input and output wiring. Parametric yield modelling can be applied to one or more different elemental types. Assume there are K types of element of interest for parametric yield in an FPGA, L k elements of type k, and the delay through any given type k element is normally distributed: N(µ k, σ k ). The cumulative probability distribution for the delay of a type k element is: D k (d k ) = + erf dk µ k σ k The parametric yield of an element type, Y k, is an order statistic which depends on L k, the total number of elements «() T Target path delay D Cumulative delay distribution µ π Mean delay of a path π σπ Variance of the delay of a path π τ = T µ π Target path delay relative to the mean d i = µ i + X i Delay of path element i X i Random variable of delay of element i Z π = P N i X i Random variable of delay of path π N Number of elements in a path P Number of paths in a circuit Y Die yield Table : Notation used in the analysis. of that type in the FPGA. The manufacturing parametric yield of the FPGA is: KY KY Y = Y k = [D k (d k )] L k () k= k= We will assume that the yield of each elemental type is balanced, such that Y k = Y K. If this is the case, the designedfor delay of a signal path of N elements is: NX NX T = d i = hµ i i + σ i erf L Y ik i= i= where the ith element has mean delay µ i and variance σi, dependent on its type. Considering the case where parametric yield is applied to LUTs only, for an FPGA with L LUTs, the relative target delay for a given yield Y is: T µ π = N erf Y L (4) σ π 4.3 Statistical static timing analysis Path-based statistical static timing analysis is examined here, as it is generally more accurate than block-based SSTA. For a single path π in a given circuit implementation, with mean delay µ π and variance σ π, the yield of the path will be: Y π = D π(t ) = + erf T µπ σ π In general, a circuit implementation will have a number of paths which will contribute to yield loss. The impact each path has on the die yield is related to the path delay mean and variance. For simplicity, assume that P of the paths have sufficiently little delay slack such that they impact on yield; these we will label near-critical paths. The remaining paths in the circuit have a negligible effect on yield. Moreover, assume each near-critical path has the same mean delay and variance, and therefore the same yield. To be consistent, the same approximation will be made when analysing the other strategies. The yield of the circuit is the product of the yields of the paths: «(3) (5) Y = D π(t ) P (6) The relative target delay for a given yield is: T µ π = erf Y P σ π (7) 8
5 This assumes that all the P paths are independent and separate. If the near-critical paths share segments, the yield will be higher than (6), as the effective number of paths (P eff ) is lower. In the limit, P eff as the correlation between paths approaches unity. 4.4 Multiple configurations To be consistent with the SSTA evaluation, it is again assumed that a circuit has P near-critical paths, each with mean delay µ π and variance σ π. The yield of an individual path is given by (5). For a single configuration, the die yield is given by (6). When multiple independent configurations are available, and the fastest configuration is chosen for each FPGA through at-speed testing, the circuit yield is: Y = h i D π(t ) P C To derive this, note that for an FPGA to fail, it must fail at-speed tests for all C configurations. The probability that it fails for a single configuration is D π(t ) P, and therefore to fail in all configurations ˆ D π(t ) P C. The relative target delay for a given yield is: T µ π σ π = erf» ( Y ) C P From (9) it is possible to determine how many independent configurations are required to achieve a required yield Y, given a target path delay of T : where (8) (9) ln ( Y ) C ln ` () u+ P «T µπ u = erf σ π () It should be emphasised that this technique, and the analysis, is dependent on independency of configurations. Configurations are independent if the near-critical paths in each configuration use different resources. In practise, this will not be the case. Correlations between configurations has the effect of reducing the effective value of C. The limiting case is where each configuration uses exactly the same resources, and the effective value of C is unity. 4.5 Region relocation The strategy of subdividing the circuit into many separate modular regions which can be assembled in different ways is next considered. Assume that the modularisation of a circuit creates R identical regions and R subcircuit modules, each of which can be assigned to any of the R regions. Clearly, there are R! possible permutations for placing the subcircuits. The yield of this strategy is the probability of finding at least one assignment within the R! implementations where all subcircuits function (that is, pass at-speed testing). However, unlike the multiple configuration scheme, the implementations are not independent. Assume that the circuit is subdivided evenly into R subcircuits such that each subcircuit has P critical paths. This R we will term a balanced division. Unbalanced divisions will be examined later. Theorem. Given a balanced subdivision of a circuit with P near-critical paths into R subcircuit modules, and considering all possible assignments of modules to regions, the yield of the system can be approximated by: Y ( q) R R () where:» q = D π(t ) P R = + «P T erf µπ R (3) σ π Note that q is the yield of an individual subcircuit module assigned to a single region. Proof. The subdivision of the circuit is balanced. Therefore, it is reasonable to assume that any given pairing of a subcircuit and a region will have a fixed probability of functioning, given by q in (3). The probability that a given subcircuit does not function in a given region is ( q). The yield of the region relocation scheme can be derived by determining the probability that of the R! possible assignments of subcircuits to regions, no combination can be found where all subcircuits function. We denote this event (that no combination works) as E. It is conjectured that E generally occurs due to one of two scenarios: either there a subcircuit which does not function in any region, or there is a region in which none of the subcircuits function. Other situations (such as there being two subcircuits which only function in one region) are sufficiently rare that they can be ignored. For a given subcircuit module m i, the probability that it does not function in any region (event F mi ) is: P(F mi ) P(m i cannot be placed) = ( q) R (4) Since there are R modules, the probability that there is a subcircuit which cannot be placed is: P(F m) = P [ i! F mi = ( q) R R (5) Similarly, over all R regions, the probability there exists a region in which no subcircuit works is: P(F r) = P [ i! F ri = ( q) R R (6) where F ri is the event that no subcircuit functions in region r i. The yield is therefore: Y = P(E) (7) (P(F m F r)) (8) (P(F m) + P(F r) P(F m)p(f r)) (9) which can be reduced to () by substitution of (5) and (6). Note that events F r and F m are only weakly dependent for large R, and so P(F m F r) P(F m)p(f r). The relative target delay given a yield Y is: T µ π = " erf Y «R # P R R σ π () If the subdivision of the circuit is unbalanced, the yield will be reduced. The limiting case is where all P near-critical paths are allocated to a single subcircuit. There are then R 8
6 P(Z = x) π target delay variability, define: X i = d i µ i (3) Z π = d π µ π = X X i i (4) (a) Initial PDF of path delay. PDF of reconfigured paths initial PDF (b) After path reconfiguration. τ τ failing paths delay remaining failing paths delay Figure 3: The probability density function (PDF) of stochastic path delay Z π before and after reconfiguration of a single path element. different placements of this subcircuit, while the placement of the remaining subcircuits is irrelevant. The yield is then similar to the multiple configuration case: Y unbalanced = h i D π(t ) P R () 4.6 Path reconfiguration The path reconfiguration scheme is complex to analyse. The analysis can be approached by treating the strategy as an iterative process: repeated alterations are made to an initial circuit implementation. The result is an estimation of yield after a number of iterations. The drawback is that it is not possible to derive a canonical expression for the relative delay for a given yield. Before beginning the analysis, an overview of the approach is as follows. The initial yield of a circuit implementation is determined by statistical STA. The paths which fail make up some proportion of the total number of paths (see Figure 3(a)). Path reconfiguration is applied to the failing paths. The reconfigured paths have a different delay distribution, as depicted in Figure 3(b). The resulting yield is calculated from the mean and variance of the reconfigured paths. To begin with, consider a single path π in a circuit implementation, for example as depicted in Figure. As described in the worst-case timing analysis, the path is constructed from a number of elements, each of which contribute a delay d i to the overall path delay d π. d π = X i d i () A path will fail at speed testing if d π > T. The delay d i of an element is a random variable; each element may have a different mean µ i and variance σi. The mean path delay µ π is a function of the design and die-to-die variation, while within-die stochastic variability affects the variance of each elemental delay. To make it easier to isolate the stochastic τ = T µ π (5) We will assume that the variables X i are normally distributed with mean and variance σi. The at-speed test will pass the path if Z π < τ. The aim of path reconfiguration is to improve the speed of a path by making isochronal changes (using reconfiguration) to elements in the path. An isochronal change, by definition, does not affect the mean delay of the path µ π. Instead, a change to element i results in a new value for X i, which we will denote X i. The reconfigured paths will have a new mean µ π,r and variance σ π,r. The mean is given by: µ π,r = E{Z π Z π > τ} E{X i Z π > τ} + E{X i} (6) Assuming that there is no dependency between the original element delay and the new element delay, the expected value E{X i} =. Therefore, on average, after the change, the path delay will have decreased by the value of the original elemental stochastic delay X i. The first term of (6) is straightforward to find: r τ σ π exp σπ E{Z π Z π > τ} = (7) π τ erfc σπ In order to quantify the improvement in path speed, we need to determine the expected value of X i given that the path failed at-speed testing. Following this, we need to quantify the variance of the reconfigured paths, so that the new yield can be determined. Theorem. The expectation value E{X i Z π > τ} can be approximated by: r! E{X i Z π > τ} σ i σ π τ + τ + 8σ π π, (8) where σ π = P σ i. Proof. To start, let us find the probability density function P(X i = x Z π > τ). This is the probability that a given stochastic delay will be x given that the path failed at-speed testing. By Bayes Rule: P(Zπ > τ Xi = x)p(xi = x) P(X i = x Z π > τ) = P(Z π > τ) (9) The terms P(X i = x) and P(Z π > τ) are straightforward results as the variables X i and Z π are (assumed to be) normally distributed: «P(X i = x) = exp x σ i π σi P(Z π > τ) = «τ erfc σπ (3) (3) The remaining term P(Z π > τ X i = x) is also normally distributed: P(Z π > τ X i = x) = P(Z ρ > τ x) (3) = «τ x erfc (33) σρ 83
7 where Z ρ is the partial sum Z ρ = P j i Xj, and σ ρ = P j i σ j. Making these substitutions: erfc τ x x σρ exp σ i P(X i = x Z π > τ) = πσi (34) τ erfc σπ The expected value is found from the integral: E{X i Z π > τ} = Z xp(x i = x Z π > τ)dx (35) Unfortunately, this is unsolvable analytically. However, by inspection of (34), it may be noted that the function P(X i = x Z π > τ) is approximately Gaussian in shape. Therefore, it is possible to estimate the expected value by determining the location of the peak in the function. This is achieved in the standard way, by determining where the differential of (34) is zero. The following approximation is used to simplify the result: and (8) follows. erfc(u) π exp( u ) (u + p u + 4/π) (36) Thus far, we have found the average decrease in the delay of a failing path after the isochronal change. In order to determine the yield of the path reconfiguration scheme when applied to a large number of paths, it is also necessary to know the variance of the delay after the change. It is then possible to determine the fraction of reconfigured paths which still fail to meet the timing requirement, as shown in Figure 3(b). The variance of a reconfigured path is determined by finding the variance of the path not being reconfigured, and adding the variance of the reconfigured element: Var{Z π} = Var{Z ρ Z π > τ} + Var{X i} (37) Here, Z ρ is again the sum of the stochastic delays for the elements not undergoing reconfiguration. Note that Var{X i} = σi. Theorem 3. The variance Var{Z ρ Z π > τ} can be approximated by: «σ π,r = Var{Z ρ Z π > τ} σi σ i (38) σπ The proof of Theorem 3 is omitted for reasons of limited space; it follows from the derivation of Theorem. The resulting path yield after each iteration of path reconfigurations is the yield of the previous iteration plus the yield of the reconfigured paths. Y π,r Y π,r + ( Y π,r )» + erf «τ µπ,r σπ,r (39) The yield of the chip is a function of the path yield and the number of independent critical paths P in the design: Y chip,r = Y P π,r (4) Again, where critical paths are correlated (that is, they share segments) the effective value of P is reduced. 4.7 Summary The derived expressions for the yields of the different strategies are summarised in Table. SSTA Mult. conf. Region reloc. Path reconf. where Yield Y = D π(t ) P Y = ˆ D π(t ) P C h Y ( D π(t ) P R ) R i R P Y r = Y π,r Y π,r [Y π,r + ( Y π,r )D π,r(τ)] D π(t ) = + erf T µπ σ π D π,r(τ) = + erf τ µπ,r σπ,r Table : Summary of the analysis. σ i = Figure 4: Die yield for a target elemental delay. Four different values of elemental variability are plotted, covering a standard deviation of % to % of the mean delay. 5. EXPERIMENTAL RESULTS This section contains results of simulations and experiments, with two objectives. Primarily, it is of interest to examine and compare the relative yield enhancement offered by the alternative strategies under different conditions. Moreover, the expressions derived in the analysis of the previous section are verified using Monte Carlo simulations. The worst-case strategy is examined first. Assuming an FPGA of moderate size (around 7 logic elements) and parametric yield applied to LUTs only, die yield curves are plotted against target elemental delay in Figure 4. The target delay is normalised to the mean LUT delay. The different curves describe different amounts of LUT delay variability, from a standard deviation of % of the mean up to % of the mean. It can be seen that an increase in variability has a significant impact on the achievable speed of the device. For the remaining graphs in this section, the elemental variation is set to σ i = %, a realistic value for FPGAs fabricated in 9nm technology [7]. Using statistical static timing analysis (SSTA) provides a significant improvement, as shown in Figure 5. Here yield curves are plotted for circuits with differing numbers of critical paths, from to 5, each comprising five elements (). Note in particular the scale of the x-axis compared to Figure 4. Indicated on the graph is a timing/yield point for a design with critical paths, showing that if the target path delay is chosen to be.95 the mean path delay, 85% of dice will be able to achieve this or better. Each critical path in this scenario is constructed from five ele- 84
8 Yield σ i =. (.95,.85) P = limit for correlated paths Required number of configurations σ i =. P = 5 µ π +.75 σ π Target yield Figure 5: Circuit yield vs target path delay using statistical static timing analysis. The curves represent different numbers of critical paths P in the circuit, from to 5. Yield σ i =. P = 5 C = limit for correlated configurations C = 3.. C = SSTA Figure 6: Circuit yield vs target path delay in the multiple configuration strategy. mental components, and each path is disjoint; there are no common elements shared by any two critical paths. Where critical paths do share elemental components their delay is correlated, and the effective number of paths is reduced, shifting the yield curve to the left. The limiting case, where all critical paths are completely correlated, is the curve for a single critical path, P =. The remaining results in this section assume a circuit with the equivalent of 5 uncorrelated paths of near-critical delay (P = 5). Further work is required to ascertain the value of P for an arbitrary circuit. Next, the multiple configuration strategy is examined. In Figure 6, yield curves are plotted for a design with between two and ten independent configurations. Observe the dashed line indicating the yield curve for statistical static timing analysis applied to the same design. It can be seen that the multiple configuration scheme provides some improvement over SSTA alone. However, it should also be noted that if the configurations are not independent the improvement is reduced; in the limiting case (where all configurations are fully correlated) the yield is the same as SSTA. When using this strategy to improve performance, it is desirable to know how many independent configurations would Figure 7: The minimum necessary number of configurations required to achieve a target yield and path timing in the multiple configuration scheme. Yield σ i =. P = 5.4 R = R = 6.3 R = 4.. Upper bound Lower bound Figure 8: Circuit yield vs target path delay in the region relocation strategy. The theoretical upper bound (for balanced subdivisions) is shown for different numbers of regions from 4 to. The lower (non-balanced) bound is shown for four regions. be required to meet a given yield and timing target. Figure 7 plots the necessary number of configurations needed to achieve yield at a range of different path timing targets. The timing targets are expressed in terms of the mean delay and variance of the path, and range from µ π +.75σ π (aggressive) to µ π + 4.5σ π (conservative). It can observed that the required number of configurations escalates rapidly for modestly more aggressive timing, particularly when the required yield rate is high. Yield curves for the region relocation strategy are depicted in Figure 8. The number of regions graphed ranges from 4 to. Note that for a large number of regions, the number of critical paths in the design (P = 5) cannot be subdivided evenly between the regions. This results in a divergence between the simulations and the theoretical results. The limiting scenario for unbalanced subdivisions is plotted for the four-region case only (dashed curve). This would result when all critical paths were contained within a single region. As with the multiple configuration strategy, it is of interest to know how many regions would be necessary to achieve a 85
9 Required number of regions µ π +.5 σ π σ i =. P = Target yield Yield th path reconf. C = R = σ i =. P = 5 Worst Case SSTA Multiple conf. Region reloc. Path reconf Figure 9: The minimum necessary number of regions required to achieve a target yield and path timing in the region relocation scheme. Yield σ i =. P = 5 5th 4th 3rd nd st reconfiguration Initial Figure : Circuit yield vs target path delay in the path reconfiguration strategy. The yield is plotted after reconfiguring each of the five elements forming the path. given target yield and timing. Figure 9 graphs this information. Note that the timing targets are more aggressive than the multiple configuration case, ranging from µ π +.5σ π (aggressive) to µ π + 3.σ π (conservative). The outcome of the final strategy, path reconfiguration, is shown in Figure. The initial yield curve (dashed line) is that achieved by statistical STA. As noted earlier, the 5 critical paths in the design are made up of five elements each. The curves plotted in Figure show the effect after successive iterations, where each failing path has one element reconfigured. For this graph, the delay distributions of all elements are identical. The improvement is underestimated (particularly noticeable in the fourth and fifth iterations) because of the approximations made in the analysis. The yields of all strategies are compared in Figure and Figure. The first graph plots the extreme cases for a design with 5 uncorrelated critical paths each comprising 5 elements. Taking as an example the 85% yield point, and compared to the reference worst-case design, the strategies are enumerated from slowest to fastest: Figure : A comparison of the yields of the different strategies. Relative delay offset (σ π ) SSTA Multiple conf. Region reloc. Path reconf. σ i =. P = 5 C = (.99, 3.54) 5th path reconf. R = Target Yield Figure : Relative timing vs yield for the reconfiguration strategies, compared with SSTA. C =..., R = SSTA provides a 3.% improvement in the path delay over worst-case design.. The multiple configuration strategy achieves a 34.8% improvement with independent configurations. 3. Reconfiguring up to all five elements per path results in a 36.6% improvement. 4. Region relocation provides a 44.7% improvement when the design is divided into balanced subcircuits. The graph of Figure is an amalgamation of the theoretical yields of the reconfiguration techniques compared with SSTA. Here, the expected circuit timing is plotted against target yield. The timing is expressed as an offset from the mean path delay (µ π) in terms of the number of standard deviations of path delay (σ π). For example, a 99% chip yield using SSTA only would mean that the slowest expected path would be 3.54 σ π slower than the average path delay. From the graph, it can be observed that multiple configurations and path reconfiguration provide a similar range of improvement over SSTA, particular at high target yields. Adding more configurations provides an increasingly diminishing return. The region relocation strategy outperforms both with relatively few regions (eight in this case). 86
10 6. CONCLUSIONS AND FUTURE WORK Within-die delay variation will become increasingly significant in future technology nodes. This paper presented three strategies for compensating for variability by exploiting the reconfigurability of FPGAs. The techniques involve reconfiguring the entire FPGA; relocating subcircuits to different regions within the FPGA; and reconfiguring individual signal path elements. Using probability theory, the yield of each approach was modelled and compared with statistical static timing analysis and worst-case design, demonstrating the benefits of reconfiguration-based techniques. The analysis presented in this paper provides a foundation from which to explore delay variability adaptive design in FPGAs. In addition to enhancing the verification of this work through more accurate simulations, we plan to refine the simplifying assumptions such as path and configuration independence. Moreover, we will investigate the implementation of the delay adaptive strategies presented. 7. ACKNOWLEDGEMENTS The authors are grateful for the financial support of the UK Engineering and Physical Sciences Research Council (Platform Grant EP/C54948/). Thanks also to Dr. R. Sedcole for advice and suggestions on statistical theory. 8. REFERENCES [] M. Abramovici and C. E. Stroud. BIST-based delay-fault testing in FPGAs. Journal of Electronic Testing: and Applications, 9(5): , Oct 3. [] A. Asenov, S. Kaya, and A. R. Brown. Intrinsic parameter fluctuations in decananometer MOSFETs introduced by gate line edge roughness. IEEE Trans. Electron Devices, 5(5):54 6, May 3. [3] A. Asenov, S. Kaya, and J. H. Davies. Intrinsic threshold voltage fluctuationsin decananometer MOSFETs due to local oxide thickness variations. IEEE Trans. Electron Devices, 49(6): 9, Jun. [4] Y. Cao, P. Gupta, A. B. Kahng, D. Sylvester, and J. Yang. Design sensitivities to variability: Extrapolations and assessments in nanometer VLSI. In Proc. IEEE International ASIC/SOC Conference,. [5] H. Chang, V. Zolotov, S. Narayan, and C. Visweswariah. Parameterized block-based statistical timing analysis with non-gaussian parameters, nonlinear delay functions. In Proc. Design Automation Conference, 5. [6] P. Girard, O. Héron, S. Pravossoudovitch, and M. Renovell. High quality TPG for delay faults in look-up tables of FPGAs. In Proc. IEEE International Workshop on Electronic Design, Test and Applications, 4. [7] I. G. Harris, P. R. Menon, and R. Tessier. BIST-based delay path testing in FPGA architectures. In Proc. IEEE International Test Conference,. [8] K. Katsuki, M. Kotani, K. Kobayashi, and H. Onodera. A yield and speed enhancement scheme under within-die variations on 9nm LUT array. In Proc. IEEE Custom Integrated Circuits Conference, 5. [9] K. Katsuki, M. Kotani, K. Kobayashi, and H. Onodera. Measurement results of within-die variations on a 9nm LUT array for speed and yield enhancement of reconfigurable devices. In Proc. Asia and South Pacific Design Automation Conference, 6. [] K. S. Kim, S. Mitra, and P. G. Ryan. Delay defect characteristics and testing strategies. IEEE Design & Test of Computers, (5):8 6, Sept-Oct 3. [] A. Kraśniewski. Evaluation of testability of path delay faults for user-configured programmable devices. In Proc. Field-Programmable Logic and Applications, 3. [] X.-Y. Li, F. Wang, T. La, and Z.-M. Ling. FPGA as process monitor an effective method to characterize poly gate CD variation and its impact on product performance and yield. IEEE Transactions on Semiconductor Manufacturing, 7(3):67 7, Aug. 4. [3] Y. Lin, M. Hutton, and L. He. Placement and timing for FPGAs considering variations. In Proc. Field-Programmable Logic and Applications, 6. [4] G. Nabaa, N. Azizi, and F. N. Najm. An adaptive FPGA architecture with process variation compensation and reduced leakage. In Proc. Design Automation Conference, 6. [5] S. R. Nassif. Design for variability in DSM technologies. In Proc. IEEE International Symposium on Quality Electronic Design,. [6] P. Sedcole, B. Blodget, T. Becker, J. Anderson, and P. Lysaght. Modular dynamic reconfiguration in Virtex FPGAs. IEE Proceedings Computers and Digital Techniques, 53(3):57 64, May 6. [7] P. Sedcole and P. Y. K. Cheung. Within-die delay variability in 9nm FPGAs and beyond. In Proc. IEEE International Conference on Field Programmable Technology, 6. [8] C. Visweswariah. Death, taxes and failing chips. In Proc. Design Automation Conference, 3. [9] C. Visweswariah, K. Ravindran, K. Kalafala, S. G. Walker, and S. Narayan. First-order incremental block-based statistical timing analysis. In Proc. Design Automation Conference, 4. [] H.-Y. Wong, L. Cheng, Y. Lin, and L. He. FPGA device and architecture evaluation considering process variation. In Proc. International Conference on Computer Aided Design, 5. 87
Chapter 2 Process Variability. Overview. 2.1 Sources and Types of Variations
Chapter 2 Process Variability Overview Parameter variability has always been an issue in integrated circuits. However, comparing with the size of devices, it is relatively increasing with technology evolution,
More informationModelling and Compensating for Clock Skew Variability in FPGAs
Modelling and Compensating for Clock Skew Variability in FPGAs Pete Sedcole, Justin S. Wong and Peter Y. K. Cheung Department of Electrical & Electronic Engineering, Imperial College London South Kensington
More informationDesign for Manufacturability and Power Estimation. Physical issues verification (DSM)
Design for Manufacturability and Power Estimation Lecture 25 Alessandra Nardi Thanks to Prof. Jan Rabaey and Prof. K. Keutzer Physical issues verification (DSM) Interconnects Signal Integrity P/G integrity
More informationConstrained Clock Shifting for Field Programmable Gate Arrays
Constrained Clock Shifting for Field Programmable Gate Arrays Deshanand P. Singh Dept. of Electrical and Computer Engineering University of Toronto Toronto, Canada singhd@eecg.toronto.edu Stephen D. Brown
More informationVariations-Aware Low-Power Design with Voltage Scaling
Variations-Aware -Power Design with Scaling Navid Azizi, Muhammad M. Khellah,VivekDe, Farid N. Najm Department of ECE, University of Toronto, Toronto, Ontario, Canada Circuits Research, Intel Labs, Hillsboro,
More informationInterconnect Yield Model for Manufacturability Prediction in Synthesis of Standard Cell Based Designs *
Interconnect Yield Model for Manufacturability Prediction in Synthesis of Standard Cell Based Designs * Hans T. Heineken and Wojciech Maly Department of Electrical and Computer Engineering Carnegie Mellon
More informationNANO-CMOS DESIGN FOR MANUFACTURABILILTY
NANO-CMOS DESIGN FOR MANUFACTURABILILTY Robust Circuit and Physical Design for Sub-65nm Technology Nodes Ban Wong Franz Zach Victor Moroz An u rag Mittal Greg Starr Andrew Kahng WILEY A JOHN WILEY & SONS,
More informationESE 570: Digital Integrated Circuits and VLSI Fundamentals
ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 23: April 13, 2017 Variation; I/O Circuits, Inductive Noise Lecture Outline! Design Quality " Variation! Packaging! Variation and Testing!
More informationPARADE: PARAmetric Delay Evaluation Under Process Variation * (Revised Version)
PARADE: PARAmetric Delay Evaluation Under Process Variation * (Revised Version) Xiang Lu, Zhuo Li, Wangqi Qiu, D. M. H. Walker, Weiping Shi Dept. of Electrical Engineering Dept. of Computer Science Texas
More informationOn Application of Output Masking to Undetectable Faults in Synchronous Sequential Circuits with Design-for-Testability Logic
On Application of Output Masking to Undetectable Faults in Synchronous Sequential Circuits with Design-for-Testability Logic Irith Pomeranz 1 and Sudhakar M. Reddy 2 School of Electrical & Computer Eng.
More informationReducing Delay Uncertainty in Deeply Scaled Integrated Circuits Using Interdependent Timing Constraints
Reducing Delay Uncertainty in Deeply Scaled Integrated Circuits Using Interdependent Timing Constraints Emre Salman and Eby G. Friedman Department of Electrical and Computer Engineering University of Rochester
More informationLongest Path Selection for Delay Test under Process Variation
2093 1 Longest Path Selection for Delay Test under Process Variation Xiang Lu, Zhuo Li, Wangqi Qiu, D. M. H. Walker and Weiping Shi Abstract Under manufacturing process variation, a path through a net
More informationAccounting for Non-linear Dependence Using Function Driven Component Analysis
Accounting for Non-linear Dependence Using Function Driven Component Analysis Lerong Cheng Puneet Gupta Lei He Department of Electrical Engineering University of California, Los Angeles Los Angeles, CA
More informationWord-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator
Word-length Optimization and Error Analysis of a Multivariate Gaussian Random Number Generator Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical & Electronic
More informationImpact of parametric mismatch and fluctuations on performance and yield of deep-submicron CMOS technologies. Philips Research, The Netherlands
Impact of parametric mismatch and fluctuations on performance and yield of deep-submicron CMOS technologies Hans Tuinhout, The Netherlands motivation: from deep submicron digital ULSI parametric spread
More informationAn Automated Approach for Evaluating Spatial Correlation in Mixed Signal Designs Using Synopsys HSpice
Spatial Correlation in Mixed Signal Designs Using Synopsys HSpice Omid Kavehei, Said F. Al-Sarawi, Derek Abbott School of Electrical and Electronic Engineering The University of Adelaide Adelaide, SA 5005,
More informationSINCE the early 1990s, static-timing analysis (STA) has
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008 589 Keynote Paper Statistical Timing Analysis: From Basic Principles to State of the Art David
More informationStatistical Timing Analysis with Path Reconvergence and Spatial Correlations
Statistical Timing Analysis with Path Reconvergence and Spatial Correlations Lizheng Zhang, Yuhen Hu, Charlie Chung-Ping Chen ECE Department, University of Wisconsin, Madison, WI53706-1691, USA E-mail:
More informationDesign Methodology and Tools for NEC Electronics Structured ASIC ISSP
Design Methodology and Tools for NEC Electronics Structured ASIC ISSP Takumi Okamoto NEC Corporation 75 Shimonumabe, Nakahara-ku, Kawasaki, Kanagawa -8666, Japan okamoto@ct.jp.nec.com Tsutomu Kimoto Naotaka
More informationParameterized Timing Analysis with General Delay Models and Arbitrary Variation Sources
Parameterized Timing Analysis with General elay Models and Arbitrary Variation Sources ABSTRACT Khaled R. Heloue epartment of ECE University of Toronto Toronto, Ontario, Canada khaled@eecg.utoronto.ca
More informationHIGH-PERFORMANCE circuits consume a considerable
1166 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL 17, NO 11, NOVEMBER 1998 A Matrix Synthesis Approach to Thermal Placement Chris C N Chu D F Wong Abstract In this
More informationPredicting IC Defect Level using Diagnosis
2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising
More informationESE535: Electronic Design Automation. Delay PDFs? (2a) Today. Central Problem. Oxide Thickness. Line Edge Roughness
ESE535: Electronic Design Automation Delay PDFs? (2a) Day 23: April 10, 2013 Statistical Static Timing Analysis Penn ESE535 Spring 2013 -- DeHon 1 Penn ESE535 Spring 2013 -- DeHon 2 Today Sources of Variation
More informationDetermining Appropriate Precisions for Signals in Fixed-Point IIR Filters
38.3 Determining Appropriate Precisions for Signals in Fixed-Point IIR Filters Joan Carletta Akron, OH 4435-3904 + 330 97-5993 Robert Veillette Akron, OH 4435-3904 + 330 97-5403 Frederick Krach Akron,
More informationDelay Modelling Improvement for Low Voltage Applications
Delay Modelling Improvement for Low Voltage Applications J.M. Daga, M. Robert and D. Auvergne Laboratoire d Informatique de Robotique et de Microélectronique de Montpellier LIRMM UMR CNRS 9928 Univ. Montpellier
More informationTest Generation for Designs with Multiple Clocks
39.1 Test Generation for Designs with Multiple Clocks Xijiang Lin and Rob Thompson Mentor Graphics Corp. 8005 SW Boeckman Rd. Wilsonville, OR 97070 Abstract To improve the system performance, designs with
More informationEE 330 Lecture 3. Basic Concepts. Feature Sizes, Manufacturing Costs, and Yield
EE 330 Lecture 3 Basic Concepts Feature Sizes, Manufacturing Costs, and Yield Review from Last Time Analog Flow VLSI Design Flow Summary System Description Circuit Design (Schematic) SPICE Simulation Simulation
More informationPhysical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006
Physical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 1 Lecture 04: Timing Analysis Static timing analysis STA for sequential circuits
More informationEE 330 Lecture 3. Basic Concepts. Feature Sizes, Manufacturing Costs, and Yield
EE 330 Lecture 3 Basic Concepts Feature Sizes, Manufacturing Costs, and Yield Review from Last Time Analog Flow VLSI Design Flow Summary System Description Circuit Design (Schematic) SPICE Simulation Simulation
More informationPreviously. ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems. Today. Variation Types. Fabrication
ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Previously Understand how to model transistor behavior Given that we know its parameters V dd, V th, t OX, C OX, W, L, N A Day
More informationPARADE: PARAmetric Delay Evaluation Under Process Variation *
PARADE: PARAmetric Delay Evaluation Under Process Variation * Xiang Lu, Zhuo Li, Wangqi Qiu, D. M. H. Walker, Weiping Shi Dept. of Electrical Engineering Dept. of Computer Science Texas A&M University
More informationReduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs
Article Reduced-Area Constant-Coefficient and Multiple-Constant Multipliers for Xilinx FPGAs with 6-Input LUTs E. George Walters III Department of Electrical and Computer Engineering, Penn State Erie,
More informationA Novel LUT Using Quaternary Logic
A Novel LUT Using Quaternary Logic 1*GEETHA N S 2SATHYAVATHI, N S 1Department of ECE, Applied Electronics, Sri Balaji Chockalingam Engineering College, Arani,TN, India. 2Assistant Professor, Department
More informationLuis Manuel Santana Gallego 71 Investigation and simulation of the clock skew in modern integrated circuits. Clock Skew Model 1
Luis Manuel Santana Gallego 71 Appendix 1 Clock Skew Model 1 Steven D. Kugelmass, Kenneth Steiglitz [KUG-88] 1. Introduction The accumulation of clock skew, the differences in arrival times of signal in
More informationVariability Aware Statistical Timing Modelling Using SPICE Simulations
Variability Aware Statistical Timing Modelling Using SPICE Simulations Master Thesis by Di Wang Informatics and Mathematical Modelling, Technical University of Denmark January 23, 2008 2 Contents List
More informationMax Operation in Statistical Static Timing Analysis on the Non-~Gaussian Variation Sources for VLSI Circuits
UNLV Theses, Dissertations, Professional Papers, and Capstones 12-1-2013 Max Operation in Statistical Static Timing Analysis on the Non-~Gaussian Variation Sources for VLSI Circuits Abu M. Baker University
More informationMethodology to Achieve Higher Tolerance to Delay Variations in Synchronous Circuits
Methodology to Achieve Higher Tolerance to Delay Variations in Synchronous Circuits Emre Salman and Eby G. Friedman Department of Electrical and Computer Engineering University of Rochester Rochester,
More informationVariation-Resistant Dynamic Power Optimization for VLSI Circuits
Process-Variation Variation-Resistant Dynamic Power Optimization for VLSI Circuits Fei Hu Department of ECE Auburn University, AL 36849 Ph.D. Dissertation Committee: Dr. Vishwani D. Agrawal Dr. Foster
More informationCapturing Post-Silicon Variations using a Representative Critical Path
1 Capturing Post-Silicon Variations using a Representative Critical Path Qunzeng Liu and Sachin S. Sapatnekar Abstract In nanoscale technologies that experience large levels of process variation, post-silicon
More informationAnalyzing the Impact of Process Variations on Parametric Measurements: Novel Models and Applications
Analyzing the Impact of Process Variations on Parametric Measurements: Novel Models and Applications Sherief Reda Division of Engineering Brown University Providence, RI 0292 Email: sherief reda@brown.edu
More informationStatistical Performance Modeling and Optimization
Foundations and Trends R in Electronic Design Automation Vol. 1, No. 4 (2006) 331 480 c 2007 X. Li, J. Le and L. T. Pileggi DOI: 10.1561/1000000008 Statistical Performance Modeling and Optimization Xin
More informationMeasurement and Modeling of MOS Transistor Current Mismatch in Analog IC s
Measurement and Modeling of MOS Transistor Current Mismatch in Analog IC s Eric Felt Amit Narayan Alberto Sangiovanni-Vincentelli Department of Electrical Engineering and Computer Sciences University of
More informationEECS 579: Logic and Fault Simulation. Simulation
EECS 579: Logic and Fault Simulation Simulation: Use of computer software models to verify correctness Fault Simulation: Use of simulation for fault analysis and ATPG Circuit description Input data for
More informationVariation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits
Variation-aware Clock Network Design Methodology for Ultra-Low Voltage (ULV) Circuits Xin Zhao, Jeremy R. Tolbert, Chang Liu, Saibal Mukhopadhyay, and Sung Kyu Lim School of ECE, Georgia Institute of Technology,
More informationFrom Blind Certainty to Informed Uncertainty
From Blind Certainty to Informed Uncertainty Kurt Keutzer and Michael Orshansky University of California, Berkeley All good things were at one time bad things; every original sin has developed into an
More informationMultivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA
Multivariate Gaussian Random Number Generator Targeting Specific Resource Utilization in an FPGA Chalermpol Saiprasert, Christos-Savvas Bouganis and George A. Constantinides Department of Electrical &
More informationVery Large Scale Integration (VLSI)
Very Large Scale Integration (VLSI) Lecture 4 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI Contents Delay estimation Simple RC model Penfield-Rubenstein Model Logical effort Delay
More informationIntroduction to VLSI Testing
Introduction to 李昆忠 Kuen-Jong Lee Dept. of Electrical Engineering National Cheng-Kung University Tainan, Taiwan Class Problems to Think How are you going to test A 32 bit adder A 32 bit counter A 32Mb
More informationFLCC Seminar. Spacer Lithography for Reduced Variability in MOSFET Performance
1 Seminar Spacer Lithography for Reduced Variability in MOSFET Performance Prof. Tsu-Jae King Liu Electrical Engineering & Computer Sciences Dept. University of California at Berkeley Graduate Student:
More informationDelay Variation Tolerance for Domino Circuits
Delay Variation Tolerance for Domino Circuits Student: Kai-Chiang Wu Advisor: Shih-Chieh Chang Department of Computer Science National Tsing Hua University Hsinchu, Taiwan 300, R.O.C. June, 2004 Abstract
More informationEGFC: AN EXACT GLOBAL FAULT COLLAPSING TOOL FOR COMBINATIONAL CIRCUITS
EGFC: AN EXACT GLOBAL FAULT COLLAPSING TOOL FOR COMBINATIONAL CIRCUITS Hussain Al-Asaad Department of Electrical & Computer Engineering University of California One Shields Avenue, Davis, CA 95616-5294
More informationVLSI Design I. Defect Mechanisms and Fault Models
VLSI Design I Defect Mechanisms and Fault Models He s dead Jim... Overview Defects Fault models Goal: You know the difference between design and fabrication defects. You know sources of defects and you
More informationOn Optimal Physical Synthesis of Sleep Transistors
On Optimal Physical Synthesis of Sleep Transistors Changbo Long, Jinjun Xiong and Lei He {longchb, jinjun, lhe}@ee.ucla.edu EE department, University of California, Los Angeles, CA, 90095 ABSTRACT Considering
More informationDesign Exploration of an FPGA-Based Multivariate Gaussian Random Number Generator
Design Exploration of an FPGA-Based Multivariate Gaussian Random Number Generator Chalermpol Saiprasert A thesis submitted for the degree of Doctor of Philosophy in Electrical and Electronic Engineering
More informationSwitching Activity Calculation of VLSI Adders
Switching Activity Calculation of VLSI Adders Dursun Baran, Mustafa Aktan, Hossein Karimiyan and Vojin G. Oklobdzija School of Electrical and Computer Engineering The University of Texas at Dallas, Richardson,
More informationSynthesis of Saturating Counters Using Traditional and Non-traditional Basic Counters
Synthesis of Saturating Counters Using Traditional and Non-traditional Basic Counters Zhaojun Wo and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst,
More informationECE 3060 VLSI and Advanced Digital Design. Testing
ECE 3060 VLSI and Advanced Digital Design Testing Outline Definitions Faults and Errors Fault models and definitions Fault Detection Undetectable Faults can be used in synthesis Fault Simulation Observability
More informationSensitivity of System Reliability to Usage Profile Changes
Sensitivity of System Reliability to Usage Profile Changes Kim Weyns Department of Communication Systems, Lund University PO Box 118 SE-211 00 LUND, Sweden kimweyns@telecomlthse Per Runeson Department
More informationEEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture. Rajeevan Amirtharajah University of California, Davis
EEC 216 Lecture #3: Power Estimation, Interconnect, & Architecture Rajeevan Amirtharajah University of California, Davis Outline Announcements Review: PDP, EDP, Intersignal Correlations, Glitching, Top
More informationS No. Questions Bloom s Taxonomy Level UNIT-I
GROUP-A (SHORT ANSWER QUESTIONS) S No. Questions Bloom s UNIT-I 1 Define oxidation & Classify different types of oxidation Remember 1 2 Explain about Ion implantation Understand 1 3 Describe lithography
More informationElectric-Energy Generation Using Variable-Capacitive Resonator for Power-Free LSI: Efficiency Analysis and Fundamental Experiment
Electric-Energy Generation Using Variable-Capacitive Resonator for Power-Free SI: Efficiency Analysis and Fundamental Experiment Masayuki Miyazaki, Hidetoshi Tanaka, Goichi Ono, Tomohiro Nagano*, Norio
More informationAdvanced Testing. EE5375 ADD II Prof. MacDonald
Advanced Testing EE5375 ADD II Prof. MacDonald Functional Testing l Original testing method l Run chip from reset l Tester emulates the outside world l Chip runs functionally with internally generated
More informationFPGA Implementation of a Predictive Controller
FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan
More informationOverlay Aware Interconnect and Timing Variation Modeling for Double Patterning Technology
Overlay Aware Interconnect and Timing Variation Modeling for Double Patterning Technology Jae-Seok Yang, David Z. Pan Dept. of ECE, The University of Texas at Austin, Austin, Tx 78712 jsyang@cerc.utexas.edu,
More informationInformation Storage Capacity of Crossbar Switching Networks
Information Storage Capacity of Crossbar Switching etworks ABSTRACT In this work we ask the fundamental uestion: How many bits of information can be stored in a crossbar switching network? The answer is
More informationEECS240 Spring Lecture 21: Matching. Elad Alon Dept. of EECS. V i+ V i-
EECS40 Spring 010 Lecture 1: Matching Elad Alon Dept. of EECS Offset V i+ V i- To achieve zero offset, comparator devices must be perfectly matched to each other How well-matched can the devices be made?
More informationTechniques for Foundry Identification
Techniques for Foundry Identification James B. Wendt Computer Science Department, University of California, Los Angeles jwendt@cs.ucla.edu Farinaz Koushanfar Department of Electrical and Computer Engineering,
More informationDelay Testing from the Ivory Tower to Tools in the Workshop
Delay Testing from the Ivory Tower to Tools in the Workshop Einar Johan Aas Department of Electronics and Telecommunications, NTNU Nordic Test Forum, Tallinn, 25. November 2008 Name, title of the presentation
More informationEarly-stage Power Grid Analysis for Uncertain Working Modes
Early-stage Power Grid Analysis for Uncertain Working Modes Haifeng Qian Department of ECE University of Minnesota Minneapolis, MN 55414 qianhf@ece.umn.edu Sani R. Nassif IBM Austin Research Labs 11400
More informationSynthesizing a Representative Critical Path for Post-Silicon Delay Prediction
Synthesizing a Representative Critical Path for Post-Silicon Delay Prediction Qunzeng Liu University of Minnesota liuxx575@umn.edu Sachin S. Sapatnekar University of Minnesota sachin@umn.edu ABSTRACT Several
More informationToward More Accurate Scaling Estimates of CMOS Circuits from 180 nm to 22 nm
Toward More Accurate Scaling Estimates of CMOS Circuits from 180 nm to 22 nm Aaron Stillmaker, Zhibin Xiao, and Bevan Baas VLSI Computation Lab Department of Electrical and Computer Engineering University
More informationStatistical Performance Analysis and Optimization of Digital Circuits
Statistical Performance Analysis and Optimization of Digital Circuits by Kaviraj Chopra A dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy (Computer
More informationEEC 118 Lecture #16: Manufacturability. Rajeevan Amirtharajah University of California, Davis
EEC 118 Lecture #16: Manufacturability Rajeevan Amirtharajah University of California, Davis Outline Finish interconnect discussion Manufacturability: Rabaey G, H (Kang & Leblebici, 14) Amirtharajah, EEC
More informationYield Evaluation Methods of SRAM Arrays: a Comparative Study
IMTC 2 - Instrumentation and Measurement Technology Conference Como, Italy, 2 May 2 Yield Evaluation Methods of SRAM Arrays: a Comparative Study M. Ottavi,L.Schiano,X.Wang,Y-B.Kim,F.J.Meyer,F.Lombardi
More informationBuilt-In Test Generation for Synchronous Sequential Circuits
Built-In Test Generation for Synchronous Sequential Circuits Irith Pomeranz and Sudhakar M. Reddy + Electrical and Computer Engineering Department University of Iowa Iowa City, IA 52242 Abstract We consider
More informationESE 570: Digital Integrated Circuits and VLSI Fundamentals
ESE 570: Digital Integrated Circuits and VLSI Fundamentals Lec 19: March 29, 2018 Memory Overview, Memory Core Cells Today! Charge Leakage/Charge Sharing " Domino Logic Design Considerations! Logic Comparisons!
More informationDesign for Variability and Signoff Tips
Design for Variability and Signoff Tips Alexander Tetelbaum Abelite Design Automation, Walnut Creek, USA alex@abelite-da.com ABSTRACT The paper provides useful design tips and recommendations on how to
More informationTest Pattern Generator for Built-in Self-Test using Spectral Methods
Test Pattern Generator for Built-in Self-Test using Spectral Methods Alok S. Doshi and Anand S. Mudlapur Auburn University 2 Dept. of Electrical and Computer Engineering, Auburn, AL, USA doshias,anand@auburn.edu
More information388 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 3, MARCH 2011
388 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 30, NO. 3, MARCH 2011 Physically Justifiable Die-Level Modeling of Spatial Variation in View of Systematic Across
More informationDefect-Oriented and Time-Constrained Wafer-Level Test-Length Selection for Core-Based Digital SoCs
Defect-Oriented and Time-Constrained Wafer-Level Test-Length Selection for Core-Based Digital SoCs Sudarshan Bahukudumbi and Krishnendu Chakrabarty Department of Electrical and Computer Engineering Duke
More informationPre and post-silicon techniques to deal with large-scale process variations
Pre and post-silicon techniques to deal with large-scale process variations Jaeyong Chung, Ph.D. Department of Electronic Engineering Incheon National University Outline Introduction to Variability Pre-silicon
More informationEffectiveness of Reverse Body Bias for Leakage Control in Scaled Dual Vt CMOS ICs
Effectiveness of Reverse Body Bias for Leakage Control in Scaled Dual Vt CMOS ICs A. Keshavarzi, S. Ma, S. Narendra, B. Bloechel, K. Mistry*, T. Ghani*, S. Borkar and V. De Microprocessor Research Labs,
More informationPower-Aware Scheduling of Conditional Task Graphs in Real-Time Multiprocessor Systems
Power-Aware Scheduling of Conditional Task Graphs in Real-Time Multiprocessor Systems Dongkun Shin School of Computer Science and Engineering Seoul National University sdk@davinci.snu.ac.kr Jihong Kim
More information! Charge Leakage/Charge Sharing. " Domino Logic Design Considerations. ! Logic Comparisons. ! Memory. " Classification. " ROM Memories.
ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec 9: March 9, 8 Memory Overview, Memory Core Cells Today! Charge Leakage/ " Domino Logic Design Considerations! Logic Comparisons! Memory " Classification
More informationChapter 2 Fault Modeling
Chapter 2 Fault Modeling Jin-Fu Li Advanced Reliable Systems (ARES) Lab. Department of Electrical Engineering National Central University Jungli, Taiwan Outline Why Model Faults? Fault Models (Faults)
More informationThe PUMA method applied to the measures carried out by using a PC-based measurement instrument. Ciro Spataro
The PUMA applied to the measures carried out by using a PC-based measurement instrument Ciro Spataro 1 DIEET - University of Palermo, ITALY, ciro.spataro@unipa.it Abstract- The paper deals with the uncertainty
More informationFault Tolerant Computing CS 530 Fault Modeling. Yashwant K. Malaiya Colorado State University
CS 530 Fault Modeling Yashwant K. Malaiya Colorado State University 1 Objectives The number of potential defects in a unit under test is extremely large. A fault-model presumes that most of the defects
More informationPower Functions for. Process Behavior Charts
Power Functions for Process Behavior Charts Donald J. Wheeler and Rip Stauffer Every data set contains noise (random, meaningless variation). Some data sets contain signals (nonrandom, meaningful variation).
More informationOn the Complexity of Error Detection Functions for Redundant Residue Number Systems
On the Complexity of Error Detection Functions for Redundant Residue Number Systems Tsutomu Sasao 1 and Yukihiro Iguchi 2 1 Dept. of Computer Science and Electronics, Kyushu Institute of Technology, Iizuka
More informationAD-Al6E 201 VLSI PROCESS PROBLEN DIAGNOSIS AND YIELD PREDICTION: A IS. CONPREHENSIVE TEST.. (U) STANFORD UNIV CA CENTER FOR INTEGRATED SYSTEMS N
AD-Al6E 201 VLSI PROCESS PROBLEN DIAGNOSIS AND YIELD PREDICTION: A IS. CONPREHENSIVE TEST.. (U) STANFORD UNIV CA CENTER FOR INTEGRATED SYSTEMS N YARBROUGH ET AL. 1985 UNCLASSIFIED F/6 9/ M L.~~V 36. MEIIJL
More informationProbabilistic Dual-Vth Leakage Optimization Under Variability
Probabilistic Dual-Vth Leaage Optimization Under Variability Azadeh Davoodi Azade@eng.umd.edu Anur Srivastava Anurs@eng.umd.edu Department of Electrical and Computer Engineering University of Maryland,
More informationFigure 1.1: Schematic symbols of an N-transistor and P-transistor
Chapter 1 The digital abstraction The term a digital circuit refers to a device that works in a binary world. In the binary world, the only values are zeros and ones. Hence, the inputs of a digital circuit
More informationAccurate Estimating Simultaneous Switching Noises by Using Application Specific Device Modeling
Accurate Estimating Simultaneous Switching Noises by Using Application Specific Device Modeling Li Ding and Pinaki Mazumder Department of Electrical Engineering and Computer Science The University of Michigan,
More informationClock Skew Scheduling in the Presence of Heavily Gated Clock Networks
Clock Skew Scheduling in the Presence of Heavily Gated Clock Networks ABSTRACT Weicheng Liu, Emre Salman Department of Electrical and Computer Engineering Stony Brook University Stony Brook, NY 11794 [weicheng.liu,
More informationPOST-SILICON TIMING DIAGNOSIS UNDER PROCESS VARIATIONS
POST-SILICON TIMING DIAGNOSIS UNDER PROCESS VARIATIONS by Lin Xie A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical Engineering) at
More informationSection 3: Combinational Logic Design. Department of Electrical Engineering, University of Waterloo. Combinational Logic
Section 3: Combinational Logic Design Major Topics Design Procedure Multilevel circuits Design with XOR gates Adders and Subtractors Binary parallel adder Decoders Encoders Multiplexers Programmed Logic
More informationThis is the author s final accepted version.
Al-Ameri, T., Georgiev, V.P., Adamu-Lema, F. and Asenov, A. (2017) Does a Nanowire Transistor Follow the Golden Ratio? A 2D Poisson- Schrödinger/3D Monte Carlo Simulation Study. In: 2017 International
More informationEfficient Circuit Analysis under Multiple Input Switching (MIS) Anupama R. Subramaniam
Efficient Circuit Analysis under Multiple Input Switching (MIS) by Anupama R. Subramaniam A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved
More informationFeasibility Study of Capacitive Tomography
Feasibility Study of apacitive Tomography Tony Warren Southern Polytechnic State University 1100 South Marietta Parkway Marietta, GA 30060 678-915-7269 twarren@spsu.edu Daren R. Wilcox Southern Polytechnic
More informationTrade Space Exploration with Ptolemy II
Trade Space Exploration with Ptolemy II Shanna-Shaye Forbes Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2009-102 http://www.eecs.berkeley.edu/pubs/techrpts/2009/eecs-2009-102.html
More information