M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011

Size: px

Start display at page:

Download "M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011"

Janice Palmer
6 years ago
Views:

1 .S. Project Report Efficient Failure Rate Prediction for SRA Cells via Gibbs Sampling Yamei Feng /5/ Committee embers: Prof. Xin Li Prof. Ken ai

2 Table of Contents CHAPTER INTRODUCTION...3 CHAPTER BACKGROUND...5 CHAPTER 3 GIBBS SAPLING GIBBS SAPLING IN CARTESIAN COORDINATE SYSTE GIBBS SAPLING IN SPHERICAL COORDINATE SYSTE... 9 CHAPTER 4 IPLEENTATION DETAILS ONE-DIENSIONAL INVERSE-TRANSFOR SAPLING INITIAL STARTING POINT SELECTION TWO-STAGE ONTE CARLO FLOW... 7 CHAPTER 5 NUERICAL EXAPLES NOISE ARGIN READ CURRENT CONSIDERATIONS FOR OTHER APPLICATIONS... 8 CHAPTER 6 CONCLUSIONS... 9 APPENDIX... 3 REFERENCES... 3

3 CHAPTER INTRODUCTION As deep sub-micron technology advances, process variations pose a new set of challenges for static random access memory (SRA) design. SRA has been widely embedded in semiconductor chips; roughly half of the area of an advanced microprocessor chip is occupied by SRA [9]. SRA cells are generally designed with minimum-size devices [9] and can be significantly impacted by large-scale process variations (e.g., local mismatches caused by random doping fluctuations) in nanoscale technology []-[4]. For this reason, it becomes increasingly critical to evaluate the failure rate of SRA cells both efficiently and accurately in order to achieve a robust design. Towards this goal, a number of statistical analysis methods have been proposed for SRA circuits [5]-[5]. For instance, analytical performance models have been derived to predict SRA parametric yield [5]-[7]. While these models offer great design insights to understand SRA circuits, they may not accurately capture the circuit behavior due to various approimations. Another possible approach for SRA failure rate prediction is based on transistor-level simulation, including both onte Carlo analysis [8]-[4] and deterministic failure region prediction [5]. Since SRA cells typically have etremely small failure probability (e.g., 8 ~ 6 ), a simple onte Carlo method which directly samples the variation space suffers from slow convergence, as only a few random samples will fall into the failure region. To improve the sampling efficiency, a number of importance sampling methods have been proposed for fast SRA failure rate prediction [8]-[4]. The key idea of importance sampling is to directly sample the failure region based on a distorted probability density function (PDF), instead of the original PDF of process variations. Applying importance sampling to SRA analysis, however, is not trivial. The efficiency of importance sampling heavily relies on the choice of the distorted PDF that is used to generate random samples. Ideally, we should sample the failure region that is most likely to occur. Such a goal, however, is etremely difficult to achieve, as we never eactly know the failure region in practice. The challenge here is how to determine the optimal PDF so that the SRA failure rate can be efficiently predicted. In this paper, a new importance sampling method, Gibbs Sampling, is proposed to improve the efficiency of SRA failure rate prediction. Unlike the traditional onte Carlo algorithm that samples a given PDF, the proposed Gibbs sampling approach does not need to know the sampling PDF eplicitly. It adaptively searches the failure region and then generates random samples in it. When applied to SRA failure rate analysis, Gibbs sampling can be conceptually viewed as a unique onte Carlo method with an integrated optimization engine which allows us to efficiently eplore the failure region. Thus, SRA 3

4 failure probability can be accurately predicted with a small number of sampling points. Our eperimental results of a commercial 9nm SRA cell demonstrate that compared to other state-of-the-art techniques, the proposed Gibbs sampling algorithm achieves.6~3.9 runtime speed-up without surrendering any accuracy. In addition, we further demonstrate that in certain applications the proposed Gibbs sampling algorithm accurately estimates the correct failure probability while the traditional techniques fail to work. While the Gibbs sampling method was initially developed by the statistics community [6], [], in this paper, it is particularly tuned for our SRA analysis application via four important new contributions. First, the proposed Gibbs sampling algorithm is implemented for both Cartesian and spherical coordinate systems. The accuracy achieved by these two different implementations is application-dependent, as will be demonstrated in Chapter 5. Also, a novel variable mapping scheme is derived to facilitate efficient statistical sampling in the spherical coordinate system. Second, as previously mentioned, Gibbs sampling iteratively searches the failure region. At each iteration step, it needs to generate a random sample from an irregular one-dimensional PDF that is not simply uniform or Normal. Such a sampling task cannot be easily done by directly using a random number generator. For this reason, we propose to adopt the inverse-transform method [] and incorporate it into our proposed Gibbs sampling engine. As such, randomly sampling an arbitrary one-dimensional PDF can be performed efficiently, which is a key technique to make the proposed Gibbs sampling of practical utility. Third, an efficient implementation of Gibbs sampling requires a good starting point to speed up the convergence. This is similar to most optimization algorithms where a good starting point facilitates fast convergence. In this paper, we propose a model-based optimization to determine a good starting point. Finally, to further improve prediction accuracy and reduce computational cost, a two-stage onte Carlo flow is developed where Gibbs sampling is applied to create a set of random samples during the first stage and these samples are used to learn the optimal PDF for importance sampling. Net, during the second stage, a large number of random samples are efficiently generated from the optimal PDF to accurately estimate the failure probability. The remainder of this paper is organized as follows. In Chapter, we briefly review the background on importance sampling and then propose the Gibbs sampling method in Chapter 3. Several implementation issues (including one-dimensional inverse-transform sampling, initial starting point selection, etc.) are discussed in detail in Chapter 4. A commercial 9nm SRA cell is used to demonstrate the efficacy of the proposed Gibbs sampling method in Chapter 5. Finally, we conclude in Chapter 6. 4

5 CHAPTER BACKGROUND Suppose that = [ ] T is an -dimensional random variable, which models process variations and has a joint PDF f(). Typically, is modeled as a multivariate Normal distribution [5]-[5]. Without loss of generality, we further assume that the random variables { m ; m =,,, } in the vector are mutually independent and standard Normal (i.e., with zero mean and unit variance): f ep m. () m And any correlated random variables that are jointly Normal can be transformed to the independent random variables { m ; m =,,, } by principal component analysis (PCA) []. The failure probability of an SRA cell can be mathematically represented as [8]: f P f d () where denotes the failure region (i.e., the subset of the variation space where the performance of interest does not meet the specification). Alternatively, the failure probability in Eq. () can be defined as: 3 I f where I() represents the indicator function: 4 P f d (3) I (4) The failure probability P f can be estimated by onte Carlo analysis. The key idea is to draw N random samples from f(), and then compute the mean of these samples: 5 C f N where (n) is the nth random sample generated by onte Carlo analysis. ~ P N n I n (5) For the SRA application, the failure probability P f in Eq. (3) is etremely small (e.g., 8 ~ 6 ) and most random samples created by onte Carlo analysis do not fall into the failure region. Hence, a large number of (e.g., over 7 ~ 9 ) samples are needed to accurately estimate the failure rate. Note that epensive transistor-level simulation is required to create each sampling point. In other words, 7 ~ 9 simulation runs must be performed. It, in turn, implies that the aforementioned onte Carlo method is etremely epensive, or even infeasible, when applied to most SRA analysis problems. 5

6 To address this computational cost issue, importance sampling has been proposed to improve the efficiency of onte Carlo analysis [8]-[4]. It aims to directly generate a large number of random samples in the failure region by using a distorted PDF g(). And, the failure probability can be epressed as [8]: 6 f g I P f g d (6) In other words, Eq. (6) calculates the epected value of the function I()f()/g() where the random variable follows the PDF g(). If N sampling points { (n) ; n =,,, N} are drawn from g(), the failure probability in Eq. (6) can be estimated by [8]: 7 ~ P IS f N N n I n n f n Note that the estimated failure probabilities in Eq. (5) and Eq. (7) are identical, if and only if the number of random samples (i.e., N) is infinite. In practice, when a finite number of sampling points are available, the results from Eq. (5) and Eq. (7) can be substantially different. If the distorted PDF g() is properly chosen, Eq. (7) can be much more accurate than the simple onte Carlo method in Eq. (5). In theory, the optimal PDF g() with maimal estimation accuracy is []: OPT 8 g g f (7) I f (8) P Intuitively, in Eq. (8), the function I()f()/g OPT () becomes a constant with zero variance. Hence, its epectation can be accurately estimated by Eq. (7) with few random samples. Studying Eq. (8), we would notice two important properties of the optimal PDF g OPT (). First, g OPT () is non-zero if and only if the variable sits in the failure region. It, in turn, implies that we should directly sample the failure region to achieve maimal accuracy. Second, g OPT () is proportional to the original PDF f() of process variations. In other words, we should sample the variation space where performance failure is most likely to occur. In practice, however, sampling the optimal PDF g OPT () in Eq. (8) is not trivial, as the indicator function I() is not known in advance. ost eisting importance sampling algorithms apply various heuristics to approimate the optimal PDF g OPT () [8]-[4]. In this paper, we propose a new Gibbs sampling method that adaptively samples the optimal PDF g OPT () without eplicitly knowing the indicator function I(). And, the SRA failure rate can be accurately predicted with low computational cost. 6

7 CHAPTER 3 GIBBS SAPLING As described in Chapter, directly sampling the optimal PDF g OPT () in Eq. (8) is difficult for two reasons. First, the indicator function I() is not known in advance. Second, since g OPT () is not a simple multivariate statistical distribution such as uniform distribution or Normal distribution, it is etremely difficult, if not impossible, to directly draw random samples from g OPT (). In this paper, we adopt the Gibbs sampling method [6], [] to predict the failure probability of SRA cells. Compared to other traditional techniques [8]-[4], Gibbs sampling provides two promising features. First, it can efficiently search the failure region and determine the indicator function I() on the fly. Second, Gibbs sampling does not directly draw random samples from a multi-dimensional joint PDF. Instead, it iteratively samples a sequence of one-dimensional PDFs. And the one-dimensional sampling can be efficiently implemented with an inverse-transform method [] with low computational cost. The proposed Gibbs sampling method is implemented in both Cartesian and spherical coordinate systems. The accuracy achieved by these two different implementations is application-dependent, as will be demonstrated by the eperimental results in Chapter 5. In what follows, we describe the technical details of Gibbs sampling method for Cartesian and spherical coordinate systems. 3. GIBBS SAPLING IN CARTESIAN COORDINATE SYSTE To illustrate the Gibbs sampling algorithm in the Cartesian coordinate system, we first consider a simple two-dimensional eample in Fig.. Starting from an initial point [ () () ] T, Gibbs sampling samples the conditional PDF g OPT [ () ] and replaces () by a new value (), as shown in Fig. (a)-(b). Net, Gibbs sampling samples a different conditional PDF g OPT [ () ] and replaces () by a new value (), as shown in Fig. (c)-(d). Since the random variable is two-dimensional, Gibbs sampling then varies again and draws a new random value (3) by sampling the conditional PDF g OPT [ () ]. These iteration steps are repeated until a sufficient number of random samples are generated. And it can be proven that the iterations yield a sequence of samples that follow the distribution g OPT (, ) [6]. 7

8 g OPT (, ) g OPT [ () ] g OPT (, ) g OPT [ () ] [ () () ] T () () (a) (b) [ () () ] T () () (c) (d) Fig.. A two-dimensional eample for Gibbs sampling in the Cartesian coordinate system. (a) [ () () ] T is the starting point. The intersection between the joint PDF g OPT (, ) and the plane = () defines the conditional PDF g OPT [ () ]. (b) A new point () is sampled from g OPT [ () ]. (c) The intersection of the joint PDF g OPT (, ) and the plane = () defines the conditional PDF g OPT [ () ]. (d) A new point () is sampled from g OPT [ () ]. Algorithm : Gibbs sampling in Cartesian coordinate system. Start from an -dimensional PDF g OPT (,,, ).. Select an initial starting point [ () () () ] T. 3. For t =,, 4. For m =,,, 5. Draw (t+) m from the conditional PDF g OPT [ m (t+),, (t+) m, (t) m+,, (t) ] to create a new sampling point [ (t+) (t+) m (t) m+ (t) ] T. 6. End For 7. End For The aforementioned two-dimensional Gibbs sampling can be etended to -dimension. Starting from an initial point [ () () () ] T, Gibbs sampling assigns a new value to one of the random variables, by randomly sampling a conditional PDF at each iteration step with the following conditional PDF: 8

9 OPT OPT 9,,,,, g g m m m m \ m (9) where \m denotes the vector with m removed. And Algorithm summarizes the major steps of Gibbs Sampling. 3. GIBBS SAPLING IN SPHERICAL COORDINATE SYSTE The Gibbs sampling method summarized by Algorithm can be etended to spherical coordinate systems. And, we need to transform the -dimensional normal PDF from the Cartesian coordinate system to the spherical coordinate systems. Traditionally, variables to specify a spatial location in an -dimensional spherical coordinate system can be epressed as: r cos 3 r sin r sin r sin cos sin cos r sin sin cos sin sin Eq. () also shows the mapping from the spherical coordinate r and θ = [θ θ θ ] T to the Cartesian coordinate []. The variable r defines the radius (i.e., the distance to the origin) and the other angel variables {θ m ; m =,,, } define the angle. Fig. (a) shows a three-dimensional eample of the traditional spherical coordinate system. 3 () θ r α r θ 3 α α (a) (b) Fig.. Two three-dimensional eamples of spherical coordinate systems. (a) The traditional spherical coordinate system. (b) The proposed spherical coordinate system. While the aforementioned transformation can be easily found for low-dimensional, it becomes computationally epensive as the dimensionality increases []. So we propose an alternative way to define the spherical coordinate system, and the -dimensional Normal PDF can be easily specified. 9

10 We write the Cartesian coordinate as: α r r r () α α where α = [α α α ] T and denotes the L-norm of a vector. In Eq. (), the variable r defines the radius, which is similar to Eq. (). The other variables {α m ; m =,,, } define the orientation of the vector. Fig. (b) shows a three-dimensional eample of the proposed spherical coordinate system. There is an important difference between the traditional spherical coordinate system and the proposed spherical coordinate system. While the traditional spherical coordinate system uses variables (i.e., r and θ) to define a spatial location in the -dimensional space, the proposed spherical coordinate system requires + variables (i.e., r and α) to define the same spatial location. The + variables used by our proposed spherical coordinate system are redundant. In other words, for a given Cartesian coordinate, the mapping from to r and α is not unique. Such redundancy, however, allows us to easily find the joint PDF of r and α, and makes the variables { m ; m =,,, } in Eq. () mutually independent and standard Normal. In what follows, we will derive the joint probability distribution for r and α in detail. First, the variable r which represents the radius is equal to: r. () Since the random variables { m ; m =,,, } are mutually independent and standard Normal, it is easy to verify that the random variable r follows the Chi distribution with degrees of freedom []: 3 f r r ep.5 r.5 r where Γ() denotes the Gamma function. Net, we consider the vector α that defines the orientation. Studying Eq. (), we would have two important observations. On the one hand, the orientation of the vector is uniquely determined by α/ α. It is independent of the length of the vector α (i.e., α ). On the other hand, the orientation α/ α must be uniformly distributed based on the fact that the random variables { m ; m =,,, } in Eq. () are mutually independent and standard Normal. So, the unit length vector α/ α must take any possible orientation with equal probability. We then define the following PDF for α: 4 f (3) m α ep (4) m where {α m ; m =,,, } are mutually independent and standard Normal. It is well-known that the multivariate Normal distribution in Eq. (4) is spherically symmetric []. Therefore, sampling α based on

11 Eq. (4) and then normalizing α by its L-norm allow us to generate a uniformly distributed orientation in the -dimensional space [7]. Finally, based on the fact that the radius and the orientation of the multivariate Normal distribution in Eq. () should be mutually independent, we combine f(r) in Eq. (3) and f(α) in Eq. (4) together. And the joint PDF f(r, α) is equal to: m α ep. (5) r ep 5.5 r f r,.5 r m Theorem states that given the joint PDF f(r, α) in Eq. (5), the random variables { m ; m =,,, } defined in Eq. () are mutually independent and standard Normal. The detailed proof of Theorem can be found in Appendi. Theorem : Given the random variables { m ; m =,,, } defined in Eq. () where r and α follow the joint PDF f(r, α) in Eq. (5), the joint PDF f() is a multivariate Normal distribution represented by Eq. (). Given r and α, importance sampling can be applied to estimate the failure probability in the proposed spherical coordinate system. In this case, the estimated failure rate in Eq. (7) can be re-written as: 6 ~ P IS f N N I n n n n r, α f r, α n n n g r, where I(r, α) and g(r, α) represent the indicator function and the distorted PDF in the spherical coordinate system respectively, [r (n) α (n) ] T stands for the nth sampling point drawn from g(r, α), and N denotes the total number of samples. Similar to Eq. (8), the optimal PDF g(r, α) with maimum estimation accuracy is: OPT 7 g r, α α r, α f r, α f (6) I (7) P Given Eq. (6) and Eq. (7), Algorithm is formulated to perform Gibbs sampling in the spherical coordinate system. Similar to Algorithm, Algorithm samples a one-dimensional conditional PDF g OPT (r α) or g OPT (α m r, α \m ) at each iteration step, where α \m denotes the vector α with α m removed. As discussed at the beginning of this Chapter, directly sampling the multi-dimensional joint PDF g OPT () or g OPT (r, α) can be etremely difficult. With Gibbs Sampling, we only need to sample the one-dimensional conditional PDF g OPT ( m \m ), g OPT (r α) or g OPT (α m r, α \m ). As will be demonstrated in Chapter 4., such one-dimensional sampling can be efficiently implemented by using an inverse-transform method [], even if the PDF g OPT () in Eq. (8) or g OPT (r, α) in Eq. (7) are not eplicitly known. In addition, a model-based

12 optimization can be applied to determine a good starting point [ () () () ] T or [r () α () α () α () ] T so that the Gibbs sampling algorithm converges quickly. All these implementation details will be discussed in Chapter 4. Algorithm : Gibbs sampling in the spherical coordinate system. Start from an (+)-dimensional PDF g OPT (r, α, α,, α ).. Select an initial starting point [r () α () α () α () ] T. 3. For t =,, 4. Draw r (t+) from the conditional PDF g OPT [r α (t),, α (t) ] to create a new sampling point [r (t+) α (t) α (t) ] T. 5. For m =,,, 6. Draw α (t+) m from the conditional PDF g OPT [α m r (t+), α (t+),, α (t+) m, α (t) m+,, α (t) ] to create a new sampling point [r (t+) α (t+) α (t+) m α (t) m+ α (t) ] T. 7. End For 8. End For

13 CHAPTER 4 IPLEENTATION DETAILS The proposed Gibbs sampling method is efficiently implemented with the following techniques: () one-dimensional inverse-transform sampling, () initial starting point selection, and (3) two-stage onte Carlo analysis. 4. ONE-DIENSIONAL INVERSE-TRANSFOR SAPLING During each iteration step of Gibbs sampling, one random variable is sampled from a one-dimensional conditional PDF g OPT ( m \m ), g OPT (r α) or g OPT (α m r, α \m ). Such one-dimensional sampling can be efficiently performed by an inverse-transform method []. To derive the one-dimensional sampling algorithm used in this paper, we first consider the scenario of sampling the random variable m from the conditional PDF g OPT ( m \m ). In this case, we write g OPT ( m \m ) as: OPT 8 g Substituting Eq. (8) into Eq. (8) yields: OPT 9 m \ m g, g OPT OPT m \ m. (8) OPT OPT g \ m g \ m f. (9), f \ m g m \ m I OPT m \ m m \ m Pf g \ m Since the random variables { m ; m =,,, } are mutually independent as shown in Eq. (), Eq. (9) can be re-written as: OPT g m f. () \ m \ m I OPT m, \ m Pf g \ m f Studying Eq. (), we have three important observations. First, g OPT ( m \m ) is linearly proportional to the probability distribution f( m ). Second, g OPT ( m \m ) is non-zero if and only if the variable m, combined with \m, sits in the failure region. Third, the term f( \m ) / P f / g OPT ( \m ) in Eq. () is a constant, given any fied value of \m. For these reasons, when sampling the one-dimensional conditional PDF g OPT ( m \m ), we should draw random samples from the failure region based on the PDF f( m ). m 3

14 f( m ) F( m ) F(v m ) u m m v m m s m F(u m ) u m m v m m 9 (a) (b) Fig. 3. One-dimensional inverse-transform sampling. (a) A single continuous failure region Ω m is denoted by the gray area. (b) A random variable m is sampled by the inverse-transform method based on the CDF F( m ). We consider the one-dimensional failure region shown in Fig. 3(a). The symbol m represents the failure region over the variable m while all other random variables are set to some given value. Similar to the previous work [8]-[], [4], we assume that there is only one single continuous failure region which can be represented as a one-dimensional interval [u m, v m ]. Based on the assumption that m is continuous, we could find the left boundary u m and the right boundary v m, with binary search over the variable m. Once u m and v m are known, we need to draw a random sample from the interval [u m, v m ]. It is important to note that the PDF g OPT ( m \m ) in Eq. () represents a truncated Normal distribution defined by f( m ) over the interval [u m, v m ], and it can be sampled with the inverse-transform method. To apply the inverse-transform method to sample g OPT ( m \m ), we first calculate the cumulative distribution function (CDF) F( m ), where m follows a standard Normal distribution. Net, we sample a new random variable s m that is uniformly distributed over the interval [F(u m ), F(v m )], as shown in Fig. 3(b). Then, we map the sampling point s m back to m based on the inverse CDF: F () m s m where F (s m ) is the inverse function of F( m ). It can be proven that the sampling point m generated by the inverse-transform method follows the statistical distribution g OPT ( m \m ) in Eq. () []. The aforementioned inverse-transform method can be similarly applied to sample the other two one-dimensional conditional PDFs g OPT (r α) and g OPT (α m r, α \m ). Similar to Eq. (), it is easy to verify the following two equations: OPT g r OPT 3 g r, m α OPT α f α Ir, α f r. () P g f f r, α α I, r α f \ m \ m OPT m. (3) Pf g r, α\ m 4

15 In Eq. () and Eq. (3), f(r) is a Chi distribution with degrees of freedom and f(α m ) is a standard Normal distribution. To sample g OPT (r α) and g OPT (α m r, α \m ), we first apply binary search to find the one-dimensional failure regions [u r, v r ] for the variable r and [u αm, v αm ] for the variable α m. Net, we sample the new random variables s r and s αm from the uniform distributions over [F(u r ), F(v r )] and [F(u αm ), F(v αm )] respectively. Finally, we map s r and s αm back to r and α m based on the inverse CDFs: 4 r F 5 F s m (4) s r (5) where F (s r ) and F (s αm ) are the inverse functions of the CDFs F(r) and F(α m ) respectively. Algorithm 3 summarizes the major steps of the inverse-transform method. It is important to emphasize that the computational cost of Algorithm 3 is dominated by Step where multiple transistor-level simulations are required to find the left and right boundaries of the one-dimensional failure region. All other steps do not involve transistor-level simulations and can be computed with low computational cost. Algorithm 3: One-dimensional inverse-transform sampling. Start from the joint PDFs f() in () and f(r, α) in Eq. (5).. Apply binary search to find the left boundary (i.e., u m, u r or u αm ) and the right boundary (i.e., v m, v r or v αm ) for the one-dimensional failure region of the random variable that should be sampled (i.e., m, r or α m ). 3. Draw one random sample (i.e., s m, s r or s αm ) from the uniform distribution over the one-dimensional failure region (i.e., [F(u m ), F(v m )], [F(u r ), F(v r )] or [F(u αm ), F(v αm )]). 4. ap the random sample s m, s r or s αm to m, r or α m based on Eq. (), Eq. (4) or Eq. (5) where F (s m ) and F (s αm ) are the inverse CDFs of a standard Normal distribution and F (s r ) is the inverse CDF of a Chi distribution with degrees of freedom. m 4. INITIAL STARTING POINT SELECTION To efficiently apply the proposed Gibbs sampling method, a good initial starting point is needed to speed-up the convergence of Algorithm and/or Algorithm. During Gibbs sampling, as each sample is drawn from a one-dimensional conditional PDF, the current sampling point is correlated to the previous sampling point. In other words, the random samples generated by Gibbs sampling depend on the initial starting point. It has been proven by the statistics community that starting from an initial point, the samples generated at the beginning of the Gibbs sampling may not follow the given probability distribution (i.e., g OPT () for Algorithm and g OPT (r, α) for Algorithm ) []. These samples contribute to the warm-up 5

16 interval of Gibbs sampling, and should be abandoned. In practice, the length of warm-up interval should be minimized; otherwise, generating a large number of random samples inside the warm-up interval can waste a large amount of computational time. In this paper, we propose to reduce the warm-up interval by appropriately selecting a good initial starting point. In what follows, we first discuss the proposed initial point selection scheme for Cartesian coordinate systems. Remember that our objective is to draw random samples from the failure region that is most likely to occur, as shown in Eq. (8). Hence, the initial starting point [ () () () ] T should meet the following two criteria: () [ () () () ] T () is inside the failure region, and () the PDF takes a large value at [ () () ] T. These two requirements can be mathematically translated to the following optimization problem: 6 f,,, T maimize,, (6) S.T. Since the -dimensional random variable is modeled as a multivariate Normal distribution in Eq. (), the optimization in Eq. (6) is equivalent to: 7,, (7) T minimize S.T. Eq. (7) aims to find the failure point that is closest to the origin =. It is similar to the norm minimization approach proposed in []. Note that solving the optimization problem in Eq. (7) is not trivial, since the failure region is not eplicitly known. To address this issue, we propose to approimate the performance of interest as a linear or quadratic model of the -dimensional random variable. Once the model is available, the optimization in Eq. (7) can be solved by either quadratic programming (for linear performance model) or semi-definite programming (for quadratic performance model) [8]. The detailed algorithms for performance modeling and optimization can be found in [8]. Note that our goal is not to eactly solve Eq. (7). Instead, we only need to find an approimated solution that can be used by Gibbs sampling to further eplore the variation space with high failure probability. It is important to mention that even though the linear or quadratic performance models may not be highly accurate, we can still obtain a good starting point. We can further etend the aforementioned initial point selection method to spherical coordinate systems. In this case, once the Cartesian coordinate [ () () () ] T is determined for the initial starting point, we need to map it to the spherical coordinate [r () α () α () α () ] T. As shown in Eq. (), the value of r () can be easily calculated as: 6

17 8 r To calculate {α m () ; m =,,, }, Eq. () can be re-written as: 9. (8) α α α r r r. (9) As discussed in Chapter 3., the vector α () only determines the orientation of () and the length of α () can be arbitrary. For our Gibbs sampling application, since we aim to locate a failure point [r () α () α () α () ] T that is most likely to occur, we should determine {α () m ; m =,,, } by maimizing the PDF f(α) in Eq. (4). Namely, even though there are multiple possible solutions of α () that are associated with the same Cartesian coordinate (), we are interested in the solution that is most likely to occur (i.e., the maimum-likelihood solution). Based on this maimum-likelihood criterion and the fact that α () follows the multivariate Normal distribution defined in Eq. (4), the length of α () should be sufficiently small so that the PDF value f[α () ] is large. Let α () = ε, where ε, and Eq. (9) becomes: 3. (3) r In practice, we find that ε =.~. is a good choice based on our numerical eperiments. Once ε is set and { () m ; m =,,, } and r () are found by Eq. (7) and Eq. (8), {α () m ; m =,,, } are uniquely determined by Eq. (3). r r 4.3 TWO-STAGE ONTE CARLO FLOW As shown in both Algorithm and Algorithm, multiple transistor-level simulations are required to perform binary search and to generate a single Gibbs sample. So, to make the proposed Gibbs sampling algorithm more efficient, there is one additional implementation strategy that should be applied. Once a set of Gibbs samples are created, it is desirable to learn the optimal PDF from these samples so that additional random sampling points can be directly drawn from the optimal PDF without running binary search. Such a strategy would help us to further reduce the computational cost and to improve the prediction accuracy. Towards this goal, we propose to adopt a two-stage onte Carlo flow consisting of two sequential steps. First, Gibbs sampling is applied to generate K random samples inside the failure region. Net, at the second stage, we approimate the optimal PDF g OPT () as a multivariate Normal distribution g NOR (). And the 7

18 mean value and the covariance matri of g NOR () are calculated from the previous K Gibbs samples. Once g NOR () is known, we directly sample it to generate N random samples and estimate the failure rate from these N samples: 3 ~ P IS f N N n I n n f NOR n Algorithm 4 summarizes the major steps of the proposed two-stage onte Carlo flow. Algorithm 4: Two-stage onte Carlo Flow. Start from a joint PDF f(), a given value of K (i.e., the number of Gibbs samples for the first stage), and a given value of N (i.e., the number of random samples for the second stage).. Apply Algorithm (for Cartesian coordinate systems only), Algorithm (for spherical coordinate systems only) and Algorithm 3 (for both Cartesian and spherical coordinate systems) to generate K Gibbs sampling points. 3. If the Gibbs samples are generated in a spherical coordinate system, map them to the Cartesian coordinates by using Eq. (). 4. Calculate the mean value and the covariance matri of these K Gibbs samples in the Cartesian coordinate system. Determine a multivariate Normal distribution g NOR () to approimate the optimal PDF for Importance Sampling. 5. Generate N random samples from the multivariate Normal distribution g NOR (). 6. Calculate the failure rate from these N samples by using Eq. (3). It should be noted that several traditional importance sampling techniques also draw random samples from a multivariate Normal distribution [8], [4]. However, unlike the traditional techniques, we apply Gibbs sampling to optimally determine both the mean value and the covariance matri for g NOR (). Hence, the second-stage random sampling can converge quickly. Compared to other traditional techniques, Algorithm 4 provides.6~3.9 speed-up without surrendering any accuracy, which be demonstrated by the eperimental results in Chapter 5. In addition, a practical eample is shown in Chapter 5 that the proposed Gibbs sampling algorithm can accurately estimates the correct failure probability while the traditional techniques fail to work. g (3) 8

19 CHAPTER 5 NUERICAL EXAPLES Fig. 4. A 6-T SRA cell (the test case to demonstrate the efficacy of the proposed Gibbs sampling method). Fig. 4 shows the circuit of a 6-T SRA cell designed in a commercial 9nm process. In this chapter, the SRA cell is used to demonstrate the efficiency of the proposed Gibbs sampling method. For testing and comparison purpose, four different importance sampling methods are used: () miture importance sampling (IS) [8], () minimum-norm importance sampling (NIS) [4], (3) the proposed Gibbs sampling implemented for Cartesian coordinate systems (G-C), and (4) the proposed Gibbs sampling implemented for spherical coordinate systems (G-S). All these four methods employ a two-stage analysis flow. Namely, a multivariate Normal distribution g NOR () is first constructed from K random samples during the first stage and then N random samples are drawn from g NOR () to calculate the failure probability at the second stage. In our eperiments, all four different methods use the same number of samples (K = 5) to determine g NOR () at the first stage. Our numerical eperiments are run on a.53 GHz Linu server with 6 GB memory. 5. NOISE ARGIN In this sub- chapter, two performance matrices, read noise margin (RN) and write noise margin (WN), are used to assess the stability of the SRA cell. Both RN and WN must be greater than zero to ensure a stable SRA cell. The four importance sampling methods are applied to estimate the failure rate associated with RN and WN. 9

20 (a) (b) (c) Fig. 5. A simulation method to calculate RN and WN. (a) Noise margin estimation. The characteristics of the normal and the mirrored inverter are defined by the functions y= F () and y= F (), where the later is the mirrored version of y= F (). (b) and (c) Circuit diagrams to calculate v for different u.

21 In this paper, only the static-noise sources are taken into account. We use a simulation method, as shown in Fig. 5, to calculate RN and WN. The method draws and mirrors two inverters characteristics and then finds the maimal possible square between the two inverters [8]. Fig. 5(a) shows two coordinate systems which are rotated by 45 degrees relative to each other. In the (u, v) system, the value of v-v at given u yields curve A, from which the RN and WN would be estimated. To find F() and F () in terms of u and v, we transform the (, y) coordinates into the (u, v) system based on the circuits shown in Fig. 5(b) and Fig. 5(c). The maimal v-v value gives a measure for RN and WN. And the difference between calculating RN and WN with this method is that the operating states of the inverters F and F are different. The node voltages of F and F are set based on the reading and writing state of the SRA Pass - 9 Fig Pass Fail VTH3 ( ) (b) Random samples generated by IS at the second stage. (a) RN. (b) WN. Pass Fail VTH5 ( ) VTH ( ) (a) VTH3 ( ) Fig. 6. Fail VTH5 ( ) VTH3 ( ) cell. 4 6 VTH ( ) (a) 8 Pass Fail VTH3 ( ) (b) Random samples generated by NIS at the second stage. (a) RN. (b) WN.

22 Pass - 9 Fig. 9. (b) Random samples generated by G-C at the second stage. (a) RN. (b) WN Fail VTH5 ( ) Pass VTH ( ) 8 Pass Fail VTH3 ( ) (a) VTH3 ( ) Fig. 8. Fail VTH5 ( ) VTH3 ( ) VTH ( ) 8 Pass Fail VTH3 ( ) (a) (b) Random samples generated by G-S at the second stage. (a) RN. (b) WN. To compare the difference between the four importance sampling algorithms, Fig. 6-Fig. 9 plot the random samples that are generated at the second stage. For illustration purpose, these figures only show the VTH mismatches of two transistors that are critical to the performance of interest (i.e., VTH and VTH3 for RN and VTH3 and VTH5 for WN). Each random sample is labeled as Pass or Fail, which indicates whether the performance of interest passes or fails the specification. There are two important observations from Fig. 6-Fig. 9 results. First, both IS and NIS cannot accurately capture the optimal PDF gopt(). These two traditional methods only identify the mean value of gopt(), while ignoring the covariance matri. And as a result, a large number of random samples generated at the second stage do not fall into the failure region. On the other hand, the proposed Gibbs sampling algorithms (i.e., G-C and G-S) are able to accurately capture both the mean value and the covariance matri of gopt(). For this reason, most sampling points generated by G-C and G-S are inside the failure region, thereby the accuracy of failure rate prediction is improved.

23 Relative Error Relative Error Failure Rate ( -6 ) Failure Rate ( -6 ) 5 NIS IS 5 G-S G-C 4 # of Simulations 3 G-C IS G-S NIS 4 # of Simulations 9 (a) (b) Fig.. Estimated failure probability, plotted as a function of the number of transistor-level simulations at the second stage. (a) RN. (b) WN. Fig. shows the estimated failure probability as a function of the number of transistor-level simulations at the second stage. All four importance sampling methods yield the same failure probability, if the number of random samples is sufficiently large. However, the proposed Gibbs sampling methods (i.e., G-C and G-S) are more accurate than the other two traditional methods (i.e., IS and NIS), given the same number of random samples. Fig G-C IS G-S NIS 4 # of Simulations - - G-C IS G-S NIS 4 # of Simulations 9 (a) (b) Relative error of failure rate prediction (defined by 95% confidence interval), plotted as a function of the number of transistor-level simulations at the second stage. (a) RN. (b) WN. To quantitatively assess the accuracy of different importance sampling methods, Fig. plots the relative prediction error as a function of the number of simulations. Here, the relative error is defined as the ratio of the 95% confidence interval over the estimated failure probability. 3

24 TABLE I NUBER OF REQUIRED SIULATIONS DURING BOTH FIRST AND SECOND STAGES TO ACHIEVE 5% ERROR DEFINED BY 95% CONFERENCE INTERVAL RN WN IS [8] NIS [4] 9 7 G-C (Proposed) 69 6 G-S (Proposed) 8 93 Table I further compares the number of required simulations during both first and second stages to achieve 5% error. Note that the proposed Gibbs sampling methods only require 69~6 simulations in total, while IS and NIS require 9~366 simulations to achieve the same accuracy. In other words, G-C and G-S achieve.6 to 3.9 runtime speed-up over IS and NIS in this eample. The epensive transistor-level simulation is required to test each sampling point. Table II lists the equivalent runtime for transistor-level simulation of the RN failure rate estimation. The numerical eperiments were run on a.53 GHz Linu server with 6 GB memory. It took about seconds to run one transistor-level simulation for the SRA cell circuit. For the standard onte Carlo analysis, the runtime is not practical because it needs over.67 8 samples to achieve an error under 5%. Compared with the standard onte Carlo method, the other four importance sampling methods are able to achieve a gain of several orders of magnitude in runtime. oreover, G-C and G-S perform even better than IS and NIS methods. TABLE II RUNTIE IN DAYS FOR RN FAILURE RATE ESTIATION TO ACHIEVE 5% ERROR DEFINED BY 95% CONFERENCE INTERVAL Runtime in number of days onte Carlo 776 IS [8].875 NIS [4].493 G-C (Proposed).799 G-S (Proposed).938 Finally, it is worth mentioning that even though G-C and G-S yield similar accuracy in this eample, 4

25 Failure Rate ( -6 ) they can lead to different results in a number of other cases. One of these eamples will be discussed in detail in the net sub- chapter. 5. READ CURRENT In this sub-chapter, we consider the read current of the SRA cell as the performance of interest. Given the circuit schematic shown in Fig.4, the read current is measured as the drain current of the transistor 3, when the word line and both bit lines are connected to the supply voltage V DD. If the read current is no less than a pre-defined threshold, the performance of the SRA cell is then considered as Pass. We apply the four different importance sampling methods to estimate the failure rate associated with the performance metric of read current. And, only the local V TH mismatches of the transistors and 3 are considered for the following reasons. First, the read current variation is dominated by the local V TH mismatches of these two transistors. Second, considering V TH and V TH3 only could facilitate us to make a comprehensive comparison between different importance sampling algorithms. 3 G-S NIS IS G-C 3 4 # of Simulations Fig.. Estimated failure probability of read current, plotted as a function of the number of transistor-level simulations at the second stage. Fig. shows the estimated failure probability as a function of the number of transistor-level simulations at the second stage. Unlike the eample in the previous sub-chapter where all four importance sampling methods eventually converge to the same failure probability, the proposed Gibbs sampling for spherical coordinate systems (i.e., G-S) results in a failure rate that is different from the other three methods (i.e., IS, NIS and G-C). In order to verify the accuracy of these four importance sampling methods, we further implement a brute-force onte Carlo analysis engine where seven million random samples are directly drawn from the PDF of process variations, i.e., f() in Eq. (). The failure probability calculated by the brute-force onte Carlo analysis engine is used as the golden result to evaluate the accuracy of all 5

26 importance sampling methods in this eample. TABLE III FAILURE PROBABILITY FOR READ CURRENT ESTIATED BY DIFFERENT IPORTANCE SAPLING ETHODS # of Simulations Failure Rate Relative Error IS [8] % NIS [4] % G-C (Proposed) % G-S (Proposed) % Brute-force C Table III summarizes the failure probability estimated by the four importance sampling methods and the brute-force onte Carlo analysis engine. Based on Table III, we notice that G-S is more accurate than the other three importance sampling methods, because the failure rate estimated by G-S is almost identical to that estimated by the brute-force onte Carlo analysis engine. In addition, in this eample, it is important to note that the error associated with IS, NIS and G-C cannot be efficiently reduced by increasing the number of samples at the second stage. In other words, while G-S accurately estimates the correct failure probability, IS, NIS and G-C all fail to work in this eample. To fully understand the limitations of IS, NIS and G-C, we uniformly sample the variation space to identify the failure region for the performance metric of read current. Since only the local V TH mismatches of two transistors are considered in this eample, the two-dimensional failure region can be easily found by the aforementioned uniform sampling, as shown in Fig.3. Several important observations can be obtained by studying Fig.3. First, the proposed G-S method is able to generate random samples to fully cover the high-probability failure region. Second, the other three importance sampling methods cannot appropriately sample the high-probability failure region. They only draw samples from a small portion of the high-probability failure region, thereby underestimating the failure probability. 6

27 V TH3 () V TH3 () V TH3 () V TH3 () Uniform IS - - V TH () (a) Uniform NIS - - V TH () (b) Uniform G-C - - V TH () Uniform G-S - - V TH () (c) (d) Fig. 3. Two-dimensional failure region for the read current. Each green square represents a failure point that is randomly sampled from a two-dimensional uniform distribution. Each black cross denotes a failure point that is generated by the importance sampling methods at the second stage: (a) IS [8], (b) NIS [4], (c) G-C, and (d) G-S. In this eample, NS, IS and G-C all fail to work, because the high-probability failure region has an irregular and non-conve shape. When NS and NIS are applied, they only shift the mean value of the multivariate Normal distribution f() in Eq. () to construct a new Normal distribution g NOR (). Since the covariance matri is not optimally estimated, both NS and NIS cannot correctly capture the failure region of interest. And, the proposed G-C method samples a one-dimensional conditional PDF in the Cartesian coordinate system at each iteration step. Since the high-probability failure region is non-conve, G-C is not able to fully eplore the high-probability failure region with a limited number of Gibbs samples at the first stage. As a result, the random samples generated by the second stage of G-C cannot correctly cover the failure region of interest. This eample demonstrates an important fact that even though both G-C and G-S rely on Gibbs Sampling, they can lead to substantially different results due to their different implementations. 7

28 5.3 CONSIDERATIONS FOR OTHER APPLICATIONS This work focused on predicting the failure rate of SRA cell circuits. Although the proposed method can be applied to other circuits, there are some constraints on the circuits to be able to efficiently apply the Gibbs sampling method. First, the failure region of the circuit needs to be continuous, which means there are no disconnected failure regions for the circuit. If the circuit has multiple failure regions, the proposed algorithm will only generate random samples in one of them and thus the estimated result would not be accurate. Second, the variable number of the circuit should not be too big (> ). For a small or median variable number ( to ), the method results in a significant gain in speed and accuracy. However, for larger variable numbers, the method would require a very large number of random samples. As a result, the efficiency of the algorithm would diminish. So, for other circuit applications, if the above two constraints are satisfied, then the proposed Gibbs sampling algorithm can dynamically eplore the region of interest and achieve high accuracy with low computational cost; otherwise, the cost of the algorithm would be high and the estimated result might be not accurate enough. 8

29 CHAPTER 6 CONCLUSIONS Statistical analysis for SRA cells is a challenging issue because the failure rate of SRA cells is etremely low. In this project, we develop an efficient importance sampling algorithm to estimate the failure rate of SRA cells. In particular, we adapt the Gibbs sampling technique from the statistics community to find the optimal probability distribution of process variations with low computational cost. ore formally, the proposed Gibbs sampling method adaptively eplores the failure region in a Cartesian or spherical coordinate system by sampling a sequence of one-dimensional probability distributions. Several implementation issues such as one-dimensional random sampling and starting point selection are carefully studied to make the Gibbs sampling method more efficient and accurate. The eperimental results of a commercial 9nm SRA cell demonstrate that the proposed Gibbs sampling method achieves.6 to 3.9 runtime speed-up over other state-of-the-art techniques without surrendering any accuracy. We further demonstrate that in certain applications the proposed Gibbs sampling algorithm accurately estimates the correct failure probability while the traditional techniques fail to work. The Gibbs sampling technique can be incorporated into a statistical optimization environment for accurate and efficient parametric yield optimization of SRA cell circuits. 9

30 APPENDIX Proof of Theorem : It has been shown in [7] that if the random variables {α m ; m =,,, } are mutually independent and standard Normal, the variables {α m / α ; m =,,, } are uniformly distributed on the surface of a sphere with unit radius. Hence, for a given value of r, the random variables { m ; m =,,, } defined in Eq. () are uniformly distributed on the surface of a sphere = r. In other words, for all points on the surface of the sphere = r, their PDF values are identical. In order to calculate the joint PDF for, we consider the sphere = r and add a small perturbation Δr to the radius, as shown by the three-dimensional eample in Fig. 4. If the perturbation Δr is sufficiently small (i.e., Δr ), the values of f() at all points within the region r r + Δr are identical. Δr r 3 Fig. 4. A perturbation Δr, where Δr, is applied to the radius of the three-dimensional sphere = r. As shown in Eq. (3), the radius r follows the Chi distribution with degrees of freedom. The CDF of r is represented as []: 3 r r F.5 t t e dt (3).5.5 Hence, the probability for r r + Δr where Δr is equal to: 33 F F r r Fr.5.5 df r dr r e.5r r On the other hand, the volume of an -dimensional sphere can be written as []: 34 V r r (33) (34) 3

3.3.1 Linear functions yet again and dot product In 2D, a homogenous linear scalar function takes the general form:

3.3.1 Linear functions yet again and dot product In 2D, a homogenous linear scalar function takes the general form: 3.3 Gradient Vector and Jacobian Matri 3 3.3 Gradient Vector and Jacobian Matri Overview: Differentiable functions have a local linear approimation. Near a given point, local changes are determined by