M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011

Size: px
Start display at page:

Download "M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011"

Transcription

1 .S. Project Report Efficient Failure Rate Prediction for SRA Cells via Gibbs Sampling Yamei Feng /5/ Committee embers: Prof. Xin Li Prof. Ken ai

2 Table of Contents CHAPTER INTRODUCTION...3 CHAPTER BACKGROUND...5 CHAPTER 3 GIBBS SAPLING GIBBS SAPLING IN CARTESIAN COORDINATE SYSTE GIBBS SAPLING IN SPHERICAL COORDINATE SYSTE... 9 CHAPTER 4 IPLEENTATION DETAILS ONE-DIENSIONAL INVERSE-TRANSFOR SAPLING INITIAL STARTING POINT SELECTION TWO-STAGE ONTE CARLO FLOW... 7 CHAPTER 5 NUERICAL EXAPLES NOISE ARGIN READ CURRENT CONSIDERATIONS FOR OTHER APPLICATIONS... 8 CHAPTER 6 CONCLUSIONS... 9 APPENDIX... 3 REFERENCES... 3

3 CHAPTER INTRODUCTION As deep sub-micron technology advances, process variations pose a new set of challenges for static random access memory (SRA) design. SRA has been widely embedded in semiconductor chips; roughly half of the area of an advanced microprocessor chip is occupied by SRA [9]. SRA cells are generally designed with minimum-size devices [9] and can be significantly impacted by large-scale process variations (e.g., local mismatches caused by random doping fluctuations) in nanoscale technology []-[4]. For this reason, it becomes increasingly critical to evaluate the failure rate of SRA cells both efficiently and accurately in order to achieve a robust design. Towards this goal, a number of statistical analysis methods have been proposed for SRA circuits [5]-[5]. For instance, analytical performance models have been derived to predict SRA parametric yield [5]-[7]. While these models offer great design insights to understand SRA circuits, they may not accurately capture the circuit behavior due to various approimations. Another possible approach for SRA failure rate prediction is based on transistor-level simulation, including both onte Carlo analysis [8]-[4] and deterministic failure region prediction [5]. Since SRA cells typically have etremely small failure probability (e.g., 8 ~ 6 ), a simple onte Carlo method which directly samples the variation space suffers from slow convergence, as only a few random samples will fall into the failure region. To improve the sampling efficiency, a number of importance sampling methods have been proposed for fast SRA failure rate prediction [8]-[4]. The key idea of importance sampling is to directly sample the failure region based on a distorted probability density function (PDF), instead of the original PDF of process variations. Applying importance sampling to SRA analysis, however, is not trivial. The efficiency of importance sampling heavily relies on the choice of the distorted PDF that is used to generate random samples. Ideally, we should sample the failure region that is most likely to occur. Such a goal, however, is etremely difficult to achieve, as we never eactly know the failure region in practice. The challenge here is how to determine the optimal PDF so that the SRA failure rate can be efficiently predicted. In this paper, a new importance sampling method, Gibbs Sampling, is proposed to improve the efficiency of SRA failure rate prediction. Unlike the traditional onte Carlo algorithm that samples a given PDF, the proposed Gibbs sampling approach does not need to know the sampling PDF eplicitly. It adaptively searches the failure region and then generates random samples in it. When applied to SRA failure rate analysis, Gibbs sampling can be conceptually viewed as a unique onte Carlo method with an integrated optimization engine which allows us to efficiently eplore the failure region. Thus, SRA 3

4 failure probability can be accurately predicted with a small number of sampling points. Our eperimental results of a commercial 9nm SRA cell demonstrate that compared to other state-of-the-art techniques, the proposed Gibbs sampling algorithm achieves.6~3.9 runtime speed-up without surrendering any accuracy. In addition, we further demonstrate that in certain applications the proposed Gibbs sampling algorithm accurately estimates the correct failure probability while the traditional techniques fail to work. While the Gibbs sampling method was initially developed by the statistics community [6], [], in this paper, it is particularly tuned for our SRA analysis application via four important new contributions. First, the proposed Gibbs sampling algorithm is implemented for both Cartesian and spherical coordinate systems. The accuracy achieved by these two different implementations is application-dependent, as will be demonstrated in Chapter 5. Also, a novel variable mapping scheme is derived to facilitate efficient statistical sampling in the spherical coordinate system. Second, as previously mentioned, Gibbs sampling iteratively searches the failure region. At each iteration step, it needs to generate a random sample from an irregular one-dimensional PDF that is not simply uniform or Normal. Such a sampling task cannot be easily done by directly using a random number generator. For this reason, we propose to adopt the inverse-transform method [] and incorporate it into our proposed Gibbs sampling engine. As such, randomly sampling an arbitrary one-dimensional PDF can be performed efficiently, which is a key technique to make the proposed Gibbs sampling of practical utility. Third, an efficient implementation of Gibbs sampling requires a good starting point to speed up the convergence. This is similar to most optimization algorithms where a good starting point facilitates fast convergence. In this paper, we propose a model-based optimization to determine a good starting point. Finally, to further improve prediction accuracy and reduce computational cost, a two-stage onte Carlo flow is developed where Gibbs sampling is applied to create a set of random samples during the first stage and these samples are used to learn the optimal PDF for importance sampling. Net, during the second stage, a large number of random samples are efficiently generated from the optimal PDF to accurately estimate the failure probability. The remainder of this paper is organized as follows. In Chapter, we briefly review the background on importance sampling and then propose the Gibbs sampling method in Chapter 3. Several implementation issues (including one-dimensional inverse-transform sampling, initial starting point selection, etc.) are discussed in detail in Chapter 4. A commercial 9nm SRA cell is used to demonstrate the efficacy of the proposed Gibbs sampling method in Chapter 5. Finally, we conclude in Chapter 6. 4

5 CHAPTER BACKGROUND Suppose that = [ ] T is an -dimensional random variable, which models process variations and has a joint PDF f(). Typically, is modeled as a multivariate Normal distribution [5]-[5]. Without loss of generality, we further assume that the random variables { m ; m =,,, } in the vector are mutually independent and standard Normal (i.e., with zero mean and unit variance): f ep m. () m And any correlated random variables that are jointly Normal can be transformed to the independent random variables { m ; m =,,, } by principal component analysis (PCA) []. The failure probability of an SRA cell can be mathematically represented as [8]: f P f d () where denotes the failure region (i.e., the subset of the variation space where the performance of interest does not meet the specification). Alternatively, the failure probability in Eq. () can be defined as: 3 I f where I() represents the indicator function: 4 P f d (3) I (4) The failure probability P f can be estimated by onte Carlo analysis. The key idea is to draw N random samples from f(), and then compute the mean of these samples: 5 C f N where (n) is the nth random sample generated by onte Carlo analysis. ~ P N n I n (5) For the SRA application, the failure probability P f in Eq. (3) is etremely small (e.g., 8 ~ 6 ) and most random samples created by onte Carlo analysis do not fall into the failure region. Hence, a large number of (e.g., over 7 ~ 9 ) samples are needed to accurately estimate the failure rate. Note that epensive transistor-level simulation is required to create each sampling point. In other words, 7 ~ 9 simulation runs must be performed. It, in turn, implies that the aforementioned onte Carlo method is etremely epensive, or even infeasible, when applied to most SRA analysis problems. 5

6 To address this computational cost issue, importance sampling has been proposed to improve the efficiency of onte Carlo analysis [8]-[4]. It aims to directly generate a large number of random samples in the failure region by using a distorted PDF g(). And, the failure probability can be epressed as [8]: 6 f g I P f g d (6) In other words, Eq. (6) calculates the epected value of the function I()f()/g() where the random variable follows the PDF g(). If N sampling points { (n) ; n =,,, N} are drawn from g(), the failure probability in Eq. (6) can be estimated by [8]: 7 ~ P IS f N N n I n n f n Note that the estimated failure probabilities in Eq. (5) and Eq. (7) are identical, if and only if the number of random samples (i.e., N) is infinite. In practice, when a finite number of sampling points are available, the results from Eq. (5) and Eq. (7) can be substantially different. If the distorted PDF g() is properly chosen, Eq. (7) can be much more accurate than the simple onte Carlo method in Eq. (5). In theory, the optimal PDF g() with maimal estimation accuracy is []: OPT 8 g g f (7) I f (8) P Intuitively, in Eq. (8), the function I()f()/g OPT () becomes a constant with zero variance. Hence, its epectation can be accurately estimated by Eq. (7) with few random samples. Studying Eq. (8), we would notice two important properties of the optimal PDF g OPT (). First, g OPT () is non-zero if and only if the variable sits in the failure region. It, in turn, implies that we should directly sample the failure region to achieve maimal accuracy. Second, g OPT () is proportional to the original PDF f() of process variations. In other words, we should sample the variation space where performance failure is most likely to occur. In practice, however, sampling the optimal PDF g OPT () in Eq. (8) is not trivial, as the indicator function I() is not known in advance. ost eisting importance sampling algorithms apply various heuristics to approimate the optimal PDF g OPT () [8]-[4]. In this paper, we propose a new Gibbs sampling method that adaptively samples the optimal PDF g OPT () without eplicitly knowing the indicator function I(). And, the SRA failure rate can be accurately predicted with low computational cost. 6

7 CHAPTER 3 GIBBS SAPLING As described in Chapter, directly sampling the optimal PDF g OPT () in Eq. (8) is difficult for two reasons. First, the indicator function I() is not known in advance. Second, since g OPT () is not a simple multivariate statistical distribution such as uniform distribution or Normal distribution, it is etremely difficult, if not impossible, to directly draw random samples from g OPT (). In this paper, we adopt the Gibbs sampling method [6], [] to predict the failure probability of SRA cells. Compared to other traditional techniques [8]-[4], Gibbs sampling provides two promising features. First, it can efficiently search the failure region and determine the indicator function I() on the fly. Second, Gibbs sampling does not directly draw random samples from a multi-dimensional joint PDF. Instead, it iteratively samples a sequence of one-dimensional PDFs. And the one-dimensional sampling can be efficiently implemented with an inverse-transform method [] with low computational cost. The proposed Gibbs sampling method is implemented in both Cartesian and spherical coordinate systems. The accuracy achieved by these two different implementations is application-dependent, as will be demonstrated by the eperimental results in Chapter 5. In what follows, we describe the technical details of Gibbs sampling method for Cartesian and spherical coordinate systems. 3. GIBBS SAPLING IN CARTESIAN COORDINATE SYSTE To illustrate the Gibbs sampling algorithm in the Cartesian coordinate system, we first consider a simple two-dimensional eample in Fig.. Starting from an initial point [ () () ] T, Gibbs sampling samples the conditional PDF g OPT [ () ] and replaces () by a new value (), as shown in Fig. (a)-(b). Net, Gibbs sampling samples a different conditional PDF g OPT [ () ] and replaces () by a new value (), as shown in Fig. (c)-(d). Since the random variable is two-dimensional, Gibbs sampling then varies again and draws a new random value (3) by sampling the conditional PDF g OPT [ () ]. These iteration steps are repeated until a sufficient number of random samples are generated. And it can be proven that the iterations yield a sequence of samples that follow the distribution g OPT (, ) [6]. 7

8 g OPT (, ) g OPT [ () ] g OPT (, ) g OPT [ () ] [ () () ] T () () (a) (b) [ () () ] T () () (c) (d) Fig.. A two-dimensional eample for Gibbs sampling in the Cartesian coordinate system. (a) [ () () ] T is the starting point. The intersection between the joint PDF g OPT (, ) and the plane = () defines the conditional PDF g OPT [ () ]. (b) A new point () is sampled from g OPT [ () ]. (c) The intersection of the joint PDF g OPT (, ) and the plane = () defines the conditional PDF g OPT [ () ]. (d) A new point () is sampled from g OPT [ () ]. Algorithm : Gibbs sampling in Cartesian coordinate system. Start from an -dimensional PDF g OPT (,,, ).. Select an initial starting point [ () () () ] T. 3. For t =,, 4. For m =,,, 5. Draw (t+) m from the conditional PDF g OPT [ m (t+),, (t+) m, (t) m+,, (t) ] to create a new sampling point [ (t+) (t+) m (t) m+ (t) ] T. 6. End For 7. End For The aforementioned two-dimensional Gibbs sampling can be etended to -dimension. Starting from an initial point [ () () () ] T, Gibbs sampling assigns a new value to one of the random variables, by randomly sampling a conditional PDF at each iteration step with the following conditional PDF: 8

9 OPT OPT 9,,,,, g g m m m m \ m (9) where \m denotes the vector with m removed. And Algorithm summarizes the major steps of Gibbs Sampling. 3. GIBBS SAPLING IN SPHERICAL COORDINATE SYSTE The Gibbs sampling method summarized by Algorithm can be etended to spherical coordinate systems. And, we need to transform the -dimensional normal PDF from the Cartesian coordinate system to the spherical coordinate systems. Traditionally, variables to specify a spatial location in an -dimensional spherical coordinate system can be epressed as: r cos 3 r sin r sin r sin cos sin cos r sin sin cos sin sin Eq. () also shows the mapping from the spherical coordinate r and θ = [θ θ θ ] T to the Cartesian coordinate []. The variable r defines the radius (i.e., the distance to the origin) and the other angel variables {θ m ; m =,,, } define the angle. Fig. (a) shows a three-dimensional eample of the traditional spherical coordinate system. 3 () θ r α r θ 3 α α (a) (b) Fig.. Two three-dimensional eamples of spherical coordinate systems. (a) The traditional spherical coordinate system. (b) The proposed spherical coordinate system. While the aforementioned transformation can be easily found for low-dimensional, it becomes computationally epensive as the dimensionality increases []. So we propose an alternative way to define the spherical coordinate system, and the -dimensional Normal PDF can be easily specified. 9

10 We write the Cartesian coordinate as: α r r r () α α where α = [α α α ] T and denotes the L-norm of a vector. In Eq. (), the variable r defines the radius, which is similar to Eq. (). The other variables {α m ; m =,,, } define the orientation of the vector. Fig. (b) shows a three-dimensional eample of the proposed spherical coordinate system. There is an important difference between the traditional spherical coordinate system and the proposed spherical coordinate system. While the traditional spherical coordinate system uses variables (i.e., r and θ) to define a spatial location in the -dimensional space, the proposed spherical coordinate system requires + variables (i.e., r and α) to define the same spatial location. The + variables used by our proposed spherical coordinate system are redundant. In other words, for a given Cartesian coordinate, the mapping from to r and α is not unique. Such redundancy, however, allows us to easily find the joint PDF of r and α, and makes the variables { m ; m =,,, } in Eq. () mutually independent and standard Normal. In what follows, we will derive the joint probability distribution for r and α in detail. First, the variable r which represents the radius is equal to: r. () Since the random variables { m ; m =,,, } are mutually independent and standard Normal, it is easy to verify that the random variable r follows the Chi distribution with degrees of freedom []: 3 f r r ep.5 r.5 r where Γ() denotes the Gamma function. Net, we consider the vector α that defines the orientation. Studying Eq. (), we would have two important observations. On the one hand, the orientation of the vector is uniquely determined by α/ α. It is independent of the length of the vector α (i.e., α ). On the other hand, the orientation α/ α must be uniformly distributed based on the fact that the random variables { m ; m =,,, } in Eq. () are mutually independent and standard Normal. So, the unit length vector α/ α must take any possible orientation with equal probability. We then define the following PDF for α: 4 f (3) m α ep (4) m where {α m ; m =,,, } are mutually independent and standard Normal. It is well-known that the multivariate Normal distribution in Eq. (4) is spherically symmetric []. Therefore, sampling α based on

11 Eq. (4) and then normalizing α by its L-norm allow us to generate a uniformly distributed orientation in the -dimensional space [7]. Finally, based on the fact that the radius and the orientation of the multivariate Normal distribution in Eq. () should be mutually independent, we combine f(r) in Eq. (3) and f(α) in Eq. (4) together. And the joint PDF f(r, α) is equal to: m α ep. (5) r ep 5.5 r f r,.5 r m Theorem states that given the joint PDF f(r, α) in Eq. (5), the random variables { m ; m =,,, } defined in Eq. () are mutually independent and standard Normal. The detailed proof of Theorem can be found in Appendi. Theorem : Given the random variables { m ; m =,,, } defined in Eq. () where r and α follow the joint PDF f(r, α) in Eq. (5), the joint PDF f() is a multivariate Normal distribution represented by Eq. (). Given r and α, importance sampling can be applied to estimate the failure probability in the proposed spherical coordinate system. In this case, the estimated failure rate in Eq. (7) can be re-written as: 6 ~ P IS f N N I n n n n r, α f r, α n n n g r, where I(r, α) and g(r, α) represent the indicator function and the distorted PDF in the spherical coordinate system respectively, [r (n) α (n) ] T stands for the nth sampling point drawn from g(r, α), and N denotes the total number of samples. Similar to Eq. (8), the optimal PDF g(r, α) with maimum estimation accuracy is: OPT 7 g r, α α r, α f r, α f (6) I (7) P Given Eq. (6) and Eq. (7), Algorithm is formulated to perform Gibbs sampling in the spherical coordinate system. Similar to Algorithm, Algorithm samples a one-dimensional conditional PDF g OPT (r α) or g OPT (α m r, α \m ) at each iteration step, where α \m denotes the vector α with α m removed. As discussed at the beginning of this Chapter, directly sampling the multi-dimensional joint PDF g OPT () or g OPT (r, α) can be etremely difficult. With Gibbs Sampling, we only need to sample the one-dimensional conditional PDF g OPT ( m \m ), g OPT (r α) or g OPT (α m r, α \m ). As will be demonstrated in Chapter 4., such one-dimensional sampling can be efficiently implemented by using an inverse-transform method [], even if the PDF g OPT () in Eq. (8) or g OPT (r, α) in Eq. (7) are not eplicitly known. In addition, a model-based

12 optimization can be applied to determine a good starting point [ () () () ] T or [r () α () α () α () ] T so that the Gibbs sampling algorithm converges quickly. All these implementation details will be discussed in Chapter 4. Algorithm : Gibbs sampling in the spherical coordinate system. Start from an (+)-dimensional PDF g OPT (r, α, α,, α ).. Select an initial starting point [r () α () α () α () ] T. 3. For t =,, 4. Draw r (t+) from the conditional PDF g OPT [r α (t),, α (t) ] to create a new sampling point [r (t+) α (t) α (t) ] T. 5. For m =,,, 6. Draw α (t+) m from the conditional PDF g OPT [α m r (t+), α (t+),, α (t+) m, α (t) m+,, α (t) ] to create a new sampling point [r (t+) α (t+) α (t+) m α (t) m+ α (t) ] T. 7. End For 8. End For

13 CHAPTER 4 IPLEENTATION DETAILS The proposed Gibbs sampling method is efficiently implemented with the following techniques: () one-dimensional inverse-transform sampling, () initial starting point selection, and (3) two-stage onte Carlo analysis. 4. ONE-DIENSIONAL INVERSE-TRANSFOR SAPLING During each iteration step of Gibbs sampling, one random variable is sampled from a one-dimensional conditional PDF g OPT ( m \m ), g OPT (r α) or g OPT (α m r, α \m ). Such one-dimensional sampling can be efficiently performed by an inverse-transform method []. To derive the one-dimensional sampling algorithm used in this paper, we first consider the scenario of sampling the random variable m from the conditional PDF g OPT ( m \m ). In this case, we write g OPT ( m \m ) as: OPT 8 g Substituting Eq. (8) into Eq. (8) yields: OPT 9 m \ m g, g OPT OPT m \ m. (8) OPT OPT g \ m g \ m f. (9), f \ m g m \ m I OPT m \ m m \ m Pf g \ m Since the random variables { m ; m =,,, } are mutually independent as shown in Eq. (), Eq. (9) can be re-written as: OPT g m f. () \ m \ m I OPT m, \ m Pf g \ m f Studying Eq. (), we have three important observations. First, g OPT ( m \m ) is linearly proportional to the probability distribution f( m ). Second, g OPT ( m \m ) is non-zero if and only if the variable m, combined with \m, sits in the failure region. Third, the term f( \m ) / P f / g OPT ( \m ) in Eq. () is a constant, given any fied value of \m. For these reasons, when sampling the one-dimensional conditional PDF g OPT ( m \m ), we should draw random samples from the failure region based on the PDF f( m ). m 3

14 f( m ) F( m ) F(v m ) u m m v m m s m F(u m ) u m m v m m 9 (a) (b) Fig. 3. One-dimensional inverse-transform sampling. (a) A single continuous failure region Ω m is denoted by the gray area. (b) A random variable m is sampled by the inverse-transform method based on the CDF F( m ). We consider the one-dimensional failure region shown in Fig. 3(a). The symbol m represents the failure region over the variable m while all other random variables are set to some given value. Similar to the previous work [8]-[], [4], we assume that there is only one single continuous failure region which can be represented as a one-dimensional interval [u m, v m ]. Based on the assumption that m is continuous, we could find the left boundary u m and the right boundary v m, with binary search over the variable m. Once u m and v m are known, we need to draw a random sample from the interval [u m, v m ]. It is important to note that the PDF g OPT ( m \m ) in Eq. () represents a truncated Normal distribution defined by f( m ) over the interval [u m, v m ], and it can be sampled with the inverse-transform method. To apply the inverse-transform method to sample g OPT ( m \m ), we first calculate the cumulative distribution function (CDF) F( m ), where m follows a standard Normal distribution. Net, we sample a new random variable s m that is uniformly distributed over the interval [F(u m ), F(v m )], as shown in Fig. 3(b). Then, we map the sampling point s m back to m based on the inverse CDF: F () m s m where F (s m ) is the inverse function of F( m ). It can be proven that the sampling point m generated by the inverse-transform method follows the statistical distribution g OPT ( m \m ) in Eq. () []. The aforementioned inverse-transform method can be similarly applied to sample the other two one-dimensional conditional PDFs g OPT (r α) and g OPT (α m r, α \m ). Similar to Eq. (), it is easy to verify the following two equations: OPT g r OPT 3 g r, m α OPT α f α Ir, α f r. () P g f f r, α α I, r α f \ m \ m OPT m. (3) Pf g r, α\ m 4

15 In Eq. () and Eq. (3), f(r) is a Chi distribution with degrees of freedom and f(α m ) is a standard Normal distribution. To sample g OPT (r α) and g OPT (α m r, α \m ), we first apply binary search to find the one-dimensional failure regions [u r, v r ] for the variable r and [u αm, v αm ] for the variable α m. Net, we sample the new random variables s r and s αm from the uniform distributions over [F(u r ), F(v r )] and [F(u αm ), F(v αm )] respectively. Finally, we map s r and s αm back to r and α m based on the inverse CDFs: 4 r F 5 F s m (4) s r (5) where F (s r ) and F (s αm ) are the inverse functions of the CDFs F(r) and F(α m ) respectively. Algorithm 3 summarizes the major steps of the inverse-transform method. It is important to emphasize that the computational cost of Algorithm 3 is dominated by Step where multiple transistor-level simulations are required to find the left and right boundaries of the one-dimensional failure region. All other steps do not involve transistor-level simulations and can be computed with low computational cost. Algorithm 3: One-dimensional inverse-transform sampling. Start from the joint PDFs f() in () and f(r, α) in Eq. (5).. Apply binary search to find the left boundary (i.e., u m, u r or u αm ) and the right boundary (i.e., v m, v r or v αm ) for the one-dimensional failure region of the random variable that should be sampled (i.e., m, r or α m ). 3. Draw one random sample (i.e., s m, s r or s αm ) from the uniform distribution over the one-dimensional failure region (i.e., [F(u m ), F(v m )], [F(u r ), F(v r )] or [F(u αm ), F(v αm )]). 4. ap the random sample s m, s r or s αm to m, r or α m based on Eq. (), Eq. (4) or Eq. (5) where F (s m ) and F (s αm ) are the inverse CDFs of a standard Normal distribution and F (s r ) is the inverse CDF of a Chi distribution with degrees of freedom. m 4. INITIAL STARTING POINT SELECTION To efficiently apply the proposed Gibbs sampling method, a good initial starting point is needed to speed-up the convergence of Algorithm and/or Algorithm. During Gibbs sampling, as each sample is drawn from a one-dimensional conditional PDF, the current sampling point is correlated to the previous sampling point. In other words, the random samples generated by Gibbs sampling depend on the initial starting point. It has been proven by the statistics community that starting from an initial point, the samples generated at the beginning of the Gibbs sampling may not follow the given probability distribution (i.e., g OPT () for Algorithm and g OPT (r, α) for Algorithm ) []. These samples contribute to the warm-up 5

16 interval of Gibbs sampling, and should be abandoned. In practice, the length of warm-up interval should be minimized; otherwise, generating a large number of random samples inside the warm-up interval can waste a large amount of computational time. In this paper, we propose to reduce the warm-up interval by appropriately selecting a good initial starting point. In what follows, we first discuss the proposed initial point selection scheme for Cartesian coordinate systems. Remember that our objective is to draw random samples from the failure region that is most likely to occur, as shown in Eq. (8). Hence, the initial starting point [ () () () ] T should meet the following two criteria: () [ () () () ] T () is inside the failure region, and () the PDF takes a large value at [ () () ] T. These two requirements can be mathematically translated to the following optimization problem: 6 f,,, T maimize,, (6) S.T. Since the -dimensional random variable is modeled as a multivariate Normal distribution in Eq. (), the optimization in Eq. (6) is equivalent to: 7,, (7) T minimize S.T. Eq. (7) aims to find the failure point that is closest to the origin =. It is similar to the norm minimization approach proposed in []. Note that solving the optimization problem in Eq. (7) is not trivial, since the failure region is not eplicitly known. To address this issue, we propose to approimate the performance of interest as a linear or quadratic model of the -dimensional random variable. Once the model is available, the optimization in Eq. (7) can be solved by either quadratic programming (for linear performance model) or semi-definite programming (for quadratic performance model) [8]. The detailed algorithms for performance modeling and optimization can be found in [8]. Note that our goal is not to eactly solve Eq. (7). Instead, we only need to find an approimated solution that can be used by Gibbs sampling to further eplore the variation space with high failure probability. It is important to mention that even though the linear or quadratic performance models may not be highly accurate, we can still obtain a good starting point. We can further etend the aforementioned initial point selection method to spherical coordinate systems. In this case, once the Cartesian coordinate [ () () () ] T is determined for the initial starting point, we need to map it to the spherical coordinate [r () α () α () α () ] T. As shown in Eq. (), the value of r () can be easily calculated as: 6

17 8 r To calculate {α m () ; m =,,, }, Eq. () can be re-written as: 9. (8) α α α r r r. (9) As discussed in Chapter 3., the vector α () only determines the orientation of () and the length of α () can be arbitrary. For our Gibbs sampling application, since we aim to locate a failure point [r () α () α () α () ] T that is most likely to occur, we should determine {α () m ; m =,,, } by maimizing the PDF f(α) in Eq. (4). Namely, even though there are multiple possible solutions of α () that are associated with the same Cartesian coordinate (), we are interested in the solution that is most likely to occur (i.e., the maimum-likelihood solution). Based on this maimum-likelihood criterion and the fact that α () follows the multivariate Normal distribution defined in Eq. (4), the length of α () should be sufficiently small so that the PDF value f[α () ] is large. Let α () = ε, where ε, and Eq. (9) becomes: 3. (3) r In practice, we find that ε =.~. is a good choice based on our numerical eperiments. Once ε is set and { () m ; m =,,, } and r () are found by Eq. (7) and Eq. (8), {α () m ; m =,,, } are uniquely determined by Eq. (3). r r 4.3 TWO-STAGE ONTE CARLO FLOW As shown in both Algorithm and Algorithm, multiple transistor-level simulations are required to perform binary search and to generate a single Gibbs sample. So, to make the proposed Gibbs sampling algorithm more efficient, there is one additional implementation strategy that should be applied. Once a set of Gibbs samples are created, it is desirable to learn the optimal PDF from these samples so that additional random sampling points can be directly drawn from the optimal PDF without running binary search. Such a strategy would help us to further reduce the computational cost and to improve the prediction accuracy. Towards this goal, we propose to adopt a two-stage onte Carlo flow consisting of two sequential steps. First, Gibbs sampling is applied to generate K random samples inside the failure region. Net, at the second stage, we approimate the optimal PDF g OPT () as a multivariate Normal distribution g NOR (). And the 7

18 mean value and the covariance matri of g NOR () are calculated from the previous K Gibbs samples. Once g NOR () is known, we directly sample it to generate N random samples and estimate the failure rate from these N samples: 3 ~ P IS f N N n I n n f NOR n Algorithm 4 summarizes the major steps of the proposed two-stage onte Carlo flow. Algorithm 4: Two-stage onte Carlo Flow. Start from a joint PDF f(), a given value of K (i.e., the number of Gibbs samples for the first stage), and a given value of N (i.e., the number of random samples for the second stage).. Apply Algorithm (for Cartesian coordinate systems only), Algorithm (for spherical coordinate systems only) and Algorithm 3 (for both Cartesian and spherical coordinate systems) to generate K Gibbs sampling points. 3. If the Gibbs samples are generated in a spherical coordinate system, map them to the Cartesian coordinates by using Eq. (). 4. Calculate the mean value and the covariance matri of these K Gibbs samples in the Cartesian coordinate system. Determine a multivariate Normal distribution g NOR () to approimate the optimal PDF for Importance Sampling. 5. Generate N random samples from the multivariate Normal distribution g NOR (). 6. Calculate the failure rate from these N samples by using Eq. (3). It should be noted that several traditional importance sampling techniques also draw random samples from a multivariate Normal distribution [8], [4]. However, unlike the traditional techniques, we apply Gibbs sampling to optimally determine both the mean value and the covariance matri for g NOR (). Hence, the second-stage random sampling can converge quickly. Compared to other traditional techniques, Algorithm 4 provides.6~3.9 speed-up without surrendering any accuracy, which be demonstrated by the eperimental results in Chapter 5. In addition, a practical eample is shown in Chapter 5 that the proposed Gibbs sampling algorithm can accurately estimates the correct failure probability while the traditional techniques fail to work. g (3) 8

19 CHAPTER 5 NUERICAL EXAPLES Fig. 4. A 6-T SRA cell (the test case to demonstrate the efficacy of the proposed Gibbs sampling method). Fig. 4 shows the circuit of a 6-T SRA cell designed in a commercial 9nm process. In this chapter, the SRA cell is used to demonstrate the efficiency of the proposed Gibbs sampling method. For testing and comparison purpose, four different importance sampling methods are used: () miture importance sampling (IS) [8], () minimum-norm importance sampling (NIS) [4], (3) the proposed Gibbs sampling implemented for Cartesian coordinate systems (G-C), and (4) the proposed Gibbs sampling implemented for spherical coordinate systems (G-S). All these four methods employ a two-stage analysis flow. Namely, a multivariate Normal distribution g NOR () is first constructed from K random samples during the first stage and then N random samples are drawn from g NOR () to calculate the failure probability at the second stage. In our eperiments, all four different methods use the same number of samples (K = 5) to determine g NOR () at the first stage. Our numerical eperiments are run on a.53 GHz Linu server with 6 GB memory. 5. NOISE ARGIN In this sub- chapter, two performance matrices, read noise margin (RN) and write noise margin (WN), are used to assess the stability of the SRA cell. Both RN and WN must be greater than zero to ensure a stable SRA cell. The four importance sampling methods are applied to estimate the failure rate associated with RN and WN. 9

20 (a) (b) (c) Fig. 5. A simulation method to calculate RN and WN. (a) Noise margin estimation. The characteristics of the normal and the mirrored inverter are defined by the functions y= F () and y= F (), where the later is the mirrored version of y= F (). (b) and (c) Circuit diagrams to calculate v for different u.

21 In this paper, only the static-noise sources are taken into account. We use a simulation method, as shown in Fig. 5, to calculate RN and WN. The method draws and mirrors two inverters characteristics and then finds the maimal possible square between the two inverters [8]. Fig. 5(a) shows two coordinate systems which are rotated by 45 degrees relative to each other. In the (u, v) system, the value of v-v at given u yields curve A, from which the RN and WN would be estimated. To find F() and F () in terms of u and v, we transform the (, y) coordinates into the (u, v) system based on the circuits shown in Fig. 5(b) and Fig. 5(c). The maimal v-v value gives a measure for RN and WN. And the difference between calculating RN and WN with this method is that the operating states of the inverters F and F are different. The node voltages of F and F are set based on the reading and writing state of the SRA Pass - 9 Fig Pass Fail VTH3 ( ) (b) Random samples generated by IS at the second stage. (a) RN. (b) WN. Pass Fail VTH5 ( ) VTH ( ) (a) VTH3 ( ) Fig. 6. Fail VTH5 ( ) VTH3 ( ) cell. 4 6 VTH ( ) (a) 8 Pass Fail VTH3 ( ) (b) Random samples generated by NIS at the second stage. (a) RN. (b) WN.

22 Pass - 9 Fig. 9. (b) Random samples generated by G-C at the second stage. (a) RN. (b) WN Fail VTH5 ( ) Pass VTH ( ) 8 Pass Fail VTH3 ( ) (a) VTH3 ( ) Fig. 8. Fail VTH5 ( ) VTH3 ( ) VTH ( ) 8 Pass Fail VTH3 ( ) (a) (b) Random samples generated by G-S at the second stage. (a) RN. (b) WN. To compare the difference between the four importance sampling algorithms, Fig. 6-Fig. 9 plot the random samples that are generated at the second stage. For illustration purpose, these figures only show the VTH mismatches of two transistors that are critical to the performance of interest (i.e., VTH and VTH3 for RN and VTH3 and VTH5 for WN). Each random sample is labeled as Pass or Fail, which indicates whether the performance of interest passes or fails the specification. There are two important observations from Fig. 6-Fig. 9 results. First, both IS and NIS cannot accurately capture the optimal PDF gopt(). These two traditional methods only identify the mean value of gopt(), while ignoring the covariance matri. And as a result, a large number of random samples generated at the second stage do not fall into the failure region. On the other hand, the proposed Gibbs sampling algorithms (i.e., G-C and G-S) are able to accurately capture both the mean value and the covariance matri of gopt(). For this reason, most sampling points generated by G-C and G-S are inside the failure region, thereby the accuracy of failure rate prediction is improved.

23 Relative Error Relative Error Failure Rate ( -6 ) Failure Rate ( -6 ) 5 NIS IS 5 G-S G-C 4 # of Simulations 3 G-C IS G-S NIS 4 # of Simulations 9 (a) (b) Fig.. Estimated failure probability, plotted as a function of the number of transistor-level simulations at the second stage. (a) RN. (b) WN. Fig. shows the estimated failure probability as a function of the number of transistor-level simulations at the second stage. All four importance sampling methods yield the same failure probability, if the number of random samples is sufficiently large. However, the proposed Gibbs sampling methods (i.e., G-C and G-S) are more accurate than the other two traditional methods (i.e., IS and NIS), given the same number of random samples. Fig G-C IS G-S NIS 4 # of Simulations - - G-C IS G-S NIS 4 # of Simulations 9 (a) (b) Relative error of failure rate prediction (defined by 95% confidence interval), plotted as a function of the number of transistor-level simulations at the second stage. (a) RN. (b) WN. To quantitatively assess the accuracy of different importance sampling methods, Fig. plots the relative prediction error as a function of the number of simulations. Here, the relative error is defined as the ratio of the 95% confidence interval over the estimated failure probability. 3

24 TABLE I NUBER OF REQUIRED SIULATIONS DURING BOTH FIRST AND SECOND STAGES TO ACHIEVE 5% ERROR DEFINED BY 95% CONFERENCE INTERVAL RN WN IS [8] NIS [4] 9 7 G-C (Proposed) 69 6 G-S (Proposed) 8 93 Table I further compares the number of required simulations during both first and second stages to achieve 5% error. Note that the proposed Gibbs sampling methods only require 69~6 simulations in total, while IS and NIS require 9~366 simulations to achieve the same accuracy. In other words, G-C and G-S achieve.6 to 3.9 runtime speed-up over IS and NIS in this eample. The epensive transistor-level simulation is required to test each sampling point. Table II lists the equivalent runtime for transistor-level simulation of the RN failure rate estimation. The numerical eperiments were run on a.53 GHz Linu server with 6 GB memory. It took about seconds to run one transistor-level simulation for the SRA cell circuit. For the standard onte Carlo analysis, the runtime is not practical because it needs over.67 8 samples to achieve an error under 5%. Compared with the standard onte Carlo method, the other four importance sampling methods are able to achieve a gain of several orders of magnitude in runtime. oreover, G-C and G-S perform even better than IS and NIS methods. TABLE II RUNTIE IN DAYS FOR RN FAILURE RATE ESTIATION TO ACHIEVE 5% ERROR DEFINED BY 95% CONFERENCE INTERVAL Runtime in number of days onte Carlo 776 IS [8].875 NIS [4].493 G-C (Proposed).799 G-S (Proposed).938 Finally, it is worth mentioning that even though G-C and G-S yield similar accuracy in this eample, 4

25 Failure Rate ( -6 ) they can lead to different results in a number of other cases. One of these eamples will be discussed in detail in the net sub- chapter. 5. READ CURRENT In this sub-chapter, we consider the read current of the SRA cell as the performance of interest. Given the circuit schematic shown in Fig.4, the read current is measured as the drain current of the transistor 3, when the word line and both bit lines are connected to the supply voltage V DD. If the read current is no less than a pre-defined threshold, the performance of the SRA cell is then considered as Pass. We apply the four different importance sampling methods to estimate the failure rate associated with the performance metric of read current. And, only the local V TH mismatches of the transistors and 3 are considered for the following reasons. First, the read current variation is dominated by the local V TH mismatches of these two transistors. Second, considering V TH and V TH3 only could facilitate us to make a comprehensive comparison between different importance sampling algorithms. 3 G-S NIS IS G-C 3 4 # of Simulations Fig.. Estimated failure probability of read current, plotted as a function of the number of transistor-level simulations at the second stage. Fig. shows the estimated failure probability as a function of the number of transistor-level simulations at the second stage. Unlike the eample in the previous sub-chapter where all four importance sampling methods eventually converge to the same failure probability, the proposed Gibbs sampling for spherical coordinate systems (i.e., G-S) results in a failure rate that is different from the other three methods (i.e., IS, NIS and G-C). In order to verify the accuracy of these four importance sampling methods, we further implement a brute-force onte Carlo analysis engine where seven million random samples are directly drawn from the PDF of process variations, i.e., f() in Eq. (). The failure probability calculated by the brute-force onte Carlo analysis engine is used as the golden result to evaluate the accuracy of all 5

26 importance sampling methods in this eample. TABLE III FAILURE PROBABILITY FOR READ CURRENT ESTIATED BY DIFFERENT IPORTANCE SAPLING ETHODS # of Simulations Failure Rate Relative Error IS [8] % NIS [4] % G-C (Proposed) % G-S (Proposed) % Brute-force C Table III summarizes the failure probability estimated by the four importance sampling methods and the brute-force onte Carlo analysis engine. Based on Table III, we notice that G-S is more accurate than the other three importance sampling methods, because the failure rate estimated by G-S is almost identical to that estimated by the brute-force onte Carlo analysis engine. In addition, in this eample, it is important to note that the error associated with IS, NIS and G-C cannot be efficiently reduced by increasing the number of samples at the second stage. In other words, while G-S accurately estimates the correct failure probability, IS, NIS and G-C all fail to work in this eample. To fully understand the limitations of IS, NIS and G-C, we uniformly sample the variation space to identify the failure region for the performance metric of read current. Since only the local V TH mismatches of two transistors are considered in this eample, the two-dimensional failure region can be easily found by the aforementioned uniform sampling, as shown in Fig.3. Several important observations can be obtained by studying Fig.3. First, the proposed G-S method is able to generate random samples to fully cover the high-probability failure region. Second, the other three importance sampling methods cannot appropriately sample the high-probability failure region. They only draw samples from a small portion of the high-probability failure region, thereby underestimating the failure probability. 6

27 V TH3 () V TH3 () V TH3 () V TH3 () Uniform IS - - V TH () (a) Uniform NIS - - V TH () (b) Uniform G-C - - V TH () Uniform G-S - - V TH () (c) (d) Fig. 3. Two-dimensional failure region for the read current. Each green square represents a failure point that is randomly sampled from a two-dimensional uniform distribution. Each black cross denotes a failure point that is generated by the importance sampling methods at the second stage: (a) IS [8], (b) NIS [4], (c) G-C, and (d) G-S. In this eample, NS, IS and G-C all fail to work, because the high-probability failure region has an irregular and non-conve shape. When NS and NIS are applied, they only shift the mean value of the multivariate Normal distribution f() in Eq. () to construct a new Normal distribution g NOR (). Since the covariance matri is not optimally estimated, both NS and NIS cannot correctly capture the failure region of interest. And, the proposed G-C method samples a one-dimensional conditional PDF in the Cartesian coordinate system at each iteration step. Since the high-probability failure region is non-conve, G-C is not able to fully eplore the high-probability failure region with a limited number of Gibbs samples at the first stage. As a result, the random samples generated by the second stage of G-C cannot correctly cover the failure region of interest. This eample demonstrates an important fact that even though both G-C and G-S rely on Gibbs Sampling, they can lead to substantially different results due to their different implementations. 7

28 5.3 CONSIDERATIONS FOR OTHER APPLICATIONS This work focused on predicting the failure rate of SRA cell circuits. Although the proposed method can be applied to other circuits, there are some constraints on the circuits to be able to efficiently apply the Gibbs sampling method. First, the failure region of the circuit needs to be continuous, which means there are no disconnected failure regions for the circuit. If the circuit has multiple failure regions, the proposed algorithm will only generate random samples in one of them and thus the estimated result would not be accurate. Second, the variable number of the circuit should not be too big (> ). For a small or median variable number ( to ), the method results in a significant gain in speed and accuracy. However, for larger variable numbers, the method would require a very large number of random samples. As a result, the efficiency of the algorithm would diminish. So, for other circuit applications, if the above two constraints are satisfied, then the proposed Gibbs sampling algorithm can dynamically eplore the region of interest and achieve high accuracy with low computational cost; otherwise, the cost of the algorithm would be high and the estimated result might be not accurate enough. 8

29 CHAPTER 6 CONCLUSIONS Statistical analysis for SRA cells is a challenging issue because the failure rate of SRA cells is etremely low. In this project, we develop an efficient importance sampling algorithm to estimate the failure rate of SRA cells. In particular, we adapt the Gibbs sampling technique from the statistics community to find the optimal probability distribution of process variations with low computational cost. ore formally, the proposed Gibbs sampling method adaptively eplores the failure region in a Cartesian or spherical coordinate system by sampling a sequence of one-dimensional probability distributions. Several implementation issues such as one-dimensional random sampling and starting point selection are carefully studied to make the Gibbs sampling method more efficient and accurate. The eperimental results of a commercial 9nm SRA cell demonstrate that the proposed Gibbs sampling method achieves.6 to 3.9 runtime speed-up over other state-of-the-art techniques without surrendering any accuracy. We further demonstrate that in certain applications the proposed Gibbs sampling algorithm accurately estimates the correct failure probability while the traditional techniques fail to work. The Gibbs sampling technique can be incorporated into a statistical optimization environment for accurate and efficient parametric yield optimization of SRA cell circuits. 9

30 APPENDIX Proof of Theorem : It has been shown in [7] that if the random variables {α m ; m =,,, } are mutually independent and standard Normal, the variables {α m / α ; m =,,, } are uniformly distributed on the surface of a sphere with unit radius. Hence, for a given value of r, the random variables { m ; m =,,, } defined in Eq. () are uniformly distributed on the surface of a sphere = r. In other words, for all points on the surface of the sphere = r, their PDF values are identical. In order to calculate the joint PDF for, we consider the sphere = r and add a small perturbation Δr to the radius, as shown by the three-dimensional eample in Fig. 4. If the perturbation Δr is sufficiently small (i.e., Δr ), the values of f() at all points within the region r r + Δr are identical. Δr r 3 Fig. 4. A perturbation Δr, where Δr, is applied to the radius of the three-dimensional sphere = r. As shown in Eq. (3), the radius r follows the Chi distribution with degrees of freedom. The CDF of r is represented as []: 3 r r F.5 t t e dt (3).5.5 Hence, the probability for r r + Δr where Δr is equal to: 33 F F r r Fr.5.5 df r dr r e.5r r On the other hand, the volume of an -dimensional sphere can be written as []: 34 V r r (33) (34) 3

3.3.1 Linear functions yet again and dot product In 2D, a homogenous linear scalar function takes the general form:

3.3.1 Linear functions yet again and dot product In 2D, a homogenous linear scalar function takes the general form: 3.3 Gradient Vector and Jacobian Matri 3 3.3 Gradient Vector and Jacobian Matri Overview: Differentiable functions have a local linear approimation. Near a given point, local changes are determined by

More information

Mobile Robot Localization

Mobile Robot Localization Mobile Robot Localization 1 The Problem of Robot Localization Given a map of the environment, how can a robot determine its pose (planar coordinates + orientation)? Two sources of uncertainty: - observations

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University

More information

Issues and Techniques in Pattern Classification

Issues and Techniques in Pattern Classification Issues and Techniques in Pattern Classification Carlotta Domeniconi www.ise.gmu.edu/~carlotta Machine Learning Given a collection of data, a machine learner eplains the underlying process that generated

More information

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

Monte Carlo Integration I

Monte Carlo Integration I Monte Carlo Integration I Digital Image Synthesis Yung-Yu Chuang with slides by Pat Hanrahan and Torsten Moller Introduction L o p,ωo L e p,ωo s f p,ωo,ω i L p,ω i cosθ i i d The integral equations generally

More information

Regression with Input-Dependent Noise: A Bayesian Treatment

Regression with Input-Dependent Noise: A Bayesian Treatment Regression with Input-Dependent oise: A Bayesian Treatment Christopher M. Bishop C.M.BishopGaston.ac.uk Cazhaow S. Qazaz qazazcsgaston.ac.uk eural Computing Research Group Aston University, Birmingham,

More information

MACHINE LEARNING ADVANCED MACHINE LEARNING

MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING ADVANCED MACHINE LEARNING Recap of Important Notions on Estimation of Probability Density Functions 2 2 MACHINE LEARNING Overview Definition pdf Definition joint, condition, marginal,

More information

Lecture 7: Weak Duality

Lecture 7: Weak Duality EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly

More information

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals.

Chapter 9 Regression. 9.1 Simple linear regression Linear models Least squares Predictions and residuals. 9.1 Simple linear regression 9.1.1 Linear models Response and eplanatory variables Chapter 9 Regression With bivariate data, it is often useful to predict the value of one variable (the response variable,

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25

Approximate inference, Sampling & Variational inference Fall Cours 9 November 25 Approimate inference, Sampling & Variational inference Fall 2015 Cours 9 November 25 Enseignant: Guillaume Obozinski Scribe: Basile Clément, Nathan de Lara 9.1 Approimate inference with MCMC 9.1.1 Gibbs

More information

Where now? Machine Learning and Bayesian Inference

Where now? Machine Learning and Bayesian Inference Machine Learning and Bayesian Inference Dr Sean Holden Computer Laboratory, Room FC6 Telephone etension 67 Email: sbh@clcamacuk wwwclcamacuk/ sbh/ Where now? There are some simple take-home messages from

More information

Expression arrays, normalization, and error models

Expression arrays, normalization, and error models 1 Epression arrays, normalization, and error models There are a number of different array technologies available for measuring mrna transcript levels in cell populations, from spotted cdna arrays to in

More information

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda

Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis. Agenda Short course A vademecum of statistical pattern recognition techniques with applications to image and video analysis Lecture Recalls of probability theory Massimo Piccardi University of Technology, Sydney,

More information

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations

Chapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models

More information

A Practitioner s Guide to Generalized Linear Models

A Practitioner s Guide to Generalized Linear Models A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically

More information

6.867 Machine learning

6.867 Machine learning 6.867 Machine learning Mid-term eam October 8, 6 ( points) Your name and MIT ID: .5.5 y.5 y.5 a).5.5 b).5.5.5.5 y.5 y.5 c).5.5 d).5.5 Figure : Plots of linear regression results with different types of

More information

Errors Intensive Computation

Errors Intensive Computation Errors Intensive Computation Annalisa Massini - 2015/2016 OVERVIEW Sources of Approimation Before computation modeling empirical measurements previous computations During computation truncation or discretization

More information

Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids Verification. General Terms Algorithms

Categories and Subject Descriptors B.7.2 [Integrated Circuits]: Design Aids Verification. General Terms Algorithms 4. Statistical Regression for Efficient High-Dimensional odeling of Analog and ixed-signal Performance Variations Xin Li Department of ECE, Carnegie ellon University 5 Forbes Avenue, Pittsburgh, PA 53

More information

A Comparison of Particle Filters for Personal Positioning

A Comparison of Particle Filters for Personal Positioning VI Hotine-Marussi Symposium of Theoretical and Computational Geodesy May 9-June 6. A Comparison of Particle Filters for Personal Positioning D. Petrovich and R. Piché Institute of Mathematics Tampere University

More information

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS Justin Dauwels Dept. of Information Technology and Electrical Engineering ETH, CH-8092 Zürich, Switzerland dauwels@isi.ee.ethz.ch

More information

Bayesian Inference of Noise Levels in Regression

Bayesian Inference of Noise Levels in Regression Bayesian Inference of Noise Levels in Regression Christopher M. Bishop Microsoft Research, 7 J. J. Thomson Avenue, Cambridge, CB FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop

More information

Expectation Maximization Deconvolution Algorithm

Expectation Maximization Deconvolution Algorithm Epectation Maimization Deconvolution Algorithm Miaomiao ZHANG March 30, 2011 Abstract In this paper, we use a general mathematical and eperimental methodology to analyze image deconvolution. The main procedure

More information

A Robustness Optimization of SRAM Dynamic Stability by Sensitivity-based Reachability Analysis

A Robustness Optimization of SRAM Dynamic Stability by Sensitivity-based Reachability Analysis ASP-DAC 2014 A Robustness Optimization of SRAM Dynamic Stability by Sensitivity-based Reachability Analysis Yang Song, Sai Manoj P. D. and Hao Yu School of Electrical and Electronic Engineering, Nanyang

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Lecture 3: Pattern Classification. Pattern classification

Lecture 3: Pattern Classification. Pattern classification EE E68: Speech & Audio Processing & Recognition Lecture 3: Pattern Classification 3 4 5 The problem of classification Linear and nonlinear classifiers Probabilistic classification Gaussians, mitures and

More information

Linear Discriminant Functions

Linear Discriminant Functions Linear Discriminant Functions Linear discriminant functions and decision surfaces Definition It is a function that is a linear combination of the components of g() = t + 0 () here is the eight vector and

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom Central Limit Theorem and the Law of Large Numbers Class 6, 8.5 Jeremy Orloff and Jonathan Bloom Learning Goals. Understand the statement of the law of large numbers. 2. Understand the statement of the

More information

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analsis Matthew B. Blaschko, Christoph H. Lampert, & Arthur Gretton Ma Planck Institute for Biological Cbernetics Tübingen, German

More information

Fast Statistical Analysis of Rare Failure Events for Memory Circuits in High- Dimensional Variation Space

Fast Statistical Analysis of Rare Failure Events for Memory Circuits in High- Dimensional Variation Space Fast Statistical Analysis of Rare Failure Events for Memory Circuits in High- Dimensional Variation Space Shupeng Sun and Xin Li Electrical & Computer Engineering Department, Carnegie Mellon University

More information

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015

CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 CLASS NOTES Computational Methods for Engineering Applications I Spring 2015 Petros Koumoutsakos Gerardo Tauriello (Last update: July 2, 2015) IMPORTANT DISCLAIMERS 1. REFERENCES: Much of the material

More information

Generic Stochastic Effect Model of Noise in Controlling Source to Electronically Controlled Device

Generic Stochastic Effect Model of Noise in Controlling Source to Electronically Controlled Device Generic Stochastic Effect Model of oise in Controlling Source to Electronically Controlled Device RAWID BACHUI Department of Computer Engineering Siam University 35 Petchakasem Rd., Phasi-charoen, Bangkok

More information

Lecture 4.2 Finite Difference Approximation

Lecture 4.2 Finite Difference Approximation Lecture 4. Finite Difference Approimation 1 Discretization As stated in Lecture 1.0, there are three steps in numerically solving the differential equations. They are: 1. Discretization of the domain by

More information

Introduction...2 Chapter Review on probability and random variables Random experiment, sample space and events

Introduction...2 Chapter Review on probability and random variables Random experiment, sample space and events Introduction... Chapter...3 Review on probability and random variables...3. Random eperiment, sample space and events...3. Probability definition...7.3 Conditional Probability and Independence...7.4 heorem

More information

STA 414/2104, Spring 2014, Practice Problem Set #1

STA 414/2104, Spring 2014, Practice Problem Set #1 STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,

More information

1 Kernel methods & optimization

1 Kernel methods & optimization Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions

More information

4.3 How derivatives affect the shape of a graph. The first derivative test and the second derivative test.

4.3 How derivatives affect the shape of a graph. The first derivative test and the second derivative test. Chapter 4: Applications of Differentiation In this chapter we will cover: 41 Maimum and minimum values The critical points method for finding etrema 43 How derivatives affect the shape of a graph The first

More information

ebay/google short course: Problem set 2

ebay/google short course: Problem set 2 18 Jan 013 ebay/google short course: Problem set 1. (the Echange Parado) You are playing the following game against an opponent, with a referee also taking part. The referee has two envelopes (numbered

More information

Calculus of Variation An Introduction To Isoperimetric Problems

Calculus of Variation An Introduction To Isoperimetric Problems Calculus of Variation An Introduction To Isoperimetric Problems Kevin Wang The University of Sydney SSP Working Seminars, MATH2916 May 4, 2013 Contents I Lagrange Multipliers 2 1 Single Constraint Lagrange

More information

MRC: The Maximum Rejection Classifier for Pattern Detection. With Michael Elad, Renato Keshet

MRC: The Maximum Rejection Classifier for Pattern Detection. With Michael Elad, Renato Keshet MRC: The Maimum Rejection Classifier for Pattern Detection With Michael Elad, Renato Keshet 1 The Problem Pattern Detection: Given a pattern that is subjected to a particular type of variation, detect

More information

Lecture 26: April 22nd

Lecture 26: April 22nd 10-725/36-725: Conve Optimization Spring 2015 Lecture 26: April 22nd Lecturer: Ryan Tibshirani Scribes: Eric Wong, Jerzy Wieczorek, Pengcheng Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept.

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing So, What is Statistics? Theory and techniques for learning from data How to collect How to analyze How to interpret

More information

9 - Markov processes and Burt & Allison 1963 AGEC

9 - Markov processes and Burt & Allison 1963 AGEC This document was generated at 8:37 PM on Saturday, March 17, 2018 Copyright 2018 Richard T. Woodward 9 - Markov processes and Burt & Allison 1963 AGEC 642-2018 I. What is a Markov Chain? A Markov chain

More information

Unconstrained Multivariate Optimization

Unconstrained Multivariate Optimization Unconstrained Multivariate Optimization Multivariate optimization means optimization of a scalar function of a several variables: and has the general form: y = () min ( ) where () is a nonlinear scalar-valued

More information

Economics 205 Exercises

Economics 205 Exercises Economics 05 Eercises Prof. Watson, Fall 006 (Includes eaminations through Fall 003) Part 1: Basic Analysis 1. Using ε and δ, write in formal terms the meaning of lim a f() = c, where f : R R.. Write the

More information

MACHINE LEARNING ADVANCED MACHINE LEARNING

MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING ADVANCED MACHINE LEARNING Recap of Important Notions on Estimation of Probability Density Functions 22 MACHINE LEARNING Discrete Probabilities Consider two variables and y taking discrete

More information

Introduction to Probability Theory for Graduate Economics Fall 2008

Introduction to Probability Theory for Graduate Economics Fall 2008 Introduction to Probability Theory for Graduate Economics Fall 008 Yiğit Sağlam October 10, 008 CHAPTER - RANDOM VARIABLES AND EXPECTATION 1 1 Random Variables A random variable (RV) is a real-valued function

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 4 Statistical Models in Simulation Purpose & Overview The world the model-builder sees is probabilistic rather than deterministic. Some statistical model

More information

5682 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

5682 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE 5682 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Hyperplane-Based Vector Quantization for Distributed Estimation in Wireless Sensor Networks Jun Fang, Member, IEEE, and Hongbin

More information

Chapter 4 (Lecture 6-7) Schrodinger equation for some simple systems Table: List of various one dimensional potentials System Physical correspondence

Chapter 4 (Lecture 6-7) Schrodinger equation for some simple systems Table: List of various one dimensional potentials System Physical correspondence V, E, Chapter (Lecture 6-7) Schrodinger equation for some simple systems Table: List of various one dimensional potentials System Physical correspondence Potential Total Energies and Probability density

More information

component risk analysis

component risk analysis 273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain

More information

Basic methods to solve equations

Basic methods to solve equations Roberto s Notes on Prerequisites for Calculus Chapter 1: Algebra Section 1 Basic methods to solve equations What you need to know already: How to factor an algebraic epression. What you can learn here:

More information

A Statistical Study of the Effectiveness of BIST Jitter Measurement Techniques

A Statistical Study of the Effectiveness of BIST Jitter Measurement Techniques A Statistical Study of the Effectiveness of BIST Jitter Measurement Techniques David Bordoley, Hieu guyen, Mani Soma Department of Electrical Engineering, University of Washington, Seattle WA {bordoley,

More information

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential.

Math 180A. Lecture 16 Friday May 7 th. Expectation. Recall the three main probability density functions so far (1) Uniform (2) Exponential. Math 8A Lecture 6 Friday May 7 th Epectation Recall the three main probability density functions so far () Uniform () Eponential (3) Power Law e, ( ), Math 8A Lecture 6 Friday May 7 th Epectation Eample

More information

State Space and Hidden Markov Models

State Space and Hidden Markov Models State Space and Hidden Markov Models Kunsch H.R. State Space and Hidden Markov Models. ETH- Zurich Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State

More information

2014 Mathematics. Advanced Higher. Finalised Marking Instructions

2014 Mathematics. Advanced Higher. Finalised Marking Instructions 0 Mathematics Advanced Higher Finalised ing Instructions Scottish Qualifications Authority 0 The information in this publication may be reproduced to support SQA qualifications only on a noncommercial

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 7 Unsupervised Learning Statistical Perspective Probability Models Discrete & Continuous: Gaussian, Bernoulli, Multinomial Maimum Likelihood Logistic

More information

Random Vectors. 1 Joint distribution of a random vector. 1 Joint distribution of a random vector

Random Vectors. 1 Joint distribution of a random vector. 1 Joint distribution of a random vector Random Vectors Joint distribution of a random vector Joint distributionof of a random vector Marginal and conditional distributions Previousl, we studied probabilit distributions of a random variable.

More information

Beyond Newton s method Thomas P. Minka

Beyond Newton s method Thomas P. Minka Beyond Newton s method Thomas P. Minka 2000 (revised 7/21/2017) Abstract Newton s method for optimization is equivalent to iteratively maimizing a local quadratic approimation to the objective function.

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Machine Learning Lecture 3

Machine Learning Lecture 3 Announcements Machine Learning Lecture 3 Eam dates We re in the process of fiing the first eam date Probability Density Estimation II 9.0.207 Eercises The first eercise sheet is available on L2P now First

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

9th International Symposium on Quality Electronic Design

9th International Symposium on Quality Electronic Design 9th International Symposium on Quality Electronic Design Projection-Based Piecewise-Linear Response Surface Modeling for Strongly Nonlinear VLSI Performance Variations Xin Li Department of Electrical and

More information

Figure 1.1: Schematic symbols of an N-transistor and P-transistor

Figure 1.1: Schematic symbols of an N-transistor and P-transistor Chapter 1 The digital abstraction The term a digital circuit refers to a device that works in a binary world. In the binary world, the only values are zeros and ones. Hence, the inputs of a digital circuit

More information

Random Process Examples 1/23

Random Process Examples 1/23 ando Process Eaples /3 E. #: D-T White Noise Let ] be a sequence of V s where each V ] in the sequence is uncorrelated with all the others: E{ ] ] } 0 for This DEFINES a DT White Noise Also called Uncorrelated

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Statistical Performance Modeling and Optimization

Statistical Performance Modeling and Optimization Foundations and Trends R in Electronic Design Automation Vol. 1, No. 4 (2006) 331 480 c 2007 X. Li, J. Le and L. T. Pileggi DOI: 10.1561/1000000008 Statistical Performance Modeling and Optimization Xin

More information

2 Generating Functions

2 Generating Functions 2 Generating Functions In this part of the course, we re going to introduce algebraic methods for counting and proving combinatorial identities. This is often greatly advantageous over the method of finding

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Chapter 2 Process Variability. Overview. 2.1 Sources and Types of Variations

Chapter 2 Process Variability. Overview. 2.1 Sources and Types of Variations Chapter 2 Process Variability Overview Parameter variability has always been an issue in integrated circuits. However, comparing with the size of devices, it is relatively increasing with technology evolution,

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Counting, symbols, positions, powers, choice, arithmetic, binary, translation, hex, addresses, and gates.

Counting, symbols, positions, powers, choice, arithmetic, binary, translation, hex, addresses, and gates. Counting, symbols, positions, powers, choice, arithmetic, binary, translation, he, addresses, and gates. Counting. Suppose the concern is a collection of objects. As an eample, let the objects be denoted

More information

Safe and Tight Linear Estimators for Global Optimization

Safe and Tight Linear Estimators for Global Optimization Safe and Tight Linear Estimators for Global Optimization Glencora Borradaile and Pascal Van Hentenryck Brown University Bo 1910 Providence, RI, 02912 {glencora, pvh}@cs.brown.edu Abstract. Global optimization

More information

A Comparison of Some Methods for Bounding Connected and Disconnected Solution Sets of Interval Linear Systems

A Comparison of Some Methods for Bounding Connected and Disconnected Solution Sets of Interval Linear Systems A Comparison of Some Methods for Bounding Connected and Disconnected Solution Sets of Interval Linear Systems R. Baker Kearfott December 4, 2007 Abstract Finding bounding sets to solutions to systems of

More information

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 29, 2014 Introduction Introduction The world of the model-builder

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández Lobato 1,2, David López Paz 3,2 and Zoubin Ghahramani 1 June 27, 2013 1 University of Cambridge 2 Equal Contributor 3 Ma-Planck-Institute

More information

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 11

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 11 EECS 16A Designing Information Devices and Systems I Spring 2018 Lecture Notes Note 11 11.1 Context Our ultimate goal is to design systems that solve people s problems. To do so, it s critical to understand

More information

Partitioning variation in multilevel models.

Partitioning variation in multilevel models. Partitioning variation in multilevel models. by Harvey Goldstein, William Browne and Jon Rasbash Institute of Education, London, UK. Summary. In multilevel modelling, the residual variation in a response

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Gaussian graphical models and Ising models: modeling networks Eric Xing Lecture 0, February 5, 06 Reading: See class website Eric Xing @ CMU, 005-06

More information

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA

More information

Gaussian Processes in Reinforcement Learning

Gaussian Processes in Reinforcement Learning Gaussian Processes in Reinforcement Learning Carl Edward Rasmussen and Malte Kuss Ma Planck Institute for Biological Cybernetics Spemannstraße 38, 776 Tübingen, Germany {carl,malte.kuss}@tuebingen.mpg.de

More information

Chapter 5 MOSFET Theory for Submicron Technology

Chapter 5 MOSFET Theory for Submicron Technology Chapter 5 MOSFET Theory for Submicron Technology Short channel effects Other small geometry effects Parasitic components Velocity saturation/overshoot Hot carrier effects ** Majority of these notes are

More information

Optics in a field of gravity

Optics in a field of gravity Optics in a field of gravity E. Eriksen # and Ø. Grøn # # Institute of Physics, University of Oslo, P.O.Bo 48 Blindern, N-36 Oslo, Norway Department of Engineering, Oslo University College, St.Olavs Pl.

More information

1. Sets A set is any collection of elements. Examples: - the set of even numbers between zero and the set of colors on the national flag.

1. Sets A set is any collection of elements. Examples: - the set of even numbers between zero and the set of colors on the national flag. San Francisco State University Math Review Notes Michael Bar Sets A set is any collection of elements Eamples: a A {,,4,6,8,} - the set of even numbers between zero and b B { red, white, bule} - the set

More information

An example to illustrate frequentist and Bayesian approches

An example to illustrate frequentist and Bayesian approches Frequentist_Bayesian_Eample An eample to illustrate frequentist and Bayesian approches This is a trivial eample that illustrates the fundamentally different points of view of the frequentist and Bayesian

More information

Random Fields in Bayesian Inference: Effects of the Random Field Discretization

Random Fields in Bayesian Inference: Effects of the Random Field Discretization Random Fields in Bayesian Inference: Effects of the Random Field Discretization Felipe Uribe a, Iason Papaioannou a, Wolfgang Betz a, Elisabeth Ullmann b, Daniel Straub a a Engineering Risk Analysis Group,

More information

Plotting data is one method for selecting a probability distribution. The following

Plotting data is one method for selecting a probability distribution. The following Advanced Analytical Models: Over 800 Models and 300 Applications from the Basel II Accord to Wall Street and Beyond By Johnathan Mun Copyright 008 by Johnathan Mun APPENDIX C Understanding and Choosing

More information

2.7 The Gaussian Probability Density Function Forms of the Gaussian pdf for Real Variates

2.7 The Gaussian Probability Density Function Forms of the Gaussian pdf for Real Variates .7 The Gaussian Probability Density Function Samples taken from a Gaussian process have a jointly Gaussian pdf (the definition of Gaussian process). Correlator outputs are Gaussian random variables if

More information

Chapter 2. Discrete Distributions

Chapter 2. Discrete Distributions Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation

More information

Machine Learning Basics III

Machine Learning Basics III Machine Learning Basics III Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Machine Learning Basics III 1 / 62 Outline 1 Classification Logistic Regression 2 Gradient Based Optimization Gradient

More information