MATH5835 Statistical Computing. Jochen Voss

Size: px
Start display at page:

Download "MATH5835 Statistical Computing. Jochen Voss"

Transcription

1 MATH5835 Statistical Computing Jochen Voss September 27, 2011

2 Copyright c 2011 Jochen Voss J.Voss@leeds.ac.uk This text is work in progress and may still contain typographical and factual mistakes. Reports about problems and suggestions for improvements are most welcome.

3 Contents 1 Introduction 3 2 Random Number Generation Pseudo Random Number Generators The Linear Congruential Generator Quality of Pseudo Random Numbers Pseudo Random Number Generators in Practice The Inverse Transform Method Rejection Sampling Uniform Distribution on Sets General Densities Other Methods Exercises Monte Carlo Methods Basic Monte Carlo Integration Variance Reduction Methods Importance Sampling Antithetic Variables Control Variates Applications to Statistical Inference Exercises Resampling Methods Bootstrap Methods Jackknife Methods Exercises Markov Chain Monte Carlo Methods Markov Chains The Metropolis-Hastings Method Description of the Method Random Walk Metropolis Sampling Independence Sampler Exercises The EM Algorithm 59 1

4 A Probability Reminders 65 A.1 Events and Probability A.2 Conditional Probabilities A.3 Expectation B Programming in R 71 B.1 General advice B.2 R as a calculator B.2.1 Mathematical Operations B.2.2 Variables B.2.3 Data Types B.3 Programming Principles B.3.1 Principle 1: Don t Repeat Yourself B.3.2 Principle 2: Divide and Conquer B.3.3 Principle 3: Test your Code B.4 Random Number Generation B.5 Exercises C Answers to the Exercises 89 Bibliography 107 Index 109 2

5 Chapter 1 Introduction The use of computers in mathematics and statistics has opened up a wide range of techniques for studying otherwise intractable problems and for analysing very large data sets. Statistical computing is the branch of mathematics which concerns these techniques for situations which either directly involve randomness, or where randomness is used as part of a mathematical model, This text gives an overview over the foundations of and basic methods in statistical computing. An area of mathematics closely related to statistical computing is computational statistics. In the literature there is no clear consensus about the differentiation between these two areas; sometimes the two terms are even used interchangeably. Here we take the point of view that computational statistics concerns computational methods which involve randomness as part of the method. Such methods are what this text is mainly interested in. Typical examples of such techniques are Monte Carlo methods. In contrast, we take computational statistics to be the larger area of computational methods which can be applied in statistics, even if the methods themselves do not involve randomness. Examples of such techniques include methods to the maximum of a density function in maximum likelihood estimation. Here, we will give such methods only a cursory treatment, and refer to the literature instead (see, for example, Thisted (1988) for extensive discussion of such methods). One of the most important ideas in statistical computing is, that often properties of a stochastic model can be found experimentally, by using a computer to generate many random instances of the model, and then statistically analysing the resulting sample. The resulting methods are called Monte Carlo methods, and discussion of such methods forms the main body of this text. This text is mostly self-contained, only basic knowledge of the beginnings of statistics and probability is required. The text should be accessible to an MSc or PhD student in Mathematics, Statistics, Physics or related areas. For reference, appendix A summarises the most important concepts and results from probability. Since we study computational methods, programming skills are required to get maximal benefit out of this text. Previous programming experience would of course be useful, but is not required: appendix B gives an introduction to the aspects of programming we require here and, starting from there, we provide many exercises for the reader to train his or her skills. While the main part of the text is independent of any specific programming language, the exercises use the statistical software package R (R Development Core Team, 2011) for implementation of our algorithms and the text includes an introduction to programming in R in appendix B. 3

6 Notation For reference, the following table summarises some of the notation used throughout this text. N the natural numbers: N = {1, 2, 3,... R the real numbers (a n ) n N a sequence of (possibly random) numbers: (a n ) n N = (a 1, a 2,...) [a, b] an interval of real numbers: [a, b] = { x R a x b {a, b the set containing a and b U[0, 1] the uniform distribution on the interval [0, 1] U{ 1, 1 the uniform distribution on the two-element set { 1, 1 A The complement of a set: A = { x x / A. A B the Cartesian product of the sets A and B: A B = { (a, b) a A, b B 1 {X A Indicator function of the event that the random variable X takes a value in the set A: 1 {X A = 1 if X A and 0 else (see page 68) 1 A (x) Indicator function of the set A: 1 A (x) = 1 if x A and 0 else (see page 68) X µ S R S Indicates that a random variable X is distributed according to a probability distribution µ. The number of elements in a finite set; in section 2.3 also the volume of subsets S R d. matrices where the columns and rows are indexed by elements of S (see page 45). R S S vectors where the components are indexed by elements of S (see page 45). 4

7 Chapter 2 Random Number Generation The basis of all stochastic algorithms is formed by methods to generate random samples from a given model, e.g. to generate throws of a coin or of a die, Gaussian random numbers, random functions (e.g. in finance), or random graphs (e.g. in epidemiology). In this chapter we will discuss how to generate random samples from a given distribution on a computer. Since computer programs are inherently deterministic, some care is needed when trying to generate random numbers on a computer. Usually the task is split into two steps: a) Generate a sequence X 1, X 2, X 3,... of independent, identically distributed (i.i.d.) random variables, uniformly distributed on the interval [0, 1] or on a finite set like {1, 2,..., n. Methods for this will be discussed in section 2.1. b) Starting with the result of the first step, transform the random variables to construct samples of more complicated distributions. Different methods for such transformations are discussed later in this chapter, starting in section 2.2. This split allows the process of generating randomness (which has a set of concerns, discussed in section 2.1) to be separate from the process of transforming the randomness to have the correct distribution (which can often be done exactly, with mathematical rigour). 2.1 Pseudo Random Number Generators There are two fundamentally different classes of methods to generate random numbers: a) True random numbers are generated using some physical phenomenon which is random. Generating such numbers requires specialised hardware and can be expensive and slow. Classical examples of this include tossing a coin or throwing dice. Modern methods utilise quantum effects, thermal noise in electric circuits, the timing of radioactive decay, etc. 5

8 b) Pseudo random numbers are generated by computer programs. While these methods are normally fast and resource effective, a challenge with this approach is that computer programs are inherently deterministic and therefore cannot produce a truly random output. In this text we will only consider pseudo random number generators. Definition 2.1. A pseudo random number generator (PRNG) is an algorithm which outputs a sequence of numbers that can be used as a replacement for an i.i.d. sequence of true random numbers. In the definition, and throughout this text, the shorthand i.i.d. is used as an abbreviation for the term independent and identically distributed The Linear Congruential Generator A (simple) example of a pseudo random number generator is given in the following algorithm: Algorithm LCG (linear congruential generator) input: m > 1 (the modulus), a {1, 2,..., m 1 (the multiplier), c {0, 1,..., m 1 (the increment), X 0 {0, 1,..., m 1 (the seed). output: a sequence X 1, X 2, X 3,... of pseudo-random numbers. 1: for n = 1, 2, 3,... do 2: X n (ax n 1 + c) mod m 3: output X n 4: end for The sequence generated by the algorithm LCG consists of integers X n {0, 1, 2,..., m 1. The output depends on the parameters m, a, c and on the seed X 0. If m, a and c are carefully chosen, the resulting sequence behaves similar to a sequence of independent, uniformly distributed random variables. This is illustrated by the following example and, more extensively, by exercise 2.2. Example 2.2. For parameters m = 8, a = 5, c = 1 and seed X 0 = 0, algorithm LCG gives the following output: n 5X n X n

9 The output 1, 6, 7, 4, 5, 2, 3, 0, 1, 6,... shows no obvious pattern and could be considered to be a sample of a random sequence. # (2.1) Since each new value of X n only depends on X n 1, the generated series will repeat over and over again once it reaches a value X n which has been generated before. In example 2.2 this happens for X 8 = X 0 and we get X 9 = X 1, X 10 = X 2 and so on. Since X n can take only m different values, the output of a linear congruential generator starts repeating itself after at most m steps; the generated sequence is eventually periodic. To work around this problem, typical values of m are on the order of m = The values for a and c are then chosen such that the generator actually achieves the maximally possible period length of m. A criterion for the choice of m, a and c is given in the following theorem (Knuth, 1981, section ). Theorem 2.3. The LCG has period m if and only if the following three conditions are satisfied: a) m and c are relatively prime b) a 1 is divisible by every prime factor of m, and c) a 1 is a multiple of 4 if m is a multiple of 4. In the situation of the theorem, the period length does not depend on the seed X 0 and usually this parameter is left to be chosen by the user of the pseudo random number generator. Example 2.4. Let m = 2 32, a = and c = Since the only prime factor of m is 2 and c is odd, the values m and c are relatively prime and condition (a) of the theorem is satisfied. Similarly, condition (b) is satisfied, since a 1 is even and thus divisible by 2. Finally, since m is a multiple of 4, we have to check condition (c) but, since a 1 = = , this condition also holds. Therefore the linear congruential generator with these parameters m, a and c has period 2 32 for every seed X 0. If the period length is longer than the amount of random numbers we use, one can use the output of the linear congruential generator as a replacement for a sequence of i.i.d. random numbers which are uniformly distributed on the set {0, 1,..., m 1. The applications of random number generators that will be discussed in the remaining parts of this text mostly require a sequence of real-valued i.i.d. random variables, e.g. uniformly distributed on the interval [0, 1]. We can get a sequence (U n ) n N of pseudo random numbers to replace an i.i.d. sequence of U[0, 1] random variables by setting U n = X n m where (X n ) n N is the output of the linear congruential generator. 7

10 2.1.2 Quality of Pseudo Random Numbers Pseudo random number generators used in modern software packages like R or Matlab are more sophisticated (and more complicated) than the linear congruential generator presented here, but they still share the following characteristics: The sequence of random numbers generated by a pseudo random number generator depends on a seed. One can get different random sequences for different runs by choosing the seed based on variable quantities like the current time of the day. Conversely, one can get reproducible results by setting the seed to a known, fixed value. The property that the output is eventually periodic is shared by all pseudo random number generators implemented in software. The period length is a measure for the quality of a pseudo random number generator. Another problem of the output of the LCG algorithm is that the generated random numbers are not independent (since each value is a deterministic function of the previous value). Again, to some extent this problem is shared by all PRNGs. There are methods available to quantify the effect and there are PRNGs which suffer less from this problem, e.g. the Mersenne Twister algorithm (Matsumoto and Nishimura, 1998). Finally, since pseudo random numbers are meant to be used as a replacement for i.i.d. sequences of true random numbers, a commonly way to test pseudo random numbers is to apply statistical tests (e.g. for testing independence) to the generated sequences. Random number generators used in practice pass such tests without problems. # (2.2) Pseudo Random Number Generators in Practice This section contains advice on using pseudo random number generators in practice. First, it is almost always a bad idea to implement your own pseudo random number generator: Finding a good algorithm for pseudo random number generation is a difficult problem, and even when an algorithm is available, given the nature of the generated output, it can be a challenge to spot and remove all mistakes in the implementation. Therefore, it is normally advisable to use a well-established method for random number generation, typically the random number generator built into a well-known software package or provided by a well-established library. A second consideration concerns the rôle of the seed. While different pseudo random number generators differ greatly in implementation details, they all use a seed (like the value X 0 in algorithm LCG) to initialise the state of the random number generator. Often, when non-predictability is required, it is useful to set the seed to some volatile quantity (like the current time) to get a different sequence of random numbers for different runs of the program. At other times it can be more useful to get reproducable results, for example to aid debugging or to ensure repeatability of published results. In these cases, the seed should be set to a known, fixed value. 2.2 The Inverse Transform Method The inverse transform method is a method for generating random variables with values in the real numbers, using only U[0, 1]-distributed values as input. In this section we will 8

11 X i Xi X i Xi X i Xi X i Xi+1 Figure 2.1: Scatter plots to illustrate the correlation between consecutive outputs X i and X i+1 of different pseudo random number generators. The random number generators used are the runif function in R (top left), the LCG with m = 81, a = 1 and c = 8 (top right), the LCG with m = 1024, a = 401, c = 101 (bottom left) and finally the LCG with parameters m = 2 32, a = , c = (bottom right). Clearly the output in the second and third example does not behave like and i.i.d. sequence of random variables. See exercise 2.2 for details. 9

12 use the concepts of probability densities and of (cumulative) distribution functions. See appendix A for a quick summary of these concepts. Theorem 2.5. Let F be a distribution function. Define the inverse of F by F 1 (u) = inf { x R F (x) u u (0, 1) and let U U[0, 1]. Define X = F 1 (U). Then X has distribution function F. proof. Using the definitions of X and F 1 we find P (X a) = P ( F 1 (U) a ) = P ( min{ x F (x) U a ). Since min{ x F (x) U a holds if and only if F (a) U, we can conclude P (X a) = P ( F (a) U ) = F (a) where the final equality comes from the definition of the uniform distribution on [0, 1]. (q.e.d.) Once F 1 is determined, this method is very simple to apply and the resulting algorithm is trivial. We state the formal algorithm only for completeness: Algorithm INV (inverse transform method) input: the inverse F 1 of a CDF F. randomness used: U U[0, 1]. output: X F. 1: output X = F 1 (U) Example 2.6. Let X have density f(x) = { 3x 2, 0 else. for x [0, 1], and Then F (a) = a 0, if a < 0, f(x) dx = a 3, if 0 a < 1, and 1 for 1 a. Since F maps (0, 1) into (0, 1) injectively, F 1 is given by the usual inverse function and thus F 1 (u) = u 1/3 for all u (0, 1). Thus, by theorem 2.5, if U U[0, 1], the cubic root U 1/3 has the same distribution as X has. Example 2.7. Let X be discrete with P (X = 0) = 0.6 and P (X = 1) = 0.4. Then 0, if a < 0, F (a) = 0.6, if 0 a < 1, and 1 if 1 a. 10

13 F (x) u v w F 1 (w) a F 1 (v) F 1 (u) x Figure 2.2: Illustration of the inverse F 1 of a CDF F. At level u the function F is continuous and injective; here F 1 coincides with the usual inverse of a function. The value v falls in the middle of a jump of F and thus has no preimage; F 1 (v) is the preimage of the right-hand limit of F and F ( F 1 (v) ) v. At level w the function F is not injective, several points map to w; the preimage F 1 (w) is the left-most of these points and we have, for example, F 1( F (a) ) a. Using the definition of F 1 we find F 1 (u) = { 0, if 0 < u 0.6, and 1 else. By theorem 2.5 we can construct an random variable X with the correct distribution from U U[0, 1], by setting { 0, if U 0.6, and X = 1 if U > 0.6. The inverse transform method can always be applied when F 1 can be computed explicitly. For some distributions like the normal distribution this is not possible, and the inverse transform method cannot be applied directly. The method can be applied (but may not be very useful) for discrete distributions like in example Rejection Sampling Rejection sampling is a method to generate samples of a given target distribution using samples from some related auxiliary distribution. Different from the previous method, rejection sampling may use more than one sample from the auxiliary distribution to generate one sample of the target distribution. The method is not restricted to the generation of random numbers, but works for sampling on arbitrary probability spaces. We formulate the method here for distributions on the Euclidean space R d. 11

14 2.3.1 Uniform Distribution on Sets In this section we consider a special case of rejection sampling, namely a rejection algorithm to sample from the uniform distribution on a set. This algorithm, while being useful in itself, will be used as the basis for the general rejection sampling method in the following section. When A R d is a set, we write A for the d-dimensional volume of A: the cube Q = [a, b] 3 R 3 has Q = (b a) 3, the unit circle C = { x R 2 x x2 2 1 has 2- dimensional volume (area) 2π and the line segment [a, b] R has 1-dimensional volume (length) b a. Definition 2.8. A random variable X with values in R d is uniformly distributed on a set A R d with A <, if P (X B) = A B A B R d. (2.3) As for real intervals, we use the notation X U(A) to indicate that X is uniformly distributed on A. In general, the set B in (2.3) may not be a subset of A. For the case B A we have A B = B and thus (2.3) simplifies to P (X B) = B / A. Similarly, if B A, we have P (X / B) = P (X R d \ B) = A \ B A = A B A = 1 B A. One of the basic ideas of rejection sampling is described in the following lemma, which turns the uniform distribution on a set into the uniform distribution of a sub-set. Lemma 2.9. Let B A R d with A < and let (X k ) k N be a sequence of independent random variables, uniformly distributed on A. Furthermore let N = min{ k N X k B. Then X N U(B). proof. The lemma can be proved by verifying that condition (2.3) is satisfied. For C R d we have P (X N C) = P (X k C, N = k) = = k=1 P (X k B C) P (X k 1 / B) P (X 1 / B) k=1 k=1 A (B C) ( 1 B A A ) k 1, where we used the independence of the X k. Using the geometric series formula k=0 qk = 1/(1 q) we get P (X N C) = B C A 1 1 ( 1 B ) = A B C B for all C R d. This is the required condition for X N to be uniformly distributed on B. (q.e.d.) 12

15 The result of lemma 2.9 is easily turned into an algorithm for sampling from U(B): we can generate the X k one by one, for each generated value check the condition X k B, and stop once the condition is true. In the context of the rejection sampling algorithm, the random variables X k are called proposals and we say that the proposals X 1,... X N 1 are rejected and X N is accepted General Densities In this section we present the general rejection sampling method. The following lemma, which makes a connection between distributions with general densities and uniform distributions on sets, forms the basis of this method. Lemma Let f : R d [0, ) be a probability density and let A = { (x, v) R d [0, ) 0 v < f(x) R d+1. Then the following two statements are equivalent: a) X has probability density f on R d and V = f(x)u where U U[0, 1]. The random variables X and U are independent. b) (X, V ) is uniformly distributed on A. proof. The volume of the set A can be found by integrating the height f(x) over all of R d. Since f is a probability density, we get A = f(x) dx = 1. (2.4) R d Assume first that the random variables X with density f and U U[0, 1] are independent, and let V = f(x)u. Furthermore let C R d, D [0, ) and B = C D. Then we get P ( (X, V ) B ) = P ( X C, V D ) = P ( V D X = x ) f(x) dx = C P ( f(x)u D ) f(x) dx = On the other hand we have f(x) A B = 1 {(x,v) B dv dx = R d 0 = D [0, f(x)] dx C C C D [0, f(x)] f(x) dx = f(x) R d 1 {x C f(x) 0 C D [0, f(x)] dx 1 {v D dv dx and thus P ( (X, V ) B ) = A B. This shows that (X, V ) is uniformly distributed on A. For the converse statement assume now that (X, V ) is uniformly distributed on A and define U = V/f(X). Since (X, V ) A, we have f(x) > 0 almost surely and thus there is no problem in dividing by f(x). Given sets C R d and D R we find P ( X C, U D ) = P ((X, V ) { (x, v) ) x C, v/f(x) D = A { (x, u) x C, v/f(x) D f(x) = 1 {x C 1 {v/f(x) D dv dx. R d 0 13

16 M R f Y k A 0 a X k b Figure 2.3: Illustration of the rejection sampling method where the graph of the target density is contained in a rectangle R and the proposals are uniformly distributed on R. Using the substitution u = v/f(x) in the inner integral we get P ( X C, U D ) = = R d C 1 1 {x C 1 {u D f(x) du dx 0 f(x) dx 1 [0,1] (u) du. D Therefore X and U are independent with densities f and 1 [0,1], respectively. (q.e.d.) An easy application of the lemma is to use the implication from b) to a) to convert a uniform distribution in R 2 to a distribution on R with a given density f : [a, b] R. For simplicity, we assume here that f lives on a bounded interval [a, b]. Furthermore, assume that f satisfies f(x) M for all x [a, b]. We can generate samples from the distribution with density f as follows: a) Let X k U[a, b] and Y k U[0, M] for k N, independently. Then the (X k, Y k ) are i.i.d., uniformly distributed on the rectangle R = [a, b] [0, M]. b) Consider the set A = { (x, y) R y f(x) and let N = min{ k N X k B. By lemma 2.9, (X N, Y N ) is uniformly distributed on A. c) By lemma 2.10, the value X N is distributed with density f. This procedure is visualised in figure 2.3. In the general case, when f is defined on an unbounded set, we cannot use proposals which are uniformly distribution on a rectangle anymore. A solution to this problem is to replace the rectangle R with a different, unbounded area which has finite volume (so that the uniform distribution exists) and to use lemma 2.10 a second time to obtain a uniform distribution on this set. This idea is implemented in the following algorithm. 14

17 c g(x k ) cg f cg(x k )U k X k x Figure 2.4: Illustration of the rejection sampling method from algorithm REJ. The proposal ( Xk, U k cg(x k ) ) is distributed uniformly on the area under the graph of cg. The proposal is accepted if it falls into the area underneath the graph of f. Algorithm REJ (rejection sampling) input: a probability density f (the target density), a probability density g (the proposal density), a constant c > 0 such that f(x) c g(x) for all x. randomness used: X k i.i.d. with density g (the proposals), U k U[0, 1] i.i.d. output: i.i.d. random variables Y j with density f. 1: let j 1 2: for k = 1, 2, 3,... do 3: generate X k with density g 4: generate U k U[0, 1] 5: if cg(x k )U k f(x k ) then 6: output Y j = X k 7: j j + 1 8: end if 9: end for The assumption in the algorithm is that we can already sample from the distribution with probability density g, but we would like to generate samples from the distribution with a density f instead. The theorem below shows that the method works whenever we can find the required constant c. This condition implies, for example, that the support of f cannot be bigger than the support of g, i.e. we need f(x) to be 0 whenever g(x) = 0. Since f and g are both probability densities, we find 1 = f(x) dx cg(x) dx = c, i.e. the constant c always satisfies c 1 and with equality only being possible for f = g. Theorem Let (Y j ) j N be the sequence of random variables generated by algorithm REJ. Then the following statements hold: 15

18 a) The Y j are i.i.d. with density f. b) The number N j of proposals required to generate Y j is geometrically distributed with parameter p = 1/c. In particular, E(N j ) = c. proof. Since the X k have density g, we know from lemma 2.10 that (X k, g(x k )U k ) is uniformly distributed on the set { (x, v) 0 v < g(x). Consequently, Z k = (X k, cg(x k )U k ) is uniformly distributed on A = { (x, y) 0 y < cg(x). By lemma 2.9, the accepted values are then uniformly distributed on the set B = { (x, y) 0 y < f(x) A and, applying lemma 2.10 again, we find that the X k, conditional on being accepted, have density f. This completes the proof of the first statement. Proposals are accepted if and only if Z i B. Since A = cg(x) dx = c and B = f(x) dx = 1, the probability of accepting proposal i is P (Z i B) = B A = 1 c and the events are independent. Since N j is the time until the first success of independent trials which succeed with probability p = 1/c, the values N j are geometrically distributed with parameter p. This completes the proof of the second statement. (q.e.d.) The average cost of generating one sample is given by the average number of proposals required times the cost for generating each proposal. Therefore the algorithm is efficient, if the following two conditions are satisfied: a) There is an efficient method to generate the proposals X i. b) f and g are similar in the sense that the constant c is close to Other Methods There are many specialised methods to generate samples from specific distributions. There are often faster that the generic methods described in the previous chapters, but can typically only be used for a single distribution. These specialised method (optimised for speed and often quite complex) form the basis of built-in random number generators in software packages. In contrast, the methods discussed in the previous sections are general purpose methods which can be used for a wide range of distributions when no pre-existing method is available. # (2.5) Further Reading A lot of information about linear congruential generators and about testing of random number generators can be found in Knuth (1981). The Mersenne Twister, a popular modern pseudo random number generator, is described in Matsumoto and Nishimura (1998). Rejection sampling and its extensions are described in Robert and Casella (2004, section 2.3). Specialised methods for generating normally distributed random variables can be found in Box and Muller (1958) and Marsaglia and Tsang (2000). 16

19 f θ x Figure 2.5: The density of the mixture distribution 1 2 N (1, 0.04) + 1 2N (4, 1), together with a histogram generated from 4000 samples. 2.5 Exercises Exercise E-2.1. Write an R function to implement the linear congruential generator. The function should have the following form: LCG <- function(n, m, a, c, X0) {... return(x); The return value should be the vector X = (X 1, X 2,..., X n ). Test your function LCG by first manually computing X 1,..., X 6 for m = 5, a = 1, c = 3 and X 0 = 0 and then comparing the result to the output of LCG(6,5,1,3,0). Exercise E-2.2. Given a sequence X = (X 1,..., X n ) of U[0, 1]-distributed (pseudo) random numbers, we can use a scatter plot of (X i, X i+1 ) for i = 1,..., n 1 in order to try to assess whether the X i are independent. a) Try creating such a plot using the built-in random-number generator of R: X <- runif(1000) plot(x[1:999], X[2:1000], asp=1) Can you explain the resulting plot? b) Create a similar plot, using your function LCG from exercise 2.1: m <- 81 a <- 1 c <- 8 seed <- 0 X <- LCG(1000, m, a, c, seed)/m plot(x[1:999], X[2:1000], asp=1) Discuss the resulting picture. 17

20 c) Repeat the experiment from part b using the parameters m = 1024, a = 401, c = 101 and m = 2 32, a = , c = Discuss the results. Exercise E-2.3. Let X, Y U[0, 1] be independent. Using definition 2.8 of the uniform distribution on a set, show that (X, Y ) is uniformly distributed on the square Q = [0, 1] [0, 1]. Exercise E-2.4. Let f and g be two probability densities and c R with f(x) cg(x) for all x. Show that c 1 and that c = 1 is only possible for f = g (except possibly on sets with measure 0). Exercise E-2.5 (rejection sampling for the normal distribution). a) Work out the optimal constant c for the rejection sampling algorithm when the proposals are Exp(1)-distributed and the target density is given by { 2 f(x) = 2π exp ( ) x2 2 if x 0 0 else. b) Implement the resulting method. Test your program by generating a histogram of the output and by comparing the histogram to the density f. c) How can the program from part b be modified to generate standard normal distributed random numbers? Hint: Have a look at exercise 2.5. Exercise E-2.6. In this exercise we will see that the rejection sampling algorithm still can be applied when the density of the target distribution is only known up to a constant. Let f : R [0, ) be a function. Define f(x) = 1 Z f(x) x R where Z = f(x) dx. Then f is a probability density. a) Assume that we can generate proposals X 1, X 2,... with density g where f(x) cg(x) for all x R. State the rejection sampling algorithm for sampling from the density f. Write the algorithm in terms of f instead of f and show that the method can be applied even if the value of Z is unknown. b) Write a program to generate samples of the probability distribution with density f(x) = 1 Z exp(cos(x)) on the interval [0, 2π]. How can you test your program? Exercise E-2.7. Consider n probability distributions P 1,..., P n with cumulative distribution functions F 1,..., F n. Furthermore let θ 1,..., θ n > 0 such that n i=1 θ i = 1. Then the mixture of P 1,..., P n with weights θ 1,... θ n is the distribution P θ with CDF F θ (x) = n θ i F i (x) x R. i=1 a) Convince yourself that P θ is the distribution obtained by the following two-step procedure: first choose randomly one of the distributions P 1,..., P n where P i is chosen with probability θ i, and then, independently, take a sample of the chosen distribution. 18 R

21 b) Show that, if P 1,..., P n have densities f 1,..., f n, then P θ also has a density which is given by n f θ = θ i f i. c) Write an R program which generates samples from the mixture of P 1 = N (1, 0.01), P 2 = N (2, 0.04) and P 3 = N (4, 0.01) with weights θ 1 = 0.5, θ 2 = 0.2 and θ 3 = 0.3. Generate a plot, similar to the one in figure 2.5, showing a histogram of 4000 samples of this distribution, together with the density of the mixture. Hint: relevant R functions include rnorm, dnorm, sample, hist, and curve. i=1 19

22 20

23 Chapter 3 Monte Carlo Methods Monte Carlo methods are computational methods where one examines properties of a probability distribution by generating a large sample from the given distribution and then studying the statistical properties of this sample. In this chapter we consider various Monte Carlo methods to numerically approximate an expectation E ( f(x) ) where f is a real-valued function. 3.1 Basic Monte Carlo Integration Let X be a random variable and f be a real-valued function. different methods to compute the expectation E ( f(x) ) : Then there are several a) Sometimes we can find the answer analytically. For example, when the distribution of X has a density ϕ, we can use the relation E ( f(x) ) = f(x) ϕ(x) dx (3.1) to obtain the value of the expectation (see appendix A.3). This method only works if we can solve the resulting integral. b) If the integral in (3.1) cannot be solved analytically, we can try to use numerical integration to get an approximation to the value of the integral. When X takes values in a low-dimensional space, this method often works well, but for higher dimensional spaces numerical approximation can become very expensive and the resulting method may no longer be efficient. Since numerical integration is outside the topic of statistical computing, we will not follow this approach here. c) The technique we will study in this chapter is called Monte Carlo integration. This technique is based on the strong law of large numbers (see theorem A.9 in the appendix): If (X i ) i N is a sequence of i.i.d. random variables with the same distribution as X, then lim N 1 N N f(x i ) = E ( f(x) ) (3.2) i=1 almost surely. While exact equality only hold in the limit N, we can use the approximation E ( f(x) ) 1 N f(x i ) (3.3) N 21 i=1

24 when N is large. Note that the estimate of the right-hand side is constructed from the random values X i thus is a random quantity itself. The term Monte Carlo in the name of the last of these methods is in reference to the Monte Carlo casino (Roulette tables are random number generators in a sense), and the word integration refers to the equivalence (3.1) between computing expectations and computing integrals. The random variables X i in (3.3) are sometimes called i.i.d. copies of X. Since, formally, X itself is not used to compute the Monte Carlo estimate, sometimes the random variable X is not explicitly introduced and one writes E ( f(x 1 ) ) instead of E ( f(x) ). Since X 1 has the same distribution as X, these two expressions have the same value. Of course any other X i could also be used instead of X 1. For completeness we reformulate formula (3.3) as an algorithm: Algorithm MC (Monte Carlo integration) input: a function f N N. randomness used: an i.i.d. sequence (X i ) i N of random variables. output: an estimate for E ( f(x 1 ) ). 1: s 0 2: for i = 1, 2,..., N do 3: s s + f(x i ) 4: end for 5: return s/n Example 3.1. For f(x) = x, the Monte Carlo estimate (3.3) reduces to E(X) 1 N N X i = X. i=1 This is just the usual estimator for the mean. Example 3.2. Assume that for some reason we need to compute E ( sin(x) ) where X N (µ, σ 2 ). Obtaining this value analytically will be difficult, but we can easily get an approximation using Monte Carlo integration: If we choose N big and generate independent, N (µ, σ)-distributed random variables X 1, X 2,..., X N, then by the strong law of large numbers we have E ( sin(x) ) 1 N sin(x i ). N The right-hand side of this approximation can be easily evaluated using a computer program, giving an estimate for the required expectation. i=1 The basic Monte Carlo method above is presented only for the case of computing an expectation. The following arguments show how the method can be applied to compute probabilities or to determine the value of an integral. 22

25 We can compute probabilities using Monte Carlo integration, by rewriting them as expectations: If X is a random variable, we have P (X A) = E ( 1 A (X) ) (see appendix A.3). Thus, we can estimate using Monte Carlo integration, by P (X A) = E ( 1 A (X) ) 1 N where the X i are i.i.d. copies of X and N is big. N 1 A (X i ) (3.4) We can compute integrals like b a f(x) dx using Monte Carlo integration by utilising the relation (3.1): Let U i U[a, b]. Then the density of U i is ϕ(x) = 1 b a 1 [a,b]. We get b a for big N. f(x) dx = (b a) i=1 f(x)ϕ(x) dx = (b a)e ( f(u) ) b a N N f(u i ) (3.5) Example 3.3. Let X N (0, 1) and a R. Then the probability P (X a) cannot be computed analytically, but i=1 can be used as an approximation. P (X a) 1 N N i=1 1 {X a # (3.6) Example 3.4. The method from (3.5) allows to derive a different method for computing the probability P (X a) from example 3.3. Assuming a 0, we know P (X a) = a Using (3.5), we can approximate this by 1 e x2 /2 dx = 1 2π a 2π P (X a) π a N N i=1 0 e U 2 i /2 e x2 /2 dx. where U i U[0, a] are i.i.d. For the case a < 0 we can use the relation P (X a) = 1 P (X a). With the method discussed so far, an open question is still how big values of N we should choose. On the one hand, the bigger N is, the more accurate our estimates get; as N the Monte Carlo approximation converges to the correct answer. But on the other hand, the bigger N is, the more expensive the method becomes, because we need to generate and process more samples X i. Key to choosing N are the following two equations: The expectation of the Monte Carlo estimate (3.3) for E ( f(x) ) is given by ( 1 E N N i=1 ) f(x i ) = 1 N N E ( f(x i ) ) = E ( f(x) ). i=1 Therefore the estimate is unbiased for every N. 23

26 Since the X i are independent, the variance of the Monte Carlo estimate is given by ( 1 Var N N i=1 ) f(x i ) = 1 N 2 N Var ( f(x i ) ) = 1 N Var( f(x) ). (3.7) i=1 Therefore the variance, which controls the fluctuations of our estimates around the correct value, converges to 0 as N. Since the estimate is unbiased, the magnitude of a typical estimation error will be given by the standard deviation of the estimate; therefore we expect the error of a Monte Carlo estimate to decay like 1/ N. The fact that the error of Monte Carlo integration decays only like 1/ N means that in practice huge numbers of samples can be required. To increase accuracy by a factor of 10, i.e. to get one more significant digit of the result, one needs to increase the number of samples by a factor of 100. Example 3.5. Assume that U U[0, 1] and that we want to estimate E(U 2 ) using Monte Carlo integration. Then we have ( 1 Var N To compute this variance we note and E ( U 2) = N i=1 E ( (U 2 ) 2) = 1 0 ) f(u i ) = 1 N Var( U 2). 1 0 u 2 du = u3 3 u 4 du = u5 5 1 = 1 u=0 3 1 u=0 = 1 5. Therefore, Var(U 2 ) = 1/5 (1/3) 2 = 4/45 and the variance of the Monte Carlo estimate is ( 1 N ) Var f(x i ) = 4 1 N 45 N N. i=1 We can use relation (3.7) to determine which value of N is required to achieve a given precision. We can either directly prescribe the standard deviation of the estimate, or alternatively we can use estimates like the following: By Chebyshev s inequality (see theorem A.8) we have and thus ( 1 P N ( 1 P N N i=1 N i=1 f(x i ) E ( f(x) ) ) ε 1 N Var( f(x) ) ε 2 f(x i ) E ( f(x) ) ) ε 1 Var( f(x) ) Nε 2. Consequently, by choosing N Var ( f(x) ) /αε 2 we can achieve ( 1 P N N i=1 f(x i ) E ( f(x) ) ) ε 1 α. 24

27 Example 3.6. Assume Var ( f(x) ) = 1. In order to estimate E ( f(x) ) so that the error is at most ε = 0.01 with probability at least 1 α = 95%, we can use a Monte Carlo estimate with N Var( f(x) ) 1 αε 2 = 0.05 (0.01) 2 = samples. Example 3.7 (estimating probabilities I). Let X be a real-valued random variable and A R. Then we can use (3.4) to estimate P (X A). The error of this method is determined by the variance Var ( 1 A (X) ) = E ( 1 A (X) 2) E ( 1 A (X) ) 2 = P (X A) P (X A) 2. In the preceding estimates we assumed that the value Var ( f(x) ) is known, whereas in practice this is normally not the case (since we need to use Monte Carlo integration to estimate the expectation of f(x), it seems unlikely that the variance is known explicitly). There are various ways to work around this problem: Sometimes an upper bound for Var ( f(x) ) is known, which can be used in place of the true variance. One can try a two-step procedure where first Monte-Carlo integration with a fixed N (say N = 1000) is used to estimate Var ( f(x) ). Then, in a second step, one can use estimates as above to determine a value of N for use in determining the required expectation. Sequential methods can be used where one generates samples X i one-by-one and estimates the standard deviation of the generated f(x i ) from time to time. The procedure is continued until the estimated standard deviation falls below a prescribed limit. In all the error estimates above, we used the fact that f(x) has finite variance. As Var ( f(x) ) gets bigger, convergence to the correct result gets slower and the required number of samples to obtain a given error increases. Since the strong law of large numbers does not require the random variables to have finite variance, equation (3.2) still holds for Var ( f(x) ) =, but convergence in (3.3) will be extremely slow and the resulting method will not be very useful in this case. 3.2 Variance Reduction Methods As we have seen, the efficiency of Monte Carlo integration is determined by the variance of the estimate: the higher to variance, the more samples are required to obtain a given accuracy. This chapter describes methods to improve efficiency by considering modified Monte Carlo methods, which, for a given sample size N, achieve a lower variance of the Monte Carlo estimate. 25

28 3.2.1 Importance Sampling The importance sampling method is based on the following argument. Assume that X is a random variable with density ϕ, that f is a real-valued function and that ψ is another probability density with ψ(x) > 0 whenever f(x)ϕ(x) > 0. Then we have E ( f(x) ) f(x)ϕ(x) = f(x) ϕ(x) dx = ψ(x) dx, ψ(x) where we define the fraction to be 0 whenever the denominator (and thus the numerator) equals 0. Since ψ is a probability density, the integral on the right can be written as an expectation again: if Y has density ψ, we have E ( f(x) ) ( f(y )ϕ(y ) ) = E. ψ(y ) Using basic Monte Carlo integration, we can estimate this expectation as E ( f(x) ) 1 N N i=1 f(y i )ϕ(y i ) ψ(y i ) (3.8) where the Y i are i.i.d. copies of Y. This approximation can be used as an alternative to the basic Monte-Carlo approximation (3.3). We can write the resulting method as an algorithm as follows: Algorithm IMP (importance sampling) input: a function f, the density ϕ of X, an auxiliary density ψ, N N. randomness used: an i.i.d. sequence (Y i ) i N with density ψ. output: an estimate for E ( f(x) ). 1: s 0 2: for i = 1, 2,..., N do 3: s y + f(y i )ϕ(y i )/ψ(y i ) 4: end for 5: return s/n This method is a generalisation of the basic Monte Carlo method: if we choose ψ = ϕ, the two densities in (3.8) cancel, the Y i are i.i.d. copies of X and, for this case, the method is identical to basic Monte Carlo integration. The usefulness of importance sampling lies in the fact that we can choose the density ψ (and thus the Y i ) in order to maximise efficiency. As we have seen, the error of the Monte Carlo estimate is determined by the variance ( N Var i=1 f(y i )ϕ(y i ) ) ψ(y i ) = 1 ( f(y )ϕ(y ) ) N Var. ψ(y ) Therefore, the method is efficient, if both of the following criteria are satisfied: a) The Y i can be generated efficiently. 26

29 b) The variance Var ( f(y )ϕ(y )) ψ(y ) is small. To find out how ψ can be chosen to maximise efficiency, we first consider the extreme case where fϕ/ψ is constant: in this case, the variance of the Monte Carlo estimate is 0 and therefore there is no error at all! In this case we have ψ(x) = 1 f(x)ϕ(x) (3.9) c for all x where c is the constant value of fϕ/ψ. We can find the value of c by using the fact that ψ is a probability density: c = c ψ(x) dx = f(x) ϕ(x) dx = E ( f(x) ). Therefore, in order to to choose ψ as in (3.9) we have to already have solved the problem of computing the expectation E ( f(x) ) and thus we don t get a useful method for this case. Still, this boundary case offers some guidance: since we get optimal efficiency if ψ is chosen proportional to fϕ, we can expect that we will get good efficiency if we choose ψ to be approximately proportional to the function f ϕ. Example 3.8 (estimating probabilities II). Let X be a real-valued random variable and A R. Then the importance sampling method for estimating the probability P (X A) is given by P (X A) = E ( 1 A (X) ) 1 N N i=1 1 A (Y i )ϕ(y i ) ψ(y i ) where ψ is a probability density with ψ(x) > 0 for all x A with ϕ(x) > 0 and (Y i ) i N is a sequence of i.i.d. random variables with density ψ. The error of the method is determined by the variance ( 1A (Y )ϕ(y ) ) ( 1A (Y ) 2 ϕ(y ) 2 ) ( 1A (Y )ϕ(y ) ) 2 Var = E ψ(y ) ψ(y ) 2 E ψ(y ) ϕ(x) 2 = A ψ(x) ( A 2 ψ(x) dx ϕ(x) ) 2 ψ(x) ψ(x) dx ϕ(x) ) 2 = A ψ(x) ( A ϕ(x) dx ϕ(x) dx ( = E 1 A (X) ϕ(x) ) P (X A) 2. ψ(x) (3.10) In example 3.7 we found that the error for the basic Monte Carlo method is determined by Var ( 1 A (X) ) = E ( 1 A (X) ) P (X A) 2. Comparing this to (3.10), we see that the importance sampling method will have a better variance than basic Monte Carlo integration if we can choose ψ such that ψ > ϕ on the set A Antithetic Variables The antithetic variables method (also called antithetic variates method) reduces the variance and thus the error of Monte Carlo estimates by using pairwise dependent samples X i instead of independent samples. 27

30 For illustration, we first consider the case N = 2: Assume that X 1 and X 2 are identically distributed random variables, which are not independent. As for the independent case we have ( f(x1 ) + f(x 2 ) ) E = E( f(x 1 ) ) + E ( f(x 2 ) ) = E ( f(x) ), 2 2 but for the variance we get ( f(x1 ) + f(x 2 ) ) Var = Var( f(x 1 ) ) + 2 Cov ( f(x 1 ), f(x 2 ) ) + Var ( f(x 2 ) ) 2 4 = 1 2 Var( f(x) ) Cov( f(x 1 ), f(x 2 ) ). Compared to the independent case, an additional covariance-term 1 2 Cov( f(x 1 ), f(x 2 ) ) is present. The idea of the antithetic variables method is to construct X 1 and X 2 such that Cov ( f(x 1 ), f(x 2 ) ) is negative, thereby reducing the total variance. If we construct the X i using the inverse transform method, we can proceed as follows: Let F be the distribution function of X and let U U[0, 1]. Then 1 U U[0, 1] and we can use X 1 = F 1 (U) and X 2 = F 1 (1 U) to generate X 1, X 2 F. Since F 1 is monotonically increasing, X 1 increases as U increases while X 2 decreases as U increases. Therefore one would expect that X 1 and X 2 to be negatively correlated (the following lemma shows that this is indeed the case) and consequently the variance of (X 1 + X 2 )/2 will be smaller than one would expect for independent random variables. Lemma 3.9. Let g : R R be monotonically increasing and U U[0, 1]. Then Cov ( g(u), g(1 U) ) < 0. proof. Let V U[0, 1], independent of U. We distinguish two cases: If U V we have g(u) g(v ) and g(1 U) g(1 V ). Otherwise, if U > V, we have g(u) g(v ) and g(1 U) g(1 V ). Thus, in both cases we have ( g(u) g(v ) )( g(1 U) g(1 V ) ) 0 and consequently Cov ( g(u), g(1 U) ) = E ( g(u)g(1 U) ) E ( g(u) ) E ( g(1 U) ) = 1 ) (g(u)g(1 2 E U) + g(v )g(1 V ) g(u)g(1 V ) g(v )g(1 U) = 1 ( (g(u) 2 E )( ) ) g(v ) g(1 U) g(1 V ) 0. This completes the proof. (q.e.d.) If f is monotonically increasing, we can apply this lemma for g(u) = f ( F 1 (u) ). In this case we have f(x 1 ) = f ( F 1 (U) ) = g(u) and f(x 2 ) = f ( F 1 (1 U) ) = g(1 U), and the lemma allows to conclude Cov ( f(x 1 ), f(x 2 ) ) 0. 28

31 For N > 2, assuming N is even, we group the samples X i into pairs and construct every pair as above: Let U j U[0, 1] i.i.d. and for j N define X 2j = F 1 (U j ) and X 2j+1 = F 1 (1 U j ). Then X i F for all i N and we can use the approximation E ( f(x) ) 1 N N f(x i ). We can write the resulting method as an algorithm as follows. Algorithm ANT (antithetic variables) input: a function f, The inverse F 1 of the distribution function of X, N N even. randomness used: a sequence (U 1, U 2,..., U N/2 ) of i.i.d. U[0, 1] random variables. i=1 output: an estimate for E ( f(x) ). 1: s 0 2: for j = 1, 2,..., N/2 do 3: s s + f ( F 1 (U j ) ) + f ( F 1 (1 U j ) ) 4: end for 5: return s/n While lemma 3.9 only guarantees that the variance for the antithetic variables methods is smaller or equal to the variance of standard Monte Carlo integration without giving any quantitative bounds, in practice often a significant reduction of variance is observed. Example Assume that X U[0, 1] and that we want to estimate E(X 2 ) using the antithetic variables method. Since the distribution function of X is F (x) = x, the antithetic samples we get with the method above are X 2j = U 2 j and X 2j+1 = (1 U j ) 2. Thus we have ( 1 Var N N i=1 ) f(x i ) = 1 N 2 N 2 Var( U 2 + (1 U) 2) where U U[0, 1]. To compute this variance we note and E ( U 2 U ) = E ( (U 2 U) 2) = = 1 2N Var( U U + U 2) = 2 N Var( U 2 U ) (u 2 u) du = ( u 3 (u 4 2u 2 u + u 2 ) du = ( u 5 5 u4 2 + u3 ) 1 3 u=0 3 u2 2 = ) 1 u=0 = = Therefore, Var(U 2 U) = 1/30 ( 1/6) 2 = 1/180 and the variance of the antithetic variables estimate is ( 1 N ) Var f(x i ) = 2 N N Var( U 2 U ) = N N. i=1 29

Uniform Random Number Generators

Uniform Random Number Generators JHU 553.633/433: Monte Carlo Methods J. C. Spall 25 September 2017 CHAPTER 2 RANDOM NUMBER GENERATION Motivation and criteria for generators Linear generators (e.g., linear congruential generators) Multiple

More information

Random Number Generation. Stephen Booth David Henty

Random Number Generation. Stephen Booth David Henty Random Number Generation Stephen Booth David Henty Introduction Random numbers are frequently used in many types of computer simulation Frequently as part of a sampling process: Generate a representative

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( ) Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical

More information

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples:

( ) ( ) Monte Carlo Methods Interested in. E f X = f x d x. Examples: Monte Carlo Methods Interested in Examples: µ E f X = f x d x Type I error rate of a hypothesis test Mean width of a confidence interval procedure Evaluating a likelihood Finding posterior mean and variance

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts Simulation Where real stuff starts March 2019 1 ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3

More information

Random Number Generation. CS1538: Introduction to simulations

Random Number Generation. CS1538: Introduction to simulations Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process

More information

2 Random Variable Generation

2 Random Variable Generation 2 Random Variable Generation Most Monte Carlo computations require, as a starting point, a sequence of i.i.d. random variables with given marginal distribution. We describe here some of the basic methods

More information

Lecture 20. Randomness and Monte Carlo. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture 20. Randomness and Monte Carlo. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico Lecture 20 Randomness and Monte Carlo J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) CS 357 1 / 40 What we ll do: Random number generators Monte-Carlo integration

More information

Generating pseudo- random numbers

Generating pseudo- random numbers Generating pseudo- random numbers What are pseudo-random numbers? Numbers exhibiting statistical randomness while being generated by a deterministic process. Easier to generate than true random numbers?

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky

Monte Carlo. Lecture 15 4/9/18. Harvard SEAS AP 275 Atomistic Modeling of Materials Boris Kozinsky Monte Carlo Lecture 15 4/9/18 1 Sampling with dynamics In Molecular Dynamics we simulate evolution of a system over time according to Newton s equations, conserving energy Averages (thermodynamic properties)

More information

Generating Uniform Random Numbers

Generating Uniform Random Numbers 1 / 43 Generating Uniform Random Numbers Christos Alexopoulos and Dave Goldsman Georgia Institute of Technology, Atlanta, GA, USA March 1, 2016 2 / 43 Outline 1 Introduction 2 Some Generators We Won t

More information

Random processes and probability distributions. Phys 420/580 Lecture 20

Random processes and probability distributions. Phys 420/580 Lecture 20 Random processes and probability distributions Phys 420/580 Lecture 20 Random processes Many physical processes are random in character: e.g., nuclear decay (Poisson distributed event count) P (k, τ) =

More information

Workshop on Heterogeneous Computing, 16-20, July No Monte Carlo is safe Monte Carlo - more so parallel Monte Carlo

Workshop on Heterogeneous Computing, 16-20, July No Monte Carlo is safe Monte Carlo - more so parallel Monte Carlo Workshop on Heterogeneous Computing, 16-20, July 2012 No Monte Carlo is safe Monte Carlo - more so parallel Monte Carlo K. P. N. Murthy School of Physics, University of Hyderabad July 19, 2012 K P N Murthy

More information

Monte Carlo Integration I [RC] Chapter 3

Monte Carlo Integration I [RC] Chapter 3 Aula 3. Monte Carlo Integration I 0 Monte Carlo Integration I [RC] Chapter 3 Anatoli Iambartsev IME-USP Aula 3. Monte Carlo Integration I 1 There is no exact definition of the Monte Carlo methods. In the

More information

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary CPSC 531: Random Numbers Jonathan Hudson Department of Computer Science University of Calgary http://www.ucalgary.ca/~hudsonj/531f17 Introduction In simulations, we generate random values for variables

More information

Sources of randomness

Sources of randomness Random Number Generator Chapter 7 In simulations, we generate random values for variables with a specified distribution Ex., model service times using the exponential distribution Generation of random

More information

Contents. 1 Probability review Introduction Random variables and distributions Convergence of random variables...

Contents. 1 Probability review Introduction Random variables and distributions Convergence of random variables... Contents Probability review. Introduction..............................2 Random variables and distributions................ 3.3 Convergence of random variables................. 6 2 Monte Carlo methods

More information

Monte Carlo integration (naive Monte Carlo)

Monte Carlo integration (naive Monte Carlo) Monte Carlo integration (naive Monte Carlo) Example: Calculate π by MC integration [xpi] 1/21 INTEGER n total # of points INTEGER i INTEGER nu # of points in a circle REAL x,y coordinates of a point in

More information

Generating Random Variables

Generating Random Variables Generating Random Variables These slides are created by Dr. Yih Huang of George Mason University. Students registered in Dr. Huang's courses at GMU can make a single machine-readable copy and print a single

More information

Scientific Computing: Monte Carlo

Scientific Computing: Monte Carlo Scientific Computing: Monte Carlo Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 April 5th and 12th, 2012 A. Donev (Courant Institute)

More information

Markov Chain Monte Carlo Lecture 1

Markov Chain Monte Carlo Lecture 1 What are Monte Carlo Methods? The subject of Monte Carlo methods can be viewed as a branch of experimental mathematics in which one uses random numbers to conduct experiments. Typically the experiments

More information

Monte Carlo Methods. Geoff Gordon February 9, 2006

Monte Carlo Methods. Geoff Gordon February 9, 2006 Monte Carlo Methods Geoff Gordon ggordon@cs.cmu.edu February 9, 2006 Numerical integration problem 5 4 3 f(x,y) 2 1 1 0 0.5 0 X 0.5 1 1 0.8 0.6 0.4 Y 0.2 0 0.2 0.4 0.6 0.8 1 x X f(x)dx Used for: function

More information

Monte Carlo Methods in Engineering

Monte Carlo Methods in Engineering Monte Carlo Methods in Engineering Mikael Amelin Draft version KTH Royal Institute of Technology Electric Power Systems Stockholm 2013 PREFACE This compendium describes how Monte Carlo methods can be

More information

Generating Uniform Random Numbers

Generating Uniform Random Numbers 1 / 41 Generating Uniform Random Numbers Christos Alexopoulos and Dave Goldsman Georgia Institute of Technology, Atlanta, GA, USA 10/13/16 2 / 41 Outline 1 Introduction 2 Some Lousy Generators We Won t

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

How does the computer generate observations from various distributions specified after input analysis?

How does the computer generate observations from various distributions specified after input analysis? 1 How does the computer generate observations from various distributions specified after input analysis? There are two main components to the generation of observations from probability distributions.

More information

Review of Statistical Terminology

Review of Statistical Terminology Review of Statistical Terminology An experiment is a process whose outcome is not known with certainty. The experiment s sample space S is the set of all possible outcomes. A random variable is a function

More information

Modern Methods of Data Analysis - WS 07/08

Modern Methods of Data Analysis - WS 07/08 Modern Methods of Data Analysis Lecture III (29.10.07) Contents: Overview & Test of random number generators Random number distributions Monte Carlo The terminology Monte Carlo-methods originated around

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Preliminary statistics

Preliminary statistics 1 Preliminary statistics The solution of a geophysical inverse problem can be obtained by a combination of information from observed data, the theoretical relation between data and earth parameters (models),

More information

Class 12. Random Numbers

Class 12. Random Numbers Class 12. Random Numbers NRiC 7. Frequently needed to generate initial conditions. Often used to solve problems statistically. How can a computer generate a random number? It can t! Generators are pseudo-random.

More information

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint Chapter 4: Monte Carlo Methods Paisan Nakmahachalasint Introduction Monte Carlo Methods are a class of computational algorithms that rely on repeated random sampling to compute their results. Monte Carlo

More information

Generating Uniform Random Numbers

Generating Uniform Random Numbers 1 / 44 Generating Uniform Random Numbers Christos Alexopoulos and Dave Goldsman Georgia Institute of Technology, Atlanta, GA, USA 10/29/17 2 / 44 Outline 1 Introduction 2 Some Lousy Generators We Won t

More information

So we will instead use the Jacobian method for inferring the PDF of functionally related random variables; see Bertsekas & Tsitsiklis Sec. 4.1.

So we will instead use the Jacobian method for inferring the PDF of functionally related random variables; see Bertsekas & Tsitsiklis Sec. 4.1. 2011 Page 1 Simulating Gaussian Random Variables Monday, February 14, 2011 2:00 PM Readings: Kloeden and Platen Sec. 1.3 Why does the Box Muller method work? How was it derived? The basic idea involves

More information

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that From Random Numbers to Monte Carlo Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that Random Walk Through Life Random Walk Through Life If you flip the coin 5 times you will

More information

Random number generators and random processes. Statistics and probability intro. Peg board example. Peg board example. Notes. Eugeniy E.

Random number generators and random processes. Statistics and probability intro. Peg board example. Peg board example. Notes. Eugeniy E. Random number generators and random processes Eugeniy E. Mikhailov The College of William & Mary Lecture 11 Eugeniy Mikhailov (W&M) Practical Computing Lecture 11 1 / 11 Statistics and probability intro

More information

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods

Nonlife Actuarial Models. Chapter 14 Basic Monte Carlo Methods Nonlife Actuarial Models Chapter 14 Basic Monte Carlo Methods Learning Objectives 1. Generation of uniform random numbers, mixed congruential method 2. Low discrepancy sequence 3. Inversion transformation

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Topics in Computer Mathematics

Topics in Computer Mathematics Random Number Generation (Uniform random numbers) Introduction We frequently need some way to generate numbers that are random (by some criteria), especially in computer science. Simulations of natural

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos Contents Markov Chain Monte Carlo Methods Sampling Rejection Importance Hastings-Metropolis Gibbs Markov Chains

More information

B. Maddah ENMG 622 Simulation 11/11/08

B. Maddah ENMG 622 Simulation 11/11/08 B. Maddah ENMG 622 Simulation 11/11/08 Random-Number Generators (Chapter 7, Law) Overview All stochastic simulations need to generate IID uniformly distributed on (0,1), U(0,1), random numbers. 1 f X (

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

2008 Winton. Review of Statistical Terminology

2008 Winton. Review of Statistical Terminology 1 Review of Statistical Terminology 2 Formal Terminology An experiment is a process whose outcome is not known with certainty The experiment s sample space S is the set of all possible outcomes. A random

More information

Generating the Sample

Generating the Sample STAT 80: Mathematical Statistics Monte Carlo Suppose you are given random variables X,..., X n whose joint density f (or distribution) is specified and a statistic T (X,..., X n ) whose distribution you

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Lecture 5: Random numbers and Monte Carlo (Numerical Recipes, Chapter 7) Motivations for generating random numbers

Lecture 5: Random numbers and Monte Carlo (Numerical Recipes, Chapter 7) Motivations for generating random numbers Lecture 5: Random numbers and Monte Carlo (Numerical Recipes, Chapter 7) Motivations for generating random numbers To sample a function in a statistically controlled manner (i.e. for Monte Carlo integration)

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 10-708 Probabilistic Graphical Models Homework 3 (v1.1.0) Due Apr 14, 7:00 PM Rules: 1. Homework is due on the due date at 7:00 PM. The homework should be submitted via Gradescope. Solution to each problem

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis

More information

(x 3)(x + 5) = (x 3)(x 1) = x + 5. sin 2 x e ax bx 1 = 1 2. lim

(x 3)(x + 5) = (x 3)(x 1) = x + 5. sin 2 x e ax bx 1 = 1 2. lim SMT Calculus Test Solutions February, x + x 5 Compute x x x + Answer: Solution: Note that x + x 5 x x + x )x + 5) = x )x ) = x + 5 x x + 5 Then x x = + 5 = Compute all real values of b such that, for fx)

More information

Simulated Annealing for Constrained Global Optimization

Simulated Annealing for Constrained Global Optimization Monte Carlo Methods for Computation and Optimization Final Presentation Simulated Annealing for Constrained Global Optimization H. Edwin Romeijn & Robert L.Smith (1994) Presented by Ariel Schwartz Objective

More information

Math Stochastic Processes & Simulation. Davar Khoshnevisan University of Utah

Math Stochastic Processes & Simulation. Davar Khoshnevisan University of Utah Math 5040 1 Stochastic Processes & Simulation Davar Khoshnevisan University of Utah Module 1 Generation of Discrete Random Variables Just about every programming language and environment has a randomnumber

More information

Numerical Methods I Monte Carlo Methods

Numerical Methods I Monte Carlo Methods Numerical Methods I Monte Carlo Methods Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course G63.2010.001 / G22.2420-001, Fall 2010 Dec. 9th, 2010 A. Donev (Courant Institute) Lecture

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

MATH Notebook 5 Fall 2018/2019

MATH Notebook 5 Fall 2018/2019 MATH442601 2 Notebook 5 Fall 2018/2019 prepared by Professor Jenny Baglivo c Copyright 2004-2019 by Jenny A. Baglivo. All Rights Reserved. 5 MATH442601 2 Notebook 5 3 5.1 Sequences of IID Random Variables.............................

More information

Monte Carlo Techniques

Monte Carlo Techniques Physics 75.502 Part III: Monte Carlo Methods 40 Monte Carlo Techniques Monte Carlo refers to any procedure that makes use of random numbers. Monte Carlo methods are used in: Simulation of natural phenomena

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Systems Simulation Chapter 7: Random-Number Generation

Systems Simulation Chapter 7: Random-Number Generation Systems Simulation Chapter 7: Random-Number Generation Fatih Cavdur fatihcavdur@uludag.edu.tr April 22, 2014 Introduction Introduction Random Numbers (RNs) are a necessary basic ingredient in the simulation

More information

Uniform random numbers generators

Uniform random numbers generators Uniform random numbers generators Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2707/ OUTLINE: The need for random numbers; Basic steps in generation; Uniformly

More information

Brief introduction to Markov Chain Monte Carlo

Brief introduction to Markov Chain Monte Carlo Brief introduction to Department of Probability and Mathematical Statistics seminar Stochastic modeling in economics and finance November 7, 2011 Brief introduction to Content 1 and motivation Classical

More information

A simple algorithm that will generate a sequence of integers between 0 and m is:

A simple algorithm that will generate a sequence of integers between 0 and m is: Using the Pseudo-Random Number generator Generating random numbers is a useful technique in many numerical applications in Physics. This is because many phenomena in physics are random, and algorithms

More information

A Repetition Test for Pseudo-Random Number Generators

A Repetition Test for Pseudo-Random Number Generators Monte Carlo Methods and Appl., Vol. 12, No. 5-6, pp. 385 393 (2006) c VSP 2006 A Repetition Test for Pseudo-Random Number Generators Manuel Gil, Gaston H. Gonnet, Wesley P. Petersen SAM, Mathematik, ETHZ,

More information

Random Number Generators

Random Number Generators 1/18 Random Number Generators Professor Karl Sigman Columbia University Department of IEOR New York City USA 2/18 Introduction Your computer generates" numbers U 1, U 2, U 3,... that are considered independent

More information

Monte Carlo and cold gases. Lode Pollet.

Monte Carlo and cold gases. Lode Pollet. Monte Carlo and cold gases Lode Pollet lpollet@physics.harvard.edu 1 Outline Classical Monte Carlo The Monte Carlo trick Markov chains Metropolis algorithm Ising model critical slowing down Quantum Monte

More information

11 Division Mod n, Linear Integer Equations, Random Numbers, The Fundamental Theorem of Arithmetic

11 Division Mod n, Linear Integer Equations, Random Numbers, The Fundamental Theorem of Arithmetic 11 Division Mod n, Linear Integer Equations, Random Numbers, The Fundamental Theorem of Arithmetic Bezout s Lemma Let's look at the values of 4x + 6y when x and y are integers. If x is -6 and y is 4 we

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information

CS 246 Review of Proof Techniques and Probability 01/14/19

CS 246 Review of Proof Techniques and Probability 01/14/19 Note: This document has been adapted from a similar review session for CS224W (Autumn 2018). It was originally compiled by Jessica Su, with minor edits by Jayadev Bhaskaran. 1 Proof techniques Here we

More information

Random number generators

Random number generators s generators Comp Sci 1570 Introduction to Outline s 1 2 s generator s The of a sequence of s or symbols that cannot be reasonably predicted better than by a random chance, usually through a random- generator

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Feng Li feng.li@cufe.edu.cn School of Statistics and Mathematics Central University of Finance and Economics Revised on April 24, 2017 Today we are going to learn... 1 Markov Chains

More information

Monte Carlo Simulations

Monte Carlo Simulations Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Chapter 11 - Sequences and Series

Chapter 11 - Sequences and Series Calculus and Analytic Geometry II Chapter - Sequences and Series. Sequences Definition. A sequence is a list of numbers written in a definite order, We call a n the general term of the sequence. {a, a

More information

Lecture 5: Two-point Sampling

Lecture 5: Two-point Sampling Randomized Algorithms Lecture 5: Two-point Sampling Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized Algorithms - Lecture 5 1 / 26 Overview A. Pairwise

More information

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch 6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch Today Probabilistic Turing Machines and Probabilistic Time Complexity Classes Now add a new capability to standard TMs: random

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Introducing the Normal Distribution

Introducing the Normal Distribution Department of Mathematics Ma 3/13 KC Border Introduction to Probability and Statistics Winter 219 Lecture 1: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2, 2.2,

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Statistics 100A Homework 5 Solutions

Statistics 100A Homework 5 Solutions Chapter 5 Statistics 1A Homework 5 Solutions Ryan Rosario 1. Let X be a random variable with probability density function a What is the value of c? fx { c1 x 1 < x < 1 otherwise We know that for fx to

More information

Lehmer Random Number Generators: Introduction

Lehmer Random Number Generators: Introduction Lehmer Random Number Generators: Introduction Revised version of the slides based on the book Discrete-Event Simulation: a first course LL Leemis & SK Park Section(s) 21, 22 c 2006 Pearson Ed, Inc 0-13-142917-5

More information

Math489/889 Stochastic Processes and Advanced Mathematical Finance Solutions for Homework 7

Math489/889 Stochastic Processes and Advanced Mathematical Finance Solutions for Homework 7 Math489/889 Stochastic Processes and Advanced Mathematical Finance Solutions for Homework 7 Steve Dunbar Due Mon, November 2, 2009. Time to review all of the information we have about coin-tossing fortunes

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

How does the computer generate observations from various distributions specified after input analysis?

How does the computer generate observations from various distributions specified after input analysis? 1 How does the computer generate observations from various distributions specified after input analysis? There are two main components to the generation of observations from probability distributions.

More information

Probability Distributions

Probability Distributions 02/07/07 PHY310: Statistical Data Analysis 1 PHY310: Lecture 05 Probability Distributions Road Map The Gausssian Describing Distributions Expectation Value Variance Basic Distributions Generating Random

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Robert Collins CSE586, PSU Intro to Sampling Methods

Robert Collins CSE586, PSU Intro to Sampling Methods Robert Collins Intro to Sampling Methods CSE586 Computer Vision II Penn State Univ Robert Collins A Brief Overview of Sampling Monte Carlo Integration Sampling and Expected Values Inverse Transform Sampling

More information

CSCI-6971 Lecture Notes: Monte Carlo integration

CSCI-6971 Lecture Notes: Monte Carlo integration CSCI-6971 Lecture otes: Monte Carlo integration Kristopher R. Beevers Department of Computer Science Rensselaer Polytechnic Institute beevek@cs.rpi.edu February 21, 2006 1 Overview Consider the following

More information

Multivariate Simulations

Multivariate Simulations Multivariate Simulations Katarína Starinská starinskak@gmail.com Charles University Faculty of Mathematics and Physics Prague, Czech Republic 21.11.2011 Katarína Starinská Multivariate Simulations 1 /

More information

Topological properties of Z p and Q p and Euclidean models

Topological properties of Z p and Q p and Euclidean models Topological properties of Z p and Q p and Euclidean models Samuel Trautwein, Esther Röder, Giorgio Barozzi November 3, 20 Topology of Q p vs Topology of R Both R and Q p are normed fields and complete

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Introducing the Normal Distribution

Introducing the Normal Distribution Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 10: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2,

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics KTH Royal Institute of Technology jimmyol@kth.se Lecture 2 Random number generation 27 March 2014 Computer Intensive Methods

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information