Random Variate Generation

Similar documents
APPENDIX 2.1 LINE AND SURFACE INTEGRALS

Statistics 100A Homework 5 Solutions

Main topics for the First Midterm Exam

3 Algebraic Methods. we can differentiate both sides implicitly to obtain a differential equation involving x and y:

Model-building and parameter estimation

Chapter 1 Review of Equations and Inequalities

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Lines and Their Equations

This does not cover everything on the final. Look at the posted practice problems for other topics.

Integrals. D. DeTurck. January 1, University of Pennsylvania. D. DeTurck Math A: Integrals 1 / 61

Getting Started with Communications Engineering

2 Functions of random variables

Haus, Hermann A., and James R. Melcher. Electromagnetic Fields and Energy. Englewood Cliffs, NJ: Prentice-Hall, ISBN:

Welcome to Math 104. D. DeTurck. January 16, University of Pennsylvania. D. DeTurck Math A: Welcome 1 / 44

Substitutions and by Parts, Area Between Curves. Goals: The Method of Substitution Areas Integration by Parts

DIFFERENTIATION RULES

Solution to Assignment 3

Order Statistics and Distributions

Sequences and infinite series

1 Functions of many variables.

2 Discrete Dynamical Systems (DDS)

Wed Feb The vector spaces 2, 3, n. Announcements: Warm-up Exercise:

Will Landau. Feb 21, 2013

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Solution to Proof Questions from September 1st

When they compared their results, they had an interesting discussion:

MAS1302 Computational Probability and Statistics

Chapter 1. Root Finding Methods. 1.1 Bisection method

7.1 Indefinite Integrals Calculus

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Chapter 4 Notes, Calculus I with Precalculus 3e Larson/Edwards

DIFFERENTIATION AND INTEGRATION PART 1. Mr C s IB Standard Notes

Exam Question 10: Differential Equations. June 19, Applied Mathematics: Lecture 6. Brendan Williamson. Introduction.

Statistics 3657 : Moment Approximations

F X (x) = P [X x] = x f X (t)dt. 42 Lebesgue-a.e, to be exact 43 More specifically, if g = f Lebesgue-a.e., then g is also a pdf for X.

Physics 509: Error Propagation, and the Meaning of Error Bars. Scott Oser Lecture #10

Common ontinuous random variables

A review of probability theory

Gradient. x y x h = x 2 + 2h x + h 2 GRADIENTS BY FORMULA. GRADIENT AT THE POINT (x, y)

Lecture 7: Statistics and the Central Limit Theorem. Philip Moriarty,

Integration Using Tables and Summary of Techniques

Physics 110. Electricity and Magnetism. Professor Dine. Spring, Handout: Vectors and Tensors: Everything You Need to Know

Ordinary Differential Equations (ODEs)

Lecture 6, September 1, 2017

MATH Green s Theorem Fall 2016

Sometimes the domains X and Z will be the same, so this might be written:

14.30 Introduction to Statistical Methods in Economics Spring 2009

5.9 Representations of Functions as a Power Series

Chapter 4. Probability Distributions Continuous

CHAPTER 1: Functions

SDS 321: Introduction to Probability and Statistics

Warm-up Simple methods Linear recurrences. Solving recurrences. Misha Lavrov. ARML Practice 2/2/2014

Random Number Generation. CS1538: Introduction to simulations

Physics 250 Green s functions for ordinary differential equations

Given a sequence a 1, a 2,...of numbers, the finite sum a 1 + a 2 + +a n,wheren is an nonnegative integer, can be written

Math Precalculus I University of Hawai i at Mānoa Spring

C-1. Snezana Lawrence

LINEAR MMSE ESTIMATION

p. 6-1 Continuous Random Variables p. 6-2

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Quadratic Equations

REVIEW FOR EXAM III SIMILARITY AND DIAGONALIZATION

Lecture 11. Probability Theory: an Overveiw

Foundations of Math II Unit 5: Solving Equations

MATH 312 Section 8.3: Non-homogeneous Systems

In this lecture we calculate moments and products of inertia of some simple geometric figures. (Refer Slide Time: 0:22)

V. Graph Sketching and Max-Min Problems

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Random Variate Generation

Lecture : The Indefinite Integral MTH 124

Solutions to Math 53 Math 53 Practice Final

DIFFERENTIATION RULES

P1 Calculus II. Partial Differentiation & Multiple Integration. Prof David Murray. dwm/courses/1pd

2 Analogies between addition and multiplication

Simulation. Where real stuff starts

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

EXAMPLES OF PROOFS BY INDUCTION

MATH 310, REVIEW SHEET 2

1 General Tips and Guidelines for Proofs

Math Boot Camp: Integration

APPM/MATH 4/5520 Solutions to Exam I Review Problems. f X 1,X 2. 2e x 1 x 2. = x 2

2.4 Log-Arithm-etic. A Practice Understanding Task

2 Continuous Random Variables and their Distributions

From Random Numbers to Monte Carlo. Random Numbers, Random Walks, Diffusion, Monte Carlo integration, and all that

MITOCW ocw f99-lec05_300k

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

AP Calculus Chapter 9: Infinite Series

=.55 = = 5.05

6.1 Moment Generating and Characteristic Functions

2 Two-Point Boundary Value Problems

Introduction to Linear Algebra

Section 1.x: The Variety of Asymptotic Experiences

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Power series and Taylor series

and likewise fdy = and we have fdx = f((x, g(x))) 1 dx. (0.1)

Derivatives and the Product Rule

1.1 GRAPHS AND LINEAR FUNCTIONS

4.4 Uniform Convergence of Sequences of Functions and the Derivative

Data Analysis I. Dr Martin Hendry, Dept of Physics and Astronomy University of Glasgow, UK. 10 lectures, beginning October 2006

1 Solution to Problem 2.1

Transcription:

CPSC 405 Random Variate Generation 2007W T1 Handout These notes present techniques to generate samples of desired probability distributions, and some fundamental results and techiques. Some of this material can be found in Chapter 8 (9 in 2nd ed.) of the book. 1 Inverse Transform Suppose we can generate a uniform random number r on [0 1]. How can we generate numbers x with a given pdf P (x)? To warm up our brain, let s first think about something else. Suppose we generate a uniform random number 0 < r < 1 and square it. So we have x = r 2. Clearly we also have 0 < x < 1. What is the pdf P (x) for x? A wild guess might be that it is just the square of the pdf for r, so x would also be uniform. It is however easy to see that this can t be true. Consider the probability p that x < 1/2. If x was uniform on [0 1] we would get p = 1/2. In order to get an x < 1/2 we must have gotten an r < 1/ 2. The probability for this is p = 1/ 2, not p = 1/2. So P (x) can t be uniform. To figure out what P (x) is, consider the probability p that x falls in the interval [x x + x]. In the limit x 0 we have, according to the definition of a pdf, p = P (x) x. So if we can figure out p, we can compute P (x). For x to fall in [x x + x], we must generate r in the range [ x x + x]. Since r is uniform, the probability for this (which is also p) is just the length of this interval. So we get p = x + x x, and we now use p = P (x) x and solve for P (x), obtaining x + x x P (x) =. x Taking the limit x 0 we thus obtain x + x x P (x) = lim x 0 x = d 1 x = dx 2 x. With this result, we now have an instance of the inverse transform method to generate random numbers with pdf 1/2 x: generate a uniform random number r and square it. The general form of the inverse transform method is obtained by computing the pdf of x = g(r) for some function g, and then trying to find a function g such that the desired pdf is obtained. Let us assume that g(r) is invertible with inverse g 1 (x). The chance that x lies in the interval [x x + dx] is P (x)dx, for infinitesimal dx. What values of r should we have gotten to get this? (Remember we are generating values x by calling a uniform random number generator to get r and then setting x = g(r).) We should have gotten an r in the interval [r r + dr], with r = g 1 (x) and r + dr = g 1 (x + dx). Write out the last formula and get r + dr = g 1 (x) + (g 1 ) (x)dx, 1

where the denotes the derivative. Using r = g 1 (x) we can simplify this to dr = (g 1 ) (x)dx. (1) The probability for the r value to be in the interval [r r + dr] is just dr. This is also the probability for x to be in [x x + dx], which is P (x)dx. Using Eq. 1, we get thus P (x)dx = (g 1 ) (x)dx, P (x) = (g 1 ) (x). Integrating both sides and remembering that F (x) = x P (y)dy, gives us or F (x) = g 1 (x), g(x) = F 1 (x), provided F (x) has an inverse. In summary, to generate a number x with pdf P (x) using the inverse transform method, we first figure out the cdf F (x) from P (x). We then invert that by solving r = F (x) for x, which gives the function F 1 (r). We then generate a uniform random number 0 < r < 1 and compute x = F 1 (r). 2 Pdf of a function of a random variable Suppose x has pdf P (x). What is the pdf Q(y) of y = g(x)? The chance for x to be in [x x + dx] is P (x)dx. Then y is in [y y + dy], with y = g(x) and dy = g (x)dx. The chance for y to be in that interval is per definition Q(y)dy. So we get, noting that g can be negative, Q(y) g (x) dx = P (x)dx, so Q(y) = P (x)/ g (x) = P (g 1 (y)) (g 1 ) (y). (2) Note that we have made an implicit assumption here that g(x) is monotone. If it is not, several intervals in x can map onto the same values y, and the inverse does not exist. This would complicate matters, and we shall not deal with this problem. If g(x) is monotone the derivative is either negative or positive (or zero), which means the derivative of the absolute value is the absolute value of the derivative, which was used in Eq. 2. Let s try it on a familiar example: P (x) is 1 on [0 1] (i.e., the uniform distribution) and g(x) = (b a)x + a. We know already that the result is a uniform pdf on [a b]. Inverting g(x) gives g 1 (y) = (y a)/(b a), and (g 1 ) (y) = 1/(b a), so Q(y) = 1/(b a) which is correct. 2

A linear transformation is often used to get a normal distribution with given µ and σ from the standard normal with µ = 1 and σ = 1, N(t) = 1 2π e t2 /2. If z denotes a standard normal variate, then x = µ + σz is normally distributed with those mean and variance as can be verified easily by using Eq. 2. 3 Constructing the pdf from measured data In this section I will present an algorithm to generate samples from a continuous distribution for which we only know a finite number of measured data points. Suppose we have measured some parameter x in a system, and have recorded N + 1 values which we have sorted in increasing order x 1, x 2,..., x N+1. Our task is now to create samples from the unknown pdf P (x) underlying this data. There are several ways to do this and since we have only limited data we will have to make some guesses. Let us denote the N intervals by I k = [x k x k+1 ] for k = 1,..., N. We now want to assign equal probabilities to each of the intervals. Areas with small interval sizes will then have a higher P (x) as expected as there are more intervals per unit length, if all intervals are treated as equally probable. Let s begin with our old friend, the uniform random number generator on [0 1], and divide the unit interval into N equal intervals R k = [k (k + 1)]/N, with k = 1,..., N. The plan is now to generate r, figure out which interval R k it is in, look up the corresponding interval I k, and generate an appropriate value of x in that interval I k, depending on the location in the interval R k of r. If r is in the left of the interval R k a value from the left side of I k will be generated and the other way around. See Figure 1. Here is MATLAB code (empgen.m) that does it. It takes a vector data with a (sorted) data sample and returns a random number y generated from the empirical pdf. Note that the empirical pdf is never explicitly constructed. function y = empgen(data) N = length(data)-1; r = rand; % k is interval k = 1+floor(N*r); % relative offset in interval (0-1) offset = r*n - (k-1); % map to appropriate value of data y = data(k)*(1-offset) + data(k+1)*offset; 3

Figure 1: Mapping from the uniform random number r to the value x based on the measured data points. 4 Convolutions Let K y = r k, k=1 where r are taken from some (fixed) distribution P (r). The pdf of y, Q(y) is called a convolution of the distribution P (x). Note that it is just a sum, and a sum is the average up to a constant. There is no simple way to compute Q(y). It goes like this. Consider the K dimensional space R spanned by r k. Let < r k <, and P (r) is possibly zero on big regions on R. The equation K k=1 r k = y, for fixed y defines a hyperplane H in R. So Q(y) is just the probability density that the r k ly on the hyperplane H, which is Q(y) = P (r 1 )P (r 2 )... P (r K ))d K 1 r, (3) H which is a hypersurface integral. For example, consider K = 2 and take P (r) to be uniform on [0 1]. The domain H is defined by the equation r 1 + r 2 = y, together with the conditions 0 < r 1 < 1 and 0 < r 2 < 1. This defines a straight line segment, which intersects the r 1 and r 2 axis at y, with 0 < y < 2. Equation 3 now reads Q(y) = ds. which is the length of the line segment times some constant we don t worry about here. The length of the line segment plotted as a function of y is just a triangle with peak at y = 1, the triangular distribution. 4 H

(a) Integral over H (b) Resulting triangular distribution Figure 2: The sum of two uniform random numbers obeys a triangular distribution Convolution gives us an easy way to generate the Erlang distribution as it is defined as the distribution of a sum of exponentially distributed variables. Note that if K is large, we will always generate an approximately normal distribution, as the convolution is just the mean up to a multiplicative constant 5 Acceptance-rejection This is sometimes an easy and fast method to program. Suppose we want to generate uniform random numbers on [c 1]. We could generate r on [0 1], accept it if r c, reject it and try again otherwise. In pseudo-c: double f1(double c) { double r; while((double r = rand())<c); return r; } Let s compare this to our inverse transform technique: double f2(double c) { double r; r = c + (1-c)*rand(); return r; } Which one is faster? The chance that f1 will generate a wrong value precisely n times is given by p = (1 c)c n. The chance for getting n wrong values is c n and the chance of 5

getting the right value at the end is 1 c. The expected value for w, the number of times a wrong value is generated is thus < w >= n(1 c)c n = c/(1 c). n=0 Suppose now the functions are called N times. On average we have to call rand() N(1+ < w >) = N/(1 c) times, and every time we have to do a compare. Method 2, (f2) on the other hand always needs to call rand() only once, but it has to do an addition, subtraction, and a multiplication every time. Which is faster depends clearly on c. Let s work it out. Let T R be the computation time for rand(), T A is the time to do an addition, subtraction, and a multiplication and let T C be the time to do the comparison. If we denote the time spent per call for the two algorithms by T 1 and T 2 we have and T 1 = 1 1 c (T R + T C ) T 2 = T R + T A. The acceptance-rejection algorithm is faster if T 1 < T 2 which we can rewrite as For example if c = 0.1 we get c 1 c T R + T c 1 c < T A. T C + 0.1T R < T A which is probably satisfied if we use the LCM algorithm for rand() as it requires about the same time as T A. This method is of course not only applicable to uniform distributions. Here s a more realistic example. IQ s are normally distributed with mean of 100 and standard deviation 15. This is of course an approximation, and in particular N(x, 100, 15) can generate negative IQ s. So if we want to generate a sample of IQ s we could try to use the inverse transform technique for a cutoff normal distribution. However it is much simpler and faster to use acceptance-rejection here and just try again if you should get a negative value. In fact, the chance of this happening is only about 1 in ten billion! 6