Mainz, May 2, 2013 Statistics, Data Analysis, and Simulation SS 2013 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler <distler@uni-mainz.de>
2. Random Numbers 2.1 Why random numbers: Simulation Sampling Numerical analysis Computer programming Decision making Cryptography Aesthetics Recreation
2.2 Number representation unsigned integer signed integer floating-point format float (32-bit) double (64-bit)
2.3 Random Number Generators 1927 L.H.C. Tippett published 40,000 random digits taken at random from census reports 1939 M.G. Kendall and B. Babington-Smith produced a table of 100,000 random digits using a mechanical random number generator 1946 John von Neumann first suggested the middle-square method. His idea was to take the square of the previous random number and to extract the middle digits; for example, if we are generating 10-digit numbers: r j+1 = (rj 2 div 100, 000) mod 10, 000, 000, 000 r 0 = 5772156649, r0 2 = 33317 7923805949 }{{} 09201 r 1 =7923805949 Von Neumann s original middle-square method has actually proved to be a comparatively poor source of random numbers. The danger is that the sequence tends to get into a rut, a short cycle of repeating elements. For example, if zero ever appears as a number of the sequence, it will continually perpetuate
Random Number Generators There is another fairly obvious objection to the middle-square technique: how can a sequence generated in such a way be random, since each number is completely determined by its predecessor? The answer is that this sequence isn t random, but it appears to be. Sequences generated in a deterministic way such as this are usually called pseudo-random or quasi-random sequences in the technical literature, but here we will make a distinction: pseudo-random numbers should pass the same statistical tests as true random numbers do quasi-random numbers or sequences can be transformed like random numbers but should only be used for MC integration. X j+1 = f (X j, X j 1,..., X 1 )
2.3.1 The Linear Congruential Method By far the most popular random number generators in use today are special cases of the following scheme, introduced by D.H. Lehmer in 1949. We choose four magic numbers : m, the modulus; 0 < m. a, the multiplier; 0 a < m. c, the increment; 0 c < m. X 0, the starting value; 0 X 0 < m. The desired sequence of random numbers is then obtained by setting X n+1 = (a X n + c) mod m, n 0 For example, the sequence obtained when m = 10 and X 0 = a = c = 7 is 7, 6, 9, 0, 7, 6, 9, 0,...
The Linear Congruential Method Generator (LCG) The following theorem makes it easy to tell if the maximum period is achieved. The proof can be found in: Donald E. Knuth: The Art of Computer Programming, Vol. 2 Theorem: The linear congruential sequence defined by m, a, c, and X 0 has period length m if and only if i) c is relatively prime to m; ( teilerfremd ) ii) b = a 1 is a multiple of p, for every prime p dividing m; iii) b is a multiple of 4, if m is a multiple of 4. Try m = 16, a = 5, c = 3, X 0 = 0: X n+1 = (5 X n + 3) mod 16
LCGs parameters in common use Source m a c output bits Numerical Recipes 2 32 1664525 1013904223 Borland C/C++ 2 32 22695477 1 bits 30..16 in rand(), bits 30..0 in lrand() glibc (used by GCC) 2 31 1103515245 12345 bits 30..0 ANSI C: Watcom,... 2 31 1103515245 12345 bits 30..16 Borland Delphi, Virtual Pascal 2 32 134775813 1 bits 63..32 of (seed * L) Microsoft Visual/Quick C/C++ 2 32 214013 2531011 bits 30..16 (343FD 16 ) (269EC3 16 ) Microsoft Visual Basic ( v6) 2 24 1140671485 12820163 (43FD43FD 16 ) (C39EC3 16 ) MMIX by Donald Knuth 2 64 6364136223846793005 1442695040888963407 VAX s MTH$RANDOM, old versions of glibc 2 32 69069 1 Java s java.util.random 2 48 25214903917 11 bits 47...16 LC53 in Forth 2 32 5 2 32-333333333 0 Source: Wikipedia
2.3.2 Multiplicative congruential method If c = 0, the generator is often called a multiplicative congruential generator, or Lehmer RNG. X n+1 = (a X n ) mod m Advantage: faster algorithm Disadvantage: no Zero, possibly shorter period Definition: When a is relatively prime to m, the smallest integer λ for which a λ mod m = 1 is conventionally called the order of a modulo m. Any such value of a that has the maximum possible order modulo m is called a primitive element modulo m.
Multiplicative congruential generators Let λ(m) denote the order of a primitive element, i.e., the maximum possible order, modulo m. λ(2) = 1 λ(4) = 2 λ(2 e ) = 2 e 2, e > 2 λ(p e ) = p e 1 (p 1), prime p > 2
Multiplicative congruential generators Theorem: The number a is a primitive element modulo p e if and only if 1 p e = 2, a is odd; or p e = 4, a mod 4 = 3; or p e = 8, a mod 8 = 3, 5, 7; or p = 2, e 4, a mod 8 = 3, 5; 2 p is odd, e = 1, a 0(modulo p), and a (p 1)/q 1(modulo p) for any prime divisor q of p 1; 3 p is odd, e > 1, a satisfies (2), and a (p 1)/q 1(modulo p 2 )
Multiplicative congruential generators Theorem: The maximum period possible when c = 0 is λ(m). This period is achieved if 1 X 0 is relatively prime to m; 2 a is a primitive element modulo m Note that we can obtain a period of length m 1 if m is prime; this is just one less than the maximum length, so for all practical purposes such a period is as long as we want.
2.3.3 Combination of multible MLCGs
Techniques for producing numbers from various distributions The inverse transform sampling method: Let X be a random variable whose distribution can be described by the cumulative distribution function F. We want to generate values of X which are distributed according to this distribution. The inverse transform sampling method works as follows: 1 Generate a random number u from the standard uniform distribution in the interval [0, 1]. 2 Compute the value x such that F(x) = u. 3 Take x to be the random number drawn from the distribution described by F.
Inverse transform sampling Generator gives: 0 X n < m 0 X n m < 1 Uniform distribution: U(0, 1) Transformation: f (x) dx = U(0, 1) du CDF: x f (t) dt = F(x) = u x = F ( 1) (u)
Acceptance-Rejection Method Suppose we want to generate samples from a density f defined on some set X. Let g be a density on X from which we know how to generate samples and with the property that for some constant c. f (x) cg(x) 1 generate X from distribution g. 2 generate U from U(0, 1). 3 If (U f (X) cg(x) ) return X otherwise go to Step 1. The acceptance-rejection method for sampling from density f uses candidates from density g.