Uniform random numbers generators Lecturer: Dmitri A. Moltchanov E-mail: moltchan@cs.tut.fi http://www.cs.tut.fi/kurssit/tlt-2707/
OUTLINE: The need for random numbers; Basic steps in generation; Uniformly distributed random numbers; Von Neumann s generator; Congruential methods: additive, multiplicative, linear; Tausworthe generator; Composite generators. Statistical tests for uniform random numbers. Independence: runs test and correlation test; Independence: χ 2 and Kolmogorov test. Lecture: Uniform random numbers generators 2
1. The need for random numbers Examples of randomness in telecommunications: interarrival times between arrivals of packets, tasks, etc.; service time of packets, tasks, etc.; time between failure of various components; repair time of various components;... Importance for simulations: random events are characterized by distributions; simulations: we cannot use distribution directly. For example, M/M/1 queuing system: arrival process: exponential distribution with mean 1/λ; service times: exponential distribution with mean 1/µ. Lecture: Uniform random numbers generators 3
Discrete-event simulation of M/M/1 queue INITIALIZATION time:=0; queue:=0; sum:=0; throughput:=0; generate first interarrival time; MAIN PROGRAM while time < runlength do case nextevent of arrival event: time:=arrivaltime; add customer to a queue; start new service if the service is idle; generate next interarrival time; departure event: time:=departuretime; throughput:=throughtput + 1; remove customer from a queue; if (queue not empty) sum:=sum + waiting time; start new service; OUTPUT mean waiting time = sum / throughput Lecture: Uniform random numbers generators 4
2. General notes All computer generated numbers are pseudo ones: we know the method how they are generated; we can predict any random sequence in advance. The goal is then: imitate random sequences as good as possible. Requirements for generators: must be fast; must have low complexity; must be portable; must have sufficiently long cycles; must allow to generate repeatable sequences; numbers must be independent; numbers must closely follow a given distribution. Lecture: Uniform random numbers generators 5
General approach nowadays: transforming one random variable to another one; as a reference distribution a uniform distribution is often used. Note the following: most languages contain generator of uniformly distributed numbers in interval (0, 1). most languages do not contain implementations of arbitrarily distributed random numbers. The procedure is to: generate RN with inform distribution between a and b, b >>>> a; transform it somehow to random number with uniform distribution on (0, 1); transform it somehow to a random number with desired distribution. Lecture: Uniform random numbers generators 6
2.1. Step 1: uniform random numbers in (a, b) Basic approach: generate random number with uniform distribution on (a, b); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. Uniform generators: old methods: mostly based on radioactivity; Von Neumann s algorithm; congruential methods. Basic approach: next number is some function of previous one γ i+1 = F (γ i ), i = 0, 1,..., (1) recurrence relation of the first order; γ 0 is known and directly computed from the seed. Lecture: Uniform random numbers generators 7
2.2. Step 2: transforming to random numbers in (0, 1) Basic approach: generate random number with uniform distribution on (0, 1); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. Uniform U(0, 1) distribution has the following pdf: 1, 0 x 1 f(x) = 0, otherwise. (2) Lecture: Uniform random numbers generators 8
Mean and variance are given by: E[X] = 1 0 xdx = x2 2 1 0 = 1 2, σ 2 [X] = 1 12. (3) How to get U(0, 1): by rescaling from U(0, m) as follows: y i = γ i /m, (4) where m is the biggest possible number that can be generated. What we get: something like: 0.12, 0.67, 0.94, 0.04, 0.65, 0.20,... ; sequence that appears to be random... Lecture: Uniform random numbers generators 9
2.3. Step 3: non-uniform random numbers Basic approach: generate random number with uniform distribution on (a, b); transform these random numbers to (0, 1); transform it somehow to a random number with desired distribution. If we have generator U(0, 1) the following techniques are avalable: discretization: bernoulli, binomial, poisson, geometric; rescaling: uniform; inverse transform: exponential; specific transforms: normal; rejection method: universal method; reduction method: Erlang, Binomial; composition method: for complex distributions. Lecture: Uniform random numbers generators 10
3. Uniformly distributed random numbers The generator is fully characterized by (S, s 0, f, U, g): S is a finite set of states; s 0 S is the initial state; f(s S) is the transition function; U is a finite set of output values; g(s U) is the output function. The algorithm is then: let u 0 = g(s 0 ); for i = 1, 2,... do the following recursion: s i = f(s i 1 ); u i = g(s i ). Note: functions f( ) and g( ) influence the goodness of the algorithm heavily. Lecture: Uniform random numbers generators 11
user choice s 0 u 0 =g(s 0 ) u 0 u 4 s 1 =f(s 0 ) s 1 s 0 s 3 s 4 s 4 =f(s 3 ) u 1 =g(s 1 ) u 4 =g(s 4 ) u 3 =g(s 3 ) u 1 u 3 s 2 =f(s 1 ) s 3 =f(s 2 ) s 2 u 2 u 2 =g(s 2 ) Figure 1: Example of the operations of random number generator. Here s 0 is a random seed: allows to repeat the whole sequence; allows to manually assure that you get different sequence. Lecture: Uniform random numbers generators 12
3.1. Von Neumann s generator The basic procedure: start with some number u 0 of a certain length x (say, x = 4 digits, this is seed); square the number; take middle 4 digits to get u 1 ; repeat... example: with seed 1234 we get 1234, 5227, 3215, 3362, 3030, etc. Shortcoming: sensitive to the random seed: seed 2345: 2345, 4990, 9001, 180, 324, 1049, 1004, 80, 64, 40... (will always < 100); may have very short period: seed 2100: 2100, 4100, 8100, 6100, 2100, 4100, 8100,... (period = 4 numbers). To generate U(0, 1): divide each obtained number by 10 x (x is the length of u 0 ). Note: this generator is also known as midsquare generator. Lecture: Uniform random numbers generators 13
3.2. Congruential methods There are a number of versions: additive congruential method; multiplicative congruential method; linear congruential method; tausworthe generator. General congruential generator: u i+1 = f(u i, u i 1,... ) mod m, (5) u i, u i 1,... are past numbers. For example, quadratic congruential generator: u i+1 = (a 1 u 2 i + a 2 u i 1 + c) mod m. (6) Note: if here a 1 = a 2 = 1, c = 0, m = 2 we have the same as midsquare method. Lecture: Uniform random numbers generators 14
3.3. Additive congruential method Additive congruential generator is given: u i+1 = (a 1 u i + a 2 u i 1 + + a k u i k ) mod m. (7) The common special case is sometimes used: u i+1 = (a 1 u i + a 2 u i 1 ) mod m. (8) Characteristics: divide by m to get U(0, 1); maximum period is m k ; note: rarely used. Shortcomings: consider k = 2: consider three consecutive numbers u i 2, u i 1, u i ; we will never get: u i 2 < u i < u i 1 and u i 1 < u i < u i 2 (must be 1/6 of all sequences). Lecture: Uniform random numbers generators 15
3.4. Multiplicative congruential method Multiplicative congruential generator is given: u i+1 = (au i ) mod m. (9) Characteristics: divide by m to get U(0, 1); theoretical maximum period is m; note: rarely used. Shortcomings: can never produce 0. Choice of a, m is very important: recommended m = (2 p 1) with p = 2, 3, 5, 7, 13, 17, 19, 31, 61 (Fermat numbers); if m = 2 q, q 4 simplifies the calculation of modulo; practical maximum period is at best no longer than m/4. Lecture: Uniform random numbers generators 16
3.5. Linear congruential method Linear congruential generator is given: u i+1 = (au i + c) mod m, (10) where a, c, m are all positive. Characteristics: divide by m to get U(0, 1); maximum period is m; frequently used. Choice of a, c, m is very important. To get full period m choose: m and c have no common divisor; c and m are prime number (distinct natural number divisors 1 and itself only); if q is a prime divisor of m then a = 1, mod q; if 4 is a divisor of m then a = 1, mod 4. Lecture: Uniform random numbers generators 17
The step-by-step procedure is as follows: set the seed x 0 ; multiply x by a and add c; divide the result by m; the reminder is x 1 ; repeat to get x 2, x 3,.... Examples: x 0 = 7, a = 7, c = 7, m = 10 we get: 7,6,9,0,7,6,9,0,... (period = 4); x 0 = 1, a = 1, c = 5, m = 13 we get: 1,6,11,3,8,0,5,10,2,7,12,4,9,1... (period = 13); x 0 = 8, a = 2, c = 5, m = 13 we get: 8,8,8,8,8,8,8,8,... (period = 1!). Recommended values: a = 314, 159, 269, c = 453, 806, 245, m = 231 for 32 bit machine. Lecture: Uniform random numbers generators 18
Complexity of the algorithm: addition, multiplications and division: division is slow: to avoid it set m to the size of the computer word. Overflow problem when m equals to the size of the word: values a, c and m are such that the result ax i + c is greater than the word; it may lead to loss of significant digits but it does not hurt! How to deal with: register can accommodate 2 digits at maximum; the largest number that can be stored is 99; if m = 100: for a = 8, u 0 = 2, c = 10 we get (au i + c) mod 100 = 26; if m = 100: for a = 8, u 0 = 20, c = 10 we get (au i + c) mod 100 = 170; au i = 8 20 = 160 causing overflow; first significant digit is lost and register contains 60; the reminder in the register (result) is: (60 + 10) mod 70 = 70. the same as 170 mod 100 = 70. Lecture: Uniform random numbers generators 19
3.6. How to get good congruental generator Characteristics of good generator: should provide maximum density: no large gaps in [0, 1] are produced by random numbers; problem: each number is discrete; solution: a very large integer for modulus m. should provide maximum period: achieve maximum density and avoid cycling; achieve by: proper choice of a, c, m, and x 0. effective for modern computers: set modulo to power of 2. Lecture: Uniform random numbers generators 20
3.7. Tausworthe generator Tausworthe generator (case of linear congruential generator or order k): ( k ) z i = (a 1 z i 1 + a 2 z i 2 + + a k z i k + c) mod 2 = a j z i j + c mod 2. (11) where a j {0, 1}, j = 0, 1,..., k; the output is binary: 0011011101011101000101... j=1 Advantages: independent of the system (computer architecture); independent of the word size; very large periods; can be used in composite generators (we consider in what follows). Note: there are several bit selection techniques to get numbers. Lecture: Uniform random numbers generators 21
A way to generate numbers: choose an integer l k; split in blocks of length l and interpret each block as a digit: u n = l 1 j=0 z nl+j 2 (j+1). (12) In practice, only two a i are used and set to 1 at places h and k. We get: Example: h = 3, k = 4, initial values 1,1,1,1; we get: 110101111000100110101111...; period is 2 k 1 = 15; if l = 4: 13/16, 7/16, 8/16, 9/16, 10/16, 15/16, 1/16, 3/16... z n = (z i h + z i k ) mod 2. (13) Lecture: Uniform random numbers generators 22
3.8. Composite generator Idea: use two generators of low period to generate another with wider period. The basic principle: use the first generator to fill the shuffling table (address - entry (random number)); use random numbers of second generator as addresses in the next step; each number corresponding to the address is replaced by new random number of first generator. The following algorithm uses one generator to shuffle with itself: 1. create shuffling table of 100 entries (i, t i = γ i, i = 1, 2,..., 100); 2. draw random number γ k and normalize to the range (1, 100); 3. entry i of the table gives random number t i ; 4. draw the next random number γ k+1 and update t i = γ k+1 ; 5. repeat from step 2. Note: table with 100 entries gives fairly good results. Lecture: Uniform random numbers generators 23
4. Tests for random number generators What do we want to check: independence; uniformity. Important notes: if and only if tests passed number can be treated as random; recall: numbers are actually deterministic! Commonly used tests for independence: runs test; correlation test. Commonly used tests for uniformity: Kolmogorov s test; χ 2 test. Lecture: Uniform random numbers generators 24
4.1. Independence: runs test Basic idea: compute patterns of numbers (always increase, always decrease, etc.); compare to theoretical probabilities. 1/3 1/3 1/3 1/3 1/3 1/3 Figure 2: Illustration of the basic idea. Lecture: Uniform random numbers generators 25
Do the following: consider a sequence of pseudo random numbers: {u i, i = 0, 1,..., n}; consider unbroken subsequences of numbers where numbers are monotonically increasing; such subsequence is called run-up; example: 0.78,081,0.89,0.81 is a run-up of length 3. compute all run-ups of length i: r i, i = 1, 2, 3, 4, 5; all run-ups of length i 6 are grouped into r 6. calculate: R = 1 n 1 i,j 6 (r i nb i )(r j nb j )a ij, 1 i, j 6, (14) where (b 1, b 2,..., b 6 ) = ( 1 6, 5 24, 11 120, 19 ) 720, 29 5040, 1, (15) 840 Lecture: Uniform random numbers generators 26
Coefficients a ij must be chosen as an element of the matrix: Statistics R has χ 2 distribution: number of freedoms: 6; n > 4000. If so, observations are i.i.d. Lecture: Uniform random numbers generators 27
4.2. Independence: correlation test Basic idea: compute autocorrelation coefficient for lag-1; if it is not zero and this is statistically significant result, numbers are not independent. Compute statistics (lag-1 autocorrelation coefficient) as: R = N (u j E[u])(u j+1 E[u])/ j=1 N (u j E[j]) 2. (16) j=1 Practice: if R is relatively big there is serial correlation. Important notes: exact distribution of R is unknown; for large N: if u j uncorrelated we have: P r{ 2/ N R 2/ N}; therefore: reject hypotheses of non-correlated at 5% level if R is not in { 2/ N, 2/ N}. Notes: other tests for correlation Ljung and Box test, Portmanteau test, etc. Lecture: Uniform random numbers generators 28
4.3. Uniformity: χ 2 test The algorithm: divide [0, 1] into k, k > 100 non-overlapping intervals; compute the relative frequencies of falling in each category, f i : ensure that there are enough numbers to get f i > 5, i = 1, 2,..., k; values f i > 5, i = 1, 2,..., k are called observed values. if observations are truly uniformly distributed then: these values should be equal to r i = n/k, i = 1, 2,..., k; these values are called theoretical values. compute χ 2 statistics for uniform distribution: χ 2 = k n that must have k 1 degrees of freedom. k i=1 ( f i n k ) 2. (17) Lecture: Uniform random numbers generators 29
Hypotheses: H 0 observations are uniformly distributed; H 1 observations are not uniformly distributed. H 0 is rejected if: computed value of χ 2 is greater than one obtained from the tables; you should check the entry with k 1 degrees of freedom and 1-a level of significance. Lecture: Uniform random numbers generators 30
4.4. Kolmogorov test Facts about this test: compares empirical distribution with theoretical ones; empirical: F N (x) number of smaller than or equal to x, divided by N; theoretical: uniform distribution in (0, 1): F (x) = x, 0 < x < 1. Hypotheses: H 0 : F N (x) follows F (x); H 1 : F N (x) does not follow F (x). Statistics: maximum absolute difference over a range: R = max F (x) F N (x). (18) if R > R α : H 0 is rejected; if R R α : H 0 is accepted. Note: use tables for N, α (significance level), to find R α. Lecture: Uniform random numbers generators 31
Example: we got 0.44, 0.81, 0.14, 0.05, 0.93: H 0 : random numbers follows uniform distribution; we have to compute: R (j) 0.05 0.14 0.44 0.81 0.93 j/n 0.20 0.40 0.60 0.80 1.00 j/n R (j) 0.15 0.26 0.16-0.07 R (j) (j-1)/n 0.05-0.04 0.21 0.13 compute statistics as: R = max F (x) F N (x) = 0.26; from tables: for α = 0.05, R α = 0.565 > R; H 0 is accepted, random numbers are distributed uniformly in (0, 1). Lecture: Uniform random numbers generators 32
4.5. Other tests The serial test: consider pairs (u 1, u 2 ), (u 3, u 4 ),..., (u 2N 1, u 2N ); count how many observations fall into N 2 different subsquares of the unit square; apply χ 2 test to decide whether they follow uniform distribution; one can formulate M-dimensional version of this test. The permutation test look at k-tuples: (u 1, u k ), (u k+1, u 2k ),..., (u (N 1)k+1, u Nk ); in a k-tuple there k! possible orderings; in a k-tuple all orderings are equally likely; determine frequencies of orderings in k-tuples; apply χ 2 test to decide whether they follow uniform distribution. Lecture: Uniform random numbers generators 33
The gap test let J be some fixed subinterval in (0, 1); if we have that: u n+j not in J, 0 j k, and both u n 1 J, u n+k+1 J; we say that there is a gap of length k. H 0 : numbers are independent and uniformly distributed in (0, 1): gap length must be geometrically distributed with some parameter p; p is the length of interval J: P r{gap of length k} = p(1 p) k. (19) practice: we observe a large number of gaps, say N; choose an integer and count number of gaps of length 0, 1,..., h 1 and h; apply χ 2 test to decide whether they independent and follow uniform distribution. Lecture: Uniform random numbers generators 34
4.6. Important notes Some important notes on seed number: do not use seed 0; avoid even values; do not use the same sequence for different purposes in a single simulation run. Note: these instruction may not be applicable for a particular generator. General notes: some common generators are found to be inadequate; even if generator passed tests, some underlying pattern might still be undetected; if the task is important use composite generator. Lecture: Uniform random numbers generators 35