So far we discussed random number generators that need to have the maximum length period. Even the increment series has the maximum length period yet it is by no means random How do we decide if a generator is adequate as a random number generator. To our eyes we tend to avoid sequences that seem non-random such as pairs of equal adjacent digits Even a random sequence will start appearing to have patterns after a while Consider π 3.459265358979323846264338327950 26 repeats after a while and the second appearance is in the middle of another pattern. In fact Dr. Matrix found dozens of properties that were observed on π We also notice patterns in numbers used in every day life to aid our memory, thus human judgment is not very good to analyse random number generators. We thus rely on a number of statistical tests. If a random number generator succeeds to pass a number of tests, it might still fail on the next test. Yet each success gives us more confidence on its randomness. Joseph Cordina
There are two kinds of tests: Empirical Tests where a generated sequence is analysed and one derives a statistical score for the sequence; Theoretical Tests where certain properties are derived on the recurrence rule used to form the sequence. Let us examine the most basic of statistical tests, the Chi-Square test (χ 2 test) Assume two fair dices, we can get the following totals s with certain probabilities value of s= 2 3 4 5 6 7 8 9 0 2 probability p s = 36 8 2 If we throw the dice n times we should get the value of s approximately np s times on average. 9 In 44 throws one expects to get the value 4 about 2 times. Let us assume we throw the dice 44 times and record the outcome value of s= 2 3 4 5 6 7 8 9 0 2 observed Y s = 2 4 0 2 22 29 2 5 4 9 6 expected np s = 4 8 2 6 20 24 20 6 2 8 4 As expected, the observed values are different from the expected. In fact if one gets 44 throws with s = 2 one would be convinced the dice are not fair even though this is still probable. Using these values one can devise a probabilistic test, i.e. how probable are certain throws on average. Joseph Cordina 2 5 36 6 5 36 9 2 8 36
One can look at the difference between the expected value of throws and the actual throws, thus V = (Y 2 np 2 ) 2 +(Y 3 np 3 ) 2 +...+(Y 2 np 2 ) 2 () A high value of V would indicate something is wrong. Note that V can still be high for a fair set of dice, and thus we ask how probable a certain value of V is. In (3) we give equal weight to each of the addition parts, even though (Y 7 np 7 ) 2 is likely to be higher than (Y 2 np 2 ) 2. Thus we weight each component to give V = (Y 2 np 2 ) 2 np 2 +...+ (Y 2 np 2 ) 2 np 2 (2) This is known as the Chi-Square Test (χ 2 ) For the experiment shown previously we find that V = 748 7. Yet is this value high or low? Joseph Cordina 3
In general we take n independent observations. Let p s be the probability that each observation falls into category s, and let Y s be the number of observations that actually do fall into category s. Then we have V = k s= (Y s np s ) 2 np s (3) In our previous example we had possible outcomes, so k =. Expanding (Y s np s ) 2 = Y 2 s 2np s Y s +n 2 p 2 s and knowing that we get Y + Y 2 +... + Y k = n p + p 2 +... + p k = V = n k s= ( Y s 2 ) n (4) p s Yet what makes a reasonable value for V? We make use of tables that give the chi-square distribution with n degrees of freedom for various values of v. Note that the value n = k should be used since as seen before one can calculate the value of Y k if one knows all other values. Thus one only has k independent values. Joseph Cordina 4
If the table entry in row n under column p is x, then the quantity V in Eq. (4) will be less than or equal to x with approximate probability p, if n is large enough. Thus if row 0 under 95 percent is 8.3, then we will have V > 8.3 only about 5 percent of the time. Assume we have two random number generators for 0 degrees of freedom, with V = 29.5 and V 2 =.2 then referring to the table we see that V is way too high, since V will be larger than 25.2 less than 0.05 percent of the time. On the other hand, V 2 is quite low meaning that the resulting value are too close to the expected. In fact we cannot consider these values to be random at all. For the previous experiment with V = 7 7 48 we see that the value falls in between 25 and 50 percent, which is a mid-range value thus satisfactory. It is surprising that the same table is used regardless of the number of observations and regardless of p s. Only n seems to matter. This table is valid only for large number of observations. A common rule of thumb is to take as many observations as possible such that each np s is 5 or more. Joseph Cordina 5
In fact the proper choice of n, the number of observations, is somewhat obscure A bias will be detected as n gets larger. Yet large values of n will smooth out locally non-random behaviour, when a series of numbers towards a certain bias are followed by a series of numbers with an opposite bias. Best way to use the chi-square test is to run the test at least three times, and if at least two are suspect, the generator is considered as not sufficiently random. Exercise: Apply the chi-square test to random number generators mentioned previously. The Chi-Square Test is adequate when the number of categories is known. Yet one can have an infinite number of categories such as a random fraction. A general notation for specifying probability distributions is to use the distribution function F(x) on a random quantity X, where F(x) = Pr(X x) = probability (X x) Figure 3 shows the distribution function for (a) a random bit, (b) a uniformly distributed random real number and (c) the limiting distribution of the value V in the chi-square test. Joseph Cordina 6
If we make n independent observations of the random quantity X, thus getting X, X 2,..., X n we can form the empirical distribution function F n (x) where F n (x) = number of X, X 2,..., X n that are x n (5) Figure 4 shows examples of empirical distributions. Note that with a finite number of samples, one is bound to get jumps in the curve. The smooth curve is the distribution function. The Kolmogorov-Smirnov Test (KS test) measures the difference between F(x) and F n (x). A bad random number generator will give an empirical distribution function that does not approach F(x) sufficiently well. To make the KS-test we use the following K + n = n K n = n max (F n(x) F(x)) (6) <x<+ max (F(x) F n(x)) (7) <x<+ The K + n measures the greatest amount of deviation when F n is greater than F and vice versa. Joseph Cordina 7
The statistics of Fig 4 are K 20 + K20 Fig 4(a) Fig 4(b) Fig. 4(c) 0.492 0.34 0.33 0.536.027 2.0 As in the Chi-Square Test we make use of tables to determine the probability of obtaining certain values of K. We see that the probability of obtaining K20 is 0.7975 or less is 75 percent. Note that the KS test is exact to the number of observations, unlike the Chi-Square Test. Equations 6 and 7 are not adequate for computer calculation since they vary over infinite values. Since F(x) is increasing and F n (x) also increases in finite steps, we can use the following procedure to obtain the KS-test values.. Obtain independent observations X,..., X n 2. Sort the observations in ascending order X... X n. 3. The desired statistics are now given by K + n = n max j n (j n F(X j)) (8) K n = n max j n (F(X j) j n ) (9) Joseph Cordina 8
Exercise Perform the KS Test on several random number generators mentioned previously and compare their corresponding Chi-Square Test. Exercise Make use of your preferred programming language random number generator for a finite number of degrees of freedom and perform the Chi-square and the KS Test on them Exercise Some dice were loaded such that on one die the value will appear twice as often as other numbers and the other die is similarly biased towards 6. The following values were obtained: value of s= 2 3 4 5 6 7 8 9 0 2 observed Y s = 2 6 0 6 8 32 20 3 6 9 2 Apply the Chi-Square test to determine if the chi-square test can detect the bad dice. If not, give indications why not. Exercise: Let F(x) be the uniform distribution given in Fig 3(b). Find K 20 + and k 20 for the following 20 observations 0.44, 0.732, 0.236, 0.62, 0.259, 0.442, 0.89, 0.693, 0.098, 0.302, 0.442, 0.434, 0.4, 0.07, 0.38, 0.869, 0.772, 0.678, 0.354, 0.78 and state if these values are to be expected. Joseph Cordina 9
The KS-Test and the Chi-Square Test are not normally used in isolation but in a number of empirical tests that evaluate random number sequences. What follows is a list of these tests. More details can be found in pages 6 to 75. Equidistribution Test: This tests that a series from 0 to is uniformly distributed. Serial Test: This tests that pairs of successive numbers is uniformly distributed in the sequence in an independent manner. This test can also be applied to triples, quadruples, etc. Gap Test: This examines the length of gaps between the occurrence of the same number. Poker Test: This test examines groups of five successive integers and observes the pattern they form. Coupon Collection Test: We see the length of numbers required to obtain a particular range of values. Permutation Test: This examines the possible relative ordering of generated sequences Run Test: This test examines if the numbers are increasing or decreasing and partitions them into groups. Maximum of t-test: This applies another test and selects the largest score on the sequences. Joseph Cordina 0
Collision Test: This is a test very similar to bucketing in hashing techniques. Birthday Spacing Test: This is similar to the above yet one applies an ordering sequence to the numbers Serial Correlation Test: This calculates the degree of dependency between one number and its predecessor. Exercise: Read about these tests and experiment with their implementation for your preferred random number generator. Joseph Cordina