Topics in Computer Mathematics

Similar documents
Algorithms and Networking for Computer Games

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

Slides 3: Random Numbers

Lehmer Random Number Generators: Introduction

Review of Statistical Terminology

Topic Contents. Factoring Methods. Unit 3: Factoring Methods. Finding the square root of a number

How does the computer generate observations from various distributions specified after input analysis?

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

B. Maddah ENMG 622 Simulation 11/11/08

Integer factorization, part 1: the Q sieve. D. J. Bernstein

Counting. 1 Sum Rule. Example 1. Lecture Notes #1 Sept 24, Chris Piech CS 109

2008 Winton. Review of Statistical Terminology

cse 311: foundations of computing Fall 2015 Lecture 11: Modular arithmetic and applications

Random Number Generation. CS1538: Introduction to simulations

Executive Assessment. Executive Assessment Math Review. Section 1.0, Arithmetic, includes the following topics:

Continuing discussion of CRC s, especially looking at two-bit errors

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

Randomized Algorithms

Generating Uniform Random Numbers

CSCE 564, Fall 2001 Notes 6 Page 1 13 Random Numbers The great metaphysical truth in the generation of random numbers is this: If you want a function

HMMT February 2018 February 10, 2018

Generating Uniform Random Numbers

4 Number Theory and Cryptography

3 The fundamentals: Algorithms, the integers, and matrices

Intermediate Math Circles November 8, 2017 Probability II

NORTHWESTERN UNIVERSITY Tuesday, Oct 6th, 2015 ANSWERS FALL 2015 NU PUTNAM SELECTION TEST

Pseudo-Random Generators

A brief review of basics of probabilities

Pseudo-Random Generators

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Generating Uniform Random Numbers

Selected problems from past exams and a few additional problems

Topics. Pseudo-Random Generators. Pseudo-Random Numbers. Truly Random Numbers

How does the computer generate observations from various distributions specified after input analysis?

Randomized Algorithms, Spring 2014: Project 2

Discrete Random Variables

Section 2.1: Lehmer Random Number Generators: Introduction

Math 312/ AMS 351 (Fall 17) Sample Questions for Final

COMP6053 lecture: Sampling and the central limit theorem. Jason Noble,

Uniform random numbers generators

4.5 Applications of Congruences

STA 247 Solutions to Assignment #1

5.3 Conditional Probability and Independence

Lecture 23: Alternation vs. Counting

cse 311: foundations of computing Fall 2015 Lecture 12: Primes, GCD, applications

12 1 = = 1

2x 1 7. A linear congruence in modular arithmetic is an equation of the form. Why is the solution a set of integers rather than a unique integer?

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Sources of randomness

MAT Mathematics in Today's World

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

STAT 201 Chapter 5. Probability

Lecture 6: Introducing Complexity

Number Theory: Applications. Number Theory Applications. Hash Functions II. Hash Functions III. Pseudorandom Numbers

Generating Random Variables

Functions and cardinality (solutions) sections A and F TA: Clive Newstead 6 th May 2014

Probability COMP 245 STATISTICS. Dr N A Heard. 1 Sample Spaces and Events Sample Spaces Events Combinations of Events...

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

COMP6053 lecture: Sampling and the central limit theorem. Markus Brede,

CPSC 467: Cryptography and Computer Security

Random processes and probability distributions. Phys 420/580 Lecture 20

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

CPSC 467b: Cryptography and Computer Security

So far we discussed random number generators that need to have the maximum length period.

Discrete Random Variables

STEP Support Programme. Hints and Partial Solutions for Assignment 17

Linear Feedback Shift Registers (LFSRs) 4-bit LFSR

2x 1 7. A linear congruence in modular arithmetic is an equation of the form. Why is the solution a set of integers rather than a unique integer?

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Simulation Modeling. Random Numbers

Notes. Number Theory: Applications. Notes. Number Theory: Applications. Notes. Hash Functions I

Algorithms (II) Yu Yu. Shanghai Jiaotong University

Cosc 412: Cryptography and complexity Lecture 7 (22/8/2018) Knapsacks and attacks

Integers and Division

FERMAT S TEST KEITH CONRAD

6.842 Randomness and Computation Lecture 5

CS 151 Complexity Theory Spring Solution Set 5

CS280, Spring 2004: Final

11 Division Mod n, Linear Integer Equations, Random Numbers, The Fundamental Theorem of Arithmetic

1: Please compute the Jacobi symbol ( 99

Stream Ciphers. Çetin Kaya Koç Winter / 20

Quantum Computation CMU BB, Fall Week 6 work: Oct. 11 Oct hour week Obligatory problems are marked with [ ]

P (A) = P (B) = P (C) = P (D) =

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 4.1-1

18.310A Final exam practice questions

MATH 115, SUMMER 2012 LECTURE 4 THURSDAY, JUNE 21ST

Lecture 2. The Euclidean Algorithm and Numbers in Other Bases

Ph 219b/CS 219b. Exercises Due: Wednesday 22 February 2006

A Repetition Test for Pseudo-Random Number Generators

Random Number Generators - a brief assessment of those available

Linear Congruences. The equation ax = b for a, b R is uniquely solvable if a 0: x = b/a. Want to extend to the linear congruence:

Review of probabilities

First Digit Tally Marks Final Count

REVIEW QUESTIONS. Chapter 1: Foundations: Sets, Logic, and Algorithms

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

THE SOLOVAY STRASSEN TEST

MACHINE COMPUTING. the limitations

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

(x 1 +x 2 )(x 1 x 2 )+(x 2 +x 3 )(x 2 x 3 )+(x 3 +x 1 )(x 3 x 1 ).

Transcription:

Random Number Generation (Uniform random numbers) Introduction We frequently need some way to generate numbers that are random (by some criteria), especially in computer science. Simulations of natural phenomena, for instance, may require that events occur at unknown (i.e. random) points in time. Clearly game programmers need to program events that occur randomly, or simulate dealing cards, throwing dice, or flipping coins. In an ideal situation, a sequence of random numbers are totally independent of each other. In a computer there are ways of generating such numbers, most of which require sensing some real world phenomena and converting to a number. For example, if the computer has a clock in it which is being incremented at a high rate of speed (say every microsecond), then one could extract some number of low order bits from the clock, perhaps massage them in some way, such as by hashing, and use the result as the required random number. If one is writing a computer program, however, the debugging process requires that we be able to reliably repeat exactly any given situation in order to make sure the program behaves correctly. Therefore, we want to be able to produce sequences of numbers which are random, but reproducible. Such sequences are generated using a function which generates the next number in the sequence by applying some mathematical manipulation to the previous number, or even several previously generated numbers. E. g. X n = f(x n-1 ) Such sequences can never be truly random, but may be statistically random enough to suit any given application. Numbers generated in this way are known as pseudo-random numbers or pseudo-random sequences. John Von Neumann produced one of the earliest such functions, known as the Middle Square Method : Given an n-digit number, square it and take the middle n digits of the result to get the next random number. In general, multiplying two numbers together can give a result which has a number of digits which is the sum of the number of digits of the multiplier and the multiplicand. So, squaring an n-digit number can produce a 2n-digit result. Unfortunately, this algorithm doesn t produce very good sequences, especially when the results start to have leading zeros - the sequence will end up generating NTC 4/24/05 185

only zeros. For instance, starting with the two digit number, 43, 43 2 = 1849 => 84 84 2 = 7056 => 05 05 2 = 0025 => 02 02 2 = 0004 => 00 and only 00s will be generated thereafter Most High Level Computer Languages (HLL) provide a built in function (such as RND()) which produces random (i.e. pseudo-random) numbers. In the most general cases, these functions produce a fraction in the range 0 to 1. Of course not all fractions are possible, but the number that are produced is very large and can give a very good approximation to uniform random numbers. (A random number sequence is uniform if they are chosen from a set of numbers which have uniform probability distribution; that is, if there are a total of n possible numbers to be chosen randomly, then the k th number has probability 1/n of being a specific number, regardless of the previous values of the k-1 numbers already chosen). Of course, we rarely need fractions between 0 and 1. Suppose we want to simulate throwing a single die. Then we really want to get numbers between 1 and 6, inclusive. We do this by scaling and using some function such as INT() which extracts the integer portion of a real number. Suppose we let X = RND(0), which is a fraction (real number) between 0 and 1. Then 5X is a real number between 0 and 5. 5X+1 is then a real number between 1 and 6, and we need to use another function, INT() to extract the integer portion of these real numbers. So, we end up with Die Value = INT(1+5RND()) But how do we get the random numbers between 0 and 1 to begin with? We will find a function that produces integers between 0 and m, and divide by m. So, if X is a random integer between 0 and m, U = X/m is a random real number between 0 and 1. Linear Congruence Method The most well-known method for generating statistically random numbers is the Linear Congruence Method. In the simplest form of this method, the n th number in the sequence is generated from the n-1 st number by the formula X n = (ax n-1 + c) mod m, n0 [R1] where NTC 4/24/05 186

X 0 is the initial value used to start the sequence (called the seed) a is a multiplier c is an increment, and m is a modulus X 0, a, and c are all 0; m is > X 0, a, and c. This function can yield very good random sequences provided X 0, a, c, and m are chosen correctly. E.g. choose X 0 = a = c = 7, m=10. Then the sequence is 7, 6, 9, 0 7, 6, 9, 0. Since m is 10, we know that there are 10 possible numbers (0 through 9) which can be generated, yet this choice of parameters only produced four of them. Note also that the method has produced a loop. This method will always produce a loop - our goal should be to maximize the length (known as the period) of the sequence before it starts to repeat. For any m (10 in this example) we should expect to get, ideally, all m numbers (0 through m-1), in random order, before the loop repeats. Why is this called a pseudo-random sequence rather than just a random sequence? Note that a true random number has no memory. That is, the value the random number takes on at any point in time is independent of any and all previous values that may have been generated. For instance, suppose we use a coin to generate the numbers 0 and 1 randomly (heads=0, tails=1). On the average we expect to get about 50% 0's and 50% 1's, since on any given flip the chance of a 0 or 1 is ½. Let s suppose that we get the following sequence after flipping the coin 10 times: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 What will the value be on the eleventh flip? There is a psychological tendency to think that, since we have had such a long string of 1's, that it must be time for a 0 to show up, and so a 0 must have a much greater probability of occurring than a 1 on the next flip (this kind of thinking is the downfall of many gamblers). But the coin is a pure random number generator with no memory of what has gone before. The probability of a 0 or 1 on the eleventh flip is still ½, regardless of previous history. This is not the case for the linear congruence method; once a number has been generated it will not be generated again until the entire sequence of possible numbers has been generated. Thus, the sequence NTC 4/24/05 187

7, 6, 9, 9, 0, 6, 7, 0, 0 could never be generated even though it may be perfectly random. For this reason numbers generated by the linear congruence method are not used without further modification. Usually this takes the form of some form of hash function to produce a more random sequence. For example, suppose I need a random sequence of two-digit numbers. Then I might use the LCM to generate a sequence of 5 digit numbers, and extract two of the five digits for my actual random numbers. (Note that extracting the least significant digits is accomplished by a modulo operation with a modulus of 100.) The parameters for the linear congruence method can all be chosen to maximize the sequence as well as the randomness of the sequence of the numbers. e. The parameter c affects the length, and in general should be non-zero. However, in the above example, if c= 0 instead of 7, the sequence becomes 7 9 6 2 4 8 6 2 4 8... Note that the period is still only four, although it took a while to settle into the loop. f. The parameter a must be chosen to be greater than 2. If a=0, there is no dependence on X n-1 ( c will be the only number generated) and if a=1, X n = (X 0 +nc) mod m which is dependent only on X 0, not X n-1. This gives the sequence 7 4 1 8 5 2 9 6 3 0 7... While this is certainly the maximum period, it would fail statistical tests for randomness (can you see why?). g. The parameter m is chosen to tread a fine line between maximizing the randomness of the number sequence, and speed of computation. That is, m should be large, but computation of (ax n-1 +c) mod m must be fast. One way (and the most common) is to let m be related to the word size, w, of the computer. Then arithmetic modulo 2 w is simply take the part of the operation which remains in the low order w bits of the answer. Better results are achieved if we use 2 w ± 1. Can we guarantee that a period of length m, the maximum, is possible? Yes, if and only if a. c is relatively prime to m b. a-1 is a multiple of p, for every p dividing m (where p is a prime) c. a-1 is a multiple of 4 if m is a multiple of 4. NTC 4/24/05 188

The conditions on a are satisfied if a = 2 k +1, (2 < k < w), and c=1. Then [R1] becomes X n = ((2 k +1)X n-1 +1) mod 2 w [R2] Implementation avoids multiplication - we can just use shifts and adds. Unfortunately, multipliers of the form a = 2 k +1 do not give very good randomness, at least for small word sizes (say <47). Better results are obtained with something like a = 2 k1 + 2 k2 +2 k3 +1 Much better results can be achieved by choosing m to be prime. Then X n = (a 1 X n-1 +... a k X n-k ) mod p [R3] will have period of length p k -1. The constants a i must be chosen so that x k - a 1 x k-1 -... - a k is a primitive polynomial - it has a root which is a primitive element of the field with p k elements. Finding the a i are not easy. Summary for building Linear Congruence Generators (Knuth) 1. X 0 is chosen arbitrarily, perhaps to the current date and time. 2. m should be large. For computer hardware generation of Random Numbers m = 2 w where w is the computers word size. This trivializes modulo arithmetic. 3. a is chosen such that, if m is a power of two, (as above) then a mod 8 =5. 4. Also, we should choose a such that m < a < m-m. 5. If m is a power of 2, c should be odd. Ideally, c/m ½ - (3)/6. 6. Use the high order bits of X; i. e. the most significant digits of X/m. Testing for Randomness Before a random number generator can be approved for use in some application, it should be tested to be sure that the numbers it produces are in fact random. In fact, there are many tests, and several of them need to be used before a generator can be certified usable. We will discuss just one of the tests here, the NTC 4/24/05 189

Chi-square test. This test can be used when the RNG is to be used to generate results in some finite number, k, of categories. For instance, if we are going to use the RNG to simulate rolls of dice, the possible outcomes (categories) are 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, and 12. (a total of 11 possible outcomes.) In addition, each outcome must have a known probability (in this case: 1/36, 1/18, 1/12, 1/9, 5/36, 1/6, 5/36, 1/9, 1/12, 1/18, 1/36). To apply the Chi-square ( 2 ) test, a large number of experiments, n, must be performed, and a count kept of how many times each an outcome falls in each category. We then compare, in some way, the results of this experiment with predicted results based on the probabilities. We label the probabilities p i, 1 <= i <= k. Then the predicted number of outcomes for category i in n experiments is np i. For example, the predicted probability of throwing 3 on a k pair of dice is 1/18, so that if we throw the dice 1000 times, the expected number of times a 3 should appear is 1000 x 1/18 = 55.6. Let s further assume that in the actual experiment a 3 appeared 60 times. What can we say about the reliability of the dice? The Chi-square test measures the variance of the expected value from the predicted value for each category using the formula V i = (N i - np i ) 2 /np i The difference is squared to eliminate negative values for V i. Finally, the Chisquare value, V, is obtained by summing the V i over all k categories. V = V i = (N i - np i ) 2 /np i where the summation is over all i, 1<= i <= k. Finally, the value obtained for V needs to be interpreted, and this is done by looking V up in a table of Chi-square distribution. Generating a random sequence of binary digits. a. Double the number (shift left one digit) b. If no overflow, use low order bit as next random binary bit c. else Exclusive-Or with c. Algebraically, we generate a sequence of random numbers in a word of length w, NTC 4/24/05 190

and use the low order bit. The numbers of length w can be described algebraically as X n = (2X n-1 ) mod 2 w if 2X n-1 < 2 w, and [R4] X n = (2X n-1 c) mod 2 w if 2X n-1 2 w. The next binary digit is then the least significant bit of X n, which can be expressed as b n = X n mod 2 This will produce a sequence of length 2 k -1 where k is the word length and c contains the coefficients of a primitive polynomial modulo 2. (pg27 Knuth) Example: let the word length = 4 and c = 0011. Then the sequence of 4-bit words generated by [R4] is 1011 0101 1010 0111 1110 1111 1101 1001 0001 0010 0100 1000 0011 0110 1100 1011 NTC 4/24/05 191