CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

Similar documents
Systems Simulation Chapter 7: Random-Number Generation

Slides 3: Random Numbers

UNIT 5:Random number generation And Variation Generation

Sources of randomness

Random Number Generation. CS1538: Introduction to simulations

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

Independent Events. Two events are independent if knowing that one occurs does not change the probability of the other occurring

IE 303 Discrete-Event Simulation L E C T U R E 6 : R A N D O M N U M B E R G E N E R A T I O N

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.

B. Maddah ENMG 622 Simulation 11/11/08

2008 Winton. Review of Statistical Terminology

Uniform Random Number Generators

How does the computer generate observations from various distributions specified after input analysis?

How does the computer generate observations from various distributions specified after input analysis?

Lehmer Random Number Generators: Introduction

Uniform random numbers generators

Review of Statistical Terminology

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

Topics in Computer Mathematics

Pseudo-Random Numbers Generators. Anne GILLE-GENEST. March 1, Premia Introduction Definitions Good generators...

Generating Uniform Random Numbers

Stream Ciphers. Çetin Kaya Koç Winter / 20

Class 12. Random Numbers

Stochastic Simulation of

Statistics, Data Analysis, and Simulation SS 2013

Generating Uniform Random Numbers

Generating Uniform Random Numbers

Generating pseudo- random numbers

Chapter 4: Monte Carlo Methods. Paisan Nakmahachalasint

Uniform Random Binary Floating Point Number Generation

Tae-Soo Kim and Young-Kyun Yang

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 7 Random Numbers

Random Number Generation. Stephen Booth David Henty

Stochastic Simulation of Communication Networks

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

Random Number Generators

Random processes and probability distributions. Phys 420/580 Lecture 20

A TEST OF RANDOMNESS BASED ON THE DISTANCE BETWEEN CONSECUTIVE RANDOM NUMBER PAIRS. Matthew J. Duggan John H. Drew Lawrence M.

Workshop on Heterogeneous Computing, 16-20, July No Monte Carlo is safe Monte Carlo - more so parallel Monte Carlo

14 Random Variables and Simulation

Random number generators

Uniform and Exponential Random Floating Point Number Generation

ISyE 6644 Fall 2014 Test #2 Solutions (revised 11/7/16)

2 P. L'Ecuyer and R. Simard otherwise perform well in the spectral test, fail this independence test in a decisive way. LCGs with multipliers that hav

CPSC 467: Cryptography and Computer Security

Algorithms and Networking for Computer Games

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Department of Electrical- and Information Technology. ETS061 Lecture 3, Verification, Validation and Input

METHODS FOR GENERATION OF RANDOM NUMBERS IN PARALLEL STOCHASTIC ALGORITHMS FOR GLOBAL OPTIMIZATION

Tables Table A Table B Table C Table D Table E 675

Pseudo-Random Generators

Pseudo-Random Generators

Notation Precedence Diagram

Topics. Pseudo-Random Generators. Pseudo-Random Numbers. Truly Random Numbers

Stochastic Simulation

Topic Contents. Factoring Methods. Unit 3: Factoring Methods. Finding the square root of a number

Section 2.1: Lehmer Random Number Generators: Introduction

Monte Carlo Techniques

A Repetition Test for Pseudo-Random Number Generators

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Random numbers and generators

Continuing discussion of CRC s, especially looking at two-bit errors

So far we discussed random number generators that need to have the maximum length period.

( x) ( ) F ( ) ( ) ( ) Prob( ) ( ) ( ) X x F x f s ds

2 Random Variable Generation

Random number generators and random processes. Statistics and probability intro. Peg board example. Peg board example. Notes. Eugeniy E.

Generating Random Variables

Chapter 9 Mathematics of Cryptography Part III: Primes and Related Congruence Equations

Practice Problems Section Problems

Random Number Generators - a brief assessment of those available

Computer Applications for Engineers ET 601

Lecture 20. Randomness and Monte Carlo. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Statistical Methods for Astronomy

A NEW RANDOM NUMBER GENERATOR USING FIBONACCI SERIES

Summary of Chapters 7-9

Modeling and Performance Analysis with Discrete-Event Simulation

Bivariate Paired Numerical Data

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

A fast random number generator for stochastic simulations

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

CSCE 564, Fall 2001 Notes 6 Page 1 13 Random Numbers The great metaphysical truth in the generation of random numbers is this: If you want a function

1 Degree distributions and data

Random Number Generation: RNG: A Practitioner s Overview

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 18

2008 Winton. Statistical Testing of RNGs

Lecture 11 - Basic Number Theory.

Transformations and Expectations

Pseudo-random Number Generation. Qiuliang Tang

Math 152. Rumbos Fall Solutions to Exam #2

Computer simulation on homogeneity testing for weighted data sets used in HEP

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Statistical Testing of Randomness

Random number generation

Logistic Regression Analysis

IE 581 Introduction to Stochastic Simulation. One page of notes, front and back. Closed book. 50 minutes. Score

Title:Varies statistical test of pseudorandom number generator

One-Sample Numerical Data

Distribution Fitting (Censored Data)

Transcription:

CPSC 531: Random Numbers Jonathan Hudson Department of Computer Science University of Calgary http://www.ucalgary.ca/~hudsonj/531f17

Introduction In simulations, we generate random values for variables with a specified distribution E.g., model service times using the exponential distribution Generation of random values is a two step process 1. Random number generation: Generate random numbers uniformly distributed between 0 and 1 2. Random variate generation: Transform the above generated random numbers to obtain numbers satisfying the desired distribution

Pseudo Random Numbers Common pseudo random number generators determine the next random number as a function of the previously generated random number (i.e., recursive calculations are applied) x n = f(x n 1, x n 2, x x 3, ) Random numbers generated, are therefore, deterministic. That is, sequence of random numbers is known a priori (BEFORE) given the starting number (called the seed). For this reason, random numbers are known as pseudo random. True random number generator s would produce numbers that are independent of those previous We can determine quality of uniformity and independence of pseudo RNG with statistical tests

A Sample Generator x n = 5x n 1 + 1 mod 16

A Sample Generator x n = 5x n 1 + 1 mod 16 Starting with x 0 = 5: The first 32 numbers obtained by the above procedure 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5, 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5.

A Sample Generator x n = 5x n 1 + 1 mod 16 Starting with x 0 = 5: The first 32 numbers obtained by the above procedure 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5, 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5. By dividing x's by 16: 0.6250, 0.1875, 0.0000, 0.0625, 0.3750, 0.9375, 0.7500, 0.8125, 0.1250, 0.6875, 0.5000, 0.5625, 0.8750, 0.4375, 0.2500, 0.3125, 0.6250, 0.1875, 0.0000, 0.0625, 0.3750, 0.9375, 0.7500, 0.8125, 0.1250, 0.6875, 0.5000, 0.5625, 0.8750, 0.4375, 0.2500, 0.3125.

A Sample Generator x n = 5x n 1 + 1 mod 16 Starting with x 0 = 5: The first 32 numbers obtained by the above procedure 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5, 10, 3, 0, 1, 6, 15, 12, 13, 2, 11, 8, 9, 14, 7, 4, 5. The length of the sequence before full repetition is known as the cycle length (period) This example has a period of 16 Some generators do not repeat an initial portion of the sequence referred to as the tail of the sequence

Desirable Properties Random number generation routines should be: Computationally efficient Portable Have sufficiently long cycle Replicable (given the same seed) Helps program debugging Helpful when comparing alternative system design Should have provision to generate several streams of random numbers Closely approximate the ideal statistical properties of uniformity and independence.

Linear Congruential Generator (LCG) Commonly used algorithm A sequence of integers x 1, x 2, between 0 and m-1 is generated according to x = (a x i 1 + c) mod m where multiplier a and increment c are constants, m is the modulus and x 0 is the seed (or starting value) Random numbers u 1, u 2, are given by u i = X i m i = 1,2, The sequence can be reproduced if the seed is known

More on LCG Selection of the values of a, c, m, and X 0 affects the statistical properties of the generator and its cycle length. If c = 0, the generator is called Multiplicative LCG. (Ex Lehmer page 39) x n = 5 x n 1 mod 2 5 If c 0, the generator is called Mixed LCG x n = ( 2 34 + 1 x n 1 + 1) mod 2 35

Even more on LCG Can have at most m distinct integers in the sequence As soon as any number in the sequence is repeated, the whole sequence is repeated Period: number of distinct integers generated before repetition occurs Problem: Instead of continuous, the u i s can only take on discrete values 0, 1/m, 2/m,, (m-1)/m Solution: m should be selected to be very large in order to achieve the effect of a continuous distribution (typically, m > 10 9 ) Most digital computers use a binary representation of numbers Speed and efficiency are aided by a modulus, m, to be (or close to) a power of 2

Seed Selection Using a seed of x 0 = 1: With x 0 = 2: Possible period 32 x n = 5 x n 1 mod 2 5 5, 25, 29, 17, 21, 9, 13, 1, 5, Period = 8 10, 18, 26, 2, 10, Period is only 4 Note: Full period is a nice property but uniformity and independence are more important

Seed Selection Seed selection Any value in the sequence can be used to seed the generator Do not use random seeds: such as the time of day Cannot reproduce. Cannot guarantee non-overlap. Do not use zero: Fine for mixed LCGs But multiplicative LCGs will stuck at zero Avoid even values: For multiplicative LCG with modulus m=2 k, the seed should be odd Do not use successive seeds May result in strong correlation

Example RNGs A currently popular multiplicative LCG is: x n = 7 5 x n 1 mod(2 31 1) 2 31-1 is a prime number and 7 5 is a primitive root of it Full period of 2 31-2. This generator has been extensively analyzed and shown to be rather good Modulus is largest 32 bit integer prime a = 7 5 mod 2 31 1 = 16807 a = 48271 has been shown to generate slightly more random sequences

Myths About Random-Number Generation A complex set of operations leads to random results. It is better to use simple operations that can be analytically evaluated for randomness. Random numbers are unpredictable. Easy to compute the parameters, a, c, and m from a few numbers => LCGs are unsuitable for cryptographic applications

Myths (Cont) Some seeds are better than others. May be true for some. x n = 9806 x n 1 + 1 mod (2 17 1) Works correctly for all seeds except x 0 = 37911 Stuck at x n = 37911 forever Such generators should be avoided Any nonzero seed in the valid range should produce an equally good sequence Generators whose period or randomness depends upon the seed should not be used, since an unsuspecting user may not remember to follow all the guidelines 2 17 1 = 131071

Myths (Cont) Accurate implementation is not important. RNGs must be implemented without any overflow or truncation For example: x n = 1103515245x n 1 + 12345 mod 2 31 Straightforward multiplication above may produce overflow. 2 31 = 2147483648

Testing Random Number Generators Two categories of test Test for uniformity Test for independence Passing a test is only a necessary condition and not a sufficient condition i.e., if a generator fails a test it implies it is bad but if a generator passes a test it does not necessarily imply it is good.

More on Testing... Testing is not necessary if a well-known simulation package is used or if a well-tested generator is used In what follows, we focus on empirical tests, that is tests that are applied to an actual sequence of random numbers Chi-Square Test KS Test

Chi-Square Test Prepare a histogram of the empirical data with k cells Let O i and E i be the observed and expected frequency of the ith cell, respectively. Compute the following: k X 0 2 = i=1 O i E i 2 X 0 2 has a Chi-Square distribution with (k-1) degrees of freedom E i

Chi-Square Test (continued...) Define a null hypothesis, H(0), that observations come from a specified distribution The null hypothesis cannot be rejected at a significance level of α if X 2 2 0 < X [1 α,k s 1] meaning of significance level α = P reject H 0 H 0 is true) s is number parameters in the distribution s = 1 poisson s = 2 normal There is a Chi-Square table that comparison can be made to

Chi-Square Test Example Interval Oi Ei Chi-Sq 1 50 50 0 2 48 50 0.08 3 49 50 0.02 4 42 50 1.28 5 52 50 0.08 6 45 50 0.5 7 63 50 3.38 8 54 50 0.32 9 50 50 0 10 47 50 0.18 500 5.84 Example: 500 random numbers generated using a random number generator; observations categorized into cells at k = 10 intervals of 0.1, between 0 and 1. At level of significance of 0.1, are these numbers IID U(0,1)? X 0 2 = 5.84 2 Chi-Sq table X [0.9,9] = 14.68 Hypothesis accepted at significance level of 0.10.

More on Chi-Square Test Errors in cells with small E i s affect the test statistics more than cells with large E i s. Minimum size of E i debated recommends a value of 3 or more; if not combine adjacent cells. Test designed for discrete distributions and large sample sizes only. For continuous distributions, Chi-Square test is only an approximation (i.e., level of significance holds only for n ).

Kolmogorov-Smirnov (KS) Test Difference between observed CDF F 0 (x) and expected CDF F e (x) should be small; formalizes the idea behind the Q-Q plot. Step 1: Rank observations from smallest to largest: Step 2: Define F o x = (#i: Y i x)/n Number of samples <= x / n Step 3: Compute K as follows: K = max x F e x F 0 (x) K = max 1 j n {j n F e Y j, F e Y j Y 1 Y 2 Y 3 Y n j 1 n }

Kolmogorov-Smirnov (KS) Test Yj j j n F j 1 e Y j F e Y j n 5 1 0.017896 0.048771 6 2 0.075098-0.00843 6 3 0.141765-0.0751 17 4 0.110331-0.04366 25 5 0.112134-0.04547 39 6 0.077057-0.01039 60 7 0.015478 0.051188 61 8 0.076684-0.01002 72 9 0.086752-0.02009 74 10 0.143781-0.07711 104 11 0.086788-0.02012 150 12 0.02313 0.043537 170 13 0.04935 0.017316 195 14 0.075607-0.00894 229 15 0.101266-0.0346 MAX 0.143781 0.051188 Example: Test if given population is exponential with parameter β = 0.01; that is F e x = 1 e βx ; K [0.9,15] = 1.0298. Max is less so observations pass test.

Vs. K-S Test Small Samples Continuous Distributions Differences between observed and expected cumulative probabilities Uses each observation in the sample without any grouping Cell size is not a problem Exact Chi-Square Test Large Samples Discrete Distributions Differences between observed and hypothesized probabilities Groups observations into small number of cells Cells sizes affect the conclusion but no firm guidelines Approximate