PROBABILITY AND STATISTICS IN COMPUTING. III. Discrete Random Variables Expectation and Deviations From: [5][7][6] German Hernandez

Similar documents
Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lecture 4: Probability and Discrete Random Variables

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

CIS 121 Data Structures and Algorithms with Java Spring Big-Oh Notation Monday, January 22/Tuesday, January 23

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 5: Expectation

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Week 12-13: Discrete Probability

Random Variable. Pr(X = a) = Pr(s)

1 Stat 605. Homework I. Due Feb. 1, 2011

Probability: Handout

MARKING A BINARY TREE PROBABILISTIC ANALYSIS OF A RANDOMIZED ALGORITHM

LIST OF FORMULAS FOR STK1100 AND STK1110

Math 510 midterm 3 answers

Review of Probability Theory

Bivariate Distributions

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 22: Variance and Covariance

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

1 Random Variable: Topics

Chapter 2. Discrete Distributions

3. DISCRETE RANDOM VARIABLES

Algorithms, CSE, OSU Quicksort. Instructor: Anastasios Sidiropoulos

Probability Theory and Statistics. Peter Jochumzen

Lectures on Elementary Probability. William G. Faris

Expectation, inequalities and laws of large numbers

PROBABILITY VITTORIA SILVESTRI

Lecture 20 : Markov Chains

Bucket-Sort. Have seen lower bound of Ω(nlog n) for comparisonbased. Some cheating algorithms achieve O(n), given certain assumptions re input

3. Probability and Statistics

STAT 430/510: Lecture 16

Some Concepts of Probability (Review) Volker Tresp Summer 2018

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Lecture 2: Repetition of probability theory and statistics

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Properties of Summation Operator

Expectation of geometric distribution

Lecture 3 - Expectation, inequalities and laws of large numbers

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 9 Fall 2007

Search Algorithms. Analysis of Algorithms. John Reif, Ph.D. Prepared by

MAS113 Introduction to Probability and Statistics. Proofs of theorems

2. Matrix Algebra and Random Vectors

Multivariate Random Variable

CS 246 Review of Proof Techniques and Probability 01/14/19

Final Review: Problem Solving Strategies for Stat 430

STOR Lecture 16. Properties of Expectation - I

Expectation of Random Variables

1 Review of Probability

Exam P Review Sheet. for a > 0. ln(a) i=0 ari = a. (1 r) 2. (Note that the A i s form a partition)

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

Randomized Algorithms

Chapter 6: Random Processes 1

Probability and Distributions

PROBABILITY VITTORIA SILVESTRI

ECE353: Probability and Random Processes. Lecture 5 - Cumulative Distribution Function and Expectation

1 Variance of a Random Variable

Quicksort (CLRS 7) We previously saw how the divide-and-conquer technique can be used to design sorting algorithm Merge-sort

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Lecture 4: Sampling, Tail Inequalities

Analysis of Algorithms. Randomizing Quicksort

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Problem 1. Problem 2. Problem 3. Problem 4

X = X X n, + X 2

Mathematical Statistics 1 Math A 6330

Chapter 7. Basic Probability Theory

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Biostat Methods STAT 5820/6910 Handout #5a: Misc. Issues in Logistic Regression

More than one variable

Expectation of geometric distribution. Variance and Standard Deviation. Variance: Examples

Introduction to Information Entropy Adapted from Papoulis (1991)

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Joint Distribution of Two or More Random Variables

CONVERGENCE OF RANDOM SERIES AND MARTINGALES

1 Proof techniques. CS 224W Linear Algebra, Probability, and Proof Techniques

1 Probability theory. 2 Random variables and probability theory.

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

1 Basic continuous random variable problems

FE 5204 Stochastic Differential Equations

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Algorithms for Uncertainty Quantification

Elementary Probability. Exam Number 38119

Homework 10 (due December 2, 2009)

Actuarial Science Exam 1/P

Notes for Math 324, Part 19

Biostat Methods STAT 5500/6500 Handout #12: Methods and Issues in (Binary Response) Logistic Regression

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Lecture 4. Quicksort

Convergence of Random Variables

Introduction to discrete probability. The rules Sample space (finite except for one example)

CHAPTER 3: LARGE SAMPLE THEORY

CS 161 Summer 2009 Homework #2 Sample Solutions

Notes 6 : First and second moment methods

Formulas for probability theory and linear models SF2941

Econ 371 Problem Set #1 Answer Sheet

Transcription:

Conditional PROBABILITY AND STATISTICS IN COMPUTING III. Discrete Random Variables and Deviations From: [5][7][6] Page of 46 German Hernandez

Conditional. Random variables.. Measurable function Let (Ω, A) and (Θ, G) be measurable spaces. Υ : Ω Θ is called a measurable function if A function Υ (G) A, for all G G, Page 2 of 46 i.e., the pre-image of a measurable set is measurable. In order to define real random variables we use a special σ- algebra on the real set, called the Borel σ-algebra and noted B(R) or simply B. This is the minimal σ-algebra on R that contains the sets of the form (, x] for all x R.

Conditional Page 3 of 46.2. Random variable A random variable (r.v.) X is a Borel measurable function defined on a probability space. Here a r.v. will be consider only real random variables, i.e., here a r.v. is of the form: X : (Ω, A, P ) (R, B) A r.v. can be seen as a translation or encoding of the outcomes of a random experiment into the real set. The condition of Borel measurability allows us to translate or encode the probabilistic structure of the experiment in a consistent manner from the events of the random experiment to Borel sets.

Conditional Random Variable X : Ω R ω ω 2 ω3 : : ω k Page 4 of 46 (Ω,A,P) Probability Space R

Conditional Page 5 of 46.3. Induced probability by a r.v. A random variable X : Ω R from a probability space (Ω, A, P ) defines an induced probability measure P X on (R, B) defined as P X : R [0, ] B P (X (B)) Usually, it is difficult to work directly with the induced probability. The alternative is to work with a function on R, called a distribution function, that encodes all the relevant information of the induced probability.

Conditional.4. Distribution function of a r.v. The distribution function or cumulative distribution function of a r.v. X, noted F X, is defined as F X : R R x P ( X x ) = P ( X ((, x]) ) Distribution Function of X F (x) =P (X ((-,x ]) ) Page 6 of 46 - X - Borel set (-,x ] - X ((-,x ]) P x X 0 Random Variable (Ω,Α,P) Probability Space

Conditional Page 7 of 46 The distribution function has the following properties: (i) lim x F X (x) = 0 and lim x F X (x) =, (ii) F X is a nondecreasing function, and, (iii) right continuous.

Conditional Page 8 of 46.5. Types of a random variables. A random variable is called discrete if its range is countable. In this case we can define the probability mass function f X (x) = P ( X = x ). A random variable is called continuous if its distribution function F X is continuous and is called absolutely continuous if there exists a non-negative Riemann integrable function, called the probability density of the r.v., f X (x) : B B such that F X (x) = x f X (τ)dτ.

Conditional Page 9 of 46 2. Let X be a discrete random variable taking on the values x, x 2,..., x k,.... The mean value or expectation of X, denoted by E[X] is defined by E[X] = x k P (X = x k ). k= Let X be a continuous random variable taking values on R. Then E[X] is defined by E[X] = xf X (τ)dτ.

Conditional Page 0 of 46 Example Indicator random variable Given an event A the indicator random variable of the event, denoted I A or sometimes I{A}, is equal to if A occurs, or 0 if A does not occur. F {, if A occurs I A = 0, if A c occurs Find the its expectation? E[I A ] = P (A) + 0 P (A c ) = P (A) If random variable X only takes on the value 0 or is called Bernoulli random variable and its expected value E[X] = P (). Example 2 Geometric random variable Consider a sequence of independent trials, each of which is a success with probability p (0, ), and a failure with probability p. If X represents the trial number of the first success, then X is said to be a geometric random variable with parameter p. Compute its expected value.

Conditional We have that P {X = n} = p( p) n, n > Hence, E[X] = np( p) n = p n( p) n = p n= n= Page of 46 Using the identity na n = n= n=0 d a n da = d n=0 an da = d ( a) da = ( a) 2

Conditional Example 3 Unbounded expectation Page 2 of 46 Then P (X = 2 i ) = 2i, i =, 2,... E[X] = 2 i 2 = = i

Conditional Theorem Let X be a discrete variable that takes only nonnegative integer values. Then E[X] = P (X i). Page 3 of 46 Proof. P (X i) = j=i P (X = j) = j j= P (X = j) = j= jp (X = j) = E[X]

Conditional Page 4 of 46 Example 4 Geometric random variable Hence P (X i) = p( p) n = ( p) i. n=i E[X] = = = ( p) = p P (X j) ( p)i

2.. Linearity Random variables Conditional Page 5 of 46 Given a family of r.v s {X},..n [ n ] n E X i = E [X i ] Proof. E [X + Y] = x y (x + y)p ((X = x) (Y = y)) = x y xp ((X = x) (Y = y)) + x y yp ((X = x) (Y = y)) = x x y P ((X = x) (Y = y)) + y y x P ((X = x) (Y = y)) = xp ((X = x)) + i y yp ((Y = y)) = E [X] + E [Y] Theorem 2 E [cx i ] = ce [X i ]

Conditional Page 6 of 46 Example 5 Bubble sort Let X denote the number of comparison in needed by Bubble sort. Obtain upper ad lower bounds for E(X). Bubble-Sort(A[ n]) for i = to n 2 do for j = to n i 3 do if A[j] > A[j + ] 4 then A[j] A[j + ] 5 If no swap occurs exit

n=5 5 3 8 7 0 Random variables Conditional j= i=2 3 5 8 7 0 j= 3 5 7 0 8 j=2 3 5 8 7 0 j=2 3 5 7 0 8 j=3 3 5 7 8 0 j=3 3 5 0 7 8 j=4 3 5 7 0 8 Page 7 of 46 j= j=2 i=3 i=4 3 5 0 7 8 3 0 5 7 8 j= 0 3 5 7 8 0 3 5 7 8

In the worst case bubble sort requires Random variables Conditional Page 8 of 46 (n ) + (n 2) + + 2 + = n(n ) 2 then n(n ) E(X) 2 In order to obtain a lower bound we need the concept of number of inversion in a permutation. For a permutation i, i 2,, i n of, 2,, n we say that a ordered pair (i, j) is an inversion of the permutation if i < j and j precedes i in the permutation. For instance the permutation 2, 4,, 5, 6, 3 has five inversions: namely, (, 2), (, 4)(3, 4), (3, 5), (3, 6). Because the values of every inversion pair will eventually have to be interchanged (and thus compared) it follows that the number of comparisons made by bubble sort is at least as large as the number of inversions of the initial ordering.

Conditional Then if I denotes the number of inversions I X which implies that E[I] E[X] Let I(i, j) for i < j be {, if (i, j) is an inversion of the initial ordering I(i, j) = 0, otherwise, Page 9 of 46 then it follows that I = j I(i, j) i<j

Conditional Hence, using linearity of the expectation E[I] = E[I(i, j)] j i<j Now, for i < j Page 20 of 46 E[I(i, j)] = P {j precedes i in the initial ordering} = 2 ( ) n There are pairs i, j for which i < j, then it follows 2 that E[I] = j E[I(i, j)] = i<j ( n 2 2 ) = n(n ) 4 Thus n(n ) n(n ) E[X] 4 2 E[X] = Θ(n 2 )

Conditional Page 2 of 46 3. The Quicksort and Find Algorithms Suppose that we want to sort a given set of n distinc values, x, x 2,..., x n. A more efficient algorithm than bubble sort for doing so is the quicksort algorithm, which is recursively defined as follows. When n = 2, the algorithm compares the two values an puts them in the appropriate order. Whem n > 2, one of the values is chosen, say it is x i, and then all of the other values are compared with x i. Those smaller than x i, are put in bracket to the left of x i, and those larger than x i,are put in a bracket to the right of x i. The algorithm the repeats itself in these brackets, continuing until all values have been sorted. For instance, suppose that we desire to sort the following 0 distinct values:

Conditional Page 22 of 46 5, 9, 3, 0,, 4, 8, 4, 7, 6 {5, 9, 3, 8, 4, 6,} 0, {, 4, 7,} {5, 3, 4}, 6, {9, 8,} 0, {, 4, 7,} {3}, 4, {5}, 6, {9, 8,} 0, {, 4, 7,} It is intuitively that the worst case occurs when every comparison value chosen is an extreme value. In this worst scenario, the number of comparisons need is (n ) + (n 2) +... + = n(n )/2. A better indication by determinig the average number of comparisons needed when the compararison value are randomly chosen.

Conditional Page 23 of 46 Let X denote the number of comparisons needed. Let denote the smallest value, let 2 denote the second value smallest, and so on. Then, for i < j n, let I(i, j) equal if i and j are ever directly compared, and let it equal 0 otherwise. which implies that n E[X] = E[ = = n X = n j=i+ n j=i+ n n j=i+ n n j=i+ I(i, j)] E[I(i, j)] I(i, j) P {i and j are ever compared}

Conditional Page 24 of 46 To determine the probability that i and j are ever compared, note that the values i, i +,..., j, j will initially be in the same bracket and will remain in the same bracket if the number chosen for the first comparison is not between i and j. Thus, the probability that it is i or j is 2/(j i + ). Therefore, P {i and j are ever compared} = 2 j i +

Consequently, we see that Random variables Conditional Page 25 of 46 For large n E[X] = n n j=i+ n = 2 = 2 = 2 n k=2 n i+ k=2 n+ k n 2 j i + = 2 ( 2 + 3... + n i + ) k k n (n + k) k k=2 = 2(n + ) = (2n + 2) n k=2 n k= 2(n ) k k 4n n k= k ln(n) Thus, the quicsort algorithm requieres, on average, approximately 2n log(n) comparison to sort n values.

Conditional Page 26 of 46 3.. The Find Algorithm Suppose that we want to find the kth-smallest of a list. The find algorithm is quite similar to quicksort; Suppose that r items are put in the bracket to left. There are now three possibilities: {2, 5, 4, 3}, 6, { 0, 2, 6, 8}. r = k then, the algorithm ends. 2. r < k then, kth smallest value is the (k r)th smallest of the n r values in the right bracket 3. r > k then, search for the kth smallest of the r values in the left bracket.

We have, as in the quicksort analysis, Random variables Conditional Page 27 of 46 and E[I(i, j)] = X = n n n j=i+ I(i, j) P {i and j are ever compared} To determine the probability that i and j are ever compared, we consider cases: Case i < j k In this case i, j, k will remain together until one of the values i, i +,..., k is chosen as the comparison value. P {i and j are ever compared} = 2 k i +

Conditional Case 2: i k < j Page 28 of 46 Case 3: P {i and j are ever compared} = k < i < j P {i and j are ever compared} = 2 j i + 2 j k +

It follow from the preceding that Random variables Conditional Page 29 of 46 n 2 E[X] = j j=2 k i + + n k j=k+ j i + + n j j=k+2 i=k+ j i + To the approximate the preceding when n and k are large, let k = αn for 0 < α <. Now, n j=2 j k k i + = = = k j=i+ k k j=2 k i k i + j j k log(k) k = αn k i +

Conditional n k j=k+ n j i + = j=k+ n j=k+ n k ( j k + +..., ) j (log(j) log(j k)) log(x)dx n k log(x)dx n log(n) n (αn log(αn) αn) (n αn) log(n αn) + (n αn) n[ α log(α) ( α) log( α)] As it similarly follows that Page 30 of 46 we see that n j j=k+2 i=k+ j k + = n k = n( α) E[X] 2n[ α log(α) ( α) log( α)] Thus, the mean number of comparison needed by the find algorithm is a linear function of ghe number of values.

Conditional Page 3 of 46 4. Markovs Inequality Let X be a random variable that assumes only non negative values. Then for all a > 0, P {X a} E[X] a

Conditional Page 32 of 46 Proof. For a > 0, let I be the indicator variable of {X a}. { if X a I(x) = 0 otherwise then ai X E[aI] E[X] E[I] E[X] a P {X a} = E[I] E[X] a Is interesting for a > E[X] and in particular for a = ne[x]. This inequality can be applied when too little is known about a distribution. In general is too weak to yield useful bounds but it is fundamental to develop other useful bounds.

Conditional 5. Variance Let X be a random variable. The variance V ar[x] of X is defined by V ar[x] = E [ (X E[X]) 2] = E[X 2 ] (E[X]) 2. Page 33 of 46 Let X, Y be a random variables. The covariance Cov[X] of X is defined by Cov[X, Y] = E [(X E[X])(Y E[Y])]

Conditional 5.. Properties. Cov[X, Y] = E [XY] E[X]E[Y] 2. Cov[X, Y] = Cov[Y, X] 3. Cov[X, X] = V ar [X] 4. Cov[cX, Y] = ccov[x, Y] 5. Cov[X, Y + Z] = Cov[X, Y] + Cov[Y, Z] ( n ) m n m 8. Cov X i, Y j = Cov(X i, Y j ) i=0 j=0 i=0 j=0 7. If X and Yare independent then Cov(X, Y) = 0 Page 34 of 46 8. V ar ( n X i=0 i) = Cov ( n X i=0 i, n X ) j=0 j = n n Cov(X i=0 j=0 i, X j ) = n Cov(X i=0 i, X i ) + n Cov(X i=0 j i i, X j ) = n V ar(x i=0 i) +2 n Cov(X i=0 j <i i, X j )

Conditional Page 35 of 46 Example 6 [7] A coupon collecting problem Suppose that there are m different coupons, and each time one obtains it is equally likely to any of these types. If X is denotes the number of coupons one need to collect to have at least one of each type find the expected value of X. We have m coupons, let us call X the number of picks that we require to get the first type of coupon, X 2 the number or picks required to get two types of coupons, X 3 the number picks that we require get three types of coupons different cards, and so on. In the first step we get one type of coupon, then X =, form that on the number of picks that we require in order get new type of coupon is geometrically distributed with success probability p 2 = m, i.e., m P (X 2 = k) = ( p 2 ) k p 2, for k =, 2, and the expected time to get the second type of coupon is E[X 2 ] = p 2 = m m.

Conditional Inductively, once we have gotten i different types of coupons, the time to get the i-th type of coupon X i is geometrically distributed with success probability, i.e., p i = m (i ) m P (X i = k) = ( p i ) k p i, for k =, 2, Page 36 of 46 and the expected time to get the ith type of coupon is E[X i ] = p i = m m i +.

Conditional Page 37 of 46 We have that the time to get all the coupons is X = X + X 2 + X m thus, the expected time to o get all the coupons is E[X] = E[X ] + E[X 2 ] + E[X m ] = m + m m { + + m m } = m + + + m m = m m m log m i Is easy to see that the geometric random variables X i are independent and V ar(x i ) = p i p 2 i.

Conditional Page 38 of 46 then V ar(x) = and m p 2 i Then m V ar(x i ) = = m 2 m m p i p 2 i (m i + ) 2 = m2 V ar(x) = m 2 m m i m 2 = m p 2 i m i2, and i < m2π2 6 m p i p i = E(X)

Conditional Page 39 of 46 Example 7 [7] Another coupon collecting problem Find the expected value and variance of the number of types of coupons after collecting n coupons? Let X be the number of types of coupons that you have in the collection at some point where X i = X = m i=0 X i {, if a type i is in the collection 0, otherwise As X i is a Bernoulli variable we have and E[X] = ( m E[X i ] = m m E[X i ] = m i=0 ( ) n ( ) n ) m m

Conditional Page 40 of 46 V ar(x i ) = ( m m ) n ( Also for i j, X i X j si also Bernoulli ( ) n ) m m E[X i X j ] = P (X i X j = ) = P (A i A j ) where A k is the event that contains at lest one coupon of type k. P (A i A j ) = P ((A i A j ) c ) = P (A c i A c j) = P (A c i) P (A c j) + P (A c ia c j) = 2 ( ) m n + ( ) m 2 n m m Therefore, for i j, Cov[X i, X j ] = 2 ( ) m n + ( ) m 2 n ( ( ) m n ) 2 m m m = ( ) m 2 n ( ) m 2n m m Hence V ar(x) = m [( ) m n ( ( m m ) n )] [ m (m 2 ) n +m(m ) ( ) ] m 2n m m

Conditional Page 4 of 46 Example 8 [7] Runs Let X, X 2,, X n be a sequence of independent binary random variables with P (X = ) = p. A maximal consecutive subsequence of s is called a run. For instance the sequence, 0,,,, 0, 0,,,, 0 has 3 runs. Let R be number of runs, find E[X], and V ar(x). Let I i = for i =, 2,, n then {, if a run begins at i 0, otherwise R = n I i

Conditional Because E[I ] = P (X = ) = p E[I i ] = P (X i = 0, X i = ) = ( p)p, for i > it follows that E[R] = n E[I i ] = p + (n )p( p) Page 42 of 46 To compute the variance n V ar(r) = V ar(i i ) + 2 n Cov(I i, I j ) i<j

Conditional Page 43 of 46 Because I i is a Bernoulli random variable we have that V ar(i i ) = E[I i ]( E[I i ]) and also that if i < j I i then I j are independent implying that Cov(I i, I j ) = 0. Moreover I i I i = 0, then Hence, Cov(I i, I j ) = E[I i ]E[I j ] V ar(r) = V ar(i ) + n V ar(i i=2 i) + 2Cov(I, I 2 ) +2 n Cov(I i=3 i, I i ) = p( p) + (n ) [p( p)[ p( p)]] 2p 2 ( p) 2(n 2) [p 2 ( p) 2 ] Application: Number of ranges that of a word in a book index.

Conditional 6. Chebyshevs Inequality For any a > 0, P { X E[X] a} V ar[x] a 2 Page 44 of 46

Conditional Proof. P ( X E[X] a) = P ([X E[X]] 2 a 2 ) Since [X E[X]] 2 is a nonbegative r.v. then appyting Markov s inequality Page 45 of 46 P {[X E[X]] 2 a 2 } E[X E[X]]2 a 2 = V ar[x] a 2 Is interesting for a = nv ar[x].

References Random variables Conditional Page 46 of 46 [] G.R. Grimmett and D.R. Stirzaker. Probability and Random Process. Oxford Science Publications, 998. [2] C.M. Grinstead and J.L. Snell. Introduction to Probability. American Mathematical Society, 997. [3] H.P. Hsu. Probability, Random Variables, and Random Processes. McGraw Hill- Schaum, 996. [4] F.P. Kelly. Probability. Notes form The Archimedeans http://www.cam.ac.uk/societies/archim/notes.html, 996. [5] M. Mitzenmaker and E. Upfal. Probability and Computing: Randomized Algoirthms and probabilitsic Analysis. Cambridge University Press, 2005. [6] R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 995. [7] S.M. Ross. Probability Models for Computer Science. Academic Press, 2002.