Lecture 2: Concentration Bounds

CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for formal publicatios. Laws of large umbers imply for a sequece of i.i.d. radom variables X, X 2,... with mea µ, the sample average, (X + X 2 + + X ), coverges to µ as goes to ifiity. Cocetratio bouds provide a quatitative distace betwee the sample average ad the expectatio. I this lecture we review several of these fudametal iequalities. I the ext few lectures we will see applicatios of these iequalities i desigig radomized algorithms. Let D be a distributio. Suppose we wat to estimate the mea E[X] of D ad we oly have access to idepedet samples of D, X D. Oe way to estimate the mea is to idepedetly draw samples X, X 2,..., X from the distributio ad retur the empirical mea: X i. By law of large umbers the empriical mea coverges to E [ [X]] as. I this lecture we will prove bouds o the umber of samples oe eeds to obtai a estimate of the mea withi ɛ-additive error. 2. Markov s Iequality Markov s Iequality: For ay oegative radom variable (R.V.) X ad ay umber k, P[X k] E[X] k. Proof. E[X] = i i P [X = i] i k i P [X = i] k P[X = i] k P [X k]. i k For example, for k = 3 2E[X], we ca write P [X 32 ] E[X] E[X] 3 2 E[X] = 2 3. (2.) Example: Suppose the average grade of CSE 52 is 2.0 (out of 4.0). Give a lower boud o the fractio of studets who received a grede at most 3.0. We assume that a grade ca be ay real umber betwee 0.0 ad 4.0. I this example E [X] = 2.0. Takig k = 3.0 = 3 2E[X] we get that at least /3 of the studets received grade at most 3.0. It turs out that if the oly thig that we kow about X is its expectatio the Markov s iequality will be the best boud we ca hope for. For a tight example cosider the followig sceario; assume k E[X] ad let 2-

2-2 Lecture 2: Cocetratio Bouds where ɛ is very close to 0. X = { k + ɛ E[X] k w.p. 0 w.p. E[X] k Applicatio. We use Markov s iequality to prove a upper boud o the umber of fixed poits of a radom permutatio. Recall that a permutatio is a oe to oe ad oto map σ : {, 2,..., } {, 2,..., }. We say i is a fixed poit of σ iff σ(i) = i. Claim 2.. With probability at least /k a uiformly radom permutatio σ has at most k fixed poits. Proof. The trick is to defie the right radom variable ad the use the Markov s iequality. Defie X i = I{σ(i) = i} ad X = X i. Observe that X is the umber of fixed poits of σ. We ca write dow the expectatio of X usig the liearity of expectatio. E [X] = E [X i ] = P [X i ] =. The secod equality uses that fact that the expectatio of a idicator radom variable is equal to its probability. The last equality holds sice σ is a uiform permutatio, i.e. P [X i ] =. Thus, by Markov s iequality P [X k] /k. 2.2 Chebyshev s Iequality Recall the defiitio of the variace: Var(X) := E [X E [X]] 2 = E [ X 2 + (E [X]) 2 2XE [X] ] = E[X 2 ] + (E[X]) 2 2E[XE[X]] = E[X 2 ] (E[X]) 2. (2.2) The secod ad the third equalities follow from the liearity of expectaio. Note that sice (X E[X]) 2 is a oegative radom variable, E[X 2 ] (E[X]) 2. The stadard deviatio of radom variable X is defied as σ(x) := Var(X). Chabishev s iequality: For ay radom variable X ad ay ɛ > 0, or equivaletly for ay umber k > 0, P [ X E[X] ɛ] Var(X) ɛ 2 P [ X E[X] kσ] k 2 We ca read the above iequality as follows: For ay radom variable X with probability at least 90%, X is withi three stadard deviatio of its expectatio.

Lecture 2: Cocetratio Bouds 2-3 Proof. Let Y := (X E[X]) 2 0. By Markov s iequality P[Y ɛ 2 ] E[Y ] ɛ 2 By the defiitio of Y, Or, equivaletly, P [ (X E[X]) 2 ɛ 2] Var(X) ɛ 2. P [ X E[X] ɛ] Var(X) ɛ 2. Next, we describe two applicatios of Chebyshev s iequality. Applicatio 2. Pollig. Cosider a large set of idividuals each votig 0 or o a presidecy cadidate, ad let p be the expectatio. We see that usig oly O(/ɛ 2 ) idepedet samples from the set we ca estimate p withi ad eps-additive error. Let X, X 2,..., X be the votes of idepedetly chose idividuals i this society. Observe that, for each i, { with probabilityp X i = 0 with probability p Defie a R.V. X = Xi. Obviously, E [X] = E [X i ] = p. We use the Chebyshev s iequality to show that for = O(/ɛ 2 ) w.h.p. X is withi a additive distace ɛ of p. To use Chebyshev s iequality, we first eed to upper boud the variace of X. We use the followig lemma to calculate the variace of sum of idepedet radom variables. Lemma 2.2. Let X, X 2,..., X be pairwise idepedet radom variables. This meas that for ay i j, E [X i X j ] = E [X i ] E [X j ]. For X = X +... X, we have, Var(X) = i = Var(X i ). Proof. By (2.2), Var(X) = E[X 2 ] (E[X]) 2 = i,j E[X i X j ] i,j E[X i ]E[X j ], where the secod equality follows by liearity of expectatio. By pairwise idepedece property, for ay i j, E[X i X j ] = E[X i ]E[X j ]. Therefore, the above expressio simplifies to, Var(X) = E[X 2 i ] i (E[X i ]) 2 = Var(X i ).

2-4 Lecture 2: Cocetratio Bouds I the pollig example, we ca write, Var(X) = Var(X i /) = 2 Var(X i ). Recall that X i is a Beroulli radom variable with prior p. We have, Var(X i ) = E[X 2 i ] E[X i] 2. Obviously, E[X i ] = p. I additio, E[X 2 i ] = 2 p + 0 2 ( p) = p. So, Var(X i ) = p p 2 /4 ad Now, by Chebyshev s iequality Var(X) 2 4 = 4. P[ X p ɛ] 4 ɛ 2 = 4ɛ 2 This meas that for = 3/ɛ 2, X approximates p withi a additive error of ɛ with 90% probability. Applicatio 3. Birthday Paradox. Let X,..., X {, 2,..., N} chose idepedetly ad uiformly at radom. How large should be to get a collisio, i.e., to get X i = X j for some i j? We show that if < N the w.h.p. there is o collisio. Ad, if > C. the with probability at least /C 2 there is a collisio. Defie a R.V. Y ij = I(X i = X j ) ad let Y = i,j Y ij. Note that Y ij s are depedet radom variables but they are pairwise idepedet. This crucial fact allows us to use Lemma 2.2 to calculate the variace of Y. Observe that Y is a itegral radom variables which couts the umber of collisios. So, we are iterested i P[Y ]. We start by calculatig the first momet of Y. ( 2) By Markov s iequality E[Y ] = i<j P[Y ] E[Y ] E[Y ij ] = i<j P[Y ij ] = = ( 2) N 2 2N, Therefore, if N with probability at least /2 there is o collisios. Now, let us study the case where N. Here, we use the Chebyshev s iequality. First, observe that sice Y is a itegral radom variable, By Chebyshev s iequality, Therefore Usig pairwise idepedece of Y ij s, we get Var(Y ) = i<j P[Y = 0] P[ Y E[Y ] E[Y ]]. N. P[ Y E[Y ] E[Y ]] Var(Y ) (E[Y ]) 2. P[Y ] = PY = 0 Var(Y ) (E[Y ]) 2 Var(Y ij ) = i<j ( N ) N 2 i,j N ( 2) N.

Lecture 2: Cocetratio Bouds 2-5 Therefore, P[Y ] Var(Y ) E [Y ] 2 ( ) 2 /N ( ( 2 ) = N ) 2N /N) 2 2. So, for C N, there is a collisio with probability at least 2/C 2. ( 2 2.3 Cheroff Bouds Cetral Limit Theorems i their geeral form state for a sequece i.i.d. radom variables X, X 2,... with bouded mea µ ad variace σ 2, ( ) X i µ N(0, σ 2 ) Cheroff types boud provide a quatitative boud o this covergece. Recall that Chebyshev s boud imply that the probability that a R.V. X is at distace kσ from the mea is /k 2. Roughly speakig, Cheroff types of bouds imply that for a suitable R.V. X this probability is exp(ω(k)). We start by describig the Hoeffdig s boud. Hoeffdig s Iequality: Let X,..., X be a sequece of idepedet variables where for each i, a i X i b i. The, [ [ ] ] P X i E X i ɛ 2 exp ( 22 ɛ 2 ) (ai b i ) 2. I the pollig example we had X i {0, } for each i, ad X,..., X are idepedet radom variables with E[X i ] = p. Therefore, by the Hoeffdig s iequality we get [ ] P X i p ɛ 2 exp ( 22 ɛ 2 ) = 2 exp( 2ɛ 2 ) So, for ay δ > 0, X i is withi additive error ɛ of p with probability at least δ if log δ ɛ 2. Applicatio 4. Ubiased radom walk o a lie. Cosider a particle which does a ubiased radom walk o the real lie. It starts at zero ad i each time step it moves oe step ahead or oe step back, i.e., from positio i with probability /2 it goes to i + ad with the remaiig probability it goes to i. We wat to see how far from the origi the particle will be at time. We ca simulate this variable b a sequece X,..., X of idepedet radom variables where for each i, { with probability/2 X i = with probability/2 Let X = X + X 2 + + X. We wat to prove a upper boud o X. Sice E[X] = 0, by Hoeffdig iequality, [ ] P X 0 ɛ 2 exp ( 22 ɛ 2 ) 4

2-6 Lecture 2: Cocetratio Bouds so if ɛ = log()/, with probability at least /, we have X log()/, or i other words, with probability at least /, X log(). That is, with high probability the particle is at distace log() from the origi. I the ext lecture, we show that the particle has distace at least from the origi w.h.p..