Problem 1: Practce wth Chebyshev and Chernoff bounds) When usng concentraton bounds to analyze randomzed algorthms, one often has to approach the problem n dfferent ways dependng on the specfc bound beng used. Typcally, Chebyshev s useful when dealng wth more complcated random varables, and n partcular, when they are parwse ndependent; Chernoff bounds are usually used along wth the unon bound for events whch are easer to analyze. We ll now go back and look at a few of our older examples usng both these technques. Part a) Number of collsons) Recall we showed that f we throw m balls n n bns, the average number of collsons X m,n to be µ m,n = ) m 1 n. Use Chebyshev s nequalty to show that: P X m,n µ m,n c µ m,n 1/c. Next suppose we choose m = n, then µ m,n 1. Use Chernoff bounds plus the unon bound to bound the probablty that no bn has more than 1 ball. Compare ths to the more exact analyss you dd n homework 1. Soluton: Let σ m,n be the standard devaton of X m,n. As we dd before, for every par of balls, j), let Y,j be the ndcator that the two balls collde. Note that snce Y,j are {0, 1} valued r.vs, we have EY,j = EY,j, and thus V ary,j ) EY,j. Moreover, observe that Y,j are parwse ndependent,.e., for any, j), j ), the r.v.s Y,j and Y,j are ndependent n partcular, note that Y,j, Y,k and Y j,k are parwse ndependent however they are not mutually ndependent). Now we have X m,n =,j Y,j, and thus µ m,n =,j EY,j and V arx m,n ) =,j V ary,j),j EY,j = µ m,n. Thus, by Chebyshev s nequalty, we have: P X m,n µ m,n c µ m,n 1/c. To use Chernoff bounds, we nstead consder the number of balls n each bn. Let Z b, be the ndcator that ball fell n bn b these are now mutually ndependent. Further, let B b = Z b, be the number of balls n bn b then we know that EB b = m/n, and moreover usng our standard Chernoff bound, we have: P B b = P B b 1 + n/m 1)) EB b exp n/m 1) m/n) + n/m 1) n m) = exp nm + n) ) ) Usng PX 1 + ɛ)µ exp ɛ )) µ + ɛ ) Fnally, by the unon bound, we have PSome bn has balls n exp n m) nm+n). Now f m = ) n, we get exp = Θ1), and thus the bound on PSome bn has balls s not n m) nm+n) 1
useful as t grows wth n). On the other hand, n the frst homework, we dd a drect calculaton to show that PNo collsons exp EX m,n ), and thus f m = n, we have PNo collsons 1/. Thus, the Chernoff bound does not gve us the tghtest scalng n ths case. Part b) Coupon collector) For n bns, recall that we defned T to be the frst tme when unque bns were flled, and used these random varables to show that T n,.e., the number of balls we need to throw before every bn has at least one ball, satsfes ET n = nh n = Θn log n). Usng the same random varables, show that P T n ET n cet n π. 6c Hn Next, suppose we throw n m = n log n+cn balls usng Chernoff bounds plus the unon bound, choose c such that no bn s empty wth probablty greater than 1 δ. Hnt: Use =1 1/ = π /6 Soluton: Frst, usng the ndependence of T s let s calculate V art n : n 1 V art n = V art = =0 n 1 =0 n 1 1 p = p =0 Now, usng Chebyshev s nequalty, we get: n n n ) = =1 nn ) n P T n ET n cet n = P T n ET n cnh n V art n cnh n ) n =1 1 π n 6. π 6c H n. Next, let m = n log n + cn, and let B b be the number of balls n bn b. We know EB b = log n + c. Then by the Chernoff bound, we have: ) ) EBb log n + c) PB b 0 = PB b 1 1)EB b exp = exp log n+c) Fnally, usng the unon bound, we have PSome bn s empty npb b 0 = explog n ), and we can set c = log n + log1/δ) to get ths less than δ. Here, usng a drect calculaton s better than the Chernoff bound. In partcular, we have: PB b 0 = 1 n) 1 m e m/n = e c /n By the unon bound, we have PSome bn s empty e c, and thus we need c = log1/δ) to ensure ths s less than δ. The man takeaway agan s that Chernoff bounds are fne when probabltes are small and we do not want the tghtest bounds, but may be weak when probabltes are larger and we want tghter bounds.
Problem : The Hoeffdng Extenson) In class, we saw that for a r.v. X Bernoullp ), we have: Ee θx = 1 p + pe θ Pluggng ths nto the Chernoff bound and optmzng over θ, we obtaned a varety of bounds n partcular, for ndependent r.vs {X } wth X = X and µ = EX = p, we showed for any ɛ > 0: ɛ ) µ P X µ ɛµ exp + ɛ We now extend ths to more general bounded r.vs. Part a) Frst, for any θ, argue that the functon fx) = e θx for x 0, 1 s bounded above by the lne jonng 0, 1) and 1, e θ ). Usng ths, fnd constants α, β such that x 0, 1: e θx αx + β Soluton: Recall that fx) = e θx s a convex functon. Now, note that f0) = 1 and f1) = e θ. Therefore, for x 0, 1, t s bounded above by the lne jonng 0, 1) and 1, e θ ), whch means: Part b) e θx e θ 1)x + 1. Next, for any random varable X takng values n 0, 1 such that EX = µ, show that: Ee θx 1 µ + µ e θ. Usng ths, for ndependent r.vs X takng values n 0, 1 wth EX = µ, and defnng X = X, µ = µ, show that: ɛ ) µ P X µ ɛµ exp + ɛ Note: You can drectly use the nequalty for Bernoull r.v.s no need to show the optmzaton.) Soluton: Let s take the expectatons of both sdes of the nequalty n the soluton of the part a): Ee θx Ee θ 1)X + 1 = e θ 1)EX + 1 = e θ 1)µ 1 + 1 = 1 µ + µ e θ. Now, let X = X, µ = µ. Usng exactly the same method s we dd n class for Bernoull r.vs, we get: ɛ ) µ P X µ ɛµ exp + ɛ 3
Part c) Optonal) Next, for any random varable X takng values n a, b such that EX = µ, a smlar boundng technque as above can be used to show: ) E e θx µ ) 1 exp 8 θ b a) Ths s sometmes referred to as Hoeffdng s lemma for the proof, see the wkpeda artcle) Now consder ndependent r.vs X takng values n a, b wth EX = µ, and as before, let X = X, µ = µ. Usng the above nequalty, optmze over θ to show that: ɛ µ ) PX µ) ɛµ exp n =1 b a ) Soluton: As n the standard Chernoff nequalty, we have: PX µ ɛµ = P e θx µ) e θɛµ mn θ>0 = mn θ>0 mn θ>0 exp Ee θx µ) e θɛµ Π Ee θx µ ) e θɛµ θ 8 ) b a ) θɛµ To optmze, we choose θ = 4ɛµ/ b a ), to get: ɛ µ ) PX µ ɛµ exp b a ) Problem 3: A Weaker Samplng Theorem: Adapted from MU Ex 4.9) In class we saw the followng samplng theorem for estmatng the mean of a {0, 1}-valued random varable: In order to get an estmate wthn ±ɛ wth confdence 1 δ, we need n +ɛ ɛ ln δ. A crucal component n ths proof was usng the Chernoff bound for Bernoullp) random varables. Suppose nstead we want to estmate a more general random varable X for example, the average number of hours of TV watched by a random person) we may not be able to use a Chernoff bound f we do not know the moment generatng functon We now show how to to get a smlar samplng theorem whch only uses knowledge of the mean and varance of ax. We want to estmate a r.v. X wth mean EX and varance V arx, gven..d samples X 1, X,.... Let r = V arx/ex ) we now show that we can estmate t up to accuracy ±ɛex and confdence 1 δ usng O ɛ ln 1 r δ samples. 4
Part a) Gven n samples X 1,..., X n, suppose we use the estmator X = n =1 X ) /n. Show that n = Or /ɛ δ) s suffcent to ensure: P X EX ɛex δ Soluton: By lnearty of expectaton, we have that E X = n =1 EX /n = EX. Moreover, snce X s are..d., we have that V ar X = n =1 V arx )/n = V arx)/n = r EX /n. Now, usng Chebyshev s nequalty we get: P X V ar X) EX ɛex ɛ EX = r nɛ To ensure P X EX ɛex δ, we can choose n such that r δ. Usng n = Or /ɛ δ) nɛ samples s thus suffcent. Part b) We say an estmator s a weak estmator f t satsfes that P X EX ɛex 1/4 usng part a), show, that we need Or /ɛ ) samples to obtan a weak estmator. Now suppose we are gven m weak estmates X 1, X,..., X m, and we defne a new estmator X to be the medan of these weak estmates. Show that usng m = Oln1/δ) weak estmates gves us an estmate X that satsfes: P X EX ɛex δ What could go wrong f we used the mean of X 1, X,..., X m nstead of the medan? NOTE: I got the defnton of weak estmator nverted n the Homework - ths s the correct defnton. Essentally, we want the probablty of samples beng close to the mean to be > 1/. Soluton: If we take δ = 1 4 n part a), we see that X s a weak estmator f we use n = Or /ɛ ) samples. Now we want to show that we need Olog 1/δ) such weak estmators to ensure that the medan of these estmators s wthn 1 ± ɛ)ex. Let us ntroduce new r.vs Y s.t. Y = 1, f X EX ɛex, and Y = 0, f X EX ɛex. Y = m = Y s the number of weak estmates for whch P X EX ɛex δ; to ensure that the medan s also wthn 1 ± ɛ)ex, we need Y > m/. In other words, we have: P X EX ɛex = P Y m/. 5
Moreover, PY = 0 3/4, and hence EY 3m/4. Now for ndependent random varables Z Bernoullp ), wth Z = n =1 Z, we saw the followng Chernoff Bound: PZ 1 ɛ)ez exp ɛ ) EZ Usng ths, we can wrte: P Y m/ = P Y 1 1/3)3m/4 PY 1 1/3)EY e 1/9)EY e m/18. Now, f we take m 18 ln1/δ) = Olog 1/δ), we get P X EX ɛex δ. Note though that we only needed to count the weak estmators that fell outsde 1 ± ɛ)ex we dd not need to assume they are bounded, or have bounded varance, etc. If nstead we had used the mean of the weak estmators, we could have a problem f these bad estmates happened to take very large values. Usng the medan made our estmate robust to such outlers. Problem 4: Randomzed Set-Cover) In ths problem, we ll look at randomzed roundng, whch s a very powerful technque for solvng large-scale combnatoral optmzaton problems. The man dea s that gven a problem whch can be wrtten as an optmzaton problem wth nteger constrants, we can sometmes solve the relaxed problem wth non-nteger constrants, and then round the solutons to get a good assgnment. We wll hghlght ths technque for the Mnmum Set-Cover problem. We are gven a collecton of m subsets {S 1, S,..., S m } whch are subsets of some large set U of n elements, such that S = U. The Mnmum Set-Cover problem s that of selectng the smallest number of sets C from the collecton {S 1, S,..., S m } such that they cover U,.e., such that each element n U les n at least one of the sets n C. Part a) Argue that the mnmum-set cover problem s equvalent to the followng nteger program: Mnmze x x subject to x 1, e U e S x {0, 1}, {1,,..., m} Let the soluton to ths problem,.e., the mnmum set-cover, be denoted OP T. Next, argue that f we solve the same problem, but now change the last constrant to x 0, 1 for all, then the resultng soluton OP T LP of ths relaxed problem obeys OP T LP OP T. Note that the relaxed problem s an LP and hence can be solved effcently. 6
Soluton: For each set S, we assocate a varable x {0, 1} that ndcates of we want to choose S or not. We can then wrte the solutons for Mnmum Set-Cover problem as a vector x {0, 1} m. The objectve s clearly to mnmze the sum of these varables moreover, snce the sets we select must cover each element n U, we need one constrant to ensure ths for each element. If we replace each constrant x {0, 1} wth x 0, 1 for all, then the resultng problem can be solved n a polynomal tme as t s a Lnear Program LP). However, by replacng nteger constrants wth contnuous domans, we have expanded the set of feasble solutons thus, the resultng mnmzaton problem must gve a smaller value as all feasble nteger solutons are wthn our new feasble regon), and hence we have that OP T LP OP T. Part b) Gven a soluton z to the relaxed LP, we now round the values to obtan a feasble soluton for the orgnal mnmum set-cover problem. For each set S, we generate k = c log n..d Bernoullz ) random varables X,1, X,,..., X,k f any of them s 1, then we set x = 1,.e., we add S to our cover C. Prove that the resultng set-cover obeys E C c log n OP T. Soluton: Frst, from the defnton of the roundng process, for any j {1,,..., k}, we have: EX,j = PX,j = 1 = z = OP T LP Therefore, by lnearty of expectaton, we have: E C clog n j=1 EX,j = c log n OP T LP c log n OP T. Part c) Fnally, choose c to ensure that the probablty that the resultng set-cover C does not cover any element e U s less than 1/n. Soluton: For any element e U, and for any j {1,,..., c log n}, let Y e,j = e S X,j, and let Y e = c log n j=1 Y e,j. Note that for the sets S contanng element e, f any of the X,j = 1, then e s covered n other words, Pe s covered = PY e 1. Moreover, EY e,j = e S EX,j = e S z. Let k e = { e S } ; snce e s fractonally covered n the LP, we have z 1 + + z ke 1, and thus EY e c log n. Now usng our basc Chernoff bound, we have: PY e 0 = P Y e 1 1)EY e e EYe c log n e = 1 n c/ Thus Pe s not covered n c/. Moreover, by the unon bound, we have that: PAny element e U s not covered n 1 c/. 7
If c 6, we get 1 c/ =, and hence PEvery element e U s covered 1 n. Note: You can actually get a tghter bound of c 3 usng the followng argument: Frst, we need to observe that probablty of e beng covered s mnmzed when z are all equal,.e., z 1 = = z ke = 1/k e ths s not obvous - try to prove t...). Thus we get for any j: P Y,j = 0 1 z 1 ) 1 z ke ) e S 1 1 k e ) ke 1 e. Therefore, each element s covered wth probablty at least 1 1/e f we draw only one sample of each X. Snce we draw c log n samples, the probablty that element e s not covered s: Pe s not covered ) 1 c log n = 1 e n c. Now va the unon bound, we see that f we take c = 3, then we ll have: PAny element e U s not covered 1 n. Problem 5: Papadmtrou s SAT Algorthm) Optonal) The SAT problem Wkpeda entry), and more generally, the boolean satsfablty SAT) problem Wkpeda entry) are one of the cornerstones of theoretcal algorthms, and also a very useful modelng tool for a varety of optmzaton problems. In the general SAT problem, we want to fnd a satsfyng assgnment for a gven a Boolean expresson n n Boolean.e., {0, 1}, or FALSE/TRUE) varables {X 1, X,..., X n } typcally nvolvng conjunctons.e., logcal AND, denoted as ), dsjunctons.e., logcal OR, denoted as ) and negatons logcal NOT; typcally X denotes the negaton of a varable X). In SAT, the expresson s restrcted to beng a conjuncton AND) of several clauses, where each clause s the dsjuncton OR) of two lterals ether a varable or ts negaton). For example, the expresson X 1 X ) X 1 X 3 ) X X 3 ) s a SAT formula, whch has several satsfyng assgnments ncludng X 1 = 1, X = 1, X 3 = 1). Note that n order to fnd a satsfyng assgnment, we need to set the varables such that each clause n the formula has at least one lteral whch s TRUE. Although the general SAT problem s known to be NP-complete, SAT can be solved n polynomal tme we wll now see a smple randomzed algorthm that demonstrates ths fact: Papapdmtrou s SAT Algorthm): Gven a CNF formula F nvolvng n Boolean varables, and an arbtrary assgnment τ, we check f τ satsfes F. If not, we pck an arbtrary unsatsfed clause, pck one of ts lterals unformly at random, and flp t to get a new assgnment τ. We then repeat ths untl we fnd a satsfyng assgnment. Part a) Assume that F has a unque satsfyng assgnment τ, and for any assgnment τ, let Nτ) be the number of lterals n τ whch agree.e., have the same value) as the correspondng lteral n 8
τ. Argue that each tme we execute an teraton of Papadmtrou s SAT algorthm wth nput assgnment τ, the new assgnment τ satsfes: { Nτ Nτ) + 1 wth probablty at least 1 ) = Nτ) 1 wth probablty at most 1 Soluton: Gven any unsatsfed clause, we know that n τ at least one of the lterals s flpped t could be both are flpped). Snce we choose to flp a unform random lteral out of the two, we are guaranteed that we ncrease the agreement between our guess τ and τ by 1 wth probablty at least 1/. Part b) Based on the above, argue that the runnng tme T n of the algorthm s upper bounded by the frst tme { that a symmetrc random walk startng from 0 hts n or n equvalently, T n = arg mn K>0 } K =1 X = n, where X are..d Rademacher random varables). Next, show that ET n = n, and thus, prove that after On 3 ) teratons, Papadmtrou s algorthm termnates wth probablty greater than 1 1/n. Soluton: See ths lecture and the next) from Tm Roughgarden s algorthms course for a beautful exposton of ths proof. One thng you should note: although runnng the algorthm tll On 3 ) steps gves the desred probablty, one can get much better bounds f we nstead stop after On ) steps, and then restart the process. Markov s tells us that after n steps, we fnd the soluton wth probablty at least 1/ now from prevous assgnments, you know that dong Olog n) ndependent trals s suffcent to amplfy the probablty to 1 1/n. However, the analyss works for any startng pont so ths does suggest that On log n) steps of Papadmtrou s algorthm was suffcent... Is ths surprsng? Not really bascally what ths says s that the falure probablty does follow a Chernoff-style exponental decay. We could not show ths drectly as the random varables were not ndependent however, Markov s nequalty stll works, and we can explot that n a clever way to get our result. 9