Pr[X (p + t)n] e D KL(p+t p)n.

Cheroff Bouds Wolfgag Mulzer 1 The Geeral Boud Let P 1,..., m ) ad Q q 1,..., q m ) be two dstrbutos o m elemets,.e.,, q 0, for 1,..., m, ad m 1 m 1 q 1. The Kullback-Lebler dvergece or relatve etroy of P ad Q s defed as m D KL P Q) : l. q If m 2,.e., P, 1 ) ad Q q, 1 q), we also wrte D KL q). The Kullback-Lebler dvergece rovdes a measure of dstace betwee the dstrbutos P ad Q: t reresets the exected loss of effcecy f we ecode a m-letter alhabet wth dstrbuto P wth a code that s otmal for dstrbuto Q. We ca ow state the geeral form of the Cheroff Boud: Theorem 1.1. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X. The, for ay t [0, 1 ], we have 1 Pr[X + t)] e D KL+t ). 2 Four Proofs 2.1 The Momet Method The usual roof of Theorem 1.1 uses the exoetal fucto ex ad Markov s equalty. It s called momet method because ex smultaeously ecodes all momets of X,.e., X, X 2, X 3, etc. The roof techque s very geeral ad ca be used to obta several varats of Theorem 1.1. Let λ > 0 be a arameter to be determed later. We have Pr[X + t)] Pr[λX λ + t)] Pr [ e λx e λ+t)]. From Markov s equalty, we obta Now, the deedece of the X yelds [ E[e λx ] E [e λ ] 1 X E Pr [ e λx e λ+t)] E[eλX ] e λ+t). 1 e λx 1 ] 1 [ ] E e λx e λ + 1 ).

Thus, e λ + 1 ), Pr[X > + t)] 1) e λ+t) for every λ > 0. Otmzg for λ usg calculus, we get that the rght had sde s mmzed f e λ 1 ) + t) 1 t). Pluggg ths to 1), we get as desred. Pr[X > + t)] [ ] ) +t 1 ) 1 t e DKL+t ), + t 1 t 2.2 Chvátal s Method Let B, ) the radom varable that gves the umber of heads deedet Beroull trals wth success robablty. It s well kow that ) Pr[B, ) l] l 1 ) l, l for l 0,...,. Thus, for ay τ 1 ad k, we get Pr[B, ) k] k k ) 1 ) ) 1 ) τ k }{{} 1 k 1 + 0 ) 1 ) τ k } {{ } 0 0 ) 1 ) τ k. Usg the Bomal theorem, we obta Pr[B, ) k] 0 ) 1 ) τ k τ k If we wrte k + t) ad τ e λ, we ca coclude Pr[B, ) + t)] 0 )τ) 1 ) τ + 1 ) τ k. e λ + 1 e λ+t) ). Ths s the same as 1), so we ca comlete the roof of Theorem 1.1 as Secto 2.1. 2

2.3 The Imaglazzo-Kabaets Method Let λ [0, 1] be a arameter to be chose later. Let I {1,..., } be a radom dex set obtaed by cludg each elemet {1,..., } wth robablty λ. We estmate Pr [ I X 1 ] two dfferet ways, where the robablty s over the radom choce of X 1,..., X ad I. O the oe had, usg the uo boud ad deedece, we have [ ] Pr X 1 I S {1,...,} S {1,...,} [ Pr I S ] X 1 S λ S 1 λ) S S s0 S {1,...,} Pr[I S] Pr[X 1] S ) λ) s 1 λ) s λ + 1 λ), 2) s by the Bomal theorem. O the other had, by the law of total robablty, [ ] [ ] Pr X 1 Pr X 1 X + t) Pr[X + t)]. I I Now, fx X 1,..., X wth X + t). For the fxed choce of X 1 x 1,..., X x, the robablty Pr [ I x 1 ] s exactly the robablty that I avods all the X dces where x 0. Thus, [ ] Pr x 1 1 λ) X 1 λ) 1 t). I Sce the boud holds uformly for every choce of x 1,..., x wth X + t), we get [ ] Pr X 1 X + t) 1 λ) 1 t), so Combg wth 2), I [ ] Pr X 1 1 λ) 1 t) Pr[X + t)]. I Pr[X + t)] ) λ + 1 λ 1 λ) 1 t). 3) Usg calculus, we get that the rght had sde s mmzed for λ t/1 ) + t) ote that λ 1 for t 1 ). Pluggg ths to 3), [ ] ) +t 1 ) 1 t Pr[X > + t)] e DKL+t ), + t 1 t as desred. 3

2.4 The Codg Theoretc Argumet The ext roof, due to Luc Devroye, Gábor Lugos, ad Pat Mor, s sred by codg theory. Let {0, 1} be the set of all bt strgs of legth, ad let w : {0, 1} [0, 1] be a weght fucto. We call w vald f x {0,1} wx) 1. The followg lemma says that for ay robablty dstrbuto x o {0, 1}, a vald weght fucto s ulkely to be substatally larger tha x. Lemma 2.1. Let D be a robablty dstrbuto o {0, 1} that assgs to each x {0, 1} a robablty x, ad let w be a vald weght fucto. For ay s 1, we have Pr [wx) s x] 1/s. x D Proof. Let Z s {x {0, 1} wx) s x }. We have Pr [wx) s x] x D x Z s x>0 x x Z s x>0 x wx) s x sce wx)/s x 1 for x Z s, x > 0, ad sce w s vald. 1/s) x Z s wx) 1/s, We ow show that Lemma 2.1 mles Theorem 1.1. For ths, we terret the sequece X 1,..., X as a bt strg of legth. Ths duces a robablty dstrbuto D that assgs to each x {0, 1} the robablty x kx 1 ) kx, where k x deotes the umber of 1-bts x. We defe a weght fucto w : {0, 1} [0, 1] by wx) +t) kx 1 t) kx, for x {0, 1}. The w s vald, sce wx) s the robablty that x s geerated by settg each bt to 1 deedetly wth robablty + t. For x {0, 1}, we have wx) x + t ) kx ) 1 t kx. 1 Sce + t)/)1 )/1 t)) 1, t follows that wx)/ x s a creasg fucto of k x. Hece, f k x + t), we have wx) x [ + t ) +t ) ] 1 t 1 t e DKL+t ). 1 We ow aly Lemma 2.1 to D ad w to get Pr[X + t)] Pr [kx) + t)] Pr x D x D as clamed Theorem 1.1. [ wx) x e D KL+t ) ] e D KL+t ), We rovde some codg-theoretc backgroud to exla the tuto behd the roof. A code for {0, 1} s a ectve fucto C : {0, 1} {0, 1}. The mages of C are called codewords. A code s called refx-free f o codeword s the refx of aother codeword,.e., for all x, y {0, 1} wth x y, we have that f x y, the x ad y dffer at least oe bt osto. A refx-free code has a atural reresetato as a rooted bary tree whch the leaves corresod to elemets of {0, 1}. Eve though the codeword legths a refx-free code may vary, ths structure moses a restrcto o the allowed legths. Ths s formalzed Kraft s equalty. 4

Lemma 2.2 Kraft s equalty). Let C : {0, 1} {0, 1} be a refx-free code. The, x {0,1} 2 Cx) 1. Coversely, gve a fucto l : {0, 1} N wth x {0,1} 2 lx) 1, there exsts a refx-free code C : {0, 1} {0, 1} wth Cx) lx) for all x {0, 1}. Proof. Let m max x {0,1} Cx), ad let y be radom elemet of y {0, 1} m. The, for each x {0, 1}, the robablty that Cx) s a refx of y s exactly 2 Cx). Furthermore, sce C s refx-free, these evets are mutually exclusve. Thus, x {0,1} 2 Cx) 1, as clamed. Next, we rove the secod art. Let m max x {0,1} lx) ad let T be a comlete bary tree of heght m. We costruct C accordg to the followg algorthm: we set X {0, 1}, ad we ck x X wth lx ) m x X lx). The we select a ode v T wth deth lx ). We assg to Cx ) the codeword of legth l that corresods to v, ad we remove v ad all ts descedats from T. Ths deletes exactly 2 m lx ) leaves from T. Next, we remove x from X ad we reeat ths rocedure utl X s emty. Whle X, we have 2 m lx) < 2 m, x {0,1} \X so T cotas each terato at least oe leaf ad thus also at least oe ode of deth lx ). Sce we assg the odes by creasg deth, ad sce all descedats of a assged ode are deleted from the tree, the resultg code s refx-free. Kraft s equalty shows that a refx-free code C duces a vald weght fucto wx) 2 Cx). Thus, Lemma 2.1 mles that for ay robablty dstrbuto x o {0, 1} ad for ay refx-free code, the robablty mass of the strgs x wth codeword legth log1/ x ) s s at most 2 s. Now, f we set lx) k x log + t) k x ) log1 t) for x {0, 1}, the coverse of Kraft s equalty shows that there exsts a refx free code C wth C x) lx). The calculato above shows that C saves roughly + t) log + t)/) + 1 t) log1 t)/1 )) bts over log1/ x ) for ay x wth k x + t), whch almost gves the desred result. We geeralze to arbtrary vald weght fuctos to avod the slack troduced by the celg fucto. 5

3 Useful Cosequeces 3.1 The Lower Tal Corollary 3.1. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X. The, for ay t [0, ], we have Proof. Pr[X t)] e D KL t ). Pr[X t)] Pr[ X t)] Pr[X 1 + t)], where X 1 X wth deedet radom varables X {0, 1} such that Pr[X 1] 1. The result follows from D KL 1 + t 1 ) D KL t ). 3.2 Motwa-Raghava verso Corollary 3.2. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X ad µ. The, for ay δ 0, we have e δ ) µ Pr[X 1 + δ)µ] 1 + δ) 1+δ, ad Pr[X 1 δ)µ] e δ 1 δ) 1 δ ) µ. Proof. Settg t δµ/ Theorem 1.1 yelds [ 1 Pr[X 1 + δ)µ] ex 1 + δ) l1 + δ) + ) µ 1 δ/1 )) δ 1 )/ 1 + δ) 1+δ ) µ e δ2 /1 )+δ e δ ) µ 1 + δ) 1+δ 1 + δ) 1+δ. Settg t δµ/ Corollary 3.1 yelds [ 1 Pr[X 1 δ)µ] ex 1 δ) l1 δ) + ) µ 1 + δ/1 )) δ 1 )/ 1 δ) 1 δ ) µ e δ2 /1 ) δ e δ ) µ 1 δ) 1 δ 1 δ) 1 δ. ) δ l ) + δ l )]) 1 δ 1 )]) 1 + δ 1 6

3.3 Hady Versos Corollary 3.3. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X ad µ. The, for ay δ 0, 1), we have Proof. By Corollary 3.2 Pr[X 1 δ)µ] e δ2 µ/2. e δ ) µ Pr[X 1 δ)µ] 1 δ) 1 δ. Usg the ower seres exaso of l1 δ), we get Thus, as clamed. 1 δ) l1 δ) 1 δ) 1 δ δ + δ 1) δ + δ2 /2. 2 Pr[X 1 δ)µ] e [ δ+δ δ2 /2]µ e δ2 µ/2, Corollary 3.4. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X ad µ. The, for ay δ 0, we have Pr[X 1 + δ)µ] e m{δ2,δ}µ/4. Proof. We may assume that 1 + δ) 1. The Theorem 1.1 gves Defe fδ) : D KL 1 + δ) ). The Pr[X 1 + δ)] e D KL1+δ) ). f δ) l1 + δ) l1 δ/1 )) ad By Taylor s theorem, we have f δ) 1 + δ)1 δ) 1 + δ. fδ) f0) + δf 0) + δ2 2 f ξ), for some ξ [0, δ]. Sce f0) f 0) 0, t follows that fδ) δ2 2 f ξ) δ2 21 + ξ) δ2 21 + δ). For δ 1, we have δ/1 + δ) 1/2, for δ < 1, we have 1/δ + 1) 1/2. Ths gves for all δ 0 ad the clam follows. fδ) m{δ 2, δ}/4, 7

Corollary 3.5. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X ad µ. The, for ay δ > 0, we have Proof. Combe Corollares 3.3 ad 3.4. Pr[ X µ δµ] 2e m{δ2,δ}µ/4. Corollary 3.6. Let X 1,..., X be deedet radom varables wth X {0, 1} ad Pr[X 1], for 1,.... Set X : 1 X ad µ. For t 2eµ, we have Proof. By Corollary 3.2 Pr[X t] 2 t. e δ ) µ Pr[X 1 + δ)µ] 1 + δ) 1+δ ) e 1+δ)µ. 1 + δ For δ 2e 1, the deomator the rght had sde s at least 2e, ad the clam follows. 4 Geeralzatos We meto a few geeralzatos of the roof techques for Secto 2. Sce the cosequeces from Secto 3 are based o smle algebrac maulato of the bouds, the same cosequeces also hold for the geeralzed settgs. 4.1 Hoeffdg-Exteso Theorem 4.1. Let X 1,..., X be deedet radom varables wth X [0, 1] ad E[X ]. Set X : 1 X ad : 1/) 1. The, for ay t [0, 1 ], we have Pr[X + t)] e D KL+t ). Proof. The roof geeralzes the momet method. Let λ > 0 a arameter to be determed later. As before, Markov s equalty yelds Usg deedece, we get Pr [ e λx e λ+t)] E[eλX ] e λ+t). E[e λx ] E [e λ ] 1 X 1 [ ] E e λx. 4) Now we eed to estmate E [ e λx ]. The fucto z e λz s covex, so e λz 1 z)e 0 λ + ze 1 λ for z [0, 1]. Hece, E [ e λx ] E[1 X + X e λ ] 1 + e λ. 8

Gog back to 4), E[e λx ] 1 + e λ ). 1 Usg the arthmetc-geometrc mea equalty 1 x 1/) 1 x ), for x 0, ths s From here we cotue as Secto 2.1. E[e λx ] 1 + e λ ). 4.2 Hyergeometrc Dstrbuto Chvátals roof geeralzes to the hyergeometrc dstrbuto. Theorem 4.2. Suose we have a ur wth N balls, P of whch are red. We radomly draw balls from the ur wthout relacemet. Let HN, P, ) deote the umber of red balls the samle. Set : P/N. The, for ay t [0, 1 ], we have Proof. It s well kow that for l 0,...,. Pr[HN, P, ) + t)] e D KL+t ). Pr[HN, P, ) l] Clam 4.3. For every {0,..., }, we have N ) 1 P P l ) N P ) N l ) ) ) N l ) 1, ). Proof. Cosder the followg radom exermet: take a radom ermutato of the N balls the ur. Let S be the sequece of the frst elemets the ermutato. Let X be the umber of -subsets of S that cota oly red balls. We comute E[X] two dfferet ways. O the oe had, E[X] ) Pr[S cotas red balls] N ) 1 P ) N P ) ). 5) O the other had, let I {1,..., } wth I. The the robablty that all the balls the ostos dexed by I are red s P N P 1 N 1 P + 1 ) P N + 1. N Thus, by learty of exectato E[X] ). Together wth 5), the clam follows. 9

Clam 4.4. For every τ 1, we have ) N 1 0 P ) ) N P τ 1 + τ 1)). Proof. Usg Clam 4.3 ad the Bomal theorem twce), ) N 1 ) ) ) P N P N 1 P τ 0 0 ) N 1 ) ) P N P 0 0 ) N 1 P τ 1) 0 as clamed. 0 Thus, for ay τ 1 ad k, we get as before ) N 1 ) ) P N P Pr[HN, P, ) k] k ) N 1 by Clam 4.4. From here the roof roceeds as Secto 2.2. ) ) N P 1 τ 1)) ) τ 1) ) N P ) τ 1)) 1 + τ 1)), 0 P ) ) ) N P )τ k τ + 1 ) τ k, 4.3 Geeral Imaglazzo-Kabaets Theorem 4.5. Let X 1,..., X be radom varables wth X 0, 1. Suose there exst [0, 1], 1,...,, such that for every dex set I {1,..., }, we have Pr[ I X 1] I. Set X : 1 X ad : 1/) 1. The, for ay t [0, 1 ], we have Pr[X + t)] e D KL+t ). Proof. Let λ [0, 1] be a arameter to be chose later. Let I {1,..., } be a radom dex set obtaed by cludg each elemet {1,..., } wth robablty λ. As before, we estmate the robablty Pr [ I X 1 ] two dfferet ways, where the robablty s over the radom choce of X 1,..., X ad I. Smlarly to before, [ ] [ ] Pr X 1 Pr X 1 [ Pr I S ] X 1 S I I S {1,...,} S {1,...,} [ ] Pr[I S] Pr X 1 S S {1,...,} λ S 1 λ) S. 6) S 10

We defe deedet radom varables Z 1,..., Z as follows: for 1,...,, wth robablty 1 λ, we set Z 1, ad wth robablty λ, we set Z. By 6), ad usg deedece ad the arthmetc-geometrc mea equalty. [ ] [ ] Pr X 1 E Z I 1 E[Z ] The roof of the lower boud remas uchaged ad yelds [ ] Pr X 1 1 λ) 1 t) Pr[X + t)], I 1 1 λ + λ) 1 λ + λ). 7) as before. Combg wth 7) ad otmzg for λ fshes the roof, see Secto 2.3. 1 11