Tail estimates for sums of variables sampled by a random walk

Tail etimate for um of variable ampled by a random walk arxiv:math/0608740v mathpr] 11 Oct 006 Roy Wagner April 1, 008 Abtract We prove tail etimate for variable i f(x i), where (X i ) i i the trajectory of a random walk on an undirected graph (or, equivalently, a reverible Markov chain) The etimate are in term of the maximum of the function f, it variance, and the pectrum of the graph Our proof are more elementary than other proof in the literature, and our reult are harper We obtain Berntein and Bennett-type inequalitie, a well a an inequality for ubgauian variable 1 Introduction One of the baic concern of ampling theory i economiing on the cot and quantity of ample required to etimate the average of random variable Drawing tate by conducting a random walk i often coniderably cheaper than the tandard Monte- Carlo procedure of drawing independent random tate Depite the lo of independence when ampling by a random walk, the empirical average may converge to the actual average at a comparable rate to the rate of convergence for independent ampling (depending, of coure, on the tructure of the pecific random walk) Thi approach play an important role in tatitical phyic and in computer cience (a concie ummary of application i provided in WX]) Reult concerning the rate of convergence of empirical average ampled by a random walk have been obtained by everal author in G], D], L], LP] (for vector and matrix valued function conult K] and WX]) Of thee, only L] and K] allowed the variance to play a role in their etimate Thi paper i a further tep in thi direction We improve known Berntein-type inequalitie, and prove a new Bennett-type inequality and a new inequality for ubgauian variable Our method are much more elementary than the one prevailing in the literature, a we do not apply Kato perturbation theory for eigenvalue Our reult are harper for the cae of graph with large pectral gap (expander) and tail which go far beyond the variance (large deviation) Thi regime often feature in application uch a the recent AM] 1

The reult Let G be a finite undirected, poibly weighted, connected graph with N vertice (random walk on uch graph can repreent any finite irreducible reverible Markov chain) Denote by the tationary ditribution of the random walk on the graph Let f be a function on the vertice of G, normalied to have maximum 1 and mean 0 relative to the tationary ditribution, namely i f(i)(i) = 0 Let V = i f (i)(i) denote the variance of f with repect to the tationary ditribution We will think of function on G a vector in R N and vice vera, o where u and v are vector, expreion uch a e u and uv will tand for coordinatewie operation Denote by P the tranition matrix of the random walk, uch that P ij i the probability of moving from node j to node i By the Perron-Frobeniu theorem the eigenvalue of thi matrix are all real, the top eigenvalue i 1 (with a the only correponding eigenvector up to calar multiplication), and the abolute value of all other eigenvalue i maller or equal to 1 Let α < 1 be the econd larget eigenvalue of P, and β 1 the econd larget abolute value of an eigenvalue of P Given a tarting ditribution q, the random variable X 0, X 1, will denote the trajectory of the random walk P q and E q will tand for the probability and expectation of event related to thi walk repectively Let S n = n i=1 f(x i) Our concern in thi paper i tail etimate for the ditribution of S n We will prove inequalitie in term of both α and β Note that inequalitie in term of α cot an additional multiplicative factor outide the exponent, wherea inequalitie in term of β are uele in the cae of β = 1 (ie bipartite graph), and become relatively poor if α i mall and β i large, which may be the cae Theorem 1 In the above etting, q e n and P q ( 1 min 1 β e r +β V (e r 1) min 1 αe r +αv (e r 1) q e r e n ( γr V γr V e r 4β 1 r+ e r (e r 1) 1 β e r β V (e r 1) )], (1) ( )] e r 1 r+ 4αer (e r 1) 1 αe r αv (e r 1) () Remark Note that the reult are the ame up to the factor 1 multiplicative factor e r and the replacement of β by α in the exponent, the The infimum i hard to compute, o we mut optimie eparately for different parameter regime Firt we ue the above reult to derive a Bennett-type inequality (cf B]) Corollary For any C > 1 we have q e n CV(1+ γ CV )log (1+ γ CV ) γ CV ] q e n γ log γ CeV,

a long a γ ( (C 1)(1+β ) 1)CV (C+1)β P q ( 1 (1 + γ CV ) q e ncv(1+ γ (1 + γ CV ) q γ nγ log e CeV CV )log (1+ γ CV ) γ CV ] where the ame retriction applie a above with β replaced by α Our theorem alo allow u to reproduce Lezaud etimate from L] with improved contant: Corollary 3 q e n γ (1 β) 4(V +γ) and e γ(1 α) V +γ q e n γ Remark Thee inequalitie imply q e n γ (1 β) 8V (1 α) (V +γ) for γ V, and q e γ(1 β) n 8 for γ V If γ i much larger or much maller than V, the contant 8 can be decreaed toward 4 Similar reult apply to the α-inequalitie Note that in the α-cae we can alo prove e γ(1 α) (V +γ) q e n γ (1 α) 4(V +γ) (3) The Bennett-type bound improve upon thi Berntein-type reult for γ >> V, provided β i mall enough Thi allow to ee how a maller β reduce the number of required ample Finally, our technique can be adapted to ituation where we have additional information on the ditribution of f, uch a ubgauian tail Let denote here, by abue of notation, the meaure on the vertice of the graph which correpond to the tationary ditribution Theorem 4 In the above etting aume alo that (f t) Ce Kt for poitive t Then q e n (γ K log(c πkγ+)), a long a γ log ( 1 β +1 ) K 3

We alo have q e γk e n(γ K log(c πkγ+)), a long a γ log ( 1 α +1 ) K Note that if we give up the aumption that f i bounded by 1, a imple renormaliation how that the etimate remain the ame, except that now we mut aume γ log ( 1 β +1 ) K f (repectively for α intead of β) For ome parameter regime Theorem 4 aymptotically improve upon Theorem 1 from AM] 3 Proof of reult in term of β In thi ection we will prove the inequalitie involving β Sketche of proof for inequalitie involving α are deferred to the next ection Before we begin proving we introduce ome notation We will denote u 1/ = u(i) i the 1-weighted l (i) norm on R N The inner product aociated with thi norm i u, v = u(i)v(i) i When we (i) refer to the tandard l norm we will ue the notation for norm and (, ) for inner product The tranition matrix P i not necearily ymmetric, and o it eigenvector need not be orthogonal (thi would be the cae only if G were a regular graph) Reveribility, however, promie that j P ij = i P ji, and o P i elf adjoint and it eigenvector are mutually orthogonal with repect to the 1 -weighted Euclidean tructure Therefore the 1/ norm of P retricted to the ubpace orthogonal to i β, the econd larget abolute value of the eigenvalue of P Proof of Theorem 1 The beginning of our proof i identical to that of Gillman and of thoe which follow it reaoning By Markov inequality P q ( 1 e rnγ E q e rsn, where the expectation can be directly expreed and etimated a ( ) n 1 e rsn q(x 0 ) (P T ) xi,x i+1 =, (e rf P T ) n q (x 0,,x n) G n+1 i=0 q 1/ Pe rf n 1/ Here e rf tand for the diagonal matrix with e rf(i) a diagonal entrie, and the inner product i, we recall, the inner product aociated with the 1 -weighted l norm At thi point Gillman proof and it variation ymmetrie the operator o that it norm will equal it top eigenvalue, and ue Kato pectral perturbation theory 4

to etimate thi eigenvalue Our proof, on the other hand, will proceed to imply etimate the norm directly To do that we will ue the equality Pe rf 1/ = max Pe rf u, Pe rf u u 1/ =1 In order to perform the computation we plit the vector u into tationary and orthogonal component, u = a + bρ, where ρ i normalied and orthogonal to in the weighted Euclidean tructure Applying imilar decompoition e rf = x + zσ and e rf ρ = y + wτ we get Pe rf 1/ = max a(x + zpσ) + b(y + wpτ), a(x + zpσ) + b(y + wpτ) a +b =1 ρ, σ, τ We open the inner product and obtain Pe rf 1/ = max a (x + z Pσ ) + b (y + w Pτ ) + ab(xy + zw Pσ, Pτ ) a +b =1, ρ, σ, τ Denote p σ = Pσ, p τ = Pτ and p σ,τ = Pσ, Pτ Our tak i reduced to computing the l norm of the following by ymmetric bilinear form: ( ) x + z p σ xy + zwp σ,τ xy + zwp σ,τ y + w p τ The norm of a by bilinear form equal it larget eigenvalue, which can be derived from it trace (um of eigenvalue) and determinant (their product) We obtain: ( ) A B = 1 (A + C) + (A + C) 4(AC B B C ) ] Subtituting the entrie of our own bilinear form and rearranging ome term inide the quare root, we get Pe rf 1/ = 1 (x + y + z p σ + w p τ )+ (x + y z p σ w p τ ) + 4z w (p σ,τ p σp τ ) ] ] + 4x z p σ + 4y w 1/ p τ + 8xyzwp σ,τ 1 (x + y + z p σ + w p τ )+ (x + y z p σ w p τ ) + 4( xz p σ + yw p τ ) ] 1/ ], where we ued the Cauchy-Schwarz inequality p σ,τ p σ p τ 5

To etimate the quare root we ue the inequality 1 + X 1 + X Thi lead u to Pe rf 1/ x + y + ( xz p σ + yw p τ ) (4) x + y z p σ w p τ Note that thi reult depend on auming that x + y z p σ + w p τ (For the purpoe of the proof of Theorem 4 we require the inequality Pe rf 1/ x + y + xz p σ + yw p τ, (5) which i obtained by uing 1 + X 1+ X, and depend on the ame inequality) Let u now etimate the component of our formula We recall that f ha mean 0 with repect to the tationary ditribution and maximum 1 We obtain x = e rf, = 1 + (f, ) r 1! + (f, ) r! + (f3, ) r 3 3! + 1 + V ( r! + r3 3! + r4 4! + ) 1 + V (er 1 r) Note alo that f 1 implie that x e r, and that by the arithmeitc-geometric mean x = i (i)erf(i) e r i (i)f(i) = 1 To etimate y = e rf ρ, = e rf, ρ recall that ρ i normalied and orthogonal to, and that f, ρ f 1/ = V We get y = e rf, ρ =, ρ + f, ρ r 1! + f, ρ r! + V (r + r! + r3 3! + ) V (e r 1) Note that x + y e rf 1/ = erf,, which, a in the computation of x above, i bounded by 1 + V (e r 1 r) Next, uing the ame etimate a for y, we get z = e rf, σ V (e r 1) For w = e rf ρ, τ we ue the imple etimate w e r Finally, ince the norm of P retricted to the ubpace orthogonal to i β, we have p σ, p τ, p σ,τ β Now we plug our etimate into inequality (4), and derive Pe rf 1/ 1 + V (er 1 r) + ( 1 + V e r 1 r + ( e V ) e r 4β 1 r+ e r (e r 1) 1 β e r β V (e r 1) (e r V (e r 1)β + V (e r 1)βe r ) 1 β e r β V (e r 1) ) (βe r (e r 1)) 1 β e r β V (e r 1), 6

a long a 1 β e r β V (e r 1) To conclude, recall that o we finally obtain P q ( 1 P q ( 1 e nγr q 1/ Pe rf n 1/, min 1 β e r +β V (e r 1) q e n ( γr 1 V e r 4β 1 r+ e r (e r 1) 1 β e r β V (e r 1) )] To derive the corollarie and Theorem 4, we only need to aign uitable value to r We will retrict to the cae q = in order not to have to carry the q term Proof of Corollary Uing the inequalitie (e r 1) e r 1 r and β e r + β V (e r 1) β (e r 1) we bound the expreion inide the exponent in inequality (1) by n ( )] γr V (e r 4β e r 1 r) 1 + = 1 β (e r 1) = n γr V (e r 1 r) 1 + ] β + β e r 1 + β β e r Take any C > 1 If we aume that e r (C 1)(1+β ), then 1+β +β e r C, o the (C+1)β 1+β β e r above expreion i bounded by n γr CV (e r 1 r) ] We et r = log (1 + γ ) Subtituting thi into the above we get CV γ log (1 + γ γ ) CV (1 + CV CV n n CV (1 + γ γ ) log (1 + CV CV ) γ ] 1 log (1 + CV )) γ ] n CV γ log γ CeV Recall that we have required that e r (C 1)(1+β ), which reduce to auming γ (C+1)β ( (C 1)(1+β ) 1)CV (C+1)β Proof of Corollary 3 Firt, we will apply the inequalitie e r 1 r r e r, e r 1 re r and e r + V (e r 1) e r 1 to the exponent in inequality (1) The exponent then turn into n ( )] ] γr V r e r 4β e r + = n γr V r 1 + β 1 β (e r 1) (1 + β )e r β 7

Now we et r = C γ(1 β) We wih to find C uch that V γr Thi inequality reduce to verifying that V r (1 + β ) (1 + β )e r β γr C(1 β)(1 + β ) (1 + β )e r β, which alo guarantee that the denominator i poitive, a required by Theorem 1 Subtituting for r and uing the inequality e r 1 r, the above inequality reduce to (1 C(1 + γ V )) + C(1 + γ V )β + ( 1 C(1 + γ V ))β + C(1 + γ V )β3 0 (6) Setting C = V (V +γ) Having verified that guarantee that the above hold for all 0 β 1 γr V r (1 + β ) (1 + β )e r β γr given our choice of C, we can plug n γr into the exponent in inequality (1), and obtain the promied bound P( 1 e n γ (1 β) 4(V +γ) Proof of Theorem 4 To prove thi theorem we offer a different analyi of the bound x + y i erf(i) (i) Thi i imply the expectation of e rf according to the meaure We can now evaluate thi quantity uing the ubgauian information We get x + y 1 + e rt d ( (f t)) = re rt (f t)dt π re rt Ce Kt /K dt = 1 + C K rer 0 Plugging thi etimate into inequality (5) together with the imple etimate x, w e r and y, z e r 1 we obtain Pe rf 1/ 1 + C π K rer /K + β(e r 1) A noted, inequality (5) depend on taking x +y z β +w β, which i guaranteed a long a β (e r 1) 1 We will make the tronger aumption β(e r 1) 1, and obtain the bound π Pe rf 1/ (C K r + /K )er 8

Recalling that P q ( 1 e nγr q 1/ Pe rf n 1/, and etting r = γk, we conclude the required P q ( 1 q 1/ e n γ K (C πkγ + ) n/ = q e n (γ K log(c πkγ+)) The condition β(e r 1) 1 now reduce to γ log ( 1 β +1 ) K Remark Note that our method allow to increae γ a far a log ( 1 etimate blow up β +1 ) K, where the 4 Proof of reult in term of α The difference between the proof of reult in term of β and α are motly computational, o I will only ketch the relevant difference Proof of Theorem 1 A above, our tak i to etimate (Pe rf ) n 1/ We will ue the imple identity Pe rf = e 1 rf e 1 rf Pe 1 rf e 1 rf to obtain (e rf P) n 1/ e r (e 1 rf Pe 1 rf ) n 1/ e r e 1 rf Pe 1 rf n 1/ Since the operator e 1 rf Pe 1 rf i elf-adjoint with repect to the weighted Euclidean tructure, we have e 1 rf Pe 1 rf 1/ = max u 1/ =1 e1 rf Pe 1 rf u, u = max u 1/ =1 Pe1 rf u, e 1 rf u Decompoing the vector a in the β-cae (with 1 r replacing r) we get e 1 rf Pe 1 rf 1/ = max a(x + zpσ) + b(y + wpτ), a(x + zσ) + b(y + wτ) a +b =1, ρ, σ, τ We open the inner product and obtain e 1 rf Pe 1 rf 1/ = max a (x + z Pσ, σ ) + b (y + w Pτ, τ ) + ab(xy + zw Pσ, τ ) a +b =1, ρ, σ, τ Our tak i reduced to computing the l norm of the ame by ymmetric bilinear form a in the β-cae, except that r i replaced by 1 r, and the definition of the p are now p σ = Pσ, σ, p τ = Pτ, τ and p σ,τ = Pσ, τ 9

The following identity till hold: e 1 rf Pe 1 rf 1/ = 1 (x + y + z p σ + w p τ )+ (x + y z p σ w p τ ) + 4z w (p σ,τ p σp τ ) + 4x z p σ + 4y w p τ + 8xyzwp σ,τ ] 1/ ] Thi time, however, the treatment of the term inide the quare root i lightly more delicate Let λ i be the eigenvalue of P in decending order, and let (σ i ) i and (τ i ) i be the coordinate of σ and τ repectively in term of the aociated orthonormal ytem Define p + σ = λ i (σ i ), 1>λ i >0λ i (σ i ) and p σ = λ i <0 and decompoe p τ and p σ,τ analogouly By Cauchy-Schwarz p + σ,τ p + σ p+ τ, and the ame goe for the p and All thi yield p σ,τ p σp τ = (p + σ,τ p σ,τ ) (p + σ p σ )(p+ τ p τ ) = ((p + σ,τ ) p + σ p+ τ ) + ((p σ,τ ) p σ p τ ) + (p+ σ p τ + p σ p+ τ p σ,τ p+ σ,τ ) ( p + σ p τ + p σ p+ τ ) x z p σ +y w p τ + xyzwp σ,τ = (x z p + σ + y w p + τ + xyzwp+ σ,τ ) (x z p σ + y w p τ + xyzwp σ,τ ) ( xz p + σ + yw p + τ ) We now combine the two etimate to get 4z w (p σ,τ p σp τ ) + 4x z p σ + 4y w p τ + 8xyzwp σ,τ 4 max( xz, yw, zw ) (( p +σ p τ + p σ p+τ ) + ( p +σ + ) p +τ ) Since λ = α, all p + are abolutely bounded by α Note alo that p + σ + αp σ α σ 1/ = α, and the ame goe for τ So, in fact, the above i bounded by the expreion ( ( 4 max ( xz, yw, zw ) p + σ (1 p + τ /α) + ) ( p + τ (1 p + σ /α) + p + σ + ) ) p + τ Rearranging term and uing Cauchy-Schwarz we get ( p + σ (1 p+ τ /α) + ) ( p + τ (1 p+ σ /α) + p + σ + ) p + τ (p +σ (1 p+τ /α) + p+τ + ) (p +σ + p+τ (1 p+σ /α))(p+τ + p+σ (1 p+τ /α)) 4α 10

So we finally obtain e 1 rf Pe 1 rf 1/ = 1 (x + y + z p σ + w p τ ) + (x + y z p σ w p τ ) + 16α max ( xz, yw, zw ) ] 1/ ] Uing the ame etimate a in the β-cae, replacing r by 1 r in the etimate of x, y, z and w, recalling that p σ, p τ α, and finally changing the bound variable r into r we obtain the deired reult The other proof derive from the remark following Theorem 1, which applie alo to the proof of Theorem 4 Note that if in the proof of Corollary 3 we et r = C γ(1 α) V intead of r = C γ(1 α), we need to atify the inequality V (1 C(1 + γ V )) α + C(1 + γ V )α 0 intead of inequality (6) Thi hold for C = remark following Corollary 3 V, leading to inequality (3) in the 4(V +γ) Reference AM] Shiri Arttein-Avidan & Vitali Milman, Logarithmic reduction of the level of randomne in ome probabilitic geometric contruction, Journal of Functional Analyi 35 (006), 97 39 B] G Bennett, Probability inequalitie for the um of independent random variable, J Amer Statit Aoc 57 (196), 33 45 D] IH Dinwoodie, A probability inequality for the occupation meaure of a reverible Markov chain, Ann of Applied Probability 5 (1995), 37 43 G] D Gillman, A Chernoff bound for random walk on expander graph, SIAM J Comput 7(4) (1998), 103 10 K] Vladilav Kargin, A Berntein type inequality for vector function on finite markov chain, preprint (005), arxiv:0508538 L] Pacal Lezaud, Chernoff type bound for finite Markov chain, Ann of Applied Probability 8(3) (1998), 849 867 LP] Carlo A Leon & Françoi Perron Optimal Hoeffding bound for dicrete reverible Markov chain, Ann of Applied Probability 14() (004), 958 970 WX] Avi Wigderon & David Xiao, A randomne-efficient ampler for matrixvalued function and Application, FOCS 005, 397 406 11

Roy Wagner Computer Science Department Academic College of Tel Aviv Yaffa 4 Antokolky St, Tel Aviv 64044, Irael rwagner@mtaacil 1