A tail inequality for suprema of unbounded empirical processes with applications to Markov chains

Size: px
Start display at page:

Download "A tail inequality for suprema of unbounded empirical processes with applications to Markov chains"

Transcription

1 E l e c t r o n i c J o u r n a l o P r o b a b i l i t y Vol , Paper no. 34, pages Journal URL A tail inequality or suprema o unbounded empirical processes with applications to Markov chains Rados law Adamczak Institute o Mathematics Polish Academy o Sciences Śniadeckich 8. P.O.Box Warszawa 10, Poland R.Adamczak@impan.gov.pl Abstract We present a tail inequality or suprema o empirical processes generated by variables with inite ψ α norms and apply it to some geometrically ergodic Markov chains to derive similar estimates or empirical processes o such chains, generated by bounded unctions. We also obtain a bounded dierence inequality or symmetric statistics o such Markov chains. Key words: concentration inequalities, empirical processes, Markov chains. AMS 2000 Subject Classiication: Primary 60E15; Secondary:60J05. Submitted to EJP on September 19, 2007, inal version accepted May 27, Research partially supported by MEiN Grant 1 PO3A

2 1 Introduction Let us consider a sequence X 1, X 2,...,X n o random variables with values in a measurable space S, B and a countable class o measurable unctions : S R. Deine moreover the random variable Z = sup F X i. In recent years a lot o eort has been devoted to describing the behaviour, in particular concentration properties o the variable Z under various assumptions on the sequence X 1,...,X n and the class F. Classically, one considers the case o i.i.d. or independent random variables X i s and uniormly bounded classes o unctions, although there are also results or unbounded unctions or sequences o variables possessing some mixing properties. The aim o this paper is to present tail inequalities or the variable Z under two dierent types o assumptions, relaxing the classical conditions. In the irst part o the article we consider the case o independent variables and unbounded unctions satisying however some integrability assumptions. The main result o this part is Theorem 4, presented in Section 2. In the second part we keep the assumption o uniorm boundedness o the class F but relax the condition on the underlying sequence o variables, by considering a class o Markov chains, satisying classical small set conditions with exponentially integrable regeneration times. I the small set assumption is satisied or the one step transition kernel, the regeneration technique or Markov chains together with the results or independent variables and unbounded unctions allow us to derive tail inequalities or the variable Z Theorem 7, presented in Section 3.2. In a more general situation, when the small set assumption is satisied only by the m-skeleton chain, our results are restricted to sums o real variables, i.e. to the case o F being a singleton Theorem 6. Finally, in Section 3.4, using similar arguments, we derive a bounded dierence type inequality or Markov chains, satisying the same small set assumptions. We will start by describing known results or bounded classes o unctions and independent random variables, beginning with the celebrated Talagrand s inequality. They will serve us both as tools and as a point o reerence or presenting our results. 1.1 Talagrand s concentration inequalities In the paper [24], Talagrand proved the ollowing inequality or empirical processes. Theorem 1 Talagrand, [24]. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R, such that a < or every F. Consider the random 1001

3 variable Z = sup F n X i. Then or all t 0, PZ EZ + t K exp 1 t ta log1 + K a V, 1 where V = Esup F n X i 2 and K is an absolute constant. In consequence, or all t 0, PZ EZ + t K 1 exp 1 t 2 K 1 V + at or some universal constant K 1. Moreover, the above inequalities hold, when replacing Z by Z. Inequalities 1 and 2 may be considered unctional versions o respectively Bennett s and Bernstein s inequalities or sums o independent random variables and similarly as in the classical case, one o them implies the other. Let us note, that Bennett s inequality recovers both the subgaussian and Poisson behaviour o sums o independent random variables, corresponding to classical limit theorems, whereas Bernstein s inequality recovers the subgaussian behaviour or small values and exhibits exponential behaviour or larger values o t. The above inequalities proved to be a very important tool in ininite dimensional probability, machine learning and M-estimation. They drew considerable attention resulting in several simpliied proos and dierent versions. In particular, there has been a series o papers, starting rom the work by Ledoux [10], exploring concentration o measure or empirical processes with the use o logarithmic Sobolev inequalities with discrete gradients. The irst explicit constants were obtained by Massart [16], who proved in particular the ollowing Theorem 2 Massart, [16]. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R, such that a < or every F. Consider the random variable Z = sup F n X i. Assume moreover that or all F and all i, EX i = 0 and let σ 2 = sup n F EX i 2. Then or all η > 0 and t 0, and PZ 1 + ηez + σ 2K 1 t + K 2 ηat e t PZ 1 ηez σ 2K 3 t K 4 ηat e t, where K 1 = 4, K 2 η = /η, K 3 = 5.4, K 4 η = /η. Similar, more reined results were obtained subsequently by Bousquet [3] and Klein and Rio [7]. The latter article contains an inequality or suprema o empirical processes with the best known constants

4 Theorem 3 Klein, Rio, [7], Theorems 1.1., 1.2. Let X 1, X 2,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S [ a, a], such that or all i, EX i = 0. Consider the random variable Then, or all t 0, Z = sup F X i. t 2 PZ EZ + t exp 2σ 2 + 2aEZ + 3at and where t 2 PZ EZ t exp 2σ 2, + 2aEZ + 3at σ 2 = sup F EX i 2. The reader may notice, that contrary to the original Talagrand s result, estimates o Theorem 2 and 3 use rather the weak variance σ 2 than the strong parameter V o Theorem 1. This stems rom several reasons, e.g. the statistical relevance o parameter σ and analogy with the concentration o Gaussian processes which by CLT, in the case o Donsker classes o unctions correspond to the limiting behaviour o empirical processes. One should also note, that by the contraction principle we have σ 2 V σ aEZ see [12], Lemma 6.6. Thus, usually one would like to describe the subgaussian behaviour o the variables Z rather in terms o σ, however the price to be paid is the additional summand o the orm ηez. Let us also remark, that i one does not pay attention to constants, inequalities presented in Theorems 2 and 3 ollow rom Talagrand s inequality just by the aorementioned estimate V σ aEZ and the inequality between the geometric and the arithmetic mean in the case o Theorem Notation, basic deinitions In the article, by K we will denote universal constants and by Cα, β, K α constants depending only on α, β or only on α resp. where α, β are some parameters. In both cases the values o constants may change rom line to line. We will also use the classical deinition o exponential Orlicz norms. Deinition 1. For α > 0, deine the unction ψ α : R + R + with the ormula ψ α x = expx α 1. For a random variable X, deine also the Orlicz norm X ψα = in{λ > 0: Eψ α X/λ 1}. 1003

5 Let us also note a basic act that we will use in the sequel, namely that by Chebyshev s inequality, or t 0, t α PX t 2 exp. X ψα Remark For α < 1 the above deinition does not give a norm but only a quasi-norm. It can be ixed by changing the unction ψ α near zero, to make it convex which would give an equivalent norm. It is however widely accepted in literature to use the word norm also or the quasi-norm given by our deinition. 2 Tail inequality or suprema o empirical processes corresponding to classes o unbounded unctions 2.1 The main result or the independent case We will now ormulate our main result in the setting o independent variables, namely tail estimates or suprema o empirical processes under the assumption that the summands have inite ψ α Orlicz norm. Theorem 4. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R. Assume that or every F and every i, EX i = 0 and or some α 0, 1] and all i, sup X i ψα <. Let Deine moreover Z = sup F σ 2 = sup F X i. EX i 2. Then, or all 0 < η < 1 and δ > 0, there exists a constant C = Cα, η, δ, such that or all t 0, and PZ 1 + ηez + t exp PZ 1 ηez t exp t δσ exp t δσ exp t α C max i sup F X i ψα t α. C max i sup F X i ψα 1004

6 Remark The above theorem may be thus considered a counterpart o Massart s result Theorem 2. It is written in a slightly dierent manner, relecting the use o Theorem 3 in the proo, but it is easy to see that i one disregards the constants, it yields another version in lavour o inequalities presented in Theorem 2. Let us note that some weaker e.g. not recovering the proper power α in the subexponential decay o the tail inequalities may be obtained by combining the Pisier inequality see 13 below with moment estimates or empirical processes proven by Giné, Lata la and Zinn [5] and later obtained by a dierent method also by Boucheron, Bousquet, Lugosi and Massart [2]. These moment estimates irst appeared in the context o tail inequalities or U-statistics and were later used in statistics, in model selection. They are however also o independent interest as extensions o classical Rosenthal s inequalities or p-th moments o sums o independent random variables with the dependence on p stated explicitly. The proo o Theorem 4 is a compilation o the classical Homan-Jørgensen inequality with Theorem 3 and another deep result due to Talagrand. Theorem 5 Ledoux, Talagrand, [12], Theorem p In the setting o Theorem 4, we have Z ψα K α Z 1 + max sup X i. i ψα We will also need the ollowing corollary to Theorem 3, which was derived in [4]. Since the proo is very short we will present it here or the sake o completeness Lemma 1. In the setting o Theorem 3, or all 0 < η 1, δ > 0 there exists a constant C = Cη, δ, such that or all t 0, PZ 1 + ηez + t exp t δσ 2 + exp t Ca and PZ 1 ηez t exp t δσ 2 + exp t. Ca Proo. It is enough to notice that or all δ > 0, exp t 2 2σ 2 exp + 2aEZ + 3at + exp t δσ 2 t δ 1 4aEZ + 3ta and use this inequality together with Theorem 3 or t + ηez instead o t, which gives C = 1 + 1/δ3 + 2η

7 Proo o Theorem 4. Without loss o generality we may and will assume that t/ max sup 1 i n F X i ψα > Kα, η, δ, 3 otherwise we can make the theorem trivial by choosing the constant C = Cη, δ, α to be large enough. The conditions on the constant Kα, η, δ will be imposed later on in the proo. Let ε = εδ > 0 its value will be determined later and or all F consider the truncated unctions 1 x = x1 {sup F x ρ} the truncation level ρ will also be ixed later. Deine also unctions 2 x = x 1 x = x1 {sup F x>ρ}. Let F i = { i : F}, i = 1, 2. We have Z = sup F X i sup 1 F 1 1 X i E 1 X i + sup 2 F 2 2 X i E 2 X i 4 and Z sup 1 F 1 1 X i E 1 X i sup 2 F 2 2 X i E 2 X i 5 where we used the act that E 1 X i + E 2 X i = 0 or all F. Similarly, by Jensen s inequality, we get E sup 1 F 1 1 X i E 1 X i 2E sup E sup 1 F 1 2 F 2 2 X i EZ 1 X i E 1 X i + 2E sup 2 X i. 6 2 F 2 Denoting and A = E sup 1 F 1 1 X i E 1 X i B = E sup 2 F 2 2 X i, 1006

8 we get by 4 and 6, PZ 1 + ηez + t P sup 1 X i E 1 X i 1 + ηez + 1 εt 1 F 1 + P sup 2 F 2 P sup 1 F 1 2 X i E 2 X i εt 1 X i E 1 X i 1 + ηa 4B + 1 εt + P sup 2 F 2 2 X i E 2 X i εt 7 and similarly by 5 and 6, PZ 1 ηez t P sup 1 X i E 1 X i 1 ηez 1 εt 1 F 1 + P sup 2 F 2 P sup 1 F 1 2 X i E 2 X i εt 1 X i E 1 X i 1 ηa 1 εt + 2B + P sup 2 F 2 2 X i E 2 X i εt. 8 We would like to choose a truncation level ρ in a way, which would allow to bound the irst summands on the right-hand sides o 7 and 8 with Lemma 1 and the other summands with Theorem 5. To this end let us set ρ = 8E max sup 1 i n F X i K α max sup 1 i n F X i. 9 ψα Let us notice that by the Chebyshev inequality and the deinition o the class F 2, we have Pmax k n sup 2 F 2 k 2 X i > 0 Pmax i sup X i > ρ 1/8 1007

9 and thus by the Homann-Jørgensen inequality see e.g. [12], Chapter 6, Proposition 6.8., inequality 6.8, we obtain In consequence E sup 2 F 2 B = E sup 2 F 2 2 X i 8E max sup 1 i n F 2 X i E 2 X i 16E max sup 1 i n F K α max sup 1 i n F X i. 10 X i X i. ψα We also have max sup 2 X i E 2 X i 1 i n 2 F 2 ψα K α max sup E 2 X i + K α max sup 2 X i 1 i n 2 F 2 ψα 1 i n 2 F 2 K α max sup 2 X i 1 i n 2 F 2 ψα K α max X i sup 1 i n F recall that with our deinitions, or α < 1, ψα is a quasi-norm, which explains the presence o the constant K α in the irst inequality. Thus, by Theorem 5, we obtain which implies sup 2 F 2 ψα 2 X i E 2 X i K α max ψα P sup 2 F 2 2 X i E 2 X i εt 2 exp Let us now choose ε < 1/10 and such that sup 1 i n F X i, ψα ψα εt α. 11 K α max 1 i n sup F X i ψα 1 5ε δ/2 1 + δ. 12 Since ε is a unction o δ, in view o 9 and 10, we can choose the constant Kα, η, δ in 3 to be large enough, to assure that B 8E max sup 1 i n F 1008 X i εt.

10 Notice moreover, that or every F, we have E 1 X i E 1 X i 2 E 1 X i 2 EX i 2. Thus, using inequalities 7, 8, 11 and Lemma 1 applied or η and δ/2, we obtain PZ 1 + ηez + t, PZ 1 ηez t exp t2 1 5ε 2 1 5εt 21 + δ/2σ 2 + exp Kη, δρ εt α + 2 exp. K α max 1 i n sup F X i ψα Since ε < 1/10, using 9 one can see that or t satisying 3 with Kα, η, δ large enough, we have exp 1 5εt Kη, δρ exp, exp εt α K α max 1 i n sup F X i ψα t α Cα, η, δ max 1 i n sup F X i ψα note that the above inequality holds or all t i α = 1. Thereore, or such t, PZ 1 + ηez + t, PZ 1 ηez t exp t2 1 5ε δ/2σ 2 t α + 3 exp. Cα, η, δ max 1 i n sup F X i ψα To inish the proo it is now enough to use 12. Remark We would like to point out that the use o the Homan-Jørgensen inequality in similar context is well known. Such applications appeared in the proo o the aorementioned moment estimates or empirical processes by Giné, Lata la, Zinn [5], in the proo o Theorem 5 and recently in the proo o Fuk-Nagaev type inequalities or empirical processes used by Einmahl and Li to investigate generalized laws o the iterated logarithm or Banach space valued variables [4]. As or using Theorem 5 to control the remainder ater truncating the original random variables, it was recently used in a somewhat similar way by Mendelson and Tomczak- Jaegermann see [17]. 1009

11 2.2 A counterexample We will now present a simple example, showing that in Theorem 4 one cannot replace sup max i X i ψα with max i sup X i ψα. With such a modiication, the inequality ails to be true even in the real valued case, i.e. when F is a singleton. For simplicity we will consider only the case α = 1. Consider a sequence Y 1, Y 2,..., o i.i.d. real random variables, such that PY i = r = e r = 1 PY i = 0. Let ε 1, ε 2,..., be a Rademacher sequence, independent rom Y i i. Deine inally X i = ε i Y i. We have Ee X i = e r e r + 1 e r 2, so X i ψ1 1. Moreover EX i 2 = r 2 e r. Assume now that we have or all n, r N and t 0, P X i K nt X1 2 + t X 1 ψ1 Ke t, where K is an absolute constant which would hold i the corresponding version o Theorem 4 was true. For suiciently large r, the above inequality applied with n e r r 2 and t r, implies that P X i r Ke r/k. On the other hand, by Levy s inequality, we have 2P X i r Pmax X i r 1 i n 2 minnpx 1 r, r 2, which gives a contradiction or large r. Remark A small modiication o the above argument shows that one cannot hope or an inequality P Z KEZ + tσ + t[log β n] max sup X i ψ1 Ke t i F with β < 1. For β = 1, this inequality ollows rom Theorem 4 via Pisier s inequality [21], max Y i K α max Y i ψα log 1/α n 13 i n ψα i n or independent real variables Y 1,..., Y n. 1010

12 3 Applications to Markov chains We will now turn to the other class o inequalities we are interested in. We are again concerned with random variables o the orm Z = sup X i, F but this time we assume that the class F is uniormly bounded and we drop the assumption on the independence o the underlying sequence X 1,...,X n. To be more precise, we will assume that X 1,...,X n orm a Markov chain, satisying some additional conditions, which are rather classical in the Markov chain or Markov Chain Monte Carlo literature. The organization o this part is as ollows. First, beore stating the main results, we will present all the structural assumptions we will impose on the chain. At the same time we will introduce some notation, which will be used in the sequel. Next, we present our results Theorems 6 and 7 ollowed by the proo which is quite straightorward but technical and a discussion o the optimality Section 3.3. At the end, in Section 3.4, we will also present a bounded dierences type inequality or Markov chains. 3.1 Assumptions on the Markov chain Let X 1, X 2,... be a homogeneous Markov chain on S, with transition kernel P = Px, A, satisying the so called minorization condition, stated below. Minorization condition We assume that there exist positive m N, δ > 0, a set C B small set and a probability measure ν on S or which and x C A B P m x, A δνa 14 x S n P nm x, C > 0, 15 where P i, is the transition kernel or the chain ater i steps. One can show that in such a situation i the chain admits an invariant measure π, then this measure is unique and satisies πc > 0 see [18]. Moreover, under some conditions on the initial distribution ξ, it can be extended to a new so called split chain X n, R n S {0, 1}, satisying the ollowing properties. Properties o the split chain P1 X n n is again a Markov chain with transition kernel P and initial distribution ξ hence or our purposes o estimating the tail probabilities we may and will identiy X n and X n, 1011

13 P2 i we deine T 1 = in{n > 0: R nm = 1}, T i+1 = in{n > 0: R T T i +nm = 1}, then T 1, T 2,..., are well deined, independent, moreover T 2, T 3,... are i.i.d., P3 i we deine S i = T T i, then the blocks Y 0 = X 1,...,X mt1 +m 1, Y i = X msi +1,...,X msi+1 +m 1, i > 0, orm a one-dependent sequence i.e. or all i, σy j j<i and σy j j>i are independent. Moreover, the sequence Y 1, Y 2,... is stationary. I m = 1, then the variables Y 0, Y 1,... are independent. In consequence, or : S R, the variables Z i = Z i = ms i+1 +m 1 i=ms i +1 X i, i 1, constitute a one-dependent stationary sequence an i.i.d. sequence i m = 1. Additionally, i is π-integrable recall that π is the unique stationary measure or the chain, then EZ i = δ 1 πc 1 m dπ. 16 P4 the distribution o T 1 depends only on ξ, P, C, δ, ν, whereas the law o T 2 only on P, C, δ and ν. We rerain rom speciying the construction o this new chain in ull generality as well as conditions under which 14 and 15 hold and reer the reader to the classical monograph [18] or a survey article [22] or a complete exposition. Here, we will only sketch the construction or m = 1, to give its lavour. Inormally speaking, at each step i, i we have X i = x and x / C, we generate the next value o the chain, according to the measure Px,. I x C, then we toss a coin with probability o success equal to δ. In the case o success R i = 1, we draw the next sample according to the measure ν, otherwise R i = 0, according to Px, δν. 1 δ When R i = 1, one usually says that the chain regenerates, as the distribution in the next step or m = 1, ater m steps in general is again ν. Let us remark that or a recurrent chain on a countable state space, admitting a stationary distribution, the Minorization condition is always satisied with m = 1 and δ = 1 or C we can take {x}, where x is an arbitrary element o the state space. Also the construction o the split chain becomes trivial. 1012

14 Beore we proceed, let us present a general idea, our approach is based on. To derive our estimates we will need two types o assumptions. Regeneration assumption. We will work under the assumption that the chain admits a representation as above Properties P1 to P4. We will not however take advantage o the explicit construction. Instead we will use the properties stated in points above. A similar approach is quite common in the literature. Assumption o the exponential integrability o the regeneration time. To derive concentration o measure inequalities, we will also assume that T 1 ψ1 < and T 2 ψ1 <. At the end o the article we will present examples or which this assumption is satisied and relate obtained inequalities to known results. The regenerative properties o the chain allow us to decompose the chain into onedependent independent i m = 1 blocks o random length, making it possible to reduce the analysis o the chain to sums o independent random variables this approach is by now classical, it has been successully used in the analysis o limit theorems or Markov chains, see [18]. Since we are interested in non-asymptotic estimates o exponential type, we have to impose some additional conditions on the regeneration time, which would give us control over the random length o one-dependent blocks. This is the reason or introducing the assumption o the exponential integrability which ater some technical steps allows us to apply the inequalities or unbounded empirical processes, presented in Section Main results concerning Markov chains Having established all the notation, we are ready to state our main results on Markov chains. As announced in the introduction, our results depend on the parameter m in the Minorization condition. I m = 1 we are able to obtain tail inequalities or empirical processes Theorem 7, whereas or m > 1 the result is restricted to linear statistics o Markov chains Theorem 6, which ormally corresponds to empirical processes indexed by a singleton. The variables T i and Z 1 appearing in the theorems were deined in the previous section see the properties P2 and P3 o the split chain. Theorem 6. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Consider a unction : S R, such that a and E π = 0. Deine also the random variable Z = X i. 1013

15 Then or all t > 0, P Z > t K exp 1 K min t 2 nmet 2 1 VarZ 1, t τ amlog n Theorem 7. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition with m = 1 and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Consider moreover a countable class F o measurable unctions : S R, such that a and E π = 0. Deine the random variable and the asymptotic weak variance Then, or all t 1, Remarks Z = sup F X i σ 2 = sup VarZ 1 /ET 2. F P Z KEZ + t K exp 1 t 2 K min t nσ 2, τ 3 ET 2 1. alog n 1. As it was mentioned in the previous section, chains satisying the Minorization condition admit at most one stationary measure. 2. In Theorem 7, the dependence on the chain is worse that in Theorem 6, i.e. we have τ 3 ET 2 1 instead o τ 2 in the denominator. It is a result o just one step in the argument we present below, however at the moment we do not know how to improve this dependence or extend the result to m > Another remark we would like to make is related to the limit behaviour o the Markov chain. Let us notice that the asymptotic variance the variance in the CLT or n 1/2 X X n equals which or m = 1 reduces to m 1 ET 2 1 VarZ 1 + 2EZ 1 Z 2, ET 2 1 VarZ 1 we again reer the reader to [18], Chapter 17 or details. Thus, or m = 1 our estimates relect the asymptotic behaviour o the variable Z. 1014

16 Let us now pass to the proos o the above theorems. For a unction : S R let us deine and recall the variables S i and Z 0 = Z i = Z i = mt 1 +m 1 n ms i+1 +m 1 i=ms i +1 X i X i, i 1, deined in the previous section see property P3 o the split chain. Recall also that Z i s orm a one-dependent stationary sequence or m > 1 and an i.i.d. sequence or m = 1. Using this notation, we have with X X n = Z Z N + i=s N+1 +1m X i, 18 N = sup{i N: ms i+1 + m 1 n}, 19 where sup = 0 note that N is a random variable. Thus Z 0 represents the sum up to the irst regeneration time, then Z 1,...,Z N are identically distributed blocks between consecutive regeneration times, included in the interval [1, n], inally the last term corresponds to the initial segment o the last block. The sum Z Z N is empty i up to time n, there has not been any regeneration i.e. mt 1 + m 1 > n or there has been only one regeneration mt 1 + m 1 n and mt 1 + T 2 + m 1 > n. The last sum on the right hand side is empty i there has been no regeneration or the last ull block ends with n. We will irst bound the initial and the last summand in the decomposition 18. To achieve this we will not need the assumptions that is centered with respect to the stationary distribution π. In consequence the same bound may be applied to proos o both Theorem 6 and Theorem 7. Lemma 2. I T 1 ψ1 τ and a, then or all t 0, t PZ 0 t 2 exp. 2amτ Proo. We have Z 0 2aT 1 m, so by the remark ater Deinition 1, t PZ 0 t PT 1 t/2am 2 exp. 2amτ 1015

17 The next lemma provides a similar bound or the last summand on the right hand side o 18. It is a little bit more complicated, since it involves additional dependence on the random variable N. Lemma 3. I T 1 ψ1, T 2 ψ1 τ, then or all t 0, Pn ms N > t K exp In consequence, i a, then P i=s N+1 +1m X i > t K exp t Kmτ log τ t Kamτ log τ Proo. Let us consider the variable M n = n ms N I M n > t then S N+1 < n t + 1 n t <. m m Thereore PM n > t = = = PS N+1 = k k< n t+1 m 1 k PS l = k &N + 1 = l k< n t+1 m 1 k< n t+1 m 1 k< n t+1 m 1 n t+1 m 1 k=1 n t+1 m 1 k=1 2 exp l=1.. k PS l = k & mk + T l+1 + m 1 > n l=1 k l=1 PS l = kpt 2 > n + 1 m 1 k PT 2 > n + 1 m 1 k 1 2 exp τ τ 1 1 n + 1 m t Kτ exp, mτ k + 1 n + 1 m exp exp1/τ τ 1 n t+1 m 1 exp1/τ 1 where the irst equality ollows rom the act that S N+1 N + 1, the second rom the deinition o N, the third rom the act that T 1, T 2,... are independent and T 2, T 3,

18 are i.i.d., inally the second inequality rom the act that S i S j or i j see the properties P2 and P3 o the split chain. Let us notice that i t > 2mτ log τ, then t t τ exp exp exp mτ 2mτ t Kmτ log τ where in the last inequality we have used the act that τ > c or some universal constant c > 1. On the other hand, i t < 2mτ log τ, then Thereore we obtain or t 0, 1 e exp PM n > t K exp which proves the irst part o the Lemma. Now, P i=s N+1 +1m X i > t t 2mτ log τ. t Kmτ log τ, PM n > t/a K exp, t Kamτ log τ Beore we proceed with the proo o Theorem 6 and Theorem 7, we would like to make some additional comments regarding our approach. As already mentioned, thanks to the property P3 o the split chain, we may apply to Z Z N the inequalities or sums o independent random variables obtained in Section 2 since or m > 1 we have only one-dependence, we will split the sum, treating even and odd indices separately. The number o summands is random, but clearly not larger than n. Since the variables Z i are equidistributed, we can reduce this random sum to a deterministic one by applying the ollowing maximal inequality by Montgomery-Smith [19]. Lemma 4. Let Y 1,...,Y n be i.i.d. Banach space valued random variables. Then or some universal constant K and every t > 0, P max k n k Y i > t KP Y i > t/k. Remark The use o regeneration methods makes our proo similar to the proo o the CLT or Markov chains. In this context, the above lemma can be viewed as a counterpart o the Anscombe theorem they are quite dierent statements but both are used to handle the random number o summands. 1017

19 One could now apply Lemma 4 directly, using the act that N n. Then however one would not get the asymptotic variance in the exponent see remark ater Theorem 7. The orm o this variance is a consequence o the aorementioned Anscombe theorem and the act that by the LLN we have denoting N = N n to stress the dependence on n N n lim n n = 1 a.s. 20 met 2 Thereore to obtain an inequality which at least up to universal constants and or m = 1 relects the limiting behaviour o the variable Z, we will need a quantitative version o 20 given in the ollowing lemma. Lemma 5. I T 1 ψ1, T 2 ψ1 τ, then PN > 3n/mET 2 K exp 1 net 2 K mτ 2. To prove the above estimate, we will use the classical Bernstein s inequality actually its version or ψ 1 variables. Lemma 6 Bernstein s ψ 1 inequality, see [25], Lemma and the subsequent remark. I Y 1,...,Y n are independent random variables such that EY i = 0 and Y ψ1 τ, then or every t > 0, P Y i > t 2 exp 1 t 2 K min t nτ 2,. τ Proo o Lemma 5. Assume now that n/met 2 1. We have PN > 3n/mET 2 P mt T 3n/mET2 +1 n P P = P 3n/mET 2 +1 i=2 3n/mET 2 +1 i=2 3n/mET 2 +1 i=2 T i ET 2 n/m 3n/mET 2 ET 2 T i ET 2 n/m 3n/2m T i ET 2 n/2m. We have T 2 ET 2 ψ1 2 T 2 ψ1 2τ, thereore Bernstein s inequality Lemma 6, 1018

20 gives PN > 3n/mET 2 2 exp 1 K min 2 exp 1 K min net2 mτ 2, n mτ = 2 exp 1 net 2 K mτ 2, n/2m 2 n 3n/mET 2 τ 2, mτ where the equality ollows rom the act that ET 2 τ. I n/met 2 < 1, then also net 2 /mτ 2 < 1, thus inally we have which proves the lemma. PN > 3n/mET 2 K exp 1 net 2 K mτ 2, We are now in position to prove Theorem 6 Proo o Theorem 6. Let us notice that Z i amt i+1, so or i 1, Z i ψ1 am T 2 ψ1 amτ. Additionally, by 16, or i 1, EZ i = 0. Denote now R = 3n/mET 2. Lemma 5, Lemma 4 and Theorem 4 with α = 1, combined with Pisier s estimate 13 give P Z Z N > 2t 21 P Z Z N > 2t &N R + K exp 1 net 2 K mτ 2 P Z 1 + Z Z 2 N 1/2 +1 > t &N R P Z 2 + Z Z 2 N/2 > t &N R + K exp 1 net 2 K mτ 2 P max Z 1 + Z Z 2k+1 > t k R 1/2 + P max Z Z 2k > t + K exp 1 net 2 k R/2 K mτ 2 KP Z 1 + Z Z 2 R 1/2 +1 > t/k + KP Z 2 + Z Z 2 R/2 > t/k + K exp 1 net 2 K mτ 2 K exp 1 K min + K exp 1 K t 2 nmet 2 1 VarZ 1, net 2 mτ 2. t log3n/met 2 amτ 1019

21 Combining the above estimate with 18, Lemma 2 and Lemma 3, we obtain P S n > 4t K exp 1 K min t 2 t nmet 2 1, VarZ 1 log3n/met 2 amτ + K exp 1 net 2 t t K mτ exp + K exp. 2amτ Kamτ log τ For t > na/4, the let hand side o the above inequality is equal to 0, thereore, using the act that ET 2 1, τ > 1, we obtain 17. The proo o Theorem 7 is quite similar, however it involves some additional technicalities related to the presence o EZ in our estimates. Proo o Theorem 7. Let us irst notice that similarly as in the real valued case, we have t Psup Z 0 t PT 1 t/a 2 exp, aτ moreover Lemma 3 applied to the unction x sup F x gives P sup i=s N+1 +1 X i > t K exp t Kaτ log τ One can also see that since we assume that m = 1, the splitting o Z Z N into sums over even and odd indices is not necessary by the property P3 o the split chain the summands are independent. Using the act that Lemma 4 is valid or Banach space valued variables, we can repeat the argument rom the proo o Theorem 6 and obtain or R = 3n/ET 2, R P Z KEsup Z i +t Thus, Theorem 7 will ollow i we prove that R Esup Z i KEsup recall that K may change rom line to line. K exp 1 t 2 K min nσ 2, K exp 1 t 2 K min nσ 2,. t τ 2 alog n t τ 3 ET 2 1 alog n. X i + Kτ 3 a/et

22 From the triangle inequality, the act that Y i = X Si +1,...,X Si+1, i 1, are i.i.d. and Jensen s inequality it ollows that R Esup Z i 12Esup 12Esup n/4et 2 n/4et 2 Z i Z i + 12aτ, 23 where in the last inequality we used the act that Esup Z i EaT i+1 aτ. We will split the integral on the right hand side into two parts, depending on the size o the variable N. Let us irst consider the quantity Esup n/4et 2 Z i 1 {N< n/4et2 } Assume that n/4et 2 1. Then, using Bernstein s inequality, we obtain P N < n/4et 2 = P PT 1 > n/2 + P 2e n/2τ + P 2e n/2τ + 2 exp n/4et 2 +1 n/4et 2 +1 i=2 n/4et 2 +1 i=2 T i > n T i ET 2 > n/2 n/4et 2 ET 2 T i ET 2 > n/4 1 K min n 2 ET 2 nτ 2, n τ I n/4et 2 < 1, the above estimate holds trivially. Thereore Esup n/4et 2 Z i 1 {N< n/4et2 } a an T 2 2 PN < n/4et2 Ke net 2/Kτ 2. n/4et 2 ET i+1 1 {N< n/4et2 } Kaτne net 2/Kτ 2 Kaτ 3 /ET Now we will bound the remaining part i.e. Esup n/4et 2 Z i 1 {N n/4et2 }. 1021

23 Recall that Y 0 = X 1,...,X T1, Y i = X Si +1,...,X Si+1 or i 1 and consider a iltration F i i 0 deined as F i = σy 0,...,Y i, where we regard the blocks Y i as random variables with values in the disjoint union Si, with the natural σ-ield, i.e. the σ-ield generated by B i recall that B denotes our σ-ield o reerence in S. Let us urther notice that T i is measurable with respect to σy i 1 or i 1. We have or i 1, {N + 1 i} = {T T i+1 > n} F i and {N + 1 0} =, so N + 1 is a stopping time with respect to the iltration F i. Thus we have Esup n/4et 2 = Esup Z i [ E E sup = Esup 1 {N n/4et2 } n/4et 2 N+1 N+1 aτ + Esup aτ + Esup N+1 Z i N+1 i=0 Z i Z i 1 {N+1> n/4et2 } F n/4et2 N+1 1 {N+1> n/4et2 } Z i 1 {N+1> n/4et2 } X i + Esup S N+2 i=n+1 ] 1 {N+1> n/4et2 } X i, where in the irst inequality we used Doob s optional sampling theorem together with the act that sup n Z i is a submartingale with respect to F i notice that Z i is measurable with respect to σy i or i N and F. The second equality ollows rom the act that {N + 1 > n/4et 2 } F n/4et2 N+1. Indeed or i n/4et 2, we have {N + 1 > n/4et 2 & n/4et 2 N + 1 i} = {N + 1 > n/4et 2 } F n/4et2 F i, whereas or i < n/4et 2, this set is empty. Now, combining the above estimate with 23 and 24 and taking into account the inequality τ ET 2 1, it is easy to see that to inish the proo o 22 it is enough to 1022

24 show that Esup S N+2 i=n+1 X i Kaτ 3 /ET 2 25 This in turn will ollow i we prove that ES N+2 n Kτ 3 /ET 2. Recall now the irst part o Lemma 3, stating under our assumptions m = 1 that Pn S N+1 > t K exp t/kτ log τ or t 0. We have PS N+2 n > t Pn S N+1 > t + Pn S N+1 t & S N+2 n > t &N > 0 + PS N+2 n > t & N = 0 t n Ke t/kτ log τ + PS N+1 = n k&t N+2 > t + k &N > 0 + PT 1 + T 2 > t Ke t/kτ log τ + k=0 t n k=0 t n k PS l = n k &T l+1 > t + k + 2e t/2τ l=2 n k Ke t/kτ log τ + PS l = n kpt l+1 > t + k k=0 l=2 Ke t/kτ log τ + 2t + 1e t/τ Ke t/kτ log τ. This implies that ES N+2 n Kτ log τ Kτ 3 /ET 2, which proves 25. Thus 22 is shown and Theorem 7 ollows. Remark Two natural questions to ask in regard to Theorem 7 is irst whether the constant K in ront o the expectation can be reduced to 1+η as in Massart s Theorem 2 or Theorem 4 and second, whether one can reduce the constant K in the Gaussian part to 21 + δ as in Theorem Another counterexample I we do not pay attention to constants, the main dierence between inequalities presented in the previous section and the classical Bernstein s inequality or sums o i.i.d. bounded variables is the presence o the additional actor log n. We would now like to argue that under the assumptions o Theorems 6 and 7, this additional actor is indispensable. To be more precise, we will construct a Markov chain on a countable state space, satisying the assumptions o Theorem 6 with m = 1 and such that or β < 1, there is 1023

25 no constant K, such that P X X n t K exp 1 K min t 2 nvarz 1, t log β n 26 or all n and all unctions : S R, with 1 and E π = 0. The state space o the chain will be the set S = {0} {n} {1, 2,...,n} {+1, 1}. n=1 The transition probabilities are as ollows p n,k,s,n,k+1,s = 1 or n = 1, 2,..., k = 1, 2,...,n 1, s = 1, +1 p n,n,s,0 = 1 or n = 1, 2,..., s = 1, +1, p 0,n,1,s = 1 2A e n or n = 1, 2,..., s = 1, +1, where A = n=1 e n. In other words, whenever a particle is at 0, it chooses one o countably many loops and travels deterministically along it until the next return to 0. It is easy to check that this chain has a stationary distribution π, given by A π 0 = A + n=1 ne n, π n,i,s = 1 2A e n π 0. The chain satisies the minorization condition 14 with C = {0}, ν{x} = p 0,x, δ = 1 and m = 1. The random variable T 1 is now just the time o the irst visit to 0 and T 2, T 3,... indicate the time between consecutive visits to 0. Moreover PT 2 = n = e n A, so T 2 ψ1 <. I we start the chain rom initial distribution ν, then T 1 has the same law as T 2, so τ = T 2 ψ1 = T 1 ψ1. Let us now assume that or some β < 1, there is a constant K, such that 26 holds. Since we work with a ixed chain, in what ollows we will use the letter K also to denote constants depending on our chain the value o K may again dier at dierent occurrences. We can in particular apply 26 to the unction = r where r is a large integer, given by the ormula 0 = 0, n, i, s = s1 {n r}. 1024

26 We have E π r = 0. Moreover Thereore 26 gives or t 1 and n N. VarZ 1 r = n 2 e n A 1 Kr 2 e r. n=r P r X r X n Kre r/2 nt + t log β n e t 27 Recall that S i = T T i. By Bernstein s inequality Lemma 6, we have or large n, PS n/3et2 > n = PT T n/3et2 > n = P P n/3et 2 n/3et 2 2 exp 1 K min T i ET i > n n/3et 2 ET 2 T i ET i > n/2 n 2 n T 2 2 ψ 1, n = 2e n/k. T 2 ψ1 From the above estimate, or some integer L and n large enough, divisible by L, 1025

27 n/l P Z i r 2Kre r/2 nt + t log β n i=0 n/l 2e n/k + P Z i r 2Kre r/2 nt + t log β n &S n/l+1 n i=0 2e n/k + P r X i Kre r/2 nt + t log β n &S n/l+1 n + P i=0 i=s n/l e n/k + e t + P k n =2e n/k + e t i=s n/l [ n k E 1 {Sn/L+1 =k}p k n r X i Kre r/2 nt + t log β n &S n/l+1 n r X i Kre r/2 nt + t log β n &S n/l+1 = k r X i Kre r/2 ] nt + t log β n 2e n/k + e t + e t k n E1 {Sn/L+1 =k} 2e n/k + 2e t, where in the third and ourth inequality we used 27 and in the equality, the Markov property. For n r 2 e r and t 1, we obtain P Z 0 r Z n/l r Kt log β n 2e t + 2e n/k 28 On the other hand we have PZ i r r > 1 2A e r. Thereore Pmax i n/l Z i r > r 2 1 minne r /2AL, 1. Since Z i r are symmetric, by Levy s inequality, we get 2P Z 0 r Z n/l r r 1 2 minne r /2AL, 1 c r 2, whereas 28 applied or t = K 1 r/ log β n K 1 r 1 β 1 gives P Z 0 r Z n/l r r 2e r1 β /K + 2e er /Kr 2, which gives a contradiction. 1026

28 3.4 A bounded dierence type inequality or symmetric unctions Now we will present an inequality or more general statistics o the chain. Under the same assumptions on the chain as above with an additional restriction that m = 1, we will prove a version o the bounded dierence inequality or symmetric unctions see e.g. [11] or the classical i.i.d. case. Let us consider a measurable unction : S n R which is invariant under permutations o arguments i.e. or all permutations σ o the set {1,...,n}. x 1,...,x n = x σ1,...,x σn 29 Let us also assume that is L-Lipschitz with respect to the Hamming distance, i.e. Then we have the ollowing x 1,...,x n y 1,...,y n L#{i: x i y i }. 30 Theorem 8. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition with m = 1 and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Then or every unction : S n R, satisying 29 and 30, we have or all t 0. PX 1,...,X n EX 1,...,X n t 2 exp 1 t 2 K nl 2 τ 2 To prove the above theorem, we will need the ollowing Lemma 7. Let ϕ: R R be a convex unction and G = Y 1,...,Y n, where Y 1,...,Y n are independent random variables with values in a measurable space E and : E n R is a measurable unction. Denote G i = Y 1,...,Y i 1, Ỹi, Y i+1,...,y n, where Ỹ1,...,Ỹn is an independent copy o Y 1,...,Y n. Assume moreover that G G i F i Y i, Ỹi or some unctions F i : E 2 R, i = 1,...,n. Then EϕG EG Eϕ ε i F i Y i, Ỹi, 31 where ε 1,...,ε n is a sequence o independent Rademacher variables, independent o Y i n and Ỹi n. 1027

29 Proo. Induction with respect to n. For n = 0 the statement is obvious, since both the let-hand and the right-hand side o 31 equal ϕ0. Let us thereore assume that the lemma is true or n 1. Then, denoting by E X integration with respect to the variable X, EϕG EG = EϕG EỸn G n + E Yn G EG EϕG G n + E Yn G EG = EϕG n G + E Yn G EG = Eϕε n G G n + E Yn G EG Eϕε n F n Y n, Ỹn + E Yn G EG, where the equalities ollow rom the symmetry and the last inequality rom the contraction principle or simply convexity o ϕ, applied conditionally on Y i i, Ỹi i. Now, denoting Z = E Yn G, Z i = E Yn G i, we have or i = 1,...,n 1, Z Z i = E Yn G E Yn G i E Yn G G i F i Y i, Ỹi, and thus or ixed Y n,ỹn and ε n, we can apply the induction assumption to the unction t ϕε n FY n, Ỹn + t instead o ϕ and E Yn G instead o G, to obtain EϕG EG Eϕ F i Y i, Ỹiε i. Lemma 8. In the setting o Lemma 7, i or all i, F i Y i, Ỹi ψ1 τ, then or all t > 0, PY 1,...,Y n EY 1,...,Y n t 2 exp 1 t 2 K min t nτ 2,. τ Proo. For p 1, Y 1,...,Y n EY 1,...,Y n p ε i FY i, Ỹi K p nτ + pτ, p where the irst inequality ollows rom Lemma 7 and the second one rom Bernstein s inequality Lemma 6 and integration by parts. Now, by the Chebyshev inequality we get PY 1,...,Y n EY 1,...,Y n K tn + tτ e t or t 1, which is up to the constant in the exponent equivalent to the statement o the lemma note that i we can change the constant in the exponent, the choice o the constant in ront o the exponent is arbitrary, provided it is bigger than 1. Proo o Theorem 8. Consider a disjoint union E = 1028 S i

30 and a unction : E n R deined as where x i s are deined by the condition y 1,...,y n = x 1,...,x n, y 1 = x 1,...,x t1 S t 1 y 2 = x t1 +1,...,x t1 +t 2 S t 2... y n = x t t n 1 +1,...,x t t n S tn. 32 Let now T 1,...,T n be the regeneration times o the chain and set Y i = X T T i 1 +1,...,X T T i or i = 1,...,n we change the enumeration with respect to previous sections, but there is no longer need to distinguish the initial block. Then Y 1,...,Y n are independent E- valued random variables recall the assumption m = 1. Moreover we have X 1,...,X n = Y 1,...,Y n. Let now Ỹ1,...,Ỹn be an independent copy o the sequence Y 1,...,Y n. Deine G and G i like in Lemma 7 or the unction. Deine also T i = j i Ỹi S j and let X i,1,..., X i,t T i 1 + T i +T i T n correspond to Y 1,...,Y i 1, Ỹi, Y i+1,...,y n in the same way as in 32. Let us notice that we can rearrange the sequence X i,1,..., X i,n in such a way that the Hamming distance o the new sequence rom X 1,...,X n will not exceed maxt i, T i. Since the unction is invariant under permutation o arguments and L-Lipschitz with respect to the Hamming distance, we have G G i LmaxT i, T i =: FY i, Ỹi. Moreover, FY i, Ỹi ψ1 2Lτ, so by Lemma 8, we obtain PX 1,...,X n EX 1,...,X n t = P Y 1,...,Y n E Y 1,...,Y n t 2 exp 1 t 2 K min t nl 2 τ 2,. Lτ But rom Jensen s inequality and 30 it ollows that X 1,...,X n EX 1,...,X n Ln, thus or t > Ln, the let hand side o the above inequality is equal to 0, whereas or t Ln, the inequality τ > 1 gives which proves the theorem. t 2 nl 2 τ 2 t Lτ, 1029

31 3.5 A ew words on connections with other results First we would like to comment on the assumptions o our main theorems, concerning Markov chains. We assume that the Orlicz norms T 1 ψ1 and T 2 ψ1 are inite, which is equivalent to existence o a number κ > 1, such that E ξ κ T 1 <, E ν κ T 1 <, where ξ is the initial distribution o the chain and ν the minorizing measure rom condition 14. This is true or instance i m = 1 and the chain satisies the drit condition, i.e. i there is a measurable unction V : S [1,, together with constants λ < 1 and K <, such that { λv x or x / C, PV x = V ypx, dy K or x C S and V is ξ and ν integrable see e.g. [1], Propositions 4.1 and 4.4, see also [22], [18]. For m > 1 one can similarly consider the kernel P m instead o P however in this case our inequalities are restricted to averages o real valued unctions as in Theorem 6. Such drit conditions have gained considerable attention in the Markov Chain Monte Carlo theory as they imply geometric ergodicity o the chain. Concentration o measure inequalities or general unctions o Markov chains were investigated by Marton [13], Samson [23] and more recently by Kontorovich and Ramanan [8]. They actually consider more general mixing processes and give estimates on the deviation o a random variable rom the mean or median in terms o mixing coeicients. When specialized to Markov chains, their estimates yield inequalities in the spirit o Theorem 8 or general non-necessarily symmetric unctions o uniormly ergodic Markov chains see [18], Chapter 16 or the deinition. To obtain their results, Marton and Samson used transportation inequalities, whereas Kontorovich s and Ramanan s approach was based on martingales. In all cases the bounds include sums o expressions o the orm sup P i x, P i y, TV, x,y S where P i is the i step transition unction o the chain. These results are not well suited or Markov chains which are not uniormly ergodic like the chain in Section 3.3, since or such chains the summands are bounded rom below by a constant which spoils the dependence on n in the estimates. It would be interesting to know i in results o this type, the supremum o the total variation distances can be replaced by some other norm, or instance a kind o average. This would allow to extend the estimates to some classes o non-uniormly ergodic Markov chains. Inequalities o the bounded dierence type or sums X X n where X i s orm a uniormly ergodic Markov chain were also obtained by Glyn and Ormoneit [6]. Their method was to analyze the Poisson equation associated with the chain. Their result has been complemented by an inormation theoretic approach in Kontoyiannis et al. [9]. 1030

32 Estimates or sums, in terms o variance, appeared in the work by Samson [23], who presents a result or empirical processes o uniormly ergodic chains. He gives a real concentration inequality around the mean and not just a tail bound as in Theorem 7. The coeicient responsible or the subgaussian behavior o the tail is E n sup X i 2. Replacing it with V = Esup i X i 2 which would correspond to the original Talagrand s inequality is stated in Samson s work as an open problem, which to our best knowledge has not been yet solved. Additionally, in Samson s estimate there is no log n actor, which is present in Theorems 6 and 7. Since we have shown that in our setting this actor is indispensable, we would like to comment on the dierences between the results by Samson and ours. Obviously, the irst dierence is the setting. Although non-uniormly ergodic chains may satisy our assumptions T 1 ψ1, T 2 ψ1 <, the Minorization condition need not hold or them with m = 1, which restricts our results to linear statistics o the chain Theorem 6. However, there are many examples o non-uniormly ergodic chains, or which one cannot apply Samson s result but which satisy our assumptions. Such chains have been considered in the MCMC theory. When specialized to sums o real variables, Samson s result can be considered a counterpart o the Bernstein inequality, valid or uniormly ergodic Markov chains. The subgaussian part o the estimate is controlled by n EX i 2, which can be much bigger than the asymptotic variance and thereore does not relect the limiting behaviour o X X n. Consider or instance a chain consisting o the origin connected with initely many loops in which, similarly as in the example rom Section 3.3, the randomness appears only at the origin i.e. ater the choice o the loop the particle travels along it deterministically until the next return to the origin. Then, one can easily construct a unction with values in {±1}, centered with respect to the stationary distribution and such that its asymptotic variance is equal to zero, whereas n EX i 2 = n or all n it happens or instance i the sum o the values o along each loop vanishes. In consequence, n 1/2 X X n converges weakly to the Dirac mass at 0 and we have PX X n nt 0 or all t > 0, which is not recovered by Samson s estimate. One can also construct other examples o similar lavour, in which the asymptotic variance is nonzero but is still much smaller than E n X i 2. On the other hand Samson s results do not require the condition E π = 0 and as already mentioned in the case o empirical processes they provide a two sided concentration around the mean. As or the log n actor, at present we do not know i at the cost o replacing the asymptotic variance with n EX i 2 one can eliminate it in our setting. Summarizing, our inequalities, when compared to known results have both advantages and disadvantages. On the one hand, when specialized to uniormly ergodic 1031

33 Markov chains, they do not recover the ull generality or strength o previous estimates or instance Theorem 8 is restricted to symmetric statistics and m = 1, on the other hand they may be applied to Markov chains arising in statistical applications, which are not uniormly ergodic and thereore beyond the scope o the estimates presented above. Another property, which in our opinion, makes the estimates o Theorems 6 and 7 interesting at least rom the theoretical point o view is the act that or m = 1, the coeicient responsible or the Gaussian level o concentration corresponds to the variance o the limiting Gaussian distribution. Acknowledgements The author would like to thank Witold Bednorz and Krzyszto Latuszyński or their useul comments concerning the results presented in this article as well as the anonymous Reeree, whose remarks helped improve their presentation. Reerences [1] Baxendale P. H. Renewal theory and computable convergence rates or geometrically ergodic Markov chains. Ann. Appl. Probab , no. 1B, MR [2] Boucheron S., Bousquet O., Lugosi G., Massart P. Moment inequalities or unctions o independent random variables. Ann. Probab , no. 2, MR [3] Bousquet O. A Bennett concentration inequality and its application to suprema o empirical processes. C. R. Math. Acad. Sci. Paris , no. 6, MR [4] Einmahl U., Li D. Characterization o LIL behavior in Banach space. To appear in Trans. Am. Math. Soc. [5] Giné E., Lata la R., Zinn J. Exponential and moment inequalities or U- statistics. In High Dimensional Probability II, Progr. Probab. 47. Birkhauser, Boston, Boston, MA, MR [6] Glynn P. W., Ormoneit D. Hoeding s inequality or uniormly ergodic Markov chains. Statist. Probab. Lett , no. 2, MR [7] Klein T., Rio, E. Concentration around the mean or maxima o empirical processes. Ann. Probab , no. 3, MR [8] Kontorovich L., Ramanan K. Concentration Inequalities or Dependent Random Variables via the Martingale Method. To appear in Ann. Probab. [9] Kontoyiannis I., Lastras-Montaño L., Meyn S. P. Relative Entropy and Exponential Deviation Bounds or General Markov Chains IEEE International Symposium on Inormation Theory. 1032

arxiv: v1 [math.pr] 19 Sep 2007

arxiv: v1 [math.pr] 19 Sep 2007 arxiv:0709.3110v1 [math.pr] 19 Sep 2007 A tail inequality or suprema o unbounded empirical processes with applications to Markov chains Rados law Adamczak June 11, 2008 Abstract We present an easy extension

More information

Tail inequalities for additive functionals and empirical processes of. Markov chains

Tail inequalities for additive functionals and empirical processes of. Markov chains Tail inequalities for additive functionals and empirical processes of geometrically ergodic Markov chains University of Warsaw Banff, June 2009 Geometric ergodicity Definition A Markov chain X = (X n )

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

Concentration inequalities and the entropy method

Concentration inequalities and the entropy method Concentration inequalities and the entropy method Gábor Lugosi ICREA and Pompeu Fabra University Barcelona what is concentration? We are interested in bounding random fluctuations of functions of many

More information

arxiv:math/ v2 [math.pr] 16 Mar 2007

arxiv:math/ v2 [math.pr] 16 Mar 2007 CHARACTERIZATION OF LIL BEHAVIOR IN BANACH SPACE UWE EINMAHL a, and DELI LI b, a Departement Wiskunde, Vrije Universiteit Brussel, arxiv:math/0608687v2 [math.pr] 16 Mar 2007 Pleinlaan 2, B-1050 Brussel,

More information

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract)

Finite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract) Electronic Notes in Theoretical Computer Science 270 (1) (2011) 113 119 www.elsevier.com/locate/entcs Finite Dimensional Hilbert Spaces are Complete or Dagger Compact Closed Categories (Extended bstract)

More information

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values

Supplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values Supplementary material or Continuous-action planning or discounted ininite-horizon nonlinear optimal control with Lipschitz values List o main notations x, X, u, U state, state space, action, action space,

More information

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction

2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction . ETA EVALUATIONS USING WEBER FUNCTIONS Introduction So ar we have seen some o the methods or providing eta evaluations that appear in the literature and we have seen some o the interesting properties

More information

Concentration, self-bounding functions

Concentration, self-bounding functions Concentration, self-bounding functions S. Boucheron 1 and G. Lugosi 2 and P. Massart 3 1 Laboratoire de Probabilités et Modèles Aléatoires Université Paris-Diderot 2 Economics University Pompeu Fabra 3

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Problem Set. Problems on Unordered Summation. Math 5323, Fall Februray 15, 2001 ANSWERS

Problem Set. Problems on Unordered Summation. Math 5323, Fall Februray 15, 2001 ANSWERS Problem Set Problems on Unordered Summation Math 5323, Fall 2001 Februray 15, 2001 ANSWERS i 1 Unordered Sums o Real Terms In calculus and real analysis, one deines the convergence o an ininite series

More information

Wald for non-stopping times: The rewards of impatient prophets

Wald for non-stopping times: The rewards of impatient prophets Electron. Commun. Probab. 19 (2014), no. 78, 1 9. DOI: 10.1214/ECP.v19-3609 ISSN: 1083-589X ELECTRONIC COMMUNICATIONS in PROBABILITY Wald for non-stopping times: The rewards of impatient prophets Alexander

More information

In the index (pages ), reduce all page numbers by 2.

In the index (pages ), reduce all page numbers by 2. Errata or Nilpotence and periodicity in stable homotopy theory (Annals O Mathematics Study No. 28, Princeton University Press, 992) by Douglas C. Ravenel, July 2, 997, edition. Most o these were ound by

More information

Classification of effective GKM graphs with combinatorial type K 4

Classification of effective GKM graphs with combinatorial type K 4 Classiication o eective GKM graphs with combinatorial type K 4 Shintarô Kuroki Department o Applied Mathematics, Faculty o Science, Okayama Uniervsity o Science, 1-1 Ridai-cho Kita-ku, Okayama 700-0005,

More information

Math 216A. A gluing construction of Proj(S)

Math 216A. A gluing construction of Proj(S) Math 216A. A gluing construction o Proj(S) 1. Some basic deinitions Let S = n 0 S n be an N-graded ring (we ollows French terminology here, even though outside o France it is commonly accepted that N does

More information

Limit theorems for dependent regularly varying functions of Markov chains

Limit theorems for dependent regularly varying functions of Markov chains Limit theorems for functions of with extremal linear behavior Limit theorems for dependent regularly varying functions of In collaboration with T. Mikosch Olivier Wintenberger wintenberger@ceremade.dauphine.fr

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

On a Closed Formula for the Derivatives of e f(x) and Related Financial Applications

On a Closed Formula for the Derivatives of e f(x) and Related Financial Applications International Mathematical Forum, 4, 9, no. 9, 41-47 On a Closed Formula or the Derivatives o e x) and Related Financial Applications Konstantinos Draais 1 UCD CASL, University College Dublin, Ireland

More information

Nonparametric regression with martingale increment errors

Nonparametric regression with martingale increment errors S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration

More information

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains A regeneration proof of the central limit theorem for uniformly ergodic Markov chains By AJAY JASRA Department of Mathematics, Imperial College London, SW7 2AZ, London, UK and CHAO YANG Department of Mathematics,

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN The notions o separatedness and properness are the algebraic geometry analogues o the Hausdor condition and compactness in topology. For varieties over the

More information

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract

A Note on Jackknife Based Estimates of Sampling Distributions. Abstract A Note on Jackknife Based Estimates of Sampling Distributions C. Houdré Abstract Tail estimates of statistics are given; they depend on jackknife estimates of variance of the statistics of interest. 1

More information

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread!

The Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread! The Cliord algebra and the Chevalley map - a computational approach detailed version 1 Darij Grinberg Version 0.6 3 June 2016. Not prooread! 1. Introduction: the Cliord algebra The theory o the Cliord

More information

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,

More information

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT Elect. Comm. in Probab. 10 (2005), 36 44 ELECTRONIC COMMUNICATIONS in PROBABILITY ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT FIRAS RASSOUL AGHA Department

More information

Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization

Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization John Duchi Outline I Motivation 1 Uniform laws of large numbers 2 Loss minimization and data dependence II Uniform

More information

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points

Roberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points Roberto s Notes on Dierential Calculus Chapter 8: Graphical analysis Section 1 Extreme points What you need to know already: How to solve basic algebraic and trigonometric equations. All basic techniques

More information

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series.

Definition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series. 2.4 Local properties o unctions o several variables In this section we will learn how to address three kinds o problems which are o great importance in the ield o applied mathematics: how to obtain the

More information

Concentration of Measures by Bounded Size Bias Couplings

Concentration of Measures by Bounded Size Bias Couplings Concentration of Measures by Bounded Size Bias Couplings Subhankar Ghosh, Larry Goldstein University of Southern California [arxiv:0906.3886] January 10 th, 2013 Concentration of Measure Distributional

More information

New Bernstein and Hoeffding type inequalities for regenerative Markov chains

New Bernstein and Hoeffding type inequalities for regenerative Markov chains New Bernstein and Hoeffding type inequalities for regenerative Markov chains Patrice Bertail Gabriela Ciolek To cite this version: Patrice Bertail Gabriela Ciolek. New Bernstein and Hoeffding type inequalities

More information

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Yan Bai Feb 2009; Revised Nov 2009 Abstract In the paper, we mainly study ergodicity of adaptive MCMC algorithms. Assume that

More information

Strong approximation for additive functionals of geometrically ergodic Markov chains

Strong approximation for additive functionals of geometrically ergodic Markov chains Strong approximation for additive functionals of geometrically ergodic Markov chains Florence Merlevède Joint work with E. Rio Université Paris-Est-Marne-La-Vallée (UPEM) Cincinnati Symposium on Probability

More information

Concentration inequalities and tail bounds

Concentration inequalities and tail bounds Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples

More information

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*

LARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS* LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Appl. 377 20 44 449 Contents lists available at ScienceDirect Journal o Mathematical Analysis and Applications www.elsevier.com/locate/jmaa Value sharing results or shits o meromorphic unctions

More information

ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION

ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION MARTA STRZELECKA, MICHA L STRZELECKI, AND TOMASZ TKOCZ Abstract. We show that every symmetric random variable with log-concave tails

More information

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i := 2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]

More information

On High-Rate Cryptographic Compression Functions

On High-Rate Cryptographic Compression Functions On High-Rate Cryptographic Compression Functions Richard Ostertág and Martin Stanek Department o Computer Science Faculty o Mathematics, Physics and Inormatics Comenius University Mlynská dolina, 842 48

More information

OXPORD UNIVERSITY PRESS

OXPORD UNIVERSITY PRESS Concentration Inequalities A Nonasymptotic Theory of Independence STEPHANE BOUCHERON GABOR LUGOSI PASCAL MASS ART OXPORD UNIVERSITY PRESS CONTENTS 1 Introduction 1 1.1 Sums of Independent Random Variables

More information

STAT 801: Mathematical Statistics. Hypothesis Testing

STAT 801: Mathematical Statistics. Hypothesis Testing STAT 801: Mathematical Statistics Hypothesis Testing Hypothesis testing: a statistical problem where you must choose, on the basis o data X, between two alternatives. We ormalize this as the problem o

More information

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results

(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results (C) The rationals and the reals as linearly ordered sets We know that both Q and R are something special. When we think about about either o these we usually view it as a ield, or at least some kind o

More information

Telescoping Decomposition Method for Solving First Order Nonlinear Differential Equations

Telescoping Decomposition Method for Solving First Order Nonlinear Differential Equations Telescoping Decomposition Method or Solving First Order Nonlinear Dierential Equations 1 Mohammed Al-Reai 2 Maysem Abu-Dalu 3 Ahmed Al-Rawashdeh Abstract The Telescoping Decomposition Method TDM is a new

More information

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods

Numerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can

More information

10. Joint Moments and Joint Characteristic Functions

10. Joint Moments and Joint Characteristic Functions 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent the inormation contained in the joint p.d. o two r.vs.

More information

SEPARATED AND PROPER MORPHISMS

SEPARATED AND PROPER MORPHISMS SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Last quarter, we introduced the closed diagonal condition or a prevariety to be a prevariety, and the universally closed condition or a variety to be complete.

More information

Objectives. By the time the student is finished with this section of the workbook, he/she should be able

Objectives. By the time the student is finished with this section of the workbook, he/she should be able FUNCTIONS Quadratic Functions......8 Absolute Value Functions.....48 Translations o Functions..57 Radical Functions...61 Eponential Functions...7 Logarithmic Functions......8 Cubic Functions......91 Piece-Wise

More information

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Entropy and Ergodic Theory Lecture 15: A first look at concentration Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that

More information

Practical conditions on Markov chains for weak convergence of tail empirical processes

Practical conditions on Markov chains for weak convergence of tail empirical processes Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto,

More information

VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS

VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Recall that or prevarieties, we had criteria or being a variety or or being complete in terms o existence and uniqueness o limits, where

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

BANDELET IMAGE APPROXIMATION AND COMPRESSION

BANDELET IMAGE APPROXIMATION AND COMPRESSION BANDELET IMAGE APPOXIMATION AND COMPESSION E. LE PENNEC AND S. MALLAT Abstract. Finding eicient geometric representations o images is a central issue to improve image compression and noise removal algorithms.

More information

VALUATIVE CRITERIA BRIAN OSSERMAN

VALUATIVE CRITERIA BRIAN OSSERMAN VALUATIVE CRITERIA BRIAN OSSERMAN Intuitively, one can think o separatedness as (a relative version o) uniqueness o limits, and properness as (a relative version o) existence o (unique) limits. It is not

More information

Solution of the Synthesis Problem in Hilbert Spaces

Solution of the Synthesis Problem in Hilbert Spaces Solution o the Synthesis Problem in Hilbert Spaces Valery I. Korobov, Grigory M. Sklyar, Vasily A. Skoryk Kharkov National University 4, sqr. Svoboda 677 Kharkov, Ukraine Szczecin University 5, str. Wielkopolska

More information

EXPANSIONS IN NON-INTEGER BASES: LOWER, MIDDLE AND TOP ORDERS

EXPANSIONS IN NON-INTEGER BASES: LOWER, MIDDLE AND TOP ORDERS EXPANSIONS IN NON-INTEGER BASES: LOWER, MIDDLE AND TOP ORDERS NIKITA SIDOROV To the memory o Bill Parry ABSTRACT. Let q (1, 2); it is known that each x [0, 1/(q 1)] has an expansion o the orm x = n=1 a

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Appl. 352 2009) 739 748 Contents lists available at ScienceDirect Journal o Mathematical Analysis Applications www.elsevier.com/locate/jmaa The growth, oscillation ixed points o solutions

More information

( x) f = where P and Q are polynomials.

( x) f = where P and Q are polynomials. 9.8 Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm ( ) ( ) ( ) P where P and Q are polynomials. Q An eample o a simple rational

More information

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions

RATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions RATIONAL FUNCTIONS Finding Asymptotes..347 The Domain....350 Finding Intercepts.....35 Graphing Rational Functions... 35 345 Objectives The ollowing is a list o objectives or this section o the workbook.

More information

Notes 15 : UI Martingales

Notes 15 : UI Martingales Notes 15 : UI Martingales Math 733 - Fall 2013 Lecturer: Sebastien Roch References: [Wil91, Chapter 13, 14], [Dur10, Section 5.5, 5.6, 5.7]. 1 Uniform Integrability We give a characterization of L 1 convergence.

More information

14.1 Finding frequent elements in stream

14.1 Finding frequent elements in stream Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours

More information

Notes 6 : First and second moment methods

Notes 6 : First and second moment methods Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative

More information

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2 Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,

More information

Ruelle Operator for Continuous Potentials and DLR-Gibbs Measures

Ruelle Operator for Continuous Potentials and DLR-Gibbs Measures Ruelle Operator or Continuous Potentials and DLR-Gibbs Measures arxiv:1608.03881v4 [math.ds] 22 Apr 2018 Leandro Cioletti Departamento de Matemática - UnB 70910-900, Brasília, Brazil cioletti@mat.unb.br

More information

arxiv: v1 [math.oc] 1 Apr 2012

arxiv: v1 [math.oc] 1 Apr 2012 arxiv.tex; October 31, 2018 On the extreme points o moments sets arxiv:1204.0249v1 [math.oc] 1 Apr 2012 Iosi Pinelis Department o Mathematical ciences Michigan Technological University Houghton, Michigan

More information

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary

More information

Lecture 1 Measure concentration

Lecture 1 Measure concentration CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples

More information

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS Josef Teichmann Abstract. Some results of ergodic theory are generalized in the setting of Banach lattices, namely Hopf s maximal ergodic inequality and the

More information

A note on the convex infimum convolution inequality

A note on the convex infimum convolution inequality A note on the convex infimum convolution inequality Naomi Feldheim, Arnaud Marsiglietti, Piotr Nayar, Jing Wang Abstract We characterize the symmetric measures which satisfy the one dimensional convex

More information

Lecture 17 Brownian motion as a Markov process

Lecture 17 Brownian motion as a Markov process Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is

More information

Sections of Convex Bodies via the Combinatorial Dimension

Sections of Convex Bodies via the Combinatorial Dimension Sections of Convex Bodies via the Combinatorial Dimension (Rough notes - no proofs) These notes are centered at one abstract result in combinatorial geometry, which gives a coordinate approach to several

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Chapter 7. Markov chain background. 7.1 Finite state space

Chapter 7. Markov chain background. 7.1 Finite state space Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider

More information

Web Appendix for The Value of Switching Costs

Web Appendix for The Value of Switching Costs Web Appendix or The Value o Switching Costs Gary Biglaiser University o North Carolina, Chapel Hill Jacques Crémer Toulouse School o Economics (GREMAQ, CNRS and IDEI) Gergely Dobos Gazdasági Versenyhivatal

More information

1 Relative degree and local normal forms

1 Relative degree and local normal forms THE ZERO DYNAMICS OF A NONLINEAR SYSTEM 1 Relative degree and local normal orms The purpose o this Section is to show how single-input single-output nonlinear systems can be locally given, by means o a

More information

On Picard value problem of some difference polynomials

On Picard value problem of some difference polynomials Arab J Math 018 7:7 37 https://doiorg/101007/s40065-017-0189-x Arabian Journal o Mathematics Zinelâabidine Latreuch Benharrat Belaïdi On Picard value problem o some dierence polynomials Received: 4 April

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

Rigorous pointwise approximations for invariant densities of non-uniformly expanding maps

Rigorous pointwise approximations for invariant densities of non-uniformly expanding maps Ergod. Th. & Dynam. Sys. 5, 35, 8 44 c Cambridge University Press, 4 doi:.7/etds.3.9 Rigorous pointwise approximations or invariant densities o non-uniormly expanding maps WAEL BAHSOUN, CHRISTOPHER BOSE

More information

Discrete Mathematics. On the number of graphs with a given endomorphism monoid

Discrete Mathematics. On the number of graphs with a given endomorphism monoid Discrete Mathematics 30 00 376 384 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: www.elsevier.com/locate/disc On the number o graphs with a given endomorphism monoid

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

arxiv: v1 [math.gt] 20 Feb 2008

arxiv: v1 [math.gt] 20 Feb 2008 EQUIVALENCE OF REAL MILNOR S FIBRATIONS FOR QUASI HOMOGENEOUS SINGULARITIES arxiv:0802.2746v1 [math.gt] 20 Feb 2008 ARAÚJO DOS SANTOS, R. Abstract. We are going to use the Euler s vector ields in order

More information

1. Stochastic Processes and filtrations

1. Stochastic Processes and filtrations 1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

Feedback Linearization

Feedback Linearization Feedback Linearization Peter Al Hokayem and Eduardo Gallestey May 14, 2015 1 Introduction Consider a class o single-input-single-output (SISO) nonlinear systems o the orm ẋ = (x) + g(x)u (1) y = h(x) (2)

More information

Asymptotically Efficient Estimation of the Derivative of the Invariant Density

Asymptotically Efficient Estimation of the Derivative of the Invariant Density Statistical Inerence or Stochastic Processes 6: 89 17, 3. 3 Kluwer cademic Publishers. Printed in the Netherlands. 89 symptotically Eicient Estimation o the Derivative o the Invariant Density NK S. DLLYN

More information

Independence of some multiple Poisson stochastic integrals with variable-sign kernels

Independence of some multiple Poisson stochastic integrals with variable-sign kernels Independence of some multiple Poisson stochastic integrals with variable-sign kernels Nicolas Privault Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological

More information

Chapter 1. Poisson processes. 1.1 Definitions

Chapter 1. Poisson processes. 1.1 Definitions Chapter 1 Poisson processes 1.1 Definitions Let (, F, P) be a probability space. A filtration is a collection of -fields F t contained in F such that F s F t whenever s

More information

NON-AUTONOMOUS INHOMOGENEOUS BOUNDARY CAUCHY PROBLEMS AND RETARDED EQUATIONS. M. Filali and M. Moussi

NON-AUTONOMOUS INHOMOGENEOUS BOUNDARY CAUCHY PROBLEMS AND RETARDED EQUATIONS. M. Filali and M. Moussi Electronic Journal: Southwest Journal o Pure and Applied Mathematics Internet: http://rattler.cameron.edu/swjpam.html ISSN 83-464 Issue 2, December, 23, pp. 26 35. Submitted: December 24, 22. Published:

More information

Notes on Gaussian processes and majorizing measures

Notes on Gaussian processes and majorizing measures Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,

More information

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares

Scattered Data Approximation of Noisy Data via Iterated Moving Least Squares Scattered Data Approximation o Noisy Data via Iterated Moving Least Squares Gregory E. Fasshauer and Jack G. Zhang Abstract. In this paper we ocus on two methods or multivariate approximation problems

More information

CHOW S LEMMA. Matthew Emerton

CHOW S LEMMA. Matthew Emerton CHOW LEMMA Matthew Emerton The aim o this note is to prove the ollowing orm o Chow s Lemma: uppose that : is a separated inite type morphism o Noetherian schemes. Then (or some suiciently large n) there

More information

Introduction to Dynamical Systems

Introduction to Dynamical Systems Introduction to Dynamical Systems France-Kosovo Undergraduate Research School of Mathematics March 2017 This introduction to dynamical systems was a course given at the march 2017 edition of the France

More information

Complexity and Algorithms for Two-Stage Flexible Flowshop Scheduling with Availability Constraints

Complexity and Algorithms for Two-Stage Flexible Flowshop Scheduling with Availability Constraints Complexity and Algorithms or Two-Stage Flexible Flowshop Scheduling with Availability Constraints Jinxing Xie, Xijun Wang Department o Mathematical Sciences, Tsinghua University, Beijing 100084, China

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d

More information

Wiener Measure and Brownian Motion

Wiener Measure and Brownian Motion Chapter 16 Wiener Measure and Brownian Motion Diffusion of particles is a product of their apparently random motion. The density u(t, x) of diffusing particles satisfies the diffusion equation (16.1) u

More information

DIFFERENTIAL POLYNOMIALS GENERATED BY SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS

DIFFERENTIAL POLYNOMIALS GENERATED BY SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS Journal o Applied Analysis Vol. 14, No. 2 2008, pp. 259 271 DIFFERENTIAL POLYNOMIALS GENERATED BY SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS B. BELAÏDI and A. EL FARISSI Received December 5, 2007 and,

More information

Concentration Inequalities for Dependent Random Variables via the Martingale Method

Concentration Inequalities for Dependent Random Variables via the Martingale Method Concentration Inequalities for Dependent Random Variables via the Martingale Method Leonid Kontorovich, and Kavita Ramanan Carnegie Mellon University L. Kontorovich School of Computer Science Carnegie

More information

Generalization theory

Generalization theory Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1

More information

Central Limit Theorems and Proofs

Central Limit Theorems and Proofs Math/Stat 394, Winter 0 F.W. Scholz Central Limit Theorems and Proos The ollowin ives a sel-contained treatment o the central limit theorem (CLT). It is based on Lindeber s (9) method. To state the CLT

More information

An almost sure invariance principle for additive functionals of Markov chains

An almost sure invariance principle for additive functionals of Markov chains Statistics and Probability Letters 78 2008 854 860 www.elsevier.com/locate/stapro An almost sure invariance principle for additive functionals of Markov chains F. Rassoul-Agha a, T. Seppäläinen b, a Department

More information

Strong Lyapunov Functions for Systems Satisfying the Conditions of La Salle

Strong Lyapunov Functions for Systems Satisfying the Conditions of La Salle 06 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 6, JUNE 004 Strong Lyapunov Functions or Systems Satisying the Conditions o La Salle Frédéric Mazenc and Dragan Ne sić Abstract We present a construction

More information