A tail inequality for suprema of unbounded empirical processes with applications to Markov chains
|
|
- Shannon Warren
- 6 years ago
- Views:
Transcription
1 E l e c t r o n i c J o u r n a l o P r o b a b i l i t y Vol , Paper no. 34, pages Journal URL A tail inequality or suprema o unbounded empirical processes with applications to Markov chains Rados law Adamczak Institute o Mathematics Polish Academy o Sciences Śniadeckich 8. P.O.Box Warszawa 10, Poland R.Adamczak@impan.gov.pl Abstract We present a tail inequality or suprema o empirical processes generated by variables with inite ψ α norms and apply it to some geometrically ergodic Markov chains to derive similar estimates or empirical processes o such chains, generated by bounded unctions. We also obtain a bounded dierence inequality or symmetric statistics o such Markov chains. Key words: concentration inequalities, empirical processes, Markov chains. AMS 2000 Subject Classiication: Primary 60E15; Secondary:60J05. Submitted to EJP on September 19, 2007, inal version accepted May 27, Research partially supported by MEiN Grant 1 PO3A
2 1 Introduction Let us consider a sequence X 1, X 2,...,X n o random variables with values in a measurable space S, B and a countable class o measurable unctions : S R. Deine moreover the random variable Z = sup F X i. In recent years a lot o eort has been devoted to describing the behaviour, in particular concentration properties o the variable Z under various assumptions on the sequence X 1,...,X n and the class F. Classically, one considers the case o i.i.d. or independent random variables X i s and uniormly bounded classes o unctions, although there are also results or unbounded unctions or sequences o variables possessing some mixing properties. The aim o this paper is to present tail inequalities or the variable Z under two dierent types o assumptions, relaxing the classical conditions. In the irst part o the article we consider the case o independent variables and unbounded unctions satisying however some integrability assumptions. The main result o this part is Theorem 4, presented in Section 2. In the second part we keep the assumption o uniorm boundedness o the class F but relax the condition on the underlying sequence o variables, by considering a class o Markov chains, satisying classical small set conditions with exponentially integrable regeneration times. I the small set assumption is satisied or the one step transition kernel, the regeneration technique or Markov chains together with the results or independent variables and unbounded unctions allow us to derive tail inequalities or the variable Z Theorem 7, presented in Section 3.2. In a more general situation, when the small set assumption is satisied only by the m-skeleton chain, our results are restricted to sums o real variables, i.e. to the case o F being a singleton Theorem 6. Finally, in Section 3.4, using similar arguments, we derive a bounded dierence type inequality or Markov chains, satisying the same small set assumptions. We will start by describing known results or bounded classes o unctions and independent random variables, beginning with the celebrated Talagrand s inequality. They will serve us both as tools and as a point o reerence or presenting our results. 1.1 Talagrand s concentration inequalities In the paper [24], Talagrand proved the ollowing inequality or empirical processes. Theorem 1 Talagrand, [24]. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R, such that a < or every F. Consider the random 1001
3 variable Z = sup F n X i. Then or all t 0, PZ EZ + t K exp 1 t ta log1 + K a V, 1 where V = Esup F n X i 2 and K is an absolute constant. In consequence, or all t 0, PZ EZ + t K 1 exp 1 t 2 K 1 V + at or some universal constant K 1. Moreover, the above inequalities hold, when replacing Z by Z. Inequalities 1 and 2 may be considered unctional versions o respectively Bennett s and Bernstein s inequalities or sums o independent random variables and similarly as in the classical case, one o them implies the other. Let us note, that Bennett s inequality recovers both the subgaussian and Poisson behaviour o sums o independent random variables, corresponding to classical limit theorems, whereas Bernstein s inequality recovers the subgaussian behaviour or small values and exhibits exponential behaviour or larger values o t. The above inequalities proved to be a very important tool in ininite dimensional probability, machine learning and M-estimation. They drew considerable attention resulting in several simpliied proos and dierent versions. In particular, there has been a series o papers, starting rom the work by Ledoux [10], exploring concentration o measure or empirical processes with the use o logarithmic Sobolev inequalities with discrete gradients. The irst explicit constants were obtained by Massart [16], who proved in particular the ollowing Theorem 2 Massart, [16]. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R, such that a < or every F. Consider the random variable Z = sup F n X i. Assume moreover that or all F and all i, EX i = 0 and let σ 2 = sup n F EX i 2. Then or all η > 0 and t 0, and PZ 1 + ηez + σ 2K 1 t + K 2 ηat e t PZ 1 ηez σ 2K 3 t K 4 ηat e t, where K 1 = 4, K 2 η = /η, K 3 = 5.4, K 4 η = /η. Similar, more reined results were obtained subsequently by Bousquet [3] and Klein and Rio [7]. The latter article contains an inequality or suprema o empirical processes with the best known constants
4 Theorem 3 Klein, Rio, [7], Theorems 1.1., 1.2. Let X 1, X 2,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S [ a, a], such that or all i, EX i = 0. Consider the random variable Then, or all t 0, Z = sup F X i. t 2 PZ EZ + t exp 2σ 2 + 2aEZ + 3at and where t 2 PZ EZ t exp 2σ 2, + 2aEZ + 3at σ 2 = sup F EX i 2. The reader may notice, that contrary to the original Talagrand s result, estimates o Theorem 2 and 3 use rather the weak variance σ 2 than the strong parameter V o Theorem 1. This stems rom several reasons, e.g. the statistical relevance o parameter σ and analogy with the concentration o Gaussian processes which by CLT, in the case o Donsker classes o unctions correspond to the limiting behaviour o empirical processes. One should also note, that by the contraction principle we have σ 2 V σ aEZ see [12], Lemma 6.6. Thus, usually one would like to describe the subgaussian behaviour o the variables Z rather in terms o σ, however the price to be paid is the additional summand o the orm ηez. Let us also remark, that i one does not pay attention to constants, inequalities presented in Theorems 2 and 3 ollow rom Talagrand s inequality just by the aorementioned estimate V σ aEZ and the inequality between the geometric and the arithmetic mean in the case o Theorem Notation, basic deinitions In the article, by K we will denote universal constants and by Cα, β, K α constants depending only on α, β or only on α resp. where α, β are some parameters. In both cases the values o constants may change rom line to line. We will also use the classical deinition o exponential Orlicz norms. Deinition 1. For α > 0, deine the unction ψ α : R + R + with the ormula ψ α x = expx α 1. For a random variable X, deine also the Orlicz norm X ψα = in{λ > 0: Eψ α X/λ 1}. 1003
5 Let us also note a basic act that we will use in the sequel, namely that by Chebyshev s inequality, or t 0, t α PX t 2 exp. X ψα Remark For α < 1 the above deinition does not give a norm but only a quasi-norm. It can be ixed by changing the unction ψ α near zero, to make it convex which would give an equivalent norm. It is however widely accepted in literature to use the word norm also or the quasi-norm given by our deinition. 2 Tail inequality or suprema o empirical processes corresponding to classes o unbounded unctions 2.1 The main result or the independent case We will now ormulate our main result in the setting o independent variables, namely tail estimates or suprema o empirical processes under the assumption that the summands have inite ψ α Orlicz norm. Theorem 4. Let X 1,...,X n be independent random variables with values in a measurable space S, B and let F be a countable class o measurable unctions : S R. Assume that or every F and every i, EX i = 0 and or some α 0, 1] and all i, sup X i ψα <. Let Deine moreover Z = sup F σ 2 = sup F X i. EX i 2. Then, or all 0 < η < 1 and δ > 0, there exists a constant C = Cα, η, δ, such that or all t 0, and PZ 1 + ηez + t exp PZ 1 ηez t exp t δσ exp t δσ exp t α C max i sup F X i ψα t α. C max i sup F X i ψα 1004
6 Remark The above theorem may be thus considered a counterpart o Massart s result Theorem 2. It is written in a slightly dierent manner, relecting the use o Theorem 3 in the proo, but it is easy to see that i one disregards the constants, it yields another version in lavour o inequalities presented in Theorem 2. Let us note that some weaker e.g. not recovering the proper power α in the subexponential decay o the tail inequalities may be obtained by combining the Pisier inequality see 13 below with moment estimates or empirical processes proven by Giné, Lata la and Zinn [5] and later obtained by a dierent method also by Boucheron, Bousquet, Lugosi and Massart [2]. These moment estimates irst appeared in the context o tail inequalities or U-statistics and were later used in statistics, in model selection. They are however also o independent interest as extensions o classical Rosenthal s inequalities or p-th moments o sums o independent random variables with the dependence on p stated explicitly. The proo o Theorem 4 is a compilation o the classical Homan-Jørgensen inequality with Theorem 3 and another deep result due to Talagrand. Theorem 5 Ledoux, Talagrand, [12], Theorem p In the setting o Theorem 4, we have Z ψα K α Z 1 + max sup X i. i ψα We will also need the ollowing corollary to Theorem 3, which was derived in [4]. Since the proo is very short we will present it here or the sake o completeness Lemma 1. In the setting o Theorem 3, or all 0 < η 1, δ > 0 there exists a constant C = Cη, δ, such that or all t 0, PZ 1 + ηez + t exp t δσ 2 + exp t Ca and PZ 1 ηez t exp t δσ 2 + exp t. Ca Proo. It is enough to notice that or all δ > 0, exp t 2 2σ 2 exp + 2aEZ + 3at + exp t δσ 2 t δ 1 4aEZ + 3ta and use this inequality together with Theorem 3 or t + ηez instead o t, which gives C = 1 + 1/δ3 + 2η
7 Proo o Theorem 4. Without loss o generality we may and will assume that t/ max sup 1 i n F X i ψα > Kα, η, δ, 3 otherwise we can make the theorem trivial by choosing the constant C = Cη, δ, α to be large enough. The conditions on the constant Kα, η, δ will be imposed later on in the proo. Let ε = εδ > 0 its value will be determined later and or all F consider the truncated unctions 1 x = x1 {sup F x ρ} the truncation level ρ will also be ixed later. Deine also unctions 2 x = x 1 x = x1 {sup F x>ρ}. Let F i = { i : F}, i = 1, 2. We have Z = sup F X i sup 1 F 1 1 X i E 1 X i + sup 2 F 2 2 X i E 2 X i 4 and Z sup 1 F 1 1 X i E 1 X i sup 2 F 2 2 X i E 2 X i 5 where we used the act that E 1 X i + E 2 X i = 0 or all F. Similarly, by Jensen s inequality, we get E sup 1 F 1 1 X i E 1 X i 2E sup E sup 1 F 1 2 F 2 2 X i EZ 1 X i E 1 X i + 2E sup 2 X i. 6 2 F 2 Denoting and A = E sup 1 F 1 1 X i E 1 X i B = E sup 2 F 2 2 X i, 1006
8 we get by 4 and 6, PZ 1 + ηez + t P sup 1 X i E 1 X i 1 + ηez + 1 εt 1 F 1 + P sup 2 F 2 P sup 1 F 1 2 X i E 2 X i εt 1 X i E 1 X i 1 + ηa 4B + 1 εt + P sup 2 F 2 2 X i E 2 X i εt 7 and similarly by 5 and 6, PZ 1 ηez t P sup 1 X i E 1 X i 1 ηez 1 εt 1 F 1 + P sup 2 F 2 P sup 1 F 1 2 X i E 2 X i εt 1 X i E 1 X i 1 ηa 1 εt + 2B + P sup 2 F 2 2 X i E 2 X i εt. 8 We would like to choose a truncation level ρ in a way, which would allow to bound the irst summands on the right-hand sides o 7 and 8 with Lemma 1 and the other summands with Theorem 5. To this end let us set ρ = 8E max sup 1 i n F X i K α max sup 1 i n F X i. 9 ψα Let us notice that by the Chebyshev inequality and the deinition o the class F 2, we have Pmax k n sup 2 F 2 k 2 X i > 0 Pmax i sup X i > ρ 1/8 1007
9 and thus by the Homann-Jørgensen inequality see e.g. [12], Chapter 6, Proposition 6.8., inequality 6.8, we obtain In consequence E sup 2 F 2 B = E sup 2 F 2 2 X i 8E max sup 1 i n F 2 X i E 2 X i 16E max sup 1 i n F K α max sup 1 i n F X i. 10 X i X i. ψα We also have max sup 2 X i E 2 X i 1 i n 2 F 2 ψα K α max sup E 2 X i + K α max sup 2 X i 1 i n 2 F 2 ψα 1 i n 2 F 2 K α max sup 2 X i 1 i n 2 F 2 ψα K α max X i sup 1 i n F recall that with our deinitions, or α < 1, ψα is a quasi-norm, which explains the presence o the constant K α in the irst inequality. Thus, by Theorem 5, we obtain which implies sup 2 F 2 ψα 2 X i E 2 X i K α max ψα P sup 2 F 2 2 X i E 2 X i εt 2 exp Let us now choose ε < 1/10 and such that sup 1 i n F X i, ψα ψα εt α. 11 K α max 1 i n sup F X i ψα 1 5ε δ/2 1 + δ. 12 Since ε is a unction o δ, in view o 9 and 10, we can choose the constant Kα, η, δ in 3 to be large enough, to assure that B 8E max sup 1 i n F 1008 X i εt.
10 Notice moreover, that or every F, we have E 1 X i E 1 X i 2 E 1 X i 2 EX i 2. Thus, using inequalities 7, 8, 11 and Lemma 1 applied or η and δ/2, we obtain PZ 1 + ηez + t, PZ 1 ηez t exp t2 1 5ε 2 1 5εt 21 + δ/2σ 2 + exp Kη, δρ εt α + 2 exp. K α max 1 i n sup F X i ψα Since ε < 1/10, using 9 one can see that or t satisying 3 with Kα, η, δ large enough, we have exp 1 5εt Kη, δρ exp, exp εt α K α max 1 i n sup F X i ψα t α Cα, η, δ max 1 i n sup F X i ψα note that the above inequality holds or all t i α = 1. Thereore, or such t, PZ 1 + ηez + t, PZ 1 ηez t exp t2 1 5ε δ/2σ 2 t α + 3 exp. Cα, η, δ max 1 i n sup F X i ψα To inish the proo it is now enough to use 12. Remark We would like to point out that the use o the Homan-Jørgensen inequality in similar context is well known. Such applications appeared in the proo o the aorementioned moment estimates or empirical processes by Giné, Lata la, Zinn [5], in the proo o Theorem 5 and recently in the proo o Fuk-Nagaev type inequalities or empirical processes used by Einmahl and Li to investigate generalized laws o the iterated logarithm or Banach space valued variables [4]. As or using Theorem 5 to control the remainder ater truncating the original random variables, it was recently used in a somewhat similar way by Mendelson and Tomczak- Jaegermann see [17]. 1009
11 2.2 A counterexample We will now present a simple example, showing that in Theorem 4 one cannot replace sup max i X i ψα with max i sup X i ψα. With such a modiication, the inequality ails to be true even in the real valued case, i.e. when F is a singleton. For simplicity we will consider only the case α = 1. Consider a sequence Y 1, Y 2,..., o i.i.d. real random variables, such that PY i = r = e r = 1 PY i = 0. Let ε 1, ε 2,..., be a Rademacher sequence, independent rom Y i i. Deine inally X i = ε i Y i. We have Ee X i = e r e r + 1 e r 2, so X i ψ1 1. Moreover EX i 2 = r 2 e r. Assume now that we have or all n, r N and t 0, P X i K nt X1 2 + t X 1 ψ1 Ke t, where K is an absolute constant which would hold i the corresponding version o Theorem 4 was true. For suiciently large r, the above inequality applied with n e r r 2 and t r, implies that P X i r Ke r/k. On the other hand, by Levy s inequality, we have 2P X i r Pmax X i r 1 i n 2 minnpx 1 r, r 2, which gives a contradiction or large r. Remark A small modiication o the above argument shows that one cannot hope or an inequality P Z KEZ + tσ + t[log β n] max sup X i ψ1 Ke t i F with β < 1. For β = 1, this inequality ollows rom Theorem 4 via Pisier s inequality [21], max Y i K α max Y i ψα log 1/α n 13 i n ψα i n or independent real variables Y 1,..., Y n. 1010
12 3 Applications to Markov chains We will now turn to the other class o inequalities we are interested in. We are again concerned with random variables o the orm Z = sup X i, F but this time we assume that the class F is uniormly bounded and we drop the assumption on the independence o the underlying sequence X 1,...,X n. To be more precise, we will assume that X 1,...,X n orm a Markov chain, satisying some additional conditions, which are rather classical in the Markov chain or Markov Chain Monte Carlo literature. The organization o this part is as ollows. First, beore stating the main results, we will present all the structural assumptions we will impose on the chain. At the same time we will introduce some notation, which will be used in the sequel. Next, we present our results Theorems 6 and 7 ollowed by the proo which is quite straightorward but technical and a discussion o the optimality Section 3.3. At the end, in Section 3.4, we will also present a bounded dierences type inequality or Markov chains. 3.1 Assumptions on the Markov chain Let X 1, X 2,... be a homogeneous Markov chain on S, with transition kernel P = Px, A, satisying the so called minorization condition, stated below. Minorization condition We assume that there exist positive m N, δ > 0, a set C B small set and a probability measure ν on S or which and x C A B P m x, A δνa 14 x S n P nm x, C > 0, 15 where P i, is the transition kernel or the chain ater i steps. One can show that in such a situation i the chain admits an invariant measure π, then this measure is unique and satisies πc > 0 see [18]. Moreover, under some conditions on the initial distribution ξ, it can be extended to a new so called split chain X n, R n S {0, 1}, satisying the ollowing properties. Properties o the split chain P1 X n n is again a Markov chain with transition kernel P and initial distribution ξ hence or our purposes o estimating the tail probabilities we may and will identiy X n and X n, 1011
13 P2 i we deine T 1 = in{n > 0: R nm = 1}, T i+1 = in{n > 0: R T T i +nm = 1}, then T 1, T 2,..., are well deined, independent, moreover T 2, T 3,... are i.i.d., P3 i we deine S i = T T i, then the blocks Y 0 = X 1,...,X mt1 +m 1, Y i = X msi +1,...,X msi+1 +m 1, i > 0, orm a one-dependent sequence i.e. or all i, σy j j<i and σy j j>i are independent. Moreover, the sequence Y 1, Y 2,... is stationary. I m = 1, then the variables Y 0, Y 1,... are independent. In consequence, or : S R, the variables Z i = Z i = ms i+1 +m 1 i=ms i +1 X i, i 1, constitute a one-dependent stationary sequence an i.i.d. sequence i m = 1. Additionally, i is π-integrable recall that π is the unique stationary measure or the chain, then EZ i = δ 1 πc 1 m dπ. 16 P4 the distribution o T 1 depends only on ξ, P, C, δ, ν, whereas the law o T 2 only on P, C, δ and ν. We rerain rom speciying the construction o this new chain in ull generality as well as conditions under which 14 and 15 hold and reer the reader to the classical monograph [18] or a survey article [22] or a complete exposition. Here, we will only sketch the construction or m = 1, to give its lavour. Inormally speaking, at each step i, i we have X i = x and x / C, we generate the next value o the chain, according to the measure Px,. I x C, then we toss a coin with probability o success equal to δ. In the case o success R i = 1, we draw the next sample according to the measure ν, otherwise R i = 0, according to Px, δν. 1 δ When R i = 1, one usually says that the chain regenerates, as the distribution in the next step or m = 1, ater m steps in general is again ν. Let us remark that or a recurrent chain on a countable state space, admitting a stationary distribution, the Minorization condition is always satisied with m = 1 and δ = 1 or C we can take {x}, where x is an arbitrary element o the state space. Also the construction o the split chain becomes trivial. 1012
14 Beore we proceed, let us present a general idea, our approach is based on. To derive our estimates we will need two types o assumptions. Regeneration assumption. We will work under the assumption that the chain admits a representation as above Properties P1 to P4. We will not however take advantage o the explicit construction. Instead we will use the properties stated in points above. A similar approach is quite common in the literature. Assumption o the exponential integrability o the regeneration time. To derive concentration o measure inequalities, we will also assume that T 1 ψ1 < and T 2 ψ1 <. At the end o the article we will present examples or which this assumption is satisied and relate obtained inequalities to known results. The regenerative properties o the chain allow us to decompose the chain into onedependent independent i m = 1 blocks o random length, making it possible to reduce the analysis o the chain to sums o independent random variables this approach is by now classical, it has been successully used in the analysis o limit theorems or Markov chains, see [18]. Since we are interested in non-asymptotic estimates o exponential type, we have to impose some additional conditions on the regeneration time, which would give us control over the random length o one-dependent blocks. This is the reason or introducing the assumption o the exponential integrability which ater some technical steps allows us to apply the inequalities or unbounded empirical processes, presented in Section Main results concerning Markov chains Having established all the notation, we are ready to state our main results on Markov chains. As announced in the introduction, our results depend on the parameter m in the Minorization condition. I m = 1 we are able to obtain tail inequalities or empirical processes Theorem 7, whereas or m > 1 the result is restricted to linear statistics o Markov chains Theorem 6, which ormally corresponds to empirical processes indexed by a singleton. The variables T i and Z 1 appearing in the theorems were deined in the previous section see the properties P2 and P3 o the split chain. Theorem 6. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Consider a unction : S R, such that a and E π = 0. Deine also the random variable Z = X i. 1013
15 Then or all t > 0, P Z > t K exp 1 K min t 2 nmet 2 1 VarZ 1, t τ amlog n Theorem 7. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition with m = 1 and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Consider moreover a countable class F o measurable unctions : S R, such that a and E π = 0. Deine the random variable and the asymptotic weak variance Then, or all t 1, Remarks Z = sup F X i σ 2 = sup VarZ 1 /ET 2. F P Z KEZ + t K exp 1 t 2 K min t nσ 2, τ 3 ET 2 1. alog n 1. As it was mentioned in the previous section, chains satisying the Minorization condition admit at most one stationary measure. 2. In Theorem 7, the dependence on the chain is worse that in Theorem 6, i.e. we have τ 3 ET 2 1 instead o τ 2 in the denominator. It is a result o just one step in the argument we present below, however at the moment we do not know how to improve this dependence or extend the result to m > Another remark we would like to make is related to the limit behaviour o the Markov chain. Let us notice that the asymptotic variance the variance in the CLT or n 1/2 X X n equals which or m = 1 reduces to m 1 ET 2 1 VarZ 1 + 2EZ 1 Z 2, ET 2 1 VarZ 1 we again reer the reader to [18], Chapter 17 or details. Thus, or m = 1 our estimates relect the asymptotic behaviour o the variable Z. 1014
16 Let us now pass to the proos o the above theorems. For a unction : S R let us deine and recall the variables S i and Z 0 = Z i = Z i = mt 1 +m 1 n ms i+1 +m 1 i=ms i +1 X i X i, i 1, deined in the previous section see property P3 o the split chain. Recall also that Z i s orm a one-dependent stationary sequence or m > 1 and an i.i.d. sequence or m = 1. Using this notation, we have with X X n = Z Z N + i=s N+1 +1m X i, 18 N = sup{i N: ms i+1 + m 1 n}, 19 where sup = 0 note that N is a random variable. Thus Z 0 represents the sum up to the irst regeneration time, then Z 1,...,Z N are identically distributed blocks between consecutive regeneration times, included in the interval [1, n], inally the last term corresponds to the initial segment o the last block. The sum Z Z N is empty i up to time n, there has not been any regeneration i.e. mt 1 + m 1 > n or there has been only one regeneration mt 1 + m 1 n and mt 1 + T 2 + m 1 > n. The last sum on the right hand side is empty i there has been no regeneration or the last ull block ends with n. We will irst bound the initial and the last summand in the decomposition 18. To achieve this we will not need the assumptions that is centered with respect to the stationary distribution π. In consequence the same bound may be applied to proos o both Theorem 6 and Theorem 7. Lemma 2. I T 1 ψ1 τ and a, then or all t 0, t PZ 0 t 2 exp. 2amτ Proo. We have Z 0 2aT 1 m, so by the remark ater Deinition 1, t PZ 0 t PT 1 t/2am 2 exp. 2amτ 1015
17 The next lemma provides a similar bound or the last summand on the right hand side o 18. It is a little bit more complicated, since it involves additional dependence on the random variable N. Lemma 3. I T 1 ψ1, T 2 ψ1 τ, then or all t 0, Pn ms N > t K exp In consequence, i a, then P i=s N+1 +1m X i > t K exp t Kmτ log τ t Kamτ log τ Proo. Let us consider the variable M n = n ms N I M n > t then S N+1 < n t + 1 n t <. m m Thereore PM n > t = = = PS N+1 = k k< n t+1 m 1 k PS l = k &N + 1 = l k< n t+1 m 1 k< n t+1 m 1 k< n t+1 m 1 n t+1 m 1 k=1 n t+1 m 1 k=1 2 exp l=1.. k PS l = k & mk + T l+1 + m 1 > n l=1 k l=1 PS l = kpt 2 > n + 1 m 1 k PT 2 > n + 1 m 1 k 1 2 exp τ τ 1 1 n + 1 m t Kτ exp, mτ k + 1 n + 1 m exp exp1/τ τ 1 n t+1 m 1 exp1/τ 1 where the irst equality ollows rom the act that S N+1 N + 1, the second rom the deinition o N, the third rom the act that T 1, T 2,... are independent and T 2, T 3,
18 are i.i.d., inally the second inequality rom the act that S i S j or i j see the properties P2 and P3 o the split chain. Let us notice that i t > 2mτ log τ, then t t τ exp exp exp mτ 2mτ t Kmτ log τ where in the last inequality we have used the act that τ > c or some universal constant c > 1. On the other hand, i t < 2mτ log τ, then Thereore we obtain or t 0, 1 e exp PM n > t K exp which proves the irst part o the Lemma. Now, P i=s N+1 +1m X i > t t 2mτ log τ. t Kmτ log τ, PM n > t/a K exp, t Kamτ log τ Beore we proceed with the proo o Theorem 6 and Theorem 7, we would like to make some additional comments regarding our approach. As already mentioned, thanks to the property P3 o the split chain, we may apply to Z Z N the inequalities or sums o independent random variables obtained in Section 2 since or m > 1 we have only one-dependence, we will split the sum, treating even and odd indices separately. The number o summands is random, but clearly not larger than n. Since the variables Z i are equidistributed, we can reduce this random sum to a deterministic one by applying the ollowing maximal inequality by Montgomery-Smith [19]. Lemma 4. Let Y 1,...,Y n be i.i.d. Banach space valued random variables. Then or some universal constant K and every t > 0, P max k n k Y i > t KP Y i > t/k. Remark The use o regeneration methods makes our proo similar to the proo o the CLT or Markov chains. In this context, the above lemma can be viewed as a counterpart o the Anscombe theorem they are quite dierent statements but both are used to handle the random number o summands. 1017
19 One could now apply Lemma 4 directly, using the act that N n. Then however one would not get the asymptotic variance in the exponent see remark ater Theorem 7. The orm o this variance is a consequence o the aorementioned Anscombe theorem and the act that by the LLN we have denoting N = N n to stress the dependence on n N n lim n n = 1 a.s. 20 met 2 Thereore to obtain an inequality which at least up to universal constants and or m = 1 relects the limiting behaviour o the variable Z, we will need a quantitative version o 20 given in the ollowing lemma. Lemma 5. I T 1 ψ1, T 2 ψ1 τ, then PN > 3n/mET 2 K exp 1 net 2 K mτ 2. To prove the above estimate, we will use the classical Bernstein s inequality actually its version or ψ 1 variables. Lemma 6 Bernstein s ψ 1 inequality, see [25], Lemma and the subsequent remark. I Y 1,...,Y n are independent random variables such that EY i = 0 and Y ψ1 τ, then or every t > 0, P Y i > t 2 exp 1 t 2 K min t nτ 2,. τ Proo o Lemma 5. Assume now that n/met 2 1. We have PN > 3n/mET 2 P mt T 3n/mET2 +1 n P P = P 3n/mET 2 +1 i=2 3n/mET 2 +1 i=2 3n/mET 2 +1 i=2 T i ET 2 n/m 3n/mET 2 ET 2 T i ET 2 n/m 3n/2m T i ET 2 n/2m. We have T 2 ET 2 ψ1 2 T 2 ψ1 2τ, thereore Bernstein s inequality Lemma 6, 1018
20 gives PN > 3n/mET 2 2 exp 1 K min 2 exp 1 K min net2 mτ 2, n mτ = 2 exp 1 net 2 K mτ 2, n/2m 2 n 3n/mET 2 τ 2, mτ where the equality ollows rom the act that ET 2 τ. I n/met 2 < 1, then also net 2 /mτ 2 < 1, thus inally we have which proves the lemma. PN > 3n/mET 2 K exp 1 net 2 K mτ 2, We are now in position to prove Theorem 6 Proo o Theorem 6. Let us notice that Z i amt i+1, so or i 1, Z i ψ1 am T 2 ψ1 amτ. Additionally, by 16, or i 1, EZ i = 0. Denote now R = 3n/mET 2. Lemma 5, Lemma 4 and Theorem 4 with α = 1, combined with Pisier s estimate 13 give P Z Z N > 2t 21 P Z Z N > 2t &N R + K exp 1 net 2 K mτ 2 P Z 1 + Z Z 2 N 1/2 +1 > t &N R P Z 2 + Z Z 2 N/2 > t &N R + K exp 1 net 2 K mτ 2 P max Z 1 + Z Z 2k+1 > t k R 1/2 + P max Z Z 2k > t + K exp 1 net 2 k R/2 K mτ 2 KP Z 1 + Z Z 2 R 1/2 +1 > t/k + KP Z 2 + Z Z 2 R/2 > t/k + K exp 1 net 2 K mτ 2 K exp 1 K min + K exp 1 K t 2 nmet 2 1 VarZ 1, net 2 mτ 2. t log3n/met 2 amτ 1019
21 Combining the above estimate with 18, Lemma 2 and Lemma 3, we obtain P S n > 4t K exp 1 K min t 2 t nmet 2 1, VarZ 1 log3n/met 2 amτ + K exp 1 net 2 t t K mτ exp + K exp. 2amτ Kamτ log τ For t > na/4, the let hand side o the above inequality is equal to 0, thereore, using the act that ET 2 1, τ > 1, we obtain 17. The proo o Theorem 7 is quite similar, however it involves some additional technicalities related to the presence o EZ in our estimates. Proo o Theorem 7. Let us irst notice that similarly as in the real valued case, we have t Psup Z 0 t PT 1 t/a 2 exp, aτ moreover Lemma 3 applied to the unction x sup F x gives P sup i=s N+1 +1 X i > t K exp t Kaτ log τ One can also see that since we assume that m = 1, the splitting o Z Z N into sums over even and odd indices is not necessary by the property P3 o the split chain the summands are independent. Using the act that Lemma 4 is valid or Banach space valued variables, we can repeat the argument rom the proo o Theorem 6 and obtain or R = 3n/ET 2, R P Z KEsup Z i +t Thus, Theorem 7 will ollow i we prove that R Esup Z i KEsup recall that K may change rom line to line. K exp 1 t 2 K min nσ 2, K exp 1 t 2 K min nσ 2,. t τ 2 alog n t τ 3 ET 2 1 alog n. X i + Kτ 3 a/et
22 From the triangle inequality, the act that Y i = X Si +1,...,X Si+1, i 1, are i.i.d. and Jensen s inequality it ollows that R Esup Z i 12Esup 12Esup n/4et 2 n/4et 2 Z i Z i + 12aτ, 23 where in the last inequality we used the act that Esup Z i EaT i+1 aτ. We will split the integral on the right hand side into two parts, depending on the size o the variable N. Let us irst consider the quantity Esup n/4et 2 Z i 1 {N< n/4et2 } Assume that n/4et 2 1. Then, using Bernstein s inequality, we obtain P N < n/4et 2 = P PT 1 > n/2 + P 2e n/2τ + P 2e n/2τ + 2 exp n/4et 2 +1 n/4et 2 +1 i=2 n/4et 2 +1 i=2 T i > n T i ET 2 > n/2 n/4et 2 ET 2 T i ET 2 > n/4 1 K min n 2 ET 2 nτ 2, n τ I n/4et 2 < 1, the above estimate holds trivially. Thereore Esup n/4et 2 Z i 1 {N< n/4et2 } a an T 2 2 PN < n/4et2 Ke net 2/Kτ 2. n/4et 2 ET i+1 1 {N< n/4et2 } Kaτne net 2/Kτ 2 Kaτ 3 /ET Now we will bound the remaining part i.e. Esup n/4et 2 Z i 1 {N n/4et2 }. 1021
23 Recall that Y 0 = X 1,...,X T1, Y i = X Si +1,...,X Si+1 or i 1 and consider a iltration F i i 0 deined as F i = σy 0,...,Y i, where we regard the blocks Y i as random variables with values in the disjoint union Si, with the natural σ-ield, i.e. the σ-ield generated by B i recall that B denotes our σ-ield o reerence in S. Let us urther notice that T i is measurable with respect to σy i 1 or i 1. We have or i 1, {N + 1 i} = {T T i+1 > n} F i and {N + 1 0} =, so N + 1 is a stopping time with respect to the iltration F i. Thus we have Esup n/4et 2 = Esup Z i [ E E sup = Esup 1 {N n/4et2 } n/4et 2 N+1 N+1 aτ + Esup aτ + Esup N+1 Z i N+1 i=0 Z i Z i 1 {N+1> n/4et2 } F n/4et2 N+1 1 {N+1> n/4et2 } Z i 1 {N+1> n/4et2 } X i + Esup S N+2 i=n+1 ] 1 {N+1> n/4et2 } X i, where in the irst inequality we used Doob s optional sampling theorem together with the act that sup n Z i is a submartingale with respect to F i notice that Z i is measurable with respect to σy i or i N and F. The second equality ollows rom the act that {N + 1 > n/4et 2 } F n/4et2 N+1. Indeed or i n/4et 2, we have {N + 1 > n/4et 2 & n/4et 2 N + 1 i} = {N + 1 > n/4et 2 } F n/4et2 F i, whereas or i < n/4et 2, this set is empty. Now, combining the above estimate with 23 and 24 and taking into account the inequality τ ET 2 1, it is easy to see that to inish the proo o 22 it is enough to 1022
24 show that Esup S N+2 i=n+1 X i Kaτ 3 /ET 2 25 This in turn will ollow i we prove that ES N+2 n Kτ 3 /ET 2. Recall now the irst part o Lemma 3, stating under our assumptions m = 1 that Pn S N+1 > t K exp t/kτ log τ or t 0. We have PS N+2 n > t Pn S N+1 > t + Pn S N+1 t & S N+2 n > t &N > 0 + PS N+2 n > t & N = 0 t n Ke t/kτ log τ + PS N+1 = n k&t N+2 > t + k &N > 0 + PT 1 + T 2 > t Ke t/kτ log τ + k=0 t n k=0 t n k PS l = n k &T l+1 > t + k + 2e t/2τ l=2 n k Ke t/kτ log τ + PS l = n kpt l+1 > t + k k=0 l=2 Ke t/kτ log τ + 2t + 1e t/τ Ke t/kτ log τ. This implies that ES N+2 n Kτ log τ Kτ 3 /ET 2, which proves 25. Thus 22 is shown and Theorem 7 ollows. Remark Two natural questions to ask in regard to Theorem 7 is irst whether the constant K in ront o the expectation can be reduced to 1+η as in Massart s Theorem 2 or Theorem 4 and second, whether one can reduce the constant K in the Gaussian part to 21 + δ as in Theorem Another counterexample I we do not pay attention to constants, the main dierence between inequalities presented in the previous section and the classical Bernstein s inequality or sums o i.i.d. bounded variables is the presence o the additional actor log n. We would now like to argue that under the assumptions o Theorems 6 and 7, this additional actor is indispensable. To be more precise, we will construct a Markov chain on a countable state space, satisying the assumptions o Theorem 6 with m = 1 and such that or β < 1, there is 1023
25 no constant K, such that P X X n t K exp 1 K min t 2 nvarz 1, t log β n 26 or all n and all unctions : S R, with 1 and E π = 0. The state space o the chain will be the set S = {0} {n} {1, 2,...,n} {+1, 1}. n=1 The transition probabilities are as ollows p n,k,s,n,k+1,s = 1 or n = 1, 2,..., k = 1, 2,...,n 1, s = 1, +1 p n,n,s,0 = 1 or n = 1, 2,..., s = 1, +1, p 0,n,1,s = 1 2A e n or n = 1, 2,..., s = 1, +1, where A = n=1 e n. In other words, whenever a particle is at 0, it chooses one o countably many loops and travels deterministically along it until the next return to 0. It is easy to check that this chain has a stationary distribution π, given by A π 0 = A + n=1 ne n, π n,i,s = 1 2A e n π 0. The chain satisies the minorization condition 14 with C = {0}, ν{x} = p 0,x, δ = 1 and m = 1. The random variable T 1 is now just the time o the irst visit to 0 and T 2, T 3,... indicate the time between consecutive visits to 0. Moreover PT 2 = n = e n A, so T 2 ψ1 <. I we start the chain rom initial distribution ν, then T 1 has the same law as T 2, so τ = T 2 ψ1 = T 1 ψ1. Let us now assume that or some β < 1, there is a constant K, such that 26 holds. Since we work with a ixed chain, in what ollows we will use the letter K also to denote constants depending on our chain the value o K may again dier at dierent occurrences. We can in particular apply 26 to the unction = r where r is a large integer, given by the ormula 0 = 0, n, i, s = s1 {n r}. 1024
26 We have E π r = 0. Moreover Thereore 26 gives or t 1 and n N. VarZ 1 r = n 2 e n A 1 Kr 2 e r. n=r P r X r X n Kre r/2 nt + t log β n e t 27 Recall that S i = T T i. By Bernstein s inequality Lemma 6, we have or large n, PS n/3et2 > n = PT T n/3et2 > n = P P n/3et 2 n/3et 2 2 exp 1 K min T i ET i > n n/3et 2 ET 2 T i ET i > n/2 n 2 n T 2 2 ψ 1, n = 2e n/k. T 2 ψ1 From the above estimate, or some integer L and n large enough, divisible by L, 1025
27 n/l P Z i r 2Kre r/2 nt + t log β n i=0 n/l 2e n/k + P Z i r 2Kre r/2 nt + t log β n &S n/l+1 n i=0 2e n/k + P r X i Kre r/2 nt + t log β n &S n/l+1 n + P i=0 i=s n/l e n/k + e t + P k n =2e n/k + e t i=s n/l [ n k E 1 {Sn/L+1 =k}p k n r X i Kre r/2 nt + t log β n &S n/l+1 n r X i Kre r/2 nt + t log β n &S n/l+1 = k r X i Kre r/2 ] nt + t log β n 2e n/k + e t + e t k n E1 {Sn/L+1 =k} 2e n/k + 2e t, where in the third and ourth inequality we used 27 and in the equality, the Markov property. For n r 2 e r and t 1, we obtain P Z 0 r Z n/l r Kt log β n 2e t + 2e n/k 28 On the other hand we have PZ i r r > 1 2A e r. Thereore Pmax i n/l Z i r > r 2 1 minne r /2AL, 1. Since Z i r are symmetric, by Levy s inequality, we get 2P Z 0 r Z n/l r r 1 2 minne r /2AL, 1 c r 2, whereas 28 applied or t = K 1 r/ log β n K 1 r 1 β 1 gives P Z 0 r Z n/l r r 2e r1 β /K + 2e er /Kr 2, which gives a contradiction. 1026
28 3.4 A bounded dierence type inequality or symmetric unctions Now we will present an inequality or more general statistics o the chain. Under the same assumptions on the chain as above with an additional restriction that m = 1, we will prove a version o the bounded dierence inequality or symmetric unctions see e.g. [11] or the classical i.i.d. case. Let us consider a measurable unction : S n R which is invariant under permutations o arguments i.e. or all permutations σ o the set {1,...,n}. x 1,...,x n = x σ1,...,x σn 29 Let us also assume that is L-Lipschitz with respect to the Hamming distance, i.e. Then we have the ollowing x 1,...,x n y 1,...,y n L#{i: x i y i }. 30 Theorem 8. Let X 1, X 2,... be a Markov chain with values in S, satisying the Minorization condition with m = 1 and admitting a unique stationary distribution π. Assume also that T 1 ψ1, T 2 ψ1 τ. Then or every unction : S n R, satisying 29 and 30, we have or all t 0. PX 1,...,X n EX 1,...,X n t 2 exp 1 t 2 K nl 2 τ 2 To prove the above theorem, we will need the ollowing Lemma 7. Let ϕ: R R be a convex unction and G = Y 1,...,Y n, where Y 1,...,Y n are independent random variables with values in a measurable space E and : E n R is a measurable unction. Denote G i = Y 1,...,Y i 1, Ỹi, Y i+1,...,y n, where Ỹ1,...,Ỹn is an independent copy o Y 1,...,Y n. Assume moreover that G G i F i Y i, Ỹi or some unctions F i : E 2 R, i = 1,...,n. Then EϕG EG Eϕ ε i F i Y i, Ỹi, 31 where ε 1,...,ε n is a sequence o independent Rademacher variables, independent o Y i n and Ỹi n. 1027
29 Proo. Induction with respect to n. For n = 0 the statement is obvious, since both the let-hand and the right-hand side o 31 equal ϕ0. Let us thereore assume that the lemma is true or n 1. Then, denoting by E X integration with respect to the variable X, EϕG EG = EϕG EỸn G n + E Yn G EG EϕG G n + E Yn G EG = EϕG n G + E Yn G EG = Eϕε n G G n + E Yn G EG Eϕε n F n Y n, Ỹn + E Yn G EG, where the equalities ollow rom the symmetry and the last inequality rom the contraction principle or simply convexity o ϕ, applied conditionally on Y i i, Ỹi i. Now, denoting Z = E Yn G, Z i = E Yn G i, we have or i = 1,...,n 1, Z Z i = E Yn G E Yn G i E Yn G G i F i Y i, Ỹi, and thus or ixed Y n,ỹn and ε n, we can apply the induction assumption to the unction t ϕε n FY n, Ỹn + t instead o ϕ and E Yn G instead o G, to obtain EϕG EG Eϕ F i Y i, Ỹiε i. Lemma 8. In the setting o Lemma 7, i or all i, F i Y i, Ỹi ψ1 τ, then or all t > 0, PY 1,...,Y n EY 1,...,Y n t 2 exp 1 t 2 K min t nτ 2,. τ Proo. For p 1, Y 1,...,Y n EY 1,...,Y n p ε i FY i, Ỹi K p nτ + pτ, p where the irst inequality ollows rom Lemma 7 and the second one rom Bernstein s inequality Lemma 6 and integration by parts. Now, by the Chebyshev inequality we get PY 1,...,Y n EY 1,...,Y n K tn + tτ e t or t 1, which is up to the constant in the exponent equivalent to the statement o the lemma note that i we can change the constant in the exponent, the choice o the constant in ront o the exponent is arbitrary, provided it is bigger than 1. Proo o Theorem 8. Consider a disjoint union E = 1028 S i
30 and a unction : E n R deined as where x i s are deined by the condition y 1,...,y n = x 1,...,x n, y 1 = x 1,...,x t1 S t 1 y 2 = x t1 +1,...,x t1 +t 2 S t 2... y n = x t t n 1 +1,...,x t t n S tn. 32 Let now T 1,...,T n be the regeneration times o the chain and set Y i = X T T i 1 +1,...,X T T i or i = 1,...,n we change the enumeration with respect to previous sections, but there is no longer need to distinguish the initial block. Then Y 1,...,Y n are independent E- valued random variables recall the assumption m = 1. Moreover we have X 1,...,X n = Y 1,...,Y n. Let now Ỹ1,...,Ỹn be an independent copy o the sequence Y 1,...,Y n. Deine G and G i like in Lemma 7 or the unction. Deine also T i = j i Ỹi S j and let X i,1,..., X i,t T i 1 + T i +T i T n correspond to Y 1,...,Y i 1, Ỹi, Y i+1,...,y n in the same way as in 32. Let us notice that we can rearrange the sequence X i,1,..., X i,n in such a way that the Hamming distance o the new sequence rom X 1,...,X n will not exceed maxt i, T i. Since the unction is invariant under permutation o arguments and L-Lipschitz with respect to the Hamming distance, we have G G i LmaxT i, T i =: FY i, Ỹi. Moreover, FY i, Ỹi ψ1 2Lτ, so by Lemma 8, we obtain PX 1,...,X n EX 1,...,X n t = P Y 1,...,Y n E Y 1,...,Y n t 2 exp 1 t 2 K min t nl 2 τ 2,. Lτ But rom Jensen s inequality and 30 it ollows that X 1,...,X n EX 1,...,X n Ln, thus or t > Ln, the let hand side o the above inequality is equal to 0, whereas or t Ln, the inequality τ > 1 gives which proves the theorem. t 2 nl 2 τ 2 t Lτ, 1029
31 3.5 A ew words on connections with other results First we would like to comment on the assumptions o our main theorems, concerning Markov chains. We assume that the Orlicz norms T 1 ψ1 and T 2 ψ1 are inite, which is equivalent to existence o a number κ > 1, such that E ξ κ T 1 <, E ν κ T 1 <, where ξ is the initial distribution o the chain and ν the minorizing measure rom condition 14. This is true or instance i m = 1 and the chain satisies the drit condition, i.e. i there is a measurable unction V : S [1,, together with constants λ < 1 and K <, such that { λv x or x / C, PV x = V ypx, dy K or x C S and V is ξ and ν integrable see e.g. [1], Propositions 4.1 and 4.4, see also [22], [18]. For m > 1 one can similarly consider the kernel P m instead o P however in this case our inequalities are restricted to averages o real valued unctions as in Theorem 6. Such drit conditions have gained considerable attention in the Markov Chain Monte Carlo theory as they imply geometric ergodicity o the chain. Concentration o measure inequalities or general unctions o Markov chains were investigated by Marton [13], Samson [23] and more recently by Kontorovich and Ramanan [8]. They actually consider more general mixing processes and give estimates on the deviation o a random variable rom the mean or median in terms o mixing coeicients. When specialized to Markov chains, their estimates yield inequalities in the spirit o Theorem 8 or general non-necessarily symmetric unctions o uniormly ergodic Markov chains see [18], Chapter 16 or the deinition. To obtain their results, Marton and Samson used transportation inequalities, whereas Kontorovich s and Ramanan s approach was based on martingales. In all cases the bounds include sums o expressions o the orm sup P i x, P i y, TV, x,y S where P i is the i step transition unction o the chain. These results are not well suited or Markov chains which are not uniormly ergodic like the chain in Section 3.3, since or such chains the summands are bounded rom below by a constant which spoils the dependence on n in the estimates. It would be interesting to know i in results o this type, the supremum o the total variation distances can be replaced by some other norm, or instance a kind o average. This would allow to extend the estimates to some classes o non-uniormly ergodic Markov chains. Inequalities o the bounded dierence type or sums X X n where X i s orm a uniormly ergodic Markov chain were also obtained by Glyn and Ormoneit [6]. Their method was to analyze the Poisson equation associated with the chain. Their result has been complemented by an inormation theoretic approach in Kontoyiannis et al. [9]. 1030
32 Estimates or sums, in terms o variance, appeared in the work by Samson [23], who presents a result or empirical processes o uniormly ergodic chains. He gives a real concentration inequality around the mean and not just a tail bound as in Theorem 7. The coeicient responsible or the subgaussian behavior o the tail is E n sup X i 2. Replacing it with V = Esup i X i 2 which would correspond to the original Talagrand s inequality is stated in Samson s work as an open problem, which to our best knowledge has not been yet solved. Additionally, in Samson s estimate there is no log n actor, which is present in Theorems 6 and 7. Since we have shown that in our setting this actor is indispensable, we would like to comment on the dierences between the results by Samson and ours. Obviously, the irst dierence is the setting. Although non-uniormly ergodic chains may satisy our assumptions T 1 ψ1, T 2 ψ1 <, the Minorization condition need not hold or them with m = 1, which restricts our results to linear statistics o the chain Theorem 6. However, there are many examples o non-uniormly ergodic chains, or which one cannot apply Samson s result but which satisy our assumptions. Such chains have been considered in the MCMC theory. When specialized to sums o real variables, Samson s result can be considered a counterpart o the Bernstein inequality, valid or uniormly ergodic Markov chains. The subgaussian part o the estimate is controlled by n EX i 2, which can be much bigger than the asymptotic variance and thereore does not relect the limiting behaviour o X X n. Consider or instance a chain consisting o the origin connected with initely many loops in which, similarly as in the example rom Section 3.3, the randomness appears only at the origin i.e. ater the choice o the loop the particle travels along it deterministically until the next return to the origin. Then, one can easily construct a unction with values in {±1}, centered with respect to the stationary distribution and such that its asymptotic variance is equal to zero, whereas n EX i 2 = n or all n it happens or instance i the sum o the values o along each loop vanishes. In consequence, n 1/2 X X n converges weakly to the Dirac mass at 0 and we have PX X n nt 0 or all t > 0, which is not recovered by Samson s estimate. One can also construct other examples o similar lavour, in which the asymptotic variance is nonzero but is still much smaller than E n X i 2. On the other hand Samson s results do not require the condition E π = 0 and as already mentioned in the case o empirical processes they provide a two sided concentration around the mean. As or the log n actor, at present we do not know i at the cost o replacing the asymptotic variance with n EX i 2 one can eliminate it in our setting. Summarizing, our inequalities, when compared to known results have both advantages and disadvantages. On the one hand, when specialized to uniormly ergodic 1031
33 Markov chains, they do not recover the ull generality or strength o previous estimates or instance Theorem 8 is restricted to symmetric statistics and m = 1, on the other hand they may be applied to Markov chains arising in statistical applications, which are not uniormly ergodic and thereore beyond the scope o the estimates presented above. Another property, which in our opinion, makes the estimates o Theorems 6 and 7 interesting at least rom the theoretical point o view is the act that or m = 1, the coeicient responsible or the Gaussian level o concentration corresponds to the variance o the limiting Gaussian distribution. Acknowledgements The author would like to thank Witold Bednorz and Krzyszto Latuszyński or their useul comments concerning the results presented in this article as well as the anonymous Reeree, whose remarks helped improve their presentation. Reerences [1] Baxendale P. H. Renewal theory and computable convergence rates or geometrically ergodic Markov chains. Ann. Appl. Probab , no. 1B, MR [2] Boucheron S., Bousquet O., Lugosi G., Massart P. Moment inequalities or unctions o independent random variables. Ann. Probab , no. 2, MR [3] Bousquet O. A Bennett concentration inequality and its application to suprema o empirical processes. C. R. Math. Acad. Sci. Paris , no. 6, MR [4] Einmahl U., Li D. Characterization o LIL behavior in Banach space. To appear in Trans. Am. Math. Soc. [5] Giné E., Lata la R., Zinn J. Exponential and moment inequalities or U- statistics. In High Dimensional Probability II, Progr. Probab. 47. Birkhauser, Boston, Boston, MA, MR [6] Glynn P. W., Ormoneit D. Hoeding s inequality or uniormly ergodic Markov chains. Statist. Probab. Lett , no. 2, MR [7] Klein T., Rio, E. Concentration around the mean or maxima o empirical processes. Ann. Probab , no. 3, MR [8] Kontorovich L., Ramanan K. Concentration Inequalities or Dependent Random Variables via the Martingale Method. To appear in Ann. Probab. [9] Kontoyiannis I., Lastras-Montaño L., Meyn S. P. Relative Entropy and Exponential Deviation Bounds or General Markov Chains IEEE International Symposium on Inormation Theory. 1032
arxiv: v1 [math.pr] 19 Sep 2007
arxiv:0709.3110v1 [math.pr] 19 Sep 2007 A tail inequality or suprema o unbounded empirical processes with applications to Markov chains Rados law Adamczak June 11, 2008 Abstract We present an easy extension
More informationTail inequalities for additive functionals and empirical processes of. Markov chains
Tail inequalities for additive functionals and empirical processes of geometrically ergodic Markov chains University of Warsaw Banff, June 2009 Geometric ergodicity Definition A Markov chain X = (X n )
More informationSTA205 Probability: Week 8 R. Wolpert
INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and
More informationConcentration inequalities and the entropy method
Concentration inequalities and the entropy method Gábor Lugosi ICREA and Pompeu Fabra University Barcelona what is concentration? We are interested in bounding random fluctuations of functions of many
More informationarxiv:math/ v2 [math.pr] 16 Mar 2007
CHARACTERIZATION OF LIL BEHAVIOR IN BANACH SPACE UWE EINMAHL a, and DELI LI b, a Departement Wiskunde, Vrije Universiteit Brussel, arxiv:math/0608687v2 [math.pr] 16 Mar 2007 Pleinlaan 2, B-1050 Brussel,
More informationFinite Dimensional Hilbert Spaces are Complete for Dagger Compact Closed Categories (Extended Abstract)
Electronic Notes in Theoretical Computer Science 270 (1) (2011) 113 119 www.elsevier.com/locate/entcs Finite Dimensional Hilbert Spaces are Complete or Dagger Compact Closed Categories (Extended bstract)
More informationSupplementary material for Continuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz values
Supplementary material or Continuous-action planning or discounted ininite-horizon nonlinear optimal control with Lipschitz values List o main notations x, X, u, U state, state space, action, action space,
More information2. ETA EVALUATIONS USING WEBER FUNCTIONS. Introduction
. ETA EVALUATIONS USING WEBER FUNCTIONS Introduction So ar we have seen some o the methods or providing eta evaluations that appear in the literature and we have seen some o the interesting properties
More informationConcentration, self-bounding functions
Concentration, self-bounding functions S. Boucheron 1 and G. Lugosi 2 and P. Massart 3 1 Laboratoire de Probabilités et Modèles Aléatoires Université Paris-Diderot 2 Economics University Pompeu Fabra 3
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 59 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationProblem Set. Problems on Unordered Summation. Math 5323, Fall Februray 15, 2001 ANSWERS
Problem Set Problems on Unordered Summation Math 5323, Fall 2001 Februray 15, 2001 ANSWERS i 1 Unordered Sums o Real Terms In calculus and real analysis, one deines the convergence o an ininite series
More informationWald for non-stopping times: The rewards of impatient prophets
Electron. Commun. Probab. 19 (2014), no. 78, 1 9. DOI: 10.1214/ECP.v19-3609 ISSN: 1083-589X ELECTRONIC COMMUNICATIONS in PROBABILITY Wald for non-stopping times: The rewards of impatient prophets Alexander
More informationIn the index (pages ), reduce all page numbers by 2.
Errata or Nilpotence and periodicity in stable homotopy theory (Annals O Mathematics Study No. 28, Princeton University Press, 992) by Douglas C. Ravenel, July 2, 997, edition. Most o these were ound by
More informationClassification of effective GKM graphs with combinatorial type K 4
Classiication o eective GKM graphs with combinatorial type K 4 Shintarô Kuroki Department o Applied Mathematics, Faculty o Science, Okayama Uniervsity o Science, 1-1 Ridai-cho Kita-ku, Okayama 700-0005,
More informationMath 216A. A gluing construction of Proj(S)
Math 216A. A gluing construction o Proj(S) 1. Some basic deinitions Let S = n 0 S n be an N-graded ring (we ollows French terminology here, even though outside o France it is commonly accepted that N does
More informationLimit theorems for dependent regularly varying functions of Markov chains
Limit theorems for functions of with extremal linear behavior Limit theorems for dependent regularly varying functions of In collaboration with T. Mikosch Olivier Wintenberger wintenberger@ceremade.dauphine.fr
More information4 Sums of Independent Random Variables
4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables
More informationOn a Closed Formula for the Derivatives of e f(x) and Related Financial Applications
International Mathematical Forum, 4, 9, no. 9, 41-47 On a Closed Formula or the Derivatives o e x) and Related Financial Applications Konstantinos Draais 1 UCD CASL, University College Dublin, Ireland
More informationNonparametric regression with martingale increment errors
S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration
More informationA regeneration proof of the central limit theorem for uniformly ergodic Markov chains
A regeneration proof of the central limit theorem for uniformly ergodic Markov chains By AJAY JASRA Department of Mathematics, Imperial College London, SW7 2AZ, London, UK and CHAO YANG Department of Mathematics,
More informationSEPARATED AND PROPER MORPHISMS
SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN The notions o separatedness and properness are the algebraic geometry analogues o the Hausdor condition and compactness in topology. For varieties over the
More informationA Note on Jackknife Based Estimates of Sampling Distributions. Abstract
A Note on Jackknife Based Estimates of Sampling Distributions C. Houdré Abstract Tail estimates of statistics are given; they depend on jackknife estimates of variance of the statistics of interest. 1
More informationThe Clifford algebra and the Chevalley map - a computational approach (detailed version 1 ) Darij Grinberg Version 0.6 (3 June 2016). Not proofread!
The Cliord algebra and the Chevalley map - a computational approach detailed version 1 Darij Grinberg Version 0.6 3 June 2016. Not prooread! 1. Introduction: the Cliord algebra The theory o the Cliord
More informationAN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES
Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,
More informationON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT
Elect. Comm. in Probab. 10 (2005), 36 44 ELECTRONIC COMMUNICATIONS in PROBABILITY ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT FIRAS RASSOUL AGHA Department
More informationUniform concentration inequalities, martingales, Rademacher complexity and symmetrization
Uniform concentration inequalities, martingales, Rademacher complexity and symmetrization John Duchi Outline I Motivation 1 Uniform laws of large numbers 2 Loss minimization and data dependence II Uniform
More informationRoberto s Notes on Differential Calculus Chapter 8: Graphical analysis Section 1. Extreme points
Roberto s Notes on Dierential Calculus Chapter 8: Graphical analysis Section 1 Extreme points What you need to know already: How to solve basic algebraic and trigonometric equations. All basic techniques
More informationDefinition: Let f(x) be a function of one variable with continuous derivatives of all orders at a the point x 0, then the series.
2.4 Local properties o unctions o several variables In this section we will learn how to address three kinds o problems which are o great importance in the ield o applied mathematics: how to obtain the
More informationConcentration of Measures by Bounded Size Bias Couplings
Concentration of Measures by Bounded Size Bias Couplings Subhankar Ghosh, Larry Goldstein University of Southern California [arxiv:0906.3886] January 10 th, 2013 Concentration of Measure Distributional
More informationNew Bernstein and Hoeffding type inequalities for regenerative Markov chains
New Bernstein and Hoeffding type inequalities for regenerative Markov chains Patrice Bertail Gabriela Ciolek To cite this version: Patrice Bertail Gabriela Ciolek. New Bernstein and Hoeffding type inequalities
More informationSimultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms
Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms Yan Bai Feb 2009; Revised Nov 2009 Abstract In the paper, we mainly study ergodicity of adaptive MCMC algorithms. Assume that
More informationStrong approximation for additive functionals of geometrically ergodic Markov chains
Strong approximation for additive functionals of geometrically ergodic Markov chains Florence Merlevède Joint work with E. Rio Université Paris-Est-Marne-La-Vallée (UPEM) Cincinnati Symposium on Probability
More informationConcentration inequalities and tail bounds
Concentration inequalities and tail bounds John Duchi Outline I Basics and motivation 1 Law of large numbers 2 Markov inequality 3 Cherno bounds II Sub-Gaussian random variables 1 Definitions 2 Examples
More informationLARGE DEVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILED DEPENDENT RANDOM VECTORS*
LARGE EVIATION PROBABILITIES FOR SUMS OF HEAVY-TAILE EPENENT RANOM VECTORS* Adam Jakubowski Alexander V. Nagaev Alexander Zaigraev Nicholas Copernicus University Faculty of Mathematics and Computer Science
More informationJournal of Mathematical Analysis and Applications
J. Math. Anal. Appl. 377 20 44 449 Contents lists available at ScienceDirect Journal o Mathematical Analysis and Applications www.elsevier.com/locate/jmaa Value sharing results or shits o meromorphic unctions
More informationON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION
ON THE CONVEX INFIMUM CONVOLUTION INEQUALITY WITH OPTIMAL COST FUNCTION MARTA STRZELECKA, MICHA L STRZELECKI, AND TOMASZ TKOCZ Abstract. We show that every symmetric random variable with log-concave tails
More informationP i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=
2.7. Recurrence and transience Consider a Markov chain {X n : n N 0 } on state space E with transition matrix P. Definition 2.7.1. A state i E is called recurrent if P i [X n = i for infinitely many n]
More informationOn High-Rate Cryptographic Compression Functions
On High-Rate Cryptographic Compression Functions Richard Ostertág and Martin Stanek Department o Computer Science Faculty o Mathematics, Physics and Inormatics Comenius University Mlynská dolina, 842 48
More informationOXPORD UNIVERSITY PRESS
Concentration Inequalities A Nonasymptotic Theory of Independence STEPHANE BOUCHERON GABOR LUGOSI PASCAL MASS ART OXPORD UNIVERSITY PRESS CONTENTS 1 Introduction 1 1.1 Sums of Independent Random Variables
More informationSTAT 801: Mathematical Statistics. Hypothesis Testing
STAT 801: Mathematical Statistics Hypothesis Testing Hypothesis testing: a statistical problem where you must choose, on the basis o data X, between two alternatives. We ormalize this as the problem o
More information(C) The rationals and the reals as linearly ordered sets. Contents. 1 The characterizing results
(C) The rationals and the reals as linearly ordered sets We know that both Q and R are something special. When we think about about either o these we usually view it as a ield, or at least some kind o
More informationTelescoping Decomposition Method for Solving First Order Nonlinear Differential Equations
Telescoping Decomposition Method or Solving First Order Nonlinear Dierential Equations 1 Mohammed Al-Reai 2 Maysem Abu-Dalu 3 Ahmed Al-Rawashdeh Abstract The Telescoping Decomposition Method TDM is a new
More informationNumerical Methods - Lecture 2. Numerical Methods. Lecture 2. Analysis of errors in numerical methods
Numerical Methods - Lecture 1 Numerical Methods Lecture. Analysis o errors in numerical methods Numerical Methods - Lecture Why represent numbers in loating point ormat? Eample 1. How a number 56.78 can
More information10. Joint Moments and Joint Characteristic Functions
10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent the inormation contained in the joint p.d. o two r.vs.
More informationSEPARATED AND PROPER MORPHISMS
SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Last quarter, we introduced the closed diagonal condition or a prevariety to be a prevariety, and the universally closed condition or a variety to be complete.
More informationObjectives. By the time the student is finished with this section of the workbook, he/she should be able
FUNCTIONS Quadratic Functions......8 Absolute Value Functions.....48 Translations o Functions..57 Radical Functions...61 Eponential Functions...7 Logarithmic Functions......8 Cubic Functions......91 Piece-Wise
More informationEntropy and Ergodic Theory Lecture 15: A first look at concentration
Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that
More informationPractical conditions on Markov chains for weak convergence of tail empirical processes
Practical conditions on Markov chains for weak convergence of tail empirical processes Olivier Wintenberger University of Copenhagen and Paris VI Joint work with Rafa l Kulik and Philippe Soulier Toronto,
More informationVALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS
VALUATIVE CRITERIA FOR SEPARATED AND PROPER MORPHISMS BRIAN OSSERMAN Recall that or prevarieties, we had criteria or being a variety or or being complete in terms o existence and uniqueness o limits, where
More informationPhenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012
Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM
More informationBANDELET IMAGE APPROXIMATION AND COMPRESSION
BANDELET IMAGE APPOXIMATION AND COMPESSION E. LE PENNEC AND S. MALLAT Abstract. Finding eicient geometric representations o images is a central issue to improve image compression and noise removal algorithms.
More informationVALUATIVE CRITERIA BRIAN OSSERMAN
VALUATIVE CRITERIA BRIAN OSSERMAN Intuitively, one can think o separatedness as (a relative version o) uniqueness o limits, and properness as (a relative version o) existence o (unique) limits. It is not
More informationSolution of the Synthesis Problem in Hilbert Spaces
Solution o the Synthesis Problem in Hilbert Spaces Valery I. Korobov, Grigory M. Sklyar, Vasily A. Skoryk Kharkov National University 4, sqr. Svoboda 677 Kharkov, Ukraine Szczecin University 5, str. Wielkopolska
More informationEXPANSIONS IN NON-INTEGER BASES: LOWER, MIDDLE AND TOP ORDERS
EXPANSIONS IN NON-INTEGER BASES: LOWER, MIDDLE AND TOP ORDERS NIKITA SIDOROV To the memory o Bill Parry ABSTRACT. Let q (1, 2); it is known that each x [0, 1/(q 1)] has an expansion o the orm x = n=1 a
More informationJournal of Mathematical Analysis and Applications
J. Math. Anal. Appl. 352 2009) 739 748 Contents lists available at ScienceDirect Journal o Mathematical Analysis Applications www.elsevier.com/locate/jmaa The growth, oscillation ixed points o solutions
More information( x) f = where P and Q are polynomials.
9.8 Graphing Rational Functions Lets begin with a deinition. Deinition: Rational Function A rational unction is a unction o the orm ( ) ( ) ( ) P where P and Q are polynomials. Q An eample o a simple rational
More informationRATIONAL FUNCTIONS. Finding Asymptotes..347 The Domain Finding Intercepts Graphing Rational Functions
RATIONAL FUNCTIONS Finding Asymptotes..347 The Domain....350 Finding Intercepts.....35 Graphing Rational Functions... 35 345 Objectives The ollowing is a list o objectives or this section o the workbook.
More informationNotes 15 : UI Martingales
Notes 15 : UI Martingales Math 733 - Fall 2013 Lecturer: Sebastien Roch References: [Wil91, Chapter 13, 14], [Dur10, Section 5.5, 5.6, 5.7]. 1 Uniform Integrability We give a characterization of L 1 convergence.
More information14.1 Finding frequent elements in stream
Chapter 14 Streaming Data Model 14.1 Finding frequent elements in stream A very useful statistics for many applications is to keep track of elements that occur more frequently. It can come in many flavours
More informationNotes 6 : First and second moment methods
Notes 6 : First and second moment methods Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Roc, Sections 2.1-2.3]. Recall: THM 6.1 (Markov s inequality) Let X be a non-negative
More informationn! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2
Order statistics Ex. 4. (*. Let independent variables X,..., X n have U(0, distribution. Show that for every x (0,, we have P ( X ( < x and P ( X (n > x as n. Ex. 4.2 (**. By using induction or otherwise,
More informationRuelle Operator for Continuous Potentials and DLR-Gibbs Measures
Ruelle Operator or Continuous Potentials and DLR-Gibbs Measures arxiv:1608.03881v4 [math.ds] 22 Apr 2018 Leandro Cioletti Departamento de Matemática - UnB 70910-900, Brasília, Brazil cioletti@mat.unb.br
More informationarxiv: v1 [math.oc] 1 Apr 2012
arxiv.tex; October 31, 2018 On the extreme points o moments sets arxiv:1204.0249v1 [math.oc] 1 Apr 2012 Iosi Pinelis Department o Mathematical ciences Michigan Technological University Houghton, Michigan
More informationSOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS
SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary
More informationLecture 1 Measure concentration
CSE 29: Learning Theory Fall 2006 Lecture Measure concentration Lecturer: Sanjoy Dasgupta Scribe: Nakul Verma, Aaron Arvey, and Paul Ruvolo. Concentration of measure: examples We start with some examples
More informationHOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann
HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS Josef Teichmann Abstract. Some results of ergodic theory are generalized in the setting of Banach lattices, namely Hopf s maximal ergodic inequality and the
More informationA note on the convex infimum convolution inequality
A note on the convex infimum convolution inequality Naomi Feldheim, Arnaud Marsiglietti, Piotr Nayar, Jing Wang Abstract We characterize the symmetric measures which satisfy the one dimensional convex
More informationLecture 17 Brownian motion as a Markov process
Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is
More informationSections of Convex Bodies via the Combinatorial Dimension
Sections of Convex Bodies via the Combinatorial Dimension (Rough notes - no proofs) These notes are centered at one abstract result in combinatorial geometry, which gives a coordinate approach to several
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationChapter 7. Markov chain background. 7.1 Finite state space
Chapter 7 Markov chain background A stochastic process is a family of random variables {X t } indexed by a varaible t which we will think of as time. Time can be discrete or continuous. We will only consider
More informationWeb Appendix for The Value of Switching Costs
Web Appendix or The Value o Switching Costs Gary Biglaiser University o North Carolina, Chapel Hill Jacques Crémer Toulouse School o Economics (GREMAQ, CNRS and IDEI) Gergely Dobos Gazdasági Versenyhivatal
More information1 Relative degree and local normal forms
THE ZERO DYNAMICS OF A NONLINEAR SYSTEM 1 Relative degree and local normal orms The purpose o this Section is to show how single-input single-output nonlinear systems can be locally given, by means o a
More informationOn Picard value problem of some difference polynomials
Arab J Math 018 7:7 37 https://doiorg/101007/s40065-017-0189-x Arabian Journal o Mathematics Zinelâabidine Latreuch Benharrat Belaïdi On Picard value problem o some dierence polynomials Received: 4 April
More informationk-protected VERTICES IN BINARY SEARCH TREES
k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from
More informationRigorous pointwise approximations for invariant densities of non-uniformly expanding maps
Ergod. Th. & Dynam. Sys. 5, 35, 8 44 c Cambridge University Press, 4 doi:.7/etds.3.9 Rigorous pointwise approximations or invariant densities o non-uniormly expanding maps WAEL BAHSOUN, CHRISTOPHER BOSE
More informationDiscrete Mathematics. On the number of graphs with a given endomorphism monoid
Discrete Mathematics 30 00 376 384 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: www.elsevier.com/locate/disc On the number o graphs with a given endomorphism monoid
More informationLARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011
LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic
More informationLecture 4 Lebesgue spaces and inequalities
Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how
More informationarxiv: v1 [math.gt] 20 Feb 2008
EQUIVALENCE OF REAL MILNOR S FIBRATIONS FOR QUASI HOMOGENEOUS SINGULARITIES arxiv:0802.2746v1 [math.gt] 20 Feb 2008 ARAÚJO DOS SANTOS, R. Abstract. We are going to use the Euler s vector ields in order
More information1. Stochastic Processes and filtrations
1. Stochastic Processes and 1. Stoch. pr., A stochastic process (X t ) t T is a collection of random variables on (Ω, F) with values in a measurable space (S, S), i.e., for all t, In our case X t : Ω S
More informationBrownian Motion. 1 Definition Brownian Motion Wiener measure... 3
Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................
More informationFeedback Linearization
Feedback Linearization Peter Al Hokayem and Eduardo Gallestey May 14, 2015 1 Introduction Consider a class o single-input-single-output (SISO) nonlinear systems o the orm ẋ = (x) + g(x)u (1) y = h(x) (2)
More informationAsymptotically Efficient Estimation of the Derivative of the Invariant Density
Statistical Inerence or Stochastic Processes 6: 89 17, 3. 3 Kluwer cademic Publishers. Printed in the Netherlands. 89 symptotically Eicient Estimation o the Derivative o the Invariant Density NK S. DLLYN
More informationIndependence of some multiple Poisson stochastic integrals with variable-sign kernels
Independence of some multiple Poisson stochastic integrals with variable-sign kernels Nicolas Privault Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang Technological
More informationChapter 1. Poisson processes. 1.1 Definitions
Chapter 1 Poisson processes 1.1 Definitions Let (, F, P) be a probability space. A filtration is a collection of -fields F t contained in F such that F s F t whenever s
More informationNON-AUTONOMOUS INHOMOGENEOUS BOUNDARY CAUCHY PROBLEMS AND RETARDED EQUATIONS. M. Filali and M. Moussi
Electronic Journal: Southwest Journal o Pure and Applied Mathematics Internet: http://rattler.cameron.edu/swjpam.html ISSN 83-464 Issue 2, December, 23, pp. 26 35. Submitted: December 24, 22. Published:
More informationNotes on Gaussian processes and majorizing measures
Notes on Gaussian processes and majorizing measures James R. Lee 1 Gaussian processes Consider a Gaussian process {X t } for some index set T. This is a collection of jointly Gaussian random variables,
More informationScattered Data Approximation of Noisy Data via Iterated Moving Least Squares
Scattered Data Approximation o Noisy Data via Iterated Moving Least Squares Gregory E. Fasshauer and Jack G. Zhang Abstract. In this paper we ocus on two methods or multivariate approximation problems
More informationCHOW S LEMMA. Matthew Emerton
CHOW LEMMA Matthew Emerton The aim o this note is to prove the ollowing orm o Chow s Lemma: uppose that : is a separated inite type morphism o Noetherian schemes. Then (or some suiciently large n) there
More informationIntroduction to Dynamical Systems
Introduction to Dynamical Systems France-Kosovo Undergraduate Research School of Mathematics March 2017 This introduction to dynamical systems was a course given at the march 2017 edition of the France
More informationComplexity and Algorithms for Two-Stage Flexible Flowshop Scheduling with Availability Constraints
Complexity and Algorithms or Two-Stage Flexible Flowshop Scheduling with Availability Constraints Jinxing Xie, Xijun Wang Department o Mathematical Sciences, Tsinghua University, Beijing 100084, China
More informationSTAT 200C: High-dimensional Statistics
STAT 200C: High-dimensional Statistics Arash A. Amini April 27, 2018 1 / 80 Classical case: n d. Asymptotic assumption: d is fixed and n. Basic tools: LLN and CLT. High-dimensional setting: n d, e.g. n/d
More informationWiener Measure and Brownian Motion
Chapter 16 Wiener Measure and Brownian Motion Diffusion of particles is a product of their apparently random motion. The density u(t, x) of diffusing particles satisfies the diffusion equation (16.1) u
More informationDIFFERENTIAL POLYNOMIALS GENERATED BY SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS
Journal o Applied Analysis Vol. 14, No. 2 2008, pp. 259 271 DIFFERENTIAL POLYNOMIALS GENERATED BY SECOND ORDER LINEAR DIFFERENTIAL EQUATIONS B. BELAÏDI and A. EL FARISSI Received December 5, 2007 and,
More informationConcentration Inequalities for Dependent Random Variables via the Martingale Method
Concentration Inequalities for Dependent Random Variables via the Martingale Method Leonid Kontorovich, and Kavita Ramanan Carnegie Mellon University L. Kontorovich School of Computer Science Carnegie
More informationGeneralization theory
Generalization theory Daniel Hsu Columbia TRIPODS Bootcamp 1 Motivation 2 Support vector machines X = R d, Y = { 1, +1}. Return solution ŵ R d to following optimization problem: λ min w R d 2 w 2 2 + 1
More informationCentral Limit Theorems and Proofs
Math/Stat 394, Winter 0 F.W. Scholz Central Limit Theorems and Proos The ollowin ives a sel-contained treatment o the central limit theorem (CLT). It is based on Lindeber s (9) method. To state the CLT
More informationAn almost sure invariance principle for additive functionals of Markov chains
Statistics and Probability Letters 78 2008 854 860 www.elsevier.com/locate/stapro An almost sure invariance principle for additive functionals of Markov chains F. Rassoul-Agha a, T. Seppäläinen b, a Department
More informationStrong Lyapunov Functions for Systems Satisfying the Conditions of La Salle
06 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 6, JUNE 004 Strong Lyapunov Functions or Systems Satisying the Conditions o La Salle Frédéric Mazenc and Dragan Ne sić Abstract We present a construction
More information