Convergence Rates of Nonparametric Posterior Distributions

Size: px
Start display at page:

Download "Convergence Rates of Nonparametric Posterior Distributions"

Transcription

1 Convergence Rates of Nonparametric Posterior Distributions Yang Xing arxiv: v1 [math.st] 17 Apr 008 Swedish University of Agricultural Sciences We study the asymptotic behavior of posterior distributions. We present general posterior convergence rate theorems, which extend several results on posterior convergence rates provided by Ghosal and Van der Vaart 000, Shen and Wasserman 001 and Walker, Lijor and Prunster 007. Our main tools are the Hausdorff α-entropy introduced by Xing and Ranneby 008 and a new notion of prior concentration, which is a slight improvement of the usual prior concentration provided by Ghosal and Van der Vaart 000. We apply our results to several statistical models. 1. Introduction. Recently, a major theoretical advance has occurred in the theory of Bayesian consistency for infinite-dimensional models. Schwartz 1965 first proved that, if the true density function is in the Kullback-Leibler support of the prior distribution, then the sequence of posterior distributions accumulates in all weak neighborhoods of the true density function. It is known that the condition of positivity of prior mass on each Kullback-Leibler neighborhood in Schwartz s theorem is not a necessary condition. When one considers problems of density estimation, it is natural to ask for the strong consistency of Bayesian procedures. Sufficient conditions for the strong Hellinger consistency and for evaluating consistency rates have been currently developed by many authors. In this paper we study the problem of determining whether the posterior distributions accumulate in Hellinger neighborhoods of the true density function. The rate of this convergence can be measured by the size of the smallest shrinking Hellinger balls around the true density function on which posterior masses tend to zero as the sample size increases to infinity. By the fundamental works of Ghosal, Ghosh and Van der Vaart 000 and Shen and Wasserman 001, we know that the convergence rate of posterior distributions is completely determined by two quantities: the rate of the metric entropy and the prior concentration rate. Roughly speaking, the rate of the metric entropy describes how large the model is, and the prior concentration rate depends on prior masses near the true distribution. Since the true distribution is unknown, the later assumption actually requires that the prior AMS 000 Subject Classifications. 6G07, 6G0, 6F15. Key words. Hellinger consistency, posterior distribution, rate of convergence, sieve, infinite-dimensional model. 1

2 distribution spreads its mass uniformly over the whole density space. Another elegant approach for determination of the convergence rate was provided by Walker 004, who obtained a sufficient condition for strong consistency by using summability of square root of prior probability instead of the metric entropy method. In this paper, in dealing with the rate of metric entropies we shall apply the Hausdorff α-entropy introduced by Xing and Ranneby 008, which is much smaller than widely used metric entropies and the bracketing entropy. For some important prior distributions of statistical models the Hausdorff α-entropies of all sieves are uniformly bounded, whereas it is generally impossible to get uniform boundedness of metric entropies of large sieves. The application of the Hausdorff α-entropy leads refinements of several theorems on posterior convergence rates, for instance, the well known assumptions on metric entropies and summability of square root of prior probability have been weakened, which particularly yields that Theorem 5 of Ghosal et al.007b is strengthened into Corollary 3 of this paper. To handle the prior concentration rate, we shall apply a new notion of prior concentration. Our approach is a slight improvement of the prior concentration provided by Ghosal et al.000, and moreover the proof of Lemma 1 on which the approach bases is quite simple. Finally, to get posterior convergence at the optimal rate 1/ n, we give an extension of Ghosal et al.000, Theorem.4, in which the universal testing constant has been replaced by any fixed constant. An outline of this paper is as follows. In Section we define the Hausdorff α-entropy with respect to a given prior and then present general theorems for the determination of posterior convergence rates. We also give a new approach to compute concentration rates. In Section 3 we apply our results to Bernstein polynomial priors, priors based on uniform distribution, log spline models and finite-dimensional models, which leads some improvements on known results for these models. The proofs of the main results are contained in Section 4.. Notations and Theorems. We consider a family of probability measures dominated by a σ-finite measure µ in X, a Polish space endowed with a σ-algebra X. Let X 1,X,...,X n stand for an independent identically distributed i.i.d. sample of n random variables, taking values in X and having a common probability density function f 0 with respect to the measure µ. Denote by F0 the infinite product distribution of the probability distribution F 0 associated with f 0. For two probability densities f and g µdx 1/ we denote the Hellinger distance Hf,g = X fx gx and the Kullback-Leibler divergence Kf,g = X µdx. Assume that the space F of probability density functions is separable with respect to the Hellinger metric and that F is the Borel σ-algebra of F. Given a prior distribution Π on F, the posterior distribution Π n is a random probability measure with the following expression fxlog fx gx Π n A = Π A X1,X,...,X n = n fx i Πdf A i=1 F n fx i Πdf i=1 = AR nfπdf F R nfπdf

3 forallmeasurablesubsetsa F, wherer n f = n { fxi /f 0 X i isthelikelihoodratio. In other words, the posterior distribution Π n is the conditional distribution of Π given the observations X 1,X,...,X n. If the posterior distribution Π n concentrates on arbitrarily small neighborhoods of the true density function f 0 almost surely or in probability, then it is said to be consistent at f 0 almost surely and in probability respectively. Throughout this paper, almost sure convergence and convergence in probability should be understood as to be with respect to the infinite product distribution F 0 of F 0. Our aim of this article is to present general theorems on posterior convergence rates at f 0. By the posterior convergence rate theorems of Ghosal, Ghosh and Van der Vaart 000, we know that the prior concentration rate and the rate of metric entropy both completely determine the convergence rate of posterior distributions. More specifically, a key inequality to determine almost sure convergence rates of posterior distributions is that for each ε > 0, F R nfπdf e 3nε i=1 Π f : Hf 0,f f 0 /f < ε almost surely for all sufficiently large n, where g stands for the supremum norm of the function g on X. This inequality was obtained by Ghosal et al.000, Lemma 8.4 under mild assumptions. It appears almost in all of papers handling strong convergence rates of posterior distributions. The reason is that in order to get the convergence rate of posterior distributions one needs to find a suitable lower bound for the denominator in the expression of posterior distributions. This is successfully done in Ghosal et al.000, who suggested that the prior Π puts sufficiently amount of mass around the true density function f 0 in the sense: Π f : Hf 0,f f 0 /f < ε n e n ε n c for some fixed constant c. Such a sequence { ε n is referred to as the concentration rate of the prior Π around f 0. Here we give a slightly stronger result. We introduce a modification of the Hellinger distance H f 0,f = f0 x fx f 0 x X 3 fx µdx. 3 Itisclearthattheinequality f 0 /f 1holdsforalldensityfunctionsf andf 0 suchthat the supremum is well-defined, and the quality holds if and only if f = f 0 almost surely. Observe also that H f 0,f H f,f 0 and 3 1/ Hf 0,f H f 0,f. Moreover, we have H f 0,f Hf 0,f 3 which yields f0 /f / Hf 0,f f0 /f 1/4 Hf 0,f f0 /f 1/ { f F : H f 0,f ε n { f F : Hf0,f f0 / f < ε n { f F : Hf 0,f f0 / f < ε n. 3

4 The following simplelemma shows that, for ε n tobe a prior concentration rate, it isenough to assume Π W εn e n ε n c 3, where W ε = { f F : H f 0,f ε. Lemma 1. Let ε > 0 and c > 0. Then the inequality F0 F R nfπdf e nε 3+c Π W ε e nε c holds for all n. Lemma 1 provides a useful approach to compute prior concentration rates, particularly for models in which rate of convergence is governed by the prior concentration rate. It leads a simplification of the proof of Theorem. of Ghosal et al.000. Furthermore, we shall present general posterior convergence rate theorems in which the well known assumptions on metric entropies and summability of square root of prior probability are also weakened. We shall apply the Hausdorff α-entropy Jδ, G, α introduced by Xing et al.008. Denote by L µ the space of all nonnegative integrable functions with the norm f 1 = Xfxµdx. Write log0 =. Definition. Let α 0 and G F. For δ > 0, the Hausdorff α-entropy Jδ,G,α with respect to the prior distribution Π is defined as Jδ,G,α = log inf N ΠB j α, where the infimum is taken over all coverings {B 1,B,...,B N of G, where N may take the value, such that each B j is contained in some Hellinger ball {f : Hf j,f < δ of radius δ and center at f j L µ. Notethat theinfimum can beequivalentlytakenover allpartitions{p 1,P,...,P N of G such that the Hellinger radius of each subset P j does not exceed δ. It was proved in Xing et al.008, Lemma 1 that the Hausdorff α-entropy Jδ, G, α is an increasing subadditive function of G and satisfies Jδ,G,α log Nδ,G for all α 0, where Nδ,G stands for the minimal number of Hellinger balls of radius δ needed to cover G. For 0 α 1 and each G F, we also obtained the following useful inequality ΠG α e Jδ,G,α ΠG α Nδ,G 1 α. Our first result in this paper is the following general theorem on posterior strong convergence rates. Denote A ε = { f : Hf 0,f ε. Theorem 1. Let { ε n and { ε n be positive sequences such that n min ε n, ε n as n. Suppose that there exist constants c 1 > 0, c > 0, c 3 0, 0 α < 1 and a sequence {G n of subsets on F such that e n ε n c < and 1 e J ε n,g n,α n ε n c 1 <, 4

5 e n ε n 3+3c +c 3 ΠA εn \G n <, 3 Π f : H f 0,f ε n e n ε n c 3. 3α+αc Then for ε n = max ε n, ε n and each r > + +αc 3 +c 1 1 α, we have Π n Arεn 0 almost surely as n. As direct applications we have Corollary 1. Let c 1 0, c > 0, c 3 0 and 0 α < 1. Suppose that {ε n is a positive sequence satisfying e nε n c < and suppose that there exists a sequence {G n of subsets on F such that 1 Jε n,g n,α nε nc 1, ΠF\G n e nε n 3+3c +c 3, 3 Π f : H f 0,f ε n e nε n c 3. 3α+αc Then for each r > + +αc 3 +c 1 +c 1 α, we have that Π n Arεn 0 almost surely as n. Proof. It is clear that all conditions of Theorem 1 are fulfilled if we let ε n = ε n = ε n and replace the c 1 in Theorem 1 by c 1 +c of Corollary 1, and the proof is complete. Corollary 1 extends Theorem. of Ghosal et al.000, in which they have stronger conditions than 1 and 3 of Corollary 1. It is probably worth mentioning that for several important prior distributions of infinite-dimensional statistical models, the quantities Jε n,g n,α are uniformly bounded for all n and hence condition 1 of Corollary 1 is triviallyfulfilled, whereasgeneral metricentropiesofthesieveg n growtoinfinity asthesample size increases. A slightly different version of Theorem 1 is the following consequence, which is in fact an extension of Proposition 1 of Walker et all.007. Corollary. Let {ε n be a positive sequence such that nε n as n. Suppose that there exist constants c 1 > 0, c > 0, c 3 0, 0 α < 1 and a sequence { G nj with G nj F such that e nε n c < and 1 Nε n,g nj 1 α ΠG nj α e nε n c 1 < ; 5

6 e nε n 3+3c +c 3 ΠA εn \ G nj <, 3 Π f : H f 0,f ε n e nε n c 3. 3α+αc Then for each r > + +αc 3 +c 1, we have that Π 1 α n Arεn 0 almost surely as n. Proof. Let G n = G nj. We only need to verify condition 1 of Theorem 1 for such a sieve G n. By Lemma 1 of Xing et al.008 we have e Jε n,g n,α nε n c 1 Corollary then follows from Theorem 1. e Jε n,g nj,α nε n c 1 Nε n,g nj 1 α ΠG nj α e nε n c 1 <. The assertion of Theorem 1 is an almost sure statement that the posterior distributions outside a Hellinger ball with a multiple of ε n as radius converge to zero almost surely. Now we give an in-probability assertion under weaker conditions. Denote Vf,g = X fx log fx gx µdx. Theorem. Let { ε n and { ε n be positive sequences such that n min ε n, ε n as n. Suppose that there exist constants c 1 > 0, c 0, 0 α < 1 and a sequence {G n of subsets on F such that 1 J ε n,g n,α n ε nc 1 as n, e n ε n +c ΠA εn \G n 0 as n, 3 Π f : Kf 0,f < ε n and Vf 0,f < ε n e n ε n c. Then for ε n = max ε n, ε n and each r > + in probability as n. α+αc +c 1 1 α, we have that Π n Arεn 0 Observe that the conditions of Theorem 1 and Theorem are only used to ensure that Π n A εn \G n 0 as n. So one can replace the conditions of these theorems by Π n A εn \ G n 0 as n almost surely and in probability respectively. Theorem is an extended version of Theorem.1 of Ghosal et al.001 and Theorem 1 of Walker 6

7 et all.007. Furthermore, as a consequence of Theorem we obtain the following slight improvement of Theorem 5 of Ghosal et al.007b. Corollary 3. Let {ε n be a positive sequence such that nε n as n. Suppose that there exist constants c 1 > 0, c 0, 0 α < 1 and a sequence { G nj with G nj F. If 1 e nε n c 1 Nε n,g nj 1 α ΠG nj α 0 as n ; e nε n +c ΠA εn \ G nj 0 as n ; 3 Π f : Kf 0,f < ε n and Vf 0,f < ε n e nε n c, α+αc then for each r > + +c 1 1 α, we have that Π n Arεn 0 in probability as n. Proof. For G n = G nj, by Lemma 1 of Xing et al.008 we get e Jε n,g n,α nε n c 1 e Jε n,g nj,α nε n c 1 e nε n c 1 Nε n,g nj 1 α ΠG nj α which tends to zero as n and condition 1 of Theorem holds. Then by Theorem we conclude the proof. The above theorems cannot yield a convergence rate 1/ n because of the assumption nε n. Particularly, these theorems cannot well serve finite-dimensional models. Ghosal et al.000, 007a have obtained a nice theorem to handle such models. Denote B ε n = { f : Kf 0,f < ε n and Vf 0,f < ε n. Now our result is Theorem 3. Let {ε n be a positive sequence such that nε n are uniformly bounded away from zero, i.e., there exists a constant c 0 > 0 such that nε n c 0 for all n. Suppose that there exist constants 0 < α < 1, c 1 < 1 α 18 and a sequence {G n of subsets on F such that 1 e nε n ΠA ε n \G n ΠB ε n 0 as n, exp J jε n 3, { f G n : jε n Hf 0,f < jε n,α e c 1j nε n ΠBε n α for all sufficiently large positive integers j and n. Then for each r n we have that Π n Arn ε n 0 in probability as n. 7

8 Remark. Here we adopt the convention that if the denominator of a quotient equals zero thenthenumeratormust alsobezero. Hence, Theorem3isstilltrueevenwhenΠB ε n = 0 for some n. Corollary 4. Let {ε n be a positive sequence such that nε n are uniformly bounded away from zero. Suppose that there exist constants c 1, c and a sequence {G n of subsets on F such that 1 logn jε n 3, { f G n : jε n Hf 0,f < jε n c1 nε n for all integer j and n large enough, e nε n ΠA ε n \G n ΠB ε n 0 as n, 3 Π f G n :jε n Hf 0,f<jε n e c j nε n for all integer j and n large enough. ΠB ε n Then for each r n we have that Π n Arn ε n 0 in probability as n. Corollary 4 is a slightly stronger version of Theorem.4 of Ghosal et al.000. A notable improvement in Corollary 4 is that we have no restriction on the constant c, whereas their constant c equals half of some universal testing constant. Proof of Corollary 4. We only need to check condition of Theorem 3. It follows from Lemma 1 of Xing et al.008 and conditions 1 and 3 that J jε n 3,{ f G n : jε n Hf 0,f < jε n,α αlogπ f G n : jε n Hf 0,f < jε n +1 αlogn jε n 3,{ f G n : jε n Hf 0,f < jε n αlog e c j nε n ΠBε n +1 αc 1 nε n = αc + 1 αc 1 j j nε n +αlogπb ε n. Taking a small α in 0,1 and then letting j be large enough, we have that αc + 1 αc 1 1 α 18 j < and hence condition of Theorem 3 is fulfilled. The proof of Corollary4is complete. We will conclude this section by presenting an analogue of Theorem 3, which gives an almost sure assertion under stronger conditions. 8

9 Theorem 4. Let {ε n be a positive sequence such that there exists a constant c 0 > 0 such that nε n c 0 logn for all large n. Suppose that there exist constants 0 < α < 1, c 1 < 1 α 18, c > 1 c 0 and a sequence {G n of subsets on F such that 1 e nε n 3+c ΠA ε n \G n ΠW ε n <, exp J jε n 3, { f G n : jε n Hf 0,f < jε n,α e c 1j nε n ΠWεn α for all sufficiently large positive integers j and n. Then for each r large enough we have that Π n Arεn 0 almost surely as n. Completely following the proof of Corollary 4, we have the following consequence of Theorem 4. Corollary 5. Let {ε n be a positive sequence such that there exists a constant c 0 > 0 such that nε n c 0 logn for all large n. Suppose that there exist constants c 1, c > 1 c 0, c 3 and a sequence {G n of subsets on F such that 1 logn jε n 3, { f G n : jε n Hf 0,f < jε n c1 nε n for all integer j and n large enough, 3 e nε n 3+c ΠA ε n \G n ΠW ε n <, Π f G n :jε n Hf 0,f<jε n ΠW ε n e c 3j nε n for all integer j and n large enough. Then for each r large enough we have that Π n Arεn 0 almost surely as n. 3. Illustrations. In this section we apply our theorems to Bernstein polynomial priors, priors based on uniform distribution, log spline models and finite-dimensional models. This leads some improvements on known results for these models Bernstein polynomial prior. A Bernstein polynomial prior is a probability measure on the space of continuous probability distribution functions on [0, 1]. Petrone 1999 introduced the Bernstein polynomial prior Π by putting a prior distribution on the class of Bernstein densities in [0,1] in the following way: bx;k,f = k Fj/k Fj 1/k βx;j,k j +1, 9

10 where βx;a,b stands for the beta density βx;a,b = Γa+b ΓaΓb xa 1 1 x b 1, k has a distribution ρ, F is a random distribution independent of ρ. In other words, if B j stands for the set of all Bernstein densities of order j, then Π = ρjπ B j, where the probability measure Π Bj is the normalized restriction of Π on B j. We refer to Petrone and Wasserman 00 for a detailed description of the Bernstein polynomial prior, in which consistency of the posterior distribution for the Bernstein polynomial prior is discussed. Rates of convergence have been established under suitable tail conditions on ρ by Ghosal 001 and Walker et al.007, where the convergence is understood as convergence in F0 -probability. Ghosal 001, Theorem.3 proved that the prior concentration rate is logn 1/3 /n 1/3 and the entropy rate is logn 5/6 /n 1/3 under the tail assumption ρj e cj for all j, which yields the convergence rate logn 5/6 /n 1/3. Walker etal.007obtainedtheentropyratelogn 1/3 /n 1/3 under thelightertailconditionρj e 4j logj = 1/j j 4 for all j, which leads the convergence rate logn 1/3 /n 1/3. In the following we establish the entropy rate 1/n γ under the tail condition ρj 1/j j c 0 for all j, where c 0 is any fixed positive constant and γ is any fixed constant strictly less than 1/. Hence we also get the convergence rate logn 1/3 /n 1/3 under the weaker condition ρj 1/j j c 0. This improves the result of Walker et al.007. From Ghosal 001 it follows that there exists an absolute constant c > 0 such that Nε n,b j c/ε n j for all j. Given γ < 1/, choose G nj = B j and ε n = 1/n γ. Take 0 < α < 1 such that γ + 1/d < 1 with d = c 0 α/1 α. To verify condition 1 of Corollary 3, by ρj 1/j j c 0 we obtain that e nε n c 1 Nε n,b j 1 α ΠB j α e nε n c 1 c j1 α ρj α ε n e c 1n 1 γ e nε n c 1 1 j cn γ 1/d cn γ c ε n j c 0 α 1 α j d j1 α + j1 α j cn γ 1/d 1 j1 α, where the last sum on the right hand side tends to zero as n. To estimate the first term, note that for gt = cn γ /t d t we have that g t = cn γ /t d t logcn γ /t d d = 0 is equivalent to t = c 1/d v γ/d e 1. This implies gj e d 1n γ/d for all j, where d 1 stands for the constant dc 1/d e 1. Therefore, by the inequality x e x for x 0 we get e c 1n 1 γ 1 j cn γ 1/d cn γ j d j1 α e c 1 n 1 γ cn γ 1/d e d 11 αn γ/d e c 1n 1 γ + 1/d c 1/d +d 1 d 1 αn γ/d 0 as j, since 1 γ > γ/d, and hence condition1 ofcorollary3holds. Condition of Corollary 3 is trivially fulfilled and hence by Corollary 3 we obtain that the entropy rate is at least 1/n γ for any given γ < 1/. 10

11 3.. Prior based on uniform distribution. Ghosal et al.1997 established posterior consistency for prior distributions based on uniform distributions of finite subsets. Priors based on discrete uniform distributions were further studied in Ghosal et al.000, in which they used the bracketing entropy as a tool to compute the convergence rate of posterior distributions. As an application of Theorem 1, we now give a slight extension. Given ε n > 0, assume that there exist density functions f 1,...,f Nn such that all sets {f : H f,f j 3 1/ ε n form a covering of F. Denote by µ n the uniform discrete probability measure on the set {f 1,...,f Nn. Define a prior distribution Π = a j µ j for a given sequence a j with a j > 0 and a j = 1. Theorem 5. If logn n + logn + log 1 a n = Onε n as n, then the posterior distributions Π n converge almost surely at least at the rate ε n, that is, Π n f : Hf0,f rε n 0 as n almost surely for any given sufficiently large r. Proof. From logn = Onε n it follows that e nε n c < for all large c > 0. For any f with H f,f j 3 1/ ε n we have Hf,f j 3 1/ H f,f j ε n. Hence we obtain that {f : H f,f j 3 1/ ε n {f : Hf,f j < ε n and then Jε n,f,0 logn n = Onε n, which implies condition 1 of Theorem 1. Condition is trivially fulfilled. Condition 3 follows from the fact that Πf : H f 0,f ε n Πf : H f 0,f 3 1/ ε n a n /N n e nε n c 3 for some c 3 > 0, since {f : H f 0,f 3 1/ ε n contains at least some density function of the set {f 1,...,f Nn. The proof of Theorem 5 is complete. It seems to be unusual to find a covering of the density space with covering sets of type {f : H f,f j cε n. The most widely used norm for continuous functions should be the supremum norm. In fact, one can easily construct a new covering N n {f : H f,f j cε n of F in terms of a given covering N n {f : f g j ε n with nonnegative bounded functions g j not necessarily density functions, as shown in the following: Take f j x = g j x+ε n / X g j x+ε n µdx, where we assume that µ is a probability measure on X and ε n 1 for all n. Then for each f with f g j ε n, that is, gj x ε n fx g j x+ε n on X, we have fx f j x = fx g j x+ε n X = 1+4ε n +4ε n g j x+ε n µdx X X f j xµdx 1+ε n, fj x+ε µdx n where f j is some density function in {f : f g j ε n and the last inequality follows from f 1 f = 1. This implies that H f,f j 3 1+ε n Hf,f j Hf,f j H f, g j +ε n +H g j +ε n,f j 11

12 1 µdx 4ε n + g j x+ε n X Therefore, we have N n N { n f : H f,f j 8ε n 1 4ε n + 1+ε n 1 = 8ε n. { f : f gj ε n F. Observe that the numbers of covering subsets in both type coverings are equal. For models with H f,g controlled by a constant multiple of the Hellinger metric Hf, g such as the exponential family and a model with uniformly bounded supremum norm f/g for all density functions f and g, it is not necessary to assume that the probability measures µ n constructed above concentrate on a finite number of points. Here we give an extension of Theorem 5. Let c 0 1 and F c0 be a subfamily of F such that H f,g c 0 Hf,g for all f,g F c0. Given ε n > 0, let {P 1,...,P Kn be a partition of F c0 such that for each P i there exists f i in F c0 with P i {f : Hf i,f ε n /c 0. Take any probability measure µ n on F c0 with µ n P i = 1/K n for i = 1,,...,K n. Define then a prior distribution Π = a j µ j for a given sequence a j with a j > 0 and a j = 1. Now we have Theorem 6. Let f 0 F c0. If logk n + logn + log 1 a n = Onε n as n, then the posterior distributions Π n converge at least at the rate ε n almost surely. Proof. By the proof of Theorem 5, we only need to verify condition 3 of Theorem 1. Take f i0 F c0 such that Hf 0,f i0 ε n /c 0. Then, for all f F c0 with Hf i0,f ε n /c 0, we have that H f 0,f c 0 Hf 0,f c 0 Hf 0,f i0 + c 0 Hf i0,f ε n. Hence we get that Πf : H f 0,f ε n ΠP i0 a n /K n e nε n c 3 for some c 3 > 0, and the proof of Theorem 6 is complete. Observe that, given a covering {O 1,O,...,O Kn of F c0, one can easily construct a partition {P 1,P,...,P Kn of F c0 in the following way: P 1 = O 1 F c0 and P i = O i i 1 l=1 P l F c0 for i =,3,...,K n. Example Exponential families. We consider the exponential family of all density functions of the form e hx, where the function hx belongs to a fixed bounded subset in the Sobolev space C p [0,1] with p > 0. A subclass of this family has been recently studied by Scricciolo 006. Following a result of Kolmogorov and Tihomirov 1959, we know that the ε-entropy of this family with respect to the norm equals Oε 1/p. Thus, using the above argument we get that logk n = Onε n for ε n = n p/p+1 and hence by Theorem 6 the posterior distributions constructed above converge at the rate ε n = n p/p+1, which is known to be the optimal rate of convergence in the minimax sense under the Hellinger loss Log spline models. Log spline models for density estimation have been studied, among others, by Stone 1990 and Ghosal et al.000. Let [ k 1/K n,k/k n with 1

13 k = 1,,...,K n be a partition of the interval [0,1. The space of splines of order q relative to this partition is the set of all functions f : [0,1 R such that f is q times continuously differentiable on [0,1 and the restriction of f on each [ k 1/K n,k/k n is a polynomial of degree strictly less then q. Let J n = q+k n 1. This space of splines is a J n -dimensional vector space with a B-spline basis B 1 x,b x,...,b Jn x, see Ghosal et al.000 for the details of such a basis. Let F be the set of all density functions in C α [0,1]. Assume that the true density function f 0 x is bounded away from zero and infinity. We consider the J n -dimensional exponential subfamily of C α [0,1] of the form J n f θ x = exp θ j B j x cθ where θ = θ 1,θ,...,θ Jn Θ 0 = {θ 1,θ,...,θ Jn R J n : J n θ j = 0 and the constant cθ is chosen such that f θ x is a density function in [0,1]. Each prior on Θ 0 induces naturally a prior on F. Let θ = max j θ j be the infinity norm on Θ 0. Assume that a 1 n 1/α+1 K n a n 1/α+1 for two fixed positive constants a 1 and a. Assume that the prior Π for Θ 0 is supported on [ M,M] J n for some M 1 and has a density function with respect to the Lebesgue measure on Θ 0, which is bounded below by a J n 3 and above by a J n 4. Take a constant d > 0 such that d θ logf θ x for all θ Θ 0. Ghosal et al.000, Theorem 4.5 proved that, if f 0 C α [0,1] with q α 1/ and logf 0 x dm/, the posteriors Π n converge in probability at the rate n α/α+1. Using Corollary 5 we now get that under the same assumptions as in Ghosal et al.000, Theorem 4.5, the posteriors Π n are in fact convergent almost surely at the rate ε n = n α/α+1. To see this, take G n = F. Clearly, nε n logn for all large n. Condition 1 of Corollary 5 has been verified by Ghosal et al.000 and condition is trivially fulfilled. Condition 3 follows also from the proof of Theorem 4.5 in Ghosal et al.000, since the inequality H f 0,f θ Hf 0,f θ f 0 /f θ 1/ holds for all θ Θ Finite-dimensional models. Let β > 0 and let Θ be a bounded subset in R d with the Euclidean norm. Denote by F the family of all density functions f θ with the parameter θ in Θ satisfying a 1 θ 1 θ β Hf θ1,f θ 3H f θ1,f θ a θ 1 θ β for all θ 1, θ Θ, where a 1 and a are two fixed positive constants. Assume that the true value θ 0 is in Θ and that the density function of the prior distribution Π with respect to the Lebesgue measure on Θ is uniformly bounded away from zero and infinity. Under slightly weaker conditions, Ghosal et al.000 proved that the posterior distributions Π n converge in probability at the rate 1/ n. Now we give an almost sure assertion for this model. Theorem 7. Under the above assumptions, the posterior distributions Π n converge almost surely at least at the rate logn/ n. 13,

14 Proof. We shall apply Corollary 5 for G n = F. Clearly, nε n = logn for ε n = logn/ n. Condition 1 has been verified in the proof of Theorem 5.1 of Ghosal et al.000. Condition is trivially fulfilled. Using H f θ0,f θ 3 1/ a θ 0 θ β, we have that ΠW εn Π θ : θ θ 0 3ε n /a 1/β. Hence, the verification of condition 3 follows from the same lines as the proof of Ghosal et al.000, Theorem 5 and then by Corollary 5 we conclude the proof of Theorem Lemmas and Proofs. In this section we give proofs of our lemmas and theorems. For simplicity of notations, we assume throughout this section that ε n = ε n = ε n. Proof of Lemma 1. ItisnorestrictiontoassumethatΠ W ε > 0. UsingJensen sinequality for the convex function x 1/ for x > 0 and Chebyshev s inequality, we obtain that F 0 On the other hand, we have = 1+ X = 1+ X = 1+ X F R nfπdf e nε 3+c Π W ε F0 R n fπdf e nε W ε = F0 e nε 3 +c 1 R n fπdf ΠW ε W ε F0 e nε 3 +c 1 ΠW ε e nε 3 +c 1 ΠW ε E R n f 1 Πdf W ε = e nε 3 +c 1 ΠW ε E W ε E 3+c Π W ε 1 W ε R n f 1 Πdf f 0 X 1 fx 1 n Πdf. f 0 X 1 fx 1 = 1+E f0 X 1 fx 1 fx1 f0 x fx fx f0 x fx+ fx f0 xµdx f0 x f0 x fx µdx+ fx f0 x f0 x fx µdx+ 1 fx 14 X f0 x fxf 0 x µdx X f0 x fx µdx

15 = 1+ 3 H f 0,f e 3 H f 0,f e 3 ε, where the last inequality holds when f W ε. Hence we get F 0 F R nfπdf e nε 3+c Π W ε e nε 3 +c 1 ΠW ε The proof of Lemma 1 is complete. W ε e 3 nε Πdf = e nε c. In the proof of Theorem 1 we use the following Lemma, which is similar to Lemma 5 of Barron et al Lemma. Let c > 0 and c 3 0. Let {ε n be a positive sequence such that ΠW ε n e nε n c 3 for all n and e nε n c <. If a sequence {D n of subsets in F satisfies e nε n 3+3c +c 3 ΠD n <, then Π n D n 0 almost surely as n. Proof. From Chebyshev s inequality and Fubini s theorem it turns out that F 0 { D n R n fπdf e nε = e nε n 3+3c +c 3 n 3+3c +c 3 e nε n 3+3c +c 3 E R n fπdf D n ER n fπdf = e nε n 3+3c +c 3 ΠD n D n for all n. Hence by the first Borel-Cantelli Lemma we get that D n R n fπdf e nε n 3+3c +c 3 almost surely for all n large enough. On the other hand, Lemma 1 and the first Borel- Cantelli Lemma yield that F R nfπdf ΠW εn e nε n 3+c e nε n 3+c +c 3 almost surely for all n. Therefore, we obtain that with probability one, Π n D n = D n R n fπdf F R nfπdf e nε n c, which tends to zero as n and the proof of Lemma is complete. 15

16 Proof of Theorem 1. It isclear that if condition1 holds for some α = α 0 then it also holds 3α+αc for any α α 0. So we may assume that 0 < α < 1. Given r > + +αc 3 +c 1 1 α, we have Π n A rεn Π n Gn A rεn +Πn Aεn \G n. Itthen followsfrom Lemma that Π n Aεn \G n 0 almostsurelyasn. Soitsuffices toprovethatπ n Gn A rεn 0almostsurely asn. BythedefinitionofJεn,G n,α, for each fixed n there exist functions f 1,f,...,f N in L µ such that G n A rεn N B j, where B j = G n A rεn {f : Hf j, f < ε n and N ΠB j α e Jε n,g n,α. It is no restriction to assume that all the sets B j are disjoint and nonempty. Taking a fj B j we get that Hf j,f 0 Hfj,f 0 Hfj,f j r 1ε n. Now for each B j we have B j R n fπdf = ΠB j n 1 B j R k+1 fπdf B j R k fπdf = ΠB j n 1 f kbj X k+1 f 0 X k+1, wheref kbj x = B j fxr k fπdf / B j R k fπdfandr 0 f = 1. Thefunction f kbj was introduced by Walker 004 and can be considered as the predictive density of f with a normalized posterior distribution, restricted on the set B j. Clearly, Jensen s inequality yields that Hf kbj,f j ε n for each k. Hence Hf kbj,f 0 Hf j,f 0 Hf j,f kbj r ε n > 0. Since e nε n c <, it turns out from Lemma 1 and the first Borel- Cantelli Lemma that F R nfπdf e nε n 3+c Π W εn almost surely for all n large enough. Hence, by condition 3 we obtain that Π n Gn A rεn Πn G n A rεn α N Π n B j α N Π n B j α = N ΠB j α n 1 f kbj X k+1 α f 0 X k+1 α F R nfπdf α N ΠB j α n 1 f kbj X k+1 α f 0 X k+1 α α ΠW εn e nε n 3+c e nε n 3+c +c 3 α N n 1 ΠB j α almost surely for all n large enough. Since r > + f kbj X k+1 α f 0 X k+1 α 3α+αc +αc 3 +c 1, the inequality 1 α 3+c +c 3 α < 1 r 1 α c 1 holds. Take a constant b with 3+c +c 3 α < b < 1 r 1 α c 1. Denote F k = σ{x 1,X,...,X k. Then we have F 0 { N n 1 ΠB j α f kbj X k+1 α f 0 X k+1 α e nε n b 16

17 where E n 1 N e nε n b E n 1 ΠB j α N = e nε n b ΠB j α E n 1 n 1 e Jε n,g n,α+nε n b max E 1 j N = E f kbj X k+1 α n 1 f 0 X k+1 α = E E n f kbj X k+1 α f 0 X k+1 α f kbj X k+1 α f 0 X k+1 α f kbj X k+1 α f 0 X k+1 E fn 1Bj X n α α f 0 X n α f kbj X k+1 α f 0 X k+1 α, f kbj X k+1 α f 0 X k+1 α F n 1 F n 1. By the conditional Hölder s inequality we get that with probability one, fn 1Bj X n α E f 0 X n α F n 1 fn 1Bj X n α = E f 0 X n α f n 1Bj X n α f 0 X n α F n 1 fn 1Bj X n α α E f 0 X n α α F n 1 α fn 1Bj X n α α = E f 0 X n α α fn 1Bj X n α α E f 0 X n α α F n 1 α. F n 1 α Take the integer m with α 1 α m < α. Repeating the above procedure m 1 more 1 α times we obtain that with probability one, fn 1Bj X n α E f 0 X n α F n 1 fn 1Bj X n E f 0 X n α m 1 α+α α m 1 α+α F n 1 m 1 α+α m, which by the conditional Hölder s inequality is less than fn 1Bj X n 1 E f 0 X n 1 F n 1 = 1 Hf n 1B j,f 0 α m 1 α = f n 1Bj X n f 0 X n µdx n m 1 α m 1 1 r ε n e m r αε n e 1 r α 1ε n. 17 α m 1

18 Hence, with probability one, we have n 1 E f kbj X k+1 α f 0 X k+1 α n e 1 r α 1ε n E f kbj X k+1 α f 0 X k+1 α. Repeating the same argument n 1 times, we obtain that for each j, n 1 E f kbj X k+1 α f 0 X k+1 α Therefore, we have gotten that for all n, F 0 { N e Jε n,g n,α+nε n n 1 ΠB j α e 1 r α 1nε n. f kbj X k+1 α f 0 X k+1 α b e nε n b+ 1 r α 1 e Jε n,g n,α nε n c 1. Thus, together with condition 1, the first Borel-Cantelli Lemma yields that N n 1 ΠB j α f kbj X k+1 α f 0 X k+1 α e nε n b almost surely for all n large enough. Hence we have Π n Gn A rεn e nε n 3α+αc +αc 3 b, which tends to zero as n, since 3+c +c 3 α < b and nε n proof of Theorem 1 is complete. as n. The To prove Theorem, we need a replacement of Lemma under weaker conditions. Lemma 3. Let c 0 and let {ε n be a positive sequence such that ΠB ε n e nε n c for all n. If a sequence {D n of subsets in F satisfies e nε n +c ΠD n 0 as n, then Π n D n 0 in probability as n. Proof. From Lemma 1 of Shen et al. 001 or Lemma 8.1 of Ghosal et al. 000 it turns out that we have, with probability tending to 1, D Π n D n n R n fπdf e nε n +c R n fπdf. ΠB ε n e nε n D n Hence for any given δ > 0 we have that F 0 { Πn D n δ F 0 { e nε n +c 18 R n fπdf δ +o1 D n

19 1 δ enε n +c E R n fπdf+o1 D n = 1 δ enε n +c ΠD n +o1 0 as n, which concludes the proof of Lemma 3. Proof of Theorem. Assumethat0 < α < 1. TheproofofTheoremfollowsfromthesame lines as the proof of Theorem 1. By Lemma 3 it suffices to prove that Π n Gn A rεn 0 in probability as n. For any given δ > 0, following the proof of Theorem 1 we get F 0 { Πn Gn A rεn δ F 0 { e nε n +c α N n 1 ΠB j α f kbj X k+1 α f 0 X k+1 α δ +o1 1 N δ enε n +c α ΠB j α E n 1 f kbj X k+1 α f 0 X k+1 α +o1 n 1 δ ejε n,g n,α+nε n +cα max E 1 j N δ ejε n,g n,α+nε n +c α+ 1 r α 1 f kbj X k+1 α f 0 X k+1 α +o1 +o1 δ ejε n,g n,α nε n c +o1 0 as n, α+αc where the last inequality follows from r > + +c 1. The proof of Theorem is 1 α complete. Proof of Theorem 3. Since Π n A rn ε n Π n Gn A rn ε n +Πn Aεn \G n, it suffices that the terms on the right hand side both tend to zero in probability. Given δ > 0, the proof of Lemma 3 implies that F 0 { Πn Aεn \G n δ e nε n ΠAεn \G n δ ΠB ε n +o1, which by condition 1 tends to zero as n. Assume that [r n ] stands for the largest integer less than or equal to r n and assume that D j = { f G n : jε n Hf 0,f < jε n Indeed, D j is an empty set for j > /ε n since the Hellinger distance cannot exceed. Then we have F 0 { Πn Gn A rn ε n δ F 0 { j=[r n ] Π n D j δ 19

20 F 0 { j=[r n ] Π n D j α δ 1 δ E j=[r n ] Π n D j α 1 δ j=[r n ] EΠ n D j α. Take a partition N j i=1 D ji for each D j such that D ji {f : Hf ji, f < jε n 3 for some f ji in L µ and N j ΠD ji α exp J jε n 3,D j,α e c 1j nε n ΠBε n α, i=1 where the last inequality follows from condition. Using the same argument as the proof of Theorem 1, one can get Hf kdji,f 0 jε n /3 and hence we have with probability tending to 1 F 0 { Πn Gn A rn ε n δ 1 δ j=[r n ] E N j i=1 α Π n D ji δ 1 δ 3 δnε n j=[r n ] N j j=[r n ] i=1 j=[r n ] EΠ n D ji α eαnε n δπb ε n α e nε n α+c 1j j α 1 δ j=[r n ] N j j=[r n ] i=1 ΠD ji α e 1 18 j nε n α 1 1 nε n α+c1 j j α 1 1 c 1 j j 1 α 3 δnε n c α = 3 δnεn c α [r n ] 1 j=[r n ] 1 j 1 1 j which tends to zero as n, since nε n are uniformly bounded away from zero and r n, where the second inequality follows from α 0,1, the third from the proof of Theorem 1, the fifth from the elementary inequality e x < 1 x for x > 0 and some of the inequalities only hold for all large n. Thus, we have proved that Π n Gn A rn ε n converges to zero in probability. The proof of Theorem 3 is complete. Proofof Theorem 4. Fromnε n c 0 lognandc 0 c > 1itturnsoutthate nε n c 1/n c 0 c and hence, by the first Borel-Cantelli Lemma and Lemma 1, we obtain that F R nfπdf e nε n 3+c Π W εn almost surely for all large n. Then, following the proofs of Lemma 3 and Theorem 3, one can get that for any δ > 0 and r > 1, F 0 { { Πn Arεn δ F δ { 0 Πn Aεn \G n +F δ 0 Πn Gn A rεn 0

21 enε n 3+c ΠA εn \G n + 4 δπw εn δ j=[r] e nε n 3+c α+c 1 j j α 1 := a n +b n. Condition 1 yields that a n <. On the other hand, since c 1 < 1 α 18, we have that for all n and for all r so large that 3+c α+c 1 + α 1 18 r 1 c 0, 4nc 0 b n 4enε n 3+c α δ 3+c α+c 1 + α 1 18 [r] j=[r] e nε δ 4nc0 1 n c 0c 1 + α 1 18 n c 1+ α 1 18 j = 4enε n 3+c α+c 1 + α 1 18 [r] δ 1 e nε n c 1+ α c α+c 1 + α 1 18 r 1 δ 1 c 0c 1 + α 1 4n 18 δ 1 c 0c 1 + α and hence b n <. Thus, by the first Borel-Cantelli Lemma we obtain that Π n Arεn δ almost surely for all large n, which concludes the proof of Theorem 4. REFERENCES BARRON, A., SCHERVISH, M. and WASSERMAN, L The consistency of posterior distributions in nonparametric problems. Ann. Statist. 7, GHOSAL, S Convergence rates for density estimation with Bernstein polynomials. Ann. Statist. 9, GHOSAL, S., GHOSH, J. K. and RAMAMOORTHI, R. V Non-informative priors via sieves and packing numbers. In Advances in Statistical Decision Theory and Applications S. Panchapakeshan and N.Balakrishnan eds Birkhäuser, Boston. GHOSAL, S., GHOSH, J. K. and VAN DERVAART, A. W Convergence rates of posterior distributions. Ann. Statist. 8, GHOSAL,S.andVAN DERVAART,A.W.001. Entropiesand ratesof convergencefor maximum likelihood and Bayes estimation for mixtures of normal densities. Ann. Statist. 9, GHOSAL, S. and VAN DER VAART, A. W. 007a. Convergence rates of posterior distributions for noniid observations. Ann. Statist. 35, GHOSAL, S. and VAN DER VAART, A. W. 007b. Posterior convergence rates of Dirichlet mixtures at smooth densities. Ann. Statist. 35, KOLMOGOROV, A. N. and TIHOMIROV, V. M ε-entropy and ε-capacity of sets in function spaces. Uspekhi Mat. Nauk 14, 3-86 [in Russian; English transl. Amer. Math. Soc. Transl. Ser., 17, ]. LIJOI, A., PRUNSTER, I. and WALKER, S On convergence rates for nonparametric posterior distributions. Aust. N. Z. J. Stat. 493, PETRONE, S Random Bernstein polynomials. Scand. J. Statist. 6, PETRONE, S. and WASSERMAN, L. 00. Consistency of Bernstein polynomial posteriors. J. R. Stat. Soc. Ser. B Stat. Methodol. 64, SCHWARTZ, L On Bayes procedures Z. Wahr. verw. Geb. 4,

22 SCRICCIOLO, C Convergence rates for Bayesian density estimation of infinite-dimensional exponential families. Ann. Statist. 34, SHEN, X. and WASSERMAN, L Rates of convergence of posterior distributions. Ann. Statist. 9, STONE, C. J Lerge-sample inference for log-spline models. Ann. Statist. 18, WALKER, S New approaches to Bayesian consistency. Ann. Statist. 3, WALKER, S., LIJOI, A. and PRUNSTER, I On rates of convergence for posterior distributions in infinite-dimensional models. Ann. Statist. 35, XING, Y. and RANNEBY, B Sufficient conditions for Bayesian consistency. Preprint. Yang Xing Centre of Biostochastics Swedish University of Agricultural Sciences SE , Umeå Sweden address:

BERNSTEIN POLYNOMIAL DENSITY ESTIMATION 1265 bounded away from zero and infinity. Gibbs sampling techniques to compute the posterior mean and other po

BERNSTEIN POLYNOMIAL DENSITY ESTIMATION 1265 bounded away from zero and infinity. Gibbs sampling techniques to compute the posterior mean and other po The Annals of Statistics 2001, Vol. 29, No. 5, 1264 1280 CONVERGENCE RATES FOR DENSITY ESTIMATION WITH BERNSTEIN POLYNOMIALS By Subhashis Ghosal University of Minnesota Mixture models for density estimation

More information

Bayesian Consistency for Markov Models

Bayesian Consistency for Markov Models Bayesian Consistency for Markov Models Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. Stephen G. Walker University of Texas at Austin, USA. Abstract We consider sufficient conditions for

More information

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Ann Inst Stat Math (29) 61:835 859 DOI 1.17/s1463-8-168-2 Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Taeryon Choi Received: 1 January 26 / Revised:

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

Introduction and Preliminaries

Introduction and Preliminaries Chapter 1 Introduction and Preliminaries This chapter serves two purposes. The first purpose is to prepare the readers for the more systematic development in later chapters of methods of real analysis

More information

Nonparametric Bayesian model selection and averaging

Nonparametric Bayesian model selection and averaging Electronic Journal of Statistics Vol. 2 2008 63 89 ISSN: 1935-7524 DOI: 10.1214/07-EJS090 Nonparametric Bayesian model selection and averaging Subhashis Ghosal Department of Statistics, North Carolina

More information

1234 S. GHOSAL AND A. W. VAN DER VAART algorithm to adaptively select a finitely supported mixing distribution. These authors showed consistency of th

1234 S. GHOSAL AND A. W. VAN DER VAART algorithm to adaptively select a finitely supported mixing distribution. These authors showed consistency of th The Annals of Statistics 2001, Vol. 29, No. 5, 1233 1263 ENTROPIES AND RATES OF CONVERGENCE FOR MAXIMUM LIKELIHOOD AND BAYES ESTIMATION FOR MIXTURES OF NORMAL DENSITIES By Subhashis Ghosal and Aad W. van

More information

Bayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam. aad. Bayesian Adaptation p. 1/4

Bayesian Adaptation. Aad van der Vaart. Vrije Universiteit Amsterdam.  aad. Bayesian Adaptation p. 1/4 Bayesian Adaptation Aad van der Vaart http://www.math.vu.nl/ aad Vrije Universiteit Amsterdam Bayesian Adaptation p. 1/4 Joint work with Jyri Lember Bayesian Adaptation p. 2/4 Adaptation Given a collection

More information

Bayesian asymptotics with misspecified models

Bayesian asymptotics with misspecified models ISSN 2279-9362 Bayesian asymptotics with misspecified models Pierpaolo De Blasi Stephen G. Walker No. 270 October 2012 www.carloalberto.org/research/working-papers 2012 by Pierpaolo De Blasi and Stephen

More information

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9 MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended

More information

Nonparametric Bayesian Uncertainty Quantification

Nonparametric Bayesian Uncertainty Quantification Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

An inverse of Sanov s theorem

An inverse of Sanov s theorem An inverse of Sanov s theorem Ayalvadi Ganesh and Neil O Connell BRIMS, Hewlett-Packard Labs, Bristol Abstract Let X k be a sequence of iid random variables taking values in a finite set, and consider

More information

On Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation

On Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation On Consistency of Nonparametric Normal Mixtures for Bayesian Density Estimation Antonio LIJOI, Igor PRÜNSTER, andstepheng.walker The past decade has seen a remarkable development in the area of Bayesian

More information

Chapter 4. Measure Theory. 1. Measure Spaces

Chapter 4. Measure Theory. 1. Measure Spaces Chapter 4. Measure Theory 1. Measure Spaces Let X be a nonempty set. A collection S of subsets of X is said to be an algebra on X if S has the following properties: 1. X S; 2. if A S, then A c S; 3. if

More information

Random Bernstein-Markov factors

Random Bernstein-Markov factors Random Bernstein-Markov factors Igor Pritsker and Koushik Ramachandran October 20, 208 Abstract For a polynomial P n of degree n, Bernstein s inequality states that P n n P n for all L p norms on the unit

More information

Nonparametric Bayes Inference on Manifolds with Applications

Nonparametric Bayes Inference on Manifolds with Applications Nonparametric Bayes Inference on Manifolds with Applications Abhishek Bhattacharya Indian Statistical Institute Based on the book Nonparametric Statistics On Manifolds With Applications To Shape Spaces

More information

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction SHARP BOUNDARY TRACE INEQUALITIES GILES AUCHMUTY Abstract. This paper describes sharp inequalities for the trace of Sobolev functions on the boundary of a bounded region R N. The inequalities bound (semi-)norms

More information

D I S C U S S I O N P A P E R

D I S C U S S I O N P A P E R I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Consistent semiparametric Bayesian inference about a location parameter

Consistent semiparametric Bayesian inference about a location parameter Journal of Statistical Planning and Inference 77 (1999) 181 193 Consistent semiparametric Bayesian inference about a location parameter Subhashis Ghosal a, Jayanta K. Ghosh b; c; 1 d; ; 2, R.V. Ramamoorthi

More information

Entropy and Ergodic Theory Lecture 15: A first look at concentration

Entropy and Ergodic Theory Lecture 15: A first look at concentration Entropy and Ergodic Theory Lecture 15: A first look at concentration 1 Introduction to concentration Let X 1, X 2,... be i.i.d. R-valued RVs with common distribution µ, and suppose for simplicity that

More information

An introduction to some aspects of functional analysis

An introduction to some aspects of functional analysis An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi Real Analysis Math 3AH Rudin, Chapter # Dominique Abdi.. If r is rational (r 0) and x is irrational, prove that r + x and rx are irrational. Solution. Assume the contrary, that r+x and rx are rational.

More information

RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1

RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2493 2511 RATES OF CONVERGENCE OF ESTIMATES, KOLMOGOROV S ENTROPY AND THE DIMENSIONALITY REDUCTION PRINCIPLE IN REGRESSION 1 By Theodoros Nicoleris and Yannis

More information

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS STEVEN P. LALLEY AND ANDREW NOBEL Abstract. It is shown that there are no consistent decision rules for the hypothesis testing problem

More information

10 Typical compact sets

10 Typical compact sets Tel Aviv University, 2013 Measure and category 83 10 Typical compact sets 10a Covering, packing, volume, and dimension... 83 10b A space of compact sets.............. 85 10c Dimensions of typical sets.............

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

arxiv: v2 [math.st] 18 Feb 2013

arxiv: v2 [math.st] 18 Feb 2013 Bayesian optimal adaptive estimation arxiv:124.2392v2 [math.st] 18 Feb 213 using a sieve prior Julyan Arbel 1,2, Ghislaine Gayraud 1,3 and Judith Rousseau 1,4 1 Laboratoire de Statistique, CREST, France

More information

P-adic Functions - Part 1

P-adic Functions - Part 1 P-adic Functions - Part 1 Nicolae Ciocan 22.11.2011 1 Locally constant functions Motivation: Another big difference between p-adic analysis and real analysis is the existence of nontrivial locally constant

More information

The Hilbert Transform and Fine Continuity

The Hilbert Transform and Fine Continuity Irish Math. Soc. Bulletin 58 (2006), 8 9 8 The Hilbert Transform and Fine Continuity J. B. TWOMEY Abstract. It is shown that the Hilbert transform of a function having bounded variation in a finite interval

More information

Defining the Integral

Defining the Integral Defining the Integral In these notes we provide a careful definition of the Lebesgue integral and we prove each of the three main convergence theorems. For the duration of these notes, let (, M, µ) be

More information

4 Countability axioms

4 Countability axioms 4 COUNTABILITY AXIOMS 4 Countability axioms Definition 4.1. Let X be a topological space X is said to be first countable if for any x X, there is a countable basis for the neighborhoods of x. X is said

More information

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University Submitted to the Annals of Statistics DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS By Subhashis Ghosal North Carolina State University First I like to congratulate the authors Botond Szabó, Aad van der

More information

IN this paper, we study two related problems of minimaxity

IN this paper, we study two related problems of minimaxity IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 45, NO. 7, NOVEMBER 1999 2271 Minimax Nonparametric Classification Part I: Rates of Convergence Yuhong Yang Abstract This paper studies minimax aspects of

More information

Bayesian Aggregation for Extraordinarily Large Dataset

Bayesian Aggregation for Extraordinarily Large Dataset Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work

More information

Introduction to Real Analysis Alternative Chapter 1

Introduction to Real Analysis Alternative Chapter 1 Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

arxiv: v2 [math.co] 11 Mar 2008

arxiv: v2 [math.co] 11 Mar 2008 Testing properties of graphs and functions arxiv:0803.1248v2 [math.co] 11 Mar 2008 Contents László Lovász Institute of Mathematics, Eötvös Loránd University, Budapest and Balázs Szegedy Department of Mathematics,

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Topological properties

Topological properties CHAPTER 4 Topological properties 1. Connectedness Definitions and examples Basic properties Connected components Connected versus path connected, again 2. Compactness Definition and first examples Topological

More information

A LITTLE REAL ANALYSIS AND TOPOLOGY

A LITTLE REAL ANALYSIS AND TOPOLOGY A LITTLE REAL ANALYSIS AND TOPOLOGY 1. NOTATION Before we begin some notational definitions are useful. (1) Z = {, 3, 2, 1, 0, 1, 2, 3, }is the set of integers. (2) Q = { a b : aεz, bεz {0}} is the set

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA  address: Topology Xiaolong Han Department of Mathematics, California State University, Northridge, CA 91330, USA E-mail address: Xiaolong.Han@csun.edu Remark. You are entitled to a reward of 1 point toward a homework

More information

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing

More information

Convergence of latent mixing measures in nonparametric and mixture models 1

Convergence of latent mixing measures in nonparametric and mixture models 1 Convergence of latent mixing measures in nonparametric and mixture models 1 XuanLong Nguyen xuanlong@umich.edu Department of Statistics University of Michigan Technical Report 527 (Revised, Jan 25, 2012)

More information

Course 212: Academic Year Section 1: Metric Spaces

Course 212: Academic Year Section 1: Metric Spaces Course 212: Academic Year 1991-2 Section 1: Metric Spaces D. R. Wilkins Contents 1 Metric Spaces 3 1.1 Distance Functions and Metric Spaces............. 3 1.2 Convergence and Continuity in Metric Spaces.........

More information

THE SET OF RECURRENT POINTS OF A CONTINUOUS SELF-MAP ON AN INTERVAL AND STRONG CHAOS

THE SET OF RECURRENT POINTS OF A CONTINUOUS SELF-MAP ON AN INTERVAL AND STRONG CHAOS J. Appl. Math. & Computing Vol. 4(2004), No. - 2, pp. 277-288 THE SET OF RECURRENT POINTS OF A CONTINUOUS SELF-MAP ON AN INTERVAL AND STRONG CHAOS LIDONG WANG, GONGFU LIAO, ZHENYAN CHU AND XIAODONG DUAN

More information

VARIETIES OF ABELIAN TOPOLOGICAL GROUPS AND SCATTERED SPACES

VARIETIES OF ABELIAN TOPOLOGICAL GROUPS AND SCATTERED SPACES Bull. Austral. Math. Soc. 78 (2008), 487 495 doi:10.1017/s0004972708000877 VARIETIES OF ABELIAN TOPOLOGICAL GROUPS AND SCATTERED SPACES CAROLYN E. MCPHAIL and SIDNEY A. MORRIS (Received 3 March 2008) Abstract

More information

Optimal global rates of convergence for interpolation problems with random design

Optimal global rates of convergence for interpolation problems with random design Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289

More information

The Heine-Borel and Arzela-Ascoli Theorems

The Heine-Borel and Arzela-Ascoli Theorems The Heine-Borel and Arzela-Ascoli Theorems David Jekel October 29, 2016 This paper explains two important results about compactness, the Heine- Borel theorem and the Arzela-Ascoli theorem. We prove them

More information

UNIFORMLY DISTRIBUTED MEASURES IN EUCLIDEAN SPACES

UNIFORMLY DISTRIBUTED MEASURES IN EUCLIDEAN SPACES MATH. SCAND. 90 (2002), 152 160 UNIFORMLY DISTRIBUTED MEASURES IN EUCLIDEAN SPACES BERND KIRCHHEIM and DAVID PREISS For every complete metric space X there is, up to a constant multiple, at most one Borel

More information

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction tm Tatra Mt. Math. Publ. 00 (XXXX), 1 10 A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES Martin Janžura and Lucie Fialová ABSTRACT. A parametric model for statistical analysis of Markov chains type

More information

Integral Jensen inequality

Integral Jensen inequality Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a

More information

Approximation by polynomials with bounded coefficients

Approximation by polynomials with bounded coefficients Journal of Number Theory 27 (2007) 03 7 www.elsevier.com/locate/jnt Approximation by polynomials with bounded coefficients Toufik Zaimi Département de mathématiques, Centre universitaire Larbi Ben M hidi,

More information

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm Chapter 13 Radon Measures Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm (13.1) f = sup x X f(x). We want to identify

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Adaptive Bayesian procedures using random series priors

Adaptive Bayesian procedures using random series priors Adaptive Bayesian procedures using random series priors WEINING SHEN 1 and SUBHASHIS GHOSAL 2 1 Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 2 Department

More information

arxiv: v1 [math.st] 1 Aug 2007

arxiv: v1 [math.st] 1 Aug 2007 The Annals of Statistics 2006, Vol. 34, No. 6, 2897 2920 DOI: 10.1214/009053606000000911 c Institute of Mathematical Statistics, 2006 arxiv:0708.0175v1 [math.st] 1 Aug 2007 CONVERGENCE RATES FOR BAYESIAN

More information

Summary of Real Analysis by Royden

Summary of Real Analysis by Royden Summary of Real Analysis by Royden Dan Hathaway May 2010 This document is a summary of the theorems and definitions and theorems from Part 1 of the book Real Analysis by Royden. In some areas, such as

More information

arxiv: v1 [math.fa] 14 Jul 2018

arxiv: v1 [math.fa] 14 Jul 2018 Construction of Regular Non-Atomic arxiv:180705437v1 [mathfa] 14 Jul 2018 Strictly-Positive Measures in Second-Countable Locally Compact Non-Atomic Hausdorff Spaces Abstract Jason Bentley Department of

More information

The Measure Problem. Louis de Branges Department of Mathematics Purdue University West Lafayette, IN , USA

The Measure Problem. Louis de Branges Department of Mathematics Purdue University West Lafayette, IN , USA The Measure Problem Louis de Branges Department of Mathematics Purdue University West Lafayette, IN 47907-2067, USA A problem of Banach is to determine the structure of a nonnegative (countably additive)

More information

Analysis Qualifying Exam

Analysis Qualifying Exam Analysis Qualifying Exam Spring 2017 Problem 1: Let f be differentiable on R. Suppose that there exists M > 0 such that f(k) M for each integer k, and f (x) M for all x R. Show that f is bounded, i.e.,

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

ON LEFT-INVARIANT BOREL MEASURES ON THE

ON LEFT-INVARIANT BOREL MEASURES ON THE Georgian International Journal of Science... Volume 3, Number 3, pp. 1?? ISSN 1939-5825 c 2010 Nova Science Publishers, Inc. ON LEFT-INVARIANT BOREL MEASURES ON THE PRODUCT OF LOCALLY COMPACT HAUSDORFF

More information

Some topics in analysis related to Banach algebras, 2

Some topics in analysis related to Banach algebras, 2 Some topics in analysis related to Banach algebras, 2 Stephen Semmes Rice University... Abstract Contents I Preliminaries 3 1 A few basic inequalities 3 2 q-semimetrics 4 3 q-absolute value functions 7

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Asymptotic Normality of Posterior Distributions for Exponential Families when the Number of Parameters Tends to Infinity

Asymptotic Normality of Posterior Distributions for Exponential Families when the Number of Parameters Tends to Infinity Journal of Multivariate Analysis 74, 4968 (2000) doi:10.1006jmva.1999.1874, available online at http:www.idealibrary.com on Asymptotic Normality of Posterior Distributions for Exponential Families when

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES

APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES APPROXIMATE ISOMETRIES ON FINITE-DIMENSIONAL NORMED SPACES S. J. DILWORTH Abstract. Every ε-isometry u between real normed spaces of the same finite dimension which maps the origin to the origin may by

More information

On maxitive integration

On maxitive integration On maxitive integration Marco E. G. V. Cattaneo Department of Mathematics, University of Hull m.cattaneo@hull.ac.uk Abstract A functional is said to be maxitive if it commutes with the (pointwise supremum

More information

In N we can do addition, but in order to do subtraction we need to extend N to the integers

In N we can do addition, but in order to do subtraction we need to extend N to the integers Chapter 1 The Real Numbers 1.1. Some Preliminaries Discussion: The Irrationality of 2. We begin with the natural numbers N = {1, 2, 3, }. In N we can do addition, but in order to do subtraction we need

More information

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space 1 Professor Carl Cowen Math 54600 Fall 09 PROBLEMS 1. (Geometry in Inner Product Spaces) (a) (Parallelogram Law) Show that in any inner product space x + y 2 + x y 2 = 2( x 2 + y 2 ). (b) (Polarization

More information

Lecture 4 Lebesgue spaces and inequalities

Lecture 4 Lebesgue spaces and inequalities Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how

More information

Chapter 5. Measurable Functions

Chapter 5. Measurable Functions Chapter 5. Measurable Functions 1. Measurable Functions Let X be a nonempty set, and let S be a σ-algebra of subsets of X. Then (X, S) is a measurable space. A subset E of X is said to be measurable if

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2) 14:17 11/16/2 TOPIC. Convergence in distribution and related notions. This section studies the notion of the so-called convergence in distribution of real random variables. This is the kind of convergence

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

1.1. MEASURES AND INTEGRALS

1.1. MEASURES AND INTEGRALS CHAPTER 1: MEASURE THEORY In this chapter we define the notion of measure µ on a space, construct integrals on this space, and establish their basic properties under limits. The measure µ(e) will be defined

More information

A VERY BRIEF REVIEW OF MEASURE THEORY

A VERY BRIEF REVIEW OF MEASURE THEORY A VERY BRIEF REVIEW OF MEASURE THEORY A brief philosophical discussion. Measure theory, as much as any branch of mathematics, is an area where it is important to be acquainted with the basic notions and

More information

The 123 Theorem and its extensions

The 123 Theorem and its extensions The 123 Theorem and its extensions Noga Alon and Raphael Yuster Department of Mathematics Raymond and Beverly Sackler Faculty of Exact Sciences Tel Aviv University, Tel Aviv, Israel Abstract It is shown

More information

Existence and Uniqueness

Existence and Uniqueness Chapter 3 Existence and Uniqueness An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect

More information

APPROXIMATION OF ENTIRE FUNCTIONS OF SLOW GROWTH ON COMPACT SETS. G. S. Srivastava and Susheel Kumar

APPROXIMATION OF ENTIRE FUNCTIONS OF SLOW GROWTH ON COMPACT SETS. G. S. Srivastava and Susheel Kumar ARCHIVUM MATHEMATICUM BRNO) Tomus 45 2009), 137 146 APPROXIMATION OF ENTIRE FUNCTIONS OF SLOW GROWTH ON COMPACT SETS G. S. Srivastava and Susheel Kumar Abstract. In the present paper, we study the polynomial

More information

Integration on Measure Spaces

Integration on Measure Spaces Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of

More information

Scalar curvature and the Thurston norm

Scalar curvature and the Thurston norm Scalar curvature and the Thurston norm P. B. Kronheimer 1 andt.s.mrowka 2 Harvard University, CAMBRIDGE MA 02138 Massachusetts Institute of Technology, CAMBRIDGE MA 02139 1. Introduction Let Y be a closed,

More information

REVIEW OF ESSENTIAL MATH 346 TOPICS

REVIEW OF ESSENTIAL MATH 346 TOPICS REVIEW OF ESSENTIAL MATH 346 TOPICS 1. AXIOMATIC STRUCTURE OF R Doğan Çömez The real number system is a complete ordered field, i.e., it is a set R which is endowed with addition and multiplication operations

More information

ON THE COMPLETE CONVERGENCE FOR WEIGHTED SUMS OF DEPENDENT RANDOM VARIABLES UNDER CONDITION OF WEIGHTED INTEGRABILITY

ON THE COMPLETE CONVERGENCE FOR WEIGHTED SUMS OF DEPENDENT RANDOM VARIABLES UNDER CONDITION OF WEIGHTED INTEGRABILITY J. Korean Math. Soc. 45 (2008), No. 4, pp. 1101 1111 ON THE COMPLETE CONVERGENCE FOR WEIGHTED SUMS OF DEPENDENT RANDOM VARIABLES UNDER CONDITION OF WEIGHTED INTEGRABILITY Jong-Il Baek, Mi-Hwa Ko, and Tae-Sung

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

Continuity. Chapter 4

Continuity. Chapter 4 Chapter 4 Continuity Throughout this chapter D is a nonempty subset of the real numbers. We recall the definition of a function. Definition 4.1. A function from D into R, denoted f : D R, is a subset of

More information

1 Definition of the Riemann integral

1 Definition of the Riemann integral MAT337H1, Introduction to Real Analysis: notes on Riemann integration 1 Definition of the Riemann integral Definition 1.1. Let [a, b] R be a closed interval. A partition P of [a, b] is a finite set of

More information