LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS

Size: px

Start display at page:

Download "LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS"

Brooke Rodgers
5 years ago
Views:

1 LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS A Dissertation Presented by CHRISTOPHER L. BOUCHER Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 998 Department of Mathematics and Statistics

3 LARGE DEVIATIONS FOR DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS A Dissertation Presented by CHRISTOPHER L. BOUCHER Approved as to style and content by: Richard S. Ellis, Chair Walter Rosenkrantz, Member Markos Katsoulakis, Member Don Towsley, Member Donald St. Mary, Department Head Mathematics and Statistics

4 To Catherine and my parents for their love, and patience

5 ACKNOWLEDGEMENTS My training as a mathematician and the research that produced this dissertation have been aided by the efforts of many professors, fellow graduate students, and friends. In particular, I would like to thank Markos Katsoulakis, Walter Rosenkrantz, and Don Towsley for their service as members of my dissertation committee. John Buonacorssi, George Knightly, and Bruce Turkington have been especially helpful teachers of mathematics and guides to the life of a mathematcian. The Department of Mathematics and Statistics at the University of Massachusetts has been generous with funding and provided a stimulating atmosphere in which to work. Above all, I owe a deep debt of gratitude to my advisor, Richard S. Ellis. His guidance and inspiration have been indispensable to my graduate career, and his companionship will be greatly missed when I leave. v

6 ABSTRACT LARGE DEVIATIONS OF DOUBLY INDEXED STOCHASTIC PROCESSES WITH APPLICATIONS TO STATISTICAL MECHANICS MAY 998 CHRISTOPHER L. BOUCHER B.A., TUFTS UNIVERSITY M.A., BOSTON COLLEGE Ph.D., UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor Richard S. Ellis The theory of large deviations studies situations in which certain probabilities involving a given stochastic process decay to zero exponentially fast. One of the aims of this dissertation is to extend this theory to the setting in which the stochastic processes under consideration are indexed by two parameters, rather than the usual one parameter. The introduction of the second index often allows one to study more easily the large deviation asymptotics of processes with a spatial component. Such doubly indexed processes, interesting in their own right, are especially so because of their applications to a class of statistical mechanical models of fluid turbulence. vi

7 Indeed, the powerful apparatus of large deviation theory can be applied to a general statistical mechanical model via the program outlined in Chapter 2. A second aim of this dissertation is to apply our two parameter large deviation results to a particular model of two-dimensional fluid turbulence introduced in Chapter 6. The main probabalistic theorem in the dissertation is the large deviation principle for the doubly indexed sequence of random probability measures W r,q (dx dy) =. 2 r θ(dx) Dr,k (x)l q,k (dy). Here θ is a probability measure on a Polish space X, {D r,k, k =,..., 2 r } is a dyadic k= partition of X (hence the use of 2 r summands) satisfying θ(d r,k ) = /2 r, and L q,, L q,2,..., L q,2 r is an independent, identically distributed sequence of random probability measures on a Polish space Y such that {L q,k, q IN} satisfies the large deviation principle with a convex rate function. A number of related asymptotic results are also derived. In the final two chapters of the dissertation we introduce a statistical mechanical model of two-dimensional turbulence constructed on a uniform lattice of points of the unit torus. We use a doubly indexed process closely related to W r,q to approximate the process which arises naturally in applying large deviation theory to this model. The two parameter large deviation principle for the doubly indexed process then leads to the evaluation of the asymptotics of certain key statistical mechanical quantities related to the partition function and the Gibbs states. vii

8 TABLE OF CONTENTS Page ACKNOWLEDGEMENTS v ABSTRACT vi Chapter. INTRODUCTION LARGE DEVIATIONS, STATISTICAL MECHANICS AND DOU- BLY INDEXED PROCESSES STATEMENT OF THE LARGE DEVIATION THEOREM AND EXAMPLES PROOF OF RESULTS IN CHAPTER A Number of Preliminary Lemmas Proof of the Large Deviation Lower Bound Proof of Exponential Tightness Proof of the Large Deviation Upper Bound Proof of Compact Level Sets of J Proof of Part (c) of Corollary EXTENSIONS OF THE RESULTS IN CHAPTER STATISTICAL EQUILIBRIUM MODEL PROPERTIES OF THE HIDDEN PROCESS APPENDIX: 7. Asymptotics of the Hidden Process Properties of H and Ãa A COLLECTION OF RESULTS NEEDED IN THE DISSERTATION BIBLIOGRAPHY viii

9 C H A P T E R INTRODUCTION The central theme of this dissertation is the development of a new two-parameter approach to the theory of large deviations. This approach addresses the large deviation behavior of doubly indexed or two-parameter stochastic processes of the form {X r,q, r, q IN}. In many applications, such processes involve a spatial as well as a stochastic component, limits being taken first with respect to the stochastic parameter, then with respect to the spatial parameter. The use of two parameters allows us to separate these components and makes the large deviation analysis relatively straightforward. This leads to a formulation of large deviation theory in a two-parameter setting which parallels the usual formulation, and one can establish links between the two settings in situations where a doubly indexed sequence provides a good approximation to a singly indexed one. The first part of the dissertation is devoted to outlining this approach, applying it to a particular doubly indexed sequence, and deriving a number of related asymptotic results. Once these asymptotics are understood, they are applied to the analysis of a statistical mechanical model of fluid turbulence. This will involve the approximation of singly indexed processes by doubly indexed processes and the proof of the large deviation principle (LDP) for a singly indexed process via an equivalent but more easily proved two-parameter LDP.

10 We begin with a general introduction to large deviations in a two-parameter setting. Given a Polish space Y, we denote by P(Y) the set of probability measures on Y. For A Y and f a function mapping Y into IR, we use the notation f(a). = inf y A f(y). Definition. A rate function I on Y is a function mapping Y into [0, ] which has compact level sets. That is, for each M IR, the set {y Y : I(y) M} is a compact subset of Y. The definition of the two-parameter LDP is the natural analogue of the corresponding definition in the usual setting. The sequences which appear below, {a r, r IN} and {b r,q, r, q IN}, are sequences of positive numbers such that lim q b r,q = for each r IN. Definition.2 A doubly indexed sequence of random variables {X r,q, r, q IN} taking values in Y is said to satisfy the two-parameter large deviation upper bound on Y with rate function I and scaling sequence {a r b r,q, r, q IN} if for any closed subset F of Y lim sup lim sup log P{X r,q F } I(F ). (.) r q a r b r,q The sequence {X r,q, r, q IN} is said to satisfy the two-parameter large deviation lower bound on Y with rate function I and scaling sequence {a r b r,q, r, q IN} if for any open subset G of Y lim inf r lim inf q a r b r,q log P{X r,q G} I(G). (.2) If {X r,q, r, q IN} satisfies both (.) and (.2), then it is said to satisfy the twoparameter large deviation principle on Y with rate function I and scaling sequence {a r b r,q, r, q IN}. 2

11 In many settings including applications to statistical mechanics, one is interested in the asymptotics of normalized logarithms of certain expectations involving the stochastic process in question. If the large deviation behavior of the process is known, then these asymptotics are consequences of the Laplace principle, stated here in the two-parameter setting. Definition.3 A doubly indexed sequence of random variables {X r,q, r, q IN} taking values in Y is said to satisfy the two-parameter Laplace Principle upper bound on Y with rate function I and scaling sequence {a r b r,q, r, q IN} if for any bounded continuous function h mapping Y into IR lim sup lim sup r q log E {exp[a r b r,q h(x r,q )]} sup a r b r,q y Y {h(y) I(y)}. (.3) The sequence {X r,q } is said to satisfy the two-parameter Laplace Principle lower bound on Y with rate function I and scaling sequence {a r b r,q, r, q IN} if for any bounded continuous function h mapping Y into IR lim inf r lim inf q log E {exp[a r b r,q h(x r,q )]} sup a r b r,q y Y {h(y) I(y)}. (.4) If {X r,q, r, q IN} satisfies both (.3) and (.4) then it is said to satisfy the twoparameter Laplace principle on Y with rate function I and scaling sequence {a r b r,q, r, q IN}. In many of the cases we consider, the scaling sequence will have the form b r,q. = q. and a r = 2 r. or a r = r. Thus, the coefficient br,q is independent of r in these situations. However, in applications where we approximate a singly indexed process by a doubly indexed one, we will retain the original index (usually n) and introduce a new index, r. The scaling /n which appears in many problems in large deviation.. theory will be written in the form /(a r b r,n ) with a r = r or ar = 2 r. and b r,n = n/ar. 3

12 Often, one is interested in the asymptotics of a sequence of random variables {X n, n IN} indexed by a single parameter. Varadhan s Theorem and its converse (Theorems.2. and.2.3 in [9]) state the equivalence of the large deviation principle and the Laplace principle in the one-parameter setting. Two-parameter versions of these results give the equivalence of the two-parameter large deviation principle and the two-parameter Laplace principle. This equivalence is stated in the next proposition; for comments on its proof, see Proposition A.9. Proposition.4 Let {a r, r IN} and {b r,q, r, q IN} be two sequences of positive numbers such that lim q b r,q = for each r IN and let {X r,q, r, q IN} be a doubly indexed stochastic process taking values in the Polish space Y. Then {X r,q, r, q IN} satisfies the two-parameter large deviation principle on Y with rate function I and scaling sequence {a r b r,q, r, q IN} if and only if it satisfies the two-parameter Laplace principle on Y with rate function I and scaling sequence {a r b r,q, r, q IN} In cases in which our objective is to prove the LDP for a singly indexed sequence of stochastic processes, {X n, n IN} via an equivalent two-parameter LDP, our strategy will involve introducing a new index r and constructing an approximating doubly indexed sequence {X n,r }. We will require that the approximation be superexponentially close to the original sequence in the sense of the following definition. Definition.5 Let {X n, n IN} and {X n,r, n, r IN} be sequences of random variables taking values in the Polish space Y equipped with metric d(, ). {X n } and {X n,r } are superexponentially close if for any δ > 0 lim sup lim sup r n n log P {d(x n, X n,r ) > δ} =. 4

13 By Corollary.2.5 in [9] it suffices to prove the Laplace principle bounds for bounded, Lipschitz continuous functions h. Thus, if {X n } and {X n,r } are superexponentially close, this proposition implies that the usual LDP for X n is a consequence of the two-parameter LDP for X n,r. Proposition.6 If {X n } and {X n,r } are superexponentially close, then for any bounded, Lipschitz continuous function h mapping Y into IR lim sup n and lim inf n n log E {exp[ nh(x n)]} lim sup r n log E {exp[ nh(x n)]} lim inf r lim sup n lim inf n n log E {exp[ nh(x n,r)]} (.5) n log E {exp[ nh(x n,r)]}. (.6) Proof: Let h : Y IR be a bounded, Lipschitz continuous function with Lipschitz constant M, and let ε > 0 be given. We choose δ > 0 so that δ < ε/m and then choose R IN so that for all r R We have for r R lim sup n n log P {d(x n, X n,r ) > δ} 2 h n log E {exp[ nh(x n)]} = n log ( E { { }) exp[ nh(x n )] {d(xn,xn,r) δ}} + E exp[ nh(xn )] {d(xn,xn,r)>δ} n log ( E {exp[ nh(x n,r ) + nmδ]} + e n h P {d(xn, X n,r ) > δ} ). Thus, lim sup n n log E {exp[ nh(x n)]} { Mδ + lim sup n max = Mδ + lim sup n n log E {exp[ nh(x n,r)} ; h + lim sup n n log E {exp[ nh(x n,r)]} < ε + lim sup n 5 } n log P {d(x n, X n,r ) > δ} n log E {exp[ nh(x n,r)]}.

14 Since this display holds for all r R and ε > 0 is arbitrary, the proof of.5 is complete. The proof of.6 is similar. Now that we have laid out some of the generalities concerning the theory of large deviations for doubly indexed stochastic processes, we turn to the body of the dissertation. We will undertake an investigation of a specific class of such processes arising in a statistical mechanical model of fluid turbulence and analyze this model using two-parameter large deviations. Chapters 2-5 address this class of processes while Chapters 6 and 7 apply these large deviation results to our model. We now outline this program. Chapter 2: Statistical Mechanics and Doubly Indexed Processes. Here we introduce a class of doubly indexed, measure-valued stochastic processes, W r,q, a close analogue of which will arise in a statistical mechanical model of two-dimensional turbulence discussed in Chapters 6 and 7. A heuristic argument is presented for the two-parameter LDP satisfied by W r,q, and a systematic procedure for applying large deviations to statistical mechanical models is discussed in general terms. This leads us to conclude the chapter with a discussion of the role of W r,q in the analysis of the model. Chapter 3: Statement of the Large Deviation Theorem and Examples. After introducing a more general version of W r,q, we define the rate function with which it satisfies the two-parameter LDP. This LDP is the central probabilistic result in the dissertation and is stated in this chapter along with some related asymptotic results. The chapter concludes by considering some examples of stochastic processes which are special cases of the general class of processes introduced in the beginning 6

15 of the chapter. Chapter 4: Proofs of Results in Chapter 3. This chapter is broken into six sections, the first five of which prove components of the two-parameter LDP for W r,q. The final section proves one of the related asymptotic results from Chapter 3 regarding weak limits of certain probability measures defined in terms of W r,q. Chapter 5: Extensions of Results in Chapter 3. In this chapter, we introduce W n,r, a slight variation of W r,q, which is perfectly tailored to our statistical mechanical model. We go through the proofs in Chapter 4 step by step, pointing out the modifications needed to extend the two-parameter LDP in Chapter 3 to W n,r. Chapter 6: Statistical Equilibrium Model. It is in this chapter that the central statistical mechanical result is stated. We present the statistical mechanical model of fluid turbulence that we wish to treat, recall the procedure by which one can apply the theory of large deviations to this model, and carry out this procedure. This leads to the evaluation of the asymptotics of certain key statistical mechanical quantities and the determination of a variational equation satisfied by the equilibrium states of the system. Chapter 7: Properties of the Hidden Process. This chapter provides proofs of several crucial components of our analysis of the model. In particular, we prove that a special case of the class of processes W r,q provides a good approximation to the hidden process associated with the model and that we have represented the conserved quantities as functionals of this hidden process. 7

16 C H A P T E R 2 LARGE DEVIATIONS, STATISTICAL MECHANICS AND DOUBLY INDEXED PROCESSES One of the main contributions of the theory of large deviations to statistical mechanics is the systematization of a procedure for the asymptotic evaluation of key statistical mechanical quantities in terms of variational formulas over sets of macrostates. These applications, well known in the study of spin systems such as the Ising model, are explained in that context in [] as well as in numerous other references. Recent applications arising in the statistical mechanics of turbulence can also be treated via the theory of large deviations in a related but novel way. The analysis of such statistical mechanical models is greatly facilitated by using a class of doubly indexed processes, whose large deviation principles it is our aim to derive. The double indexing is not merely a mathematical contrivance, but reflects fundamental multiscale aspects of the models under consideration. In order to put flesh on this skeleton of assertions, let us consider a special case of the doubly indexed processes with which we will deal. A close analogue of this process arises in the analysis of a specific model of two-dimensional turbulence that is studied in Chapters 6 and 7. The asymptotics of this model lead to a rigorous derivation of maximum entropy principles arising in the well known Miller-Robert 8

17 theory of two-dimensional turbulence as well as in a modification of that theory recently proposed by Turkington [28]. Let T 2 denote the unit square [0, ) [0, ) with periodic boundary conditions. For u and v in IN we set r =. 2u, q =. 2 2v, and n =. 2 r q and consider a regular dyadic partition of T 2 into 2 r squares D r,k, called macrocells, each having area /2 r. We also consider a regular dyadic partition of each D r,k into q = n/2 r squares, called microcells, each having area /n. The model is defined on the sites of the uniform lattice L of n points in T 2 containing the origin and having intersite spacing equal to /n /2 in each coordinate direction. Thus each macrocell D r,k contains q = n/2 r points of L and each microcell one point of L. For s L M(s) denotes the unique microcell containing s. Let ρ be a probability measure on IR with bounded support Y. The configuration space of the model is the product space Ω n. = Y n, and a typical configuration is denoted by ζ = {ζ(s), s L}, which is referred to as the vorticity field. We denote by P n the finite product measure on Ω n which assigns to a Borel subset B of Ω n the probability P n {B} =. ρ(dζ(s)). B s L With respect to P n, the coordinates ζ(s) are i.i.d. random variables with common distribution ρ. Write θ(dx) or dx for Lebesgue measure on T 2 and define P θ (T 2 Y) to be the set of probability measures on T 2 Y having first marginal θ. We consider the random probability measure W r,q (dx dy) = W r,q (ζ, dx dy) =. 2 r dx Dr,k (x) L q,k (dy), (2.) where L q,k is the empirical measure L q,k (dy) = L q,k (ζ, dy) =. δ ζ(s) (dy). q s L D r,k 9 k=

18 Thus W r,q assigns to a Borel subset B of T 2 Y the probability W r,q {B} =. 2 r Dr,k (x) dx L q,k (dy). k= B Clearly, W r,q takes values in P θ (T 2 Y). The two-parameter large deviation principle satisfied by W r,q as well as by a large class of generalizations is stated in Theorem 3.4. These generalizations are obtained by replacing T 2 and Y by arbitrary Polish spaces and L q,k by other random measures on Y. Theorem 3.4 is proved in Chapter 4. This proof can be adapted to give the LDP for function space-valued processes defined by a formula like (2.) but with L q,k (dy) replaced by other random variables. Such LDP s will be considered in the paper [4], where they are applied to prove the LDP for a new class of processes. The following heuristic calculation motivates the two-parameter large deviation principle for W r,q and suggests the form of the rate function in this particular case. The proofs of the large deviation lower bound and upper bound in the general context of Theorem 3.4 will follow this line of reasoning. We denote by R( ) the relative entropy and by P(Y) the set of probability measures on Y. Let τ,..., τ 2 r be probability measures on Y and suppose that µ P θ (T 2 Y) has the form µ(dx dy) = dx τ(x, dy), where τ(x, dy) =. 2 r Dr,k (x) τ k (dy); k= i.e., for any Borel subset B of T 2 Y µ{b} = B (x, y) dx τ(x, dy). T 2 Y By Sanov s Theorem, which is stated in Proposition A., for each k {L q,k, q IN} satisfies the LDP on P(Y) with rate function R( ρ). Since L q,, L q,2,..., L q,2 r are 0

19 independent, lim q 2 r q log P n{w r,q µ} (2.2) = lim q 2 r q log P n{l q, τ, L q,2 τ 2,... L q,2 r τ 2 r} = 2r lim 2 r q q log P n{l q,k τ k } 2 r R(τ 2 r k ρ) k= k= 2 r = R(τ(x, ) ρ( )) dx = R(τ(x, ) ρ( )) dx D r,k T 2 k= = R(dx τ(x, ) dx ρ(dy)) = R(µ θ ρ). The last line is a consequence of the chain rule [9, Lem..4.3(f)] and the definition of µ. According to Lemma 4.2, any measure µ P θ (T 2 Y) can be well approximated by a sequence of measures of the form dx 2 r k= D r,k (x) τ k (dy) as r. Hence, the calculation in the last display makes it reasonable to expect that W r,q satisfies the two-parameter LDP, which we summarize by the notation lim lim r q 2 r q log P n{w r,q } = R( θ ρ). (2.3) Such doubly indexed processes W r,q, of considerable interest in their own right, are doubly interesting because of their applications to the statistical mechanics of turbulence. In order to explain this, it is useful to outline in some detail a systematic procedure, alluded to in the first paragraph of this chapter, for applying the theory of large deviations to the asymptotic evaluation of key statistical mechanical quantities. A more detailed discussion of this procedure can be found in [3]. Whether explicitly stated or not, this procedure is at the heart of numerous analyses of statistical mechanical models; as such, it deserves to be more widely known. The procedure applies to spin systems such as the Ising model, to models of turbulence such as will be considered in [2], and to numerous other models.

20 We consider a statistical mechanical model that is defined in terms of the following data. A sequence of configuration spaces {Ω n, n IN}. A Hamiltonian H n (ζ) of ζ Ω n and an additional function A n (ζ) of ζ Ω n. In the case of spin systems A n could represent the interactions of the spins with an external magnetic field, while in the case of turbulence it could represent a generalized enstrophy [2]. A n = 0 is allowed. A sequence of positive scaling constants b n. A probability measure P n on Ω n. In terms of these quantities we define for each n IN and β IR the partition function Z n (β) =. exp[ βh n (ζ) A n (ζ)] P n (dζ) Ω n and the Gibbs state P n,β, which is the probability measure on Ω n that assigns to a Borel subset B of Ω n the probability P n,β {B} =. exp[ βh n (ζ) A n (ζ)] P n (dζ). Z n (β) B For β IR we also consider the limit ϕ(β) =. lim log Z n (β) n b n if it exists. The function β ϕ(β) is known as the specific Gibbs free energy for the model. In order to carry out a large deviation analysis of the model, the following four items are needed. A Polish space S, called the hidden space. 2

21 For each n IN a random variable Y n mapping Ω n into S. The sequence {Y n, n IN} is called the hidden process. Bounded continuous functions H and Ã mapping S into IR such that H n (ζ) = b n H(Yn (ζ)) + o(b n ) and A n (ζ) = b n Ã(Y n (ζ)) + o(b n ) (2.4) uniformly for ζ Ω n. H and Ã are called representation functions. A rate function J on S such that the sequence of P n -distributions of Y n satisfies the large deviation principle on S with scaling constants b n and rate function J. In other words, J maps S into [0, ], J has compact level sets, and for any closed subset F of S and open subset G of S lim sup n log P n {Y n F } J(F ), lim inf b n n b n log P n {Y n G} J(G). Given these items, the asymptotic behavior of the model is readily determined. To see this, let us summarize the large deviation principle for the P n -distributions of Y n by the formal notation P n (Y n dz) exp[ b n J(z)] dz. Substituting this into the definition of Z n and using (2.4), one is led, by analogy with Laplace s method for integrals on IR, to the following formal limit, which is not difficult to justify: ϕ(β). = lim log Z n (β) n b n = lim log n b n = lim n = lim log n b n ( exp [ b n β H(Y )] n ) + Ã(Y n) dp n Ω n log exp [ b n (β b H(z) )] + Ã(z) P n (Y n dz) n S exp [ b n (β H(z) )] + Ã(z) + J(z) dz = sup{ β H(z) Ã(z) J(z)}. z S S 3

22 Similar considerations motivate the fact that for each β the sequence of P n,β - distributions of the hidden process Y n satisfies the large deviation principle on S with scaling constants b n and rate function J β (z). = J(z) + β H(z) + Ã(z) inf {J(y) + β H(y) + Ã(y)}. (2.5) y S It follows that if B is a Borel subset of S whose closure B has empty intersection with the 0-level set E β. = {z S : Jβ (z) = 0}, then J β ( B) > 0, and so by the large deviation upper bound P n,β {Y n B} C exp( b n J β ( B)/2) 0. This in turn leads to the identification of S as the set of possible macrostates for the model and of E β as the set of equilibrium macrostates. In the case of many important spin systems, the hidden space, the hidden process, the representation functions, and the large deviation principle are well known. For example, as explained in [3], in the case of the Curie -Weiss model, S equals IR, Y n equals the sample mean of the spins, and the large deviation principle is given by Cramér s Theorem; in the case of the Curie -Weiss-Potts model, S equals IR Q for some Q IN, Y n equals the empirical vector of the spins, and the large deviation principle is given by Sanov s Theorem; in the case of the D- dimensional Ising model, S equals the set of strictly stationary probability measures on {, } ZD, Y n equals the empirical field of the spins, and the large deviation principle is proved in [7, 22]. In any model such as these for which the hidden space, the hidden process, and the representation functions can be identified and the large deviation principle proved, the asymptotic behavior of the model can be determined as discussed in the preceding paragraph. In the model of two-dimensional turbulence that is presented in Chapter 6 and that is defined on the uniform lattice L of T 2 as described in Chapter 6 for a 4

23 standard choice of Hamiltonian H n and generalized entstrophy A n given in Chapter 6, the simplest choice of hidden process is the sequence of random measures Y n (dx dy) = Y n (ζ, dx dy). = dx s L M(s) (x) δ ζ(s) (dy). (2.6) In this case the hidden space is P θ (T 2 Y), and the representation functions are readily determined. On the other hand, the large deviation principle is by no means obvious. The desire to take advantage of the relatively simple spatial dependence of Y n and gain insight into why this large deviation principle should hold, coupled with the importance of the application to two-dimensional turbulence led us to develop the two-parameter techniques elucidated in the introduction. Our approach is to prove the requisite large deviation principle for the hidden process Y n, not by convex analysis, but by approximating Y n by the random measures W r,q in (2.) and applying the almost intuitive two-parameter large deviation principle for W r,q summarized in (2.3). The approximation is straightforward. Replacing, for each s L D r,k, the point mass δ ζ(s) by the average q s L D r,k δ ζ(s) = L q,k, one expects that for large q and large r 2 r Y n (dx dy) dx M(s) (x) L q,k (dy) s L D r,k k= 2 r = dx Dr,k (x) L q,k (dy) = W r,q (dx dy). k= In fact, by taking r large enough, one can show that with respect to an appropriate metric on P θ (T 2 Y) the distance between Y n and W r,q can be made as small as desired uniformly in q. The large deviation principle for a wide class of generalizations of W r,q is formulated in Chapter 3 together with related asymptotic results. Proofs are presented 5

24 in Chapter 4. Chapter 5 is devoted to a further generalization of the large deviation principle that is needed in [4] and other applications. 6

25 C H A P T E R 3 STATEMENT OF THE LARGE DEVIATION THEOREM AND EXAMPLES We now formulate the large deviation principle and related asymptotic results for extensive generalizations of the random measures W r,q defined in (2.) (see also [3]). Let (Ω, F, P ) be a probability space, Y a Polish space, P(Y) the space of probability measures on Y with the topology of weak convergence, and I a convex rate function on P(Y). We assume that {L q, q IN} is a sequence of random variables mapping Ω into P(Y) which satisfies the large deviation principle with rate function I. Thus I maps P(Y) into [0, ]; for each M [0, ) {γ P(Y) : I(γ) M} is compact (compact level sets); for any closed subset F of P(Y) lim sup q and for any open subset G of P(Y) lim inf q q log P {L q F } I(F ); q log P {L q G} I(G). A basic example of such a sequence, and the one that appears in the application to two-dimensional turbulence in Chapters 6 and 7, is the sequence L q of empirical measures of i.i.d. random variables ζ i taking values in Y; thus, L q. = q q i= δ ζ i. This and other examples will be discussed in Example 3.7. We also introduce, for 7

26 each r IN, a sequence L q,, L q,2,..., L q,2 r of 2 r independent random variables mapping Ω into P(Y), each having the same distribution as L q. Finally, let X be a Polish space, θ a probability measure on X, and Ξ r. = {Dr,k, k =,..., 2 r } a dyadic partition of X satisfying the following condition. Condition 3.. For each r IN (i) θ{d r,k } = /2 r. (ii) Ξ r+ is a refinement of Ξ r in the sense that D r,k = D r+,2k D r+,2k. (iii) lim max diam(d r r,k) = 0. k {,...,2 r } (iv) θ{ D r,k } = 0. Part (i) of this condition states that Ξ r is an equivolume partition. Part (ii) is needed in order to prove Lemma 4.2, which uses a martingale argument to derive a key approximation property of certain measures. In applications in turbulence, part (ii) reflects the natural way of constructing a sequence of lattice models, where each lattice is a refinement of its predecessor. We use part (iii) of Condition 3. to prove Lemma 4., which allows us to approximate an arbitrary closed set in X by sets in the σ-fields generated by the partitions Ξ r. Finally, part (iv) of Condition 3. is needed in Section 4.5 to prove that the function J defined in Definition 3.3 is lower semicontinuous. Let θ be Lebesgue measure on X =.. [0, ). For r IN, taking D r,k = [(k )/2 r, k/2 r ), k 2 r, gives an example of a partition satisfying Condition 3.. Remark 3.2 By Lemma 4.3 the assumptions that L q,,..., L q,2 r are i.i.d. copies of L q and that {L q, q IN} satisfies the LDP on P(Y) with rate function I guarantee 8

27 that for each r the sequence {(L q,,..., L q,2 r), q IN} satisfies the LDP on P(Y) 2r with rate function 2 r (ν,..., ν 2 r) I(ν k ). All that is needed in the sequel is this LDP for {(L q,,..., L q,2 r), q IN} for each r. While this is true under much weaker hypotheses on L q,k, we have adopted these assumptions to avoid overcomplicating the exposition. k= The process whose asymptotics we wish to analyze is the doubly indexed sequence of random probability measures on X Y given by W r,q (dx dy) =. 2 r θ(dx) Dr,k (x) L q,k (dy). (3.) In other words, W r,q maps Ω into P(X Y) and assigns to a Borel subset B of X Y the probability W r,q {B} =. 2 r Dr,k (x) θ(dx) L q,k (dy). k= B The sum in (3.) defines a stochastic kernel τ(x, dy) on Y given X. In other words, τ(x, dy) is a family of probability measures on Y indexed by x X and for each Borel subset B of Y the mapping 2 r x τ(x, B) = Dr,k (x) L q,k {B} k= is measurable. In order to avoid the complications involved in working with a space of stochastic kernels, the measure θ has been included in the definition of W r,q. We denote by P θ (X Y) the closed subset of P(X Y) consisting of measures with first marginal equal to θ. W r,q takes values in P θ (X Y). In Chapter 4 we introduce a standard metric that makes P(X Y) and P θ (X Y) Polish spaces. The LDP satisfied by W r,q is stated in Theorem 3.4. The formal calculation given in (2.2) can easily be generalized to motivate this LDP and suggest the form k= 9

28 of the rate function. Let τ,..., τ 2 r be probability measures on Y and suppose that µ P θ (X Y) has the form µ(dx dy) = θ(dx) τ(x, dy), where τ(x, dy) =. 2 r Dr,k (x) τ k (dy). k= Since L q,,..., L q,2 r are i.i.d. copies of L q and {L q, q IN} satisfies the LDP on P(Y) with rate function I, we obtain as in (2.2) lim q 2 r 2 r q log P {W r,q µ} I(τ 2 r k ) k= 2 r = I(τ(x, )) θ(dx) = I(τ(x, )) θ(dx). D r,k X k= Since any µ P θ (X Y) can be well approximated by a sequence of measures of the form θ(dx) 2 r k= D r,k (x) τ k (dy) as r [Lemma 4.2], it is reasonable to expect W r,q to satisfy the two-parameter LDP with scaling sequence {2 r q, r, q IN}, the rate function for which we next define. For any µ P θ (X Y) there exists a stochastic kernel τ(x, dy) on Y given X such that µ(dx dy) = θ(dx) τ(x, dy) [9, Thm. A.5.4]. The definition of the rate function for W r,q uses this decomposition, which we summarize as µ = θ τ. Definition 3.3. Let I denote the convex rate function for {L q } on P(Y). Given µ = θ τ P θ (X Y) define J(µ) =. I(τ(x, )) θ(dx). X J(µ) is well-defined since the mapping x X τ(x, ) P(Y) is measurable [9, Thm. A.5.2] and I is nonnegative and lower semicontinuous. Clearly J is nonnegative, and because I is convex, J is convex. We will prove in general, by an indirect argument, that J has compact level sets. It is possible to see this directly in several cases. For example, assume that L q,, L q,2,..., L q,2 r are i.i.d. copies of 20

29 the empirical measure L q common distribution ρ. Then. = q q i= δ ζ i of i.i.d. random variables ζ i having the J(µ) = R(µ θ ρ), where R( ) denotes the relative entropy. Thus J has compact level sets since the relative entropy has this property. Details are given in Example 3.7(a). We now state the two-parameter large deviation theorem for W r,q. It is proved in Chapter 4. Theorem 3.4. Let W r,q be defined by (3.). We assume that L q,, L q,2,..., L q,2 r are i.i.d. copies of L q, that {L q, q IN} satisfies the LDP on P(Y) with a convex rate function I, and that the partitions Ξ r = {D r,k } satisfy Condition 3.. Then the function J defined in Definition 3.3 is a convex rate function. Furthermore, the sequence W r,q satisfies the two-parameter LDP on P θ (X Y) with scaling sequence {2 r q, r, q IN} and rate function J. That is, for any closed subset F of P θ (X Y) lim sup lim sup r q 2 r q log P {W r,q F } J(F ), and for any open subset G of P θ (X Y) lim inf r lim inf q 2 r q log P {W r,q G} J(G). By Proposition.4, this two-parameter LDP is equivalent to a corresponding two-parameter Laplace principle. We state this Laplace principle next for easy reference; it will be needed in the application to two-dimensional turbulence treated in Chapters 6 and 7. Corollary 3.5. Under the same conditions as in Theorem 3.4, the sequence W r,q satisfies the two-parameter Laplace principle on P θ (X Y) with rate function J in 2

30 the following sense. For any bounded continuous function h mapping P θ (X Y) into IR lim lim r q 2 r q log exp[2 r q h(µ)] P (W r,q dµ) = P θ (X Y) sup {h(µ) J(µ)}. µ P θ (X Y) Another corollary of Theorem 3.4 that will also be needed in the application to two-dimensional turbulence is the LDP for a sequence of measures on P θ (X Y) defined in terms of the distributions of W r,q. Let Φ be a bounded continuous function mapping P θ (X Y) into IR and consider the sequence of probability measures {Π r,q, r IN, q IN} on P θ (X Y) that assign to a Borel subset B of P θ (X Y) the probability Π r,q {B} =. exp[2 r q Φ(µ)] P (W r,q dµ) B P θ (X Y). exp[2 r q Φ(µ)] P (W r,q dµ) (3.2) From Corollary 3.5 we can easily derive the two-parameter Laplace principle for Π r,q with scaling sequence {2 r q, r, q IN} and rate function J Φ (µ). = J(µ) Φ(µ) inf {J(ν) Φ(ν)}. (3.3) ν P θ (X Y) Indeed, the simple proof that J Φ is a rate function is given in Lemma A.2 and for any bounded continuous function h mapping P θ (X Y) into IR lim lim r q 2 r q log P θ (X Y) = lim lim r q 2 r q log lim lim r q 2 r q log exp[2 r q h(µ)] Π r,q (dµ) P θ (X Y) P θ (X Y) exp[2 r q (h(µ) + Φ(µ))] P (W r,q dµ) exp[2 r q Φ(µ)] P (W r,q dµ) = sup {h(µ) + Φ(µ) J(µ)} sup {Φ(ν) J(ν)} µ P θ (X Y) ν P θ (X Y) = sup {h(µ) J Φ (µ)}. µ P θ (X Y) 22

31 Since the two-parameter Laplace principle implies the two-parameter LDP with the same rate function, we have proved part (a) of the next corollary. Part (b) will enable us to characterize the equilibrium states of the continuum limit of the model of two-dimensional turbulence to be considered in Chapters 6 and 7. Part (c) gives information concerning the set of weak limits of the probability measures Π r,q. Corollary 3.6. Under the same conditions as in Theorem 3.4, the following conclusions hold. (a) The sequence of measures Π r,q defined in (3.2) satisfies the two-parameter Laplace principle and LDP on P θ (X Y) with rate function J Φ defined in (3.3). (b) The set E =. {µ P θ (X Y) : J Φ (µ) = 0} is a nonempty compact subset of P θ (X Y). Furthermore, if B is a Borel subset of P θ (X Y) whose closure has empty intersection with E, then lim r lim q Π r,q {B} = 0. (c) If E consists of a unique measure µ, then there exists N IN such that for all r N Π r,q δ µ as q. If E does not consist of a unique measure, then the following is valid. There exists N IN such that for all r N and any subsequence of q IN, there exists a subsubsequence and Π r P θ (X Y) such that Π r,q Π r as q along this subsubsequence. Furthermore, for any subsequence of {r IN : r N} there exists a subsubsequence and Π P θ (X Y) such that Π r Π as r along this subsubsequence. The measures Π r and Π all have support in E. The assertion about E in part (b) of the corollary follows from the fact that J Φ is a rate function. The second assertion in part (b) is a consequence of the fact that if the closure of B, B, has empty intersection with E, then JΦ ( B) > 0. Hence 23

32 by the large deviation upper bound in (a) Π r,q {B} C exp( J Φ ( B)/2) 0 as q, r. Part (c) of the corollary is proved in Section 4.6. An easy example in which E consists of a unique measure is when {L q, q IN} is the sequence of empirical measures of i.i.d. random variables having the common distribution ρ and Φ is convex. In that case J Φ is strictly convex and so attains its infimum of 0 at a unique point in P θ (X Y). The assertion in Theorem 3.4 that J is a rate function requires showing that J has compact level sets. This proof, given in Section 4.5, is surprisingly complicated, and it is only in this proof that part (iv) of Condition 3. is needed. The main effort is required to show that J is lower semicontinuous. This property together with other estimates to be obtained will yield the compactness of the level sets of J. The requirement that a function governing the large deviations of a process have compact level sets is not required in all aspects of the theory. For example, it is not needed to prove that an LDP implies a corresponding Laplace principle, although it is needed to show the reverse implication. We next present four cases of processes L q for which the LDP in Theorem 3.4 is of interest. Case (a) involves the empirical measures of i.i.d. random variables, case (b) the α-variate empirical measures of certain Markov chains for α 2, and case (d) the empirical processes of certain Markov chains. In all these cases, we can prove directly from the form of J that it has compact level sets, avoiding Condition 3. and the complicated proof in Section 4.5. However, in case (c), which involves the empirical measures of certain Markov chains, Condition 3. and the proof in Section 4.5 seem unavoidable since we cannot prove directly from the form of J that it has compact level sets. 24

33 Example 3.7 (a) Empirical Measures of I.I.D. Random Variables. Let Z be a Polish space and {ζ i, i IN} a sequence of i.i.d. random variables taking values in Z and having common distribution ρ. We define L q,, L q,2,..., L q,2 r to be i.i.d. copies of L q. = q q δ ζi, which takes values in P(Z). Sanov s Theorem implies that {L q, q IN} satisfies the LDP on P(Z) with the convex rate function I(γ). = R(γ ρ), where R is the i= relative entropy R(γ ρ) =. Z ( log dγ ) dγ dρ if γ ρ otherwise. By the chain rule [9, Thm. C.3.], if µ P θ (X Z) has the decomposition µ = θ τ, then J(µ) = J(θ τ) = R(τ(x, ) ρ) θ(dx) = R(θ τ θ ρ) = R(µ θ ρ). X Thus J has compact level sets since the relative entropy has this property [9, Lem..4.3(c)]. This proof that J has compact level sets does not require Condition 3.. According to Theorem 3.4, W r,q defined with these L q,k satisfies the two-parameter LDP on P θ (X Z) with scaling sequence {2 r q, r, q IN} and rate function J. (b) α-variate Empirical Measures of Certain Markov Chains, α 2. Let {ζ i, i IN} be a Markov chain taking values in a Polish space Z and having transition probability function p(y, dz). We denote by C b (Z) the set of bounded continuous functions mapping Z into IR. It is assumed that p(y, dz) satisfies the Feller property; i.e., for all f C b (Z) the function mapping y Z (pf)(y) =. f(z) p(y, dz) IR 25 Z

34 is continuous. It is also assumed that for some C [, ), all y, y Z, and all Borel subsets A of Z p(y, A) C p(y, A). (3.4) We consider the bivariate empirical measures of the Markov chain, which is the case α = 2 of the α-variate empirical measures. The case of general α 3 can be handled similarly and will be omitted. Let L q,, L q,2,..., L q,2 r be i.i.d. copies of the bivariate empirical measure L q. = q q δ (ζi,ζ i+ ), (3.5) i= which takes values in P(Z 2 ). Given γ P(Z 2 ), we denote by γ and γ 2 the first and second marginals of γ obtained by projection onto the corresponding coordinates. Under the hypotheses on p(y, dz), Theorem.4 in [2] proves that {L q, q IN} satisfies the LDP on P(Z 2 ) with the convex rate function I(γ) =. R(γ γ p) if γ = γ 2 otherwise. (3.6) Denote by J (2) the function on P θ (X Z 2 ) defined in Definition 3.3 in terms of this I. According to Theorem 3.4, W r,q defined with these L q,k satisfies the twoparameter LDP on P θ (X Z 2 ) with scaling sequence {2 r q, r, q IN} and rate function J (2). At the end of this chapter we prove directly from the form of J (2) that this function has compact level sets. Condition 3. is not required. In order to carry this out, it is useful to rewrite J (2). Given µ = µ(dx dz dz 2 ) P θ (X Z 2 ), we denote by µ i, i =, 2, 3, the i th marginal of µ obtained by projection onto the i th coordinate and by µ i,j, i < j 3, the marginal of µ obtained by projection onto the i th and j th coordinates. The measure µ equals θ, and if J (2) (µ) <, 26

35 then it follows from (3.6) and the chain rule that µ 2 = µ 3 and J (2) (µ) = R(µ(dx dz dz 2 ) µ,2 (dx dz ) p(z, dz 2 )). (3.7) (c) Empirical Measures of Certain Markov Chains. Let {ζ i, i IN} be a Markov chain taking values in a Polish space Z and having transition probability function p(y, dz). We assume that p(y, dz) satisfies the same properties as in case (b). Let L q,, L q,2,..., L q,2 r be i.i.d. copies of the empirical measure L q. = q q δ ζi. Under the hypotheses on p(y, dz), [27] proves that {L q, q IN} satisfies the LDP on P(Z) with the convex rate function I(µ). = sup u U(Z) Z i= u(y) µ(dy), (3.8) (pu)(y) where U(Z) denotes the set of u C b (Z) satisfying u ε on Z for some ε = ε(u) > 0. See also Chapter IV of [7]. Denote by J the function on P θ (X Z) defined in Definition 3.3 in terms of this I. According to Theorem 3.4, W r,q defined with these L q,k satisfies the two-parameter LDP on P θ (X Z) with rate function J. We cannot prove directly from the form of J that it has compact level sets, but must resort to the proof in Subsection 4.5 which requires Condition 3.. Using the contraction principle [6, Thm. 4.2.], we can obtain another representation for the rate function J. Indeed, denote by W (2) r,q the process defined in (3.), where we take L q,, L q,2,..., L q,2 r to be independent copies of the bivariate empirical measure L q in (3.5). As pointed out in part (b), W (2) r,q satisfies the twoparameter LDP on P θ (X Z 2 ) with scaling sequence {2 r q, r, q IN} and rate function J (2). Now let Γ denote the continuous function mapping P θ (X Z 2 ) into 27

36 P θ (X Z) defined by Γ(µ). = µ,2. Since Γ(W (2) r,q ) = W r,q, it follows that W r,q satisfies the two-parameter LDP on P θ (X Z) with scaling sequence {2 r q, r, q IN} rate function J(γ). = inf{j (2) (µ) : µ P θ (X Z 2 ), µ,2 = γ}. Since a rate function for W r,q is unique, it follows that under Condition 3. J equals J. (d) Empirical Processes of Certain Markov Chains. Let {ζ i, i IN} be a Markov chain taking values in a Polish space Z and having transition probability function p(y, dz). We assume that p(y, dz) satisfies the same properties as in case (b). Define Z Z to be the product space j Z Z. j, where for each j Z j = Z, and let P T (Z Z ) denote the space of probability measures P on Z Z which satisfy P T = P ; T denotes the shift operator on Z Z. For each q IN we repeat the sequence (ζ 0, ζ,..., ζ q ) periodically into a doubly infinite sequence, obtaining a point ζ(q) Z Z. We then let L q,, L q,2,..., L q,2 r be i.i.d. copies of the empirical process which takes values in P T (Z Z ). L q. = q q δ T i ζ(q), i=0 In order to specify the rate function, additional notation is needed. For z. = (..., z 2, z,z 0,z,...) Z Z, let z the mapping that takes z to z j and by ˆX. = (..., z 2, z ). We denote by ˆX j, j Z, ˆX the mapping that takes z to z ; thus = (..., ˆX 2, ˆX ). For P P T (Z Z ) we define P (z, dz 0 ) to be a regular conditional distribution, with respect to P, of Ẑ0 given ˆX = z, and we write P (dz ) for the P -distribution of ˆX. Finally, let Z. = j= Z j. Under the hypotheses on p(y, dz), Theorem.3 in [4] proves that {L q, q IN} 28

37 satisfies the LDP on P(Z Z ) with the convex rate function I(P ) =. R ( P (z, p(z, ) ) P (dz ) if P P T (Z Z ) Z otherwise. (3.9) Denote by J the function on P θ (X Z Z ) defined in Definition 3.3 in terms of this I. According to Theorem 3.4, W r,q defined with these L q,k satisfies the two-parameter LDP on P θ (X Z Z ) with rate function J. While it is possible to prove directly from the form of J that it has compact level sets, this proof is omitted. We end this section by giving a direct proof that the function J (2) in case (b) of Example 3.7 has compact level sets. This proof does not require Condition 3.. For M [0, ), we show that any sequence {µ n, n IN} in P θ (X Z 2 ) satisfying J (2) (µ n ) M is tight; J (2) is given by (3.7). Since R is lower semicontinuous in both variables [9, Lem..4.3(b)] and p satisfies the Feller property, the proof that J (2) has compact level sets is completed by Prohorov s Theorem. We fix a point y Z and set ν(dz 2 ) =. p(y, dz 2 ). By (3.7) and the Donsker- Varadhan variational formula [9, Lem..4.3(a)], for any n IN and any g C b (Z) M J (2) (µ n ) g(z 2 ) µ n (dx dz dz 2 ) log X Z 2 e g(z2) (µ n ),2 (dx dz ) p(z, dz 2 ), X Z 2 and so by (3.4) M + log C Z g(z 2 )(µ n ) 3 (dz 2 ) log e g(z2) ν(dz 2 ). Z Since g C b (Z) is arbitrary, we conclude that for any n IN M + log C R((µ n ) 3 ν). The compactness of the level sets of R( ν) implies that the sequence of marginals {(µ n ) 3 } is tight, and since J (2) (µ n ) M <, the sequence {(µ n ) 2 } = {(µ n ) 3 } is 29

38 also tight. Since (µ n ) = θ is tight, we conclude that {µ n, n IN} is tight. This completes the proof that J (2) has compact level sets. 30

39 C H A P T E R 4 PROOF OF RESULTS IN CHAPTER 3 This chapter consists of six sections, which prove the following: ) a number of preliminary lemmas, 2) the large deviation lower bound in Theorem 3.4, 3) the exponential tightness of {W r,q, q IN} for each fixed r, 4) the large deviation upper bound in Theorem 3.4, 5) the compactness of the level sets of J, and 6) the proof of part (c) of Corollary 3.6. Together, 2), 4), and 5) yield the theorem. Denote by BL(X Y) the collection of bounded, Lipschitz continuous functions mapping X Y into IR and by m the metric on X Y. Then define for f BL(X Y) f BL. = f L + f, where f L. = supx y f(x) f(y) /m(x, y) is the Lipschitz seminorm of f. We define BL (X Y). = {f BL(X Y) : f BL } and metrize P(X Y) with the dual-bounded-lipschitz metric d(µ, ν). = { sup f dµ f BL (X Y) X Y X Y } f dν. This metric is compatible with the topology of weak convergence on P(X Y) and when metrized by d, P(X Y) is a Polish space [8, Prop..3.2, Thm..3.3, Cor. 3

40 .5.5]. As a closed subset of P(X Y), P θ (X Y). = {µ P(X Y) : µ = θ} is also a Polish space when metrized by d. Given µ P θ (X Y) and ε > 0, B(µ, ε) denotes the open ball centered at µ with radius ε and B(µ, ε) denotes the closed ball centered at µ with radius ε. 4. A Number of Preliminary Lemmas The second lemma is an approximation result for stochastic kernels needed in the proof of the large deviation lower bound and in the proof that J has compact level sets. In order to prove this lemma, we first need to know that an arbitrary closed set in X can be approximated by sets in the σ-fields generated by the partitions Ξ r. Lemma 4.. Let F be a closed subset of X, Ξ r = {D r,k, k =,..., 2 r } a partition of X satisfying parts (ii) and (iii) of Condition 3., and F r. = σ(ξr ). Then there exists a sequence of subsets D r (F ) of X such that D r (F ) F r for each r and D r (F ) F. Proof. Define A r. = {k {,..., 2 r } : D r,k F } and D r (F ). = k A r D r,k. For each r IN D r (F ) F r, and due to part (ii) of Condition 3. D r+ (F ) D r (F ). It must be shown that r= D r(f ) = F. Since F D r (F ) for each r, one direction of containment is obvious. We prove by contradiction that r= D r(f ) F. Thus suppose that there exists x r= D r(f ) and x / F. Since F is closed, 32

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations