Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Classes
|
|
- Hilary Cannon
- 5 years ago
- Views:
Transcription
1 Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Classes This is page 5 Printer: Opaque this Aad van der Vaart and Jon A. Wellner ABSTRACT We show that the P Glivenko property of classes of functions F,...,F k is preserved by a continuous function ϕ from R k to R in the sense that the new class of functions x ϕ(f (x),...,f k (x)), f i F i, i =,...,k is again a Glivenko-Cantelli class of functions if it has an integrable envelope. We also prove an analogous result for preservation of the uniform Glivenko-Cantelli property. Corollaries of the main theorem include two preservation theorems of Dudley (998). We apply the main result to reprove a theorem of Schick and Dudley 998a or b? Yu (999)concerning consistency of the NPMLE in a model for mixed case interval censoring. Finally a version of the consistency result of Schick and Yu (999)is established for a general model for mixed case interval censoring in which a general sample space Y is partitioned into sets which are members of some VC-class C of subsets of Y. Glivenko-Cantelli theorems Let (X, A,P) be a probability space, and suppose that F L (P ). For such a class of functions, let F 0,P {f Pf : f F}. We also let F F (x) sup f F f(x), the envelope function of F. If f F for all f F with F measurable, then F is an envelope for F. Suppose that X,X 2,... are i.i.d. P. The Glivenko-Cantelli theorems give conditions under which the empirical measure P n converges uniformly to P over a class F, either in probability (in which case we say that F is a weak Glivenko-Cantelli class for P ) or almost surely: P n P F sup P n f Pf a.s. 0; f F in this case we say that F is a strong Glivenko-Cantelli class for P. Useful sufficient conditions for a class F to be a strong Glivenko-Cantelli class for P are that it has an integrable envelope and either log N [] (ɛ, F,L (P )) < for all ɛ>0
2 6 A. van der Vaart and J. A. Wellner or log N(ɛ, F M,L (P n )) = o P (n) for every ɛ>0 and 0 <M<. Here F M = {f [F M] : f F}. In the second case some additional measurability conditions are necessary; see Van der Vaart and Wellner (996), Chapter 2.4, pp In particular, the condition in Theorem 2.4.3, page 23, is shown by Giné and Zinn (984) to be both necessary and sufficient, under measurability assumptions, for the class F to be a strong Glivenko-Cantelli class (see Talagrand (996) for another proof). Talagrand (987b) gives necessary and sufficient conditions for the Glivenko-Cantelli theorem without any measurability hypotheses. Theorem. (Giné and Zinn, 984). Suppose that F is L (P ) bounded and nearly linearly supremum measurable for P ; in particular this holds if F is image admissible Suslin. Then the following are equivalent: (a) F is a strong Glivenko-Cantelli class for P. (b) F has an envelope function F L (P ) and the truncated classes F M {f{f M} : f F}satisfy n E log N(ɛ, F M,L r (P n )) 0 for all ɛ>0, and for all M (0, ) for some (all) r (0, ] where f Lr(P ) f P,r {P ( f r )} r. 2 Preservation of the Glivenko Cantelli property Our goal in this section is to present several results concerning the stability of the Glivenko-Cantelli property of one or more classes of functions under composition with functions ϕ. A theorem which motivated our interest is the following result of Dudley (998a). Theorem 2. (Dudley, 998a). Suppose that F is a strong Glivenko-Cantelli class for P with PF <, J is a possibly unbounded interval including the ranges of all f F, ϕ is continuous and monotone on J, and for some finite constants c, d, ϕ(y) c y + d for all y J. Then ϕ(f) is also a strong Glivenko-Cantelli class for P. Dudley (998a) proves this via the characterization of Glivenko-Cantelli classes due to Talagrand (987b). Dudley (998b) also uses Talagrand s characterization to prove the following interesting proposition. Proposition. (Dudley, 998b). Suppose that F is a strong Glivenko- Cantelli class for P with PF <, and g is a fixed bounded function ( g < ). Then the class of functions g F {g f : f F}is a strong Glivenko-Cantelli class for P.
3 Glivenko-Cantelli Preservation Theorems 7 Yet another proposition in this same vein is: Proposition 2. (Giné and Zinn, 984). Suppose that F is a uniformly bounded strong Glivenko-Cantelli class for P, and g L (P ) is a fixed function. Then the class of functions g F {g f : f F}is a strong Glivenko-Cantelli class for P. Given classes F,...,F k of functions f i : X Rand a function ϕ : R k R, let ϕ(f,...,f k ) be the class of functions x ϕ(f (x),...,f k (x)), where f i F i, i =,...,k. Theorem 2 and Propositions and 2 are all corollaries of the following theorem. Theorem 3. Suppose that F,...,F k are P Glivenko-Cantelli classes of functions, and that ϕ : R k R is continuous. Then H ϕ(f,...,f k ) is P Glivenko-Cantelli provided that it has an integrable envelope function. Proof. We first assume that the classes of functions F i are appropriately measurable. Let F,...,F k and H be integrable envelopes for F,...,F k and H respectively, and set F = F... F k.form (0, ), define Now H M {ϕ(f) [F M] : f =(f,...,f k ) F F 2 F k F}. (P n P )ϕ(f) F (P n + P )H [F>M] + (P n P )h HM. The expectation of the first term on the right converges to 0 as M. Hence it suffices to show that H M is P Glivenko-Cantelli for every fixed M. Let δ = δ(ɛ) betheδ of Lemma 2 below for ϕ :[ M,M] k R, ɛ>0, and the L -norm. Then for any (f j,g j ) F j, j =,...,k, implies that It follows that P n f j g j [Fj M] δ, j =,...,k k P n ϕ(f,...,f k ) ϕ(g,...,g k ) [F M] ɛ. N(ɛ, H M,L (P n )) k N( δ k, F j [Fj M],L (P n )). Thus E log N(ɛ, H M,L (P n )) = o(n) for every ɛ>0, M<. This implies that E log N(ɛ, (H M ) N,L (P n )) = o(n) for (H M ) N the functions h{h
4 8 A. van der Vaart and J. A. Wellner N} for h H M.ThusH M is strong Glivenko-Cantelli for P by Theorem. This concludes the proof that H = ϕ(f) is weak Glivenko-Cantelli. Because it has an integrable envelope, it is strong Glivenko-Cantelli by, e.g., Lemma of Van der Vaart and Wellner (996). This concludes the proof for appropriately measurable classes F j, j =,...,k. We extend the theorem to general Glivenko-Cantelli classes using separable versions as in Talagrand (987a). (Also see van der Vaart and Wellner (996), pp for a discussion.) As shown in the preceding argument, it is not a loss of generality to assume that the classes F i are uniformly bounded. Furthermore, it suffices to show that ϕ(f,...,f k )is weak Glivenko-Cantelli. We first need a lemma. Lemma. Any strong P -Glivenko Cantelli class F is totally bounded in L (P ) if and only if P F <. Furthermore for any r (, ), iff has an envelope that is contained in L r (P ), then F is also totally bounded in L r (P ). (983) or (984)? Proof. A class that is totally bounded is also bounded. Thus for the first statement we only need to prove that a strong Glivenko-Cantelli class F with P F < is totally bounded in L (P ). It is well-known that such a class has an integrable envelope. E.g. see Giné and Zinn (983) or Problem 2.4. of van der Vaart and Wellner (996) to conclude first that P f Pf F <. Next the claim follows from the triangle inequality f F f Pf F + P F. Thus it is no loss of generality to assume that the class F possesses an envelope that is finite everywhere. If F is not totally bounded in L (P ), then there is ε > 0 and G F countable such that the packing number D(ε, G,L (P )) =. This is obvious, since if F is not totally bounded, then D(ε, F,L (P )) = for some ε>0; then take a countable set of ε separated points. So, now G has all the measurablity properties, is GC and L (P )-bounded, and moreover, D(ε, G,L )=. But this is a contradiction because, as shown in Giné and Zinn (984), first half of page 984, D(ε, G,L ) must be finite for all ε>0. This completes the proof of the first part of the lemma. If F has an envelope in L r (P ), then F is totally bounded in L r (P ) if the class F M of functions f{f M} is totally bounded in L r (P ) for every fixed M. The class F M is P -GC by Theorem 3 and hence this class is totally bounded in L (P ). But then it is also totally bounded in L r (P ), because P f r P f M r for any f that is bounded by M and we can construct the ɛ-net over F M in L (P ) to consist of functions that are bounded by M. Because a Glivenko-Cantelli class F with P F < is totally bounded in L (P ) by Lemma, it is separable as a subset of L (P ). A minor generalization of Theorem in van der Vaart and Wellner (996) shows that there exists a bijection f f of F onto a class F L (P ) such that
5 f = f P almost surely for every f F. Glivenko-Cantelli Preservation Theorems 9 there exists a countable subset G F such that for every n there exists a measurable set N n X n with P n (N n ) = 0 such that for all (x,...,x n ) / N n and f F there exists {g m } Gsuch that P g m f 0 and (g m (x ),...,g m (x n )) ( f(x ),..., f(x n )). By an adaptation of a theorem due to Talagrand (987a) (see Theorem in van der Vaart and Wellner (996)) a class F is weak Glivenko- Cantelli if and only if the class F is weak Glivenko-Cantelli and sup P n f f 0 f F in outer probability. Construct a pointwise separable version F i for each of the classes F i. The classes F i possess enough measurability to make the preceding argument work; in particular pointwise separable version in the above sense is sufficient for the nearly linearly supremum measurable hypothesis of Giné and Zinn (984) for both F,...,F k and ϕ(f,...,f k ). Thus the class ϕ( F,..., F k ) is Glivenko-Cantelli for P. Now by Lemma 2 there exists for every ɛ>0aδ>0 such that P n f j f j < δ k, j =,...,k, implies P n ϕ(f,...,f k ) ϕ( f,..., f k ) <ɛ. The theorem follows. Lemma 2. Suppose that ϕ : K R is continuous and K R k is compact. Then for every ɛ>0 there exists δ>0 such that for all n and for all a,...,a n,b,...,b n K R k implies n n n a i b i <δ n ϕ(a i ) ϕ(b i ) <ɛ. Here can be any norm on R k ; in particular it can be ( k ) /r x r = x i r, r [, ) or x max i k x i for x =(x,...,x k ) R k.
6 20 A. van der Vaart and J. A. Wellner Proof. Let U n be uniform on {,...,n}, and set X n = a Un, Y n = b Un. Then we can write n n a i b i = E X n Y n and n ϕ(a i ) ϕ(b i ) = E ϕ(x n ) ϕ(y n ). n Hence it suffices to show that for every ɛ>0 there exists δ>0 such that for all (X, Y ) random vectors in K R k, E X Y <δ implies E ϕ(x) ϕ(y ) <ɛ. Suppose not. Then for some ɛ>0 and for all m =, 2,... there exists (X m,y m ) such that E X m Y m < m, E ϕ(x m) ϕ(y m ) ɛ. But since {(X m,y m )} is tight, there exists (X m,y m ) d (X, Y ). Then it follows that E X Y = lim E X m m Y m =0 so that X = Y a.s., while on the other hand 0=E ϕ(x) ϕ(y ) = lim E ϕ(x m m ) ϕ(y m ) ɛ>0. This contradiction means that the desired implication holds. Another potentially useful preservation theorem is one based on building up Glivenko-Cantelli classes from the restrictions of a class of functions to elements of a partition of the sample space. The following theorem is related to the results of Van der Vaart (996) for Donsker classes. Theorem 4. Suppose that F is a class of functions on (X, A,P), and {X i } is a partition of X : X i = X, X i X j = for i j. Suppose that F j {f Xj : f F}is P Glivenko-Cantelli for each j, and F has an integrable envelope function F. Then F is itself P Glivenko-Cantelli. Proof. Since f = f Xj = f Xj, it follows that E P n P F E P n P Fj 0
7 Glivenko-Cantelli Preservation Theorems 2 by the dominated convergence theorem, since each term in the sum converges to zero by the hypothesis that each F j is P Glivenko-Cantelli, and we have E P n P Fj E P n (F Xj )+P (F Xj ) 2P (F Xj ) where P (F X j )=P (F ) <. 3 Preservation of the uniform Glivenko Cantelli property We say that F is a strong uniform Glivenko-Cantelli class if for all ɛ>0 ( ) sup PrP sup P m P F >ɛ 0 as n P P(X,A) m n where P(X, A) is the set of all probability measures on (X, A). For x = (x,...,x n ) X n, n =, 2,..., and r (0, ), we define on F the pseudo-distances e x,r (f,g) = { n n } r f(x i ) g(x i ) r e x, (f,g) = max i n f(x i) g(x i ),, f,g F. Let N(ɛ, F,e x,r ) denote the ɛ covering number of (F,e x,r ), ɛ>0. Then define, for n =, 2,..., ɛ>0, and r (0, ], the quantities N n,r (ɛ, F) = sup x X n N(ɛ, F,e x,r ). Theorem 5. (Dudley, Giné, and Zinn (99)). Suppose that F is a class of uniformly bounded functions such that F is image admissible Suslin. Then the following are equivalent: (a) F is a strong uniform Glivenko-Cantelli class. (b) log N n,r (ɛ, F) 0 for all ɛ>0 n for some (all) r (0, ]. For the definition of the image admissible Suslin property see Dudley (984), sections 0.3 and.. The following theorem gives natural sufficient conditions for preservation of the uniform Glivenko-Cantelli theorem.
8 22 A. van der Vaart and J. A. Wellner Theorem 6. Suppose that F,...,F k are classes of uniformly bounded functions on (X, A) such that F,...,F k are image-admissible Suslin and strong uniform Glivenko-Cantelli classes. Suppose that ϕ : R k R is continuous. Then H ϕ(f,...,f k ) is a strong uniform Glivenko-Cantelli class. Proof. It follows from Lemma 2 that for any ɛ>0 there exists δ>0 such that for any f j,g j F j, j =,...,k, x X n, implies that It follows that and hence that e x, (f j,g j ) δ, j =,...,k k e x, (ϕ(f,...,f k ),ϕ(g,...,g k )) ɛ. N n, (ɛ, H) n log N n,(ɛ, H) k N n, ( δ k, F j), k n log N n,( δ k, F j) 0 where the convergence follows from part (b) of Theorem 5 with r =.If we show that H = ϕ(f) is image admissible Suslin, then the conclusion follows from (b) implies (a) in Theorem 5. We give a proof of a slightly stronger statement in the following lemma. Lemma 3. If F,...,F k are image admissible Suslin via (Y i, S i,t i ), i =,...,k, and φ : R k R is measurable, then ϕ(f,...,f k ) is image admissible Suslin via ( k Y i, k S i,t) for T (y,...,y k )=ϕ(t y,...,t k y k ). Proof. There exist Polish spaces Z i and measurable maps g i : Z i Y i which are onto. Then g : Z Z k Y Y k defined by g(z,...,z k )=(g (z ),...,g k (z k )) is onto and measurable for S S k. So ( i Y i, i S i) is Suslin. It suffices to check that T is onto and ψ defined by the map ψ(x, y,...,y k )=ϕ(t y (x),...,t k y k (x)) is measurable. Obviously T is onto, and ψ is measurable because each map (x, y i ) T i y i (x) is measurable on X Y i and hence on X Y Y k, and hence (x, y,...,y k ) (T y (x),...,t k y k (x)) is measurable from X Yto R k.
9 Glivenko-Cantelli Preservation Theorems 23 4 Nonparametric maximum likelihood estimation: a general result Now we prove a general result for nonparametricmaximum likelihood estimation in a class of densities. The main proposition in this section is related to results of Pfanzagl (988) and Van de Geer (993), (996). Suppose that P is a class of densities with respect to a fixed σ finite measure µ on a measurable space (X, A). Suppose that X,...,X n are i.i.d. P 0 with density p 0 P. Let p n argmax P n log p. For 0 <α, let ϕ α (t) =(t α )/(t α + ) for t 0, ϕ(t) = for t<0. Then ϕ α is bounded and continuous for each α (0, ]. For 0 <β< define h 2 β(p, q) p β q β dµ. Note that h /2 (p, q) h(p, q) is the Hellinger distance between p and q, and by Hölder s inequality, h β (p, q) 0 with equality if and only if p = q a.e. µ. Proposition 3. Suppose that P is convex. Then ( h 2 α/2 ( p n,p 0 ) (P n P 0 ) ϕ α ( p ) n ). p 0 In particular, when α =we have, with ϕ ϕ, ( h 2 ( p n,p 0 ) h 2 /2 ( p n,p 0 ) (P n P 0 ) ϕ( p ) n ). p 0 Corollary. Suppose that {ϕ(p/p 0 ): p P}is a P 0 Glivenko-Cantelli class. Then for each 0 <α, h α/2 ( p n,p 0 ) a.s. 0. Proof of Proposition 3. It follows from convexity of P and convexity of t ϕ α (/t) that () 0 P n ϕ α ( p n /p 0 )=(P n P 0 )ϕ α ( p n /p 0 )+P 0 ϕ α ( p n /p 0 ); see van der Vaart and Wellner (996) page 330, and Pfanzagl (988), pp Now we show that p α p α ( ) 0 (2) P 0 ϕ α (p/p 0 )= p α + p α dp 0 p β 0 p β dµ 0 for β = α/2. Note that this holds if and only if p α +2 p α 0 + p 0dµ + p β pα 0 p β dµ,
10 24 A. van der Vaart and J. A. Wellner or p β 0 p β dµ 2 p α p α 0 + pα p 0dµ. But this holds if p β 0 p β 2 pα p 0 p α 0 +. pα With β = α/2, this becomes 2 (pα 0 + p α ) p α/2 0 p α/2 = p α 0 pα, and this holds by the arithmeticmean geometricmean inequality. Thus (2) holds. Combining (2) with () yields the claim of the proposition. 5 Example:a result of Schick and Yu Our goal in this section is to give another proof of the consistency result of Schick and Yu (999) for the Non-Parametric Maximum Likelihood Estimator (NPMLE) F n for mixed case interval censored data. Our proof is based on the inequality of the preceding section, and is similar in spirit to results of Van de Geer (993), (996). Suppose that Y is a random variable taking values in R + = [0, ) with distribution function F F= {all df s F on R + }. Unfortunately we are not able to observe Y itself. What we do observe is a vector of times T K =(T K,,...,T K,K ) where K, the number of times, is itself random, and the interval (T K,j,T K,j ] into which Y falls (with T K,0 0, T K,K+ ). More formally, we assume that K is an integer-valued random variable, and T = {T k,j,j =,...,k,k =, 2,...}, is a triangular array of potential observation times, and that Y and (K, T) are independent. Let X =( K,T K,K), with a possible value x =(δ k,t k,k), where k =( k,,..., k,k ) with k,j = (Tk,j,T k,j ](Y ), j =, 2,...,k +, and T k is the kth row of the triangular array T. Suppose we observe n i.i.d. copies of X; X,X 2,...,X n, where X i =( (i),t (i),k (i) ), i = K (i) K (i), 2,...,n. Here (Y (i),t (i),k (i) ), i =, 2,... are the underlying i.i.d. copies of (Y,T,K). We first note that conditionally on K and T K, the vector K has a multinomial distribution: where ( K K, T K ) Multinomial K+ (, F K ) F K (F (T K, ),F(T K,2 ) F (T K, ),..., F (T K,K )).
11 Glivenko-Cantelli Preservation Theorems 25 Suppose for the moment that the distribution G k of (T K K = k) has density g k and p k P (K = k). Then a density of X is given by (3) k+ p F (x) p F (δ k,t k,k)= (F (t k,j ) F (t k,j )) δ k,j g k (t k )p k where t k,0 0, t k,k+. In general, (4) p F (x) p F (δ k,t k,k) = = k+ (F (t k,j ) F (t k,j )) δ k,j k+ δ k,j (F (t k,j ) F (t k,j )) is a density of X with respect to the dominating measure ν where ν is determined by the joint distribution of (K, T), and it is this version of the density of X with which we will work throughout the rest of the paper. Thus the log-likelihood function for F of X,...,X n is given by where n l n(f X) = n n = P n m F K (i) + ( ) (i) K,j log F (T (i) (i) ) F (T K (i),j K (i),j ) m F (X) = K+ K+ K,j log (F (T K,j ) F (T K,j )) K,j log ( F K,j ) and where we have ignored the terms not involving F. We also note that, with P 0 P F0, K+ P 0 m F (X) =P 0 F 0,K,j log ( F K,j ). The NonparametricMaximum Likelihood Estimator (NPMLE) F n is the distribution function F n (t) which puts all its mass at the observed time points and maximizes the log-likelihood l n (F X). It can be calculated via the iterative convex minorant algorithm proposed in Groeneboom and Wellner (992) for case 2 interval censored data.
12 26 A. van der Vaart and J. A. Wellner By Proposition 3 with α = and ϕ ϕ as before, it follows that ( ) h 2 (p Fn,p F0 ) (P n P 0 ) ϕ(p Fn /p F0 ) where ϕ is bounded and continuous from R to R. Now the collection of functions G {p F : F F} is easily seen to be a Glivenko-Cantelli class of functions: this can be seen by first applying Theorem 4 to the collections G k, k =, 2,... obtained from G by restricting to the sets K = k. Then for fixed k, the collections G k = {p F (δ, t k,k): F F}are P 0 Glivenko-Cantelli classes since F is a uniform Glivenko-Cantelli class, and since the functions p F are continuous transformations of the classes of functions x δ k,j and x F (t k,j ) for j =,...,k+, and hence G is P Glivenko-Cantelli by Theorem 3. Note that the single function p F0 is trivially P 0 Glivenko-Cantelli since it is uniformly bounded, and the single function (/p F0 ) is also P 0 GC since P 0 (/p F0 ) <. Thus by Proposition 2 with g =(/p F0 ) and F = G = {p F : F F}, it follows that G {p F /p F0 : F F}is P 0 Glivenko- Cantelli. Finally another application of Theorem 3 shows that the collection H {L (p F /p F0 ): F F}= {ϕ(p F /p F0 ): F F} is also P 0 -Glivenko-Cantelli. When combined with Proposition 3, this yields the following theorem: Theorem 7. The NPMLE F n satisfies h(p Fn,p F0 ) a.s. 0. To relate this result to a recent theorem of Schick and Yu (999), it remains only to understand the relationship between their L (µ) and the Hellinger metric h between p F and p F0. Let B denote the collection of Borel sets in R. OnB we define measures µ and µ as follows: For B B, (5) and (6) µ(b) = µ(b) = P (K = k) k= P (K = k) k k= k P (T k,j B K = k), k P (T k,j B K = k). Let d be the L (µ) metricon the class F; thus for F,F 2 F, d(f,f 2 )= F (t) F 2 (t) dµ(t).
13 Glivenko-Cantelli Preservation Theorems 27 The measure µ was introduced by Schick and Yu (999); note that µ is a finite measure if E(K) <. Note that d(f,f 2 ) can also be written in terms of an expectation as: K (7) d(f,f 2 )=E (K,T ) (F (T K,j ) F 2 (T K,j ). As Schick and Yu (999) observed, consistency of the NPMLE F n in L (µ) holds under virtually no further hypotheses. Theorem 8. (Schick and Yu). Suppose that E(K) <. Then d( F n,f 0 ) a.s. 0. Proof. We will show that Theorem 8 follows from Theorem 7 and the following lemma. Lemma 4. { 2 Proof. We know that F n F 0 d µ} 2 h 2 (p F n,p F 0 ). h 2 (p Fn,p F0 ) d TV (p Fn,p F0 ) 2h(p Fn,p F0 ) where, with y k,0 =, y k,k+ =, h 2 (p Fn,p F0 )= k+ P (K = k) {[ F n (y k,j ) F n (y k,j )] /2 k= [F 0 (y k,j ) F 0 (y k,j )] /2 } 2 dg k (y) while k+ d TV (p Fn,p F0 )= P (K = k) [ F n (y k,j ) F n (y k,j )] k= [F 0 (y k,j ) F 0 (y k,j )] dg k (y). Note that k+ [ F n (y k,j ) F n (y k,j )] [F 0 (y k,j ) F 0 (y k,j )] k+ = ( F n F 0 )(y k,j,y k,j ] max j k+ F n (y k,j ) F 0 (y k,j ),
14 28 A. van der Vaart and J. A. Wellner so integrating across this inequality with respect to G k (y) yields k+ [ F n (y k,j ) F n (y k,j )] [F 0 (y k,j ) F 0 (y k,j )] dg k (y) max j k k k F n (y k,j ) F 0 (y k,j ) dg k,j (y k,j ) F n (y k,j ) F 0 (y k,j ) dg k,j (y k,j ). By multiplying across by P (K = k) and summing over k, this yields d TV (p Fn,p F0 ) F n F 0 d µ, and hence (8) h 2 (p Fn,p F0 ) 2 { F n F 0 d µ} 2. The measure µ figuring in Lemma 4 is not the same as the measure µ of Schick and Yu (999) because of the factor /k. Note that this factor means that the measure µ is always a finite measure, even if E(K) =. It is clear that µ(b) µ(b) for every Borel set B, and that µ µ. The following lemma (Lemma 2.2 of Schick and Yu (999)) together with Lemma 4 shows that Theorem 7 implies the result of Schick and Yu once again: Lemma 5. Suppose that µ and µ are two finite measures, and that g, g, g 2,... are measurable functions with range in [0, ]. Suppose that µ is absolutely continuous with respect to µ. Then g n g d µ 0 implies that gn g dµ 0. Proof. Write g n g dµ = g n g dµ d µ d µ and use the dominated convergence theorem applied to a.e. convergent subsequences. 6 Example:generalizing the result of Schick and Yu Our goal in this section is to give a generalization of the consistency result of Schick and Yu (999).
15 Glivenko-Cantelli Preservation Theorems 29 Suppose that Y is a random variable taking values in Y. Suppose that Y has distribution Q on the measurable space (Y, B). Unfortunately we are not able to observe Y itself. What we do observe is a vector of random sets C K =(C K,,...,C K,K ) where K, the number of sets is itself random, and the set {C K,j } K form a partition of Y: C K,j C K,j = if j j and K C K,j = Y. More formally, we assume that K is an integer-valued random variable, and C = {C k,j,j =,...,k,k =, 2,...}, is a triangular array of random sets, and that Y and (K, C) are independent. Let X =( K,C K,K), with a possible value x =(δ k,c k,k), where k =( k,,..., k,k ) with k,j = Ck,j (Y ), j =, 2,...,k, and C k is the k th row of the triangular array C. Suppose we observe n i.i.d. copies of X; X,X 2,...,X n, where X i =( (i),c (i),k (i) ), i =, 2,...,n. Here K (i) K (i) (Y (i),c (i),k (i) ), i =, 2,... are the underlying i.i.d. copies of (Y,C,K). We first note that conditionally on K and C K, the vector K has a multinomial distribution: where ( K K, C K ) Multinomial K (,Q K ) Q K (Q(C K, ),Q(C K,2 ),...,Q(C K,K )). Suppose for the moment that the distribution G k of (C K K = k) has density g k and p k P (K = k). Then a density of X is given by (9) k p Q (x) p Q (δ k,c k,k)= Q(c k,j ) δ k,j g k (c k )p k. In general, (0) k k p Q (x) p Q (δ k,c k,k)= Q(c k,j ) δ k,j = δ k,j Q(c k,j ) is a density of X with respect to the dominating measure ν where ν is determined by the joint distribution of (K, C), and it is this version of the density of X with which we will work throughout the rest of the paper. Thus the log-likelihood function for Q of X,...,X n is given by where n l n(q X) = n n K (i) (i) K,j log Q(C(i) K (i),j )=P nm Q m Q (X) = K K,j log Q(C K,j )
16 30 A. van der Vaart and J. A. Wellner and where we have ignored the terms not involving Q. We also note that, with P 0 P Q0, K P 0 m Q (X) =P 0 Q 0 (C K,j ) log Q(C K,j ). The NonparametricMaximum Likelihood Estimator (NPMLE) Q n is a probability measure Q n which maximizes the log-likelihood l n (Q X). Here we will bypass the many interesting existence, characterization, and computational issues connected with the NPMLE Q n, and focus instead on the issue of consistency once we have the NPMLE in hand. By Proposition 3 with α = it follows that ( ) h 2 (p Qn,p Q0 ) (P n P 0 ) ϕ(p Qn /p Q0 ). where ϕ is bounded and continuous. Now the collection of functions G {p Q : Q Q} where Q is the collection of all probability distributions on (Y, B), is easily seen to be a Glivenko-Cantelli class of functions: this can be seen by first applying Theorem 4 to the collections G k, k =, 2,... obtained from G by restricting to the sets K = k. The next step is to show that the collections G k {p Q (,,k): Q Q} are P 0 Glivenko-Cantelli for each k. To this end, suppose that (Y, B) isa measurable space and let C be a universal GC - class of sets in Y (i.e. Q GC for every probability measure Q on (Y, B)). Let (T,T,G) be a probability space, and let t C t be a map with values in C. Lemma 6. Suppose that the collection {{t : y C t } : y Y}is G GC. Then the collection of functions t Q(C t ) where Q ranges over all probability measures Q on Y is G GC. Remark. By Assouad (983), Proposition 2.2, page 246, together with Corollary.0, page 24, it follows that {{t T : y C t } : y Y}is a VC-class of subsets of T if and only if {C t : t T } is a VC-class of subsets of Y. Thus if we assume that all the subsets C = C t arising in the partitions are elements of a VC-class C, then, subject to being suitably measurable, the hypothesis of Lemma 6 is satisfied (and the class in question is even universal Glivenko-Cantelli).
17 Glivenko-Cantelli Preservation Theorems 3 Proof. For Q the Diracmeasure at y (unit mass at y), the function t Q(C t ) becomes t Ct (y). By assumption, the set of all such functions, with y ranging over Y, is G GC. Then so is the set of all functions t m m Ct (y i ), y,...,y m Y,m N. Let Y,...,Y m be an i.i.d. sample from a given Q. Because C is Q GC, sup m {Y i C} P (C) 0 a.s.. C C m Then there exists a sequence y,y 2,... in Y such that sup m {y i C} P (C) 0, C C m and consequently sup m {y i C t } P(C t ) 0. t T m It follows that the maps t Q(C t ) are the (uniform) limits of sequences of functions from a G GC class. Hence they are G GC. Note on measurability: It is assumed implicitly that the sets {t : y C t } are measurable in T for every y Y. The proof shows that the maps t P (C t ) are then also measurable. It follows that for fixed k, the collections G k = {p Q (δ, c k,k) : Q Q} are P 0 -Glivenko-Cantelli, and since the functions p Q are continuous transformations of the classes of functions x δ k,j and x Q(c k,j ) for j =,...,k, and hence G is P Glivenko-Cantelli by Theorem 3. Note that the single function p Q0 is trivially P 0 Glivenko-Cantelli since it is uniformly bounded. Now another application of Theorem 3 shows that the collection H {L (p Q /p Q0 ): Q Q}= {ϕ(p Q /p Q0 ): Q Q} is also P 0 -Glivenko-Cantelli. When combined with Proposition 3, this yields the following theorem. Theorem 9. If all C K,j C, a VC collection of subsets of X, then the NPMLE Q n satisfies h(p Qn,p Q0 ) a.s. 0.
18 32 A. van der Vaart and J. A. Wellner To obtain a statement analogous to the theorem of Schick and Yu (999), let Σ denote a sigma algebra for the space C of subsets: we are assuming that P ( K [C K,j C])=.Thus(C, Σ) is a measurable space. On Σ we define the measure µ as follows: () k µ(b) = P (K = k) P (C k,j B K = k), B Σ. k= Note that d TV (p Qn,p Q0 ) = (2) = k P (K = k) Q n (c k,j ) Q 0 (c k,j ) dg k (c) k= Q n (c) Q 0 (c) dµ(c) d( Q n,q 0 ). Here is a generalization of the theorem of Schick and Yu (999). Theorem 0. The NPMLE Q n satisfies d( Q n,q 0 ) a.s. 0. Proof. Theorem 0 follows from Theorem 9, (2) and the observation that d( Q n,q 0 )=d TV (p Qn,p Q0 ) 2h(p Qn,p Q0 ). Example. (Mixed case interval censoring in R 2 ). Suppose that Y Q 0 takes values in R +2, and the partitions {C K,j } K are obtained as the natural rectangles formed from two increasing collections 0 S K,... S K,K and 0 T K2,... T K2,K on the two coordinate axes. Then K =(K + )(K 2 + ), and all the elements of the partitions are elements of the VC-class of rectangles in R 2. It follows that the NPMLE Q n of Q 0 satisfies Q n (s, t] Q 0 (s, t] dµ(s, t) 0 a.s.. s,t R +2 Acknowledgments. We owe thanks to Evarist Giné and Joel Zinn for suggesting the proof given for Lemma. References Assouad, P. (983). Densité et dimension. Annales de l Institut Fourier 33, Dudley, R. M. (984). A Course on Empirical Processes. Ecole d été de probabilités de St.-Flour, 982. Lecture Notes in Mathematics 097, Springer, Berlin, 42.
19 Glivenko-Cantelli Preservation Theorems 33 Dudley, R. M. (998a). Consistency of M-estimators and one-sided bracketing. In Proceedings of the First International Conference on High- Dimensional Probability. E. Eberlein, M. Hahn, and M. Talagrand, eds. Birkhäuser, Basel, pp Dudley, R. M. (998b). Personal communication. Dudley, R. M., Giné, E., and Zinn, J. (99). Uniform and universal Glivenko-Cantelli classes. J. Theoretical Probab. 4, Giné, E., and Zinn, J. (984). Some limit theorems for empirical processes. Ann. Probability 2, Groeneboom, P. and Wellner, J. A. (992). Information Bounds and Nonparametric Maximum Likelihood Estimation. DMV Seminar Band 9, Birkhäuser, Basel. Pfanzagl, J. (988). Consistency of maximum likelihood estimators for certain nonparametric families, in particular: mixtures. J. Statist. Planning and Inference 9, Schick, A. and Yu, Q. (999), Consistency of the GMLE with Mixed Case Interval-Censored Data, Scand. J. Statist., to appear. Talagrand, M. (987a). Measurability problems for empirical measures. Ann. Probability 5, Talagrand, M. (987b). The Glivenko-Cantelli problem. Ann. Probability 5, Talagrand, M. (996). The Glivenko-Cantelli problem, ten years later. J. Theor. Prob. 9, Van de Geer, S. (993). Hellinger consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 2, Van de Geer, S. (996). Rates of convergence for the maximum likelihood estimator in mixture models. J. Nonparametric Statist. 6, Van der Vaart, A. W. (996). New Donsker classes. Ann. Probability 24, Van der Vaart, A. W., and Wellner, J. A. (996). Weak Convergence and Empirical Processes with Applications to Statistics, Springer-Verlag, New York. update? Department of Mathematics Free University De Boelelaan 08a 08 HV Amsterdam The Netherlands aad@cs.vu.nl Department of Statistics University of Washington P.O. Box Seattle, Washington U.S.A. jaw@stat.washington.edu
Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Classes
Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Classes Aad van der Vaart and Jon A. Wellner Free University and University of Washington ABSTRACT We show that the P Glivenko
More informationPRESERVATION THEOREMS FOR GLIVENKO-CANTELLI AND UNIFORM GLIVENKO-CANTELLI CLASSES AAD VAN DER VAART AND JON A. WELLNER ABSTRACT We show that the P Gli
PRESERVATION THEOREMS FOR GLIVENKO-CANTELLI AND UNIFORM GLIVENKO-CANTELLI CLASSES AAD VAN DER VAART AND JON A. WELLNER ABSTRACT We show that the P Glivenko property of classes of functions F ;::: ;F k
More informationSOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS
SOME CONVERSE LIMIT THEOREMS OR EXCHANGEABLE BOOTSTRAPS Jon A. Wellner University of Washington The bootstrap Glivenko-Cantelli and bootstrap Donsker theorems of Giné and Zinn (990) contain both necessary
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More information3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?
MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable
More informationEmpirical Processes: General Weak Convergence Theory
Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated
More information2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?
MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations
Introduction to Empirical Processes and Semiparametric Inference Lecture 13: Entropy Calculations Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationL p Spaces and Convexity
L p Spaces and Convexity These notes largely follow the treatments in Royden, Real Analysis, and Rudin, Real & Complex Analysis. 1. Convex functions Let I R be an interval. For I open, we say a function
More informationMATHS 730 FC Lecture Notes March 5, Introduction
1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists
More information5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.
5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint
More informationMATH MEASURE THEORY AND FOURIER ANALYSIS. Contents
MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure
More informationTHEOREMS, ETC., FOR MATH 515
THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every
More informationDefining the Integral
Defining the Integral In these notes we provide a careful definition of the Lebesgue integral and we prove each of the three main convergence theorems. For the duration of these notes, let (, M, µ) be
More information(Y; I[X Y ]), where I[A] is the indicator function of the set A. Examples of the current status data are mentioned in Ayer et al. (1955), Keiding (199
CONSISTENCY OF THE GMLE WITH MIXED CASE INTERVAL-CENSORED DATA By Anton Schick and Qiqing Yu Binghamton University April 1997. Revised December 1997, Revised July 1998 Abstract. In this paper we consider
More informationLecture 2: Uniform Entropy
STAT 583: Advanced Theory of Statistical Inference Spring 218 Lecture 2: Uniform Entropy Lecturer: Fang Han April 16 Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal
More informationProblem set 1, Real Analysis I, Spring, 2015.
Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n
More informationChapter 8. General Countably Additive Set Functions. 8.1 Hahn Decomposition Theorem
Chapter 8 General Countably dditive Set Functions In Theorem 5.2.2 the reader saw that if f : X R is integrable on the measure space (X,, µ) then we can define a countably additive set function ν on by
More informationHILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define
HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,
More informationStatistics 581 Revision of Section 4.4: Consistency of Maximum Likelihood Estimates Wellner; 11/30/2001
Statistics 581 Revision of Section 4.4: Consistency of Maximum Likelihood Estimates Wellner; 11/30/2001 Some Uniform Strong Laws of Large Numbers Suppose that: A. X, X 1,...,X n are i.i.d. P on the measurable
More informationMeasure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond
Measure Theory on Topological Spaces Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond May 22, 2011 Contents 1 Introduction 2 1.1 The Riemann Integral........................................ 2 1.2 Measurable..............................................
More information1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time
ASYMPTOTIC PROPERTIES OF THE GMLE WITH CASE 2 INTERVAL-CENSORED DATA By Qiqing Yu a;1 Anton Schick a, Linxiong Li b;2 and George Y. C. Wong c;3 a Dept. of Mathematical Sciences, Binghamton University,
More informationMAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9
MAT 570 REAL ANALYSIS LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: FALL 204 Contents. Sets 2 2. Functions 5 3. Countability 7 4. Axiom of choice 8 5. Equivalence relations 9 6. Real numbers 9 7. Extended
More informationMaximum likelihood: counterexamples, examples, and open problems
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington visiting Vrije Universiteit, Amsterdam Talk at BeNeLuxFra Mathematics Meeting 21 May, 2005 Email:
More informationSTATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES
STATISTICAL INFERENCE UNDER ORDER RESTRICTIONS LIMIT THEORY FOR THE GRENANDER ESTIMATOR UNDER ALTERNATIVE HYPOTHESES By Jon A. Wellner University of Washington 1. Limit theory for the Grenander estimator.
More information3 Integration and Expectation
3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ
More informationA Law of the Iterated Logarithm. for Grenander s Estimator
A Law of the Iterated Logarithm for Grenander s Estimator Jon A. Wellner University of Washington, Seattle Probability Seminar October 24, 2016 Based on joint work with: Lutz Dümbgen and Malcolm Wolff
More informationIntegration on Measure Spaces
Chapter 3 Integration on Measure Spaces In this chapter we introduce the general notion of a measure on a space X, define the class of measurable functions, and define the integral, first on a class of
More informationA SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO
A SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO Abstract. Let (S, d) be a metric space, G a σ-field on S and (µ n : n 0) a sequence of probabilities
More informationBanach Spaces V: A Closer Look at the w- and the w -Topologies
BS V c Gabriel Nagy Banach Spaces V: A Closer Look at the w- and the w -Topologies Notes from the Functional Analysis Course (Fall 07 - Spring 08) In this section we discuss two important, but highly non-trivial,
More informationMaximum likelihood: counterexamples, examples, and open problems. Jon A. Wellner. University of Washington. Maximum likelihood: p.
Maximum likelihood: counterexamples, examples, and open problems Jon A. Wellner University of Washington Maximum likelihood: p. 1/7 Talk at University of Idaho, Department of Mathematics, September 15,
More informationCHAPTER V DUAL SPACES
CHAPTER V DUAL SPACES DEFINITION Let (X, T ) be a (real) locally convex topological vector space. By the dual space X, or (X, T ), of X we mean the set of all continuous linear functionals on X. By the
More informationPart II Probability and Measure
Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationCHAPTER I THE RIESZ REPRESENTATION THEOREM
CHAPTER I THE RIESZ REPRESENTATION THEOREM We begin our study by identifying certain special kinds of linear functionals on certain special vector spaces of functions. We describe these linear functionals
More informationNonparametric estimation under Shape Restrictions
Nonparametric estimation under Shape Restrictions Jon A. Wellner University of Washington, Seattle Statistical Seminar, Frejus, France August 30 - September 3, 2010 Outline: Five Lectures on Shape Restrictions
More informationExamples of Dual Spaces from Measure Theory
Chapter 9 Examples of Dual Spaces from Measure Theory We have seen that L (, A, µ) is a Banach space for any measure space (, A, µ). We will extend that concept in the following section to identify an
More informationMath212a1413 The Lebesgue integral.
Math212a1413 The Lebesgue integral. October 28, 2014 Simple functions. In what follows, (X, F, m) is a space with a σ-field of sets, and m a measure on F. The purpose of today s lecture is to develop the
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More information1 Weak Convergence in R k
1 Weak Convergence in R k Byeong U. Park 1 Let X and X n, n 1, be random vectors taking values in R k. These random vectors are allowed to be defined on different probability spaces. Below, for the simplicity
More informationWeak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria
Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then
More informationReal Analysis Notes. Thomas Goller
Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................
More information7 Convergence in R d and in Metric Spaces
STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a
More informationAnalysis Comprehensive Exam Questions Fall 2008
Analysis Comprehensive xam Questions Fall 28. (a) Let R be measurable with finite Lebesgue measure. Suppose that {f n } n N is a bounded sequence in L 2 () and there exists a function f such that f n (x)
More informationReal Analysis Problems
Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.
More informationReal Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis
Real Analysis, 2nd Edition, G.B.Folland Chapter 5 Elements of Functional Analysis Yung-Hsiang Huang 5.1 Normed Vector Spaces 1. Note for any x, y X and a, b K, x+y x + y and by ax b y x + b a x. 2. It
More informationSTAT 7032 Probability Spring Wlodek Bryc
STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,
More informationarxiv:math/ v2 [math.st] 17 Jun 2008
The Annals of Statistics 2008, Vol. 36, No. 3, 1031 1063 DOI: 10.1214/009053607000000974 c Institute of Mathematical Statistics, 2008 arxiv:math/0609020v2 [math.st] 17 Jun 2008 CURRENT STATUS DATA WITH
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationReal Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi
Real Analysis Math 3AH Rudin, Chapter # Dominique Abdi.. If r is rational (r 0) and x is irrational, prove that r + x and rx are irrational. Solution. Assume the contrary, that r+x and rx are rational.
More informationconsists of two disjoint copies of X n, each scaled down by 1,
Homework 4 Solutions, Real Analysis I, Fall, 200. (4) Let be a topological space and M be a σ-algebra on which contains all Borel sets. Let m, µ be two positive measures on M. Assume there is a constant
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More information2 Sequences, Continuity, and Limits
2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs
More informationSupplementary Notes for W. Rudin: Principles of Mathematical Analysis
Supplementary Notes for W. Rudin: Principles of Mathematical Analysis SIGURDUR HELGASON In 8.00B it is customary to cover Chapters 7 in Rudin s book. Experience shows that this requires careful planning
More informationFunctional Analysis, Stein-Shakarchi Chapter 1
Functional Analysis, Stein-Shakarchi Chapter 1 L p spaces and Banach Spaces Yung-Hsiang Huang 018.05.1 Abstract Many problems are cited to my solution files for Folland [4] and Rudin [6] post here. 1 Exercises
More informationEMPIRICAL PROCESSES: Theory and Applications
Corso estivo di statistica e calcolo delle probabilità EMPIRICAL PROCESSES: Theory and Applications Torgnon, 23 Corrected Version, 2 July 23; 21 August 24 Jon A. Wellner University of Washington Statistics,
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationThe intersection axiom of
The intersection axiom of conditional independence: some new [?] results Richard D. Gill Mathematical Institute, University Leiden This version: 26 March, 2019 (X Y Z) & (X Z Y) X (Y, Z) Presented at Algebraic
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationL p Functions. Given a measure space (X, µ) and a real number p [1, ), recall that the L p -norm of a measurable function f : X R is defined by
L p Functions Given a measure space (, µ) and a real number p [, ), recall that the L p -norm of a measurable function f : R is defined by f p = ( ) /p f p dµ Note that the L p -norm of a function f may
More informationTHE INVERSE FUNCTION THEOREM
THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)
More informationSummary of Real Analysis by Royden
Summary of Real Analysis by Royden Dan Hathaway May 2010 This document is a summary of the theorems and definitions and theorems from Part 1 of the book Real Analysis by Royden. In some areas, such as
More informationMATH5011 Real Analysis I. Exercise 1 Suggested Solution
MATH5011 Real Analysis I Exercise 1 Suggested Solution Notations in the notes are used. (1) Show that every open set in R can be written as a countable union of mutually disjoint open intervals. Hint:
More informationEstimation of the Bivariate and Marginal Distributions with Censored Data
Estimation of the Bivariate and Marginal Distributions with Censored Data Michael Akritas and Ingrid Van Keilegom Penn State University and Eindhoven University of Technology May 22, 2 Abstract Two new
More informationEmpirical Processes and random projections
Empirical Processes and random projections B. Klartag, S. Mendelson School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540, USA. Institute of Advanced Studies, The Australian National
More informationCommutative Banach algebras 79
8. Commutative Banach algebras In this chapter, we analyze commutative Banach algebras in greater detail. So we always assume that xy = yx for all x, y A here. Definition 8.1. Let A be a (commutative)
More informationCONVERGENCE OF MULTIPLE ERGODIC AVERAGES FOR SOME COMMUTING TRANSFORMATIONS
CONVERGENCE OF MULTIPLE ERGODIC AVERAGES FOR SOME COMMUTING TRANSFORMATIONS NIKOS FRANTZIKINAKIS AND BRYNA KRA Abstract. We prove the L 2 convergence for the linear multiple ergodic averages of commuting
More informationd(x n, x) d(x n, x nk ) + d(x nk, x) where we chose any fixed k > N
Problem 1. Let f : A R R have the property that for every x A, there exists ɛ > 0 such that f(t) > ɛ if t (x ɛ, x + ɛ) A. If the set A is compact, prove there exists c > 0 such that f(x) > c for all x
More informationProblem 1: Compactness (12 points, 2 points each)
Final exam Selected Solutions APPM 5440 Fall 2014 Applied Analysis Date: Tuesday, Dec. 15 2014, 10:30 AM to 1 PM You may assume all vector spaces are over the real field unless otherwise specified. Your
More information1. Bounded linear maps. A linear map T : E F of real Banach
DIFFERENTIABLE MAPS 1. Bounded linear maps. A linear map T : E F of real Banach spaces E, F is bounded if M > 0 so that for all v E: T v M v. If v r T v C for some positive constants r, C, then T is bounded:
More informationTopological properties of Z p and Q p and Euclidean models
Topological properties of Z p and Q p and Euclidean models Samuel Trautwein, Esther Röder, Giorgio Barozzi November 3, 20 Topology of Q p vs Topology of R Both R and Q p are normed fields and complete
More information(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.
1. Abstract Integration The main reference for this section is Rudin s Real and Complex Analysis. The purpose of developing an abstract theory of integration is to emphasize the difference between the
More informationREAL AND COMPLEX ANALYSIS
REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any
More informationProbability and Measure
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability
More informationLebesgue-Radon-Nikodym Theorem
Lebesgue-Radon-Nikodym Theorem Matt Rosenzweig 1 Lebesgue-Radon-Nikodym Theorem In what follows, (, A) will denote a measurable space. We begin with a review of signed measures. 1.1 Signed Measures Definition
More informationSpectral Continuity Properties of Graph Laplacians
Spectral Continuity Properties of Graph Laplacians David Jekel May 24, 2017 Overview Spectral invariants of the graph Laplacian depend continuously on the graph. We consider triples (G, x, T ), where G
More informationFunctional Analysis I
Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker
More information6. Duals of L p spaces
6 Duals of L p spaces This section deals with the problem if identifying the duals of L p spaces, p [1, ) There are essentially two cases of this problem: (i) p = 1; (ii) 1 < p < The major difference between
More informationFUNCTIONAL ANALYSIS LECTURE NOTES: WEAK AND WEAK* CONVERGENCE
FUNCTIONAL ANALYSIS LECTURE NOTES: WEAK AND WEAK* CONVERGENCE CHRISTOPHER HEIL 1. Weak and Weak* Convergence of Vectors Definition 1.1. Let X be a normed linear space, and let x n, x X. a. We say that
More informationIntegral Jensen inequality
Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a
More informationINTRODUCTION TO MEASURE THEORY AND LEBESGUE INTEGRATION
1 INTRODUCTION TO MEASURE THEORY AND LEBESGUE INTEGRATION Eduard EMELYANOV Ankara TURKEY 2007 2 FOREWORD This book grew out of a one-semester course for graduate students that the author have taught at
More informationAn elementary proof of the weak convergence of empirical processes
An elementary proof of the weak convergence of empirical processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical
More informationLecture 4 Lebesgue spaces and inequalities
Lecture 4: Lebesgue spaces and inequalities 1 of 10 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 4 Lebesgue spaces and inequalities Lebesgue spaces We have seen how
More informationMTH 404: Measure and Integration
MTH 404: Measure and Integration Semester 2, 2012-2013 Dr. Prahlad Vaidyanathan Contents I. Introduction....................................... 3 1. Motivation................................... 3 2. The
More informationMATS113 ADVANCED MEASURE THEORY SPRING 2016
MATS113 ADVANCED MEASURE THEORY SPRING 2016 Foreword These are the lecture notes for the course Advanced Measure Theory given at the University of Jyväskylä in the Spring of 2016. The lecture notes can
More informationChapter 4. The dominated convergence theorem and applications
Chapter 4. The dominated convergence theorem and applications The Monotone Covergence theorem is one of a number of key theorems alllowing one to exchange limits and [Lebesgue] integrals (or derivatives
More information18.175: Lecture 3 Integration
18.175: Lecture 3 Scott Sheffield MIT Outline Outline Recall definitions Probability space is triple (Ω, F, P) where Ω is sample space, F is set of events (the σ-algebra) and P : F [0, 1] is the probability
More informationScore Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics
Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee 1 and Jon A. Wellner 2 1 Department of Statistics, Department of Statistics, 439, West
More informationAN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE. Mona Nabiei (Received 23 June, 2015)
NEW ZEALAND JOURNAL OF MATHEMATICS Volume 46 (2016), 53-64 AN EFFECTIVE METRIC ON C(H, K) WITH NORMAL STRUCTURE Mona Nabiei (Received 23 June, 2015) Abstract. This study first defines a new metric with
More informationFunctional Analysis HW #1
Functional Analysis HW #1 Sangchul Lee October 9, 2015 1 Solutions Solution of #1.1. Suppose that X
More informationAn introduction to some aspects of functional analysis
An introduction to some aspects of functional analysis Stephen Semmes Rice University Abstract These informal notes deal with some very basic objects in functional analysis, including norms and seminorms
More informationStanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures
2 1 Borel Regular Measures We now state and prove an important regularity property of Borel regular outer measures: Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon
More informationSOLUTIONS TO HOMEWORK ASSIGNMENT 4
SOLUTIONS TO HOMEWOK ASSIGNMENT 4 Exercise. A criterion for the image under the Hilbert transform to belong to L Let φ S be given. Show that Hφ L if and only if φx dx = 0. Solution: Suppose first that
More informationMath 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras
Math 4121 Spring 2012 Weaver Measure Theory 1. σ-algebras A measure is a function which gauges the size of subsets of a given set. In general we do not ask that a measure evaluate the size of every subset,
More informationApproximate Self Consistency for Middle-Censored Data
Approximate Self Consistency for Middle-Censored Data by S. Rao Jammalamadaka Department of Statistics & Applied Probability, University of California, Santa Barbara, CA 93106, USA. and Srikanth K. Iyer
More informationTheorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1
Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be
More informationFunctional Analysis HW #3
Functional Analysis HW #3 Sangchul Lee October 26, 2015 1 Solutions Exercise 2.1. Let D = { f C([0, 1]) : f C([0, 1])} and define f d = f + f. Show that D is a Banach algebra and that the Gelfand transform
More informationThe main results about probability measures are the following two facts:
Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a
More informationFinite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product
Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )
More information