On the Uniform Asymptotic Validity of Subsampling and the Bootstrap
|
|
- Ambrose Harrison
- 5 years ago
- Views:
Transcription
1 On the Uniform Asymptotic Validity of Subsampling and the Bootstrap Joseph P. Romano Departments of Economics and Statistics Stanford University Azeem M. Shaikh Department of Economics University of Chicago April 13, 2010 Abstract This paper provides conditions under which subsampling and the bootstrap can be used to construct estimates of the quantiles of the distribution of a root that behave well uniformly over a large class of distributions P. These results are then applied (i) to construct confidence regions that behave well uniformly over P in the sense that the coverage probability tends to at least the nominal level uniformly over P and (ii) to construct tests that behave well uniformly over P in the sense that the size tends to no greater than the nominal level uniformly over P. Without these stronger notions of convergence, the asymptotic approximations to the coverage probability or size may be poor even in very large samples. Specific applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process, and U-statistics. KEYWORDS: Bootstrap, Empirical Process, Moment Inequalities, Multiple Testing, Subsampling, Uniformity, U-Statistic 1
2 1 Introduction Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P and denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. In statistics and econometrics, it is often of interest to estimate certain quantiles of J n (x, P ). Two commonly used methods for this purpose are subsampling and the bootstrap. This paper provides conditions under which these estimates behave well uniformly over P. More precisely, for α (0, 1), we provide conditions under which subsampling and the bootstrap may be used to construct estimates ĉ n (1 α) of the 1 α quantiles of J n (x, P ) satisfying lim inf inf P R n ĉ n (1 α)} 1 α. (1) Similarly, for α (0, 1), we provide conditions under which subsampling and the bootstrap may be used to construct estimates ĉ n (α) of the α quantiles of J n (x, P ) satisfying lim inf We also provide stronger conditions under which lim inf inf P R n ĉ n (α)} 1 α. (2) inf P ĉ n(α) R n ĉ n (1 α)} 1 2α. (3) In many cases, it is possible to replace the lim inf and in (1), (2), and (3) with lim and =, respectively. These results differ from those usually stated in the literature in that they require the convergence to hold uniformly over P instead of just pointwise over P. The importance of this stronger notion of convergence when applying these results is discussed further below. The additional flexibility allowed for by the lim inf and in (1) and (2) greatly extends the scope of applicability of the results, especially in the case of subsampling. As we will see, ĉ n (1 α) may satisfy (1) while ĉ n (α) fails to satisfy (2), or, conversely, ĉ n (α) may satisfy (2) while ĉ n (1 α) fails to satisfy (1). This phenomenon arises when it is not possible to estimate J n (x, P ) uniformly well with respect to a suitable metric, but, in a sense to be made precise by our results, it is possible to estimate it sufficiently well to ensure that (1) or (2) still hold. It is worth noting that metrics compatible with the weak topology are not sufficient for our purposes. In particular, closeness of distributions with respect to such a metric does not ensure closeness of quantiles. See Remark 2.7 for further discussion of this point. In fact, closeness of distributions with respect to even stronger metrics, such as the Kolmogorov metric, does not ensure closeness of quantiles either. For this reason, our results rely heavily on Lemma 4.1 in the 2
3 Appendix that relates closeness of distributions with respect to a suitable metric and coverage statements. The usual arguments for the pointwise asymptotic validity of subsampling and the bootstrap rely on showing for each P P that ĉ n (1 α) tends in probability under P to the 1 α quantile of the limiting distribution of R n under P. Because our results are uniform in P P, we must consider the behavior of R n and ĉ n (1 α) under arbitrary sequences P n P : n 1}, under which the quantile estimates need not even settle down. Thus, the results are not trivial extensions of the usual pointwise asymptotic arguments. The construction of ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) is useful for constructing confidence regions that behave well uniformly over P. More precisely, our results provide conditions under which subsampling and the bootstrap can be used to construct confidence regions C n = C n (X (n) ) of level α for a parameter θ(p ) that are uniformly consistent in level in the sense that lim inf inf P θ(p ) C n} 1 α. (4) Our results are also useful for constructing tests φ n = φ n (X (n) ) of level α for a null hypothesis P P 0 P against the alternative P P 1 = P \ P 0 that are uniformly consistent in level in the sense that lim sup sup E P [φ n ] α. (5) 0 In some cases, it is possible to replace the lim inf and in (4) or the lim sup and in (5) with lim and =, respectively. Confidence regions satisfying (4) are desirable because they ensure that for every ɛ > 0 there is an N such that for n > N we have that P θ(p ) C n } is no less than 1 α ɛ for all P P. In contrast, confidence regions that are only pointwise consistent in level in the sense that lim inf P θ(p ) C n} 1 α for all P P have the feature that for every ɛ > 0 and every n, however large, there is some P P for which P θ(p ) C n } is less than 1 α ɛ. Likewise, tests satisfying (5) are desirable because they ensure that for every ɛ > 0 there is an N such that for n > N we have that E P [φ n ] is no greater than α + ɛ for all P P 0. In contrast, tests that are only pointwise consistent in level in the sense that lim sup E P [φ n ] α for all P P 0 have the feature that for every ɛ > 0 and every n, however large, there is some P P 0 for which E P [φ n ] is greater than α + ɛ. For this reason, inferences based on confidence 3
4 regions or tests that fail to satisfy (4) or (5) may be very misleading in finite samples. Of course, as pointed out by Bahadur and Savage (1956), there may be no nontrivial confidence region or test satisfying (4) or (5) when P is sufficiently rich. For this reason, we will have to restrict P appropriately in our examples. In the case of the confidence regions for or tests about the mean, for instance, we will have to impose a very weak Lindeberg-type condition. Some of our results on subsampling are closely related to results in Andrews and Guggenberger (2010), which were developed independently and at about the same time as our results. See the discussion on p. 431 of Andrews and Guggenberger (2010). Our results show that the the question of whether subsampling can be used to construct estimates ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) reduces to a single, succinct requirement on the asymptotic relationship between the distribution of J n (x, P ) and J b (x, P ), where b is the subsample size, whereas the results of Andrews and Guggenberger (2010) require the verification of a larger number of conditions. On the other hand, the results of Andrews and Guggenberger (2010) further provide a means of calculating the limiting value of inf P R n ĉ n (1 α)} or inf P R n ĉ n (α)} in the case where it may not satisfy (1) or (2). To the best of our knowledge, our results on the bootstrap are the first to be stated at this level of generality. An important antecedent is Romano (1989), who studies the uniform asymptotic behavior of confidence regions for a univariate cumulative distribution function. See also Mikusheva (2007), who analyzes the uniform asymptotic behavior of some tests that arise in the context of an autoregressive model where the autoregressive parameter may be close to one in absolute value. The remainder of the paper is organized as follows. In Section 2, we present the conditions under which ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) may be constructed using subsampling or the bootstrap. We then provide in Section 3 several applications of our general results. These applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process and U-statistics. The discussion of U-statistics is especially noteworthy because it highlights the fact that the assumptions required for the uniform asymptotic validity of subsampling and the bootstrap may differ. In particular, subsampling may be uniformly asymptotically valid under conditions where the bootstrap fails even to be pointwise asymptotically valid. Proofs of all results can be found in the Appendix. Many of the intermediate results in the Appendix may be of independent interest, including uniform weak laws of large numbers for U-statistics and V -statistics (Lemma 4.12 and Lemma 4.13, respectively) as well as Lemma
5 2 General Results 2.1 Subsampling Theorem 2.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0. N n = ( n b) and L n (x, P ) = 1 N n Let 1 i N n IR b (X n,(b),i, P ) x}, (6) where X n,(b),i denotes the ith subset of data of size b. Then, the following statements are true for any α (0, 1): (i) If lim sup sup sup J b (x, P ) J n (x, P )} 0, then lim inf (ii) If lim sup sup sup J n (x, P ) J b (x, P )} 0, then lim inf (iii) If lim sup sup J b (x, P ) J n (x, P ) = 0, then lim inf inf P L 1 n inf P R n L 1 n (1 α, P )} 1 α. (7) inf P R n L 1 n (α, P )} 1 α. (8) (α, P ) R n L 1 n (1 α, P )} 1 2α. (9) Remark 2.1. It is typically easy to deduce from the conclusions of Theorem 2.1 stronger results in which the lim inf and in (7), (8) and (9) are replaced by lim and =, respectively. For example, in order to assert that (7) holds with lim inf and replaced by lim and =, respectively, all that is required is that lim P R n L 1 n (1 α, P )} = 1 α for some P P. This can be verified using the usual arguments for the pointwise asymptotic valididy of subsampling. Indeed, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its 1 α quantile. See Politis et al. (1999) for details. Likewise, in order to assert that (8) holds with lim inf and replaced by lim and =, respectively, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its α quantile. Finally, in order to assert that (9) 5
6 holds with lim inf and replaced by lim and =, respectively, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its α and 1 α quantiles. Remark 2.2. Note that L n (x, P ) as defined in (6) is infeasible because it still depends on P, which is unknown, through R b (X n,(b),i, P ). Even so, Theorem 2.1 may be used without modification to construct feasible confidence regions for a parameter of interest θ(p ) provided that R n (X (n), P ), and therefore L n (x, P ), depends on P only through θ(p ). If this is the case, then one may simply invert tests of the null hypotheses θ(p ) = θ for all θ Θ to construct a confidence region for θ(p ). More concretely, suppose R n (X (n), P ) = R n (X (n), θ(p )) and L n (x, P ) = L n (x, θ(p )). Whenever we may apply part (i) of Theorem 2.1, we have that C n = θ Θ : R n (X (n), θ) L 1 n (1 α, θ)} satisfies (4). Similar conclusions follow from parts (ii) and (iii) of Theorem 2.1. Remark 2.3. In some cases, Theorem 2.1 may be used to construct feasible confidence regions for a parameter of interest θ(p ) without having to invert hypothesis tests. To illustrate this, suppose interest focuses on a real-valued parameter θ(p ) and consider the root R n (X (n), P ) = τ n (ˆθ n θ(p )). (10) Here, ˆθ n = ˆθ n (X (n) ) is some estimate of θ(p ) and τ n is a sequence tending to infinity with the sample size in hopes of satisfying the conditions of Theorem 2.1. Let ˆL n (x) be the feasible estimate of J n (x, P ) given by ˆL n (x) = 1 N n In this case, it is easy to see that Therefore, 1 i N n Iτ b (ˆθ b (X n,(b),i ) ˆθ n ) x}. L 1 1 n (1 α, P ) = ˆL n (1 α) + τ b (ˆθ n θ(p )). P τ n (ˆθ n θ(p )) L 1 1 n (1 α, P )} = P (τ n τ b )(ˆθ n θ(p )) ˆL n (1 α)}. Hence, whenever we may apply part (i) of Theorem 2.1, we have that lim inf inf P (τ 1 n τ b )(ˆθ n θ(p )) ˆL n (1 α)} 1 α. Similar conclusions follow from parts (ii) and (iii) of Theorem 2.1. An analagous modification is applicable even in some cases where the root is not of the form given by (10), for example, when θ(p ) takes values in a metric space with metric d and R n (X (n), P ) = τ n d(ˆθ n, θ(p )). 6
7 Remark 2.4. In some cases it may often be possible simply to replace L n (x, P ) in Theorem 2.1 with a feasible estimate ˆL n (x) without affecting the conclusions of the theorem. For example, when L n (x, P ) = L n (x, θ(p )), such an estimate may be given by L n (x, ˆθ n ). The validity of such a modification will typically require first an argument to establish that the quantiles of ˆL n (x) are close to the quantiles of L n (x, P ) and then a subsequencing argument to establish the desired coverage property with ˆL n (x) in place of L n (x, P ). See Example 3.1, Example 3.5 and Example 3.6 for details. Remark 2.5. It is worth emphasizing that even though Theorem 2.1 is stated for roots, it is, of course, applicable in the special case where R n (X (n), P ) = T n (X (n) ). In other words, in the special case where R n (X (n), P ), and therefore L n (x, P ), does not depend on P. This is especially useful in the context of hypothesis testing. See Example 3.3 for one such instance. 2.2 Bootstrap Theorem 2.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let ρ(, ) be a function on P P and let ˆP n = ˆP n (X (n) ) be a (random) sequence of distributions such that for all sequences P n P : n 1} ρ( ˆP n, P n ) Pn 0 and P n ˆP n P } 1. Then, the following are true for any α (0, 1): (i) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 lim sup supj n (x, Q n ) J n (x, P n )} 0, then lim inf inf P R n Jn 1 (1 α, ˆP n )} 1 α. (11) (ii) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 lim sup supj n (x, P n ) J n (x, Q n )} 0, then lim inf inf P R n Jn 1 (α, ˆP n )} 1 α. (12) 7
8 (iii) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 then lim inf inf lim sup J n (x, Q n ) J n (x, P n ) = 0, 1 P Jn (α, ˆP n ) R n Jn 1 (1 α, ˆP n )} 1 2α. (13) Remark 2.6. It is typically easy to deduce from the conclusions of Theorem 2.2 stronger results in which the lim inf and in (11), (12) and (13) are replaced by lim and =, respectively. For example, in order to assert that (11) holds with lim inf and replaced by lim and =, respectively, all that is required is that for some P P. lim P R n J 1 n (1 α, ˆP n )} = 1 α This can be verified using the usual arguments for the pointwise asymptotic validity of the bootstrap. Indeed, if the conditions required to apply part (iii) of Theorem 2.2 hold, then it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its 1 α quantile. This follows after noting that along a further subsequence J n (x, ˆP n ) converges in distribution with probability one under P to J(x, P ). Analagous generalizations of assertions (12) and (13) hold as well. Remark 2.7. In some cases, it is possible to construct estimates Ĵn(x) of J n (x, P ) that are uniformly consistent over a large class of distributions P in the sense that for any ɛ > 0 sup P ρ(ĵn( ), J n (, P )) > ɛ} 0, (14) where ρ is the Levy metric or some other metric compatible with the weak topology. Yet a result such as (14) is not strong enough to yield any of (1) - (3). In other words, uniform coverage statements such as those in Theorem 2.1 and Theorem 2.2 do not follow from uniform approximations of the distribution of interest if the quality of the approximation is measured in terms of metrics metrizing weak convergence. To see this, consider the following simple example. Example 2.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P θ = Bernoulli(θ). Denote by J n (x, P θ ) the distribution of root n(ˆθn θ) under P θ. It is easy to show for any ɛ > 0 that sup P θ ρ(j n (, ˆP n ), J n (, P θ )) > ɛ} 0 (15) 0<θ<1 whenever ρ is a metric compatible with the weak topology. Nevertheless, it follows from p. 78 of Romano (1989) that the coverage statements in Theorem 2.2 fail to hold. 8
9 On the other hand, when ρ is the Kolmogorov metric, (15) holds when θ is restricted to an interval of the form [δ, 1 δ] for some δ > 0. Moreover, when θ is restricted to such an interval, the coverage statements in Theorem 2.2 hold as well. Thus, even when using a parametric bootstrap in very smooth parametric models, uniform coverage statements may require further restrictions on P. 3 Applications 3.1 Subsampling Example 3.1. (Multivariate Nonparametric Mean) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Suppose one wishes to construct a rectangular confidence region for µ(p ). For this purpose, a natural choice of root is We have the following theorem: n( R n (X (n) Xj,n µ j (P )), P ) = max. (16) 1 j k S j,n Theorem 3.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Denote by P j the set of distributions formed from the jth marginal distributions of the distributions in P. Suppose P is such that for all 1 j k [ (X ) µ(p ) 2 lim sup E P I λ P P σ(p ) X µ(p ) }] σ(p ) > λ = 0 (17) for P = P j. Let J n (x, P ) be the distribution of the root (16). Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0 and define L n (x, P ), the subsampling estimate of J n (x, P ), by (6). Then, for any α (0, 1), the following statements are true: (i) (ii) } n( lim inf P L 1 Xj,n µ j (P )) n (α, P ) max = 1 α. 1 j k S j,n } n( lim inf P Xj,n µ j (P )) max L 1 n (1 α, P ) = 1 α. 1 j k S j,n (iii) n( lim inf P L 1 Xj,n µ j (P )) n (α, P ) max 1 j k S j,n } L 1 n (1 α, P ) = 1 2α. 9
10 Note in Theorem 3.1 that L n (x, P ), only depends on P through µ(p ), i.e., L n (x, P ) = L n (x, µ(p )). Therefore, as described in Remark 2.2, it follows from part (i) of Theorem 3.1 that } n( C n = µ R k : L 1 Xj,n µ j ) n (α, µ) max 1 j k satisfies S j,n lim inf P µ(p ) C n} = 1 α. (18) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.1. Remark 2.4, with an additional argument it is possible to show that } C n = µ R k : L 1 n (α, X n( Xj,n µ j ) n ) max 1 j k S j,n also satisfies (18) with C n in place of C n. To see this, first note that Moreover, as described in max 1 j k b( Xj,b X j,n ) S j,b max 1 j k b( Xj,b µ j (P )) S j,b + max 1 j k b(µj (P ) X j,n ) S j,b. Similarly, max 1 j k b( Xj,b µ j (P )) S j,b max 1 j k b( Xj,b X j,n ) S j,b + max 1 j k b(µj (P ) X j,n ) S j,b. Consider any sequence P n P : n 1}. Since max 1 j k b(µj (P n ) X j,n ) S j,b = o Pn (1), we have that L 1 n (α, X n ) = L 1 n (α, P n ) + o Pn (1). It now follows from an additional subsequencing argument and part (i) of Theorem 3.1 that } lim inf P L 1 n (α, X n( Xj,n µ j (P )) n ) max = 1 α, 1 j k from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem 3.1. Remark 3.1. Theorem 3.1 generalizes to the case where the root is given by S j,n R n (X (n), P ) = f(z n (P ), Ω( ˆP n )), where Z n (P ) = ( n( X1,n µ 1 (P )) S 1,n,..., ) n( Xk,n µ k (P )) S k,n, (19) 10
11 f is a real-valued function that is continuous in both of its arguments, and f(z, Ω) is continuously distributed whenever Z N(0, Ω) for some Ω satisfying Ω j,j = 1 for all 1 j k. Under these conditions, sup J n (x, P n ) P f(z, Ω) x} 0 (20) whenever Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω). As a result, the proof given in Section 4.4 can be adapted appropriately to accommodate this situation. The result can be further generalized to allow the distribution of f(z, Ω) to have discontinuities. In this case, we require that at any point of discontinuity P n f(z n (P n ), Ω( ˆP n )) x} P f(z, Ω) x} P n f(z n (P n ), Ω( ˆP n )) < x} P f(z, Ω) < x} whenever Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω). When this is true, it follows from Lemma 3 on p. 260 of Chow and Teicher (1978), a generalization of Polya s Theorem, that (20) holds, from which the desired result can again be established by modifying the proof in Section 4.4 appropriately. Example 3.2. (Constrained Univariate Nonparametric Mean) Andrews (2000) considers the following example. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose it is known that µ(p ) 0 for all P P and one wishes to construct a confidence interval for µ(p ). A natural choice of root in this case is R n = R n (X (n), P ) = n(max X n, 0} µ(p )). This root differs from the one considered in Theorem 3.1 and the ones discussed in Remark 3.1 in the sense that under weak assumptions on P holds, but lim sup lim sup sup supj b (x, P ) J n (x, P )} 0 (21) sup supj n (x, P ) J b (x, P )} 0 (22) fails to hold. To see this, suppose P satisfies the uniform integrability requirement (17) with P = P. Note that J b (x, P ) = P maxz b (P ), bµ(p )} x} J n (x, P ) = P maxz n (P ), nµ(p )} x}, 11
12 where Z b (P ) = b( X b µ(p )) and Z n (P ) = n( X n µ(p )). Since bµ(p ) nµ(p ) for any P P, J b (x, P ) J n (x, P ) is bounded from above by P maxz b (P ), nµ(p )} x} J n (x, P ). It now follows from the uniform central limit theorem established by Lemma of Romano and Shaikh (2008) and Theorem 2.11 of Bhattacharya and Rao (1976) that (21) holds. A similar argument shows that (22) fails to hold. Thus, under these assumptions, subsampling can be used to construct uniformly valid upper critical values for R n, but not uniformly valid lower critical values for R n. Example 3.3. (Moment Inequalities) The generality of Theorem 2.1 illustrated in Example 3.2 is also useful when testing multisided hypotheses about the mean. To see this, let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Define P 0 = P P : µ(p ) 0} and P 1 = P\P 0. Consider testing the null hypothesis that P P 0 versus the alternative hypothesis that P P 1 at level α (0, 1). Such hypothesis testing problems have recently received considerable attention in the moment inequality literature in econometrics. See, for example, Andrews and Soares (2010), Andrews and Guggenberger (2010), Andrews and Jia (2009), Bugni (2010), Canay (2010), Romano and Shaikh (2008), and Romano and Shaikh (2010). Theorem 2.1 may be used to construct tests φ n (X (n) ) that are uniformly consistent in level in the sense that (5) holds under weak assumptions on P. Specifically, consider the test φ n (X (n) ) = IT n (X (n) ) > L 1 n (1 α)}, where n T n (X (n) Xj,n ) = max 1 j k S j,n and L n (x, P ) is given by (6) with R n (X (n), P ) = T n (X (n) ). If P is defined as in Theorem 3.1 and b = b n < n is a sequence of positive integers tending to infinity, but satisfying b/n 0, then it is possible to show that (21) holds. The desired conclusion therefore follows from Theorem 2.1. The required argument is essentially the same as the one presented in Romano and Shaikh (2008) for T n (X (n) ) = max n X j,n, 0} 2, 1 j k though Lemma 4.5 in the Appendix is also needed for establishing (21) in the case considered here because of Studentization. Related results are obtained by Andrews and Guggenberger (2009). Further applications of Theorem 2.1 can be found in Romano and Shaikh (2010). 12
13 Example 3.4. (Multiple Testing) We now illustrate the use of Theorem 2.1 to construct tests of multiple hypotheses that behave well uniformly over a large class of distributions. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and consider testing the family of null hypotheses H j : µ j (P ) 0 for 1 j k (23) versus the alternative hypotheses H j : µ j (P ) > 0 for 1 j k. (24) in a way that controls the familywise error rate at level α (0, 1) in the sense that lim sup sup F W ER P α, (25) where F W ER P = P reject some H j with µ j (P ) 0}. For K 1,..., k}, define L n (x, K) according to the righthand-side of (6) with R n (X (n), P ) = max j K n Xj,n S j,n and consider the following stepwise multiple testing procedure: Algorithm 3.1. Step 1: Set K 1 = 1,..., k}. If n Xj,n max j K 1 S j,n then stop. Otherwise, reject any H j with L 1 n (1 α, K 1 ), n Xj,n S j,n > L 1 n (1 α, K 1 ) and continue to Step 2 with K 2 = n Xj,n j K 1 : S j,n } L 1 n (1 α, K 1 ).. 13
14 Step s: If n Xj,n max j K s S j,n then stop. Otherwise, reject any H j with L 1 n (1 α, K s ), n Xj,n S j,n > L 1 n (1 α, K s ) and continue to Step s + 1 with n Xj,n K s+1 = j K s : S j,n } L 1 n (1 α, K s ).. It follows from the analysis in Romano and Wolf (2005) that Algorithm 3.1 satisfies } n Xj,n F W ER P P > L 1 n (1 α, I(P )), (26) max j I(P ) S j,n where I(P ) = 1 j k : µ j (P ) 0}. (27) If P is defined as in Theorem 3.1, then the supremum over P of the righthand-side of (26) is asymptotically bounded from above by α from Example 3.3. It is, of course, possible to extend the analysis in a straightforward way to two-sided testing. See also Romano and Shaikh (2010) for applications of Theorem 2.1 to testing an uncountably infinite number of null hypotheses in a way that satisfies (25). Example 3.5. (Empirical Process on R) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for the cumulative distribution function associated with P, i.e., P (, t]}. For this purpose a natural choice of root is sup n ˆPn (, t]} P (, t]}, (28) where ˆP n is the empirical distribution of X (n). We have the following theorem. Theorem 3.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P = P on R : ɛ < P (, t]} < 1 ɛ for some t R}. (29) Let J n (x, P ) be the distribution of the root (28). Then, for any α (0, 1), the following statements are true: 14
15 (i) (ii) (iii) lim inf P lim inf P lim inf P L 1 n sup L 1 n } (α, P ) sup n ˆPn (, t]} P (, t]} = 1 α. } n ˆPn (, t]} P (, t]} L 1 n (1 α, P ) = 1 α. } (α, P ) sup n ˆPn (, t]} P (, t]} L 1 n (1 α, P ) = 1 2α. We have immediately from part (i) of Theorem 3.2 that } C n = P : L 1 n (α, P ) sup n ˆPn (, t]} P (, t]} satisfies lim inf P P C n} = 1 α. (30) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.2. Yet this construction is not very useful in practice since it involves searching through all possible distributions on R. On the other hand, as described in Remark 2.4, with an additional argument it is possible to show that } C n = P : L 1 n (α, ˆP n ) sup n ˆPn (, t]} P (, t]} also satisfies (30) with C n in place of C n. To see this, first note that sup Similarly, b ˆPb (, t]} ˆP n (, t]} sup b ˆPb (, t]} P (, t]} + sup b P (, t]} ˆPn (, t]}. sup b ˆPb (, t]} P (, t]} sup b ˆPb (, t]} ˆP n (, t]} Consider any sequence P n P : n 1}. Since + sup b ˆPn (, t]} P (, t]}. sup b Pn (, t]} ˆP n (, t]} = o Pn (1), we have that L 1 n (α, ˆP n ) = L 1 n (α, P n ) + o Pn (1). 15
16 It now follows from an additional subsequencing argument and part (i) of Theorem 3.2 that } lim inf P L 1 n (α, ˆP n ) sup n ˆPn (, t]} P (, t]} = 1 α, from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem 3.2. Example 3.6. (One Sample U-Statistics) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for θ(p ) = θ h (P ) = E P [h(x 1,..., X m )], (31) where h is a symmetric kernel of degree m. The usual estimate of θ(p ) in this case is given by the U-statistic ˆθ n = ˆθ n (X (n) ) = ( 1 n ) m c h(x i1,..., X im ). Here, c denotes summation over all ( n m) subsets i1,..., i m } of 1,..., n}. A natural choice of root is therefore given by We have the following theorem: R n (X (n), P ) = n(ˆθ n θ(p )). (32) Theorem 3.3. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Define g(x, P ) = g h (x, P ) = E P [h(x, X 2,..., X m )] θ(p ), (33) and Suppose P is such that σ 2 (P ) = σ 2 h (P ) = m2 Var P [g(x i, P )]. (34) [ g 2 }] lim sup (X i, P ) g(x i, P ) E P λ σ 2 I (P ) σ(p ) > λ = 0 (35) and Var P [h(x 1,..., X m )] sup σ 2 <. (36) (P ) Let J n (x, P ) be the distribution of the root (32). Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0 and define L n (x, P ), the subsampling estimate of J n (x, P ), by (6). Then, for any α (0, 1), the following statements are true: 16
17 (i) (ii) (iii) lim inf P lim inf P L 1 n (α, P ) } n(ˆθ n θ(p )) = 1 α. } lim inf P n(ˆθn θ(p )) L 1 n (1 α, P ) = 1 α. L 1 n (α, P ) } n(ˆθ n θ(p )) L 1 n (1 α, P ) = 1 2α. Note in Theorem 3.3 that L n (x, P ), only depends on P through θ(p ), i.e., L n (x, P ) = L n (x, θ(p )). Therefore, as described in Remark 2.2, it follows from part (i) of Theorem 3.3 that satisfies C n = θ R : L 1 n (α, θ) n(ˆθ n θ)} lim inf P θ(p ) C n} = 1 α. (37) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.3. Remark 2.4, with an additional argument it is possible to show that C n = θ R : L 1 n (α, ˆθ n ) n(ˆθ n θ)} also satisfies (37) with C n in place of C n. To see this, first note that Moreover, as described in b(ˆθb ˆθ n ) = b(ˆθ b θ(p )) + b(θ(p ) ˆθ n ). Consider any sequence P n P : n 1}. Since b(θ(pn ) ˆθ n ) = o Pn (1), we have that L 1 n (α, ˆθ n ) = L 1 n (α, P n ) + o Pn (1). It now follows from an additional subsequencing argument and part (i) of Theorem 3.3 that lim inf P L 1 n (α, ˆθ n ) } n(ˆθ n θ(p )) = 1 α, from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem
18 3.2 Bootstrap Example 3.7. (Multivariate Nonparametric Mean) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Suppose one wishes to construct a rectangular confidence region for µ(p ). As described in Example 3.1, a natural choice of root in this case is given by (16). We have the following theorem: Theorem 3.4. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P is defined as in Theorem 3.1. Let J n (x, P ) be the distribution of the root R n defined by (16). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: (i) (ii) } lim inf P J 1 n (α, ˆP n( Xj,n µ j (P )) n ) max = 1 α. 1 j k S j,n } n( lim inf P Xj,n µ j (P )) max J n 1 (1 α, 1 j k S ˆP n ) = 1 α. j,n (iii) lim inf P J n 1 (α, ˆP n( Xj,n µ j (P )) n ) max 1 j k S j,n } Jn 1 (1 α, ˆP n ) = 1 2α. Theorem 3.4 generalizes in the same way that Theorem 3.1 generalizes as described in Remark 3.1. In this case, the proof given in Section 4.7 can be adapted simply by modifying Lemma 4.7 appropriately. Example 3.8. (Moment Inequalities) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and define P 0 and P 1 as in Example 3.3. Andrews and Jia (2009) propose testing the null hypothesis that P P 0 versus the alternative hypothesis that P P 1 at level α (0, 1) using an adjusted quasi-likelihood ratio statistic T n (X (n) ) defined as follows: Here, T n (X (n) ) = inf W n (t) Ω 1 k n W n (t). :t 0 Ω n = maxɛ det(ω( ˆP n )), 0}diag(Ω( ˆP n )), where ɛ > 0 and W n (t) = ( n( X1,n t) S 1,n,..., ) n( Xk,n t) S k,n. 18
19 Andrews and Jia (2009) propose a procedure for constructing critical values for T n (X (n) ) that they term refined moment selection. construction. To this end, define R n (X (n), P ) = For illustrative purposes, we instead consider a simpler inf (Z n (P ) t) Ω 1 k n (Z n (P ) t), (38) :t 0 where Z n (P ) is defined as in (19). Note that (38) is a function of Z n (P ) and Ω( ˆP n ) only, i.e., R n (X (n), P ) = f(z n (P ), Ω( ˆP n )). Furthermore, whenever Z N(0, Ω) for some Ω satisfying Ω j,j = 1 for all 1 j k, f(z, Ω) is continuously distributed except for a point mass at zero. But for any sequence P n P : n 1} such that Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω), we see that P n f(z n (P n ), Ω( ˆP n )) 0} P f(z, Ω) 0} P n f(z n (P n ), Ω( ˆP n )) < 0} P f(z, Ω) < 0}. It therefore follows from the generalization of Theorem 3.4 discussed in Example 3.7 that the test φ n (X (n) ) = IT n (X (n) ) > J 1 n (1 α, ˆP n )}, where J n (x, P ) is the distribution of (38), satisfies (5) whenever P is defined as in Theorem 3.4. Example 3.9. (Multiple Testing) Theorem 2.2 may be used in the same way that Theorem 2.1 was used in Example 3.4 to construct tests of multiple hypotheses that behave well uniformly over a large class of distributions. To see this, let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and again consider testing the family of null hypotheses (23) versus the alternative hypotheses (24) in a way that satisfies (25). For K 1,..., k}, let J n (x, K, P ) be the distribution of the root R n (X (n), P ) = max j K n( Xj,n µ j (P )) S j,n under P and consider the following stepwise multiple testing procedure: Algorithm 3.2. Step 1: Set K 1 = 1,..., k}. If max j K 1 n Xj,n S j,n J 1 n (1 α, K 1, ˆP n ), then stop. Otherwise, reject any H j with n Xj,n > Jn 1 (1 α, K 1, ˆP n ) S j,n 19
20 and continue to Step 2 with n Xj,n K 2 = j K 1 : S j,n } Jn 1 (1 α, K 1, ˆP n ).. Step s: If n Xj,n max j K s S j,n J 1 n (1 α, K s, ˆP n ), then stop. Otherwise, reject any H j with n Xj,n > Jn 1 (1 α, K s, ˆP n ) S j,n and continue to Step s + 1 with n Xj,n K s+1 = j K s : S j,n } Jn 1 (1 α, K s, ˆP n ).. It follows from the analysis in Romano and Wolf (2005) that Algorithm 3.1 satisfies } n Xj,n F W ER P P > Jn 1 (1 α, I(P ), ˆP n ), (39) max j I(P ) S j,n where I(P ) is defined as in (27). The righthand-side of (39) can be further bounded from above by } n( Xj,n µ j (P )) P > Jn 1 (1 α, I(P ), ˆP n ). max j I(P ) S j,n The desired conclusion therefore follows from Theorem 3.4 if P is defined as in Theorem 3.1. It is, of course, possible to extend the analysis in a straightforward way to two-sided testing. Example (Empirical Process on R) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for the cumulative distribution function associated with P, i.e., P (, t]}. As described in Example 3.5, a natural choice of root in this case is given by (28). We have the following theorem: Theorem 3.5. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P is defined as in (29). Let J n (x, P ) be the distribution of the root (28). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: (i) lim inf P J 1 n (α, ˆP n ) sup } n ˆPn (, t]} P (, t]} = 1 α. 20
21 (ii) lim inf P sup } n ˆPn (, t]} P (, t]} Jn 1 (1 α, ˆP n ) = 1 α. (iii) lim inf P J 1 n (α, ˆP n ) sup } n ˆPn (, t]} P (, t]} Jn 1 (1 α, ˆP n ) = 1 2α. Some of the conclusions of Theorem 3.5 can be found in Romano (1989), though the method of proof given in the Appendix is quite different. Example (One Sample U-Statistics) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Suppose one wishes to construct a confidence region for θ(p ) = θ h (P ) given by (31). As described in Example 3.6, a natural choice of root in this case is given by (32). Before proceeding, it is useful to introduce the following notation. For an arbitrary kernel h, ɛ > 0 and B > 0, denote by P h,ɛ,b the set of all distributions P on R such that E P [ h(x 1,..., X m ) θ h(p ) ɛ ] B. (40) Similarly, for an arbitrary kernel h and δ > 0, denote by S h,δ the set of all distributions P on R such that σ 2 h(p ) δ, (41) where σ 2 h(p ) is defined as in (34). Finally, for an arbitrary kernel h, ɛ > 0 and B > 0, let P h,ɛ,b be the set of distributions P on R such that E P [ h(x i1,..., X im )] θ h(p ) ɛ ] B whenever 1 i j n for all 1 j m. Using this notation, we have the following theorem: Theorem 3.6. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Define the kernel h of degree 2m according to the rule h (x 1,..., x 2m ) = h(x 1,..., x m )h(x 1, x m+2,..., x 2m ) h(x 1,..., x m )h(x m+1,..., x 2m ). (42) Suppose P P h,2+δ,b S h,δ P h,1+δ,b P h,2+δ,b for some δ > 0 and B > 0. Let J n (x, P ) be the distribution of the root R n defined by (32). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: 21
22 (i) (ii) (iii) lim inf P lim inf P J n 1 (α, ˆP n ) } n(ˆθ n θ(p )) = 1 α. lim inf P n(ˆθn θ(p )) J n 1 (1 α, ˆP } n ) = 1 α. Jn 1 (α, ˆP n ) n(ˆθ n θ(p )) Jn 1 (1 α, ˆP } n ) = 1 2α. Note that the conditions on P in Theorem 3.6 are stronger than the conditions on P in Theorem 3.3. While it may be possible to weaken the restrictions on P in Theorem 3.6 some, it is not possible to establish the conclusions of Theorem 3.6 under the conditions on P in Theorem 3.3. Indeed, as shown by Bickel and Freedman (1981), the bootstrap need not be even pointwise asymptotically valid under the conditions on P in Theorem Appendix 4.1 Auxiliary Results Lemma 4.1. Let F and G be (nonrandom) distribution functions on R. Then the following are true: (i) If sup G(x) F (x)} ɛ, then G 1 (1 α) F 1 (1 (α + ɛ)). (ii) If sup F (x) G(x)} ɛ, then G 1 (α) F 1 (α + ɛ). Furthermore, if X F, it follows that: (iii) If sup G(x) F (x)} ɛ, then P X G 1 (1 α)} 1 (α + ɛ). (iv) If sup F (x) G(x)} ɛ, then P X G 1 (α)} 1 (α + ɛ). 22
23 (v) If sup G(x) F (x) ɛ, then P G 1 (α) X G 1 (1 α)} 1 2(α + ɛ). If Ĝ is a random distribution function on R, then we have further that: (vi) If P sup Ĝ(x) F (x)} ɛ} 1 δ, then P X Ĝ 1 (1 α)} 1 (α + ɛ + δ). (vii) If P sup F (x) Ĝ(x)} ɛ} 1 δ, then P X Ĝ 1 (α)} 1 (α + ɛ + δ). (viii) If P sup Ĝ(x) F (x) ɛ} 1 δ, then P Ĝ 1 (α) X Ĝ 1 (1 α)} 1 2(α + ɛ + δ). Proof: To see (i), first note that sup G(x) F (x)} ɛ implies that G(x) ɛ F (x) for all x R. Thus, x R : G(x) 1 α} = x R : G(x) ɛ 1 α ɛ} x R : F (x) 1 α ɛ}, from which it follows that F 1 (1 (α + ɛ)) = infx R : F (x) 1 α ɛ} infx R : G(x) 1 α} = G 1 (1 α}. Similarly, to prove (ii), first note that sup F (x) G(x)} ɛ implies that F (x) ɛ G(x) for all x R, so x R : F (x) α + ɛ} = x R : F (x) ɛ α} x R : G(x) α}. Therefore, G 1 (α) = infx R : G(x) α} infx R : F (x) α+ɛ} = F 1 (α+ɛ). To prove (iii), note that because sup G(x) F (x)} ɛ, it follows from (i) that X G 1 (1 α)} X F 1 (1 (α+ɛ))}. Hence, P X G 1 (1 α)} P X F 1 (1 (α+ɛ))} 1 (α+ɛ). Using the same reasoning, (iv) follows from (ii) and the assumption that sup F (x) G(x)} ɛ. To see (v), note that P G 1 (α) X G 1 (1 α)} 1 P X < G 1 (α)} P X > G 1 (1 α)} = P X G 1 (α)} + P X G 1 (1 α)} 1 1 2(α + ɛ), where the first inequality follows from the Bonferroni inequality and the second inequality follows from (iii) and (iv). 23
24 To prove (vi), note that P X Ĝ 1 (1 α)} P X Ĝ 1 (1 α) supĝ(x) F (x)} ɛ} P X F 1 (1 (α + ɛ)) supĝ(x) F (x)} ɛ} = P X F 1 (1 (α + ɛ))} P X F 1 (1 (α + ɛ)) supĝ(x) F (x)} > ɛ} P X F 1 (1 (α + ɛ))} P supĝ(x) F (x)} > ɛ} = 1 α ɛ δ, where the second inequality follows from (i). A similar argument using (ii) establishes (vii). Finally, (viii) follows from (vi) and (vii) by an argument analogous to the one used to establish (v). 4.2 Proof of Theorem 2.1 Lemma 4.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let N n = ( n b), kn = n b and define L n(x, P ) according to (6). Then, for any ɛ > 0, we have that P sup L n (x, P ) J b (x, P ) > ɛ} 1 2π. (43) ɛ k n We also have that for any 0 < δ < 1 P sup L n (x, P ) J b (x, P ) > ɛ} δ ɛ + 2 ɛ exp 2k nδ 2 }. (44) Proof: Let ɛ > 0 be given and define S n (x, P ; X 1,..., X n ) by 1 k n 1 i k n IR b ((X b(i 1)+1,..., X bi ), P ) x} J b (x, P ). Denote by S n the symmetric group with n elements. Note that using this notation, we may rewrite as 1 N n 1 i N n IR b (X n,(b),i, P ) x} J b (x, P ) Z n (x, P ; X 1,..., X n ) = 1 S n (x, P ; X n! π(1),..., X π(n) ). π S n 24
25 Note further that sup Z n (x, P ; X 1,..., X n ) 1 n! π S n sup S n (x, P ; X π(1),..., X π(n) ), which is a sum of n! identically distributed random variables. Let ɛ > 0 be given. It follows that P sup Z n (x, P ; X 1,..., X n ) > ɛ} is bounded above by P 1 n! π S n sup Using Markov s inequality, (45) can be bounded by = 1 ɛ S n (x, P ; X π(1),..., X π(n) ) > ɛ}. (45) 1 ɛ Esup S n (x, P ; X 1,..., X n ) } (46) 1 0 P sup S n (x, P ; X 1,..., X n ) > u}du. (47) We may then use the Dvoretsky-Kiefer-Wolfowitz inequality to bound (47) by which, when evaluated, yields 1 ɛ exp 2k n u 2 }du, 2 2π [Φ(2 k n ) 1 ɛ k n 2 ] < 1 2π. ɛ k n To establish (44), note that for any 0 < δ < 1, we have that Esup S n (x, P ; X 1,..., X n ) } δ + P sup S n (x, P ; X 1,..., X n ) > δ}. (48) The result (44) now follows immediately by using the Dvoretsky-Kiefer-Wolfowitz inequality to bound the second term on the right hand side in (48). Lemma 4.3. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let k n = n b and define L n(x, P ) according to (6). Then, for all ɛ > 0 and γ (0, 1), we have the following: (i) P sup L n (x, P ) J n (x, P )} > ɛ} 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}. γɛ k n (ii) P sup J n (x, P ) L n (x, P )} > ɛ} 1 2π + IsupJ n (x, P ) J b (x, P )} > (1 γ)ɛ}. γɛ k n 25
26 (iii) P sup L n (x, P ) J n (x, P )} > ɛ} 1 2π + Isup J b (x, P ) J n (x, P ) > (1 γ)ɛ}. γɛ k n Proof: Let ɛ > 0 and γ (0, 1) be given. Note that P supl n (x, P ) J n (x, P )} > ɛ} P supl n (x, P ) J b (x, P )} + supj b (x, P ) J n (x, P )} > ɛ} P supl n (x, P ) J b (x, P )} > γɛ} + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ} 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}, γɛ k n where the final inequality follows from Lemma 4.2. A similar argument establishes (ii) and (iii). Lemma 4.4. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let k n = n b and define L n(x, P ) according to (6). Let δ 1,n (ɛ, γ, P ) = 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ} γɛ k n δ 2,n (ɛ, γ, P ) = 1 2π + IsupJ n (x, P ) J b (x, P )} > (1 γ)ɛ} γɛ k n δ 3,n (ɛ, γ, P ) = 1 2π + Isup J b (x, P ) J n (x, P ) > (1 γ)ɛ}. γɛ k n Then, for any ɛ > 0 and γ (0, 1), we have that (i) (ii) (iii) P R n L 1 n (1 α)} 1 (α + ɛ + δ 1,n (ɛ, γ, P )). P R n L 1 n (α)} 1 (α + ɛ + δ 2,n (ɛ, γ, P )). P L 1 n ( α 2 ) R n L 1 n (1 α 2 )} 1 (α + ɛ + δ 3,n(ɛ, γ, P )). Proof: Let ɛ > 0 and γ (0, 1) be given. By part (i) of Lemma 4.3, we have that P supl n (x, P ) J n (x, P )} > ɛ} δ 1,n (ɛ, γ, P ). 26
27 Assertion (i) follows by applying part (vi) of Lemma 4.1. Assertions (ii) and (iii) are established similarly. Proof of Theorem 2.1: To prove (i), note that by part (i) of Lemma 4.4, we have for any ɛ > 0 and γ (0, 1) that where sup By the assumption on P R n L 1 n (1 α)} 1 (α + ɛ + inf δ 1,n(ɛ, γ, P )), δ 1,n (ɛ, γ, P ) = 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}. γɛ k n sup supj b (x, P ) J n (x, P )}, we have that for every ɛ > 0, inf δ 1,n (ɛ, γ, P ) 0. Thus, there exists a sequence ɛ n > 0 tending to 0 so that inf δ 1,n (ɛ n, γ, P ) 0. The desired claim now follows from applying part (i) of Lemma 4.4 to this sequence. Assertions (ii) and (iii) follow in exactly the same way. 4.3 Proof of Theorem 2.2 We prove only (i). Similar arguments can be used to establish (ii) and (iii). Let η > 0 and α (0, 1) be given. Choose δ > 0 so that sup J n (x, P ) J n (x, P )} < η 2 whenever ρ(p, P ) < δ for P P and P P. For n sufficiently large, we have that For such n, we therefore have that sup P ρ( ˆP n, P ) > δ} < η 4 and sup P ˆP n P } < η 4. 1 η 2 inf P ρ( ˆP n, P ) δ ˆP n P } inf P sup J n (x, ˆP n ) J n (x, P )} η 2 }. It follows from part (vi) of Lemma 5.1 that for such n inf P R n Jn 1 (1 α, ˆP n )} 1 (α + η). Since the choice of η was arbitrary, the desired result follows. 27
28 4.4 Proof of Theorem 3.1 Lemma 4.5. Consider any sequence P n P : n 1}, where P satisfies (17). Let X n,i, i = 1,..., n be an i.i.d. sequence of random variables with distribution P n. Then, Sn 2 P n 1. σ 2 (P n ) Proof: First assume without loss of generality that µ(p n ) = 0 and σ(p n ) = 1 for all n 1. Next, note that S 2 n = 1 n 1 i n X 2 n,i X 2 n. By Lemma of Lehmann and Romano (2005), we have that 1 n 1 i n X 2 n,i P n 1. By Lemma of Lehmann and Romano (2005), we have further that X n P n 0. The desired result therefore follows from the continuous mapping theorem. Proof of Theorem 3.1: We argue that sup sup J b (x, P ) J n (x, P ) 0. (49) Suppose by way of contradiction that (49) fails. It follows that there exists a subsequence n l such that Ω(P nl ) Ω and either or To see that neither (50) or (51) can hold, let sup J nl (x, P nl ) Φ Ω (x,..., x) 0 (50) sup J bnl (x, P nl ) Φ Ω (x,..., x) 0. (51) } n( X1,n µ 1 (P )) n( Xk,n µ k (P )) K n (x, P ) = P x 1,..., x k. (52) σ 1 (P ) σ k (P ) By Polya s Theorem, sup Φ Ω(Pnl )(x) Φ Ω (x) 0. k 28
29 It therefore follows from the uniform central limit theorem established by Lemma of Romano and Shaikh (2008) that K nl (x, P nl ) d Φ Ω (x). From Lemma 4.5, we have for 1 j k that S j,nl P nl 1. σ j (P nl ) Hence, by Slutsky s Theorem, K nl (x, P nl ) d Φ Ω (x). By Polya s Theorem, we therefore have that sup K nl (x, P nl ) Φ Ω (x) 0. k We thus see that (50) can not hold. A similar argument establishes that (51) can not hold. Hence, (49) holds. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) therefore follow from Theorem 2.1 and Remark Proof of Theorem 3.2 We begin with some preliminaries. First note that where B n is the uniform empirical process and J n (x, P ) = P sup B n (P (, t]}) x} = P sup (P ) B n (t) x}, R(P ) = cl(p (, t]} : t R}). (53) By Theorem 3.85 of Aliprantis and Border (2006), the set of all nonempty closed subsets of [0, 1] is a compact metric space with respect to the Hausdorff metric Here, d H (U, V ) = infη > 0 : U V η, V U η }. (54) U η = A η (u), u U where A η (u) is the open ball with center u and radius η. Thus, for any sequence P n P : n 1}, there is a subsequence n l and a closed set R [0, 1] along which d H (R(P nl ), R) 0. (55) Finally, denote by B the standard Brownian bridge process. By the almost sure representation theorem, we may choose B n and B so that sup B n (t) B(t) 0 a.s. (56) 0 t 1 29
30 We now argue that sup sup J b (x, P ) J n (x, P ) 0. (57) Suppose by way of contradiction that (57) fails. It follows that there exists a subsequence n l and a closed subset R [0, 1] such that d H (R(P nl ), R) 0 and either or where sup J nl (x, P nl ) J (x) 0 (58) sup J bnl (x, P nl ) J (x) 0, (59) J (x) = P sup B(t) x}. Moreover, by the definition of P, it must be the case that R contains some point different from zero and one. To see that neither (58) or (59) can hold, note that sup (P nl ) B n (t) sup B(t) sup B n (t) B(t) + sup (P nl ) (P nl ) B(t) sup B(t). (60) By (56), we see that the first term on the righthand-side of (60) tends to zero a.s. By the a.s. uniform continuity of B(t) and (82), we see that the second term on the righthand-side of (60) tends to zero a.s. Thus, sup B n (t) (P nl ) d sup B(t). Since R contains some point different from zero and one, we see from Theorem 11.1 Davydov et al. (1998) that sup B(t) is continuously distributed. By Polya s Theorem, we therefore have that (58) holds. A similar argument establishes that (59) can not hold. Hence, (57) holds. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) now follow from Theorem 2.1 and Remark Proof of Theorem 3.3 Lemma 4.6. Let h be a symmetric kernel of degree m. Denote by J n (x, P ) the distribution of R n (X (n), P ) defined in (32). Suppose P satisfies (35) and (36). Then, lim sup sup J n (x, P ) Φ(x/σ(P )) = 0. x 30
31 Proof: Let P n P : n 1} be given and denote by X n,i, i = 1,..., n an i.i.d. sequence of random variables with distribution P n. Define n (P n ) = n 1/2 (ˆθ n θ(p n )) mn 1/2 1 i n g(x n,i, P n ), (61) where g(x, P ) is defined as in (33). Note that n (P n ) is itself a mean zero, degenerate U-statistic. By Lemma A on p. 183 of Serfling (1980), we therefore see that ( ) n 1 ( m Var Pn [ n (P n )] = n m c 2 c m where the terms ζ c (P n ) are nondecreasing in c and thus Hence, It follows that Since )( n m m c ζ c (P n ) ζ m (P n ) = V ar Pn [h(x 1,..., X m )]. ( ) n 1 Var Pn [ n (P n )] n m Var Pn [ n (P n ) σ(p n ) 2 c m ] ( ) n 1 n m 2 c m ( ) n 1 m n m c=2 ( m c ( m c )( n m m c ) ζ c (P n ), (62) ) Var Pn [h(x 1,..., X m )]. ( )( ) m n m VarPn [h(x 1,..., X m )] c m c σ(p n ) )( n m m c ) 0,. (63) it follows from (36) that the lefthand-side of (63) tends to zero. Therefore, by Chebychev s inequality, we see that n (P n ) σ(p n ) P n 0. Next, note that from Lemma of Lehmann and Romano (2005) we have that mn 1/2 1 i n g(x n,i, P n ) σ(p n ) d Φ(x) under P n. We therefore have further from Slutsky s Theorem that n 1/2 (ˆθ n θ(p n )) σ(p n ) d Φ(x) under P n. An appeal to Polya s Theorem establishes the desired result. Proof of Theorem 3.3: From the triangle inequality and Lemma 4.6, we see immediately that sup sup J b (x, P ) J n (x, P ) 0. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) therefore follow from Theorem 2.1 and Remark
ON THE UNIFORM ASYMPTOTIC VALIDITY OF SUBSAMPLING AND THE BOOTSTRAP. Joseph P. Romano Azeem M. Shaikh
ON THE UNIFORM ASYMPTOTIC VALIDITY OF SUBSAMPLING AND THE BOOTSTRAP By Joseph P. Romano Azeem M. Shaikh Technical Report No. 2010-03 April 2010 Department of Statistics STANFORD UNIVERSITY Stanford, California
More informationlarge number of i.i.d. observations from P. For concreteness, suppose
1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate
More informationarxiv: v2 [math.st] 18 Feb 2013
The Annals of Statistics 2012, Vol. 40, No. 6, 2798 2822 DOI: 10.1214/12-AOS1051 c Institute of Mathematical Statistics, 2012 arxiv:1204.2762v2 [math.st] 18 Feb 2013 ON THE UNIFORM ASYMPTOTIC VALIDITY
More informationInference for Identifiable Parameters in Partially Identified Econometric Models
Inference for Identifiable Parameters in Partially Identified Econometric Models Joseph P. Romano Department of Statistics Stanford University romano@stat.stanford.edu Azeem M. Shaikh Department of Economics
More informationInference for identifiable parameters in partially identified econometric models
Journal of Statistical Planning and Inference 138 (2008) 2786 2807 www.elsevier.com/locate/jspi Inference for identifiable parameters in partially identified econometric models Joseph P. Romano a,b,, Azeem
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago
More informationUniversity of California San Diego and Stanford University and
First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford
More informationLecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf
Lecture 13: 2011 Bootstrap ) R n x n, θ P)) = τ n ˆθn θ P) Example: ˆθn = X n, τ n = n, θ = EX = µ P) ˆθ = min X n, τ n = n, θ P) = sup{x : F x) 0} ) Define: J n P), the distribution of τ n ˆθ n θ P) under
More informationExact and Approximate Stepdown Methods For Multiple Hypothesis Testing
Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu
More informationControl of Generalized Error Rates in Multiple Testing
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and
More informationMultiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap
University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and
More informationPractical and Theoretical Advances in Inference for Partially Identified Models
Practical and Theoretical Advances in Inference for Partially Identified Models Ivan A. Canay Department of Economics Northwestern University iacanay@northwestern.edu Azeem M. Shaikh Department of Economics
More informationChristopher J. Bennett
P- VALUE ADJUSTMENTS FOR ASYMPTOTIC CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE by Christopher J. Bennett Working Paper No. 09-W05 April 2009 DEPARTMENT OF ECONOMICS VANDERBILT UNIVERSITY NASHVILLE,
More informationSupplement to Quantile-Based Nonparametric Inference for First-Price Auctions
Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract
More informationConsonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 446 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Azeem
More information1 The Glivenko-Cantelli Theorem
1 The Glivenko-Cantelli Theorem Let X i, i = 1,..., n be an i.i.d. sequence of random variables with distribution function F on R. The empirical distribution function is the function of x defined by ˆF
More informationComparison of inferential methods in partially identified models in terms of error in coverage probability
Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,
More informationControl of the False Discovery Rate under Dependence using the Bootstrap and Subsampling
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and
More informationEXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS. Eun Yi Chung Joseph P. Romano
EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS By Eun Yi Chung Joseph P. Romano Technical Report No. 20-05 May 20 Department of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065 EXACT AND
More informationis a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.
Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable
More informationMetric Spaces and Topology
Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies
More informationOn the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions
On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions Yuichi Kitamura Department of Economics Yale University yuichi.kitamura@yale.edu Andres Santos Department of Economics
More informationRejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling
Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh
More informationSIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011
SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationVALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES
Econometric Theory, 2009, Page 1 of 41. Printed in the United States of America. doi:10.1017/s0266466608090257 VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT
More informationINVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES. Donald W. K. Andrews and Sukjin Han
INVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES By Donald W. K. Andrews and Sukjin Han July 2008 COWLES FOUNDATION DISCUSSION PAPER NO. 1671
More informationLecture Notes 3 Convergence (Chapter 5)
Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let
More informationSupplement to On the Uniform Asymptotic Validity of Subsampling and the Bootstrap
Supplemet to O the Uiform Asymptotic Validity of Subsamplig ad the Bootstrap Joseph P. Romao Departmets of Ecoomics ad Statistics Staford Uiversity romao@staford.edu Azeem M. Shaikh Departmet of Ecoomics
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationEconomics 583: Econometric Theory I A Primer on Asymptotics
Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:
More informationGeneralized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses
Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:
More informationTHE LIMIT OF FINITE-SAMPLE SIZE AND A PROBLEM WITH SUBSAMPLING. Donald W. K. Andrews and Patrik Guggenberger. March 2007
THE LIMIT OF FINITE-SAMPLE SIZE AND A PROBLEM WITH SUBSAMPLING By Donald W. K. Andrews and Patrik Guggenberger March 2007 COWLES FOUNDATION DISCUSSION PAPER NO. 1605 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationProbability and Measure
Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real
More informationEXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE
EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE Joseph P. Romano Department of Statistics Stanford University Stanford, California 94305 romano@stat.stanford.edu Michael
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More informationResampling-Based Control of the FDR
Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationThe Heine-Borel and Arzela-Ascoli Theorems
The Heine-Borel and Arzela-Ascoli Theorems David Jekel October 29, 2016 This paper explains two important results about compactness, the Heine- Borel theorem and the Arzela-Ascoli theorem. We prove them
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........
More information1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3
Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and
More informationBahadur representations for bootstrap quantiles 1
Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by
More informationThe Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University
The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor
More informationTopology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:
Topology Xiaolong Han Department of Mathematics, California State University, Northridge, CA 91330, USA E-mail address: Xiaolong.Han@csun.edu Remark. You are entitled to a reward of 1 point toward a homework
More informationExact and Approximate Stepdown Methods for Multiple Hypothesis Testing
Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing Joseph P. ROMANO and Michael WOLF Consider the problem of testing k hypotheses simultaneously. In this article we discuss finite-
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued
Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner
More informationCompact operators on Banach spaces
Compact operators on Banach spaces Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto November 12, 2017 1 Introduction In this note I prove several things about compact
More informationA Simple, Graphical Procedure for Comparing Multiple Treatment Effects
A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationSequences and Series of Functions
Chapter 13 Sequences and Series of Functions These notes are based on the notes A Teacher s Guide to Calculus by Dr. Louis Talman. The treatment of power series that we find in most of today s elementary
More informationSpring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures
36-752 Spring 2014 Advanced Probability Overview Lecture Notes Set 1: Course Overview, σ-fields, and Measures Instructor: Jing Lei Associated reading: Sec 1.1-1.4 of Ash and Doléans-Dade; Sec 1.1 and A.1
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results
Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics
More informationProblem List MATH 5143 Fall, 2013
Problem List MATH 5143 Fall, 2013 On any problem you may use the result of any previous problem (even if you were not able to do it) and any information given in class up to the moment the problem was
More informationR N Completeness and Compactness 1
John Nachbar Washington University in St. Louis October 3, 2017 R N Completeness and Compactness 1 1 Completeness in R. As a preliminary step, I first record the following compactness-like theorem about
More informationStatistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation
Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider
More informationON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS
Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)
More informationEmpirical Processes: General Weak Convergence Theory
Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated
More informationSection 8.2. Asymptotic normality
30 Section 8.2. Asymptotic normality We assume that X n =(X 1,...,X n ), where the X i s are i.i.d. with common density p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume that θ 0 is identified in the sense that
More informationPartial Identification and Confidence Intervals
Partial Identification and Confidence Intervals Jinyong Hahn Department of Economics, UCLA Geert Ridder Department of Economics, USC September 17, 009 Abstract We consider statistical inference on a single
More informationASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE M OUT OF N BOOTSTRAP. DONALD W.K. ANDREWS and PATRIK GUGGENBERGER
ASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE M OUT OF N BOOTSTRAP BY DONALD W.K. ANDREWS and PATRIK GUGGENBERGER COWLES FOUNDATION PAPER NO. 1295 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationThe Numerical Delta Method and Bootstrap
The Numerical Delta Method and Bootstrap Han Hong and Jessie Li Stanford University and UCSC 1 / 41 Motivation Recent developments in econometrics have given empirical researchers access to estimators
More informationLecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.
1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if
More informationMaths 212: Homework Solutions
Maths 212: Homework Solutions 1. The definition of A ensures that x π for all x A, so π is an upper bound of A. To show it is the least upper bound, suppose x < π and consider two cases. If x < 1, then
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationStatistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality
Statistics 581, Problem Set 1 Solutions Wellner; 1/3/218 1. (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality and is usually written P ( X ɛ) E( X )/ɛ for an arbitrary random
More informationSIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012
SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR
More informationarxiv: v3 [math.st] 23 May 2016
Inference in partially identified models with many moment arxiv:1604.02309v3 [math.st] 23 May 2016 inequalities using Lasso Federico A. Bugni Mehmet Caner Department of Economics Department of Economics
More informationProofs for Large Sample Properties of Generalized Method of Moments Estimators
Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my
More informationClosest Moment Estimation under General Conditions
Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest
More informationSolutions Final Exam May. 14, 2014
Solutions Final Exam May. 14, 2014 1. (a) (10 points) State the formal definition of a Cauchy sequence of real numbers. A sequence, {a n } n N, of real numbers, is Cauchy if and only if for every ɛ > 0,
More informationSome Background Material
Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important
More informationPriors for the frequentist, consistency beyond Schwartz
Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist
More informationINFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES USING GENERALIZED MOMENT SELECTION. DONALD W. K. ANDREWS and GUSTAVO SOARES
INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES USING GENERALIZED MOMENT SELECTION BY DONALD W. K. ANDREWS and GUSTAVO SOARES COWLES FOUNDATION PAPER NO. 1291 COWLES FOUNDATION FOR RESEARCH IN
More informationCopyright 2010 Pearson Education, Inc. Publishing as Prentice Hall.
.1 Limits of Sequences. CHAPTER.1.0. a) True. If converges, then there is an M > 0 such that M. Choose by Archimedes an N N such that N > M/ε. Then n N implies /n M/n M/N < ε. b) False. = n does not converge,
More informationQED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University
QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,
More informationCHAPTER I THE RIESZ REPRESENTATION THEOREM
CHAPTER I THE RIESZ REPRESENTATION THEOREM We begin our study by identifying certain special kinds of linear functionals on certain special vector spaces of functions. We describe these linear functionals
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence
Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationLarge Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n
Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the
More informationFundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales
Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Prakash Balachandran Department of Mathematics Duke University April 2, 2008 1 Review of Discrete-Time
More informationPROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo
PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures
More informationSDS : Theoretical Statistics
SDS 384 11: Theoretical Statistics Lecture 1: Introduction Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin https://psarkar.github.io/teaching Manegerial Stuff
More information2 Sequences, Continuity, and Limits
2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs
More informationMath 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1
Math 8B Solutions Charles Martin March 6, Homework Problems. Let (X i, d i ), i n, be finitely many metric spaces. Construct a metric on the product space X = X X n. Proof. Denote points in X as x = (x,
More informationEconomics 204 Fall 2011 Problem Set 2 Suggested Solutions
Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit
More informationLecture 8 Inequality Testing and Moment Inequality Models
Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationDoes k-th Moment Exist?
Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,
More informationFunctional Analysis HW #3
Functional Analysis HW #3 Sangchul Lee October 26, 2015 1 Solutions Exercise 2.1. Let D = { f C([0, 1]) : f C([0, 1])} and define f d = f + f. Show that D is a Banach algebra and that the Gelfand transform
More informationStat 710: Mathematical Statistics Lecture 31
Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:
More information3 Integration and Expectation
3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ
More informationEstimation of the functional Weibull-tail coefficient
1/ 29 Estimation of the functional Weibull-tail coefficient Stéphane Girard Inria Grenoble Rhône-Alpes & LJK, France http://mistis.inrialpes.fr/people/girard/ June 2016 joint work with Laurent Gardes,
More informationThe Lindeberg central limit theorem
The Lindeberg central limit theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto May 29, 205 Convergence in distribution We denote by P d the collection of Borel probability
More informationBOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION
Statistica Sinica 8(998), 07-085 BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Jun Shao and Yinzhong Chen University of Wisconsin-Madison Abstract: The bootstrap
More informationProblem set 1, Real Analysis I, Spring, 2015.
Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n
More informationMaximum Likelihood Estimation
Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More information