On the Uniform Asymptotic Validity of Subsampling and the Bootstrap

Size: px
Start display at page:

Download "On the Uniform Asymptotic Validity of Subsampling and the Bootstrap"

Transcription

1 On the Uniform Asymptotic Validity of Subsampling and the Bootstrap Joseph P. Romano Departments of Economics and Statistics Stanford University Azeem M. Shaikh Department of Economics University of Chicago April 13, 2010 Abstract This paper provides conditions under which subsampling and the bootstrap can be used to construct estimates of the quantiles of the distribution of a root that behave well uniformly over a large class of distributions P. These results are then applied (i) to construct confidence regions that behave well uniformly over P in the sense that the coverage probability tends to at least the nominal level uniformly over P and (ii) to construct tests that behave well uniformly over P in the sense that the size tends to no greater than the nominal level uniformly over P. Without these stronger notions of convergence, the asymptotic approximations to the coverage probability or size may be poor even in very large samples. Specific applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process, and U-statistics. KEYWORDS: Bootstrap, Empirical Process, Moment Inequalities, Multiple Testing, Subsampling, Uniformity, U-Statistic 1

2 1 Introduction Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P and denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. In statistics and econometrics, it is often of interest to estimate certain quantiles of J n (x, P ). Two commonly used methods for this purpose are subsampling and the bootstrap. This paper provides conditions under which these estimates behave well uniformly over P. More precisely, for α (0, 1), we provide conditions under which subsampling and the bootstrap may be used to construct estimates ĉ n (1 α) of the 1 α quantiles of J n (x, P ) satisfying lim inf inf P R n ĉ n (1 α)} 1 α. (1) Similarly, for α (0, 1), we provide conditions under which subsampling and the bootstrap may be used to construct estimates ĉ n (α) of the α quantiles of J n (x, P ) satisfying lim inf We also provide stronger conditions under which lim inf inf P R n ĉ n (α)} 1 α. (2) inf P ĉ n(α) R n ĉ n (1 α)} 1 2α. (3) In many cases, it is possible to replace the lim inf and in (1), (2), and (3) with lim and =, respectively. These results differ from those usually stated in the literature in that they require the convergence to hold uniformly over P instead of just pointwise over P. The importance of this stronger notion of convergence when applying these results is discussed further below. The additional flexibility allowed for by the lim inf and in (1) and (2) greatly extends the scope of applicability of the results, especially in the case of subsampling. As we will see, ĉ n (1 α) may satisfy (1) while ĉ n (α) fails to satisfy (2), or, conversely, ĉ n (α) may satisfy (2) while ĉ n (1 α) fails to satisfy (1). This phenomenon arises when it is not possible to estimate J n (x, P ) uniformly well with respect to a suitable metric, but, in a sense to be made precise by our results, it is possible to estimate it sufficiently well to ensure that (1) or (2) still hold. It is worth noting that metrics compatible with the weak topology are not sufficient for our purposes. In particular, closeness of distributions with respect to such a metric does not ensure closeness of quantiles. See Remark 2.7 for further discussion of this point. In fact, closeness of distributions with respect to even stronger metrics, such as the Kolmogorov metric, does not ensure closeness of quantiles either. For this reason, our results rely heavily on Lemma 4.1 in the 2

3 Appendix that relates closeness of distributions with respect to a suitable metric and coverage statements. The usual arguments for the pointwise asymptotic validity of subsampling and the bootstrap rely on showing for each P P that ĉ n (1 α) tends in probability under P to the 1 α quantile of the limiting distribution of R n under P. Because our results are uniform in P P, we must consider the behavior of R n and ĉ n (1 α) under arbitrary sequences P n P : n 1}, under which the quantile estimates need not even settle down. Thus, the results are not trivial extensions of the usual pointwise asymptotic arguments. The construction of ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) is useful for constructing confidence regions that behave well uniformly over P. More precisely, our results provide conditions under which subsampling and the bootstrap can be used to construct confidence regions C n = C n (X (n) ) of level α for a parameter θ(p ) that are uniformly consistent in level in the sense that lim inf inf P θ(p ) C n} 1 α. (4) Our results are also useful for constructing tests φ n = φ n (X (n) ) of level α for a null hypothesis P P 0 P against the alternative P P 1 = P \ P 0 that are uniformly consistent in level in the sense that lim sup sup E P [φ n ] α. (5) 0 In some cases, it is possible to replace the lim inf and in (4) or the lim sup and in (5) with lim and =, respectively. Confidence regions satisfying (4) are desirable because they ensure that for every ɛ > 0 there is an N such that for n > N we have that P θ(p ) C n } is no less than 1 α ɛ for all P P. In contrast, confidence regions that are only pointwise consistent in level in the sense that lim inf P θ(p ) C n} 1 α for all P P have the feature that for every ɛ > 0 and every n, however large, there is some P P for which P θ(p ) C n } is less than 1 α ɛ. Likewise, tests satisfying (5) are desirable because they ensure that for every ɛ > 0 there is an N such that for n > N we have that E P [φ n ] is no greater than α + ɛ for all P P 0. In contrast, tests that are only pointwise consistent in level in the sense that lim sup E P [φ n ] α for all P P 0 have the feature that for every ɛ > 0 and every n, however large, there is some P P 0 for which E P [φ n ] is greater than α + ɛ. For this reason, inferences based on confidence 3

4 regions or tests that fail to satisfy (4) or (5) may be very misleading in finite samples. Of course, as pointed out by Bahadur and Savage (1956), there may be no nontrivial confidence region or test satisfying (4) or (5) when P is sufficiently rich. For this reason, we will have to restrict P appropriately in our examples. In the case of the confidence regions for or tests about the mean, for instance, we will have to impose a very weak Lindeberg-type condition. Some of our results on subsampling are closely related to results in Andrews and Guggenberger (2010), which were developed independently and at about the same time as our results. See the discussion on p. 431 of Andrews and Guggenberger (2010). Our results show that the the question of whether subsampling can be used to construct estimates ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) reduces to a single, succinct requirement on the asymptotic relationship between the distribution of J n (x, P ) and J b (x, P ), where b is the subsample size, whereas the results of Andrews and Guggenberger (2010) require the verification of a larger number of conditions. On the other hand, the results of Andrews and Guggenberger (2010) further provide a means of calculating the limiting value of inf P R n ĉ n (1 α)} or inf P R n ĉ n (α)} in the case where it may not satisfy (1) or (2). To the best of our knowledge, our results on the bootstrap are the first to be stated at this level of generality. An important antecedent is Romano (1989), who studies the uniform asymptotic behavior of confidence regions for a univariate cumulative distribution function. See also Mikusheva (2007), who analyzes the uniform asymptotic behavior of some tests that arise in the context of an autoregressive model where the autoregressive parameter may be close to one in absolute value. The remainder of the paper is organized as follows. In Section 2, we present the conditions under which ĉ n (α) and ĉ n (1 α) satisfying (1) - (3) may be constructed using subsampling or the bootstrap. We then provide in Section 3 several applications of our general results. These applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process and U-statistics. The discussion of U-statistics is especially noteworthy because it highlights the fact that the assumptions required for the uniform asymptotic validity of subsampling and the bootstrap may differ. In particular, subsampling may be uniformly asymptotically valid under conditions where the bootstrap fails even to be pointwise asymptotically valid. Proofs of all results can be found in the Appendix. Many of the intermediate results in the Appendix may be of independent interest, including uniform weak laws of large numbers for U-statistics and V -statistics (Lemma 4.12 and Lemma 4.13, respectively) as well as Lemma

5 2 General Results 2.1 Subsampling Theorem 2.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0. N n = ( n b) and L n (x, P ) = 1 N n Let 1 i N n IR b (X n,(b),i, P ) x}, (6) where X n,(b),i denotes the ith subset of data of size b. Then, the following statements are true for any α (0, 1): (i) If lim sup sup sup J b (x, P ) J n (x, P )} 0, then lim inf (ii) If lim sup sup sup J n (x, P ) J b (x, P )} 0, then lim inf (iii) If lim sup sup J b (x, P ) J n (x, P ) = 0, then lim inf inf P L 1 n inf P R n L 1 n (1 α, P )} 1 α. (7) inf P R n L 1 n (α, P )} 1 α. (8) (α, P ) R n L 1 n (1 α, P )} 1 2α. (9) Remark 2.1. It is typically easy to deduce from the conclusions of Theorem 2.1 stronger results in which the lim inf and in (7), (8) and (9) are replaced by lim and =, respectively. For example, in order to assert that (7) holds with lim inf and replaced by lim and =, respectively, all that is required is that lim P R n L 1 n (1 α, P )} = 1 α for some P P. This can be verified using the usual arguments for the pointwise asymptotic valididy of subsampling. Indeed, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its 1 α quantile. See Politis et al. (1999) for details. Likewise, in order to assert that (8) holds with lim inf and replaced by lim and =, respectively, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its α quantile. Finally, in order to assert that (9) 5

6 holds with lim inf and replaced by lim and =, respectively, it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its α and 1 α quantiles. Remark 2.2. Note that L n (x, P ) as defined in (6) is infeasible because it still depends on P, which is unknown, through R b (X n,(b),i, P ). Even so, Theorem 2.1 may be used without modification to construct feasible confidence regions for a parameter of interest θ(p ) provided that R n (X (n), P ), and therefore L n (x, P ), depends on P only through θ(p ). If this is the case, then one may simply invert tests of the null hypotheses θ(p ) = θ for all θ Θ to construct a confidence region for θ(p ). More concretely, suppose R n (X (n), P ) = R n (X (n), θ(p )) and L n (x, P ) = L n (x, θ(p )). Whenever we may apply part (i) of Theorem 2.1, we have that C n = θ Θ : R n (X (n), θ) L 1 n (1 α, θ)} satisfies (4). Similar conclusions follow from parts (ii) and (iii) of Theorem 2.1. Remark 2.3. In some cases, Theorem 2.1 may be used to construct feasible confidence regions for a parameter of interest θ(p ) without having to invert hypothesis tests. To illustrate this, suppose interest focuses on a real-valued parameter θ(p ) and consider the root R n (X (n), P ) = τ n (ˆθ n θ(p )). (10) Here, ˆθ n = ˆθ n (X (n) ) is some estimate of θ(p ) and τ n is a sequence tending to infinity with the sample size in hopes of satisfying the conditions of Theorem 2.1. Let ˆL n (x) be the feasible estimate of J n (x, P ) given by ˆL n (x) = 1 N n In this case, it is easy to see that Therefore, 1 i N n Iτ b (ˆθ b (X n,(b),i ) ˆθ n ) x}. L 1 1 n (1 α, P ) = ˆL n (1 α) + τ b (ˆθ n θ(p )). P τ n (ˆθ n θ(p )) L 1 1 n (1 α, P )} = P (τ n τ b )(ˆθ n θ(p )) ˆL n (1 α)}. Hence, whenever we may apply part (i) of Theorem 2.1, we have that lim inf inf P (τ 1 n τ b )(ˆθ n θ(p )) ˆL n (1 α)} 1 α. Similar conclusions follow from parts (ii) and (iii) of Theorem 2.1. An analagous modification is applicable even in some cases where the root is not of the form given by (10), for example, when θ(p ) takes values in a metric space with metric d and R n (X (n), P ) = τ n d(ˆθ n, θ(p )). 6

7 Remark 2.4. In some cases it may often be possible simply to replace L n (x, P ) in Theorem 2.1 with a feasible estimate ˆL n (x) without affecting the conclusions of the theorem. For example, when L n (x, P ) = L n (x, θ(p )), such an estimate may be given by L n (x, ˆθ n ). The validity of such a modification will typically require first an argument to establish that the quantiles of ˆL n (x) are close to the quantiles of L n (x, P ) and then a subsequencing argument to establish the desired coverage property with ˆL n (x) in place of L n (x, P ). See Example 3.1, Example 3.5 and Example 3.6 for details. Remark 2.5. It is worth emphasizing that even though Theorem 2.1 is stated for roots, it is, of course, applicable in the special case where R n (X (n), P ) = T n (X (n) ). In other words, in the special case where R n (X (n), P ), and therefore L n (x, P ), does not depend on P. This is especially useful in the context of hypothesis testing. See Example 3.3 for one such instance. 2.2 Bootstrap Theorem 2.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let ρ(, ) be a function on P P and let ˆP n = ˆP n (X (n) ) be a (random) sequence of distributions such that for all sequences P n P : n 1} ρ( ˆP n, P n ) Pn 0 and P n ˆP n P } 1. Then, the following are true for any α (0, 1): (i) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 lim sup supj n (x, Q n ) J n (x, P n )} 0, then lim inf inf P R n Jn 1 (1 α, ˆP n )} 1 α. (11) (ii) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 lim sup supj n (x, P n ) J n (x, Q n )} 0, then lim inf inf P R n Jn 1 (α, ˆP n )} 1 α. (12) 7

8 (iii) If for all sequences Q n P : n 1} and P n P : n 1} satisfying ρ(q n, P n ) 0 then lim inf inf lim sup J n (x, Q n ) J n (x, P n ) = 0, 1 P Jn (α, ˆP n ) R n Jn 1 (1 α, ˆP n )} 1 2α. (13) Remark 2.6. It is typically easy to deduce from the conclusions of Theorem 2.2 stronger results in which the lim inf and in (11), (12) and (13) are replaced by lim and =, respectively. For example, in order to assert that (11) holds with lim inf and replaced by lim and =, respectively, all that is required is that for some P P. lim P R n J 1 n (1 α, ˆP n )} = 1 α This can be verified using the usual arguments for the pointwise asymptotic validity of the bootstrap. Indeed, if the conditions required to apply part (iii) of Theorem 2.2 hold, then it suffices to show for some P P that J n (x, P ) tends in distribution to a limiting distribution J(x, P ) that is continuous at its 1 α quantile. This follows after noting that along a further subsequence J n (x, ˆP n ) converges in distribution with probability one under P to J(x, P ). Analagous generalizations of assertions (12) and (13) hold as well. Remark 2.7. In some cases, it is possible to construct estimates Ĵn(x) of J n (x, P ) that are uniformly consistent over a large class of distributions P in the sense that for any ɛ > 0 sup P ρ(ĵn( ), J n (, P )) > ɛ} 0, (14) where ρ is the Levy metric or some other metric compatible with the weak topology. Yet a result such as (14) is not strong enough to yield any of (1) - (3). In other words, uniform coverage statements such as those in Theorem 2.1 and Theorem 2.2 do not follow from uniform approximations of the distribution of interest if the quality of the approximation is measured in terms of metrics metrizing weak convergence. To see this, consider the following simple example. Example 2.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P θ = Bernoulli(θ). Denote by J n (x, P θ ) the distribution of root n(ˆθn θ) under P θ. It is easy to show for any ɛ > 0 that sup P θ ρ(j n (, ˆP n ), J n (, P θ )) > ɛ} 0 (15) 0<θ<1 whenever ρ is a metric compatible with the weak topology. Nevertheless, it follows from p. 78 of Romano (1989) that the coverage statements in Theorem 2.2 fail to hold. 8

9 On the other hand, when ρ is the Kolmogorov metric, (15) holds when θ is restricted to an interval of the form [δ, 1 δ] for some δ > 0. Moreover, when θ is restricted to such an interval, the coverage statements in Theorem 2.2 hold as well. Thus, even when using a parametric bootstrap in very smooth parametric models, uniform coverage statements may require further restrictions on P. 3 Applications 3.1 Subsampling Example 3.1. (Multivariate Nonparametric Mean) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Suppose one wishes to construct a rectangular confidence region for µ(p ). For this purpose, a natural choice of root is We have the following theorem: n( R n (X (n) Xj,n µ j (P )), P ) = max. (16) 1 j k S j,n Theorem 3.1. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Denote by P j the set of distributions formed from the jth marginal distributions of the distributions in P. Suppose P is such that for all 1 j k [ (X ) µ(p ) 2 lim sup E P I λ P P σ(p ) X µ(p ) }] σ(p ) > λ = 0 (17) for P = P j. Let J n (x, P ) be the distribution of the root (16). Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0 and define L n (x, P ), the subsampling estimate of J n (x, P ), by (6). Then, for any α (0, 1), the following statements are true: (i) (ii) } n( lim inf P L 1 Xj,n µ j (P )) n (α, P ) max = 1 α. 1 j k S j,n } n( lim inf P Xj,n µ j (P )) max L 1 n (1 α, P ) = 1 α. 1 j k S j,n (iii) n( lim inf P L 1 Xj,n µ j (P )) n (α, P ) max 1 j k S j,n } L 1 n (1 α, P ) = 1 2α. 9

10 Note in Theorem 3.1 that L n (x, P ), only depends on P through µ(p ), i.e., L n (x, P ) = L n (x, µ(p )). Therefore, as described in Remark 2.2, it follows from part (i) of Theorem 3.1 that } n( C n = µ R k : L 1 Xj,n µ j ) n (α, µ) max 1 j k satisfies S j,n lim inf P µ(p ) C n} = 1 α. (18) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.1. Remark 2.4, with an additional argument it is possible to show that } C n = µ R k : L 1 n (α, X n( Xj,n µ j ) n ) max 1 j k S j,n also satisfies (18) with C n in place of C n. To see this, first note that Moreover, as described in max 1 j k b( Xj,b X j,n ) S j,b max 1 j k b( Xj,b µ j (P )) S j,b + max 1 j k b(µj (P ) X j,n ) S j,b. Similarly, max 1 j k b( Xj,b µ j (P )) S j,b max 1 j k b( Xj,b X j,n ) S j,b + max 1 j k b(µj (P ) X j,n ) S j,b. Consider any sequence P n P : n 1}. Since max 1 j k b(µj (P n ) X j,n ) S j,b = o Pn (1), we have that L 1 n (α, X n ) = L 1 n (α, P n ) + o Pn (1). It now follows from an additional subsequencing argument and part (i) of Theorem 3.1 that } lim inf P L 1 n (α, X n( Xj,n µ j (P )) n ) max = 1 α, 1 j k from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem 3.1. Remark 3.1. Theorem 3.1 generalizes to the case where the root is given by S j,n R n (X (n), P ) = f(z n (P ), Ω( ˆP n )), where Z n (P ) = ( n( X1,n µ 1 (P )) S 1,n,..., ) n( Xk,n µ k (P )) S k,n, (19) 10

11 f is a real-valued function that is continuous in both of its arguments, and f(z, Ω) is continuously distributed whenever Z N(0, Ω) for some Ω satisfying Ω j,j = 1 for all 1 j k. Under these conditions, sup J n (x, P n ) P f(z, Ω) x} 0 (20) whenever Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω). As a result, the proof given in Section 4.4 can be adapted appropriately to accommodate this situation. The result can be further generalized to allow the distribution of f(z, Ω) to have discontinuities. In this case, we require that at any point of discontinuity P n f(z n (P n ), Ω( ˆP n )) x} P f(z, Ω) x} P n f(z n (P n ), Ω( ˆP n )) < x} P f(z, Ω) < x} whenever Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω). When this is true, it follows from Lemma 3 on p. 260 of Chow and Teicher (1978), a generalization of Polya s Theorem, that (20) holds, from which the desired result can again be established by modifying the proof in Section 4.4 appropriately. Example 3.2. (Constrained Univariate Nonparametric Mean) Andrews (2000) considers the following example. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose it is known that µ(p ) 0 for all P P and one wishes to construct a confidence interval for µ(p ). A natural choice of root in this case is R n = R n (X (n), P ) = n(max X n, 0} µ(p )). This root differs from the one considered in Theorem 3.1 and the ones discussed in Remark 3.1 in the sense that under weak assumptions on P holds, but lim sup lim sup sup supj b (x, P ) J n (x, P )} 0 (21) sup supj n (x, P ) J b (x, P )} 0 (22) fails to hold. To see this, suppose P satisfies the uniform integrability requirement (17) with P = P. Note that J b (x, P ) = P maxz b (P ), bµ(p )} x} J n (x, P ) = P maxz n (P ), nµ(p )} x}, 11

12 where Z b (P ) = b( X b µ(p )) and Z n (P ) = n( X n µ(p )). Since bµ(p ) nµ(p ) for any P P, J b (x, P ) J n (x, P ) is bounded from above by P maxz b (P ), nµ(p )} x} J n (x, P ). It now follows from the uniform central limit theorem established by Lemma of Romano and Shaikh (2008) and Theorem 2.11 of Bhattacharya and Rao (1976) that (21) holds. A similar argument shows that (22) fails to hold. Thus, under these assumptions, subsampling can be used to construct uniformly valid upper critical values for R n, but not uniformly valid lower critical values for R n. Example 3.3. (Moment Inequalities) The generality of Theorem 2.1 illustrated in Example 3.2 is also useful when testing multisided hypotheses about the mean. To see this, let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Define P 0 = P P : µ(p ) 0} and P 1 = P\P 0. Consider testing the null hypothesis that P P 0 versus the alternative hypothesis that P P 1 at level α (0, 1). Such hypothesis testing problems have recently received considerable attention in the moment inequality literature in econometrics. See, for example, Andrews and Soares (2010), Andrews and Guggenberger (2010), Andrews and Jia (2009), Bugni (2010), Canay (2010), Romano and Shaikh (2008), and Romano and Shaikh (2010). Theorem 2.1 may be used to construct tests φ n (X (n) ) that are uniformly consistent in level in the sense that (5) holds under weak assumptions on P. Specifically, consider the test φ n (X (n) ) = IT n (X (n) ) > L 1 n (1 α)}, where n T n (X (n) Xj,n ) = max 1 j k S j,n and L n (x, P ) is given by (6) with R n (X (n), P ) = T n (X (n) ). If P is defined as in Theorem 3.1 and b = b n < n is a sequence of positive integers tending to infinity, but satisfying b/n 0, then it is possible to show that (21) holds. The desired conclusion therefore follows from Theorem 2.1. The required argument is essentially the same as the one presented in Romano and Shaikh (2008) for T n (X (n) ) = max n X j,n, 0} 2, 1 j k though Lemma 4.5 in the Appendix is also needed for establishing (21) in the case considered here because of Studentization. Related results are obtained by Andrews and Guggenberger (2009). Further applications of Theorem 2.1 can be found in Romano and Shaikh (2010). 12

13 Example 3.4. (Multiple Testing) We now illustrate the use of Theorem 2.1 to construct tests of multiple hypotheses that behave well uniformly over a large class of distributions. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and consider testing the family of null hypotheses H j : µ j (P ) 0 for 1 j k (23) versus the alternative hypotheses H j : µ j (P ) > 0 for 1 j k. (24) in a way that controls the familywise error rate at level α (0, 1) in the sense that lim sup sup F W ER P α, (25) where F W ER P = P reject some H j with µ j (P ) 0}. For K 1,..., k}, define L n (x, K) according to the righthand-side of (6) with R n (X (n), P ) = max j K n Xj,n S j,n and consider the following stepwise multiple testing procedure: Algorithm 3.1. Step 1: Set K 1 = 1,..., k}. If n Xj,n max j K 1 S j,n then stop. Otherwise, reject any H j with L 1 n (1 α, K 1 ), n Xj,n S j,n > L 1 n (1 α, K 1 ) and continue to Step 2 with K 2 = n Xj,n j K 1 : S j,n } L 1 n (1 α, K 1 ).. 13

14 Step s: If n Xj,n max j K s S j,n then stop. Otherwise, reject any H j with L 1 n (1 α, K s ), n Xj,n S j,n > L 1 n (1 α, K s ) and continue to Step s + 1 with n Xj,n K s+1 = j K s : S j,n } L 1 n (1 α, K s ).. It follows from the analysis in Romano and Wolf (2005) that Algorithm 3.1 satisfies } n Xj,n F W ER P P > L 1 n (1 α, I(P )), (26) max j I(P ) S j,n where I(P ) = 1 j k : µ j (P ) 0}. (27) If P is defined as in Theorem 3.1, then the supremum over P of the righthand-side of (26) is asymptotically bounded from above by α from Example 3.3. It is, of course, possible to extend the analysis in a straightforward way to two-sided testing. See also Romano and Shaikh (2010) for applications of Theorem 2.1 to testing an uncountably infinite number of null hypotheses in a way that satisfies (25). Example 3.5. (Empirical Process on R) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for the cumulative distribution function associated with P, i.e., P (, t]}. For this purpose a natural choice of root is sup n ˆPn (, t]} P (, t]}, (28) where ˆP n is the empirical distribution of X (n). We have the following theorem. Theorem 3.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P = P on R : ɛ < P (, t]} < 1 ɛ for some t R}. (29) Let J n (x, P ) be the distribution of the root (28). Then, for any α (0, 1), the following statements are true: 14

15 (i) (ii) (iii) lim inf P lim inf P lim inf P L 1 n sup L 1 n } (α, P ) sup n ˆPn (, t]} P (, t]} = 1 α. } n ˆPn (, t]} P (, t]} L 1 n (1 α, P ) = 1 α. } (α, P ) sup n ˆPn (, t]} P (, t]} L 1 n (1 α, P ) = 1 2α. We have immediately from part (i) of Theorem 3.2 that } C n = P : L 1 n (α, P ) sup n ˆPn (, t]} P (, t]} satisfies lim inf P P C n} = 1 α. (30) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.2. Yet this construction is not very useful in practice since it involves searching through all possible distributions on R. On the other hand, as described in Remark 2.4, with an additional argument it is possible to show that } C n = P : L 1 n (α, ˆP n ) sup n ˆPn (, t]} P (, t]} also satisfies (30) with C n in place of C n. To see this, first note that sup Similarly, b ˆPb (, t]} ˆP n (, t]} sup b ˆPb (, t]} P (, t]} + sup b P (, t]} ˆPn (, t]}. sup b ˆPb (, t]} P (, t]} sup b ˆPb (, t]} ˆP n (, t]} Consider any sequence P n P : n 1}. Since + sup b ˆPn (, t]} P (, t]}. sup b Pn (, t]} ˆP n (, t]} = o Pn (1), we have that L 1 n (α, ˆP n ) = L 1 n (α, P n ) + o Pn (1). 15

16 It now follows from an additional subsequencing argument and part (i) of Theorem 3.2 that } lim inf P L 1 n (α, ˆP n ) sup n ˆPn (, t]} P (, t]} = 1 α, from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem 3.2. Example 3.6. (One Sample U-Statistics) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for θ(p ) = θ h (P ) = E P [h(x 1,..., X m )], (31) where h is a symmetric kernel of degree m. The usual estimate of θ(p ) in this case is given by the U-statistic ˆθ n = ˆθ n (X (n) ) = ( 1 n ) m c h(x i1,..., X im ). Here, c denotes summation over all ( n m) subsets i1,..., i m } of 1,..., n}. A natural choice of root is therefore given by We have the following theorem: R n (X (n), P ) = n(ˆθ n θ(p )). (32) Theorem 3.3. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Define g(x, P ) = g h (x, P ) = E P [h(x, X 2,..., X m )] θ(p ), (33) and Suppose P is such that σ 2 (P ) = σ 2 h (P ) = m2 Var P [g(x i, P )]. (34) [ g 2 }] lim sup (X i, P ) g(x i, P ) E P λ σ 2 I (P ) σ(p ) > λ = 0 (35) and Var P [h(x 1,..., X m )] sup σ 2 <. (36) (P ) Let J n (x, P ) be the distribution of the root (32). Let b = b n < n be a sequence of positive integers tending to infinity, but satisfying b/n 0 and define L n (x, P ), the subsampling estimate of J n (x, P ), by (6). Then, for any α (0, 1), the following statements are true: 16

17 (i) (ii) (iii) lim inf P lim inf P L 1 n (α, P ) } n(ˆθ n θ(p )) = 1 α. } lim inf P n(ˆθn θ(p )) L 1 n (1 α, P ) = 1 α. L 1 n (α, P ) } n(ˆθ n θ(p )) L 1 n (1 α, P ) = 1 2α. Note in Theorem 3.3 that L n (x, P ), only depends on P through θ(p ), i.e., L n (x, P ) = L n (x, θ(p )). Therefore, as described in Remark 2.2, it follows from part (i) of Theorem 3.3 that satisfies C n = θ R : L 1 n (α, θ) n(ˆθ n θ)} lim inf P θ(p ) C n} = 1 α. (37) Similar conclusions follow from parts (ii) and (iii) of Theorem 3.3. Remark 2.4, with an additional argument it is possible to show that C n = θ R : L 1 n (α, ˆθ n ) n(ˆθ n θ)} also satisfies (37) with C n in place of C n. To see this, first note that Moreover, as described in b(ˆθb ˆθ n ) = b(ˆθ b θ(p )) + b(θ(p ) ˆθ n ). Consider any sequence P n P : n 1}. Since b(θ(pn ) ˆθ n ) = o Pn (1), we have that L 1 n (α, ˆθ n ) = L 1 n (α, P n ) + o Pn (1). It now follows from an additional subsequencing argument and part (i) of Theorem 3.3 that lim inf P L 1 n (α, ˆθ n ) } n(ˆθ n θ(p )) = 1 α, from which the desired conclusion follows. Similar statements follow from parts (ii) and (iii) of Theorem

18 3.2 Bootstrap Example 3.7. (Multivariate Nonparametric Mean) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k. Suppose one wishes to construct a rectangular confidence region for µ(p ). As described in Example 3.1, a natural choice of root in this case is given by (16). We have the following theorem: Theorem 3.4. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P is defined as in Theorem 3.1. Let J n (x, P ) be the distribution of the root R n defined by (16). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: (i) (ii) } lim inf P J 1 n (α, ˆP n( Xj,n µ j (P )) n ) max = 1 α. 1 j k S j,n } n( lim inf P Xj,n µ j (P )) max J n 1 (1 α, 1 j k S ˆP n ) = 1 α. j,n (iii) lim inf P J n 1 (α, ˆP n( Xj,n µ j (P )) n ) max 1 j k S j,n } Jn 1 (1 α, ˆP n ) = 1 2α. Theorem 3.4 generalizes in the same way that Theorem 3.1 generalizes as described in Remark 3.1. In this case, the proof given in Section 4.7 can be adapted simply by modifying Lemma 4.7 appropriately. Example 3.8. (Moment Inequalities) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and define P 0 and P 1 as in Example 3.3. Andrews and Jia (2009) propose testing the null hypothesis that P P 0 versus the alternative hypothesis that P P 1 at level α (0, 1) using an adjusted quasi-likelihood ratio statistic T n (X (n) ) defined as follows: Here, T n (X (n) ) = inf W n (t) Ω 1 k n W n (t). :t 0 Ω n = maxɛ det(ω( ˆP n )), 0}diag(Ω( ˆP n )), where ɛ > 0 and W n (t) = ( n( X1,n t) S 1,n,..., ) n( Xk,n t) S k,n. 18

19 Andrews and Jia (2009) propose a procedure for constructing critical values for T n (X (n) ) that they term refined moment selection. construction. To this end, define R n (X (n), P ) = For illustrative purposes, we instead consider a simpler inf (Z n (P ) t) Ω 1 k n (Z n (P ) t), (38) :t 0 where Z n (P ) is defined as in (19). Note that (38) is a function of Z n (P ) and Ω( ˆP n ) only, i.e., R n (X (n), P ) = f(z n (P ), Ω( ˆP n )). Furthermore, whenever Z N(0, Ω) for some Ω satisfying Ω j,j = 1 for all 1 j k, f(z, Ω) is continuously distributed except for a point mass at zero. But for any sequence P n P : n 1} such that Z n (P n ) d Z under P n and Ω( ˆP n ) Pn Ω, where Z N(0, Ω), we see that P n f(z n (P n ), Ω( ˆP n )) 0} P f(z, Ω) 0} P n f(z n (P n ), Ω( ˆP n )) < 0} P f(z, Ω) < 0}. It therefore follows from the generalization of Theorem 3.4 discussed in Example 3.7 that the test φ n (X (n) ) = IT n (X (n) ) > J 1 n (1 α, ˆP n )}, where J n (x, P ) is the distribution of (38), satisfies (5) whenever P is defined as in Theorem 3.4. Example 3.9. (Multiple Testing) Theorem 2.2 may be used in the same way that Theorem 2.1 was used in Example 3.4 to construct tests of multiple hypotheses that behave well uniformly over a large class of distributions. To see this, let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R k and again consider testing the family of null hypotheses (23) versus the alternative hypotheses (24) in a way that satisfies (25). For K 1,..., k}, let J n (x, K, P ) be the distribution of the root R n (X (n), P ) = max j K n( Xj,n µ j (P )) S j,n under P and consider the following stepwise multiple testing procedure: Algorithm 3.2. Step 1: Set K 1 = 1,..., k}. If max j K 1 n Xj,n S j,n J 1 n (1 α, K 1, ˆP n ), then stop. Otherwise, reject any H j with n Xj,n > Jn 1 (1 α, K 1, ˆP n ) S j,n 19

20 and continue to Step 2 with n Xj,n K 2 = j K 1 : S j,n } Jn 1 (1 α, K 1, ˆP n ).. Step s: If n Xj,n max j K s S j,n J 1 n (1 α, K s, ˆP n ), then stop. Otherwise, reject any H j with n Xj,n > Jn 1 (1 α, K s, ˆP n ) S j,n and continue to Step s + 1 with n Xj,n K s+1 = j K s : S j,n } Jn 1 (1 α, K s, ˆP n ).. It follows from the analysis in Romano and Wolf (2005) that Algorithm 3.1 satisfies } n Xj,n F W ER P P > Jn 1 (1 α, I(P ), ˆP n ), (39) max j I(P ) S j,n where I(P ) is defined as in (27). The righthand-side of (39) can be further bounded from above by } n( Xj,n µ j (P )) P > Jn 1 (1 α, I(P ), ˆP n ). max j I(P ) S j,n The desired conclusion therefore follows from Theorem 3.4 if P is defined as in Theorem 3.1. It is, of course, possible to extend the analysis in a straightforward way to two-sided testing. Example (Empirical Process on R) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R. Suppose one wishes to construct a confidence region for the cumulative distribution function associated with P, i.e., P (, t]}. As described in Example 3.5, a natural choice of root in this case is given by (28). We have the following theorem: Theorem 3.5. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P, where P is defined as in (29). Let J n (x, P ) be the distribution of the root (28). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: (i) lim inf P J 1 n (α, ˆP n ) sup } n ˆPn (, t]} P (, t]} = 1 α. 20

21 (ii) lim inf P sup } n ˆPn (, t]} P (, t]} Jn 1 (1 α, ˆP n ) = 1 α. (iii) lim inf P J 1 n (α, ˆP n ) sup } n ˆPn (, t]} P (, t]} Jn 1 (1 α, ˆP n ) = 1 2α. Some of the conclusions of Theorem 3.5 can be found in Romano (1989), though the method of proof given in the Appendix is quite different. Example (One Sample U-Statistics) Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Suppose one wishes to construct a confidence region for θ(p ) = θ h (P ) given by (31). As described in Example 3.6, a natural choice of root in this case is given by (32). Before proceeding, it is useful to introduce the following notation. For an arbitrary kernel h, ɛ > 0 and B > 0, denote by P h,ɛ,b the set of all distributions P on R such that E P [ h(x 1,..., X m ) θ h(p ) ɛ ] B. (40) Similarly, for an arbitrary kernel h and δ > 0, denote by S h,δ the set of all distributions P on R such that σ 2 h(p ) δ, (41) where σ 2 h(p ) is defined as in (34). Finally, for an arbitrary kernel h, ɛ > 0 and B > 0, let P h,ɛ,b be the set of distributions P on R such that E P [ h(x i1,..., X im )] θ h(p ) ɛ ] B whenever 1 i j n for all 1 j m. Using this notation, we have the following theorem: Theorem 3.6. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P on R and let h be a symmetric kernel of degree m. Define the kernel h of degree 2m according to the rule h (x 1,..., x 2m ) = h(x 1,..., x m )h(x 1, x m+2,..., x 2m ) h(x 1,..., x m )h(x m+1,..., x 2m ). (42) Suppose P P h,2+δ,b S h,δ P h,1+δ,b P h,2+δ,b for some δ > 0 and B > 0. Let J n (x, P ) be the distribution of the root R n defined by (32). Denote by ˆP n the empirical distribution of X (n). Then, for any α (0, 1), the following statements are true: 21

22 (i) (ii) (iii) lim inf P lim inf P J n 1 (α, ˆP n ) } n(ˆθ n θ(p )) = 1 α. lim inf P n(ˆθn θ(p )) J n 1 (1 α, ˆP } n ) = 1 α. Jn 1 (α, ˆP n ) n(ˆθ n θ(p )) Jn 1 (1 α, ˆP } n ) = 1 2α. Note that the conditions on P in Theorem 3.6 are stronger than the conditions on P in Theorem 3.3. While it may be possible to weaken the restrictions on P in Theorem 3.6 some, it is not possible to establish the conclusions of Theorem 3.6 under the conditions on P in Theorem 3.3. Indeed, as shown by Bickel and Freedman (1981), the bootstrap need not be even pointwise asymptotically valid under the conditions on P in Theorem Appendix 4.1 Auxiliary Results Lemma 4.1. Let F and G be (nonrandom) distribution functions on R. Then the following are true: (i) If sup G(x) F (x)} ɛ, then G 1 (1 α) F 1 (1 (α + ɛ)). (ii) If sup F (x) G(x)} ɛ, then G 1 (α) F 1 (α + ɛ). Furthermore, if X F, it follows that: (iii) If sup G(x) F (x)} ɛ, then P X G 1 (1 α)} 1 (α + ɛ). (iv) If sup F (x) G(x)} ɛ, then P X G 1 (α)} 1 (α + ɛ). 22

23 (v) If sup G(x) F (x) ɛ, then P G 1 (α) X G 1 (1 α)} 1 2(α + ɛ). If Ĝ is a random distribution function on R, then we have further that: (vi) If P sup Ĝ(x) F (x)} ɛ} 1 δ, then P X Ĝ 1 (1 α)} 1 (α + ɛ + δ). (vii) If P sup F (x) Ĝ(x)} ɛ} 1 δ, then P X Ĝ 1 (α)} 1 (α + ɛ + δ). (viii) If P sup Ĝ(x) F (x) ɛ} 1 δ, then P Ĝ 1 (α) X Ĝ 1 (1 α)} 1 2(α + ɛ + δ). Proof: To see (i), first note that sup G(x) F (x)} ɛ implies that G(x) ɛ F (x) for all x R. Thus, x R : G(x) 1 α} = x R : G(x) ɛ 1 α ɛ} x R : F (x) 1 α ɛ}, from which it follows that F 1 (1 (α + ɛ)) = infx R : F (x) 1 α ɛ} infx R : G(x) 1 α} = G 1 (1 α}. Similarly, to prove (ii), first note that sup F (x) G(x)} ɛ implies that F (x) ɛ G(x) for all x R, so x R : F (x) α + ɛ} = x R : F (x) ɛ α} x R : G(x) α}. Therefore, G 1 (α) = infx R : G(x) α} infx R : F (x) α+ɛ} = F 1 (α+ɛ). To prove (iii), note that because sup G(x) F (x)} ɛ, it follows from (i) that X G 1 (1 α)} X F 1 (1 (α+ɛ))}. Hence, P X G 1 (1 α)} P X F 1 (1 (α+ɛ))} 1 (α+ɛ). Using the same reasoning, (iv) follows from (ii) and the assumption that sup F (x) G(x)} ɛ. To see (v), note that P G 1 (α) X G 1 (1 α)} 1 P X < G 1 (α)} P X > G 1 (1 α)} = P X G 1 (α)} + P X G 1 (1 α)} 1 1 2(α + ɛ), where the first inequality follows from the Bonferroni inequality and the second inequality follows from (iii) and (iv). 23

24 To prove (vi), note that P X Ĝ 1 (1 α)} P X Ĝ 1 (1 α) supĝ(x) F (x)} ɛ} P X F 1 (1 (α + ɛ)) supĝ(x) F (x)} ɛ} = P X F 1 (1 (α + ɛ))} P X F 1 (1 (α + ɛ)) supĝ(x) F (x)} > ɛ} P X F 1 (1 (α + ɛ))} P supĝ(x) F (x)} > ɛ} = 1 α ɛ δ, where the second inequality follows from (i). A similar argument using (ii) establishes (vii). Finally, (viii) follows from (vi) and (vii) by an argument analogous to the one used to establish (v). 4.2 Proof of Theorem 2.1 Lemma 4.2. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let N n = ( n b), kn = n b and define L n(x, P ) according to (6). Then, for any ɛ > 0, we have that P sup L n (x, P ) J b (x, P ) > ɛ} 1 2π. (43) ɛ k n We also have that for any 0 < δ < 1 P sup L n (x, P ) J b (x, P ) > ɛ} δ ɛ + 2 ɛ exp 2k nδ 2 }. (44) Proof: Let ɛ > 0 be given and define S n (x, P ; X 1,..., X n ) by 1 k n 1 i k n IR b ((X b(i 1)+1,..., X bi ), P ) x} J b (x, P ). Denote by S n the symmetric group with n elements. Note that using this notation, we may rewrite as 1 N n 1 i N n IR b (X n,(b),i, P ) x} J b (x, P ) Z n (x, P ; X 1,..., X n ) = 1 S n (x, P ; X n! π(1),..., X π(n) ). π S n 24

25 Note further that sup Z n (x, P ; X 1,..., X n ) 1 n! π S n sup S n (x, P ; X π(1),..., X π(n) ), which is a sum of n! identically distributed random variables. Let ɛ > 0 be given. It follows that P sup Z n (x, P ; X 1,..., X n ) > ɛ} is bounded above by P 1 n! π S n sup Using Markov s inequality, (45) can be bounded by = 1 ɛ S n (x, P ; X π(1),..., X π(n) ) > ɛ}. (45) 1 ɛ Esup S n (x, P ; X 1,..., X n ) } (46) 1 0 P sup S n (x, P ; X 1,..., X n ) > u}du. (47) We may then use the Dvoretsky-Kiefer-Wolfowitz inequality to bound (47) by which, when evaluated, yields 1 ɛ exp 2k n u 2 }du, 2 2π [Φ(2 k n ) 1 ɛ k n 2 ] < 1 2π. ɛ k n To establish (44), note that for any 0 < δ < 1, we have that Esup S n (x, P ; X 1,..., X n ) } δ + P sup S n (x, P ; X 1,..., X n ) > δ}. (48) The result (44) now follows immediately by using the Dvoretsky-Kiefer-Wolfowitz inequality to bound the second term on the right hand side in (48). Lemma 4.3. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let k n = n b and define L n(x, P ) according to (6). Then, for all ɛ > 0 and γ (0, 1), we have the following: (i) P sup L n (x, P ) J n (x, P )} > ɛ} 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}. γɛ k n (ii) P sup J n (x, P ) L n (x, P )} > ɛ} 1 2π + IsupJ n (x, P ) J b (x, P )} > (1 γ)ɛ}. γɛ k n 25

26 (iii) P sup L n (x, P ) J n (x, P )} > ɛ} 1 2π + Isup J b (x, P ) J n (x, P ) > (1 γ)ɛ}. γɛ k n Proof: Let ɛ > 0 and γ (0, 1) be given. Note that P supl n (x, P ) J n (x, P )} > ɛ} P supl n (x, P ) J b (x, P )} + supj b (x, P ) J n (x, P )} > ɛ} P supl n (x, P ) J b (x, P )} > γɛ} + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ} 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}, γɛ k n where the final inequality follows from Lemma 4.2. A similar argument establishes (ii) and (iii). Lemma 4.4. Let X (n) = (X 1,..., X n ) be an i.i.d. sequence of random variables with distribution P P. Denote by J n (x, P ) the distribution of a real-valued root R n = R n (X (n), P ) under P. Let k n = n b and define L n(x, P ) according to (6). Let δ 1,n (ɛ, γ, P ) = 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ} γɛ k n δ 2,n (ɛ, γ, P ) = 1 2π + IsupJ n (x, P ) J b (x, P )} > (1 γ)ɛ} γɛ k n δ 3,n (ɛ, γ, P ) = 1 2π + Isup J b (x, P ) J n (x, P ) > (1 γ)ɛ}. γɛ k n Then, for any ɛ > 0 and γ (0, 1), we have that (i) (ii) (iii) P R n L 1 n (1 α)} 1 (α + ɛ + δ 1,n (ɛ, γ, P )). P R n L 1 n (α)} 1 (α + ɛ + δ 2,n (ɛ, γ, P )). P L 1 n ( α 2 ) R n L 1 n (1 α 2 )} 1 (α + ɛ + δ 3,n(ɛ, γ, P )). Proof: Let ɛ > 0 and γ (0, 1) be given. By part (i) of Lemma 4.3, we have that P supl n (x, P ) J n (x, P )} > ɛ} δ 1,n (ɛ, γ, P ). 26

27 Assertion (i) follows by applying part (vi) of Lemma 4.1. Assertions (ii) and (iii) are established similarly. Proof of Theorem 2.1: To prove (i), note that by part (i) of Lemma 4.4, we have for any ɛ > 0 and γ (0, 1) that where sup By the assumption on P R n L 1 n (1 α)} 1 (α + ɛ + inf δ 1,n(ɛ, γ, P )), δ 1,n (ɛ, γ, P ) = 1 2π + IsupJ b (x, P ) J n (x, P )} > (1 γ)ɛ}. γɛ k n sup supj b (x, P ) J n (x, P )}, we have that for every ɛ > 0, inf δ 1,n (ɛ, γ, P ) 0. Thus, there exists a sequence ɛ n > 0 tending to 0 so that inf δ 1,n (ɛ n, γ, P ) 0. The desired claim now follows from applying part (i) of Lemma 4.4 to this sequence. Assertions (ii) and (iii) follow in exactly the same way. 4.3 Proof of Theorem 2.2 We prove only (i). Similar arguments can be used to establish (ii) and (iii). Let η > 0 and α (0, 1) be given. Choose δ > 0 so that sup J n (x, P ) J n (x, P )} < η 2 whenever ρ(p, P ) < δ for P P and P P. For n sufficiently large, we have that For such n, we therefore have that sup P ρ( ˆP n, P ) > δ} < η 4 and sup P ˆP n P } < η 4. 1 η 2 inf P ρ( ˆP n, P ) δ ˆP n P } inf P sup J n (x, ˆP n ) J n (x, P )} η 2 }. It follows from part (vi) of Lemma 5.1 that for such n inf P R n Jn 1 (1 α, ˆP n )} 1 (α + η). Since the choice of η was arbitrary, the desired result follows. 27

28 4.4 Proof of Theorem 3.1 Lemma 4.5. Consider any sequence P n P : n 1}, where P satisfies (17). Let X n,i, i = 1,..., n be an i.i.d. sequence of random variables with distribution P n. Then, Sn 2 P n 1. σ 2 (P n ) Proof: First assume without loss of generality that µ(p n ) = 0 and σ(p n ) = 1 for all n 1. Next, note that S 2 n = 1 n 1 i n X 2 n,i X 2 n. By Lemma of Lehmann and Romano (2005), we have that 1 n 1 i n X 2 n,i P n 1. By Lemma of Lehmann and Romano (2005), we have further that X n P n 0. The desired result therefore follows from the continuous mapping theorem. Proof of Theorem 3.1: We argue that sup sup J b (x, P ) J n (x, P ) 0. (49) Suppose by way of contradiction that (49) fails. It follows that there exists a subsequence n l such that Ω(P nl ) Ω and either or To see that neither (50) or (51) can hold, let sup J nl (x, P nl ) Φ Ω (x,..., x) 0 (50) sup J bnl (x, P nl ) Φ Ω (x,..., x) 0. (51) } n( X1,n µ 1 (P )) n( Xk,n µ k (P )) K n (x, P ) = P x 1,..., x k. (52) σ 1 (P ) σ k (P ) By Polya s Theorem, sup Φ Ω(Pnl )(x) Φ Ω (x) 0. k 28

29 It therefore follows from the uniform central limit theorem established by Lemma of Romano and Shaikh (2008) that K nl (x, P nl ) d Φ Ω (x). From Lemma 4.5, we have for 1 j k that S j,nl P nl 1. σ j (P nl ) Hence, by Slutsky s Theorem, K nl (x, P nl ) d Φ Ω (x). By Polya s Theorem, we therefore have that sup K nl (x, P nl ) Φ Ω (x) 0. k We thus see that (50) can not hold. A similar argument establishes that (51) can not hold. Hence, (49) holds. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) therefore follow from Theorem 2.1 and Remark Proof of Theorem 3.2 We begin with some preliminaries. First note that where B n is the uniform empirical process and J n (x, P ) = P sup B n (P (, t]}) x} = P sup (P ) B n (t) x}, R(P ) = cl(p (, t]} : t R}). (53) By Theorem 3.85 of Aliprantis and Border (2006), the set of all nonempty closed subsets of [0, 1] is a compact metric space with respect to the Hausdorff metric Here, d H (U, V ) = infη > 0 : U V η, V U η }. (54) U η = A η (u), u U where A η (u) is the open ball with center u and radius η. Thus, for any sequence P n P : n 1}, there is a subsequence n l and a closed set R [0, 1] along which d H (R(P nl ), R) 0. (55) Finally, denote by B the standard Brownian bridge process. By the almost sure representation theorem, we may choose B n and B so that sup B n (t) B(t) 0 a.s. (56) 0 t 1 29

30 We now argue that sup sup J b (x, P ) J n (x, P ) 0. (57) Suppose by way of contradiction that (57) fails. It follows that there exists a subsequence n l and a closed subset R [0, 1] such that d H (R(P nl ), R) 0 and either or where sup J nl (x, P nl ) J (x) 0 (58) sup J bnl (x, P nl ) J (x) 0, (59) J (x) = P sup B(t) x}. Moreover, by the definition of P, it must be the case that R contains some point different from zero and one. To see that neither (58) or (59) can hold, note that sup (P nl ) B n (t) sup B(t) sup B n (t) B(t) + sup (P nl ) (P nl ) B(t) sup B(t). (60) By (56), we see that the first term on the righthand-side of (60) tends to zero a.s. By the a.s. uniform continuity of B(t) and (82), we see that the second term on the righthand-side of (60) tends to zero a.s. Thus, sup B n (t) (P nl ) d sup B(t). Since R contains some point different from zero and one, we see from Theorem 11.1 Davydov et al. (1998) that sup B(t) is continuously distributed. By Polya s Theorem, we therefore have that (58) holds. A similar argument establishes that (59) can not hold. Hence, (57) holds. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) now follow from Theorem 2.1 and Remark Proof of Theorem 3.3 Lemma 4.6. Let h be a symmetric kernel of degree m. Denote by J n (x, P ) the distribution of R n (X (n), P ) defined in (32). Suppose P satisfies (35) and (36). Then, lim sup sup J n (x, P ) Φ(x/σ(P )) = 0. x 30

31 Proof: Let P n P : n 1} be given and denote by X n,i, i = 1,..., n an i.i.d. sequence of random variables with distribution P n. Define n (P n ) = n 1/2 (ˆθ n θ(p n )) mn 1/2 1 i n g(x n,i, P n ), (61) where g(x, P ) is defined as in (33). Note that n (P n ) is itself a mean zero, degenerate U-statistic. By Lemma A on p. 183 of Serfling (1980), we therefore see that ( ) n 1 ( m Var Pn [ n (P n )] = n m c 2 c m where the terms ζ c (P n ) are nondecreasing in c and thus Hence, It follows that Since )( n m m c ζ c (P n ) ζ m (P n ) = V ar Pn [h(x 1,..., X m )]. ( ) n 1 Var Pn [ n (P n )] n m Var Pn [ n (P n ) σ(p n ) 2 c m ] ( ) n 1 n m 2 c m ( ) n 1 m n m c=2 ( m c ( m c )( n m m c ) ζ c (P n ), (62) ) Var Pn [h(x 1,..., X m )]. ( )( ) m n m VarPn [h(x 1,..., X m )] c m c σ(p n ) )( n m m c ) 0,. (63) it follows from (36) that the lefthand-side of (63) tends to zero. Therefore, by Chebychev s inequality, we see that n (P n ) σ(p n ) P n 0. Next, note that from Lemma of Lehmann and Romano (2005) we have that mn 1/2 1 i n g(x n,i, P n ) σ(p n ) d Φ(x) under P n. We therefore have further from Slutsky s Theorem that n 1/2 (ˆθ n θ(p n )) σ(p n ) d Φ(x) under P n. An appeal to Polya s Theorem establishes the desired result. Proof of Theorem 3.3: From the triangle inequality and Lemma 4.6, we see immediately that sup sup J b (x, P ) J n (x, P ) 0. For any fixed P P, we also have that J n (x, P ) tends in distribution to a continuous limiting distribution. Assertions (i) - (iii) therefore follow from Theorem 2.1 and Remark

ON THE UNIFORM ASYMPTOTIC VALIDITY OF SUBSAMPLING AND THE BOOTSTRAP. Joseph P. Romano Azeem M. Shaikh

ON THE UNIFORM ASYMPTOTIC VALIDITY OF SUBSAMPLING AND THE BOOTSTRAP. Joseph P. Romano Azeem M. Shaikh ON THE UNIFORM ASYMPTOTIC VALIDITY OF SUBSAMPLING AND THE BOOTSTRAP By Joseph P. Romano Azeem M. Shaikh Technical Report No. 2010-03 April 2010 Department of Statistics STANFORD UNIVERSITY Stanford, California

More information

large number of i.i.d. observations from P. For concreteness, suppose

large number of i.i.d. observations from P. For concreteness, suppose 1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

More information

arxiv: v2 [math.st] 18 Feb 2013

arxiv: v2 [math.st] 18 Feb 2013 The Annals of Statistics 2012, Vol. 40, No. 6, 2798 2822 DOI: 10.1214/12-AOS1051 c Institute of Mathematical Statistics, 2012 arxiv:1204.2762v2 [math.st] 18 Feb 2013 ON THE UNIFORM ASYMPTOTIC VALIDITY

More information

Inference for Identifiable Parameters in Partially Identified Econometric Models

Inference for Identifiable Parameters in Partially Identified Econometric Models Inference for Identifiable Parameters in Partially Identified Econometric Models Joseph P. Romano Department of Statistics Stanford University romano@stat.stanford.edu Azeem M. Shaikh Department of Economics

More information

Inference for identifiable parameters in partially identified econometric models

Inference for identifiable parameters in partially identified econometric models Journal of Statistical Planning and Inference 138 (2008) 2786 2807 www.elsevier.com/locate/jspi Inference for identifiable parameters in partially identified econometric models Joseph P. Romano a,b,, Azeem

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

University of California San Diego and Stanford University and

University of California San Diego and Stanford University and First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford

More information

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf Lecture 13: 2011 Bootstrap ) R n x n, θ P)) = τ n ˆθn θ P) Example: ˆθn = X n, τ n = n, θ = EX = µ P) ˆθ = min X n, τ n = n, θ P) = sup{x : F x) 0} ) Define: J n P), the distribution of τ n ˆθ n θ P) under

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Control of Generalized Error Rates in Multiple Testing

Control of Generalized Error Rates in Multiple Testing Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

Practical and Theoretical Advances in Inference for Partially Identified Models

Practical and Theoretical Advances in Inference for Partially Identified Models Practical and Theoretical Advances in Inference for Partially Identified Models Ivan A. Canay Department of Economics Northwestern University iacanay@northwestern.edu Azeem M. Shaikh Department of Economics

More information

Christopher J. Bennett

Christopher J. Bennett P- VALUE ADJUSTMENTS FOR ASYMPTOTIC CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE by Christopher J. Bennett Working Paper No. 09-W05 April 2009 DEPARTMENT OF ECONOMICS VANDERBILT UNIVERSITY NASHVILLE,

More information

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

More information

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 446 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Azeem

More information

1 The Glivenko-Cantelli Theorem

1 The Glivenko-Cantelli Theorem 1 The Glivenko-Cantelli Theorem Let X i, i = 1,..., n be an i.i.d. sequence of random variables with distribution function F on R. The empirical distribution function is the function of x defined by ˆF

More information

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Comparison of inferential methods in partially identified models in terms of error in coverage probability Comparison of inferential methods in partially identified models in terms of error in coverage probability Federico A. Bugni Department of Economics Duke University federico.bugni@duke.edu. September 22,

More information

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and

More information

EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS. Eun Yi Chung Joseph P. Romano

EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS. Eun Yi Chung Joseph P. Romano EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS By Eun Yi Chung Joseph P. Romano Technical Report No. 20-05 May 20 Department of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065 EXACT AND

More information

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications. Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

More information

Metric Spaces and Topology

Metric Spaces and Topology Chapter 2 Metric Spaces and Topology From an engineering perspective, the most important way to construct a topology on a set is to define the topology in terms of a metric on the set. This approach underlies

More information

On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions

On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions On the Asymptotic Optimality of Empirical Likelihood for Testing Moment Restrictions Yuichi Kitamura Department of Economics Yale University yuichi.kitamura@yale.edu Andres Santos Department of Economics

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 COWLES FOUNDATION DISCUSSION PAPER NO. 1815 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES

VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES Econometric Theory, 2009, Page 1 of 41. Printed in the United States of America. doi:10.1017/s0266466608090257 VALIDITY OF SUBSAMPLING AND PLUG-IN ASYMPTOTIC INFERENCE FOR PARAMETERS DEFINED BY MOMENT

More information

INVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES. Donald W. K. Andrews and Sukjin Han

INVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES. Donald W. K. Andrews and Sukjin Han INVALIDITY OF THE BOOTSTRAP AND THE M OUT OF N BOOTSTRAP FOR INTERVAL ENDPOINTS DEFINED BY MOMENT INEQUALITIES By Donald W. K. Andrews and Sukjin Han July 2008 COWLES FOUNDATION DISCUSSION PAPER NO. 1671

More information

Lecture Notes 3 Convergence (Chapter 5)

Lecture Notes 3 Convergence (Chapter 5) Lecture Notes 3 Convergence (Chapter 5) 1 Convergence of Random Variables Let X 1, X 2,... be a sequence of random variables and let X be another random variable. Let F n denote the cdf of X n and let

More information

Supplement to On the Uniform Asymptotic Validity of Subsampling and the Bootstrap

Supplement to On the Uniform Asymptotic Validity of Subsampling and the Bootstrap Supplemet to O the Uiform Asymptotic Validity of Subsamplig ad the Bootstrap Joseph P. Romao Departmets of Ecoomics ad Statistics Staford Uiversity romao@staford.edu Azeem M. Shaikh Departmet of Ecoomics

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

Economics 583: Econometric Theory I A Primer on Asymptotics

Economics 583: Econometric Theory I A Primer on Asymptotics Economics 583: Econometric Theory I A Primer on Asymptotics Eric Zivot January 14, 2013 The two main concepts in asymptotic theory that we will use are Consistency Asymptotic Normality Intuition consistency:

More information

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Ann Inst Stat Math (2009) 61:773 787 DOI 10.1007/s10463-008-0172-6 Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses Taisuke Otsu Received: 1 June 2007 / Revised:

More information

THE LIMIT OF FINITE-SAMPLE SIZE AND A PROBLEM WITH SUBSAMPLING. Donald W. K. Andrews and Patrik Guggenberger. March 2007

THE LIMIT OF FINITE-SAMPLE SIZE AND A PROBLEM WITH SUBSAMPLING. Donald W. K. Andrews and Patrik Guggenberger. March 2007 THE LIMIT OF FINITE-SAMPLE SIZE AND A PROBLEM WITH SUBSAMPLING By Donald W. K. Andrews and Patrik Guggenberger March 2007 COWLES FOUNDATION DISCUSSION PAPER NO. 1605 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE Joseph P. Romano Department of Statistics Stanford University Stanford, California 94305 romano@stat.stanford.edu Michael

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

The Heine-Borel and Arzela-Ascoli Theorems

The Heine-Borel and Arzela-Ascoli Theorems The Heine-Borel and Arzela-Ascoli Theorems David Jekel October 29, 2016 This paper explains two important results about compactness, the Heine- Borel theorem and the Arzela-Ascoli theorem. We prove them

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

More information

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3 Index Page 1 Topology 2 1.1 Definition of a topology 2 1.2 Basis (Base) of a topology 2 1.3 The subspace topology & the product topology on X Y 3 1.4 Basic topology concepts: limit points, closed sets,

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

More information

Bahadur representations for bootstrap quantiles 1

Bahadur representations for bootstrap quantiles 1 Bahadur representations for bootstrap quantiles 1 Yijun Zuo Department of Statistics and Probability, Michigan State University East Lansing, MI 48824, USA zuo@msu.edu 1 Research partially supported by

More information

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University The Bootstrap: Theory and Applications Biing-Shen Kuo National Chengchi University Motivation: Poor Asymptotic Approximation Most of statistical inference relies on asymptotic theory. Motivation: Poor

More information

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA  address: Topology Xiaolong Han Department of Mathematics, California State University, Northridge, CA 91330, USA E-mail address: Xiaolong.Han@csun.edu Remark. You are entitled to a reward of 1 point toward a homework

More information

Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing Joseph P. ROMANO and Michael WOLF Consider the problem of testing k hypotheses simultaneously. In this article we discuss finite-

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations Research

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Compact operators on Banach spaces

Compact operators on Banach spaces Compact operators on Banach spaces Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto November 12, 2017 1 Introduction In this note I prove several things about compact

More information

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 13 Sequences and Series of Functions These notes are based on the notes A Teacher s Guide to Calculus by Dr. Louis Talman. The treatment of power series that we find in most of today s elementary

More information

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures 36-752 Spring 2014 Advanced Probability Overview Lecture Notes Set 1: Course Overview, σ-fields, and Measures Instructor: Jing Lei Associated reading: Sec 1.1-1.4 of Ash and Doléans-Dade; Sec 1.1 and A.1

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics

More information

Problem List MATH 5143 Fall, 2013

Problem List MATH 5143 Fall, 2013 Problem List MATH 5143 Fall, 2013 On any problem you may use the result of any previous problem (even if you were not able to do it) and any information given in class up to the moment the problem was

More information

R N Completeness and Compactness 1

R N Completeness and Compactness 1 John Nachbar Washington University in St. Louis October 3, 2017 R N Completeness and Compactness 1 1 Completeness in R. As a preliminary step, I first record the following compactness-like theorem about

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

Empirical Processes: General Weak Convergence Theory

Empirical Processes: General Weak Convergence Theory Empirical Processes: General Weak Convergence Theory Moulinath Banerjee May 18, 2010 1 Extended Weak Convergence The lack of measurability of the empirical process with respect to the sigma-field generated

More information

Section 8.2. Asymptotic normality

Section 8.2. Asymptotic normality 30 Section 8.2. Asymptotic normality We assume that X n =(X 1,...,X n ), where the X i s are i.i.d. with common density p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume that θ 0 is identified in the sense that

More information

Partial Identification and Confidence Intervals

Partial Identification and Confidence Intervals Partial Identification and Confidence Intervals Jinyong Hahn Department of Economics, UCLA Geert Ridder Department of Economics, USC September 17, 009 Abstract We consider statistical inference on a single

More information

ASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE M OUT OF N BOOTSTRAP. DONALD W.K. ANDREWS and PATRIK GUGGENBERGER

ASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE M OUT OF N BOOTSTRAP. DONALD W.K. ANDREWS and PATRIK GUGGENBERGER ASYMPTOTIC SIZE AND A PROBLEM WITH SUBSAMPLING AND WITH THE M OUT OF N BOOTSTRAP BY DONALD W.K. ANDREWS and PATRIK GUGGENBERGER COWLES FOUNDATION PAPER NO. 1295 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

More information

The Numerical Delta Method and Bootstrap

The Numerical Delta Method and Bootstrap The Numerical Delta Method and Bootstrap Han Hong and Jessie Li Stanford University and UCSC 1 / 41 Motivation Recent developments in econometrics have given empirical researchers access to estimators

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information

Maths 212: Homework Solutions

Maths 212: Homework Solutions Maths 212: Homework Solutions 1. The definition of A ensures that x π for all x A, so π is an upper bound of A. To show it is the least upper bound, suppose x < π and consider two cases. If x < 1, then

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Statistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality

Statistics 581, Problem Set 1 Solutions Wellner; 10/3/ (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality Statistics 581, Problem Set 1 Solutions Wellner; 1/3/218 1. (a) The case r = 1 of Chebychev s Inequality is known as Markov s Inequality and is usually written P ( X ɛ) E( X )/ɛ for an arbitrary random

More information

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012

SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER. Donald W. K. Andrews. August 2011 Revised March 2012 SIMILAR-ON-THE-BOUNDARY TESTS FOR MOMENT INEQUALITIES EXIST, BUT HAVE POOR POWER By Donald W. K. Andrews August 2011 Revised March 2012 COWLES FOUNDATION DISCUSSION PAPER NO. 1815R COWLES FOUNDATION FOR

More information

arxiv: v3 [math.st] 23 May 2016

arxiv: v3 [math.st] 23 May 2016 Inference in partially identified models with many moment arxiv:1604.02309v3 [math.st] 23 May 2016 inequalities using Lasso Federico A. Bugni Mehmet Caner Department of Economics Department of Economics

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

More information

Solutions Final Exam May. 14, 2014

Solutions Final Exam May. 14, 2014 Solutions Final Exam May. 14, 2014 1. (a) (10 points) State the formal definition of a Cauchy sequence of real numbers. A sequence, {a n } n N, of real numbers, is Cauchy if and only if for every ɛ > 0,

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES USING GENERALIZED MOMENT SELECTION. DONALD W. K. ANDREWS and GUSTAVO SOARES

INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES USING GENERALIZED MOMENT SELECTION. DONALD W. K. ANDREWS and GUSTAVO SOARES INFERENCE FOR PARAMETERS DEFINED BY MOMENT INEQUALITIES USING GENERALIZED MOMENT SELECTION BY DONALD W. K. ANDREWS and GUSTAVO SOARES COWLES FOUNDATION PAPER NO. 1291 COWLES FOUNDATION FOR RESEARCH IN

More information

Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall.

Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall. .1 Limits of Sequences. CHAPTER.1.0. a) True. If converges, then there is an M > 0 such that M. Choose by Archimedes an N N such that N > M/ε. Then n N implies /n M/n M/N < ε. b) False. = n does not converge,

More information

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University QED Queen s Economics Department Working Paper No. 1319 Hypothesis Testing for Arbitrary Bounds Jeffrey Penney Queen s University Department of Economics Queen s University 94 University Avenue Kingston,

More information

CHAPTER I THE RIESZ REPRESENTATION THEOREM

CHAPTER I THE RIESZ REPRESENTATION THEOREM CHAPTER I THE RIESZ REPRESENTATION THEOREM We begin our study by identifying certain special kinds of linear functionals on certain special vector spaces of functions. We describe these linear functionals

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n

Large Sample Theory. Consider a sequence of random variables Z 1, Z 2,..., Z n. Convergence in probability: Z n Large Sample Theory In statistics, we are interested in the properties of particular random variables (or estimators ), which are functions of our data. In ymptotic analysis, we focus on describing the

More information

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Prakash Balachandran Department of Mathematics Duke University April 2, 2008 1 Review of Discrete-Time

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

SDS : Theoretical Statistics

SDS : Theoretical Statistics SDS 384 11: Theoretical Statistics Lecture 1: Introduction Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin https://psarkar.github.io/teaching Manegerial Stuff

More information

2 Sequences, Continuity, and Limits

2 Sequences, Continuity, and Limits 2 Sequences, Continuity, and Limits In this chapter, we introduce the fundamental notions of continuity and limit of a real-valued function of two variables. As in ACICARA, the definitions as well as proofs

More information

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1

Math 118B Solutions. Charles Martin. March 6, d i (x i, y i ) + d i (y i, z i ) = d(x, y) + d(y, z). i=1 Math 8B Solutions Charles Martin March 6, Homework Problems. Let (X i, d i ), i n, be finitely many metric spaces. Construct a metric on the product space X = X X n. Proof. Denote points in X as x = (x,

More information

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions

Economics 204 Fall 2011 Problem Set 2 Suggested Solutions Economics 24 Fall 211 Problem Set 2 Suggested Solutions 1. Determine whether the following sets are open, closed, both or neither under the topology induced by the usual metric. (Hint: think about limit

More information

Lecture 8 Inequality Testing and Moment Inequality Models

Lecture 8 Inequality Testing and Moment Inequality Models Lecture 8 Inequality Testing and Moment Inequality Models Inequality Testing In the previous lecture, we discussed how to test the nonlinear hypothesis H 0 : h(θ 0 ) 0 when the sample information comes

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Does k-th Moment Exist?

Does k-th Moment Exist? Does k-th Moment Exist? Hitomi, K. 1 and Y. Nishiyama 2 1 Kyoto Institute of Technology, Japan 2 Institute of Economic Research, Kyoto University, Japan Email: hitomi@kit.ac.jp Keywords: Existence of moments,

More information

Functional Analysis HW #3

Functional Analysis HW #3 Functional Analysis HW #3 Sangchul Lee October 26, 2015 1 Solutions Exercise 2.1. Let D = { f C([0, 1]) : f C([0, 1])} and define f d = f + f. Show that D is a Banach algebra and that the Gelfand transform

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

3 Integration and Expectation

3 Integration and Expectation 3 Integration and Expectation 3.1 Construction of the Lebesgue Integral Let (, F, µ) be a measure space (not necessarily a probability space). Our objective will be to define the Lebesgue integral R fdµ

More information

Estimation of the functional Weibull-tail coefficient

Estimation of the functional Weibull-tail coefficient 1/ 29 Estimation of the functional Weibull-tail coefficient Stéphane Girard Inria Grenoble Rhône-Alpes & LJK, France http://mistis.inrialpes.fr/people/girard/ June 2016 joint work with Laurent Gardes,

More information

The Lindeberg central limit theorem

The Lindeberg central limit theorem The Lindeberg central limit theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto May 29, 205 Convergence in distribution We denote by P d the collection of Borel probability

More information

BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION

BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Statistica Sinica 8(998), 07-085 BOOTSTRAPPING SAMPLE QUANTILES BASED ON COMPLEX SURVEY DATA UNDER HOT DECK IMPUTATION Jun Shao and Yinzhong Chen University of Wisconsin-Madison Abstract: The bootstrap

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information