# University of California San Diego and Stanford University and

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 First International Workshop on Functional and Operatorial Statistics. Toulouse, June 19-21, 2008 K-sample Subsampling Dimitris N. olitis andjoseph.romano University of California San Diego and Stanford University and Abstract The problem of subsampling in two-sample and K-sample settings is addressed where both the data and the statistics of interest take values in general spaces. We show the asymptotic validity of subsampling confidence intervals and hypothesis tests in the case of independent samples, and give a comparison to the bootstrap in the K-sample setting. 1 Introduction Subsampling is a statistical method that is most generally valid for nonparametric inference in a large-sample setting. The applications of subsampling are numerous starting from i.i.d. data and regression, and continuing to time series, random fields, marked point processes, etc. ; see olitis, Romano and Wolf (1999) for a review and list of references. Interestingly, the two-sample and K-sample settings have not been explored yet in the subsampling literature ; we attempt to fill this gap here. So, consider K independent datasets : X (1),...,X (K) where X (k) =(X (k) 1,...,X(k) variables X (k) j n k )fork =1,...,K. The random take values in an arbitrary space 1 S ; typically, S would be R d for some d, but S can very well be a function space. Although the dataset X (k) is independent of X (k ) for k k, there may exist some dependence within a dataset. For conciseness, we will focus on the case of independence within samples here ; the general case will be treated in a follow-up bigger exposition. Thus, in the sequel we will assume that, for any k, X (k) 1,...,X(k) n k are i.i.d. An example in the i.i.d. case is the usual two-sample set-up in biostatistics where d features (body characteristics, gene expressions, etc.) are measured on a group of patients, and then again measured on a control group. The probability law associated with such a K-sample experiment is specified by =( 1,..., K ), where k is the underlying probability of the kth sample ; more formally, the joint distribution of all the observations is the product measure K. The goal is inference (confidence regions, hypothesis tests, etc.) k=1 n k k 1 Actually, one can let S vary with k as well, but we do not pursue this here for lack of space.

2 regarding some parameter θ = θ( ) that takes values in a general normed linear space B with norm denoted by.denoten =(n 1,...,n K ), and let ˆθ n = ˆθ n (X (1),...,X (K) ) be an estimator of θ. It will be assumed that ˆθ n is consistent as min k n k. In general, one could also consider the case where the number of samples K tends to as well. Let g : B R be a continuous function, and let J n ( ) denote the sampling distribution of the root g[τ n (ˆθ n θ( ))] under, with corresponding cumulative distribution function J n (x, )=rob {g[τ n (ˆθ n θ( ))] x} (1) where τ n is a normalizing sequence ; in particular, τ n is to be thought of as a fixed function of n such that τ n when min k n k.asanexample,g( ) might be a continuous function of the norm or a projection operator. As in the one-sample case, the basic assumption that is required for subsampling to work is existence of a bona fide large-sample distribution, i.e., Assumption 1.1 There exists a nondegenerate limiting law J( ) such that J n ( ) converges weakly to J( ) as min k n k. The α quantile of J( ) will be denoted by J 1 (α, )=inf{x : J(x, ) α}. In addition to Assumption 1.1, we will use the following mild assumption. Assumption 1.2 As min k n k, τ b ˆθ n θ( ) = o (1). Assumptions 1.1 and 1.2 are implied by the following assumption, as long as τ b /τ n 0. Assumption 1.3 As min k n k, the distribution of τ n (ˆθ n θ( )) under converges weakly to some distribution (on the Borel σ-field of the normed linear space B). Here, weak convergence is understood to be taken in the modern sense of Hoffmann- Jorgensen ; see Section 1.3 of van der Vaart and Wellner (1996). That Assumption 1.3 implies both Assumptions 1.1 and 1.2 follows by the Continuous Mapping Theorem ; see Theorem of van der Vaart and Wellner (1996). 2 Subsampling hypothesis tests in K samples For k =1,...,K,letS k denote the set of all size b k (unordered) subsets of the dataset {X (k) 1,...,X(k) n k } where b k is an integer in [1,n k ]. Note that the set S k contains Q k = ( nk ) (k) b k elements that are enumerated as S 1,S(k) 2,...,S(k) Q k.ak foldsubsample is then constructed by choosing one element from each super-set S k for k =1,...,K.Thus,a typical K fold subsample has the form : S (1) i 1,S (2) i 2,...,S (K) i K, where i k is an integer in [1,Q k ]fork =1,...,K. It is apparent that the number of possible K fold subsamples is Q = K k=1 Q k. So a subsample value of the general statistic ˆθ n is ˆθ i,b = ˆθ b (S (1) i 1,...,S (K) i K ) (2) where b =(b 1,...,b K )andi =(i 1,...,i K ). The subsampling distribution approximation to J n ( ) is defined by L n,b (x) = 1 Q Q 1 Q 2 i 1=1 i 2=1 Q K i K=1 1{g[τ b (ˆθ i,b ˆθ n )] x}. (3)

3 The distribution L n,b (x) is useful for the construction of subsampling confidence sets as discussed in Section 3. For hypothesis testing, however, we instead let G n,b (x) = 1 Q Q 1 Q 2 i 1=1 i 2=1 Q K i K =1 1{g[τ b (ˆθ i,b )] x}, (4) and consider the general problem of testing a null hypothesis H 0 that =( 1,..., k ) 0 against H 1 that 1. The goal is to construct an asymptotically valid null distribution based on some test statistic of the form g(τ n ˆθn ), whose distribution under is defined to be G n ( )(withc.d.f.g n (,)). The subsampling critical value is obtained as the 1 α quantile of G n,b ( ), denoted g n,b (1 α). We will make use of the following assumption. Assumption 2.1 If 0, there exists a nondegenerate limiting law G( ) such that G n ( ) converges weakly to G( ) as min k n k. Let G(,) denote the c.d.f. corresponding to G( ). Let G 1 (1 α, )denotea1 α quantile of G( ). The following result gives the consistency of the procedure under H 0, and under a sequence of contiguous alternatives ; for the definition contiguity see Section 12.3 of Lehmann and Romano (2007). One could also obtain a simple consistency result under fixed alternatives. Theorem 2.1 Suppose Assumption 2.1 holds. Also, assume that, for each k =1,...,K, we have b k /n k 0, andb k as min k n k. (i) Assume 0.IfG(,) is continuous at its 1 α quantile G 1 (1 α, ), then and g n,b rob {g(τ n ˆθn ) >g n,b (1 α)} α (ii) Suppose, that for some = ( 1,..., k ) 0, n k k,n k G 1 (1 α, ) (5) as min k n k. (6) is contiguous to n k k k =1,...K. Then, under such a contiguous sequence, g(τ n ˆθn ) is tight. Moreover, if it converges in distribution to some random variable T and G(,) is continuous at G 1 (1 α, ), then the limiting power of the test against such a sequence is {T > G 1 (1 α, )}. roof : To prove (i), let x be a continuity point of G(,). We claim G n,b (x) for G(x, ). (7) Toseewhy,notethatE[G n,b (x)] = G b (x, ) G(x, ). So, by Chebychev s inequality, to show (7), it suffices to show Var[G n,b (x)] 0. To do this, let d = d n be the greatest integer min k (n k /b k ). Then, for j =1,...,d,let θ j,b be equal to the statistic ˆθ b evaluated at the data set where the observations from the kth sample are (X (k) b k (j 1)+1,X(k) b k (j 1)+2,...,X(k) b k (j 1)+b k ). Then, set Ḡ n,b (x) =d 1 d 1{g(τ b θj,b ) x}. j=1

4 By construction, Ḡ n,b (x) is an average of i.i.d. 0 1 random variables with expectation G b (x, ) and variance that is bounded above by 1/(4d n ) 0. But, G n,b (x) has smaller variance than Ḡn,b(x). This last statement follows by a sufficiency argument from the Rao-Blackwell Theorem ; indeed, (k) G n,b (x) =E[Ḡn,b(x) ˆ n k, k =1,...,K], (k) where ˆ n k is the empirical measure in the kth sample. Since these empirical measures are sufficient, it follows that Var(G n,b (x)) Var(Ḡn,b(x)) 0. Thus, (7) holds. Then, (5) follows by Lemma (ii) of Lehmann and Romano (2005). Application of Slutsky s Theorem yields (6). To prove (ii), we know that g n,b G 1 (1 α, ) under. Contiguity forces the same convergence under the sequence of contiguous alternatives. The result follows by Slutsky s Theorem. 3 Subsampling confidence sets in K samples Let c n,b (1 α) =inf{x : L n,b (x) 1 α} where L n,b (x) was defined in (3). Theorem 3.1 Assume Assumptions 1.1 and 1.2, where g is assumed uniformly continuous. Also assume that, for each k =1,...,K, we have b k /n k 0, τ b /τ n 0, and b k as min k n k. (i) Then, L n,b (x) J(x, ) for all continuity points x of J(,). (ii) If J(,) is continuous at J 1 (1 α, ), then the event {g[τ n (ˆθ n θ( ))] c n,b (1 α)} (8) has asymptotic probability equal to 1 α ; therefore, the confidence set {θ : g[τ n (ˆθ n θ)] c n,b (1 α)} has asymptotic coverage probability 1 α. roof : Assume without loss of generality that θ( ) = 0 (in which case J n ( )=G n ( )). Let x be a continuity point of J(,). First, we claim that L n,b (x) G n,b (x) 0. (9) Given ɛ>0, there exists δ>0, so that g(x) g(x ) <ɛif x x <δ.butthen, g[τ b (ˆθ i,b ˆθ n )] g(τ bˆθi,b ) <ɛif τ bˆθn <δ; this latter event has probability tending to one. It follows that, for any fixed ɛ>0, G n,b (x ɛ) L n,b (x) G n,b (x + ɛ) with probability tending to one. But, the behavior of G n,b (x) was given in Theorem 2.1. Letting ɛ 0 through continuity points of J(,) yields (9) and (i). art (ii) follows from Slutsky s Theorem.

5 Remark. The uniform continuity assumption for g can be weakened to continuity if Assumptions 1.1 and 1.2 are replaced by Assumption 1.3. However, the proof is much more complicated and relies on a K-sample version of Theorem of olitis, Romano and Wolf (1999). In general, we may also try to approximate the distribution of a studentized root of the form g(τ n [ˆθ n θ( )]/ˆσ n,whereˆσ n is some estimator which tends in probability to some finite nonzero constant σ( ). The subsampling approximation to this distribution is L + n,b (x) = 1 Q Q 1 Q 2 i 1=1 i 2=1 Q K i K =1 1{g[τ b (ˆθ i,b ˆθ n )]/ˆσ i,b x}, (10) where ˆσ i,b is the estimator ˆσ b computed from the ith subsampled data set. Also let c + n,b (1 α) =inf{x : L+ n,b (x) 1 α}. Theorem 3.2 Assume Assumptions 1.1 and 1.2, where g is assumed uniformly continuous. Let ˆσ n satisfy ˆσ n σ( ) > 0. Also assume that, for each k =1,...,K, we have b k /n k 0, τ b /τ n 0, andb k as min k n k. (i) Then, L n,b (x) J(x σ( ),) if J(,) is continuous at xσ( ). (ii) If J(,) is continuous at J 1 (1 α, )/σ( ), then the event has asymptotic probability equal to 1 α. {g[τ n (ˆθ n θ( ))]/ˆσ n c + n,b (1 α)} (11) 4 Random subsamples and the K sample bootstrap ( nk b k ) can be a prohibitively large number ; consider- For large values of n k and b k, Q = k ing all possible subsamples may be impractical and, thus, we may resort to Monte Carlo. To define the algorithm for generating random subsamples of sizes b 1,...,b K respectively, recall that subsampling in the i.i.d. single-sample case is tantamount to sampling without replacement from the original dataset ; see e.g. olitis et al. (1999, Ch. 2.3). Thus, for m =1,...,M, we can generate the mth joint subsample as X (1) m,x (2) m,...,x (K) m where X m (k) = {X(k) I 1,...,X (k) I bk },andi 1,...,I bk are b k numbers drawn randomly without replacement from the index set {1, 2,...,n k }. Note that the random indices drawn to generate X m (k) are independent to those drawn to generate ) X(k m for k k. Thus, a randomly chosen subsample value of the statistic ˆθ n is given by ˆθ m,b = ˆθ b (X (1) m,...,x m (K) ), with corresponding subsampling distribution defined as L n,b (x) = 1 M M 1{g[τ b (ˆθ m,b ˆθ n )] x}. (12) m=1 The following corollary shows that L n,b (x) andits1 α quantile c n,b (1 α) canbeused for the construction of large-sample confidence regions for θ ; its proof is analogous to the proof of Corollary 2.1 of olitis and Romano (1994).

6 Corollary 4.1 Assume the conditions of Theorem 3.1. As M, parts (i) and (ii) of Theorem 3.1 remain valid with L n,b (x) and c n,b (1 α) instead of L n,b (x) and c n,b (1 α). Similarly, hypothesis testing can be conducted using the notion of random subsamples. To describe it, let g n,b (1 α) =inf{x : G n,b (x) 1 α} where G n,b (x) = 1 M M 1{g[τ b (ˆθ m,b )] x}. (13) m=1 Corollary 4.2 Assume the conditions of Theorem 2.1. As M, parts (i) and (ii) of Theorem 2.1 remain valid with G n,b (x) and g n,b (1 α) instead of G n,b (x) and g n,b (1 α). The bootstrap in two-sample settings is often used in practical work ; see Hall and Martin (1988) or van der Vaart and Wellner (1996). In the i.i.d. set-up, resampling and (random) subsampling are very closely related since, as mentioned, they are tantamount to sampling with vs. without replacement from the given i.i.d. sample. By contrast to subsampling, however, no general validity theorem is available for the bootstrap unless a smaller resample size is used ; see olitis and Romano (1993). Actually, the general validity of K-sample bootstrap that uses a resample size b k for sample k follows from the general validity of subsampling as long as b 2 k << n k. To state it, let Jn,b (x) denote the bootstrap (pseudo-empirical) distribution of g[τ b(ˆθ n,b ˆθ n )] where ˆθ n,b is the statistic ˆθ b computed from the bootstrap data. Similarly, let c n,b (1 α) = inf{x : Jn,b (x) 1 α}. The proof of the following corollary parallels the discussion in Section 2.3 of olitis et al. (1999). Corollary 4.3 Under the additional condition b 2 k /n k 0 for all k, Theorem 3.1 is valid as stated with Jn,b (x) and c n,b (1 α) in place of L n,b(x) and c n,b (1 α) respectively. Références [1] Hall,. and Martin, M. (1988). On the bootstrap and two-sample problems, Australian Journal of Statistics, vol. 30A, [2] Lehmann, E.L. and Romano, J. (2005). Testing Statistical Hypotheses, 3rd edition, Springer, New York. [3] olitis, D.N., and Romano, J.. (1993). Estimating the Distribution of a Studentized Statistic by Subsampling, in Bulletin of the International Statistical Institute, 49th Session, Firenze, August 25 - September 2, 1993, Book 2, pp [4] olitis, D.N., and Romano, J.. (1994). Large sample confidence regions based on subsamples under minimal assumptions, Ann. Statist., vol. 22, [5] olitis, D., Romano, J. and Wolf, M. (1999). Subsampling, Springer, New York. [6] van der Vaart, A. and Wellner, J. (1996). Weak Convergence and Empirical rocesses, Springer, New York.

### large number of i.i.d. observations from P. For concreteness, suppose

1 Subsampling Suppose X i, i = 1,..., n is an i.i.d. sequence of random variables with distribution P. Let θ(p ) be some real-valued parameter of interest, and let ˆθ n = ˆθ n (X 1,..., X n ) be some estimate

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

### Theoretical Statistics. Lecture 1.

1. Organizational issues. 2. Overview. 3. Stochastic convergence. Theoretical Statistics. Lecture 1. eter Bartlett 1 Organizational Issues Lectures: Tue/Thu 11am 12:30pm, 332 Evans. eter Bartlett. bartlett@stat.

### Inference for identifiable parameters in partially identified econometric models

Journal of Statistical Planning and Inference 138 (2008) 2786 2807 www.elsevier.com/locate/jspi Inference for identifiable parameters in partially identified econometric models Joseph P. Romano a,b,, Azeem

### Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions Vadim Marmer University of British Columbia Artyom Shneyerov CIRANO, CIREQ, and Concordia University August 30, 2010 Abstract

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

### One-Sample Numerical Data

One-Sample Numerical Data quantiles, boxplot, histogram, bootstrap confidence intervals, goodness-of-fit tests University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

### Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

### 5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

### EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS. Eun Yi Chung Joseph P. Romano

EXACT AND ASYMPTOTICALLY ROBUST PERMUTATION TESTS By Eun Yi Chung Joseph P. Romano Technical Report No. 20-05 May 20 Department of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065 EXACT AND

### UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

### is a Borel subset of S Θ for each c R (Bertsekas and Shreve, 1978, Proposition 7.36) This always holds in practical applications.

Stat 811 Lecture Notes The Wald Consistency Theorem Charles J. Geyer April 9, 01 1 Analyticity Assumptions Let { f θ : θ Θ } be a family of subprobability densities 1 with respect to a measure µ on a measurable

### Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing Joseph P. ROMANO and Michael WOLF Consider the problem of testing k hypotheses simultaneously. In this article we discuss finite-

### Lawrence D. Brown* and Daniel McCarthy*

Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals

### Robust Performance Hypothesis Testing with the Variance. Institute for Empirical Research in Economics University of Zurich

Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 516 Robust Performance Hypothesis Testing with the Variance Olivier Ledoit and Michael

### Tail bound inequalities and empirical likelihood for the mean

Tail bound inequalities and empirical likelihood for the mean Sandra Vucane 1 1 University of Latvia, Riga 29 th of September, 2011 Sandra Vucane (LU) Tail bound inequalities and EL for the mean 29.09.2011

### The Numerical Delta Method and Bootstrap

The Numerical Delta Method and Bootstrap Han Hong and Jessie Li Stanford University and UCSC 1 / 41 Motivation Recent developments in econometrics have given empirical researchers access to estimators

### Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

### Advanced Statistics II: Non Parametric Tests

Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011 Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney

### Robust Resampling Methods for Time Series

Robust Resampling Methods for Time Series Lorenzo Camponovo University of Lugano Olivier Scaillet Université de Genève and Swiss Finance Institute Fabio Trojani University of Lugano and Swiss Finance Institute

### Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

### Chapter 5 Confidence Intervals

Chapter 5 Confidence Intervals Confidence Intervals about a Population Mean, σ, Known Abbas Motamedi Tennessee Tech University A point estimate: a single number, calculated from a set of data, that is

### Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

### Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

### Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

### Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location

Bootstrap tests Patrick Breheny October 11 Patrick Breheny STA 621: Nonparametric Statistics 1/14 Introduction Conditioning on the observed data to obtain permutation tests is certainly an important idea

### Random Number Generation. CS1538: Introduction to simulations

Random Number Generation CS1538: Introduction to simulations Random Numbers Stochastic simulations require random data True random data cannot come from an algorithm We must obtain it from some process

### Resampling and the Bootstrap

Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

### Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach By Shiqing Ling Department of Mathematics Hong Kong University of Science and Technology Let {y t : t = 0, ±1, ±2,

### Estimation and Inference of Quantile Regression. for Survival Data under Biased Sampling

Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased

### Central Limit Theorems for Conditional Markov Chains

Mathieu Sinn IBM Research - Ireland Bei Chen IBM Research - Ireland Abstract This paper studies Central Limit Theorems for real-valued functionals of Conditional Markov Chains. Using a classical result

### Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

### Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Chirok Han and Robert de Jong January 28, 2002 Abstract This paper considers Closest Moment (CM) estimation with a general distance function, and avoids

### Lecture 1: Random number generation, permutation test, and the bootstrap. August 25, 2016

Lecture 1: Random number generation, permutation test, and the bootstrap August 25, 2016 Statistical simulation 1/21 Statistical simulation (Monte Carlo) is an important part of statistical method research.

### A nonparametric two-sample wald test of equality of variances

University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

### 6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

### An elementary proof of the weak convergence of empirical processes

An elementary proof of the weak convergence of empirical processes Dragan Radulović Department of Mathematics, Florida Atlantic University Marten Wegkamp Department of Mathematics & Department of Statistical

### A Signed-Rank Test Based on the Score Function

Applied Mathematical Sciences, Vol. 10, 2016, no. 51, 2517-2527 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2016.66189 A Signed-Rank Test Based on the Score Function Hyo-Il Park Department

### Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Bootstrapping high dimensional vector: interplay between dependence and dimensionality Xianyang Zhang Joint work with Guang Cheng University of Missouri-Columbia LDHD: Transition Workshop, 2014 Xianyang

### Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and

### [y i α βx i ] 2 (2) Q = i=1

Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

### Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

### HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

Georgian Mathematical Journal Volume 9 (2002), Number 3, 591 600 NONEXPANSIVE MAPPINGS AND ITERATIVE METHODS IN UNIFORMLY CONVEX BANACH SPACES HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

### Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

### An Omnibus Nonparametric Test of Equality in. Distribution for Unknown Functions

An Omnibus Nonparametric Test of Equality in Distribution for Unknown Functions arxiv:151.4195v3 [math.st] 14 Jun 217 Alexander R. Luedtke 1,2, Mark J. van der Laan 3, and Marco Carone 4,1 1 Vaccine and

### Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

### Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

1 / 171 Bootstrap inference Francisco Cribari-Neto Departamento de Estatística Universidade Federal de Pernambuco Recife / PE, Brazil email: cribari@gmail.com October 2013 2 / 171 Unpaid advertisement

### A Review of Basic Monte Carlo Methods

A Review of Basic Monte Carlo Methods Julian Haft May 9, 2014 Introduction One of the most powerful techniques in statistical analysis developed in this past century is undoubtedly that of Monte Carlo

### Limit theory and inference about conditional distributions

Limit theory and inference about conditional distributions Purevdorj Tuvaandorj and Victoria Zinde-Walsh McGill University, CIREQ and CIRANO purevdorj.tuvaandorj@mail.mcgill.ca victoria.zinde-walsh@mcgill.ca

### The Essential Equivalence of Pairwise and Mutual Conditional Independence

The Essential Equivalence of Pairwise and Mutual Conditional Independence Peter J. Hammond and Yeneng Sun Probability Theory and Related Fields, forthcoming Abstract For a large collection of random variables,

### Understanding Ding s Apparent Paradox

Submitted to Statistical Science Understanding Ding s Apparent Paradox Peter M. Aronow and Molly R. Offer-Westort Yale University 1. INTRODUCTION We are grateful for the opportunity to comment on A Paradox

### A Bootstrap Test for Conditional Symmetry

ANNALS OF ECONOMICS AND FINANCE 6, 51 61 005) A Bootstrap Test for Conditional Symmetry Liangjun Su Guanghua School of Management, Peking University E-mail: lsu@gsm.pku.edu.cn and Sainan Jin Guanghua School

### Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

### Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

### On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments

UDC 519.2 On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments N. V. Gribkova Department of Probability Theory and Mathematical Statistics, St.-Petersburg

### asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:

### Theory and Methods of Statistical Inference

PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:

### Bootstrap of residual processes in regression: to smooth or not to smooth?

Bootstrap of residual processes in regression: to smooth or not to smooth? arxiv:1712.02685v1 [math.st] 7 Dec 2017 Natalie Neumeyer Ingrid Van Keilegom December 8, 2017 Abstract In this paper we consider

### A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators

A Perron-type theorem on the principal eigenvalue of nonsymmetric elliptic operators Lei Ni And I cherish more than anything else the Analogies, my most trustworthy masters. They know all the secrets of

### AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

### Rank-sum Test Based on Order Restricted Randomized Design

Rank-sum Test Based on Order Restricted Randomized Design Omer Ozturk and Yiping Sun Abstract One of the main principles in a design of experiment is to use blocking factors whenever it is possible. On

### IMPROVING TWO RESULTS IN MULTIPLE TESTING

IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,

### So we will instead use the Jacobian method for inferring the PDF of functionally related random variables; see Bertsekas & Tsitsiklis Sec. 4.1.

2011 Page 1 Simulating Gaussian Random Variables Monday, February 14, 2011 2:00 PM Readings: Kloeden and Platen Sec. 1.3 Why does the Box Muller method work? How was it derived? The basic idea involves

### Probability and Statistics

Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

### Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

### Estimation of the Conditional Variance in Paired Experiments

Estimation of the Conditional Variance in Paired Experiments Alberto Abadie & Guido W. Imbens Harvard University and BER June 008 Abstract In paired randomized experiments units are grouped in pairs, often

### Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

### The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our

### 14.30 Introduction to Statistical Methods in Economics Spring 2009

MIT OpenCourseWare http://ocw.mit.edu 4.0 Introduction to Statistical Methods in Economics Spring 009 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

### Asymptotic statistics using the Functional Delta Method

Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random

### Spread, estimators and nuisance parameters

Bernoulli 3(3), 1997, 323±328 Spread, estimators and nuisance parameters EDWIN. VAN DEN HEUVEL and CHIS A.J. KLAASSEN Department of Mathematics and Institute for Business and Industrial Statistics, University

### An algorithm for robust fitting of autoregressive models Dimitris N. Politis

An algorithm for robust fitting of autoregressive models Dimitris N. Politis Abstract: An algorithm for robust fitting of AR models is given, based on a linear regression idea. The new method appears to

### Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

### Closest Moment Estimation under General Conditions

Closest Moment Estimation under General Conditions Chirok Han Victoria University of Wellington New Zealand Robert de Jong Ohio State University U.S.A October, 2003 Abstract This paper considers Closest

### arxiv: v4 [stat.me] 20 Oct 2017

A simple method for implementing Monte Carlo tests Dong Ding, Axel Gandy, and Georg Hahn Department of Mathematics, Imperial College London arxiv:1611.01675v4 [stat.me] 20 Oct 2017 Abstract We consider

### Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures

2 1 Borel Regular Measures We now state and prove an important regularity property of Borel regular outer measures: Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon

### Master s Written Examination - Solution

Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

### Least Squares Model Averaging. Bruce E. Hansen University of Wisconsin. January 2006 Revised: August 2006

Least Squares Model Averaging Bruce E. Hansen University of Wisconsin January 2006 Revised: August 2006 Introduction This paper developes a model averaging estimator for linear regression. Model averaging

### Nonparametric Test on Process Capability

Nonparametric Test on Process Capability Stefano Bonnini Abstract The study of process capability is very important in designing a new product or service and in the definition of purchase agreements. In

### Contents 1. Contents

Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

### 4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether

### Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

### Uniformity and the delta method

Uniformity and the delta method Maximilian Kasy January, 208 Abstract When are asymptotic approximations using the delta-method uniformly valid? We provide sufficient conditions as well as closely related

### 1. (Regular) Exponential Family

1. (Regular) Exponential Family The density function of a regular exponential family is: [ ] Example. Poisson(θ) [ ] Example. Normal. (both unknown). ) [ ] [ ] [ ] [ ] 2. Theorem (Exponential family &

### Qualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf

Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section

### The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80 71. Decide in each case whether the hypothesis is simple

### Asymptotic Refinements of a Misspecification-Robust Bootstrap for Generalized Method of Moments Estimators

Asymptotic Refinements of a Misspecification-Robust Bootstrap for Generalized Method of Moments Estimators SeoJeong Jay) Lee University of Wisconsin-Madison Job Market Paper Abstract I propose a nonparametric

### Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods

PhD School in Statistics XXV cycle, 2010 Theory and Methods of Statistical Inference PART I Frequentist likelihood methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

### Smooth nonparametric estimation of a quantile function under right censoring using beta kernels

Smooth nonparametric estimation of a quantile function under right censoring using beta kernels Chanseok Park 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634 Short Title: Smooth

### Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

### A Bias Correction for the Minimum Error Rate in Cross-validation

A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.

### where x i and u i are iid N (0; 1) random variates and are mutually independent, ff =0; and fi =1. ff(x i )=fl0 + fl1x i with fl0 =1. We examine the e

Inference on the Quantile Regression Process Electronic Appendix Roger Koenker and Zhijie Xiao 1 Asymptotic Critical Values Like many other Kolmogorov-Smirnov type tests (see, e.g. Andrews (1993)), the

### CH.9 Tests of Hypotheses for a Single Sample

CH.9 Tests of Hypotheses for a Single Sample Hypotheses testing Tests on the mean of a normal distributionvariance known Tests on the mean of a normal distributionvariance unknown Tests on the variance

### Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring

### High Breakdown Analogs of the Trimmed Mean

High Breakdown Analogs of the Trimmed Mean David J. Olive Southern Illinois University April 11, 2004 Abstract Two high breakdown estimators that are asymptotically equivalent to a sequence of trimmed

### The Block Criterion for Multiscale Inference About a Density, With Applications to Other Multiscale Problems

Please see the online version of this article for supplementary materials. The Block Criterion for Multiscale Inference About a Density, With Applications to Other Multiscale Problems Kaspar RUFIBACH and

### Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

### AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

Econometrics Working Paper EWP0401 ISSN 1485-6441 Department of Economics AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY Lauren Bin Dong & David E. A. Giles Department of Economics, University of Victoria