Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk

Similar documents
arxiv: v4 [stat.me] 20 Oct 2017

On Using Truncated Sequential Probability Ratio Test Boundaries for Monte Carlo Implementation of Hypothesis Tests

Optimal Generalized Sequential Monte Carlo Test

Resampling and the Bootstrap

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Performance Evaluation and Comparison

Multiple Sample Categorical Data

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Monte Carlo test under general conditions: Power and number of simulations

Robust Backtesting Tests for Value-at-Risk Models

Random Numbers and Simulation

Likelihood-based inference with missing data under missing-at-random

Bootstrap Testing in Econometrics

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

11. Bootstrap Methods

Experimental Design and Data Analysis for Biologists

Bootstrap and Parametric Inference: Successes and Challenges

Bayesian Econometrics

POLI 443 Applied Political Research

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

UNIVERSITÄT POTSDAM Institut für Mathematik

Appendix of the paper: Are interest rate options important for the assessment of interest rate risk?

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Resampling and the Bootstrap

Hypothesis Testing (May 30, 2016)

Bootstrap, Jackknife and other resampling methods

The chopthin algorithm for resampling

STAT 730 Chapter 5: Hypothesis Testing

Statistical Inference

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Statistical Inference

Analytical Bootstrap Methods for Censored Data

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Sequential Detection. Changes: an overview. George V. Moustakides

Better Bootstrap Confidence Intervals

Formal Statement of Simple Linear Regression Model

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

A fast sampler for data simulation from spatial, and other, Markov random fields

CER Prediction Uncertainty

LARGE NUMBERS OF EXPLANATORY VARIABLES. H.S. Battey. WHAO-PSI, St Louis, 9 September 2018

Hypothesis testing: theory and methods

Analysis of Type-II Progressively Hybrid Censored Data

ECE 5424: Introduction to Machine Learning

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Advanced Statistical Methods: Beyond Linear Regression

Empirical likelihood ratio with arbitrarily censored/truncated data by EM algorithm

Lecture 10: Generalized likelihood ratio test

Case study: stochastic simulation via Rademacher bootstrap

A better way to bootstrap pairs

Lecture Particle Filters. Magnus Wiktorsson

Package CCP. R topics documented: December 17, Type Package. Title Significance Tests for Canonical Correlation Analysis (CCA) Version 0.

Probability and Statistics Notes

Maximum Non-extensive Entropy Block Bootstrap

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Post-exam 2 practice questions 18.05, Spring 2014

Robustness and Distribution Assumptions

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Chapter 7. Hypothesis Testing

Package dhsic. R topics documented: July 27, 2017

Pubh 8482: Sequential Analysis

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

A comparison study of the nonparametric tests based on the empirical distributions

Distributed Estimation, Information Loss and Exponential Families. Qiang Liu Department of Computer Science Dartmouth College

Direct Simulation Methods #2

Decentralized Sequential Hypothesis Testing. Change Detection

Finite Population Correction Methods

Blind Equalization via Particle Filtering

Lecture 21. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

CUSUM(t) D data. Supplementary Figure 1: Examples of changes in linear trend and CUSUM. (a) The existence of

An Introduction to Statistical Machine Learning - Theoretical Aspects -

Empirical likelihood-based methods for the difference of two trimmed means

Lecture 1: Random number generation, permutation test, and the bootstrap. August 25, 2016

A Resampling Method on Pivotal Estimating Functions

A Nonparametric Sequential Test for Online Randomized Experiments

STAT 461/561- Assignments, Year 2015

ECE531 Lecture 13: Sequential Detection of Discrete-Time Signals

Bias-Variance Tradeoff

Robert Collins CSE586, PSU Intro to Sampling Methods

Hybrid Censoring; An Introduction 2

Bootstrap Tests: How Many Bootstraps?

Testing Statistical Hypotheses

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study

Hypothesis Testing with the Bootstrap. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Computational statistics

Model Selection, Estimation, and Bootstrap Smoothing. Bradley Efron Stanford University

Some General Types of Tests

Lecture Particle Filters

Midterm exam CS 189/289, Fall 2015

Monte Carlo Methods. Leon Gu CSD, CMU

Large sample distribution for fully functional periodicity tests

6. Fractional Imputation in Survey Sampling

Comparing non-nested models in the search for new physics

Transcription:

Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk Axel Gandy Department of Mathematics Imperial College London a.gandy@imperial.ac.uk user! 2009, Rennes July 8-10, 2009

Introduction Test statistic T, reject for large values. Observation: t. p-value: p = P(T t) Often not available in closed form. Monte Carlo Test: ˆp naive = 1 n where T, T 1,... T n i.i.d. Examples: Bootstrap, Permutation tests. Goal: Estimate p using few X i n I(T i t), i=1 Mainly interested in deciding if p α for some α. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 2

Introduction Test statistic T, reject for large values. Observation: t. p-value: p = P(T t) Often not available in closed form. Monte Carlo Test: ˆp naive = 1 n where T, T 1,... T n i.i.d. Examples: Bootstrap, Permutation tests. Goal: Estimate p using few X i n I(T i t), }{{} =:X i B(1,p) i=1 Mainly interested in deciding if p α for some α. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 2

Introduction Test statistic T, reject for large values. Observation: t. p-value: p = P(T t) Often not available in closed form. Monte Carlo Test: ˆp naive = 1 n where T, T 1,... T n i.i.d. Examples: Bootstrap, Permutation tests. Goal: Estimate p using few X i n I(T i t), }{{} =:X i B(1,p) i=1 Mainly interested in deciding if p α for some α. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 2

Introduction Test statistic T, reject for large values. Observation: t. p-value: p = P(T t) Often not available in closed form. Monte Carlo Test: ˆp naive = 1 n where T, T 1,... T n i.i.d. Examples: Bootstrap, Permutation tests. Goal: Estimate p using few X i n I(T i t), }{{} =:X i B(1,p) i=1 Mainly interested in deciding if p α for some α. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 2

Sequential approaches based on S n = n i=1 X i 10 9 8 7 6 5 4 3 2 1 0 S n Stop once S n U n or S n L n τ: hitting time Compute ˆp based on S τ and τ. Hit B U : decide p > α, Hit B L : decide p α, 0 1 2 3 4 5 6 7 8 9 10 n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 3

Sequential approaches based on S n = n i=1 X i 10 9 8 7 6 5 4 3 2 1 0 S n Stop once S n U n or S n L n τ: hitting time Compute ˆp based on S τ and τ. Hit B U : decide p > α, Hit B L : decide p α, 0 1 2 3 4 5 6 7 8 9 10 n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 3

Sequential approaches based on S n = n i=1 X i 10 9 8 7 6 5 4 3 2 1 0 S n Stop once S n U n or S n L n τ: hitting time Compute ˆp based on S τ and τ. Hit B U : decide p > α, Hit B L : decide p α, 0 1 2 3 4 5 6 7 8 9 10 n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 3

Previous Approaches Besag & Clifford (1991): h S n 0 n 0 m (Truncated) Sequential Probability Ratio Test, Fay et al. (2007) h S n 0 0 m R-package MChtest. n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 4

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

What do we really want? Is p α? Two individuals using the same statistical method on the same data should arrive at the same conclusion. First law of applied statistics, Gleser (1996) Consider the resampling risk Want: RR p (ˆp) { P p (ˆp > α) if p α, P p (ˆp α) if p > α. sup RR p (ˆp) ɛ p [0,1] for some (small) ɛ > 0. For Besag & Clifford (1991), SPRT: sup p RR P 0.5 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 5

Recursive Definition of the Boundaries Want: sup RR p (ˆp) ɛ p Suffices to ensure P α (hit B U ) ɛ P α (hit B L ) ɛ Recursive definition: Given U 1,..., U n 1 and L 1,..., L n 1, define U n as the minimal value such that P α (hit B U until n) ɛ n and L n as the maximal value such that P α (hit B L until n) ɛ n where ɛ n 0 with ɛ n ɛ (spending sequence). Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 6

Recursive Definition of the Boundaries Want: sup RR p (ˆp) ɛ p Suffices to ensure P α (hit B U ) ɛ P α (hit B L ) ɛ Recursive definition: Given U 1,..., U n 1 and L 1,..., L n 1, define U n as the minimal value such that P α (hit B U until n) ɛ n and L n as the maximal value such that P α (hit B L until n) ɛ n where ɛ n 0 with ɛ n ɛ (spending sequence). Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 6

Recursive Definition of the Boundaries Want: sup RR p (ˆp) ɛ p Suffices to ensure P α (hit B U ) ɛ P α (hit B L ) ɛ Recursive definition: Given U 1,..., U n 1 and L 1,..., L n 1, define U n as the minimal value such that P α (hit B U until n) ɛ n and L n as the maximal value such that P α (hit B L until n) ɛ n where ɛ n 0 with ɛ n ɛ (spending sequence). Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 6

Recursive Definition of the Boundaries Want: sup RR p (ˆp) ɛ p Suffices to ensure P α (hit B U ) ɛ P α (hit B L ) ɛ Recursive definition: Given U 1,..., U n 1 and L 1,..., L n 1, define U n as the minimal value such that P α (hit B U until n) ɛ n and L n as the maximal value such that P α (hit B L until n) ɛ n where ɛ n 0 with ɛ n ɛ (spending sequence). Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 6

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n P α (S n =k, τ n) 0 k= 3 k= 2 k= 1 k= 0 1 ɛ n 0 U n 1 L n -1 n = Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n P α (S n =k, τ n) 0 1 k= 3 k= 2 k= 1.2 k= 0 1.8 ɛ n 0.07 U n 1 2 L n -1-1 n = Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n P α (S n =k, τ n) 0 1 2 k= 3 k= 2.04 k= 1.2.32 k= 0 1.8.64 ɛ n 0.07.11 U n 1 2 2 L n -1-1 -1 n = Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n P α (S n =k, τ n) 0 1 2 3 k= 3 k= 2.04.06 k= 1.2.32.38 k= 0 1.8.64.51 ɛ n 0.07.11.15 U n 1 2 2 2 L n -1-1 -1-1 n = Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n n = P α (S n =k, τ n) 0 1 2 3 4 k= 3 k= 2.04.06.08 k= 1.2.32.38.41 k= 0 1.8.64.51.41 ɛ n 0.07.11.15.18 U n 1 2 2 2 3 L n -1-1 -1-1 -1 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n n = P α (S n =k, τ n) 0 1 2 3 4 5 k= 3.02 k= 2.04.06.08.14 k= 1.2.32.38.41.41 k= 0 1.8.64.51.41.33 ɛ n 0.07.11.15.18.20 U n 1 2 2 2 3 3 L n -1-1 -1-1 -1-1 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n n = P α (S n =k, τ n) 0 1 2 3 4 5 6 k= 3.02.03 k= 2.04.06.08.14.20 k= 1.2.32.38.41.41.39 k= 0 1.8.64.51.41.33.26 ɛ n 0.07.11.15.18.20.22 U n 1 2 2 2 3 3 3 L n -1-1 -1-1 -1-1 -1 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n n = P α (S n =k, τ n) 0 1 2 3 4 5 6 7 k= 3.02.03.04 k= 2.04.06.08.14.20.24 k= 1.2.32.38.41.41.39.37 k= 0 1.8.64.51.41.33.26.21 ɛ n 0.07.11.15.18.20.22.23 U n 1 2 2 2 3 3 3 3 L n -1-1 -1-1 -1-1 -1 0 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Recursive Definition - Example α = 0.2, ɛ n = 0.4 n 5+n. U n =the minimal value such that L n = maximal value such that P α (hit B U until n) ɛ n P α (hit B L until n) ɛ n n = P α (S n =k, τ n) 0 1 2 3 4 5 6 7 8 k= 3.02.03.04.05 k= 2.04.06.08.14.20.24.26 k= 1.2.32.38.41.41.39.37.29 k= 0 1.8.64.51.41.33.26.21 ɛ n 0.07.11.15.18.20.22.23.25 U n 1 2 2 2 3 3 3 3 3 L n -1-1 -1-1 -1-1 -1 0 0 Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 7

Sequential Decision Procedure - Example α = 0.2, ɛ n = 0.4 n 5+n. 20 10 0 0 10 20 30 40 50 60 70 80 90 100 n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 8

Influence of ɛ on the stopping rule ɛ = 0.1, 0.001, 10 5, 10 7 ; ɛ n = ɛ n 1000+n 0 50 100 150 200 250 300 350 0 1000 2000 3000 4000 5000 n Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 9

Sequential Estimation based on the MLE One can show: S τ ˆp = τ, τ < α, τ =, hitting the upper boundary implies ˆp > α, hitting the lower boundary implies ˆp < α. Hence, sup RR p (ˆp) ɛ p Furthermore, random interval I n s.t. In only depends on X 1,..., X n, ˆp In. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 10

Example - Two-way sparse contingency table 1 2 2 1 1 0 1 2 0 0 2 3 0 0 0 1 1 1 2 7 3 1 1 2 0 0 0 1 0 1 1 1 1 0 0 H 0 : variables are independent. Reject for large values of the likelihood ratio test statistic T T d χ 2 (7 1)(5 1) under H 0. Based on this: p = 0.031. Matrix sparse - approximation poor? Use parametric bootstrap based on row and column sums. Naive test statistic ˆp naive with n = 1,000 replicates: p = 0.041 < 0.05. Probability of reporting p > 0.05: roughly 0.08. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 11

Example - Two-way sparse contingency table 1 2 2 1 1 0 1 2 0 0 2 3 0 0 0 1 1 1 2 7 3 1 1 2 0 0 0 1 0 1 1 1 1 0 0 H 0 : variables are independent. Reject for large values of the likelihood ratio test statistic T T d χ 2 (7 1)(5 1) under H 0. Based on this: p = 0.031. Matrix sparse - approximation poor? Use parametric bootstrap based on row and column sums. Naive test statistic ˆp naive with n = 1,000 replicates: p = 0.041 < 0.05. Probability of reporting p > 0.05: roughly 0.08. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 11

Example - Two-way sparse contingency table 1 2 2 1 1 0 1 2 0 0 2 3 0 0 0 1 1 1 2 7 3 1 1 2 0 0 0 1 0 1 1 1 1 0 0 H 0 : variables are independent. Reject for large values of the likelihood ratio test statistic T T d χ 2 (7 1)(5 1) under H 0. Based on this: p = 0.031. Matrix sparse - approximation poor? Use parametric bootstrap based on row and column sums. Naive test statistic ˆp naive with n = 1,000 replicates: p = 0.041 < 0.05. Probability of reporting p > 0.05: roughly 0.08. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 11

Example - Bootstrap and Sequential Algorithm > dat <- matrix(c(1,2,2,1,1,0,1, 2,0,0,2,3,0,0, 0,1,1,1,2,7,3, 1,1,2,0,0,0,1, + 0,1,1,1,1,0,0), nrow=5,ncol=7,byrow=true) > loglikrat <- function(data){ + cs <- colsums(data);rs <- rowsums(data); mu <- outer(rs,cs)/sum(rs) + 2*sum(ifelse(data<=0.5, 0,data*log(data/mu))) + } > resample <- function(data){ + cs <- colsums(data);rs <- rowsums(data); n <- sum(rs) + mu <- outer(rs,cs)/n/n + matrix(rmultinom(1,n,c(mu)),nrow=dim(data)[1],ncol=dim(data)[2]) + } > t <- loglikrat(dat); > library(simctest) > res <- simctest(function(){loglikrat(resample(dat))>=t},maxsteps=1000) > res No decision reached. Final estimate will be in [ 0.02859135, 0.07965451 ] Current estimate of the p.value: 0.041 Number of samples: 1000 > cont(res, steps=10000) p.value: 0.04035456 Number of samples: 8574

Further Uses of the Algorithm Simulation study to evaluate whether a test is liberal/conservative. Determining the sample size to achieve a certain power. Iterated Use: Determining the power of a bootstrap test. Simulation study to evaluate whether a bootstrap test is liberal/conservative. Double bootstrap test. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 13

Expected Hitting Time Result: E p (τ) < p α Example with α = 0.05, ɛ n = ɛ n 1000+n : E p (τ) 0 400 800 ε = 0.001 ε = 1e 05 ε = 1e 07 E p (τ) µ p 1.0 1.2 1.4 0.0 0.2 0.4 0.6 0.8 1.0 p µ p = theoretical lower bound on E p (τ). Note: 1 0 µ pdp = ; for iterated use: Need to limit the number of steps.

Expected Hitting Time Result: E p (τ) < p α Example with α = 0.05, ɛ n = ɛ n 1000+n : E p (τ) 0 400 800 ε = 0.001 ε = 1e 05 ε = 1e 07 E p (τ) µ p 1.0 1.2 1.4 0.0 0.2 0.4 0.6 0.8 1.0 p µ p = theoretical lower bound on E p (τ). Note: 1 0 µ pdp = ; for iterated use: Need to limit the number of steps.

Summary Sequential implementation of Monte Carlo Tests and computation of p-values. Useful when implementing tests in packages. After a finite number of steps: ˆp or interval [ˆp L n, ˆp U n ] in which ˆp will lie. Guarantee (up to a very small error probability): ˆp is on the correct side of α. R-package simctest available on CRAN. (efficient implementation with C-code) For details see Gandy (2009). Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 15

References Besag, J. & Clifford, P. (1991). Sequential Monte Carlo p-values. Biometrika 78, 301 304. Davison, A. & Hinkley, D. (1997). Bootstrap methods and their application. Cambridge University Press. Fay, M. P., Kim, H.-J. & Hachey, M. (2007). On using truncated sequential probability ratio test boundaries for Monte Carlo implementation of hypothesis tests. Journal of Computational & Graphical Statistics 16, 946 967. Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Accepted for publication in JASA. Gleser, L. J. (1996). Comment on Bootstrap Confidence Intervals by T. J. DiCiccio and B. Efron. Statistical Science 11, 219 221. Axel Gandy Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk 16