A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics

Similar documents
INTRODUCTION TO INTERSECTION-UNION TESTS

Lecture 13: p-values and union intersection tests

Likelihood Ratio Tests and Intersection-Union Tests. Roger L. Berger. Department of Statistics, North Carolina State University

Multiple Testing in Group Sequential Clinical Trials

Type I error rate control in adaptive designs for confirmatory clinical trials with treatment selection at interim

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

Bioequivalence Trials, Intersection-Union Tests, and. Equivalence Condence Sets. Columbus, OH October 9, 1996 { Final version.

Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

Control of Directional Errors in Fixed Sequence Multiple Testing

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

Pubh 8482: Sequential Analysis

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

Tests about a population mean

The International Journal of Biostatistics

Brock University. Department of Computer Science. A Fast Randomisation Test for Rule Significance

Statistica Sinica Preprint No: SS R1

2015 Duke-Industry Statistics Symposium. Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data

An extrapolation framework to specify requirements for drug development in children

Stepwise Gatekeeping Procedures in Clinical Trial Applications

Paper Equivalence Tests. Fei Wang and John Amrhein, McDougall Scientific Ltd.

Lecture 21: October 19

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Inferential Statistics

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Test Volume 11, Number 1. June 2002

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich

Mathematical statistics

Inference in Constrained Linear Regression

Chapter Six: Two Independent Samples Methods 1/51

Formal Statement of Simple Linear Regression Model

Assessing the Effect of Prior Distribution Assumption on the Variance Parameters in Evaluating Bioequivalence Trials

On Generalized Fixed Sequence Procedures for Controlling the FWER

MATH 240. Chapter 8 Outlines of Hypothesis Tests

Mathematical statistics

Bootstrapping the Grainger Causality Test With Integrated Data

- 1 - By H. S Steyn, Statistical Consultation Services, North-West University (Potchefstroom Campus)

Master s Written Examination

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Unified Approach For Performing An Analytical Methods Comparability Study IVT s LAB WEEK Presenter: Peter M. Saama, Ph.D. Bayer HealthCare LLC

Hypothesis Testing Chap 10p460

SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE

Chapters 10. Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Testing for equivalence: an intersection-union permutation solution

Conformity assessment of a high-performance liquid chromatography calibration.

An Informal Introduction to Statistics in 2h. Tim Kraska

6 Sample Size Calculations

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Determination of Design Space for Oral Pharmaceutical Drugs

A Gate-keeping Approach for Selecting Adaptive Interventions under General SMART Designs

MTH U481 : SPRING 2009: PRACTICE PROBLEMS FOR FINAL

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

AN ALTERNATIVE APPROACH TO EVALUATION OF POOLABILITY FOR STABILITY STUDIES

To control the consistency and quality

Guide to the Expression of Uncertainty in Measurement (GUM) and its supplemental guides

Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni based closed tests

UNIVERSITY OF TORONTO Faculty of Arts and Science

STUDY OF THE APPLICABILTY OF CONTENT UNIFORMITY AND DISSOLUTION VARIATION TEST ON ROPINIROLE HYDROCHLORIDE TABLETS

Answers and expectations

Testing Monotonicity of Pricing Kernels

Summary of Chapters 7-9

Sequential Monitoring of Clinical Trials Session 4 - Bayesian Evaluation of Group Sequential Designs

A note on tree gatekeeping procedures in clinical trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Topic 15: Simple Hypotheses

Modified Simes Critical Values Under Positive Dependence

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Psychology 282 Lecture #4 Outline Inferences in SLR

Use of frequentist and Bayesian approaches for extrapolating from adult efficacy data to design and interpret confirmatory trials in children

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

7. Estimation and hypothesis testing. Objective. Recommended reading

Alpha-Investing. Sequential Control of Expected False Discoveries

Fundamental Probability and Statistics

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Group Sequential Designs: Theory, Computation and Optimisation

Likelihood-based inference with missing data under missing-at-random

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Power assessment in group sequential design with multiple biomarker subgroups for multiplicity problem

Statistical Inference

Limited Dependent Variables and Panel Data

TUTORIAL 8 SOLUTIONS #

Hypothesis for Means and Proportions

Multiple comparisons - subsequent inferences for two-way ANOVA

Problems ( ) 1 exp. 2. n! e λ and

On Assessing Bioequivalence and Interchangeability between Generics Based on Indirect Comparisons

Statistical Inference

Applied Statistics for the Behavioral Sciences

Multivariate Time Series: Part 4

An Alpha-Exhaustive Multiple Testing Procedure

STA 2101/442 Assignment 2 1

Section 6.2 Hypothesis Testing

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Probability and Measure

Transcription:

Introduction A Brief Introduction to Intersection-Union Tests Often, the quality of a product is determined by several parameters. The product is determined to be acceptable if each of the parameters meets certain standards. Jimmy Akira Doi jadoi@unity.ncsu.edu North Carolina State University Department of Statistics Acceptance Sampling: Upholstery Fabric Example ASTM Standards (American Society for Testing and Materials) Mean breaking strength (in pounds) Probability of passing flammability test Fabric deemed acceptable when each of these parameters satisfy certain criteria Slide 1 Slide 2 Hypothesis Testing Framework Formulate the null hypothesis H 0 which states that one or more of the parameters do not meet their criteria. Formulate the alternative hypothesis H A which states that all of the parameters do meet their criteria. Under this framework, note that rejection of the null hypothesis H 0 corresponds to the decision that the product is acceptable. Also, the probability of a Type I error will be the consumer s risk. Some Concerns Multiple Testing Procedure Is there a need for a multiplicity adjustment? Need to Control Consumer s Risk Slide 3 Slide 4

Multiple Comparisons Example: Experiment with 5 treatments Outline Review of the Problem of Multiple Comparisons Definition of the Intersection-Union Test 2 Relevant Theorems Example with Simulation Results Bioequivalence Introduction 10 possible pairwise comparisons or contrasts (0.05 level t-tests: t 1,t 2,...,t 10) Combine all tests into φ t, where φ t rejects if any of the t i rejects. Familywise error rate α 0.05 n 3 5 10 20 45 α 14% 23% 40% 64% 90% Need multiplicity adjustment Slide 5 Slide 6 Intersection-Union Tests (IUT ) According to one survey, multiple comparison methods constitute the second most frequently applied group of statistical methods, second only to the F-test... If they rank second to frequency of use, they rank perhaps first in frequency of abuse. Given X f(x θ), suppose H 0 is expressed as a union of k sets (the index set need not be finite): H 0 : θ Θ 0 = k i=1 Θ i vs. H A : θ Θ c 0 = k i=1 Θc i (1) Suppose for each i, R i is the rejection region for a test of H 0i : θ Θ i vs. H Ai : θ Θ c i. Jason Hsu, Multiple Comparisons: Theory and Methods The rejection region for the IUT of H 0 vs. H A is k i=1 R i. In other words, the IUT rejects only if all of the tests reject. Do we need a multiplicity adjustment? Slide 7 Slide 8

Theorem 1 The following theorems are from: Berger, RL (1997), Likelihood Ratio Tests and Intersection-Union Tests Theorem 1: If R i is a level-α test of H 0i, for i =1,...,k, then the IUT with rejection region R = k i=1 R i is a level-α test of H 0 versus H A in (1). (No adjustment necessary!) Usefulness of IUT : Rely upon the simpler tests of the individual hypotheses and not worry about the joint multivariate dist n. Although the IUT above is level-α, the test can be very conservative. Can the IUT also be size-α? Proof of Theorem 1 Theorem 1: If R i is a level-α test of H 0i, for i =1,...,k, then the IUT with rejection region R = k i=1 R i is a level-α test of H 0 versus H A in (1). Proof : Let θ Θ 0 = k i=1 Θ i. So, for some i =1,...,k (say i = i ), θ Θ i. Thus, P θ ( R i ) P θ (R i ) α Since θ Θ 0 was arbitrarily chosen, the IUT is level-α as sup θ Θ0 P θ ( R i ) α (No adjustment necessary) Slide 9 Slide 10 Theorem 2 Theorem 2: For some i =1,...,k, suppose R i is a size-α rejection region for testing H 0i vs. H Ai. For every j =1,...,k,j i, suppose R j is a level-α rejection region for testing H 0j vs. H Aj. Suppose there exists a sequence of parameter points θ l, l =1,2,..., in Θ i such that and, for every j = i,...,k,j i, lim l P θl (R i )=α, lim l P θl (R j )=1. Then the IUT with rejection region R = k i=1 R i is a size α test of H 0 vs. H A. Proof: lim l P θl Proof of Theorem 2 Bonferroni Inequality ( k ) {}}{ k R m lim P θl (R m ) (k 1) m=1 l m=1 = α +(k 1) (k 1) = α. Slide 11 Slide 12

Acceptance Sampling: Upholstery Fabric Example θ 1, mean breaking strength (in pounds) θ 2, probability of passing flammability test Fabric is acceptable when meeting the standards: θ 1 > 50 and θ 2 > 0.95 Table 1. Monte Carlo Estimates of α Fixed θ 2 and Increasing θ 1 θ 1 =50 θ 1 =60 θ 1 = 75 θ 1 = 100 θ 2 =.95.002.041.049.051 H 0: {θ 1 50 or θ 2 0.95} H A: {θ 1 > 50 and θ 2 > 0.95} X 1,...,X n iid N(θ 1,σ 2 ), use LRT of H 01 : θ 1 50. Y 1,...,Y m iid Bern(θ 2), use LRT of H 02 : θ 2 0.95. IUT rejection region is {(x, y) : x 50 s/ >tand m yi >b}. n i=1 MC Simulation: n = m =58,t=1.672,b=57 LRTs are approx. size-α (0.05) tests. Table 2. Monte Carlo Estimates of α Fixed θ 1 and Increasing θ 2 θ 2 =.95 θ 2 =.99 θ 2 =.999 θ 2 =.9999 θ 1 =50.002.015.049.049 Slide 13 Slide 14 Power Analysis θ 2 1.999.990.975 Θ Α Uses of the IUT.95 50 60 65 70 θ 1 Acceptance Sampling Comparisons of Regression Lines Table 3. MC Estimates of Power θ 2 Testing for Contingency Tables Bioequivalence.975.99.999 60.185.553.757 θ 1 65.226.544.927 70.230.553.944 Slide 16 Slide 15

New Drug Development and Approval Typically, the approval of a new drug product requires a process that is very time consuming and expensive. Phase I, II, III Clinical Trials Efficacy, Safety, Risk (side effects) In the US, if a biopharmaceutical company successfully introduces a new drug into the market, the process typically requires 15 years and approximately $500 million. Companies that produce generic drugs can bypass this process by showing the drug regulatory agency that their product is similar (therapeutically) to that of an approved reference drug. That is, they want to establish bioequivalence. (Bio)equivalence A Definition Two different drugs or formulations of the same drug are called bioequivalent if they are absorbed into the blood and become available at the drug action site at about the same rate and concentration. Berger and Hsu (1996) In order for the generic drug to be approved in the US, the Food and Drug Administration (FDA, www.fda.gov) requires a test of the following: Slide 17 Slide 18 Hypothesis Testing Framework for Bioequivalence (FDA) Let µ R = Mean blood concentration of a reference drug µ T = Mean blood concentration of a generic drug H 0 : µ T µ R δ or µ T µ R δ H A : δ<µ T µ R <δ where δ>0 is a tolerance limit specified by the drug regulatory agency. TOST Most widely used IUT Much wider recognition of IUTs due to this application But, TOST suffers from a lack of power IUT: Combine two, one-sided, size-α tests to obtain an overall size-α test. This test is often referred to as the TOST. Slide 19 Slide 20

Summary IUT avoids the need for multiplicity adjustments. 2 relevant theorems: Theorem 1: IUT is level-α when all others are level-α. Theorem 2: Under certain conditions, IUT is size α. No need to postulate a multivariate model just combine the simpler individual tests. Strongest application found through bioequivalence studies. Slide 21