Pearson s meta-analysis revisited

Similar documents
By Art B. Owen 1 Stanford University

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses

Topic 3: Hypothesis Testing

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Parameter Estimation, Sampling Distributions & Hypothesis Testing

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Testing Independence

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Statistical Applications in Genetics and Molecular Biology

Journal Club: Higher Criticism

Hypothesis testing (cont d)

Lecture 10: Generalized likelihood ratio test

Analysis of Variance

Review of Statistics 101

Data Mining. CS57300 Purdue University. March 22, 2018

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Chapter 7: Hypothesis testing

A. Motivation To motivate the analysis of variance framework, we consider the following example.

Mathematical Statistics

Stat 206: Estimation and testing for a mean vector,

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

Composite Hypotheses and Generalized Likelihood Ratio Tests

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Post-Selection Inference

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

Many natural processes can be fit to a Poisson distribution

Ling 289 Contingency Table Statistics

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Statistical Data Analysis Stat 3: p-values, parameter estimation

Eco517 Fall 2004 C. Sims MIDTERM EXAM

Statistical inference

appstats27.notebook April 06, 2017

Scatter plot of data from the study. Linear Regression

MS&E 226: Small Data

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Inference in Regression Model

DATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do


Probability and Statistics Notes

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Warm-up Using the given data Create a scatterplot Find the regression line

Scatter plot of data from the study. Linear Regression

Modified Simes Critical Values Under Positive Dependence

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Harvard University. Rigorous Research in Engineering Education

Part 1.) We know that the probability of any specific x only given p ij = p i p j is just multinomial(n, p) where p k1 k 2

Chapter 1 Review of Equations and Inequalities

Statistical Modeling and Analysis of Scientific Inquiry: The Basics of Hypothesis Testing

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Central Limit Theorem ( 5.3)

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Analysis of Variance

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

Unit 14: Nonparametric Statistical Methods

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Testing Research and Statistical Hypotheses

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

MATH Notebook 3 Spring 2018

UCLA STAT 251. Statistical Methods for the Life and Health Sciences. Hypothesis Testing. Instructor: Ivo Dinov,

ECO375 Tutorial 4 Introduction to Statistical Inference

Lectures 5 & 6: Hypothesis Testing

Looking at the Other Side of Bonferroni

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Linear Regression. Chapter 3

Statistics for IT Managers

14.30 Introduction to Statistical Methods in Economics Spring 2009

1 Least Squares Estimation - multiple regression.

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Probability. Lecture Notes. Adolfo J. Rumbos

Fundamental Probability and Statistics

Lecture 21: October 19

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

Chapter 27 Summary Inferences for Regression

HAPPY BIRTHDAY CHARLES

Search for b Ø bz. CDF note Adam Scott, David Stuart UCSB. 1 Exotics Meeting. Blessing

Multiple samples: Modeling and ANOVA

determine whether or not this relationship is.

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Statistical Tests. Matthieu de Lapparent

Advanced Statistical Methods: Beyond Linear Regression

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

10. Composite Hypothesis Testing. ECE 830, Spring 2014

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Transcription:

Pearson s meta-analysis revisited 1 Pearson s meta-analysis revisited in a microarray context Art B. Owen Department of Statistics Stanford University

Pearson s meta-analysis revisited 2 Long story short 1) A microarray analysis needed a meta-analysis that accounts for directionality of effects 2) Pearson (1934) already had the same idea 3) And Birnbaum (1954) showed inadmissibility 4) But Birnbaum misread Pearson 5) The method is admissible & competitive vs Fisher (where we need it) 6) and the proof leads to something new that may be better

Pearson s meta-analysis revisited 3 Karl Pearson quote Stigler (2008) recounting Karl Pearson s amazing productivity includes this from Stouffer (1958): You Americans would not understand, but I never answer a telephone or attend a committee meeting. Pearson was born in 1857

Pearson s meta-analysis revisited 4 Two example problems Work with NIA and Kim lab AGEMAP Zahn et al. PLOS Is gene i correlated with age in tissue j of the mouse? For 8932 genes and 16 tissues We get a matrix of 8932 16 p-values fmri Benjamini & Heller Is brain location i activated in task j? Similar problems

Pearson s meta-analysis revisited 5 AGEMAP goals Which genes are age related generically? They should show age relationship in multiple tissues Ideally the sign should be common too Too much to suppose that the slope is exactly the same Two tasks 1) Combine 16 p values into one decision per gene 2) Adjust for having tested 8932 genes Here We look at task 1) understanding that it is for screening For this talk: pretend tests are independent & ignore gene groups

Pearson s meta-analysis revisited 6 Given a collection of p-values: We have n null hypotheses H 01,..., H 0n Multiple hypothesis testing We get n p-values p 1,..., p n p i for H 0i Decide which to reject, controlling false discoveries Meta-analysis We have 1 hypothesis H 0 We have m tests and m p-values for H 0 Combine p 1,..., p m into one decision Or combine m underlying test statistics

Pearson s meta-analysis revisited 7 An age related gene 1) should have a statistically significant regression slope 2) in multiple tissues (not necessarily all) 3) predominantly of one sign 4) not necessarily a common slope The underlying model Regress expression for gene i and tissue j on age adjusting for sex. Y ijk = β 0ij + β 1ij Age k + β 1ij Sex k + ε ijk There were 40 animals... so 37 degrees of freedom 40 16 8932 responses (apart from some missing values)

Pearson s meta-analysis revisited 8 ( m ) Refer 2 log j=1 p j to χ 2 (2m) Choose 1 tailed or 2 tailed p values Run Fisher vs β j < 0 run again vs β j > 0 use whichever one tailed test is most extreme Fisher s test K. Pearson s test What we get 1) Strong preference for concordant alternatives 2) We don t have to know the direction a priori 3) Still have some power if one test is discordant Pearson gets better power vs concordant alternatives and less power vs discordant.

Pearson s meta-analysis revisited 9 Notation for 1 gene Parameters: β 1 β m Estimates: ˆβ1 ˆβm Obs. Values: ˆβobs 1 ˆβobs m Null hypothesis H 0,j : β j = 0 Alternative H L,j : β j < 0 H R,j : β j > 0 H C,j : β j 0 p value Pr( ˆβ j Pr( ˆβ j Pr( ˆβ j ˆβ obs j β j = 0 ) p j ˆβ obs j β j = 0 ) 1 p j ˆβ obs j β j = 0 ) p j = 2 min( p j, 1 p j )

Pearson s meta-analysis revisited 10 Hypotheses on β = (β 1,..., β m ) Null H 0 : β = 0 Left orthant H L : β (, 0] m {0} Right orthant H R : β [0, ) m {0} Any H A : β 0 For > 0 In screening, we don t know whether to use H L or H R We prefer β = ±(,,..., ) to most β = (±, ±,..., ± ) But β = (,,...,, ) or (,,...,, 0) is also interesting So we use H A and a test with more power in H L and H R than elsewhere

Pearson s meta-analysis revisited 11 Test statistics Fisher s test, 3 ways ( m Q L = 2 log j=1 p j ) ( m ) Q R = 2 log (1 p j ) j=1 ( m ) Q C = 2 log p j j=1 For m = 1 Q U = Q C but not for m > 1 Pearson s test Q U max(q L, Q R ) Mnemonic: U for undirected

Pearson s meta-analysis revisited 12 Null distributions Q L, Q R, Q C χ 2 (2m) Via associated random variables, we find Pr ( Q U > x ) = Pr ( Q L > x ) + Pr ( Q R > x ) Pr ( Q L > x & Q R > x ) 2 Pr ( Q L > x ) Pr ( Q L > x ) 2 So Bonferroni is quite sharp for small α α Pr ( Q U χ 2,1 α/2 ) α 2 (2m) α 4 For α =.01, the level is in [0.009975, 0.01]

Pearson s meta-analysis revisited 13 Stouffer et al (1949) test statistics Under H 0 Z j = Φ 1 ( p j ) N(0, 1) Reject H 0 for large S S L = 1 m m j=1 S R = 1 m m j=1 S C = 1 m m j=1 Φ 1 (1 p j ) Φ 1 ( p j ) S U = max(s L, S R ) Φ 1 ( p j ) Stouffer test is mostly a straw man Though S U advocated by Whitlock (2005)

Pearson s meta-analysis revisited 14 Meta-analysis refresher Key ref: Hedges and Olkin (1985) We have 1 hypothesis H 0 p values p 1,..., p m indep U(0, 1) under H 0 There is no unique best way to combine them (Birnbaum 1954) Condition 1 If H 0 is rejected for any given (p 1,..., p m ) then it will also be rejected for all (p 1,..., p m) such that p j p j for j = 1,..., m. Birnbaum shows that any combination method which satisfies Condition 1 is admissible.

Pearson s meta-analysis revisited 15 Meta-analysis geometry min(p 1, p 2 ) max(p 1, p 2 ) Fisher Stouffer x axis is p 1 y axis is p 2 Blue for α = 0.1 rejection region They all satisfy Condition 1 min is due to Tippett 1931 max is due to Wilkinson 1951

Pearson s meta-analysis revisited 16 Geometry again min(p 1, p 2 ) max(p 1, p 2 ) Fisher Stouffer Top row coords (p 1, p 2 ) bottom row coords ( p 1, p 2 )

Pearson s meta-analysis revisited 17 Undirected tests Fisher Q U Stouffer S U Rejection regions in one tailed ( p 1, p 2 ) coords Thicker rejection region for coordinated alternatives Stouffer allows one p j to veto the others

Pearson s meta-analysis revisited 18 A more stringent admissibility Tippet and Wilkinson are optimal at some alternatives hence admissible Some alternatives are far fetched For ˆβ j in exponential families Birnbaum Condition 2: Admissibility convex acceptance region for ( ˆβ 1,..., ˆβ m ) In a world of Gaussian data ˆβ j N (β j, σ 2 /n j ) p j = Φ( n j ˆβj /σ) ˆβ j = Φ 1 ( p j ) σ/ n j regions in p j regions in ˆβ j

Pearson s meta-analysis revisited 19 Birnbaum s result Reject for small Q B Get non convex acceptance regions Hence inadmissible test Quite right, but not Pearson s proposal ( m ) Q B = 2 log (1 p j ) j=1 What went wrong χ 2 (2m) Birnbaum 1954 misread Egon Pearson (1938) describing Karl Pearson (1934) Two problems 1) 1 vs 2 tailed p values mixed up 2) the word or misinterpreted

Pearson s meta-analysis revisited 20 Acceptance regions Q C Q U Q L Q B x axis is ˆβ 1 & y axis is ˆβ 2 Blue curve = rejection boundary Dot (origin) is in acceptance region for H 0 Admissible = dot in convex region Pearson s Q U region looks convex Of course it is! Intersect Q L and Q R regions

Pearson s meta-analysis revisited 21 Theorem 1 For ˆβ 1,..., ˆβ m R m let ( Q U = max 2 log Admissibility of Q U m j=1 Φ( ˆβ j ), 2 log m j=1 ) Φ( ˆβ j ). Then {( ˆβ 1,..., ˆβ m ) Q U < q} is convex so that Pearson s test is admissible in the exponential family context, for Gaussian data. 1) ϕ(t) is log concave Ideas of proof 2) so therefore are Φ(t) and Φ( t) Boyd and Vandenberge 3) log(log concave) is convex 4) sum of convex is convex 5) max of convex is convex these steps apply in other settings too

Pearson s meta-analysis revisited 22 Marden (1985) For Z j = Φ 1 ( p j ) Likelihood ratio tests Left, right, and center versions Λ L = Λ R = Λ C = m max(0, Z j ) 2 j=1 m max(0, Z j ) 2 j=1 m j=1 Z 2 j New one Λ U = max(λ L, Λ R ) Admissible, favors concordant alternatives, Bonferroni fairly tight

Pearson s meta-analysis revisited 23 Undirected LRT vs Fisher in ( p 1, p 2 ) Λ U Q U Λ U will catch more discordant tests Q U has more power for concordant tests

Pearson s meta-analysis revisited 24 More acceptance regions 3 2 1 0 1 2 3 Two Gaussian variables: Und. Likelihood ratio Λ U Und. Fisher Q U Stouffer S U 3 2 1 0 1 2 3

Pearson s meta-analysis revisited 25 Alternatives of interest Most β j either zero or of common sign (β 1,..., β m ) R m Simpler special cases: each β j {0, } > 0

Pearson s meta-analysis revisited 26 Power of tests k nonzero {}}{ β = ±(,...,, 0,..., 0) H }{{} A R m ˆβ N (β, Im ) m k zero Power 0.0 0.2 0.4 0.6 0.8 1.0 16 8 4 2 0 1 2 3 4 5 Delta m = 16 k {2, 4, 8, 16} Q U Λ U Λ C = m j=1 ˆβ 2 j

Pearson s meta-analysis revisited 27 Scale to k k nonzero {}}{ β = ±( k,..., k, 0,..., 0) H }{{} A R m ˆβ N (β, Im ) m k zero Choose k so j ˆβ 2 j has power 0.8 at α = 0.01 Power 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 Number nonzero Q U Λ U S U S C

Pearson s meta-analysis revisited 28 One negative k 1 nonzero {}}{ β = ±( k, k,..., k, 0,..., 0) H }{{} A R m ˆβ N (β, Im ) m k zero Choose k so j ˆβ 2 j has power 0.8 at α = 0.01 Power 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 Number nonzero Q U Λ U S U S C

Pearson s meta-analysis revisited 29 Computing the power e.g. Q L = m log ( Φ( p j ) ) j=1 A sum of independent random variables, distns F j under H A Get distribution by convolution (FFT) Monahan (2001) convolves characteristic functions New (?) alternative Get Discrete CDFs F j F j F + j (stochastic inequality) Support on grid {0, η, 2η,..., (N 1)η, + } η > 0 When convolving upper bounds, round overflow up to + When convolving lower bounds, round overflow down to (N 1)η After convolution m j=1 F j L(Q L ) m j=1 F + j We get 100% confidence, finite width

Pearson s meta-analysis revisited 30 Recommendations All j same sign = S U = j ˆβ j recommended Most Many j same sign = Q U = max(q L, Q R ) recommended j same sign = Λ U = max(λ L, Λ R ) recommended

Pearson s meta-analysis revisited 31 Extensive simulation Fisher-Pearson Q U has better precision-recall than S U or ˆβ2 j for finding truly age related genes in a simulation where we know which ones are related with β = (,...,, 0,..., 0) and resampled residuals No free lunch Increased power for concordant comes with decreased power for discordant If we wanted to We could design a test that preferred discordant results or concordant within subgroups

Pearson s meta-analysis revisited 32 Some results, for 9 tissues 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Pool via QC at level 0.001 Num. of neg coef at 0.05 Num. of pos coef at 0.05 0 1 2 3 4 5 6 0 1 2 3 4 5 6 Pool via QU at level 0.001 Num. of neg coef at 0.05 Num. of pos coef at 0.05 Left shows genes found via Q C right via Q U each circle is one gene (Expect 8.932 genes by chance) x axis is # tissues with p j < 0.025 y axis is # tissues with p j > 0.975 Q U pulls up more unanimous genes (269 vs 216), fewer split decisions, fewer total

Pearson s meta-analysis revisited 33 1) Pick a prior on β A more principled approach 2) Quantify the relative value of split decisions vs unanimous findings 3) Find a test to optimize expected value of discoveries Steps 1 and 2 look harder than 3

Pearson s meta-analysis revisited 34 Simes test regions p = min 1 j m m j p (j) U(0, 1) Under H 0 p = min(2p (1), p (2) ) for m = 2 C L T 3 2 1 0 1 2 3 3 2 1 0 1 2 3 3 2 1 0 1 2 3 3 2 1 0 1 2 3 3 2 1 0 1 2 3 3 2 1 0 1 2 3 x axis is ˆβ 1 y axis is ˆβ 2 95% regions

Pearson s meta-analysis revisited 35 Partial conjunction hypotheses Benjamini and Heller (2007) Alt. is only interesting if r or more of β j 0 Null and alternative H 0r : m 1 βj 0 < r H Cr : j=1 m 1 βj 0 r j=1 NB: the null is composite for r > 1, e.g {0} and the axes when r = 2 Ignore the most significant r 1 p values combine the rest Test statistics

Pearson s meta-analysis revisited 36 Partial conjunction test statistics p (1) p (2) p (m) indep of p (1) p (2) p (m) Fisher style ( m 2 log j=r p (j) ) ( m 2 log j=r p (r) ) (m r+1 ) 2 log (1 p (r) ) j=1

Pearson s meta-analysis revisited 37 Partial conjunction test statistics p (1) p (2) p (m) indep of p (1) p (2) p (m) Fisher style ( m 2 log j=r p (j) ) ( m 2 log j=r p (r) ) Stouffer style (m r+1 ) 2 log (1 p (r) ) j=1 m Φ 1 (p (j) ) m Φ 1 ( p (j) ) m r+1 Φ 1 (1 p (j) ) j=r j=r j=1

Pearson s meta-analysis revisited 38 Partial conjunction test statistics p (1) p (2) p (m) indep of p (1) p (2) p (m) Fisher style ( m 2 log j=r p (j) ) ( m 2 log j=r p (r) ) Stouffer style (m r+1 ) 2 log (1 p (r) ) j=1 m Φ 1 (p (j) ) m Φ 1 ( p (j) ) m r+1 Φ 1 (1 p (j) ) j=r j=r j=1 Simes style min r j m m r + 1 j r + 1 p (j) min r j m m r + 1 j r + 1 p (j) min r j m m r + 1 j r + 1 (1 p (m j+1)) worth considering LRT and undirected versions

Pearson s meta-analysis revisited 39 Partial conjunction regions C L U For m = 2 and r = 2 need both significant Simes/Fisher/Stouffer collapse into one p (r) p (m) is just p (2) { } (β 1, β 0 ) β 1 = 0 or β 2 = 0 Null is

Pearson s meta-analysis revisited 40 Next steps Partial conjunction tests have nonconvex acceptance regions So they re not suited to a point null They were not motivated by that null either So how to pick good tests for this setting? Or rule out bad ones?

Pearson s meta-analysis revisited 41 Acknowledgments Stuart Kim and Jacob Zahn for many discussions about testing Ingram Olkin and John Marden for comments on meta-analysis NSF for support Nancy Zhang, Ed George, Adam Greenberg

Pearson s meta-analysis revisited 42 Quotes Given time, here s the history of the mixup. More details in paper Karl Pearson s Meta-Analysis Revisited Annals of Statistics, (2009)

Pearson s meta-analysis revisited 43 Birnbaum (1954) p 562 Quote Karl Pearson s method: reject H 0 if and only if (1 u 1 )(1 u 2 ) (1 u k ) c, where c is a predetermined constant corresponding to the desired significance level. In applications, c can be computed by a direct adaptation of the method used to calculate the c used in Fisher s method. Upshot In our notation (1 u 1 )(1 u 2 ) (1 u k ) is m j=1 (1 p j). It is clear from his Figure 4 that it does not mean m j=1 (1 p j). Birnbaum does not cite any of Karl Pearson s papers. Instead he cites Egon Pearson

Pearson s meta-analysis revisited 44 E. Pearson (1938) p 136 Quote Following what may be described as the intuitional line of approach, K. Pearson (1933) suggested as suitable test criterion one or other of the products Q 1 = y 1 y 2 y n, or Q 1 = (1 y 1 )(1 y 2 ) (1 y n ). Upshot In our notation Q 1 = m j=1 p j and Q 1 = m j=1 (1 p j). E. Pearson cites K. Pearson s 1933 paper, although it appears that he should have cited the 1934 paper instead, because the former has only Q 1, while the latter has Q 1 and Q 1. or or or K. Pearson s or meant try them both and take the more extreme. A. Birnbaum s or meant try either of them one at a time. He also used two-tailed p j where Pearson had one-tailed p j.

Pearson s meta-analysis revisited 45 Hedges & Olkin (1985) Several other functions for combining p-values have been proposed. In 1933 Karl Pearson suggested combining p-values via the product (1 p 1 )(1 p 2 ) (1 p k ). Other functions of the statistics p i = Min{p i, 1 p i }, i = 1,..., k, were suggested by David(1934) for the combination of two-sided test statistic, which treat large and small values of the p i symmetrically. Neither of these procedures has a convex acceptance region, so these procedures are not admissible for combining test statistics from the one-parameter exponential family. Upshot The complaint vs Q U may be stuck in the literature for a while. Birnbaum points out that finding something inadmissible does not mean it will be easy to find the thing that beats it.