SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE
|
|
- Morris Flowers
- 6 years ago
- Views:
Transcription
1 SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE A HYPOTHESIS TEST APPROACH Ismaïl Ahmed 1,2, Françoise Haramburu 3,4, Annie Fourrier-Réglat 3,4,5, Frantz Thiessard 4,5,6, Carmen Kreft-Jais 7, Ghada Miremont-Salamé 3,4,5, Bernard Bégaud 3,4,5, Pascale Tubert-Bitter 1,2 1: Inserm, U780, Villejuif - 2: Univ Paris-Sud, Villejuif - 3: Inserm, U657, Bordeaux - 4: Pellegrin Hospital, Bordeaux - 5: Université Victor Segalen Bordeaux 2, Bordeaux - 6: Inserm, U593, Bordeaux - 7: Afssaps, Saint-Denis Statistics for health registers and linked databases Open University, May 2009
2 Introduction (1) Pharmacovigilance : post-marketing surveillance Objectives Detection : identification as early as possible of ADRs non observed during clinical trials (rare, latent, affecting sub-groups) Characterization of new risks Data sources : spontaneous reporting systems Spontaneous reporting data : Features Life-size capture of events in the exposed population Under-reporting, unknown baseline and exposure incidence 2 / 30
3 Introduction (2) Framework Signals Very large databases (spontaneous reports) Incorporating a statistical tool for signal detection Statistical associations within the database (automatic detection) Potential adverse drug reactions (ADRs) 3 / 30
4 Introduction (3) Pharmacovigilance database Large contingency table crossing all the drugs (D) and all the adverse events (AE) French database : 672 D 820 AEs (80 % of the cells are empty) Adverse event i Other adverse events Drug j n ij Other Drugs Automatic signal detection methods Frequentist methods : Proportional Reporting Ratio (UK) Reporting Odds Ratio (Netherlands) Bayesian methods : Gamma Poisson Shrinkage (USA) Bayesian Confidence Propagation Neural Network (WHO) 4 / 30
5 Introduction (4) Limits of the current methods Thresholds for the statistics of interest are arbitrarily chosen Do not take into account the multiple comparisons Proposed Approach : To account for the multiple comparisons for the choice of a threshold Through the use recent False Discovery Rate (FDR) approaches Leads to alternative statistics of interest P-values for the frequentist methods posterior probabilities of the null hypothesis for the Bayesian methods 5 / 30
6 Outline Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 6 / 30
7 Some notations For a particular couple (AE i, D j ) Drug j Other Drugs Adverse event i n ij n i j n i. Other adverse events n īj n ī j n ī. n.j n. j n n ij : Number of reports involving AE i and D j n i. : Marginal count involving AE i n.j : Marginal count involving D j n : Total number of AE-D pairs counts 7 / 30
8 Frequentist methods Reporting Odds Ratio (ROR) van Puijenbroek et al For the adverse event-drug pair (i, j) ˆψ ij = nijnī j n īj n i j ln( ˆψ ij) is assumed to follow a normal distribution with variance : A signal is generated if cvar{ln( ˆψ ij)} = n ij n ī j n īj n i j ln( ˆψ ij) 1.96 var{ln( ˆψ ij)} 1/2 > 0 Proportional Reporting Ratio (PRR) Evans et al Same idea but with the relative risk as association measure of interest 8 / 30
9 Bayesian Gamma Poisson Shrinkage (GPS) DuMouchel (1999) Poisson - 2 gamma mixture model n ij e ij, λ ij Pn(λ ij e ij) avec e ij = λ ij ŵ Ga(ˆα 1, ˆβ 1) + (1 ŵ) Ga(ˆα 2, ˆβ 2) ni. n.j n where (ŵ, ˆα 1, ˆα 2, ˆβ 1, ˆβ 2) maximizes the marginal likelihood ˆw fbn {n ij; α 1, β 1/(β 1 + e ij)} + (1 w) f Bn {n ij; α 2, β 2/(β 2 + e ij)} Q ij Association measure λ ij = λ ij n ij, e ij λ ij w ij Ga(ˆα 1 + n ij, ˆβ 1 + e ij) + (1 w ij) Ga(ˆα 2 + n ij, ˆβ 2 + e ij) Signal generation Q 0.05 (λ ij) > 2 Szarfman et al. (2002) 9 / 30
10 Bayesian Confidence Propagation Neural Network (BCPNN) (1) Bate et al. (1998), Noren et al. (2006) Multinomial-Dirichlet model (n ij, n i j, n īj, n ī j) Mu(n, p ij, p i j, p īj, p ī j) with (p ij, p i j, p īj, p ī j) Di(α ij, α i j, α īj, α ī j) The hyperparameters depend on the cell counts The posterior distribution of (p ij, p i j, p īj, p ī j) is also a Dirichlet : (p ij, p i j, p īj, p ī j) Di(γ ij, γ i j, γ īj, γ ī j) with γ kl = α kl + n kl In particular p ij Be(γ ij, γ i j + γ īj + γ ī j) p i. = p ij + p i j Be(γ ij + γ i j, γ īj + γ ī j) p.j = p ij + p īj Be(γ ij + γ īj, γ i j + γ ī j) 10 / 30
11 Bayesian Confidence Propagation Neural Network (BCPNN) (2) Bate et al. (1998), Noren et al. (2006) Association measure IC ij = log 2 p ij p i. p.j! Ratio of beta distributions No analytic form Signal generation Q (IC ij) > 0 Interpolation model built from Monte Carlo simulations : Noren et al. (2006) 11 / 30
12 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 12 / 30
13 False Discovery Rate and Pharmacovigilance Automatic signal detection methods are data mining tools Extension to the hypothesis testing framework relying on the recent developments in multiple comparison statistical field detection thresholds based on statistical criteria False Discovery Rate (Benjamini and Hochberg (1995)) E(proportion of false discoveries among the generated signals) used in the genomic data analysis adapted to massive comparisons and exploratory analysis 13 / 30
14 Frequentist methods - Proposed approach (1) New statistic of interest : P-values e.g ROR : for each cell, we want to test H 0ij : ψ ij ψ 0 «ln( The corresponding P-values p ij = 1 Φ ˆψ ij ) ln(ψ 0 ) var[ln( ˆψ ij )] 1/2 where Φ denotes the standard normal cdf The current decision rule corresponds to choose ψ 0 = 1 and generate signals for cells with p ij Exactly the same idea for the PRR method Alternative : mid-p-values from the Fisher s exact test (midrfet) 14 / 30
15 Frequentist methods - Proposed approach (2) FDR estimation P-values are assumed to follow a mixture of two distributions F (p) = π 0 F 0(p) + (1 π 0) F 1(p) F 0 (p) is the cdf of p under the null hypothesis F 1 (p) is the cdf of p under the alternative hypothesis For a P-value rejection region [0, γ] with γ ]0, 1] : FDR(γ) = π0f0(γ) F (γ) The main difficulty is to estimate π 0 qvalue Storey (2003), LBE Dalmasso et al. (2005) They are based on few distribution assumptions They provide an upper bound of the FDR They were developped for single null hypotheses uniform distribution of the p-values under H 0 (F 0 (γ) = γ) In our case the null hypothesis is one-sided The distribution of the p-value is not uniform But we can use those procedures on p = 1 2 p / 30
16 Bayesian methods - Proposed approach (1) New statistic of interest : posterior probability of the null hypothesis For each cell, we want to test H 0ij : λ ij R 0 for the GPS model p ij H 0ij : R 0 for the BCPNN model p i.p.j and thus to calculate the posterior probability of H 0ij Pr(λ ij R 0) for the GPS model Pr(IC ij ln(r 0)) for the BCPNN model The current decision rules correspond to R 0 = 2 and Pr(λ ij R 0) 0.05 for the GPS model R 0 = 1 and Pr(IC ij ln(r 0 )) for the BCPNN model 16 / 30
17 Bayesian methods - Proposed approach (2) : FDR estimation Based on the bayesian decision theory framework - Müller et al. (2004) Status z ij {0, 1} Decision d ij {0, 1} FDR FDP = P ij (1 zij)dij FDR = E[FDP] Pij dij Bayesian FDR estimation E[FDP data] = P ij vij dij P ij dij where v ij = P r(z ij = 0 data) is the posterior Pr. of H 0ij Pr(λ ij R 0) for the GPS model Pr(IC ij ln(r 0 )) for the BCPNN model d ij = 1 [vij α] 17 / 30
18 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 18 / 30
19 Simulation study Data generation Model : n ij Mu(n, p ij ) from the French database p ij p i. w Di(n i.) p w.j Di(n.j ) = p ij = rw ij pw i. pẉ j P ij rw ij pw i. pẉ j log(rij w ) Lo(0, 0.5) From p ij n ij s and the true marginal probabilities : p i. = P j p ij real status of the cells according to ψ ij, and ψ 0 for the frequentist methods R ij = p ij p i. p and R 0 for the bayesian methods.j Simulation plan 500 simulated datasets The FDRs are calculated for cells with n ij 3 Simulation for ROR, midrfet, GPS and BCPNN methods Results are presented for {ψ 0, R 0} = 1 and 2 p.j = P i p ij 19 / 30
20 Simulation results - Comparison of the methods True FDRs (Monte Carlo estimation) ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Average number of generated signals (a) Average number of generated signals (b) 20 / 30
21 Simulation results - midrfet True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Overestimation of the FDR (as expected) Average number of generated signals (b) 21 / 30
22 Simulation results - ROR True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Normal approximation Underestimation Average number of generated signals (b) 22 / 30
23 Simulation results - GPS True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Average number of generated signals (b) 23 / 30
24 Simulation results - BCPNN True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Underestimation Average number of generated signals (b) 24 / 30
25 Simulation results - Comparison of the methods True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Average number of generated signals (a) Average number of generated signals (b) 25 / 30
26 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 26 / 30
27 Application to the French database ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Number of generated signals (a) Current decision rules Method Sig. FDR ROR BCPNN GPS Number of generated signals (b) Based on the FDR : e.g GPS Sig. R 0 = R 0 = / 30
28 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 28 / 30
29 Discussion Extension of the current methods to the multiple comparison framework No modification of the model New decision rules The FDR is calculated within the database Spontaneous reporting database several sources of bias true associations in the database may not reflect the situation in the population It is a measure for evaluating and comparing the performances of the automatic signal detection methods Close performances for all the automatic methods The GPS model provides better FDR estimates 29 / 30
30 References I. Ahmed et al. FDR estimation for frequentist pharmacovigilance signal detection methods. Biometrics, In press. I. Ahmed et al. Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting. Statistics in Medicine, In press. A. Bate et al. A bayesian neural network method for adverse drug reaction signal generation. European Journal of Clinical Pharmacology, 54(4) : , Jun Y. Benjamini and Y. Hochberg. Controlling the false discovery rate : a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57(1) : , C. Dalmasso et al. A simple procedure for estimating the false discovery rate. Bioinformatics, 21(5) : , Mar W. DuMouchel. Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system. The American Statistician, 53(3) : , P. Müller et al. Optimal sample size for multiple testing : the case of gene expression microarrays. Journal of The American Statistical Association, 99 : , G. N. Norén et al. Extending the methods used to screen the who drug safety database towards analysis of complex associations and improved accuracy for rare events. Statistics in Medicine, 25(21) : , J. D. Storey. The positive false discovery rate : A bayesian interpretation and the q-value. The Annals of Statistics, 31(6) : , A. Szarfman et al. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA s spontaneous reports database. Drug Safety, 25(6) : , E. P. van Puijenbroek et al. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiology and Drug Safety, 11(1) :3 10, / 30
Temporality and Context for Detecting Adverse Drug Reactions from Longitudinal Data
Noname manuscript No. (will be inserted by the editor) Temporality and Context for Detecting Adverse Drug Reactions from Longitudinal Data Henry Lo Wei Ding Zohreh Nazeri the date of receipt and acceptance
More informationData Mining in Pharmacovigilence. Aimin Feng, David Madigan, and Ivan Zorych
Data Mining in Pharmacovigilence Aimin Feng, David Madigan, and Ivan Zorych dmadigan@rutgers.edu http://stat.rutgers.edu/~madigan 1 Overview Intro. to Post-marketing Surveillance SRS Databases Existing
More informationMultiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationReports of the Institute of Biostatistics
Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions
More informationDivision of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series
Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series Year: 2013 #006 The Expected Value of Information in Prospective Drug Safety Monitoring Jessica M. Franklin a, Amanda R. Patrick
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationA Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data
A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction
More informationHigh-throughput Testing
High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector
More informationThe study of drug-drug interactions in ADRAC database
The study of drug-drug interactions in ADRAC database M.A.MAMMADOV, A.BANERJEE and J.YEARWOOD University of Ballarat, Australia Abstract Drug-drug interaction is one of the important problems of Adverse
More informationSupplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control
Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationEstimation of the False Discovery Rate
Estimation of the False Discovery Rate Coffee Talk, Bioinformatics Research Center, Sept, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationThis paper has been submitted for consideration for publication in Biometrics
BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationProbabilistic Inference for Multiple Testing
This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationFalse discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationResearch Article Sample Size Calculation for Controlling False Discovery Proportion
Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationFamilywise Error Rate Controlling Procedures for Discrete Data
Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department
More informationControlling Bayes Directional False Discovery Rate in Random Effects Model 1
Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA
More informationLinear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments
Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Gordon K. Smyth (as interpreted by Aaron J. Baraff) STAT 572 Intro Talk April 10, 2014 Microarray
More informationA moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data
Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More informationA Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics
Introduction A Brief Introduction to Intersection-Union Tests Often, the quality of a product is determined by several parameters. The product is determined to be acceptable if each of the parameters meets
More informationINTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP
INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology
More informationControlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method
Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman
More informationData Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We
More informationComparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters
Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University
More informationTwo-stage Adaptive Randomization for Delayed Response in Clinical Trials
Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal
More informationPubH 5450 Biostatistics I Prof. Carlin. Lecture 13
PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null
More informationBios 6649: Clinical Trials - Statistical Design and Monitoring
Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & nformatics Colorado School of Public Health University of Colorado Denver
More informationLet us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided
Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or
More informationA Large-Sample Approach to Controlling the False Discovery Rate
A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University
More informationZhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018
Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng
More informationA GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE
A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and
More informationAlpha-Investing. Sequential Control of Expected False Discoveries
Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint
More informationHypothesis testing (cont d)
Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able
More informationLecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk
More informationMixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data
Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Cinzia Viroli 1 joint with E. Bonafede 1, S. Robin 2 & F. Picard 3 1 Department of Statistical Sciences, University
More informationFalse Discovery Control in Spatial Multiple Testing
False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University
More informationBayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion
Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection By-products and Spontaneous Abortion November 8, 2007 Outline Outline 1 Introduction Outline Outline 1 Introduction
More informationModule 22: Bayesian Methods Lecture 9 A: Default prior selection
Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical
More information(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics
Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationFalse Discovery Rate
False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR
More informationBios 6649: Clinical Trials - Statistical Design and Monitoring
Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver
More information29 Sample Size Choice for Microarray Experiments
29 Sample Size Choice for Microarray Experiments Peter Müller, M.D. Anderson Cancer Center Christian Robert and Judith Rousseau CREST, Paris Abstract We review Bayesian sample size arguments for microarray
More informationCluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May
Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie
More informationSome General Types of Tests
Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationResampling-Based Control of the FDR
Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago
More informationDoing Cosmology with Balls and Envelopes
Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More information2015 Duke-Industry Statistics Symposium. Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data
2015 Duke-Industry Statistics Symposium Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data Victoria Chang Senior Statistician Biometrics and Data Management
More informationROI ANALYSIS OF PHARMAFMRI DATA:
ROI ANALYSIS OF PHARMAFMRI DATA: AN ADAPTIVE APPROACH FOR GLOBAL TESTING Giorgos Minas, John A.D. Aston, Thomas E. Nichols and Nigel Stallard Department of Statistics and Warwick Centre of Analytical Sciences,
More informationRegularized Regression A Bayesian point of view
Regularized Regression A Bayesian point of view Vincent MICHEL Director : Gilles Celeux Supervisor : Bertrand Thirion Parietal Team, INRIA Saclay Ile-de-France LRI, Université Paris Sud CEA, DSV, I2BM,
More informationStatistical Methods for Astronomy
Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical
More informationStat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.
Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationBayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)
Bayesian inference Justin Chumbley ETH and UZH (Thanks to Jean Denizeau for slides) Overview of the talk Introduction: Bayesian inference Bayesian model comparison Group-level Bayesian model selection
More informationLecture 7 April 16, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties
More informationExceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004
Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),
More informationAn introduction to Bayesian inference and model comparison J. Daunizeau
An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison
More informationImproving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses
Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving
More informationStatistical methods for large scale exploratory analysis of post-marketing drug safety data
Mathematical Statistics Stockholm University Statistical methods for large scale exploratory analysis of post-marketing drug safety data G. Niklas Norén Research Report 2005:9 Licentiate thesis ISSN 1650-0377
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationBayesian Nonparametric Regression for Diabetes Deaths
Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationdiscovery rate control
Optimal design for high-throughput screening via false discovery rate control arxiv:1707.03462v1 [stat.ap] 11 Jul 2017 Tao Feng 1, Pallavi Basu 2, Wenguang Sun 3, Hsun Teresa Ku 4, Wendy J. Mack 1 Abstract
More informationNeutral Bayesian reference models for incidence rates of (rare) clinical events
Neutral Bayesian reference models for incidence rates of (rare) clinical events Jouni Kerman Statistical Methodology, Novartis Pharma AG, Basel BAYES2012, May 10, Aachen Outline Motivation why reference
More informationLarge-Scale Hypothesis Testing
Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early
More informationAliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25
Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationThe optimal discovery procedure: a new approach to simultaneous significance testing
J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationSample Size Estimation for Studies of High-Dimensional Data
Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,
More informationBayes Factors for Grouped Data
Bayes Factors for Grouped Data Lizanne Raubenheimer and Abrie J. van der Merwe 2 Department of Statistics, Rhodes University, Grahamstown, South Africa, L.Raubenheimer@ru.ac.za 2 Department of Mathematical
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationStatistics in medicine
Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial
More informationPB HLTH 240A: Advanced Categorical Data Analysis Fall 2007
Cohort study s formulations PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Srine Dudoit Division of Biostatistics Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~srine
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationRelated Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM
Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationQTL model selection: key players
Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:
More informationStatistics for the LHC Lecture 1: Introduction
Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University
More informationOn Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses
On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical
More informationA class of latent marginal models for capture-recapture data with continuous covariates
A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit
More informationTABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1
TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8
More informationBayesian Regression (1/31/13)
STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...
More information