SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE

Size: px
Start display at page:

Download "SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE"

Transcription

1 SIGNAL RANKING-BASED COMPARISON OF AUTOMATIC DETECTION METHODS IN PHARMACOVIGILANCE A HYPOTHESIS TEST APPROACH Ismaïl Ahmed 1,2, Françoise Haramburu 3,4, Annie Fourrier-Réglat 3,4,5, Frantz Thiessard 4,5,6, Carmen Kreft-Jais 7, Ghada Miremont-Salamé 3,4,5, Bernard Bégaud 3,4,5, Pascale Tubert-Bitter 1,2 1: Inserm, U780, Villejuif - 2: Univ Paris-Sud, Villejuif - 3: Inserm, U657, Bordeaux - 4: Pellegrin Hospital, Bordeaux - 5: Université Victor Segalen Bordeaux 2, Bordeaux - 6: Inserm, U593, Bordeaux - 7: Afssaps, Saint-Denis Statistics for health registers and linked databases Open University, May 2009

2 Introduction (1) Pharmacovigilance : post-marketing surveillance Objectives Detection : identification as early as possible of ADRs non observed during clinical trials (rare, latent, affecting sub-groups) Characterization of new risks Data sources : spontaneous reporting systems Spontaneous reporting data : Features Life-size capture of events in the exposed population Under-reporting, unknown baseline and exposure incidence 2 / 30

3 Introduction (2) Framework Signals Very large databases (spontaneous reports) Incorporating a statistical tool for signal detection Statistical associations within the database (automatic detection) Potential adverse drug reactions (ADRs) 3 / 30

4 Introduction (3) Pharmacovigilance database Large contingency table crossing all the drugs (D) and all the adverse events (AE) French database : 672 D 820 AEs (80 % of the cells are empty) Adverse event i Other adverse events Drug j n ij Other Drugs Automatic signal detection methods Frequentist methods : Proportional Reporting Ratio (UK) Reporting Odds Ratio (Netherlands) Bayesian methods : Gamma Poisson Shrinkage (USA) Bayesian Confidence Propagation Neural Network (WHO) 4 / 30

5 Introduction (4) Limits of the current methods Thresholds for the statistics of interest are arbitrarily chosen Do not take into account the multiple comparisons Proposed Approach : To account for the multiple comparisons for the choice of a threshold Through the use recent False Discovery Rate (FDR) approaches Leads to alternative statistics of interest P-values for the frequentist methods posterior probabilities of the null hypothesis for the Bayesian methods 5 / 30

6 Outline Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 6 / 30

7 Some notations For a particular couple (AE i, D j ) Drug j Other Drugs Adverse event i n ij n i j n i. Other adverse events n īj n ī j n ī. n.j n. j n n ij : Number of reports involving AE i and D j n i. : Marginal count involving AE i n.j : Marginal count involving D j n : Total number of AE-D pairs counts 7 / 30

8 Frequentist methods Reporting Odds Ratio (ROR) van Puijenbroek et al For the adverse event-drug pair (i, j) ˆψ ij = nijnī j n īj n i j ln( ˆψ ij) is assumed to follow a normal distribution with variance : A signal is generated if cvar{ln( ˆψ ij)} = n ij n ī j n īj n i j ln( ˆψ ij) 1.96 var{ln( ˆψ ij)} 1/2 > 0 Proportional Reporting Ratio (PRR) Evans et al Same idea but with the relative risk as association measure of interest 8 / 30

9 Bayesian Gamma Poisson Shrinkage (GPS) DuMouchel (1999) Poisson - 2 gamma mixture model n ij e ij, λ ij Pn(λ ij e ij) avec e ij = λ ij ŵ Ga(ˆα 1, ˆβ 1) + (1 ŵ) Ga(ˆα 2, ˆβ 2) ni. n.j n where (ŵ, ˆα 1, ˆα 2, ˆβ 1, ˆβ 2) maximizes the marginal likelihood ˆw fbn {n ij; α 1, β 1/(β 1 + e ij)} + (1 w) f Bn {n ij; α 2, β 2/(β 2 + e ij)} Q ij Association measure λ ij = λ ij n ij, e ij λ ij w ij Ga(ˆα 1 + n ij, ˆβ 1 + e ij) + (1 w ij) Ga(ˆα 2 + n ij, ˆβ 2 + e ij) Signal generation Q 0.05 (λ ij) > 2 Szarfman et al. (2002) 9 / 30

10 Bayesian Confidence Propagation Neural Network (BCPNN) (1) Bate et al. (1998), Noren et al. (2006) Multinomial-Dirichlet model (n ij, n i j, n īj, n ī j) Mu(n, p ij, p i j, p īj, p ī j) with (p ij, p i j, p īj, p ī j) Di(α ij, α i j, α īj, α ī j) The hyperparameters depend on the cell counts The posterior distribution of (p ij, p i j, p īj, p ī j) is also a Dirichlet : (p ij, p i j, p īj, p ī j) Di(γ ij, γ i j, γ īj, γ ī j) with γ kl = α kl + n kl In particular p ij Be(γ ij, γ i j + γ īj + γ ī j) p i. = p ij + p i j Be(γ ij + γ i j, γ īj + γ ī j) p.j = p ij + p īj Be(γ ij + γ īj, γ i j + γ ī j) 10 / 30

11 Bayesian Confidence Propagation Neural Network (BCPNN) (2) Bate et al. (1998), Noren et al. (2006) Association measure IC ij = log 2 p ij p i. p.j! Ratio of beta distributions No analytic form Signal generation Q (IC ij) > 0 Interpolation model built from Monte Carlo simulations : Noren et al. (2006) 11 / 30

12 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 12 / 30

13 False Discovery Rate and Pharmacovigilance Automatic signal detection methods are data mining tools Extension to the hypothesis testing framework relying on the recent developments in multiple comparison statistical field detection thresholds based on statistical criteria False Discovery Rate (Benjamini and Hochberg (1995)) E(proportion of false discoveries among the generated signals) used in the genomic data analysis adapted to massive comparisons and exploratory analysis 13 / 30

14 Frequentist methods - Proposed approach (1) New statistic of interest : P-values e.g ROR : for each cell, we want to test H 0ij : ψ ij ψ 0 «ln( The corresponding P-values p ij = 1 Φ ˆψ ij ) ln(ψ 0 ) var[ln( ˆψ ij )] 1/2 where Φ denotes the standard normal cdf The current decision rule corresponds to choose ψ 0 = 1 and generate signals for cells with p ij Exactly the same idea for the PRR method Alternative : mid-p-values from the Fisher s exact test (midrfet) 14 / 30

15 Frequentist methods - Proposed approach (2) FDR estimation P-values are assumed to follow a mixture of two distributions F (p) = π 0 F 0(p) + (1 π 0) F 1(p) F 0 (p) is the cdf of p under the null hypothesis F 1 (p) is the cdf of p under the alternative hypothesis For a P-value rejection region [0, γ] with γ ]0, 1] : FDR(γ) = π0f0(γ) F (γ) The main difficulty is to estimate π 0 qvalue Storey (2003), LBE Dalmasso et al. (2005) They are based on few distribution assumptions They provide an upper bound of the FDR They were developped for single null hypotheses uniform distribution of the p-values under H 0 (F 0 (γ) = γ) In our case the null hypothesis is one-sided The distribution of the p-value is not uniform But we can use those procedures on p = 1 2 p / 30

16 Bayesian methods - Proposed approach (1) New statistic of interest : posterior probability of the null hypothesis For each cell, we want to test H 0ij : λ ij R 0 for the GPS model p ij H 0ij : R 0 for the BCPNN model p i.p.j and thus to calculate the posterior probability of H 0ij Pr(λ ij R 0) for the GPS model Pr(IC ij ln(r 0)) for the BCPNN model The current decision rules correspond to R 0 = 2 and Pr(λ ij R 0) 0.05 for the GPS model R 0 = 1 and Pr(IC ij ln(r 0 )) for the BCPNN model 16 / 30

17 Bayesian methods - Proposed approach (2) : FDR estimation Based on the bayesian decision theory framework - Müller et al. (2004) Status z ij {0, 1} Decision d ij {0, 1} FDR FDP = P ij (1 zij)dij FDR = E[FDP] Pij dij Bayesian FDR estimation E[FDP data] = P ij vij dij P ij dij where v ij = P r(z ij = 0 data) is the posterior Pr. of H 0ij Pr(λ ij R 0) for the GPS model Pr(IC ij ln(r 0 )) for the BCPNN model d ij = 1 [vij α] 17 / 30

18 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 18 / 30

19 Simulation study Data generation Model : n ij Mu(n, p ij ) from the French database p ij p i. w Di(n i.) p w.j Di(n.j ) = p ij = rw ij pw i. pẉ j P ij rw ij pw i. pẉ j log(rij w ) Lo(0, 0.5) From p ij n ij s and the true marginal probabilities : p i. = P j p ij real status of the cells according to ψ ij, and ψ 0 for the frequentist methods R ij = p ij p i. p and R 0 for the bayesian methods.j Simulation plan 500 simulated datasets The FDRs are calculated for cells with n ij 3 Simulation for ROR, midrfet, GPS and BCPNN methods Results are presented for {ψ 0, R 0} = 1 and 2 p.j = P i p ij 19 / 30

20 Simulation results - Comparison of the methods True FDRs (Monte Carlo estimation) ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Average number of generated signals (a) Average number of generated signals (b) 20 / 30

21 Simulation results - midrfet True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Overestimation of the FDR (as expected) Average number of generated signals (b) 21 / 30

22 Simulation results - ROR True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Normal approximation Underestimation Average number of generated signals (b) 22 / 30

23 Simulation results - GPS True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Average number of generated signals (b) 23 / 30

24 Simulation results - BCPNN True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = FDR FDR estimate Average number of generated signals (a) Underestimation Average number of generated signals (b) 24 / 30

25 Simulation results - Comparison of the methods True and estimated FDRs ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Average number of generated signals (a) Average number of generated signals (b) 25 / 30

26 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 26 / 30

27 Application to the French database ψ 0 = 1, R 0 = 1 ψ 0 = 2, R 0 = midrfet ROR BCPNN GPS Number of generated signals (a) Current decision rules Method Sig. FDR ROR BCPNN GPS Number of generated signals (b) Based on the FDR : e.g GPS Sig. R 0 = R 0 = / 30

28 Description of the current methods Extension to the multiple comparison setting Simulation study Application to the French Data Discussion 28 / 30

29 Discussion Extension of the current methods to the multiple comparison framework No modification of the model New decision rules The FDR is calculated within the database Spontaneous reporting database several sources of bias true associations in the database may not reflect the situation in the population It is a measure for evaluating and comparing the performances of the automatic signal detection methods Close performances for all the automatic methods The GPS model provides better FDR estimates 29 / 30

30 References I. Ahmed et al. FDR estimation for frequentist pharmacovigilance signal detection methods. Biometrics, In press. I. Ahmed et al. Bayesian pharmacovigilance signal detection methods revisited in a multiple comparison setting. Statistics in Medicine, In press. A. Bate et al. A bayesian neural network method for adverse drug reaction signal generation. European Journal of Clinical Pharmacology, 54(4) : , Jun Y. Benjamini and Y. Hochberg. Controlling the false discovery rate : a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57(1) : , C. Dalmasso et al. A simple procedure for estimating the false discovery rate. Bioinformatics, 21(5) : , Mar W. DuMouchel. Bayesian data mining in large frequency tables, with an application to the fda spontaneous reporting system. The American Statistician, 53(3) : , P. Müller et al. Optimal sample size for multiple testing : the case of gene expression microarrays. Journal of The American Statistical Association, 99 : , G. N. Norén et al. Extending the methods used to screen the who drug safety database towards analysis of complex associations and improved accuracy for rare events. Statistics in Medicine, 25(21) : , J. D. Storey. The positive false discovery rate : A bayesian interpretation and the q-value. The Annals of Statistics, 31(6) : , A. Szarfman et al. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA s spontaneous reports database. Drug Safety, 25(6) : , E. P. van Puijenbroek et al. A comparison of measures of disproportionality for signal detection in spontaneous reporting systems for adverse drug reactions. Pharmacoepidemiology and Drug Safety, 11(1) :3 10, / 30

Temporality and Context for Detecting Adverse Drug Reactions from Longitudinal Data

Temporality and Context for Detecting Adverse Drug Reactions from Longitudinal Data Noname manuscript No. (will be inserted by the editor) Temporality and Context for Detecting Adverse Drug Reactions from Longitudinal Data Henry Lo Wei Ding Zohreh Nazeri the date of receipt and acceptance

More information

Data Mining in Pharmacovigilence. Aimin Feng, David Madigan, and Ivan Zorych

Data Mining in Pharmacovigilence. Aimin Feng, David Madigan, and Ivan Zorych Data Mining in Pharmacovigilence Aimin Feng, David Madigan, and Ivan Zorych dmadigan@rutgers.edu http://stat.rutgers.edu/~madigan 1 Overview Intro. to Post-marketing Surveillance SRS Databases Existing

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Reports of the Institute of Biostatistics

Reports of the Institute of Biostatistics Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions

More information

Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series

Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series Year: 2013 #006 The Expected Value of Information in Prospective Drug Safety Monitoring Jessica M. Franklin a, Amanda R. Patrick

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

The study of drug-drug interactions in ADRAC database

The study of drug-drug interactions in ADRAC database The study of drug-drug interactions in ADRAC database M.A.MAMMADOV, A.BANERJEE and J.YEARWOOD University of Ballarat, Australia Abstract Drug-drug interaction is one of the important problems of Adverse

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

Estimation of the False Discovery Rate

Estimation of the False Discovery Rate Estimation of the False Discovery Rate Coffee Talk, Bioinformatics Research Center, Sept, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

Probabilistic Inference for Multiple Testing

Probabilistic Inference for Multiple Testing This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

Linear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments

Linear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Gordon K. Smyth (as interpreted by Aaron J. Baraff) STAT 572 Intro Talk April 10, 2014 Microarray

More information

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics

A Brief Introduction to Intersection-Union Tests. Jimmy Akira Doi. North Carolina State University Department of Statistics Introduction A Brief Introduction to Intersection-Union Tests Often, the quality of a product is determined by several parameters. The product is determined to be acceptable if each of the parameters meets

More information

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP

INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

PubH 5450 Biostatistics I Prof. Carlin. Lecture 13

PubH 5450 Biostatistics I Prof. Carlin. Lecture 13 PubH 5450 Biostatistics I Prof. Carlin Lecture 13 Outline Outline Sample Size Counts, Rates and Proportions Part I Sample Size Type I Error and Power Type I error rate: probability of rejecting the null

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & nformatics Colorado School of Public Health University of Colorado Denver

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

A Large-Sample Approach to Controlling the False Discovery Rate

A Large-Sample Approach to Controlling the False Discovery Rate A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University

More information

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018 Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

Alpha-Investing. Sequential Control of Expected False Discoveries

Alpha-Investing. Sequential Control of Expected False Discoveries Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint

More information

Hypothesis testing (cont d)

Hypothesis testing (cont d) Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able

More information

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 25. Ingo Ruczinski. November 24, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 25 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University November 24, 2015 1 2 3 4 5 6 7 8 9 10 11 1 Hypothesis s of homgeneity 2 Estimating risk

More information

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data

Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Mixtures of Negative Binomial distributions for modelling overdispersion in RNA-Seq data Cinzia Viroli 1 joint with E. Bonafede 1, S. Robin 2 & F. Picard 3 1 Department of Statistical Sciences, University

More information

False Discovery Control in Spatial Multiple Testing

False Discovery Control in Spatial Multiple Testing False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University

More information

Bayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion

Bayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection By-products and Spontaneous Abortion November 8, 2007 Outline Outline 1 Introduction Outline Outline 1 Introduction

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics

(4) One-parameter models - Beta/binomial. ST440/550: Applied Bayesian Statistics Estimating a proportion using the beta/binomial model A fundamental task in statistics is to estimate a proportion using a series of trials: What is the success probability of a new cancer treatment? What

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

False Discovery Rate

False Discovery Rate False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR

More information

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6649: Clinical Trials - Statistical Design and Monitoring Bios 6649: Clinical Trials - Statistical Design and Monitoring Spring Semester 2015 John M. Kittelson Department of Biostatistics & Informatics Colorado School of Public Health University of Colorado Denver

More information

29 Sample Size Choice for Microarray Experiments

29 Sample Size Choice for Microarray Experiments 29 Sample Size Choice for Microarray Experiments Peter Müller, M.D. Anderson Cancer Center Christian Robert and Judith Rousseau CREST, Paris Abstract We review Bayesian sample size arguments for microarray

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

Some General Types of Tests

Some General Types of Tests Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

Doing Cosmology with Balls and Envelopes

Doing Cosmology with Balls and Envelopes Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

2015 Duke-Industry Statistics Symposium. Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data

2015 Duke-Industry Statistics Symposium. Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data 2015 Duke-Industry Statistics Symposium Sample Size Determination for a Three-arm Equivalence Trial of Poisson and Negative Binomial Data Victoria Chang Senior Statistician Biometrics and Data Management

More information

ROI ANALYSIS OF PHARMAFMRI DATA:

ROI ANALYSIS OF PHARMAFMRI DATA: ROI ANALYSIS OF PHARMAFMRI DATA: AN ADAPTIVE APPROACH FOR GLOBAL TESTING Giorgos Minas, John A.D. Aston, Thomas E. Nichols and Nigel Stallard Department of Statistics and Warwick Centre of Analytical Sciences,

More information

Regularized Regression A Bayesian point of view

Regularized Regression A Bayesian point of view Regularized Regression A Bayesian point of view Vincent MICHEL Director : Gilles Celeux Supervisor : Bertrand Thirion Parietal Team, INRIA Saclay Ile-de-France LRI, Université Paris Sud CEA, DSV, I2BM,

More information

Statistical Methods for Astronomy

Statistical Methods for Astronomy Statistical Methods for Astronomy If your experiment needs statistics, you ought to have done a better experiment. -Ernest Rutherford Lecture 1 Lecture 2 Why do we need statistics? Definitions Statistical

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides) Bayesian inference Justin Chumbley ETH and UZH (Thanks to Jean Denizeau for slides) Overview of the talk Introduction: Bayesian inference Bayesian model comparison Group-level Bayesian model selection

More information

Lecture 7 April 16, 2018

Lecture 7 April 16, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties

More information

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),

More information

An introduction to Bayesian inference and model comparison J. Daunizeau

An introduction to Bayesian inference and model comparison J. Daunizeau An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison

More information

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving

More information

Statistical methods for large scale exploratory analysis of post-marketing drug safety data

Statistical methods for large scale exploratory analysis of post-marketing drug safety data Mathematical Statistics Stockholm University Statistical methods for large scale exploratory analysis of post-marketing drug safety data G. Niklas Norén Research Report 2005:9 Licentiate thesis ISSN 1650-0377

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

discovery rate control

discovery rate control Optimal design for high-throughput screening via false discovery rate control arxiv:1707.03462v1 [stat.ap] 11 Jul 2017 Tao Feng 1, Pallavi Basu 2, Wenguang Sun 3, Hsun Teresa Ku 4, Wendy J. Mack 1 Abstract

More information

Neutral Bayesian reference models for incidence rates of (rare) clinical events

Neutral Bayesian reference models for incidence rates of (rare) clinical events Neutral Bayesian reference models for incidence rates of (rare) clinical events Jouni Kerman Statistical Methodology, Novartis Pharma AG, Basel BAYES2012, May 10, Aachen Outline Motivation why reference

More information

Large-Scale Hypothesis Testing

Large-Scale Hypothesis Testing Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early

More information

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25 Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

The optimal discovery procedure: a new approach to simultaneous significance testing

The optimal discovery procedure: a new approach to simultaneous significance testing J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

Sample Size Estimation for Studies of High-Dimensional Data

Sample Size Estimation for Studies of High-Dimensional Data Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,

More information

Bayes Factors for Grouped Data

Bayes Factors for Grouped Data Bayes Factors for Grouped Data Lizanne Raubenheimer and Abrie J. van der Merwe 2 Department of Statistics, Rhodes University, Grahamstown, South Africa, L.Raubenheimer@ru.ac.za 2 Department of Mathematical

More information

Bayesian inference J. Daunizeau

Bayesian inference J. Daunizeau Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Cohort study s formulations PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007 Srine Dudoit Division of Biostatistics Department of Statistics University of California, Berkeley www.stat.berkeley.edu/~srine

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

QTL model selection: key players

QTL model selection: key players Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:

More information

Statistics for the LHC Lecture 1: Introduction

Statistics for the LHC Lecture 1: Introduction Statistics for the LHC Lecture 1: Introduction Academic Training Lectures CERN, 14 17 June, 2010 indico.cern.ch/conferencedisplay.py?confid=77830 Glen Cowan Physics Department Royal Holloway, University

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

A class of latent marginal models for capture-recapture data with continuous covariates

A class of latent marginal models for capture-recapture data with continuous covariates A class of latent marginal models for capture-recapture data with continuous covariates F Bartolucci A Forcina Università di Urbino Università di Perugia FrancescoBartolucci@uniurbit forcina@statunipgit

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Bayesian Regression (1/31/13)

Bayesian Regression (1/31/13) STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Week 12. Testing and Kullback-Leibler Divergence 1. Likelihood Ratios Let 1, 2, 2,...

More information