Statistical Methods in Clinical Trials Categorical Data

Similar documents
LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Stat 5101 Lecture Notes

Testing Independence

Discrete Multivariate Statistics

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Generalized, Linear, and Mixed Models

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

STA6938-Logistic Regression Model

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Simple logistic regression

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Subject CS1 Actuarial Statistics 1 Core Principles

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 3.1 Basic Logistic LDA

Multinomial Logistic Regression Models

Correlation and regression

Practice Problems Section Problems

Semiparametric Generalized Linear Models

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Lecture 01: Introduction

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Small n, σ known or unknown, underlying nongaussian

Generalized Linear Models for Non-Normal Data

Irr. Statistical Methods in Experimental Physics. 2nd Edition. Frederick James. World Scientific. CERN, Switzerland

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

8 Nominal and Ordinal Logistic Regression

Lecture 10: Introduction to Logistic Regression

Classification. Chapter Introduction. 6.2 The Bayes classifier

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Glossary for the Triola Statistics Series

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

WU Weiterbildung. Linear Mixed Models

Open Problems in Mixed Models

Three-Way Contingency Tables

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Linear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Chapter 1. Modeling Basics

Primer on statistics:

Unit 9: Inferences for Proportions and Count Data

Categorical Data Analysis Chapter 3

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

9 Generalized Linear Models

Review of Statistics 101

Introduction to Statistical Analysis

Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

A simulation study for comparing testing statistics in response-adaptive randomization

Statistical Data Analysis Stat 3: p-values, parameter estimation

n y π y (1 π) n y +ylogπ +(n y)log(1 π).

Machine Learning Linear Classification. Prof. Matteo Matteucci

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours

Lecture 5: Poisson and logistic regression

Chapter 1 Statistical Inference

Generalized Linear Models

Measures of Association and Variance Estimation

Statistical Distribution Assumptions of General Linear Models

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Statistics in medicine

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Longitudinal Modeling with Logistic Regression

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Multivariate Survival Analysis

Lecture 12: Effect modification, and confounding in logistic regression

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Generalized linear models

Statistics in medicine

STAT 7030: Categorical Data Analysis

Model Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Linear Regression Models P8111

Cohen s s Kappa and Log-linear Models

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Poisson regression: Further topics

Unit 9: Inferences for Proportions and Count Data

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Describing Contingency tables

Epidemiology Wonders of Biostatistics Chapter 13 - Effect Measures. John Koval

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

CDA Chapter 3 part II

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Lecture 14: Introduction to Poisson Regression

Transcription:

Statistical Methods in Clinical Trials Categorical Data

Types of Data quantitative Continuous Blood pressure Time to event Categorical sex qualitative Discrete No of relapses Ordered Categorical Pain level

Types of data analysis (Inference) Parametric Vs Non parametric Model based Vs Data driven Frequentist Vs Bayesian

Categorical data In a RCT, endpoints and surrogate endpoints can be categorical or ordered categorical variables. In the simplest cases we have binary responses (e.g. responders non-responders). In Outcomes research it is common to use many ordered categories (no improvement, moderate improvement, high improvement). Example: Binary outcomes: Remission Mortality Presence/absence of an AE Responder/non-responder according to some pre-defined criteria Success/Failure

Two proportions Sometimes, we want to compare the proportion of successes in two separate groups. For this purpose we take two samples of sizes n1 and n2. We let yi1 and pi1 be the observed number of subjects and the proportion of successes in the ith group. The difference in population proportions of successes and its large sample variance can be estimated by

Two proportions (continued) Assume we want to test the null hypothesis that there is no difference between the proportions of success in the two groups. Under the null hypothesis, we can estimate the common proportion by Its large sample variance is estimated by Leading to the test statistic

Example NINDS trial in acute ischemic stroke Treatment n responders* rt-pa 312 147 (47.1%) placebo 312 122 (39.1%) *early improvement defined on a neurological scale Point estimate: 0.080 (s.e.=0.0397) 95% CI: (0.003 ; 0.158) p-value: 0.043

Two proportions (Chi square) The problem of comparing two proportions can sometimes be formulated as a problem of independence! Assume we have two groups as above (treatment and placebo). Assume further that the subjects were randomized to these groups. We can then test for independence between belonging to a certain group and the clinical endpoint (success or failure). The data can be organized in the form of a contingency table in which the marginal totals and the total number of subjects are considered as fixed.

2 x 2 Contingency table R E S P O N S E Failure Success Total T R E A T M E N T Drug 165 147 312 Placebo 190 122 312 Total 355 462 N=624

Hyper geometric distribution n balls are drawn at random without replacement. Y is the number of white balls (successes) Y follows the Hyper geometric Distribution with parameters (N, W, n) Urn containing W white balls and R red balls: N=W+R

Contingency tables N subjects in total y.1 of these are special (success) y1. are drawn at random Y11 no of successes among these y1. Y11 is HG(N,y.1,y 1.) in general

Contingency tables The null hypothesis of independence is tested using the (Pearson) chi square statistic Which, under the null hypothesis, is chi square distributed with one degree of freedom provided the sample sizes in the two groups are large (over 30) and the expected frequency in each cell is non negligible (over 5)

Contingency tables For moderate sample sizes we use Fisher s exact test. According to this calculate the desired probabilities using the exact Hyper-geometric distribution. The variance can then be calculated. To illustrate consider: Using this and expectation m11 we have the randomization chi square statistic. With fixed margins only one cell is allowed to vary. Randomization is crucial for this approach.

The (Pearson) Chi-square test The test-statistic is: 2 p i j (O ij E ij E ij ) 2 where y ij = observed frequencies and m ij = expected frequencies (under independence) the test-statistic approximately follows a chi-square distribution

Example 5 Chi-square test for a 22 table Examining the independence between two treatments and a classification into responder/non-responder is equivalent to comparing the proportion of responders in the two groups NINDS again non-resp responder rt-pa 165 147 312 Observed frequencies placebo 190 122 312 355 269 Expected frequencies non-resp responder rt-pa 177.5 134.5 312 placebo 177.5 134.5 312 355 269

TABLE OF GRP BY Y S A S o u t p u t Frequency Row Pct nonresp resp Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ placebo 190 122 312 60.90 39.10 ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ rt-pa 165 147 312 52.88 47.12 ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 355 269 624 STATISTICS FOR TABLE OF GRP BY Y Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 4.084 0.043 Likelihood Ratio Chi-Square 1 4.089 0.043 Continuity Adj. Chi-Square 1 3.764 0.052 Mantel-Haenszel Chi-Square 1 4.077 0.043 Fisher's Exact Test (Left) 0.982 (Right) 0.026 (2-Tail) 0.052 Phi Coefficient 0.081 Contingency Coefficient 0.081 Cramer's V 0.081 Sample Size = 624

Odds, Odds Ratios and relative Risks The odds of success in group i is estimated by The odds ratio of success between the two groups i is estimated by Define risk for success in the ith group as the proportion of cases with success. The relative risk between the two groups is estimated by Absolute Risk = AR = p 11 p 21

Nominal Categorical data E.g. patient residence at end of follow-up (hospital, nursing home, own home, etc.) Ordinal (ordered) E.g. some global rating Normal, not at all ill Borderline mentally ill Mildly ill Moderately ill Markedly ill Severely ill Among the most extremely ill patients

Categorical data & Chi-square test Other factor A B C D E i n ia n ib n ic n id n ie n i One Factor ii n iia n iib n iic n iid n iie n ii iii n iiia n iiib n iiic n iiid n iiie n iii n A n B n C n D n E n ia The chi-square test is useful for detection of a general association between treatment and categorical response (in either the nominal or ordinal scale), but it cannot identify a particular relationship, e.g. a location shift.

Nominal categorical data Disease category dip snip fup bop other treatment A 33 15 34 26 8 116 group B 28 18 34 20 14 114 61 33 68 46 22 230 Chi-square test: 2 = 3.084, df=4, p = 0.544

Ordered categorical data Here we assume two groups one receiving the drug and one placebo. The response is assumed to be ordered categorical with J categories. The null hypothesis is that the distribution of subjects in response categories is the same for both groups. Again the randomization and the HG distribution lead to the same chi square test statistic but this time with (J-1) df. Moreover the same relationship exists between the two versions of the chi square statistic.

The Mantel-Haensel statistic The aim here is to combine data from several (H) strata for comparing two groups drug and placebo. The expected frequency and the variance for each stratum are used to define the Mantel- Haensel statistic which is chi square distributed with one df.

Logistic regression Logistic regression is part of a category of statistical models called generalized linear models (GLM). This broad class of models includes ordinary regression and ANOVA, as well as multivariate statistics such as ANCOVA and loglinear regression. An excellent treatment of generalized linear models is presented in Agresti (1996). Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. Generally, the dependent or response variable is dichotomous, such as presence/absence or success/failure.

Multiple logistic regression More than one independent variable Dichotomous, ordinal, nominal, continuous ln P 1-P α β 1 x 1 β 2 x 2... β i x i Interpretation of b i Increase in log-odds for a one unit increase in x i with all the other x i s constant Measures association between x i and logodds adjusted for all other x i

Fitting equation to the data Linear regression: Least squares or Maximum likelihood Logistic regression: Maximum likelihood Likelihood function Estimates parameters b Practically easier to work with log-likelihood

Statistical testing Question Does model including given independent variable provide more information about dependent variable than model without this variable? Three tests Likelihood ratio statistic (LRS) Wald test Score test

Likelihood ratio statistic Compares two nested models Log(odds) = + b 1 x 1 + b 2 x 2 + b 3 x 3 (model 1) Log(odds) = + b 1 x 1 + b 2 x 2 (model 2) LR statistic -2 log (likelihood model 2 / likelihood model 1) = -2 log (likelihood model 2) minus -2log (likelihood model 1) LR statistic is a 2 with DF = number of extra parameters in model

Example 6 Fitting a Logistic regression model to the NINDS data, using only one covariate (treatment group). NINDS again non-resp responder Observed frequencies rt-pa 165 147 312 placebo 190 122 312 355 269

S A S o u t p u t The LOGISTIC Procedure Response Profile Ordered Binary Value Outcome Count 1 EVENT 269 2 NO EVENT 355 Model Fitting Information and Testing Global Null Hypothesis BETA=0 Intercept Intercept and Criterion Only Covariates Chi-Square for Covariates AIC 855.157 853.069. SC 859.593 861.941. -2 LOG L Score 853.157. 849.069. 4.089 with 1 DF (p=0.0432) 4.084 with 1 DF (p=0.0433) Analysis of Maximum Likelihood Estimates Parameter Standard Wald Pr > Standardized Odds Variable DF Estimate Error Chi-Square Chi-Square Estimate Ratio INTERCPT 1-0.4430 0.1160 14.5805 0.0001.. GRP 1 0.3275 0.1622 4.0743 0.0435 0.090350 1.387

David Brennan CEO of AstraZeneca

?

4 measures of association (effect) Quite often we are interested in risk and probabilty only as a way to measure association or effect: cure is associated with drug = the drug has an effect This can be done in different ways 1. Relative Risk (Prospective Studies) 2. Odds Ratio (Prospective or Retrospective) 3. Absolute Risk (Prospective Studies) 4. (Number Needed totreat) (Prospective Studies)

Absolute Risk Difference Between Proportions of outcomes in 2 groups 1 and 2. Estimated absolute risk ^ 1 n n 11 1. ^ ^ 2 ^ AR 1 2 n n 95% Confidence Interval for Population Absolute Risk ^ ^ ^ ^ 11 1 21 2 AR 1.96 New drug n n (0.115, 0.205) 1. 95% 21 2. 0.80 0.64 2. 0.16 Standard drug AR Cured 646 0 0 0 Association No association - Association Not cured 354 Group Total 800 200 1000 1000 Total 1446 554 2000

Number Neede to Treat

NNT Assume n subjects take one treatment and n subjects take a second treatment. Let X1 and X2 be the number of successful treatments in the two cases and p1 and p2 denote the probabilities of sucess in the two groups. Assume further that we can use the binomial distribution. Then the average difference between the two groups and the number needed to treat can be calculated according to ) ( 1 1 ) ( ), ( ] [ ] [, ] [ ), ( 2 1 2 1 2 1 2 1 p p n p p n p p n X E X E np X E p n Bin X i i i i

Number needed to treat Definition: The number needed to be treated to prevent 1 event is calculated as the inverse of the absolute risk difference: NNT 1 AR ^ 1 1 1 0.16 6.25 NNT is frequently used in clinical trials to provide an insight into the clinical relevance of the effect of treatment under investigation. It is often claimed that its popularity depends on its simplicity and intuitive interpretation. Cured ^ 2 Not cured Group Total AR ^ ^ 1 2 0.80 0.64 0.16 New drug Standard drug 800 200 1000 646 354 1000 Total 1446 554 2000

Issues with NNT NNT should be completed with follow up period and unfavourable event avoided. NNT presupposes that there is statistically significant difference (*). How much NNT is good? No magic figure: (10-500) risky surgerey standard inexpensive drug with no side effect active treatment preventive treatment etc. NNT Statistical properties? Confidence intervals? When AR = 0, NNT becomes infinite! The distribution of NNT is complicated because its behavior around AR = 0; The moments of NNT do not exist; Simple calculations with NNT like can give nonsensical results. 1 AR 1 ^ ˆ ˆ1 2

Example 8 In a study it was reported that the absolute risk reduction for patients with moderate baseline stroke severity as being 16.6%. The number needed to treat is thus 1/0.166 or approximately 6. This benefit was statistically significant: the 95% confidence interval for the absolute risk reduction was [0.9%, 32.2%]. A 95% confidence interval for the number needed to treat is [1/0.009, 1/0.322] or approximately [3.1, 111.1]. This all seems quite straightforward, but what if we try the calculation for a non-significant result, for example, for patients with low baseline stroke severity. The absolute risk reduction was 6.6% with a 95% confidence interval of [ 20.9%, 34.1%]. Naively taking reciprocals gives a number needed to treat of about 15.2 and an apparent 95% confidence interval of [-4.8, 2.9], which does not seem to include 15.2! Clearly something s wrong.

To understand the source of the confusion, note first that the lower limit of the confidence interval for the absolute risk reduction is negative, because the data do not rule out the possibility that the treatment is actually harmful for this group of patients. The reciprocal of this lower limit is 4.8, or a number needed to harm of 4.8. A better description of positive and negative values of the number needed to treat would be the number needed to treat for one additional patient to benefit (or be harmed), or NNTB and NNTH respectively. The 95% confidence interval for the absolute risk reduction thus extends from a NNTH of 4.8 at one extreme to a NNTB of 2.9 at the other.

To understand what such a confidence interval covers, imagine for a moment that the absolute risk reduction had only just been significant, with a confidence interval extending from slightly more than 0% to 34.1%. The confidence interval for the number needed to treat would now extend from 2.9 to something approaching infinity. This would indicate that, according to the data, for one additional patient to benefit, a clinician would need to treat at least 2.9 patients (the reciprocal of 34.1%), but perhaps an extremely large number of patients. Thus, when a confidence interval for an absolute risk reduction overlaps zero, the corresponding confidence interval for the number needed to treat includes infinity. This explains the confusion in the case of the patients with low baseline stroke severity: the 95% confidence interval does, after all, contain the point estimate (see fig. below). The estimated number needed to treat and its confidence interval can be quoted as NNTB = 15.2 (95% confidence interval NNTH 4.8 to to NNTB 2.9).

[ 20.9%, 34.1%] [ 4, ) [2.9,,) Confidence intervals for absolute risk reduction and number needed to treat for benefit (NNTB) or harm (NNTH) for patients with low baseline stroke severity.

In other words, for this group of patients, it could be that, on average, treating as few as 3 patients would result in one additional patient benefiting. On the other hand, it could be that, on average, treating as few as 5 patients would result in one additional patient being harmed. It is important that a nonsignificant number needed to treat has a confidence interval with 2 parts, one allowing for the possibility that the treatment is actually harmful, and the other for the possibility that the treatment is beneficial.

Maximum likelihood Invariance Property of MLE s If qˆ θ is the MLE of some parameter θ and t(.) is a one-to-one function, then h(ˆθ) ˆ t ( q ) is the MLE of t (q. ) The invariance property for ML estimators cannot apply here for the following reason: For a one dimensional parameter q a function of this parameter t(q) must have a single valued inverse in order to have ˆ t ( q ) t ( ˆ) q Bimodality and the range of definition make convergence to normality difficult to achieve (slow) for small sample sizes.

Unbiasedeness Unbiasedness is a matter of scale: if qˆ q is unbiased for q then t(q qˆ ) will be biased for t(q) unless t is the identity function. Moreover the singularity at 0 implies that NNT cannot be bias corrected. Attempts to improve the behaviour of the estimator by reducing the bias will fail.

Testing No simple test of no treatment effect can be constructed for the supposedly simple and comprehensible NNT. This is because this corresponds to a value of for the parameter (a z-statistic of the form ( ˆ q ) / SE.

Generalized Mixed Effects Models 48 Date

Various forms of models and relation between them Classical statistics (Observations are random, parameters are unknown constants) LM: Assumptions: 1. independence, 2. normality, LMM: Assumptions 1) and 3) are modified Repeated measures: Assumptions 1) and 3) are modified 3. constant parameters GLM: assumption 2) Exponential family GLMM: Assumption 2) Exponential family and assumptions 1) and 3) are modified Maximum likelihood Longitudinal data LM - Linear model GLM - Generalised linear model LMM - Linear mixed model GLMM - Generalised linear mixed model Bayesian statistics Non-linear models Name, department 49 Date

Exponential families Exponential family comprises a set of flexible distribution ranging both continuous and discrete random variables. The members of this family have many important properties which merits discussing them in some general format. Many of the usual probability distributions are specific members of this family: Gaussian Bernoull Binomial - Von mises - Gamma Poisson Exponential - Beta: (0; 1) Weibull etc

Generalized linear Models: Name, department 51 Date

The Bernoulli distribution Name, department 52 Date

Generalized Linear Models Name, department 53 Date

Generalised Linear Mixed Models Name, department 54 Date

Name, department 55 Date

Name, department 56 Date

Empirical Bayes estimates Name, department 57 Date

Example 1 (cont d) Name, department 58 Date

Name, department 59 Date

A Bayesian alternative

Infection vs. poverty Some studies from the year 1990 suggested that the risk to CHD is associated with childhood poverty. Since infection with the bacterium H. Pylori is also linked to poverty, some researchers suspected H. Pylori to be the missing link. In a study where levels of infections were considered in patients and controls, the following results were obtained. Using the data below, the chi square statistic having, the value 4.37 yields a p-value of 0.03 which is less than the formal level of significance 0.05. CHD Healthy Control High 60% 39% Low 40% 61%

Let us try a bayesian alternative: Since we have no theoretical reason to believe that the above result is true, we take P(H0)=0.5. P[H 0 D] 1 1 2 1 BF 2 1 BF 1 BF 1 BF BF 1 Berger and Selke (1987) have shown that for a very wide range of cases including this one BF 2 1 2 2 e Using the value 4.73 for the chi square variable leads to a BF value of at least 0.337 Reference: M. A. Mendall et al Relation betweenh. Pylori infection and coronary heart disease. Heart J. (1994)).

Conclusion P[H 0 D] 0.337 0.337 1 0.252 Taking other (more or less sceptical) attitude does not change a the conclusion that much: P(H0)=0.75 => P[ H0 D] > (0.5) P(H0)=0.25 => P[ H0 D] > (0.1)

Bayesian properties of NNT Let D = (x1, x2, n1, n2) represent data from some trial. Assuming independent Beta(αi, βi ) prior distributions for the pi leads to the joint posterior distribution of (p1, p2) as a product of independent Beta distributions. Apart from mathematical tractability, beta priors offer great flexibility of distributional shape. One can obtain the posterior distribution of the difference p=(p1 -p2) or that of NNT = 1/p by simple transformation, and using Markov chain Monte Carlo (MCMC) to simulate directly from the posterior 2 distributions. The posterior mean p μp and variance of p are respectively given by 1 E[ p i p i 2 and D] ( x i 2 p 2 i1 n ) /( n i i i i (1 i ) b 1 b 1), i i i i i 1,2.

Asymptotically, p will have a Normal posterior distribution with 2 mean μp and variance p. The common practice is to estimate NNT by 1/μp and the corresponding interval estimate is given by the 95% credible interval ( p 1.96 ) Making the transformation to y = 1/p = NNT, we find that the asymptotic distribution of Y is given by 1 2 ( p) 1 y f ( y D) exp 2 2 2 2 p y p This density is known as the inverse normal distribution (Johnson et al., 1995, p. 171). It is a special case of the generalized inverse normal family of density functions considered by Robert (1991). The mean and variance of this distribution do not exist. p 1

However, the distribution has two modes at Thus the point estimate of NNT would be given by NNT2 when there is efficacy and by NNT1 when the control treatment dominates the experimental. The figure below shows graphs of for different values of μp and σp. We observe from the figure that the pdf based on μp < 0 is a mirror image of that of μp > 0. 2 2 2 2 2 2 2 1 4 8 ˆ and, 4 8 ˆ p p p p p p p p NNT NNT