Distribution-free ROC Analysis Using Binary Regression Techniques

Similar documents
Distribution-free ROC Analysis Using Binary Regression Techniques

Categorical Predictor Variables

Linear Regression With Special Variables

MS&E 226: Small Data

Association studies and regression

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

1. Let A and B be two events such that P(A)=0.6 and P(B)=0.6. Which of the following MUST be true?

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Lecture 10: Introduction to Logistic Regression

Estimation and Centering

Midterm: CS 6375 Spring 2015 Solutions

Introduction to lnmle: An R Package for Marginally Specified Logistic-Normal Models for Longitudinal Binary Data

Bayesian Linear Models

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

STAT2201 Assignment 6

Correlation and regression

Bayesian Linear Models

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Linear Regression Models P8111

Machine Learning Linear Classification. Prof. Matteo Matteucci

Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training

Part 8: GLMs and Hierarchical LMs and GLMs

MS&E 226: Small Data

Biostat 2065 Analysis of Incomplete Data

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

Interactions among Continuous Predictors

Simple Linear Regression

Lecture 9: Classification, LDA

Lecture 9: Classification, LDA

1 What does the random effect η mean?

Regression diagnostics

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Simple, Marginal, and Interaction Effects in General Linear Models

Logistic Regression and Generalized Linear Models

Bayesian Linear Regression

Multivariate Regression

The linear model is the most fundamental of all serious statistical models encompassing:

ISQS 5349 Final Exam, Spring 2017.

Accounting for Complex Sample Designs via Mixture Models

Correlation & Simple Regression

Simple Linear Regression Using Ordinary Least Squares

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Optimal Designs for 2 k Experiments with Binary Response

Mixed-Models. version 30 October 2011

Predicted Y Scores. The symbol stands for a predicted Y score

Statistical Modelling with Stata: Binary Outcomes

CSC2515 Assignment #2

Beyond GLM and likelihood

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

Logit Regression and Quantities of Interest

Accommodating covariates in receiver operating characteristic analysis

Lecture Slides - Part 1

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Lecture #11: Classification & Logistic Regression

COS513 LECTURE 8 STATISTICAL CONCEPTS

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Models for binary data

A general and robust framework for secondary traits analysis

ACOVA and Interactions

Generalized logit models for nominal multinomial responses. Local odds ratios

Logistic regression: Miscellaneous topics

Basic Medical Statistics Course

BIOS 2083 Linear Models c Abdus S. Wahed

Well-developed and understood properties

6. Multiple regression - PROC GLM

Generalized Linear Models Introduction

Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning)

Scatter plot of data from the study. Linear Regression

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

Analysing data: regression and correlation S6 and S7

Chapter 1. Modeling Basics

Logistic Regression. Advanced Methods for Data Analysis (36-402/36-608) Spring 2014

STAT2201 Assignment 6 Semester 1, 2017 Due 26/5/2017

Classification. Chapter Introduction. 6.2 The Bayes classifier

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Remedial Measures for Multiple Linear Regression Models

Dimensionality reduction

Semiparametric Inferential Procedures for Comparing Multivariate ROC Curves with Interaction Terms

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Semiparametric Generalized Linear Models

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

The Sum of Standardized Residuals: Goodness of Fit Test for Binary Response Model

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

General Linear Model: Statistical Inference

Correlation and Regression

Linear Methods for Prediction

Scatter plot of data from the study. Linear Regression

Eco517 Fall 2014 C. Sims FINAL EXAM

Lecture 9: Classification, LDA

Linear Regression and Its Applications

Transcription:

Distribution-free ROC Analysis Using Binary Regression Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Update Talk 1 / 37

Prediction diagram... 2 / 37

Any old ROC curve 3 / 37

Prediction diagram... 4 / 37

Prediction diagram... 5 / 37

Prediction diagram... 6 / 37

ROC G 0 and G 1 the survivor functions for Y 0 and Y 1, respectively... ROC Y0,Y 1 (s) = P ( Y 1 G 1 0 (s)) = G 1 ( G 1 0 (s)) 7 / 37

ROC 8 / 37

ROC Regression X : covariate(s) for all subjects X 1 : variables specific to the diseased group (most often part of the outcome) G 0 (y X ) and G 1 (y X, X 1 ) the (conditional) survivor functions for Y 0 and Y 1, respectively... ROC Y0,Y 1 X,X 1 (s) = P ( Y 1 G0 1 (s X ) ) X, X1 ( = G 1 G 1 0 (s X ) ) X, X1 9 / 37

Estimation (Pepe, 2000) We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 10 / 37

Estimation (Pepe, 2000) Hence, it is sensible to fit a GLM to estimate ROC with a binary outcome. But how do we choose a link function? Need to make sure ROC curve is monotonically increasing in s. 11 / 37

Choice of link function? Binormal ROC curve: ROC Y0j,Y 1i X i,x j,x 1i (s) = Φ ( α 0 + α 1 Φ 1 (s) + βx + γx 1 ) Called binormal because if Y 0 and Y 1 are normally distributed with means µ 0, µ 1, and variances σ 2 0, σ2 1 (respectively), then the ROC curve is of this form. 12 / 37

Computational (In?)efficiency Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime 0.01 sec. 0.95 sec. 2.87 sec. 13 / 37

Computational (In?)efficiency: 500 Bootstraps Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime > 5 sec. > 8 min. > 24 min. 14 / 37

Computational (In?)efficiency: 500 Bootstraps Solution: Load up Bayes, Gauss, Gosset, Hercules, etc. and annoy everyone in the department. 15 / 37

! Questions?... Just kidding 16 / 37

Computational (In?)efficiency: 500 Bootstraps Solution: Load up Bayes, Gauss, Gosset, Hercules, etc. and annoy everyone in the department. (I like having good standing with people in the department, when possible). Solution: Consider a more sensible set of comparisons... i.e., Not all pairs of diseased and healthy participants... Anticipated Problem: Fewer pairs will likely come with a loss of (statistical) efficiency. (Spoilers!) 17 / 37

Back to the drawing board! (Recall... ) We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 18 / 37

Back to the drawing board! We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 19 / 37

Back to the drawing board! We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U is = 1(Y 1i G 1 0 (s X j)), then... E[U is X i, X 1i ] = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) For a suitable set of choices for s. 20 / 37

Specificity Set Key Observation: E[U is X i, X 1i ] = ROC Y0j,Y 1i X i,x j,x 1i (s) Indeed, the choice S = {i/n0 : i = 1,..., n 0 1} is maximal and corresponds to the original GLM method proposed. 21 / 37

Simulation Studies There are so many questions we can ask: How well does this approach work (and how does it do compared to maximum likelihood?) How does efficiency change when you shrink the specificity set? Can you estimate standard errors well with bootstrap? Is it robust to model misspecification? 22 / 37

Study 1: How well does it work? n 0 = n 1 = 50 Y 0i N (0, 1); Y 1j N ( α 0 /α 1, 1/α1 2 )... this is done so that ROC Y0,Y 1 (s) is binormal with parameters α 0 and α 1 Case 1: α = (0.75, 0.90) Case 2: α = (1.50, 0.85) Two-thousand replications 23 / 37

Study 1: How well does it work? 24 / 37

Study 1: How well does it work? Note: Maximum likelihood estimates are given by: ˆα 0 = X 1 X 0 ˆσ 1 ; ˆα 1 = ˆσ 0 ˆσ 1 which follows from finding the inverse survivor function for the healthy subjects, and then applying the normal survivor function from the diseased subjects. = Φ ( ) G0 1 (s) µ 1 = 1 Φ σ 1 ( ROC Y0,Y 1 (s) = G 1 G 1 0 (s)) ( σ0 Φ 1 ) (s) + µ 0 µ 1 σ 1 ( µ1 µ 0 = Φ + σ 0 Φ 1 (s) σ 1 σ 1 ), 25 / 37

Study 1: How well does it work? Scenario 1: α = (0.75, 0.90) 26 / 37

Study 1: How well does it work? Scenario 2: α = (1.50, 0.85) 27 / 37

Study 2: Loss of efficiency with smaller S n 0 = n 1 = 200 Y 0i N (0, 1); Y 1j N ( α 0 /α 1, 1/α1 2 )... this is done so that ROC Y0,Y 1 (s) is binormal with parameters α 0 and α 1 Case 1: α = (0.75, 0.90) Case 2: α = (1.50, 0.85) Two-thousand replications Consider S = {i/(l + 1); i = 1,..., l} for several choices of l 28 / 37

Study 2: Loss of efficiency with smaller S α = (0.75, 0.90) α = (1.50, 0.85) 29 / 37

Computational (In?)efficiency: 500 Bootstraps Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime > 5 sec. > 8 min. > 24 min. 30 / 37

Childhood Predictors of Adult Obesity Study (CPAO) 823 adults (133 obese and 823 non-obese) Determine whether childhood BMI can predict adult obesity Unadjusted model (easy enough) Adjusted model: include age, sex (for everyone), and adult BMI (for the obese participants) in the model 31 / 37

Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Unadjusted: Intercept 1.07 0.116 < 0.001 Φ 1 (s) 0.890 0.0916 < 0.001 Adjusted: Intercept 0.150 0.376 0.69 Age 0.0669 0.0276 0.012 Gender -0.284 0.238 0.23 Adult BMI 0.236 0.136 0.083 Φ 1 (s) 0.923 0.0926 < 0.001 32 / 37

Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Unadjusted: Intercept 1.07 0.116 < 0.001 Φ 1 (s) 0.890 0.0916 < 0.001 Adjusted: Intercept 0.150 0.376 0.69 Age 0.0669 0.0276 0.012 Gender -0.284 0.238 0.23 Adult BMI 0.236 0.136 0.083 Φ 1 (s) 0.923 0.0926 < 0.001 33 / 37

Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Age 0.0669 0.0276 0.012 Two obese adults of the same gender and adult BMI, but differing in age by one year, differ in estimated probability of having a BMI exceed the healthy quantiles of the same respective covariates by 0.0669 on the probit scale (with the older participant having the higher probability) 34 / 37

35 / 37

In Summary Meaningful computational efficiency gains in considering fewer cut-points... At a cost of statistical efficiency Might be acceptable when trying to get done a ton of computations... or working with an absolutely massive data set 36 / 37

What are your thoughts? I am hugely disappointed not to have been able to discuss model misspecification... With the five extra minutes I have next time...... would you rather see the case where the data generation mechanism is not normal, or when curve itself is misspecified? Is there anything from this you would cut out? Anyone still have trouble remembering which is sensitivity and which is specificity? 37 / 37