Distribution-free ROC Analysis Using Binary Regression Techniques Todd A. Alonzo and Margaret S. Pepe As interpreted by: Andrew J. Spieker University of Washington Dept. of Biostatistics Update Talk 1 / 37
Prediction diagram... 2 / 37
Any old ROC curve 3 / 37
Prediction diagram... 4 / 37
Prediction diagram... 5 / 37
Prediction diagram... 6 / 37
ROC G 0 and G 1 the survivor functions for Y 0 and Y 1, respectively... ROC Y0,Y 1 (s) = P ( Y 1 G 1 0 (s)) = G 1 ( G 1 0 (s)) 7 / 37
ROC 8 / 37
ROC Regression X : covariate(s) for all subjects X 1 : variables specific to the diseased group (most often part of the outcome) G 0 (y X ) and G 1 (y X, X 1 ) the (conditional) survivor functions for Y 0 and Y 1, respectively... ROC Y0,Y 1 X,X 1 (s) = P ( Y 1 G0 1 (s X ) ) X, X1 ( = G 1 G 1 0 (s X ) ) X, X1 9 / 37
Estimation (Pepe, 2000) We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 10 / 37
Estimation (Pepe, 2000) Hence, it is sensible to fit a GLM to estimate ROC with a binary outcome. But how do we choose a link function? Need to make sure ROC curve is monotonically increasing in s. 11 / 37
Choice of link function? Binormal ROC curve: ROC Y0j,Y 1i X i,x j,x 1i (s) = Φ ( α 0 + α 1 Φ 1 (s) + βx + γx 1 ) Called binormal because if Y 0 and Y 1 are normally distributed with means µ 0, µ 1, and variances σ 2 0, σ2 1 (respectively), then the ROC curve is of this form. 12 / 37
Computational (In?)efficiency Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime 0.01 sec. 0.95 sec. 2.87 sec. 13 / 37
Computational (In?)efficiency: 500 Bootstraps Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime > 5 sec. > 8 min. > 24 min. 14 / 37
Computational (In?)efficiency: 500 Bootstraps Solution: Load up Bayes, Gauss, Gosset, Hercules, etc. and annoy everyone in the department. 15 / 37
! Questions?... Just kidding 16 / 37
Computational (In?)efficiency: 500 Bootstraps Solution: Load up Bayes, Gauss, Gosset, Hercules, etc. and annoy everyone in the department. (I like having good standing with people in the department, when possible). Solution: Consider a more sensible set of comparisons... i.e., Not all pairs of diseased and healthy participants... Anticipated Problem: Fewer pairs will likely come with a loss of (statistical) efficiency. (Spoilers!) 17 / 37
Back to the drawing board! (Recall... ) We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 18 / 37
Back to the drawing board! We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U ij = 1(Y 1i Y 0j ), then... E[U ij G 0 (Y 0j X j ) = s, X i, X j, X 1i ] = P(Y 1 Y 0 G 0 (Y 0j X j ) = s, X i, X j, X 1i ) = P(Y 1 Y 0 G 1 0 (s X j) = Y 0j, X i, X j, X 1i ) = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) 19 / 37
Back to the drawing board! We observe Y - and X -values on n 0 healthy controls and n 1 diseased participants... Key Observation: If U is = 1(Y 1i G 1 0 (s X j)), then... E[U is X i, X 1i ] = P(Y 1 G 1 0 (s X j) X i, X 1i ) = G 1 (G 1 0 (s X j) X i, X 1i ) = ROC Y0j,Y 1i X i,x j,x 1i (s) For a suitable set of choices for s. 20 / 37
Specificity Set Key Observation: E[U is X i, X 1i ] = ROC Y0j,Y 1i X i,x j,x 1i (s) Indeed, the choice S = {i/n0 : i = 1,..., n 0 1} is maximal and corresponds to the original GLM method proposed. 21 / 37
Simulation Studies There are so many questions we can ask: How well does this approach work (and how does it do compared to maximum likelihood?) How does efficiency change when you shrink the specificity set? Can you estimate standard errors well with bootstrap? Is it robust to model misspecification? 22 / 37
Study 1: How well does it work? n 0 = n 1 = 50 Y 0i N (0, 1); Y 1j N ( α 0 /α 1, 1/α1 2 )... this is done so that ROC Y0,Y 1 (s) is binormal with parameters α 0 and α 1 Case 1: α = (0.75, 0.90) Case 2: α = (1.50, 0.85) Two-thousand replications 23 / 37
Study 1: How well does it work? 24 / 37
Study 1: How well does it work? Note: Maximum likelihood estimates are given by: ˆα 0 = X 1 X 0 ˆσ 1 ; ˆα 1 = ˆσ 0 ˆσ 1 which follows from finding the inverse survivor function for the healthy subjects, and then applying the normal survivor function from the diseased subjects. = Φ ( ) G0 1 (s) µ 1 = 1 Φ σ 1 ( ROC Y0,Y 1 (s) = G 1 G 1 0 (s)) ( σ0 Φ 1 ) (s) + µ 0 µ 1 σ 1 ( µ1 µ 0 = Φ + σ 0 Φ 1 (s) σ 1 σ 1 ), 25 / 37
Study 1: How well does it work? Scenario 1: α = (0.75, 0.90) 26 / 37
Study 1: How well does it work? Scenario 2: α = (1.50, 0.85) 27 / 37
Study 2: Loss of efficiency with smaller S n 0 = n 1 = 200 Y 0i N (0, 1); Y 1j N ( α 0 /α 1, 1/α1 2 )... this is done so that ROC Y0,Y 1 (s) is binormal with parameters α 0 and α 1 Case 1: α = (0.75, 0.90) Case 2: α = (1.50, 0.85) Two-thousand replications Consider S = {i/(l + 1); i = 1,..., l} for several choices of l 28 / 37
Study 2: Loss of efficiency with smaller S α = (0.75, 0.90) α = (1.50, 0.85) 29 / 37
Computational (In?)efficiency: 500 Bootstraps Case n 0 = n 1 = 50 n 0 = n 1 = 500 n 0 = n 1 = 1000 Runtime > 5 sec. > 8 min. > 24 min. 30 / 37
Childhood Predictors of Adult Obesity Study (CPAO) 823 adults (133 obese and 823 non-obese) Determine whether childhood BMI can predict adult obesity Unadjusted model (easy enough) Adjusted model: include age, sex (for everyone), and adult BMI (for the obese participants) in the model 31 / 37
Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Unadjusted: Intercept 1.07 0.116 < 0.001 Φ 1 (s) 0.890 0.0916 < 0.001 Adjusted: Intercept 0.150 0.376 0.69 Age 0.0669 0.0276 0.012 Gender -0.284 0.238 0.23 Adult BMI 0.236 0.136 0.083 Φ 1 (s) 0.923 0.0926 < 0.001 32 / 37
Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Unadjusted: Intercept 1.07 0.116 < 0.001 Φ 1 (s) 0.890 0.0916 < 0.001 Adjusted: Intercept 0.150 0.376 0.69 Age 0.0669 0.0276 0.012 Gender -0.284 0.238 0.23 Adult BMI 0.236 0.136 0.083 Φ 1 (s) 0.923 0.0926 < 0.001 33 / 37
Table 1. ROC-GLM Models for CPAO Data Method Estimate Bootstrap S.E. p-value Age 0.0669 0.0276 0.012 Two obese adults of the same gender and adult BMI, but differing in age by one year, differ in estimated probability of having a BMI exceed the healthy quantiles of the same respective covariates by 0.0669 on the probit scale (with the older participant having the higher probability) 34 / 37
35 / 37
In Summary Meaningful computational efficiency gains in considering fewer cut-points... At a cost of statistical efficiency Might be acceptable when trying to get done a ton of computations... or working with an absolutely massive data set 36 / 37
What are your thoughts? I am hugely disappointed not to have been able to discuss model misspecification... With the five extra minutes I have next time...... would you rather see the case where the data generation mechanism is not normal, or when curve itself is misspecified? Is there anything from this you would cut out? Anyone still have trouble remembering which is sensitivity and which is specificity? 37 / 37