Performance Measures. Sören Sonnenburg. Fraunhofer FIRST.IDA, Kekuléstr. 7, Berlin, Germany

Sören Sonnenburg Fraunhofer FIRST.IDA, Kekuléstr. 7, 2489 Berlin, Germany

Roadmap: Contingency Table Scores from the Contingency Table Curves from the Contingency Table Discussion Sören Sonnenburg

Contingency Table Contingency Table / Confusion Matrix Given sample x...x N X 2-class classifier f(x) {, +}, outputs f(x )...f(x n ) for x...x N true labeling (y...y N ) {,+} N partitions the N sample into outputs\ labeling y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N Sören Sonnenburg 2

Scores I o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N acc = TP+TN N - accurracy err = FP+FN N - error Sören Sonnenburg 3

Scores II o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N tpr = TP/N + = tnr = TN/N = fnr = FN/N + = fpr = FP/N = ppv = TP/O + = fdr = FP/O + = npv = TN/O = fdr = FN/O = TP T P +F N - sensitivity/recall TN T N +F P - specificity FN F N +T P - -sensitivity FP F P +T N - -specificity TP T P +F P - positive predictive value / precision FP F P +T P - false discovery rate TN T N +F N - negative predictive value FN FN+TN - negative false discovery rate Sören Sonnenburg 4

Scores III o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N ber = 2 ( FN FN+TP + ) FP FP+TN - balanced error WRacc = TP TP+FN FP F P +T N - weighted relative accuracy f = 2 TP 2 TP+FP+FN - F score (harmonic mean between precision/recall) cc = TP 2 FP FN - cross correlation (TP+FP)(TP+FN)(TN+FP)(TN+FN) Sören Sonnenburg 5

curves Receiver Operator Characteristic Curve ROC ROC true positive rate true positive rate 0. 0. 0.0 0 0.2 0.4 0.6 0.8 false positive rate 0.0 0.000 0.00 0.0 0. false positive rate use tpr/fpr obtained by varying bias independent of class skew monotonely ascending Sören Sonnenburg 6

curves Precision Recall Curve positive predictive value 0. PPV 0 0.2 0.4 0.6 0.8 true positive rate positive predictive value PPV 0. 0. true positive rate use ppv/tpr obtained by varying bias depends on class skew not monotone Sören Sonnenburg 7

curves ROC vs. PRC ROC PPV true positive rate 0. 0.0 0.000 0.00 0.0 0. false positive rate ROC (0 times as many negatives) positive predictive value 0. 0. true positive rate PRC (0 times as many negatives) true positive rate 0. 0.0 0.000 0.00 0.0 0. false positive rate positive predictive value 0. 0.0 0.0 0. true positive rate Sören Sonnenburg 8

Area under the Curves auroc remember: tpr = TP/N + = TP TP+FN and fpr = FP/N = FP FP+TN Properties considering output as ranking, it is the number of swappings such that output of positive examples output of negative examples divided by N + N equivalent to wilcoxon rank test Gini = 2 auroc independant of bias independant of class skew Sören Sonnenburg 9

Area under the Curves auprc remember: ppv = TP/O + = TP TP+FP and tpr = TP/N + = TP TP+FN Properties considering output as ranking,...??? meaning??? independant of bias dependant on class skew Sören Sonnenburg 0

Comparison accuracy, auroc, auprc 90 80 Classification Performance (in percent) 70 60 50 40 30 20 0 Accuracy Area under the ROC Area under the PRC 000 0000 00000 000000 0000000 Number of training examples Sören Sonnenburg

Extensions ROC N Score - area under the ROC curve up to the first N false-positives other??? Sören Sonnenburg 2

Summary When use what: balanced data accuracy/auroc/... unbalanced date BER/F/auPPV/CC other cases / other measures??? Sören Sonnenburg 3

auroc/auprc optimized training current classifiers minimize trainerror (+ some complexity) want to optimize auroc/auprc/... directly first approaches (also implemented in shogun) create pairs of examples (x I, x J ),I = {i y i = +}, J = {i y i = +}, N + Ṅ,e.g. O(N 2 ) examples in standard SVM learning unusuably slow recently TJ (ICML 2005) SVM multi : min w,ξ 0 2 w 2 + Cξ s.t. Y Y\Y : w T [Ψ(X, Y ) Ψ(X,Y )] (Y, Y ) ξ with n Ψ(X, Y ) = y i x i i= Sören Sonnenburg 4

AIM has been done for auroc, F, BER and is doable for any other measure based on scores in contingency table AIM: Do this for auprc/roc n! min w,ξ 0 2 w 2 + Cξ s.t. Y Y\Y : w T [Ψ(X, Y ) Ψ(X,Y )] (Y, Y ) ξ with n Ψ(X, Y ) = y i x i i= Anyone interested? Sören Sonnenburg 5

Outlook muliclass regression...... Sören Sonnenburg 6