Performance Measures. Sören Sonnenburg. Fraunhofer FIRST.IDA, Kekuléstr. 7, Berlin, Germany

Size: px

Start display at page:

Download "Performance Measures. Sören Sonnenburg. Fraunhofer FIRST.IDA, Kekuléstr. 7, Berlin, Germany"

Easter Harvey
5 years ago
Views:

1 Sören Sonnenburg Fraunhofer FIRST.IDA, Kekuléstr. 7, 2489 Berlin, Germany

2 Roadmap: Contingency Table Scores from the Contingency Table Curves from the Contingency Table Discussion Sören Sonnenburg

3 Contingency Table Contingency Table / Confusion Matrix Given sample x...x N X 2-class classifier f(x) {, +}, outputs f(x )...f(x n ) for x...x N true labeling (y...y N ) {,+} N partitions the N sample into outputs\ labeling y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N Sören Sonnenburg 2

4 Scores I o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N acc = TP+TN N - accurracy err = FP+FN N - error Sören Sonnenburg 3

5 Scores II o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N tpr = TP/N + = tnr = TN/N = fnr = FN/N + = fpr = FP/N = ppv = TP/O + = fdr = FP/O + = npv = TN/O = fdr = FN/O = TP T P +F N - sensitivity/recall TN T N +F P - specificity FN F N +T P - -sensitivity FP F P +T N - -specificity TP T P +F P - positive predictive value / precision FP F P +T P - false discovery rate TN T N +F N - negative predictive value FN FN+TN - negative false discovery rate Sören Sonnenburg 4

6 Scores III o\ l y=+ y=- Σ f(x)=+ TP FP O + f(x)=- FN TN O Σ N + N N ber = 2 ( FN FN+TP + ) FP FP+TN - balanced error WRacc = TP TP+FN FP F P +T N - weighted relative accuracy f = 2 TP 2 TP+FP+FN - F score (harmonic mean between precision/recall) cc = TP 2 FP FN - cross correlation (TP+FP)(TP+FN)(TN+FP)(TN+FN) Sören Sonnenburg 5

7 curves Receiver Operator Characteristic Curve ROC ROC true positive rate true positive rate false positive rate false positive rate use tpr/fpr obtained by varying bias independent of class skew monotonely ascending Sören Sonnenburg 6

8 curves Precision Recall Curve positive predictive value 0. PPV true positive rate positive predictive value PPV true positive rate use ppv/tpr obtained by varying bias depends on class skew not monotone Sören Sonnenburg 7

9 curves ROC vs. PRC ROC PPV true positive rate false positive rate ROC (0 times as many negatives) positive predictive value true positive rate PRC (0 times as many negatives) true positive rate false positive rate positive predictive value true positive rate Sören Sonnenburg 8

10 Area under the Curves auroc remember: tpr = TP/N + = TP TP+FN and fpr = FP/N = FP FP+TN Properties considering output as ranking, it is the number of swappings such that output of positive examples output of negative examples divided by N + N equivalent to wilcoxon rank test Gini = 2 auroc independant of bias independant of class skew Sören Sonnenburg 9

11 Area under the Curves auprc remember: ppv = TP/O + = TP TP+FP and tpr = TP/N + = TP TP+FN Properties considering output as ranking,...??? meaning??? independant of bias dependant on class skew Sören Sonnenburg 0

12 Comparison accuracy, auroc, auprc Classification Performance (in percent) Accuracy Area under the ROC Area under the PRC Number of training examples Sören Sonnenburg

13 Extensions ROC N Score - area under the ROC curve up to the first N false-positives other??? Sören Sonnenburg 2

14 Summary When use what: balanced data accuracy/auroc/... unbalanced date BER/F/auPPV/CC other cases / other measures??? Sören Sonnenburg 3

15 auroc/auprc optimized training current classifiers minimize trainerror (+ some complexity) want to optimize auroc/auprc/... directly first approaches (also implemented in shogun) create pairs of examples (x I, x J ),I = {i y i = +}, J = {i y i = +}, N + Ṅ,e.g. O(N 2 ) examples in standard SVM learning unusuably slow recently TJ (ICML 2005) SVM multi : min w,ξ 0 2 w 2 + Cξ s.t. Y Y\Y : w T [Ψ(X, Y ) Ψ(X,Y )] (Y, Y ) ξ with n Ψ(X, Y ) = y i x i i= Sören Sonnenburg 4

16 AIM has been done for auroc, F, BER and is doable for any other measure based on scores in contingency table AIM: Do this for auprc/roc n! min w,ξ 0 2 w 2 + Cξ s.t. Y Y\Y : w T [Ψ(X, Y ) Ψ(X,Y )] (Y, Y ) ξ with n Ψ(X, Y ) = y i x i i= Anyone interested? Sören Sonnenburg 5

17 Outlook muliclass regression Sören Sonnenburg 6

Big Data Analytics: Evaluating Classification Performance April, 2016 R. Bohn. Some overheads from Galit Shmueli and Peter Bruce 2010

Big Data Analytics: Evaluating Classification Performance April, 2016 R. Bohn 1 Some overheads from Galit Shmueli and Peter Bruce 2010 Most accurate Best! Actual value Which is more accurate?? 2 Why Evaluate