Concentration-based Delta Check for Laboratory Error Detection

Size: px
Start display at page:

Download "Concentration-based Delta Check for Laboratory Error Detection"

Transcription

1 Northeastern University Department of Electrical and Computer Engineering Concentration-based Delta Check for Laboratory Error Detection Biomedical Signal Processing, Imaging, Reasoning, and Learning (BSPIRAL) Group Author: Jamshid Sourati Reviewers: Deniz Erdogmus Murat Akcakaya Steve C. azmierczak Todd. Leen Supervisor: Deniz Erdogmus June, 25 * Cite this report in the following format: Jamshid Sourati, Concentration-based Delta Check for Laboratory Error Detection, Technical Report, BSPIRAL-563-R, Northeastern University, 25.

2 Abstract Investigating the variation of clinical measurements of patients over time is a common technique, known as delta check, for detecting laboratory errors. They are based on the expected biological variations and machine imprecision, where the latter varies for different concentrations of the analytes. Here, we present a novel delta check method in the form of composite thresholding, and provide its sufficient statistics by constructing the corresponding discriminant function, which enables us to use statistical and learning analysis tools. Using the scores obtained from such a discriminant function, we statistically study the performance of our algorithm on a labeled data set for the purpose of detecting lab errors.

3 Contents Introduction 3 2 Notations 3 3 Decision Rules 4 3. Causal Delta Check Non-Causal Delta Check Data Analysis 5 4. Sufficient Statistics Estimating the ROC Curves Estimating the AUC Comparing the ROC Curves Experimental Results 5. Experiment Settings Statistics Conclusion 2

4 Introduction Quality control is an important stage in analyzing specimens in clinical laboratories. The goal is to detect the erroneous measurements in a pool of clinical samples. Traditional quality control systems represent a significant expenditure of financial and personnel resources, and are sensitive to no more than % of the laboratory errors (Witte et al., 997; Hawkins, 22). Computer-aided algorithms are recently developed for automatically detecting errors among the samples. Each clinical measurement is a vector of numbers, where each component represents the evaluation of the concentration of a specific substance, or analyte, in the patient s blood. Most automatic algorithms are based on the differences between the analytes of the current measurement of a given patient and those in a prior measurement of the same patient. Normally, these changes do not exceed certain upper limits, unless there is an error in the reported the analyte values. Such variation-based detection techniques are usually called delta checks in the clinical literature (Strathmann et al., 2). Delta checks can be more complicated than simple thresholding. For instance, the magnitude of the analyte variations usually change in different concentrations (Ricos et al., 29). Therefore, a composite thresholding is needed, where both the cut-off parameters and the values to be thresholded are different across the concentration ranges. Such approaches are not straightforward to analyze as they do not match the framework of standard statistical and learning analysis. For instance, most of the theoretical results regarding the computation and comparison of the ROC curves, have the assumption that a one-dimensional continuous test is being thresholded for labeling the samples (Pepe, 23). In this paper, we present a novel composite delta check, designed for automatically detecting erroneous clinical measurements, and also present its sufficient statistics. These statistics are obtained by building the discriminant functions of our decision rules and can also be viewed as a dimensionality reduction of the feature vectors. Calculating the scalar scores using such discriminant functions enables us to apply various statistical and learning tools to analyze the delta check, say, by building learning models or using statistical analysis of the data. Here, as simple examples, we use them to evaluate the performance of our detection algorithm by means of statistical analysis based on the ROC curves. Under this analysis, we discuss the strength of each analyte in distinguishing the erroneous samples. 2 Notations Throughout this paper, we use Ω to denote the set of analytes that are evaluated for each patient. The vector x = [x,..., x d ] R d (where d = Ω ) contains evaluation of the analytes in Ω for a given patient at a specific time. The difference between the analyte values in x and the prior and subsequent measurements of the same patient, within a 24 hours interval, are denoted by the variation vectors x = [ x,..., x d ] and x = [ x,..., x d ], respectively. Also assume that n denotes the number of samples available. Finally, Φ( ) is CDF of the standard normal distribution N (, ) and (A) is an indicator function, whose value is, if the expression A is true, and otherwise. Lab error detection problem: given a set of evaluated analytes stored in x, together with either one or both types of variations x and x, use all or a subset of analytes to 3

5 determine if there exists an erroneous measurement in x. 3 Decision Rules In this section, we discuss forming the feature vectors and formulate our decision rules in two general cases: () causal check using only x, and (2) non-causal check using both x and x. A decision rule is a mapping from the feature space to the binary space {, }, where represents the decision of labeling a sample as error, and otherwise. 3. Causal Delta Check In order to take into account different concentrations, we divide [, ], as the range of all possible values of the analyte evaluations, into three intervals. For instance, for analyte indexed by a ( a d), we get [, l a ], (l a, u a ] and (u a, ) for, where each interval is assigned a different threshold: β,a (an absolute value), β 2,a and β 3,a (in form of percentage with respect to ), respectively. First, suppose the decision is to be made based on the individual analyte a. The feature vector is constructed as y a = [ x a ]. Then, our decision rule in this case, denoted by h a, is ( ) β,a, l a ( ) h a (y a ) = β 2,a, l a < u a () ( ) β 3,a, u a < Now, if we consider all the analytes in Ω, we construct the feature vectors by including all the values and variations y = [ x x }{{} y x 2 x 2 }{{} y 2... x k x d }{{} y d ], The sample will be labeled as an error, if the variation of at least one of the analytes exceeds the threshold. Hence, we can formulate the decision rule as 3.2 Non-Causal Delta Check h (y ) = max i d {h i (y i )}. (2) Here, we use the same partitioning of concentration ranges of the analytes, and the same thresholds in each division, as in the previous case. Again, let us first focus on a single analyte indexed by a. The feature vector will be constructed as y a = [ x a x a ]. Then, we label the sample as an error, if the variations in both directions, i.e. x a and x a, violate the corresponding threshold. Such decision 4

6 rule can be written as h a (y a ) = (3) ( ) ( ) β,a β,a, l a ( ) ( ) β 2,a β 2,a, l a < ( ) ( ) u a β 2,a β 3,a, u a < The feature vector when using all the analytes in Ω is constructed as y = [x } x {{ x } y x } 2 x 2 {{ x 2} y 2... x } d x d {{ x d} y d ] As before, when considering multiple analytes, we label the given sample as an error, if there exists one analyte a Ω that has erroneous measurement according to h a (y a ). Hence, a similar equation to (2) holds for the non-causal delta check. 4 Data Analysis In this section, we present the sufficient statistics by constructing the discriminant functions of our delta check. Evaluation of the discriminant functions gives us scalar scores that tends to be larger in error class. The ROC curves of the decision rules can be empirically estimated by varying the threshold over the scores and each time classifying those exceeding the threshold as error. Here, we also discuss estimation of the AUC values and their confidence intervals, as well as comparing the performance of single-analyte delta checks under usage of different analytes, by means of a one-sided hypothesis testing over the difference between their AUC values. 4. Sufficient Statistics The idea in constructing the statistics is to relax the indicator functions in the decision rules that compare the (normalized) variation with the thresholds (see () or (3)). First, note that we can rewrite each of the decision rules in a single expression. For example, the single-analyte causal decision rule in () can be reformulated as ( ) h a (y a ) = ( l a ) β,a ( ) (l a < u a ) β 2,a ) ( (u a < ) β 3,a (4) 5

7 By relaxing the second indicator function in each term, we get the following discriminant score (function): ( [ ]) ( ) s a y xa a = x = ( l a ) a β,a ( ) (l a < u a ) β 2,a (u a < ) ( ) β 3,a Note that the embedded scalar tends to be larger in positive direction when y a is actually an erroneous sample. Our decision is obtained by thresholding the score values by zero. The discriminant score for the causal multiple-analyte case is constructed by replacing the decision rules h i in equation (2) by the scalar transformation s i : s (y ) = max i d {s i (y i )}. (6) Construction of the discriminant function is more complicated in case of non-causal delta checks. Note that relaxing the indicator functions of (3) is troublesome, because even if the sample is correct and therefore the differences between the (normalized) variations and the thresholds in both directions tend to be negative, their multiplication results a positive value. In order to resolve this issue, we reformulate this rule by replacing the multiplication of the indicators by their minimum value (see (7)). Therefore, the function can be written as in (8). Finally, a maximization similar to equation (6) will suffice to get s(y) for the multiple-analyte case. { ( ) ( )} min β,a, β,a, l a { ( ) ( )} h a (y a ) = min β 2,a, β 2,a, l a < u a (7) { ( ) ( )} min β 3,a, β 3,a, u a < (5) s a y a = x a x a {( ) ( )} = ( l a ) min, (8) β,a β,a {( ) ( )} (l a < u a ) min β 2,a, β 2,a x {( ) ( a )} (u a < ) min β 3,a, β 3,a 4.2 Estimating the ROC Curves After embedding the feature vectors into a scalar space, the ROC curves can be easily estimated empirically by varying the threshold over the transformed scalar values and computing 6

8 BUN Creatinine Na TPR CO2 TPR 9% CI 95% CI 99% CI Mean y=x FPR (b) Causal multiple-analyte Ca.5.5 FPR Alb.5 TPR 9% CI 95% CI 99% CI Mean y=x FPR (a) Causal single-analytes (c) Non-causal multiple-analyte Figure : ROC curves, together with the confidence regions, of the single- and multi-analyte delta checks the false positive rate (F P R) and true positive rate (T P R) in each case. In order to compute the ( α) confidence region of a given operating point (F P R, T P R), we consider the number of false positives (F P = n F P R) and true positives (T P = n T P R) as two independent binomial random variables with success probabilities F P R and T P R, respectively, and n as the number of trials. Then, the confidence intervals of F P R and T P R can be computed accordingly (Johnson et al., 25). Let I,α and I 2,α be the ( α ) confidence interval of F P R and T P R respectively, where α = α, then because of the independence assumption, the rectangle I,α I 2,α is the (α) confidence interval of the pair (F P R, T P R). We take the union of such rectangles obtained for all the operating points as the ( α) confidence region of our empirical ROC. 4.3 Estimating the AUC In formulations of this section, we focus on the non-causal multiple-analyte case only, but it can be done for all the other cases as well. The AUC value can be estimated by either computing the area under the empirical ROC curve, or using the survivor functions under the error and non-error classes, which are the same as probabilities of detection and false 7

9 BUN Creat Na CO2 Ca Alb Ω BUN Creat Na CO2 Ca Alb Ω % 95% 99% (a) Causal % 95% 99% (c) Non-causal BUN Creatinine Na CO2 Ca Albumin Albumin CO2 Ca Creatinine Na BUN BUN Creatinine Na CO2 Ca Albumin Albumin CO2 Ca Creatinine Na BUN (b) Causal (d) Non-causal Figure 2: AUC values, together with their confidence intervals, are shown for the single- and multiple-analyte delta checks (a,c); the resulting p-values of the pairwise hypothesis tests, described in Section 4.4 are also displayed (b,d). alarm, respectively ( y R): error : S e (y) = P r(s(y) y y is error), (9a) non-error : S n (y) = P r(s(y) y y is not error), (9b) These functions can be approximated empirically or using kernel density estimation (DE). We denote the given approximations by Ŝe and Ŝn. Let us denote the vector of scalar scores by s = [s,..., s n ], where s i = s(y i ). Without loss of generality, suppose that the transformed values in s are sorted such that the first k scalars are errors and the rest are sound measurements. Then, the AUC and its variance can be approximated as below (DeLong 8

10 et al., 988): AUC k k Ŝ n (s i ), i= Var[AUC] k Var{Ŝn(s i ) i =,..., k} n k Var{Ŝe(s i ) i = k,..., n} (a) (b) The confidence ( interval of ) AUC can also be approximated by assuming that its logit transformation, i.e. log AUC AUC, is distributed normally with the variance calculated in (b) (Pepe, 23). 4.4 Comparing the ROC Curves For two given analytes a, b Ω, denote the vector of single-analyte sorted scores in noncausal case by s a and s b, respectively (the formulation for the causal case is exactly the same). In order to compare the performance of delta checks using these analytes, we consider the difference between their AUC values, AUC a,b = AUC a AUC b, where AUC a and AUC b are estimated based on s a and s b, respectively. However, these two AUC values are not independent, as s a and s b are computed from the same data. Therefore, the variance of AUC a,b is not additive. It can be shown that this variance can be approximated as below (DeLong et al., 988): Var[ AUC a,b ] = k Var {Ŝ n (s ai ) Ŝ n(s bi ) i =,..., k } {Ŝ n k Var e (s ai ) Ŝ e(s bi )) } i = k,..., n Then, assuming that AUC a,b N (µ, Var[ AUC a,b ]), the following one-sided hypothesis test is performed: H : µ H : µ > So H is the hypothesis that the AUC obtained by using analyte a is no better than that obtained by using analyte b. The AUC difference tends to be larger under H than under H. Therefore, it is natural to reject H when AUC a,b is large. We take the test statistic AUC a,b Var[ AUC a,b ] as T = and reject H when T c, for some critical value c. The power ( ) µ function of this test is Φ c, which gives the test size of Φ(c). We Var[ AUC a,b ] ( ) AUC are interested in the p-value of the test which can be shown to be Φ a,b. Var[ AUC a,b ] By definition, the smaller the p-value of a certain observed AUC difference is, the stronger evidence we have to reject H, i.e. using analyte a is more likely not to be equal or worse than using analyte b. () 9

11 5 Experimental Results 5. Experiment Settings The data that we evaluated consisted of laboratory test values obtained from hospital inpatients with renal failure, 8 years of age or older, seen at Oregon Health & Science University during the time period of October 2 through September 2. The set of analytes, Ω, that we considered in constructing sample vectors are urea (BUN), creatinine, sodium (Na), potassium (), chloride (), total carbon dioxide (CO2), calcium (Ca), phosphorus () and albumin. Serial data from samples that were noted by the laboratory to be hemolyzed and which could have significantly affected the measured values of certain analytes (i.e., potassium) were excluded from our evaluation. We queried 85 samples, consisting of a mixture of randomly selected and low-likelihood samples under a GMM, to get their labels from the clinical experts, with 64 (7.95%) of the samples showing an error. Among them we got 436 samples with at least one prior measurement (with 37 errors), and 254 samples with both prior and subsequent measurements (with 26 errors). Therefore, n is different for our causal and non-causal training data sets. The parameters β,a, β 2,a and β 3,a are determined based on the physiological variation in analyte a within the individual, and machine imprecision. The former is specified based on recent literature (Ricos et al., 29), and the latter based on quality control data obtained from the instrument used to measure the analytes. The concentration ranges of the analytes are also provided by the clinicians. Notice that these parameters can also be learned, for example, by doing a cross validation over the obtained labeled data. Furthermore, the survivor functions in (9) are computed using DE based on Gaussian kernels with empirically fixed kernel widths ( 5 ). 5.2 Statistics The ROC curves obtained for the single- and multiple-analyte decision rules are shown in Figure. Because of lack of space, and that the curves of single-analyte delta checks were very similar in causal and non-causal modes, we showed only the former. From Figure (a), BUN had the worst performance, and potassium and calcium were among the best ones, in terms of their average ROC curve (the magenta curve). Other analytes were somewhere in between. The AUC values of these ROC curves, together with their confidence intervals, are shown in Figure 2(a,c). We can observe that relative performance of the single analytes in terms of the AUC values are similar in both causal and non-causal modes, with longer confidence intervals for the latter. This is because n is smaller in the non-causal case. The resulting p-values of our pairwise hypothesis tests are also shown in Figure 2(b,d). For each case, a matrix is displayed whose (a, b) entry ( a, b d) represents the p-value of the hypothesis over AUC a,b. Therefore, darkness for such an entry implies that our data provide strong evidence against the null hypothesis that analyte a does not outperform analyte b. Observe that the darkest rows of each matrire those associated with potassium and calcium. That is, in the comparison between these analytes and the rest, it is highly probable that they are not worse than others. Whereas the rows corresponding to BUN and creatinine, are among the brightest ones. These are in accordance with our observations on

12 ROC curves and the AUC values. Figures (b,c) illustrate the ROC curves of the multiple-analyte decision rules. They showed significantly better performance than the single-analyte checks. This was expected, as they are using the information from all the analytes. Also observe that, not surprisingly, the non-causal mode outperforms the causal mode. Recall that in the former we use the variations in both directions, whereas in the latter we are restricted to use a subset of this knowledge by focusing on only the prior measurement. The AUC values of these mutipleanalyte checks are also shown in the last rows of Figures 2(a,c), which shows that their mean AUC (the red dots) are larger than all the single-analyte AUCs. 6 Conclusion In this paper, we proposed a novel concentration-dependent delta check algorithm and provided its sufficient statistics by constructing the discriminant functions based on the decision rules of our algorithm. Computing the scores with the discriminant functions enabled us to do various statistical analysis, such as empirically estimating the ROC curves and the AUC values, together with their confidence regions. Performance of our proposed delta check under various single analytes are also compared in a pairwise manner based on the difference between their AUC values. In future work, we will incorporate correlations between the variations of the analytes when detecting the lab errors, develop a soft probabilistic classifier in case of multiple-analyte delta check (rather than a hard-max classifier) and also devise an active learning framework based on the proposed discriminant function to efficiently query samples from the clinical experts. References DeLong, E. R., DeLong, D. M., and arke-pearson, D. L. (988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, pages Hawkins, R. (22). Managing the pre-and post-analytical phases of the total testing process. Annals of laboratory medicine, 32():5 6. Johnson, N. L., emp, A. W., and otz, S. (25). Univariate discrete distributions, volume 444. John Wiley & Sons. Pepe, M. S. (23). The statistical evaluation of medical tests for classification and prediction. Oxford University Press. Ricos, C., Alvarez, V., and Cava, F. (29). Biologic variation and desirable specifications for qc. Strathmann, F. G., Baird, G. S., and Hoffman, N. G. (2). Simulations of delta check rule performance to detect specimen mislabeling using historical laboratory data. inica Chimica Acta, 42(2):

13 Witte, D. L., VanNess, S. A., Angstadt, D. S., and Pennell, B. J. (997). Errors, mistakes, blunders, outliers, or unacceptable results: how many? inical Chemistry, 43(8):

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Introduction to Signal Detection and Classification. Phani Chavali

Introduction to Signal Detection and Classification. Phani Chavali Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 3: Detection Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Detection theory

More information

Pointwise Exact Bootstrap Distributions of Cost Curves

Pointwise Exact Bootstrap Distributions of Cost Curves Pointwise Exact Bootstrap Distributions of Cost Curves Charles Dugas and David Gadoury University of Montréal 25th ICML Helsinki July 2008 Dugas, Gadoury (U Montréal) Cost curves July 8, 2008 1 / 24 Outline

More information

Asymptotic Analysis of Objectives Based on Fisher Information in Active Learning

Asymptotic Analysis of Objectives Based on Fisher Information in Active Learning Journal of Machine Learning Research 18 (2017 1-41 Submitted 3/15; Revised 2/17; Published 4/17 Asymptotic Analysis of Objectives Based on Fisher Information in Active Learning Jamshid Sourati Department

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

How do we compare the relative performance among competing models?

How do we compare the relative performance among competing models? How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model

More information

Detection theory. H 0 : x[n] = w[n]

Detection theory. H 0 : x[n] = w[n] Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Performance Evaluation and Hypothesis Testing

Performance Evaluation and Hypothesis Testing Performance Evaluation and Hypothesis Testing 1 Motivation Evaluating the performance of learning systems is important because: Learning systems are usually designed to predict the class of future unlabeled

More information

Smart Home Health Analytics Information Systems University of Maryland Baltimore County

Smart Home Health Analytics Information Systems University of Maryland Baltimore County Smart Home Health Analytics Information Systems University of Maryland Baltimore County 1 IEEE Expert, October 1996 2 Given sample S from all possible examples D Learner L learns hypothesis h based on

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Classifier performance evaluation

Classifier performance evaluation Classifier performance evaluation Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských partyzánu 1580/3, Czech Republic

More information

Stephen Scott.

Stephen Scott. 1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training

More information

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1 CptS 570 Machine Learning School of EECS Washington State University CptS 570 - Machine Learning 1 IEEE Expert, October 1996 CptS 570 - Machine Learning 2 Given sample S from all possible examples D Learner

More information

Sampling Strategies to Evaluate the Performance of Unknown Predictors

Sampling Strategies to Evaluate the Performance of Unknown Predictors Sampling Strategies to Evaluate the Performance of Unknown Predictors Hamed Valizadegan Saeed Amizadeh Milos Hauskrecht Abstract The focus of this paper is on how to select a small sample of examples for

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Bayesian Decision Theory

Bayesian Decision Theory Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is

More information

Performance Evaluation

Performance Evaluation Performance Evaluation David S. Rosenberg Bloomberg ML EDU October 26, 2017 David S. Rosenberg (Bloomberg ML EDU) October 26, 2017 1 / 36 Baseline Models David S. Rosenberg (Bloomberg ML EDU) October 26,

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab

More information

Model Accuracy Measures

Model Accuracy Measures Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

Diagnostics. Gad Kimmel

Diagnostics. Gad Kimmel Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. Cross validation. ROC plot. Introduction Motivation Estimating properties of an estimator. Given data samples say the average. x 1, x 2,...,

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL

More information

Estimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves

Estimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves Journal of Data Science 6(2008), 1-13 Estimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves Feng Gao 1, Chengjie

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Performance Evaluation

Performance Evaluation Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Example:

More information

Classification and Pattern Recognition

Classification and Pattern Recognition Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Performance evaluation of binary classifiers

Performance evaluation of binary classifiers Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider

More information

LECTURE 5 HYPOTHESIS TESTING

LECTURE 5 HYPOTHESIS TESTING October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall

More information

Introduction to Statistical Inference

Introduction to Statistical Inference Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural

More information

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN

TESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO

More information

Anomaly Detection. Jing Gao. SUNY Buffalo

Anomaly Detection. Jing Gao. SUNY Buffalo Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Lecture 3 Classification, Logistic Regression

Lecture 3 Classification, Logistic Regression Lecture 3 Classification, Logistic Regression Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se F. Lindsten Summary

More information

Learning Methods for Linear Detectors

Learning Methods for Linear Detectors Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Vapnik Chervonenkis Theory Barnabás Póczos Empirical Risk and True Risk 2 Empirical Risk Shorthand: True risk of f (deterministic): Bayes risk: Let us use the empirical

More information

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /

More information

Topic 3: Hypothesis Testing

Topic 3: Hypothesis Testing CS 8850: Advanced Machine Learning Fall 07 Topic 3: Hypothesis Testing Instructor: Daniel L. Pimentel-Alarcón c Copyright 07 3. Introduction One of the simplest inference problems is that of deciding between

More information

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.

Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. January 5, 25 Outline Methodologies for the development of classification

More information

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press, Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml CHAPTER 14: Assessing and Comparing Classification Algorithms

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian

More information

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700 Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L2: Instance Based Estimation Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune, January

More information

Unsupervised Learning Methods

Unsupervised Learning Methods Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Intelligent Systems Statistical Machine Learning

Intelligent Systems Statistical Machine Learning Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k

More information

Least Squares Classification

Least Squares Classification Least Squares Classification Stephen Boyd EE103 Stanford University November 4, 2017 Outline Classification Least squares classification Multi-class classifiers Classification 2 Classification data fitting

More information

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant

More information

Hypothesis Evaluation

Hypothesis Evaluation Hypothesis Evaluation Machine Learning Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Hypothesis Evaluation Fall 1395 1 / 31 Table of contents 1 Introduction

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

An Overview of Outlier Detection Techniques and Applications

An Overview of Outlier Detection Techniques and Applications Machine Learning Rhein-Neckar Meetup An Overview of Outlier Detection Techniques and Applications Ying Gu connygy@gmail.com 28.02.2016 Anomaly/Outlier Detection What are anomalies/outliers? The set of

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

ECE521: Inference Algorithms and Machine Learning University of Toronto. Assignment 1: k-nn and Linear Regression

ECE521: Inference Algorithms and Machine Learning University of Toronto. Assignment 1: k-nn and Linear Regression ECE521: Inference Algorithms and Machine Learning University of Toronto Assignment 1: k-nn and Linear Regression TA: Use Piazza for Q&A Due date: Feb 7 midnight, 2017 Electronic submission to: ece521ta@gmailcom

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

Empirical Evaluation (Ch 5)

Empirical Evaluation (Ch 5) Empirical Evaluation (Ch 5) how accurate is a hypothesis/model/dec.tree? given 2 hypotheses, which is better? accuracy on training set is biased error: error train (h) = #misclassifications/ S train error

More information

Intelligent Systems Statistical Machine Learning

Intelligent Systems Statistical Machine Learning Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2015/2016, Our model and tasks The model: two variables are usually present: - the first one is typically discrete

More information

Need for Sampling in Machine Learning. Sargur Srihari

Need for Sampling in Machine Learning. Sargur Srihari Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,

More information

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Anomaly Detection for the CERN Large Hadron Collider injection magnets Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing

More information

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation

More information

CS 543 Page 1 John E. Boon, Jr.

CS 543 Page 1 John E. Boon, Jr. CS 543 Machine Learning Spring 2010 Lecture 05 Evaluating Hypotheses I. Overview A. Given observed accuracy of a hypothesis over a limited sample of data, how well does this estimate its accuracy over

More information

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier

More information

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood

More information

Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests

Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests Noryanti Muhammad, Universiti Malaysia Pahang, Malaysia, noryanti@ump.edu.my Tahani Coolen-Maturi, Durham

More information

Rejection regions for the bivariate case

Rejection regions for the bivariate case Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test

More information

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)? ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we

More information

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler

Machine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler + Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Detection theory 101 ELEC-E5410 Signal Processing for Communications

Detection theory 101 ELEC-E5410 Signal Processing for Communications Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off

More information

Three-group ROC predictive analysis for ordinal outcomes

Three-group ROC predictive analysis for ordinal outcomes Three-group ROC predictive analysis for ordinal outcomes Tahani Coolen-Maturi Durham University Business School Durham University, UK tahani.maturi@durham.ac.uk June 26, 2016 Abstract Measuring the accuracy

More information

Linear and Logistic Regression. Dr. Xiaowei Huang

Linear and Logistic Regression. Dr. Xiaowei Huang Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)

Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Eunsik Park 1 and Y-c Ivan Chang 2 1 Chonnam National University, Gwangju, Korea 2 Academia Sinica, Taipei,

More information

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging

Chapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging Chapter IR:VIII VIII. Evaluation Laboratory Experiments Performance Measures Logging IR:VIII-62 Evaluation HAGEN/POTTHAST/STEIN 2018 Statistical Hypothesis Testing Claim: System 1 is better than System

More information

Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design

Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design Chapter 236 Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design Introduction This module provides power analysis and sample size calculation for non-inferiority tests

More information

THE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS

THE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS THE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS William M. Briggs General Internal Medicine, Weill Cornell Medical College 525 E. 68th, Box 46, New York, NY 10021 email:

More information

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski

More information

Machine Learning Concepts in Chemoinformatics

Machine Learning Concepts in Chemoinformatics Machine Learning Concepts in Chemoinformatics Martin Vogt B-IT Life Science Informatics Rheinische Friedrich-Wilhelms-Universität Bonn BigChem Winter School 2017 25. October Data Mining in Chemoinformatics

More information

6.873/HST.951 Medical Decision Support Spring 2004 Evaluation

6.873/HST.951 Medical Decision Support Spring 2004 Evaluation Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support, Fall 2005 Instructors: Professor Lucila Ohno-Machado and Professor Staal Vinterbo 6.873/HST.951 Medical Decision

More information