Concentration-based Delta Check for Laboratory Error Detection
|
|
- Alexandrina Anthony
- 6 years ago
- Views:
Transcription
1 Northeastern University Department of Electrical and Computer Engineering Concentration-based Delta Check for Laboratory Error Detection Biomedical Signal Processing, Imaging, Reasoning, and Learning (BSPIRAL) Group Author: Jamshid Sourati Reviewers: Deniz Erdogmus Murat Akcakaya Steve C. azmierczak Todd. Leen Supervisor: Deniz Erdogmus June, 25 * Cite this report in the following format: Jamshid Sourati, Concentration-based Delta Check for Laboratory Error Detection, Technical Report, BSPIRAL-563-R, Northeastern University, 25.
2 Abstract Investigating the variation of clinical measurements of patients over time is a common technique, known as delta check, for detecting laboratory errors. They are based on the expected biological variations and machine imprecision, where the latter varies for different concentrations of the analytes. Here, we present a novel delta check method in the form of composite thresholding, and provide its sufficient statistics by constructing the corresponding discriminant function, which enables us to use statistical and learning analysis tools. Using the scores obtained from such a discriminant function, we statistically study the performance of our algorithm on a labeled data set for the purpose of detecting lab errors.
3 Contents Introduction 3 2 Notations 3 3 Decision Rules 4 3. Causal Delta Check Non-Causal Delta Check Data Analysis 5 4. Sufficient Statistics Estimating the ROC Curves Estimating the AUC Comparing the ROC Curves Experimental Results 5. Experiment Settings Statistics Conclusion 2
4 Introduction Quality control is an important stage in analyzing specimens in clinical laboratories. The goal is to detect the erroneous measurements in a pool of clinical samples. Traditional quality control systems represent a significant expenditure of financial and personnel resources, and are sensitive to no more than % of the laboratory errors (Witte et al., 997; Hawkins, 22). Computer-aided algorithms are recently developed for automatically detecting errors among the samples. Each clinical measurement is a vector of numbers, where each component represents the evaluation of the concentration of a specific substance, or analyte, in the patient s blood. Most automatic algorithms are based on the differences between the analytes of the current measurement of a given patient and those in a prior measurement of the same patient. Normally, these changes do not exceed certain upper limits, unless there is an error in the reported the analyte values. Such variation-based detection techniques are usually called delta checks in the clinical literature (Strathmann et al., 2). Delta checks can be more complicated than simple thresholding. For instance, the magnitude of the analyte variations usually change in different concentrations (Ricos et al., 29). Therefore, a composite thresholding is needed, where both the cut-off parameters and the values to be thresholded are different across the concentration ranges. Such approaches are not straightforward to analyze as they do not match the framework of standard statistical and learning analysis. For instance, most of the theoretical results regarding the computation and comparison of the ROC curves, have the assumption that a one-dimensional continuous test is being thresholded for labeling the samples (Pepe, 23). In this paper, we present a novel composite delta check, designed for automatically detecting erroneous clinical measurements, and also present its sufficient statistics. These statistics are obtained by building the discriminant functions of our decision rules and can also be viewed as a dimensionality reduction of the feature vectors. Calculating the scalar scores using such discriminant functions enables us to apply various statistical and learning tools to analyze the delta check, say, by building learning models or using statistical analysis of the data. Here, as simple examples, we use them to evaluate the performance of our detection algorithm by means of statistical analysis based on the ROC curves. Under this analysis, we discuss the strength of each analyte in distinguishing the erroneous samples. 2 Notations Throughout this paper, we use Ω to denote the set of analytes that are evaluated for each patient. The vector x = [x,..., x d ] R d (where d = Ω ) contains evaluation of the analytes in Ω for a given patient at a specific time. The difference between the analyte values in x and the prior and subsequent measurements of the same patient, within a 24 hours interval, are denoted by the variation vectors x = [ x,..., x d ] and x = [ x,..., x d ], respectively. Also assume that n denotes the number of samples available. Finally, Φ( ) is CDF of the standard normal distribution N (, ) and (A) is an indicator function, whose value is, if the expression A is true, and otherwise. Lab error detection problem: given a set of evaluated analytes stored in x, together with either one or both types of variations x and x, use all or a subset of analytes to 3
5 determine if there exists an erroneous measurement in x. 3 Decision Rules In this section, we discuss forming the feature vectors and formulate our decision rules in two general cases: () causal check using only x, and (2) non-causal check using both x and x. A decision rule is a mapping from the feature space to the binary space {, }, where represents the decision of labeling a sample as error, and otherwise. 3. Causal Delta Check In order to take into account different concentrations, we divide [, ], as the range of all possible values of the analyte evaluations, into three intervals. For instance, for analyte indexed by a ( a d), we get [, l a ], (l a, u a ] and (u a, ) for, where each interval is assigned a different threshold: β,a (an absolute value), β 2,a and β 3,a (in form of percentage with respect to ), respectively. First, suppose the decision is to be made based on the individual analyte a. The feature vector is constructed as y a = [ x a ]. Then, our decision rule in this case, denoted by h a, is ( ) β,a, l a ( ) h a (y a ) = β 2,a, l a < u a () ( ) β 3,a, u a < Now, if we consider all the analytes in Ω, we construct the feature vectors by including all the values and variations y = [ x x }{{} y x 2 x 2 }{{} y 2... x k x d }{{} y d ], The sample will be labeled as an error, if the variation of at least one of the analytes exceeds the threshold. Hence, we can formulate the decision rule as 3.2 Non-Causal Delta Check h (y ) = max i d {h i (y i )}. (2) Here, we use the same partitioning of concentration ranges of the analytes, and the same thresholds in each division, as in the previous case. Again, let us first focus on a single analyte indexed by a. The feature vector will be constructed as y a = [ x a x a ]. Then, we label the sample as an error, if the variations in both directions, i.e. x a and x a, violate the corresponding threshold. Such decision 4
6 rule can be written as h a (y a ) = (3) ( ) ( ) β,a β,a, l a ( ) ( ) β 2,a β 2,a, l a < ( ) ( ) u a β 2,a β 3,a, u a < The feature vector when using all the analytes in Ω is constructed as y = [x } x {{ x } y x } 2 x 2 {{ x 2} y 2... x } d x d {{ x d} y d ] As before, when considering multiple analytes, we label the given sample as an error, if there exists one analyte a Ω that has erroneous measurement according to h a (y a ). Hence, a similar equation to (2) holds for the non-causal delta check. 4 Data Analysis In this section, we present the sufficient statistics by constructing the discriminant functions of our delta check. Evaluation of the discriminant functions gives us scalar scores that tends to be larger in error class. The ROC curves of the decision rules can be empirically estimated by varying the threshold over the scores and each time classifying those exceeding the threshold as error. Here, we also discuss estimation of the AUC values and their confidence intervals, as well as comparing the performance of single-analyte delta checks under usage of different analytes, by means of a one-sided hypothesis testing over the difference between their AUC values. 4. Sufficient Statistics The idea in constructing the statistics is to relax the indicator functions in the decision rules that compare the (normalized) variation with the thresholds (see () or (3)). First, note that we can rewrite each of the decision rules in a single expression. For example, the single-analyte causal decision rule in () can be reformulated as ( ) h a (y a ) = ( l a ) β,a ( ) (l a < u a ) β 2,a ) ( (u a < ) β 3,a (4) 5
7 By relaxing the second indicator function in each term, we get the following discriminant score (function): ( [ ]) ( ) s a y xa a = x = ( l a ) a β,a ( ) (l a < u a ) β 2,a (u a < ) ( ) β 3,a Note that the embedded scalar tends to be larger in positive direction when y a is actually an erroneous sample. Our decision is obtained by thresholding the score values by zero. The discriminant score for the causal multiple-analyte case is constructed by replacing the decision rules h i in equation (2) by the scalar transformation s i : s (y ) = max i d {s i (y i )}. (6) Construction of the discriminant function is more complicated in case of non-causal delta checks. Note that relaxing the indicator functions of (3) is troublesome, because even if the sample is correct and therefore the differences between the (normalized) variations and the thresholds in both directions tend to be negative, their multiplication results a positive value. In order to resolve this issue, we reformulate this rule by replacing the multiplication of the indicators by their minimum value (see (7)). Therefore, the function can be written as in (8). Finally, a maximization similar to equation (6) will suffice to get s(y) for the multiple-analyte case. { ( ) ( )} min β,a, β,a, l a { ( ) ( )} h a (y a ) = min β 2,a, β 2,a, l a < u a (7) { ( ) ( )} min β 3,a, β 3,a, u a < (5) s a y a = x a x a {( ) ( )} = ( l a ) min, (8) β,a β,a {( ) ( )} (l a < u a ) min β 2,a, β 2,a x {( ) ( a )} (u a < ) min β 3,a, β 3,a 4.2 Estimating the ROC Curves After embedding the feature vectors into a scalar space, the ROC curves can be easily estimated empirically by varying the threshold over the transformed scalar values and computing 6
8 BUN Creatinine Na TPR CO2 TPR 9% CI 95% CI 99% CI Mean y=x FPR (b) Causal multiple-analyte Ca.5.5 FPR Alb.5 TPR 9% CI 95% CI 99% CI Mean y=x FPR (a) Causal single-analytes (c) Non-causal multiple-analyte Figure : ROC curves, together with the confidence regions, of the single- and multi-analyte delta checks the false positive rate (F P R) and true positive rate (T P R) in each case. In order to compute the ( α) confidence region of a given operating point (F P R, T P R), we consider the number of false positives (F P = n F P R) and true positives (T P = n T P R) as two independent binomial random variables with success probabilities F P R and T P R, respectively, and n as the number of trials. Then, the confidence intervals of F P R and T P R can be computed accordingly (Johnson et al., 25). Let I,α and I 2,α be the ( α ) confidence interval of F P R and T P R respectively, where α = α, then because of the independence assumption, the rectangle I,α I 2,α is the (α) confidence interval of the pair (F P R, T P R). We take the union of such rectangles obtained for all the operating points as the ( α) confidence region of our empirical ROC. 4.3 Estimating the AUC In formulations of this section, we focus on the non-causal multiple-analyte case only, but it can be done for all the other cases as well. The AUC value can be estimated by either computing the area under the empirical ROC curve, or using the survivor functions under the error and non-error classes, which are the same as probabilities of detection and false 7
9 BUN Creat Na CO2 Ca Alb Ω BUN Creat Na CO2 Ca Alb Ω % 95% 99% (a) Causal % 95% 99% (c) Non-causal BUN Creatinine Na CO2 Ca Albumin Albumin CO2 Ca Creatinine Na BUN BUN Creatinine Na CO2 Ca Albumin Albumin CO2 Ca Creatinine Na BUN (b) Causal (d) Non-causal Figure 2: AUC values, together with their confidence intervals, are shown for the single- and multiple-analyte delta checks (a,c); the resulting p-values of the pairwise hypothesis tests, described in Section 4.4 are also displayed (b,d). alarm, respectively ( y R): error : S e (y) = P r(s(y) y y is error), (9a) non-error : S n (y) = P r(s(y) y y is not error), (9b) These functions can be approximated empirically or using kernel density estimation (DE). We denote the given approximations by Ŝe and Ŝn. Let us denote the vector of scalar scores by s = [s,..., s n ], where s i = s(y i ). Without loss of generality, suppose that the transformed values in s are sorted such that the first k scalars are errors and the rest are sound measurements. Then, the AUC and its variance can be approximated as below (DeLong 8
10 et al., 988): AUC k k Ŝ n (s i ), i= Var[AUC] k Var{Ŝn(s i ) i =,..., k} n k Var{Ŝe(s i ) i = k,..., n} (a) (b) The confidence ( interval of ) AUC can also be approximated by assuming that its logit transformation, i.e. log AUC AUC, is distributed normally with the variance calculated in (b) (Pepe, 23). 4.4 Comparing the ROC Curves For two given analytes a, b Ω, denote the vector of single-analyte sorted scores in noncausal case by s a and s b, respectively (the formulation for the causal case is exactly the same). In order to compare the performance of delta checks using these analytes, we consider the difference between their AUC values, AUC a,b = AUC a AUC b, where AUC a and AUC b are estimated based on s a and s b, respectively. However, these two AUC values are not independent, as s a and s b are computed from the same data. Therefore, the variance of AUC a,b is not additive. It can be shown that this variance can be approximated as below (DeLong et al., 988): Var[ AUC a,b ] = k Var {Ŝ n (s ai ) Ŝ n(s bi ) i =,..., k } {Ŝ n k Var e (s ai ) Ŝ e(s bi )) } i = k,..., n Then, assuming that AUC a,b N (µ, Var[ AUC a,b ]), the following one-sided hypothesis test is performed: H : µ H : µ > So H is the hypothesis that the AUC obtained by using analyte a is no better than that obtained by using analyte b. The AUC difference tends to be larger under H than under H. Therefore, it is natural to reject H when AUC a,b is large. We take the test statistic AUC a,b Var[ AUC a,b ] as T = and reject H when T c, for some critical value c. The power ( ) µ function of this test is Φ c, which gives the test size of Φ(c). We Var[ AUC a,b ] ( ) AUC are interested in the p-value of the test which can be shown to be Φ a,b. Var[ AUC a,b ] By definition, the smaller the p-value of a certain observed AUC difference is, the stronger evidence we have to reject H, i.e. using analyte a is more likely not to be equal or worse than using analyte b. () 9
11 5 Experimental Results 5. Experiment Settings The data that we evaluated consisted of laboratory test values obtained from hospital inpatients with renal failure, 8 years of age or older, seen at Oregon Health & Science University during the time period of October 2 through September 2. The set of analytes, Ω, that we considered in constructing sample vectors are urea (BUN), creatinine, sodium (Na), potassium (), chloride (), total carbon dioxide (CO2), calcium (Ca), phosphorus () and albumin. Serial data from samples that were noted by the laboratory to be hemolyzed and which could have significantly affected the measured values of certain analytes (i.e., potassium) were excluded from our evaluation. We queried 85 samples, consisting of a mixture of randomly selected and low-likelihood samples under a GMM, to get their labels from the clinical experts, with 64 (7.95%) of the samples showing an error. Among them we got 436 samples with at least one prior measurement (with 37 errors), and 254 samples with both prior and subsequent measurements (with 26 errors). Therefore, n is different for our causal and non-causal training data sets. The parameters β,a, β 2,a and β 3,a are determined based on the physiological variation in analyte a within the individual, and machine imprecision. The former is specified based on recent literature (Ricos et al., 29), and the latter based on quality control data obtained from the instrument used to measure the analytes. The concentration ranges of the analytes are also provided by the clinicians. Notice that these parameters can also be learned, for example, by doing a cross validation over the obtained labeled data. Furthermore, the survivor functions in (9) are computed using DE based on Gaussian kernels with empirically fixed kernel widths ( 5 ). 5.2 Statistics The ROC curves obtained for the single- and multiple-analyte decision rules are shown in Figure. Because of lack of space, and that the curves of single-analyte delta checks were very similar in causal and non-causal modes, we showed only the former. From Figure (a), BUN had the worst performance, and potassium and calcium were among the best ones, in terms of their average ROC curve (the magenta curve). Other analytes were somewhere in between. The AUC values of these ROC curves, together with their confidence intervals, are shown in Figure 2(a,c). We can observe that relative performance of the single analytes in terms of the AUC values are similar in both causal and non-causal modes, with longer confidence intervals for the latter. This is because n is smaller in the non-causal case. The resulting p-values of our pairwise hypothesis tests are also shown in Figure 2(b,d). For each case, a matrix is displayed whose (a, b) entry ( a, b d) represents the p-value of the hypothesis over AUC a,b. Therefore, darkness for such an entry implies that our data provide strong evidence against the null hypothesis that analyte a does not outperform analyte b. Observe that the darkest rows of each matrire those associated with potassium and calcium. That is, in the comparison between these analytes and the rest, it is highly probable that they are not worse than others. Whereas the rows corresponding to BUN and creatinine, are among the brightest ones. These are in accordance with our observations on
12 ROC curves and the AUC values. Figures (b,c) illustrate the ROC curves of the multiple-analyte decision rules. They showed significantly better performance than the single-analyte checks. This was expected, as they are using the information from all the analytes. Also observe that, not surprisingly, the non-causal mode outperforms the causal mode. Recall that in the former we use the variations in both directions, whereas in the latter we are restricted to use a subset of this knowledge by focusing on only the prior measurement. The AUC values of these mutipleanalyte checks are also shown in the last rows of Figures 2(a,c), which shows that their mean AUC (the red dots) are larger than all the single-analyte AUCs. 6 Conclusion In this paper, we proposed a novel concentration-dependent delta check algorithm and provided its sufficient statistics by constructing the discriminant functions based on the decision rules of our algorithm. Computing the scores with the discriminant functions enabled us to do various statistical analysis, such as empirically estimating the ROC curves and the AUC values, together with their confidence regions. Performance of our proposed delta check under various single analytes are also compared in a pairwise manner based on the difference between their AUC values. In future work, we will incorporate correlations between the variations of the analytes when detecting the lab errors, develop a soft probabilistic classifier in case of multiple-analyte delta check (rather than a hard-max classifier) and also devise an active learning framework based on the proposed discriminant function to efficiently query samples from the clinical experts. References DeLong, E. R., DeLong, D. M., and arke-pearson, D. L. (988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, pages Hawkins, R. (22). Managing the pre-and post-analytical phases of the total testing process. Annals of laboratory medicine, 32():5 6. Johnson, N. L., emp, A. W., and otz, S. (25). Univariate discrete distributions, volume 444. John Wiley & Sons. Pepe, M. S. (23). The statistical evaluation of medical tests for classification and prediction. Oxford University Press. Ricos, C., Alvarez, V., and Cava, F. (29). Biologic variation and desirable specifications for qc. Strathmann, F. G., Baird, G. S., and Hoffman, N. G. (2). Simulations of delta check rule performance to detect specimen mislabeling using historical laboratory data. inica Chimica Acta, 42(2):
13 Witte, D. L., VanNess, S. A., Angstadt, D. S., and Pennell, B. J. (997). Errors, mistakes, blunders, outliers, or unacceptable results: how many? inical Chemistry, 43(8):
Machine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationLearning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht
Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 3: Detection Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Detection theory
More informationPointwise Exact Bootstrap Distributions of Cost Curves
Pointwise Exact Bootstrap Distributions of Cost Curves Charles Dugas and David Gadoury University of Montréal 25th ICML Helsinki July 2008 Dugas, Gadoury (U Montréal) Cost curves July 8, 2008 1 / 24 Outline
More informationAsymptotic Analysis of Objectives Based on Fisher Information in Active Learning
Journal of Machine Learning Research 18 (2017 1-41 Submitted 3/15; Revised 2/17; Published 4/17 Asymptotic Analysis of Objectives Based on Fisher Information in Active Learning Jamshid Sourati Department
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationHow do we compare the relative performance among competing models?
How do we compare the relative performance among competing models? 1 Comparing Data Mining Methods Frequent problem: we want to know which of the two learning techniques is better How to reliably say Model
More informationDetection theory. H 0 : x[n] = w[n]
Detection Theory Detection theory A the last topic of the course, we will briefly consider detection theory. The methods are based on estimation theory and attempt to answer questions such as Is a signal
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationPerformance Evaluation and Hypothesis Testing
Performance Evaluation and Hypothesis Testing 1 Motivation Evaluating the performance of learning systems is important because: Learning systems are usually designed to predict the class of future unlabeled
More informationSmart Home Health Analytics Information Systems University of Maryland Baltimore County
Smart Home Health Analytics Information Systems University of Maryland Baltimore County 1 IEEE Expert, October 1996 2 Given sample S from all possible examples D Learner L learns hypothesis h based on
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationClassifier performance evaluation
Classifier performance evaluation Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských partyzánu 1580/3, Czech Republic
More informationStephen Scott.
1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training
More informationCptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1
CptS 570 Machine Learning School of EECS Washington State University CptS 570 - Machine Learning 1 IEEE Expert, October 1996 CptS 570 - Machine Learning 2 Given sample S from all possible examples D Learner
More informationSampling Strategies to Evaluate the Performance of Unknown Predictors
Sampling Strategies to Evaluate the Performance of Unknown Predictors Hamed Valizadegan Saeed Amizadeh Milos Hauskrecht Abstract The focus of this paper is on how to select a small sample of examples for
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationPerformance Evaluation
Performance Evaluation David S. Rosenberg Bloomberg ML EDU October 26, 2017 David S. Rosenberg (Bloomberg ML EDU) October 26, 2017 1 / 36 Baseline Models David S. Rosenberg (Bloomberg ML EDU) October 26,
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationA short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie
A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab
More informationModel Accuracy Measures
Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses
More informationFundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur
Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationDiagnostics. Gad Kimmel
Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. Cross validation. ROC plot. Introduction Motivation Estimating properties of an estimator. Given data samples say the average. x 1, x 2,...,
More informationSequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process
Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationEstimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves
Journal of Data Science 6(2008), 1-13 Estimating Optimum Linear Combination of Multiple Correlated Diagnostic Tests at a Fixed Specificity with Receiver Operating Characteristic Curves Feng Gao 1, Chengjie
More informationPolitical Science 236 Hypothesis Testing: Review and Bootstrapping
Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The
More informationPerformance Evaluation
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Example:
More informationClassification and Pattern Recognition
Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationPerformance evaluation of binary classifiers
Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationSample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA
Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider
More informationLECTURE 5 HYPOTHESIS TESTING
October 25, 2016 LECTURE 5 HYPOTHESIS TESTING Basic concepts In this lecture we continue to discuss the normal classical linear regression defined by Assumptions A1-A5. Let θ Θ R d be a parameter of interest.
More informationQuestion. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?
Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More informationTESTS FOR EQUIVALENCE BASED ON ODDS RATIO FOR MATCHED-PAIR DESIGN
Journal of Biopharmaceutical Statistics, 15: 889 901, 2005 Copyright Taylor & Francis, Inc. ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400500265561 TESTS FOR EQUIVALENCE BASED ON ODDS RATIO
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationLecture 3 Classification, Logistic Regression
Lecture 3 Classification, Logistic Regression Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se F. Lindsten Summary
More informationLearning Methods for Linear Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2
More informationIntroduction to Machine Learning
Introduction to Machine Learning Vapnik Chervonenkis Theory Barnabás Póczos Empirical Risk and True Risk 2 Empirical Risk Shorthand: True risk of f (deterministic): Bayes risk: Let us use the empirical
More informationSupport Vector Machine. Industrial AI Lab. Prof. Seungchul Lee
Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /
More informationTopic 3: Hypothesis Testing
CS 8850: Advanced Machine Learning Fall 07 Topic 3: Hypothesis Testing Instructor: Daniel L. Pimentel-Alarcón c Copyright 07 3. Introduction One of the simplest inference problems is that of deciding between
More informationComputational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms.
Computational paradigms for the measurement signals processing. Metodologies for the development of classification algorithms. January 5, 25 Outline Methodologies for the development of classification
More informationLecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,
Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml CHAPTER 14: Assessing and Comparing Classification Algorithms
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationClass 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700
Class 4 Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science Copyright 013 by D.B. Rowe 1 Agenda: Recap Chapter 9. and 9.3 Lecture Chapter 10.1-10.3 Review Exam 6 Problem Solving
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationSection 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples
Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L2: Instance Based Estimation Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune, January
More informationUnsupervised Learning Methods
Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational
More informationInverse Sampling for McNemar s Test
International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2014/2015, Our tasks (recap) The model: two variables are usually present: - the first one is typically discrete k
More informationLeast Squares Classification
Least Squares Classification Stephen Boyd EE103 Stanford University November 4, 2017 Outline Classification Least squares classification Multi-class classifiers Classification 2 Classification data fitting
More informationClass 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant
More informationHypothesis Evaluation
Hypothesis Evaluation Machine Learning Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Hypothesis Evaluation Fall 1395 1 / 31 Table of contents 1 Introduction
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationTwo-stage Adaptive Randomization for Delayed Response in Clinical Trials
Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationAn Overview of Outlier Detection Techniques and Applications
Machine Learning Rhein-Neckar Meetup An Overview of Outlier Detection Techniques and Applications Ying Gu connygy@gmail.com 28.02.2016 Anomaly/Outlier Detection What are anomalies/outliers? The set of
More informationHST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007
MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationECE521: Inference Algorithms and Machine Learning University of Toronto. Assignment 1: k-nn and Linear Regression
ECE521: Inference Algorithms and Machine Learning University of Toronto Assignment 1: k-nn and Linear Regression TA: Use Piazza for Q&A Due date: Feb 7 midnight, 2017 Electronic submission to: ece521ta@gmailcom
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level
More informationEmpirical Evaluation (Ch 5)
Empirical Evaluation (Ch 5) how accurate is a hypothesis/model/dec.tree? given 2 hypotheses, which is better? accuracy on training set is biased error: error train (h) = #misclassifications/ S train error
More informationIntelligent Systems Statistical Machine Learning
Intelligent Systems Statistical Machine Learning Carsten Rother, Dmitrij Schlesinger WS2015/2016, Our model and tasks The model: two variables are usually present: - the first one is typically discrete
More informationNeed for Sampling in Machine Learning. Sargur Srihari
Need for Sampling in Machine Learning Sargur srihari@cedar.buffalo.edu 1 Rationale for Sampling 1. ML methods model data with probability distributions E.g., p(x,y; θ) 2. Models are used to answer queries,
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More information> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel
Logistic Regression Pattern Recognition 2016 Sandro Schönborn University of Basel Two Worlds: Probabilistic & Algorithmic We have seen two conceptual approaches to classification: data class density estimation
More informationCS 543 Page 1 John E. Boon, Jr.
CS 543 Machine Learning Spring 2010 Lecture 05 Evaluating Hypotheses I. Overview A. Given observed accuracy of a hypothesis over a limited sample of data, how well does this estimate its accuracy over
More informationEstimating the accuracy of a hypothesis Setting. Assume a binary classification setting
Estimating the accuracy of a hypothesis Setting Assume a binary classification setting Assume input/output pairs (x, y) are sampled from an unknown probability distribution D = p(x, y) Train a binary classifier
More informationLecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary
ECE 830 Spring 207 Instructor: R. Willett Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we saw that the likelihood
More informationNonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests
Nonparametric predictive inference with parametric copulas for combining bivariate diagnostic tests Noryanti Muhammad, Universiti Malaysia Pahang, Malaysia, noryanti@ump.edu.my Tahani Coolen-Maturi, Durham
More informationRejection regions for the bivariate case
Rejection regions for the bivariate case The rejection region for the T 2 test (and similarly for Z 2 when Σ is known) is the region outside of an ellipse, for which there is a (1-α)% chance that the test
More information2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?
ECE 830 / CS 76 Spring 06 Instructors: R. Willett & R. Nowak Lecture 3: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics Executive summary In the last lecture we
More informationMachine Learning and Data Mining. Bayes Classifiers. Prof. Alexander Ihler
+ Machine Learning and Data Mining Bayes Classifiers Prof. Alexander Ihler A basic classifier Training data D={x (i),y (i) }, Classifier f(x ; D) Discrete feature vector x f(x ; D) is a con@ngency table
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationDetection theory 101 ELEC-E5410 Signal Processing for Communications
Detection theory 101 ELEC-E5410 Signal Processing for Communications Binary hypothesis testing Null hypothesis H 0 : e.g. noise only Alternative hypothesis H 1 : signal + noise p(x;h 0 ) γ p(x;h 1 ) Trade-off
More informationThree-group ROC predictive analysis for ordinal outcomes
Three-group ROC predictive analysis for ordinal outcomes Tahani Coolen-Maturi Durham University Business School Durham University, UK tahani.maturi@durham.ac.uk June 26, 2016 Abstract Measuring the accuracy
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More informationPermutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods
Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of
More informationClassification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC)
Classification Ensemble That Maximizes the Area Under Receiver Operating Characteristic Curve (AUC) Eunsik Park 1 and Y-c Ivan Chang 2 1 Chonnam National University, Gwangju, Korea 2 Academia Sinica, Taipei,
More informationChapter IR:VIII. VIII. Evaluation. Laboratory Experiments Performance Measures Training and Testing Logging
Chapter IR:VIII VIII. Evaluation Laboratory Experiments Performance Measures Logging IR:VIII-62 Evaluation HAGEN/POTTHAST/STEIN 2018 Statistical Hypothesis Testing Claim: System 1 is better than System
More informationNon-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design
Chapter 236 Non-Inferiority Tests for the Ratio of Two Proportions in a Cluster- Randomized Design Introduction This module provides power analysis and sample size calculation for non-inferiority tests
More informationTHE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS
THE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS William M. Briggs General Internal Medicine, Weill Cornell Medical College 525 E. 68th, Box 46, New York, NY 10021 email:
More informationPubh 8482: Sequential Analysis
Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification
More informationFeature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size
Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski
More informationMachine Learning Concepts in Chemoinformatics
Machine Learning Concepts in Chemoinformatics Martin Vogt B-IT Life Science Informatics Rheinische Friedrich-Wilhelms-Universität Bonn BigChem Winter School 2017 25. October Data Mining in Chemoinformatics
More information6.873/HST.951 Medical Decision Support Spring 2004 Evaluation
Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support, Fall 2005 Instructors: Professor Lucila Ohno-Machado and Professor Staal Vinterbo 6.873/HST.951 Medical Decision
More information