Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio
|
|
- Rafe Robbins
- 5 years ago
- Views:
Transcription
1 Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio
2 Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant Analysis Logistic Regression (next class probably): Other classification algorithms (Decision Trees, Support Vector Machines, Multilayer Perceptrons, Ensemble classifiers)
3 Problem Set #2 1) Recall that the Wilcoxon sign rank statistic depends on W +, the sum of the ranks of the positive values of X i, and W -, the sum of the ranks of the negative values of X i. We talked about two test statistics: S = W + - W - and min(w -, W + ). These test statistics have different null distributions but given the same data, they can be used to generate the same P-values. Is the same true for W -? Why or why not?
4 Problem Set #2 2) The Wilcoxon sign test tests whether or not the distribution of X i has a median of by comparing the number of positive values of X i to the number of negative values. Intuitively, there should be about an equal number of each. What is the null distribution of the number of positive values of X i in N samples? 2b) What s the P-value associated with observing 4 positive values of X i out of 1? How about 2 out of 2?
5 Problem Set #2 3) The P-value is not the probability of the incorrect rejection of the null hypothesis. Can you explain why? 5) Can the FDR correction ever lead to fewer rejections than the FWER correction? If not, then why not? If so, please give an example.
6 Classification example I: Predicting gene function from expression profiles Microarray profiles are relatively easily measured and reflect function Zhang et al (J Biol 24)
7 Pattern detection RNA splicing + RNA splicing What distinguishes these two sets of profiles?
8 Classification example II: Classifying cancer from cellular profile Microarray profiles can be used to subcategorize cancer (leukemia) rmalized expression Golub et al (Science 1999)
9 Classification in a nutshell Input Features aka covariates Θ Parameters aka coefficients, weights X e.g. microarray profile Classification algorithm e.g. Neural network, SVM, KNN Output aka discriminant value, confidence Y versus η threshold Goal: Find parameters that make outputs predictive of targets on a training set of matched inputs and labeled target values
10 Formal definition Given: 1. a training set {(X 1, t 1 ), (X 2, t 2 ), (X N, t N )} of matched inputs X i and target labels t i (t i = or 1) 2. a classification procedure represented by a discriminant function, f(x; Θ) and a threshold η, so that I[f(X; Θ) > η] is the predicted label given input X. Goal: Set Θ to maximize the agreement between the predicted target labels and actual target labels on the training set. I[H] is a function that has value 1 if the statement H is true, otherwise it has value.
11 Important concepts Training and test sets Uncertainty about classification Overfitting Cross-validation (leave-one-out)
12 Put yourself in the machine s shoes Feature1 Feature2 Feature3 Feature4 Feature5 Expression level during heat shock Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Which uncharacterized genes are involved in trna processing?
13 Training Positives Negatives Known genes
14 Training Positives Negatives What pattern distinguishes the positives and the negatives?
15 Training Positives Negatives 4 green features features 1,3, and 5 are green features 1 and 3 are green and feature 2 is red features 1 and 3 are green
16 Training Positives Negatives features 1,3, and 5 are green features 1 and 3 are green and feature 2 is red features 1 and 3 are green Known genes
17 Training Positives Negatives features 1 and 3 are green and feature 2 is red features 1 and 3 are green Known genes
18 Training Positives Negatives features 1 and 3 are green Known genes
19 Training Positives Negatives features 1 and 3 are green Known genes
20 Prediction Unknowns Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Which genes are involved in trna processing?
21 Prediction Feature1 Feature3 Features 1 and 3 green? Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Yes Which genes are involved in trna processing?
22 Prediction Feature1 Feature3 Features 1 and 3 green? Prediction: Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Yes Involved Involved t Involved Involved t Involved t Involved Which genes are involved in trna processing?
23 Experimental validation Prediction: Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Involved Involved t Involved Involved t Involved t Involved
24 Experimental validation Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Prediction: Involved Involved t Involved Involved t Involved t Involved Assay: All predictions are correct!
25 Sparse annotation Positives Negatives What pattern distinguishes the positives and the negatives?
26 Multiple lines separate the two classes x 1 x 2
27 Training under sparse annotation Positives Negatives 4 green features features 1 and 3 are green What pattern distinguishes the positives and the negatives?
28 Prediction under sparse annotation Feature1 Feature3 Four green features? Features 1 and 3 green? Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Yes Yes Yes Which genes are involved in trna processing?
29 Prediction under sparse annotation Feature1 Feature3 Four green features? Features 1 and 3 green? Confidence Gene1 Yes Yes 1. Gene2 Yes.5 Gene3 Gene4 Yes.5 Gene5 Yes.5 Gene6 Legend 1..5 Definitely involved May be involved Definitely not involved
30 Prediction under sparse annotation Feature1 Feature3 Four green features? Features 1 and 3 green? Confidence Gene1 Yes Yes 1. Gene2 Yes.5 Gene3 Gene4 Yes.5 Gene5 Yes.5 Gene6 Prediction: Gene1, and probably Genes 2, 4, and 5 are involved in trna processing.
31 Experimental validation Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene
32 Experimental validation Label Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Confidence
33 Experimental validation Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene One correct confidence 1 prediction
34 Experimental validation Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene Two out of three confidence.5 predictions correct.
35 Validation results Confidence Cutoff # True Positives # False Positives Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Confidence
36 isy features Positives Negatives Incorrect measurement, should be green.
37 isy features Positives Negatives What distinguishes the positives and the negatives?
38 isy features + sparse data = overfitting Positives Negatives What distinguishes the positives and the negatives?
39 Training Positives Negatives 4 green features
40 Prediction Four green features? Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Which genes are involved in trna processing?
41 Prediction Four green features? Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Prediction: Gene1 and 5 are involved in trna processing.
42 Experimental validation Four green features? Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes 1. 1.
43 Experimental validation Four green features? Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes One incorrect high confidence prediction, i.e., one false positive
44 Experimental validation Four green features? Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes Two genes missed completely, i.e., two false negatives
45 Experimental validation Four green features? Confidence Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Yes Yes One incorrect high confidence prediction, two genes missed completely
46 Validation results Confidence Cutoff # True Positives # False Positives Gene1 Gene2 Gene3 Gene4 Gene5 Gene6 Confidence 1. 1.
47 What have we learned? Sparse data: many different patterns distinguish positives and negatives.
48 What have we learned? Sparse data: many different patterns distinguish positives and negatives. isy features: Actual distinguishing pattern may not be observable
49 What have we learned? Sparse data: many different patterns distinguish positives and negatives. isy features: Actual distinguishing pattern may not be observable Sparse data + noisy features: may detect, and be highly confident in, spurious, incorrect patterns. Overfitting
50 Overfitting For a given training / test set: Generalization (test set) error Classification error Training set error (Effective) # of parameters aka Complexity aka VC dimension
51 Validation Different algorithms assign confidence to their predictions differently Need to 1. Determine meaning of each algorithm s confidence score. 2. Determine what level of confidence is warranted by the data
52 Cross-validation Basic idea: Hold out part of the data and use it to validate confidence levels
53 Cross-validation Positives Negatives
54 Cross-validation Positives Negatives Hold-out Label + - +
55 Cross-validation: training Positives Negatives
56 Cross-validation: training Positives Negatives Features 1 and 3 are green
57 Cross-validation: testing Features 1 and 3 green? Hold-out Yes
58 Cross-validation: testing Features 1 and 3 green? Yes Hold-out Confidence 1.
59 Cross-validation: testing Features 1 and 3 green? Yes Hold-out Confidence 1. Label + - +
60 Confidence cutoff Cross-validation: testing # True Positives # False Positives Hold-out Confidence 1. Label + - +
61 - N-fold cross validation Step 1: Randomly reorder rows Step 2: Split into N sets (e.g. N = 5) Step 3: Train N times, using each split, in turn, as the hold-out set + - Labelled data Permuted data Training splits
62 - N-fold cross validation Step 1: Randomly reorder rows Step 2: Split into N sets (e.g. N = 5) Step 3: Train N times, using each split, in turn, as the hold-out set + - Labelled data Permuted data Training splits
63 Using N-fold cross validation to assign confidence to predictions Training set Test set +ves -ves +ves -ves Classification statistics Thres #TP #FP Fold Fold Fold 3... Fold N
64 Cross-validation results Confidence cutoff # True Positives # False Positives
65 Displaying results: ROC curves Confidence cutoff # True Positives # False Positives # TP 5 ROC curve # FP 5 5 5
66 Making new predictions Confidence cutoff # True Positives # False Positives # TP 5 ROC curve x # FP 5 5 5
67 Figures of merit Predicted T F Precision: #TP / (#TP + #FP) (also known as positive predictive value) Actual T F TP FP FN TN Recall: #TP / (#TP + #FN) (also known as sensitivity) Specificity: #TN / (#FP + #TN) Negative predictive value: #TN / (#FN + #TN) Accuracy: (#TP + #TN) / (#TP + #FP + #TN + #FN)
68 Area under the ROC curve ROC curve Area Under the roc Curve (AUC) = Average proportion of negatives with confidence levels less than a random positive Sensitivity 1 Quick facts: - < AUC < 1 - AUC of random classifier =.5 1-Specificity 1
69 Quick facts: Area under the ROC curve Area under the ROC curve (AUC): ratio of positive/negative pairs correctly ordered. Quick facts: < AUC < 1 AUC of random classifier =.5 TP rate AUC Mann-Whitney U statistic My favourite classification error measure: 1-AUC 1 ROC curve FP rate 1 69
70 Precision-recall curves 1 Precision Baseline precision: #P / (#P + #N) 1 Recall Often, people report Area under the Precision-Recall (PR) curve (AUPRC) as a performance metric when # of positives is low and when you want to make good predictions about which genes are positives. Area = average precision using thresholds determined by the positives Unlike ROC curve, the PR curve is not monotonic nor is there a statistical test (that I know of) associated with it.
71 Simple classifier based on Bayes X is a real-value rule. We know that if y =, then X ~ N(m, s) and if y = 1, then X ~ N(m 1, s) What is P(y = 1 X = x)? P(y = 1 X = x) = P(X = x y = 1) P(y = 1) / P(X = x) Let s say P(y = 1) = p 1 [and P(y = ) = p = 1-p 1 ]
72 Some definitions P(X = x y = 1) = N(x; m 1, v) = Z(v) -1 exp[-.5 (x m 1 ) 2 ] P(X = x y = ) = N(x; m, v) = Z(v) -1 exp[-.5 (x m ) 2 ] Where, Z(v) = (v2π) 1/2
73 P(y =1 X = x) = P(X = x y =1)P(y =1) P(X = x y =1)P(y =1) + P(X = x y = )P(y = ) = 1 1+ P(X = x y = )P(y = ) /[P(X = x y =1)P(y =1)], Assuming p = p 1 = 1 1+ exp[.5(x m ) 2 /v +.5(x m 1 ) 2 /v] 1 = 1+ exp[x (m m 1 ) /v w.5(m 2 m 2 1 ) /v], te (m 2 m 2 1 ) = (m + m 1 )(m m 1 ) 1 = 1+ exp[w (x (m + m 1 ) /2 )] = 1 1+ exp[w(x b)] b
Performance Evaluation
Performance Evaluation David S. Rosenberg Bloomberg ML EDU October 26, 2017 David S. Rosenberg (Bloomberg ML EDU) October 26, 2017 1 / 36 Baseline Models David S. Rosenberg (Bloomberg ML EDU) October 26,
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationBayesian Decision Theory
Introduction to Pattern Recognition [ Part 4 ] Mahdi Vasighi Remarks It is quite common to assume that the data in each class are adequately described by a Gaussian distribution. Bayesian classifier is
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationMachine Learning for NLP
Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationModel Accuracy Measures
Model Accuracy Measures Master in Bioinformatics UPF 2017-2018 Eduardo Eyras Computational Genomics Pompeu Fabra University - ICREA Barcelona, Spain Variables What we can measure (attributes) Hypotheses
More informationBiochip informatics-(i)
Biochip informatics-(i) : biochip normalization & differential expression Ju Han Kim, M.D., Ph.D. SNUBI: SNUBiomedical Informatics http://www.snubi snubi.org/ Biochip Informatics - (I) Biochip basics Preprocessing
More informationStephen Scott.
1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training
More informationSupplementary Information
Supplementary Information Performance measures A binary classifier, such as SVM, assigns to predicted binding sequences the positive class label (+1) and to sequences predicted as non-binding the negative
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationEnsemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan
Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationMethods and Criteria for Model Selection. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Methods and Criteria for Model Selection CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Introduce classifier evaluation criteria } Introduce Bias x Variance duality } Model Assessment }
More informationLinear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes. November 9, Statistics 202: Data Mining
Linear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes November 9, 2012 1 / 1 Nearest centroid rule Suppose we break down our data matrix as by the labels yielding (X
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationCptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1
CptS 570 Machine Learning School of EECS Washington State University CptS 570 - Machine Learning 1 IEEE Expert, October 1996 CptS 570 - Machine Learning 2 Given sample S from all possible examples D Learner
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationRegularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline
Other Measures 1 / 52 sscott@cse.unl.edu learning can generally be distilled to an optimization problem Choose a classifier (function, hypothesis) from a set of functions that minimizes an objective function
More informationLinear Models for Classification
Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,
More informationBANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1
BANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013) ISLR Data Mining I
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationPart I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis
Week 5 Based in part on slides from textbook, slides of Susan Holmes Part I Linear Discriminant Analysis October 29, 2012 1 / 1 2 / 1 Nearest centroid rule Suppose we break down our data matrix as by the
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationPerformance evaluation of binary classifiers
Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in
More informationDiagnostics. Gad Kimmel
Diagnostics Gad Kimmel Outline Introduction. Bootstrap method. Cross validation. ROC plot. Introduction Motivation Estimating properties of an estimator. Given data samples say the average. x 1, x 2,...,
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationData Privacy in Biomedicine. Lecture 11b: Performance Measures for System Evaluation
Data Privacy in Biomedicine Lecture 11b: Performance Measures for System Evaluation Bradley Malin, PhD (b.malin@vanderbilt.edu) Professor of Biomedical Informatics, Biostatistics, & Computer Science Vanderbilt
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis Jonathan Taylor, 10/12 Slide credits: Sergio Bacallado 1 / 1 Review: Main strategy in Chapter 4 Find an estimate ˆP
More informationMachine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.
10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the
More informationPerformance Evaluation
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Example:
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationLecture 9: Classification, LDA
Lecture 9: Classification, LDA Reading: Chapter 4 STATS 202: Data mining and analysis October 13, 2017 1 / 21 Review: Main strategy in Chapter 4 Find an estimate ˆP (Y X). Then, given an input x 0, we
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationEmpirical Risk Minimization, Model Selection, and Model Assessment
Empirical Risk Minimization, Model Selection, and Model Assessment CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 5.7-5.7.2.4, 6.5-6.5.3.1 Dietterich,
More informationSmart Home Health Analytics Information Systems University of Maryland Baltimore County
Smart Home Health Analytics Information Systems University of Maryland Baltimore County 1 IEEE Expert, October 1996 2 Given sample S from all possible examples D Learner L learns hypothesis h based on
More informationSVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels
SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic
More informationIntroduction to Logistic Regression
Introduction to Logistic Regression Problem & Data Overview Primary Research Questions: 1. What are the risk factors associated with CHD? Regression Questions: 1. What is Y? 2. What is X? Did player develop
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationClassifier performance evaluation
Classifier performance evaluation Václav Hlaváč Czech Technical University in Prague Czech Institute of Informatics, Robotics and Cybernetics 166 36 Prague 6, Jugoslávských partyzánu 1580/3, Czech Republic
More information15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation
15-388/688 - Practical Data Science: Nonlinear modeling, cross-validation, regularization, and evaluation J. Zico Kolter Carnegie Mellon University Fall 2016 1 Outline Example: return to peak demand prediction
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationQualifying Exam in Machine Learning
Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts
More informationWarm up: risk prediction with logistic regression
Warm up: risk prediction with logistic regression Boss gives you a bunch of data on loans defaulting or not: {(x i,y i )} n i= x i 2 R d, y i 2 {, } You model the data as: P (Y = y x, w) = + exp( yw T
More informationNonlinear Classification
Nonlinear Classification INFO-4604, Applied Machine Learning University of Colorado Boulder October 5-10, 2017 Prof. Michael Paul Linear Classification Most classifiers we ve seen use linear functions
More informationClassification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationClassification and Pattern Recognition
Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations
More informationAlgorithmisches Lernen/Machine Learning
Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationEvaluation & Credibility Issues
Evaluation & Credibility Issues What measure should we use? accuracy might not be enough. How reliable are the predicted results? How much should we believe in what was learned? Error on the training data
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationCS534 Machine Learning - Spring Final Exam
CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the
More informationLinear Classifiers as Pattern Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2014/2015 Lesson 16 8 April 2015 Contents Linear Classifiers as Pattern Detectors Notation...2 Linear
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More information10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification
10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationThe exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please
More information1 Machine Learning Concepts (16 points)
CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationMachine Learning (CS 567) Lecture 2
Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationMachine Learning Concepts in Chemoinformatics
Machine Learning Concepts in Chemoinformatics Martin Vogt B-IT Life Science Informatics Rheinische Friedrich-Wilhelms-Universität Bonn BigChem Winter School 2017 25. October Data Mining in Chemoinformatics
More informationProblem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56
STAT 391 - Spring Quarter 2017 - Midterm 1 - April 27, 2017 Name: Student ID Number: Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56 Directions. Read directions carefully and show all your
More informationMachine Learning. Lecture 9: Learning Theory. Feng Li.
Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationCSCI-567: Machine Learning (Spring 2019)
CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March
More informationClassification. Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester / 162
Classification Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 66 / 162 Department Biosysteme Karsten Borgwardt Data Mining Course Basel Fall Semester 2015 67 / 162
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationStochastic Gradient Descent
Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular
More informationLecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher
Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and
More informationPerformance Measures. Sören Sonnenburg. Fraunhofer FIRST.IDA, Kekuléstr. 7, Berlin, Germany
Sören Sonnenburg Fraunhofer FIRST.IDA, Kekuléstr. 7, 2489 Berlin, Germany Roadmap: Contingency Table Scores from the Contingency Table Curves from the Contingency Table Discussion Sören Sonnenburg Contingency
More informationSupport Vector Machines and Kernel Methods
Support Vector Machines and Kernel Methods Geoff Gordon ggordon@cs.cmu.edu July 10, 2003 Overview Why do people care about SVMs? Classification problems SVMs often produce good results over a wide range
More informationMachine Learning for Signal Processing Bayes Classification and Regression
Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification
More informationLearning Methods for Linear Detectors
Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2011/2012 Lesson 20 27 April 2012 Contents Learning Methods for Linear Detectors Learning Linear Detectors...2
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More informationBias-Variance Tradeoff
What s learning, revisited Overfitting Generative versus Discriminative Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 19 th, 2007 Bias-Variance Tradeoff
More informationIntroduction to Supervised Learning. Performance Evaluation
Introduction to Supervised Learning Performance Evaluation Marcelo S. Lauretto Escola de Artes, Ciências e Humanidades, Universidade de São Paulo marcelolauretto@usp.br Lima - Peru Performance Evaluation
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationGenerative v. Discriminative classifiers Intuition
Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 1 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features
More informationMicroarray Data Analysis: Discovery
Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes dr. Petra Kralj Novak Petra.Kralj.Novak@ijs.si 7.11.2017 1 Course Prof. Bojan Cestnik Data preparation Prof. Nada Lavrač: Data mining overview Advanced
More informationMachine Learning (CS 567) Lecture 3
Machine Learning (CS 567) Lecture 3 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationMining Classification Knowledge
Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationMachine Learning for NLP
Machine Learning for NLP Uppsala University Department of Linguistics and Philology Slides borrowed from Ryan McDonald, Google Research Machine Learning for NLP 1(50) Introduction Linear Classifiers Classifiers
More informationPerformance Evaluation and Hypothesis Testing
Performance Evaluation and Hypothesis Testing 1 Motivation Evaluating the performance of learning systems is important because: Learning systems are usually designed to predict the class of future unlabeled
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 1, 2011 Today: Generative discriminative classifiers Linear regression Decomposition of error into
More informationIntroduction to Machine Learning. Introduction to ML - TAU 2016/7 1
Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger
More informationTufts COMP 135: Introduction to Machine Learning
Tufts COMP 135: Introduction to Machine Learning https://www.cs.tufts.edu/comp/135/2019s/ Logistic Regression Many slides attributable to: Prof. Mike Hughes Erik Sudderth (UCI) Finale Doshi-Velez (Harvard)
More information10-701/ Machine Learning - Midterm Exam, Fall 2010
10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationLinear and Logistic Regression. Dr. Xiaowei Huang
Linear and Logistic Regression Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Two Classical Machine Learning Algorithms Decision tree learning K-nearest neighbor Model Evaluation Metrics
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More information