Advances in Anomaly Detection

Size: px
Start display at page:

Download "Advances in Anomaly Detection"

Transcription

1 Advances in Anomaly Detection Tom Dietterich Alan Fern Weng-Keen Wong Andrew Emmott Shubhomoy Das Md. Amran Siddiqui Tadesse Zemicheal

2 Outline Introduction Three application areas Two general approaches to anomaly detection Under-fitting Over-fitting DARPA ADAMS Red Team results Benchmarks for Anomaly Detection Validation Comparison Study Next Steps Anomaly Explanations Ensembles 2

3 Why Anomaly Detection? Data cleaning Find data points that contain errors Science Find data points that are interesting or unusual Security / fraud detection Find users/customers who are behaving weirdly 3

4 Data Cleaning for Sensor Networks An ideal method should produce two things given raw data: x11 Air Temperature (Degrees Celsius) x31 x29 x25 x20 x19 x18 x17 x Day Index (From Start of Deployment) 4

5 Data Cleaning for Sensor Networks An ideal method should produce two things given raw data: A label that marks anomalies Air Temperature (Degrees Celsius) Day Index (From Start of Deployment) x11 x12 x17 x18 x19 x20 x25 x29 x31 5

6 Data Cleaning for Sensor Networks An ideal method should produce two things given raw data: A label that marks anomalies An imputation of the true value (with some confidence measure) Dereszynski &, Dietterich, ACM TOS Air Temperature (Degrees Celsius) Day Index (From Start of Deployment) 6 x11 x12 x17 x18 x19 x20 x25 x29 x31

7 NASA: Finding Interesting Data Points Ingest data set Rank points by interestingness Repeat Show most interesting point to scientist Yes: Interesting No: Not Interesting Build model of the uninteresting points Most interesting point == most un-uninteresting point Most extreme outlier among the uninteresting points Mars Science Laboratory ChemCam Olivine First non-carbonate Wagstaff, Lanza, Thompson, Dietterich, Gilmore. AAAI

8 Security/Fraud Detection: DARPA ADAMS Program Desktop activity data collected from ~5000 employees of a corporation using Raytheon-Oakley Sureview CERT Red Team overlays selected employees with insider threat activity based on real scenarios Example Scenarios: Anomalous Encryption Layoff Logic Bomb Insider Startup Circumventing SureView Hiding Undue Affluence Survivor s Burden Team: LEIDOS (former SAIC); Ted Senator, PI; Rand Waltzman, PM. 8

9 Outline Introduction Three application areas Two general approaches to anomaly detection Under-fitting Over-fitting DARPA ADAMS Red Team results Benchmarks for Anomaly Detection Validation Comparison Study Next Steps Anomaly Explanations Ensembles 9

10 What is Anomaly Detection? Input: vectors xx ii R dd for ii = 1,, NN Assumed to be a mix of normal and anomalous data points Anomalies are generated by some distinct process (e.g., instrument failures, fraud, intruders, etc.) Output Anomaly score ss ii for each input xx ii such that higher scores are more anomalous and similar scores imply similar levels of anomalousness Metrics AUC: Probability that a randomly-chosen anomaly is ranked above a randomly-chosen normal point Precision in top K 10

11 Two General Approaches: Anomaly Detection Methods Anomaly Detection by Underfitting Anomaly Detection by Overfitting Gaussian Mixture Model (GMM) Ensemble of Gaussian Mixture Models (EGMM) Isolation Forest (IFOR) Repeated Impossible Discrimination Ensemble (RIDE) 11

12 Anomaly Detection by Under-Fitting Choose a class of models Fit to the data Let PP θθ xx ii be the probability density assigned to data point xx ii by the model θθ Assign score ss ii log PP θθ xx ii Low density points (poorlyexplained by the model) are the anomalies 12

13 Example: Gaussian Mixture Model PP xx = KK kk=1 pp kk Normal xx μμ kk, Σ kk K=3 13

14 Ensemble of GMMs Train MM independent Gaussian Mixture Models Train model mm = 1,, MM on a bootstrap replicate of the data Vary the number of clusters KK Delete any model with log likelihood < 70% of best model Compute average surprise: 1 MM log PP mm(xx ii ) mm 14

15 DARPA ADAMS Vegas Results Score each user and rank them all AUC = Probability that we correctly rank a randomlychoosen Red Team insert above a randomly-chosen normal user Vegas Sept 2012 ROC ROC (Vegas Sept) Vegas Oct 2012 ROC AUC=0.970 AUC=0.970 AvgLift= AUC=

16 New approach: Anomaly Detection By Over-Fitting Take the input points Randomly split in half and label half as 0, half as 1 Apply supervised learning to discriminate the 0 s from the 1 s (which by construction is impossible) ssssssssss xx = 0.5 PP yy ii = 1 Repeat random split Repeat discrimination ssssssssss xx = 0.5 PP yy ii = 1 Total score after 50 iterations RIDE: Repeated Impossible Discrimination Ensemble 16

17 RIDE Vegas Results Vegas Sept ROC Vegas Oct ROC AUC= AUC=

18 Isolation Forest [Liu, Ting, Zhou, 2011] Construct a fully random binary tree choose attribute jj at random choose splitting threshold θθ uniformly from min xx jj, max xx jj until every data point is in its own leaf let dd(xx ii ) be the depth of point xx ii repeat 100 times let dd (xx ii ) be the average depth of xx ii ssssssssss xx ii = 2 dd xx ii rr xx ii rr(xx ii ) is the expected depth xx jj > θθ xx 2 > θθ 2 xx 8 > θθ 3 xx 3 > θθ 4 xx 1 > θθ 5 xx ii 18

19 Outline Introduction Three application areas Two general approaches to anomaly detection Under-fitting Over-fitting DARPA ADAMS Red Team results Benchmarks for Anomaly Detection Validation Comparison Study Next Steps Anomaly Explanations Ensembles 19

20 VEGAS Results May 2013

21 VEGAS Results June 2013

22 VEGAS Results July 2013

23 Outline Introduction Three application areas Two general approaches to anomaly detection Under-fitting Over-fitting DARPA ADAMS Red Team results Benchmarks for Anomaly Detection Validation Comparison Study Next Steps Anomaly Explanations Ensembles 23

24 Needed: Benchmarks for Anomaly Detection Algorithms Shared benchmark databases have helped supervised learning make rapid progress UCI Repository of Machine Learning Data Sets Anomaly Detection lacks shared benchmarks Most data sets are proprietary and/or classified Exception: Lincoln Labs Simulated Network Intrusion data set hopelessly out of date Goal: Develop a collection of benchmark data sets with known properties 24

25 Benchmark Requirements The underlying process generating the anomalies should be distinct from the process generating the normal points anomalies are not merely outliers We need many benchmark data sets prevent the research community from fixating on a small number of problems Benchmark data sets should systematically vary a set of relevant properties 25

26 Relevant Properties Point difficulty: How difficult is it to separate each individual anomaly point from the normal points? Relative frequency: How rare are the anomalies? Clusteredness: Are the anomalous points tightly clustered or widely scattered? Irrelevant features: How many features are irrelevant? 26

27 Creating an Anomaly Detection Benchmark Data Set Select a UCI supervised learning dataset Choose one class to be the anomalies (call this class 0 and the union of the other classes class 1) Ensures that different processes are generating the anomalies and the normal points Computing point difficulty: Fit a kernel logistic regression model to estimate PP(yy = 1 xx), where yy is the class label oracle model Difficulty of xx ii is defined as PP yy = 1 xx for anomaly points according to the oracle For the desired relative frequency Select points based on difficulty and clusteredness Optionally: Add irrelevant features by selecting existing features and randomly permuting their values 27

28 Benchmark Collection 19 mother UCI data sets point difficulty: low: (0, 0.16) medium: [0.16, 0.33) high: [0.33, 0.5) very high: [0.5, 1) relative frequency: 0.001, 0.005, 0.01, 0.05, 0.1 clusteredness: 7 levels based on log σσ nn 2 σσ 2 aa variance of normal points divided by variance of anomalous points facility location algorithm used to select well-spaced points seed point neighbors used to find clustered points irrelevant features: 4 levels based on increasing the average distance between normal points 24,800 benchmark data sets generated 28

29 Benchmarking Study State of the art methods: ocsvm: one-class SVM (Schoelkopf et al. 1999) lof: Local Outlier Factor (Breuning et al. 2000) svdd: Support Vector Data Description (Tax & Duin, 2004) if: Isolation Forest (Liu et al., 2008, 2011) scif: SciForest (Liu et al., 2010) rkde: Robust Kernel Density Estimation (Kim & Scott, 2012) egmm: ours Analysis Measure the AUC of each method Compute mean AUC for each method Fit logistic regression model: logit AAAAAA = mmmmmmmmmmm + dddddddddddddddddddd + ffffffffffffffffff + cccccccccccccccccccccccccc + iiiiiiiiiiiiiiiiiiiiii 29

30 Benchmark Validity: Point Difficulty 30

31 Benchmark Validity: Relative Frequency 31

32 Benchmark Validity: Clusteredness 32

33 Algorithm Comparisons: Mean AUC Mean AUC if lof rkde egmm svdd scif ocsvm 33

34 Algorithm Comparisons: Logistic Regression Results if: Isolation Forest (Ling et al, 2011) rkde: Robust Kernel Density Estimation (Kim & Scott, 2012) egmm: ours lof: Local Outlier Factor (Breuning et al. 2000) ocsvm: one-class SVM (Schoelkopf et al. 1999) svdd: Support Vector Data Description (Tax & Duin, 2004) 34

35 Sensitivity to Irrelevant Features The performance of all methods drops with increasing # of irrelevant features RKDE and IFOR performing very well OCSVM extremely sensitive EGMM was hurt by the largest level of irrelevance top performer when there is no noise Average AUC level-0 level-1 level-2 level-3 Increasing # of Irrelevant Features egmm if lof rkde svdd scif ocsvm 35

36 Outline Introduction Three application areas Two general approaches to anomaly detection Under-fitting Over-fitting DARPA ADAMS Red Team results Benchmarks for Anomaly Detection Validation Comparison Study Next Steps Anomaly Explanations Ensembles 36

37 Next Steps Generate Explanations of each Anomaly for the Analyst Ensembles Model the peer-group structure of the organization the same user in previous days all users in the company today users with the same job class users who work together shared printer cliques 37

38 Anomaly Explanations Data Points Outliers Alarms Threats & Non-Threats Anomaly Detector Threats & False Positives Human Analyst Threats & False Positives Non-Outliers Discarded Non-Threats & Missed Threats Non-Threats & Missed Threats Type 1 Missed Threats = Anomaly Detector False Negatives Reduce by improving anomaly detector Type 2 Missed Threats = Analyst False Negatives Can occur due to information overload and time constraints We consider reducing Type 2 misses by providing explanations Why did the detector consider an object to be an outlier? Analyst can focus on info related to explanation 38

39 Sequential Feature Explanations Outliers + Explanations Threats & False Positives Human Analyst Alarms Threats & False Positives Goal: reduce analyst effort for correctly detecting outliers that are threats How: provide analyst with sequential feature explanations of outlier points Sequential Feature Explanation (SFE): an ordering on features of an outlier prioritized by importance to anomaly detector Protocol: incrementally reveal features ordered by SFE until analyst can make a confident determination 39

40 Typical Sequential Feature Explanation Curve Performance Metric: # of features that must be examined by the analyst in order to make a confident decision that a proposed threat (outlier) requires opening an investigation 40

41 Evaluating Explanations Methodological Problem: Evaluation requires access to an analyst, but we can t run large scale experiments with real analysts Solution: Construct simulated analysts that compute PP(nnnnnnnnnnnn xx) How: Start with an anomaly detection benchmark constructed from a UCI supervised learning data set [Emmott et al., 2013] Learn a classifier to predict anomaly vs. normal from labeled data (cheating) UCI Dataset Normal Points Supervised Learning Simulated Analyst Classifier PP(nnnnnnnnnnnn xx) 41 Anomaly Points Repeat for each subset of KK features

42 Explanation Methods for Density- Based Anomaly Detectors Density-based: Rank points xx according to estimated density ff(xx) Marginal Methods: greedily add features that most decrease joint marginal ff(xx 1,, xx KK ) Sequential Marginal: First feature xx ii minimizes ff(xx ii ) Second feature xx jj minimizes ff xx ii, xx jj.. Independent Marginal -- Order features by ff xx ii Dropout Methods: greedily remove features that most increase ff xx Sequential Dropout: Independent Dropout First feature xx ii minimizes ff(xx ( ii) ) -- Order features by ff xx ( ii) Second feature xx jj minimizes ff xx ( ii jj).. 42

43 Empirical Demonstration Datasets: 10,000 benchmarks derived from 7 UCI datasets Anomaly Detector: Ensemble of Gaussian Mixture Model (EGMM) Simulated Analysts: Regularized Random Forests (RRFs) Evaluation Metric: mean minimum feature prefix (MMFP) = average number of features revealed before the analyst is able to make a decision (exonerate vs. open investigation) 43

44 Results (EGMM + Explanation Method) MMFP 6 In these domains, IndDO an oracle 5only needs IndMarg 1-2 features OptOracle SeqDO Dropout SeqMarg methods Randomare often worse than marginal Often no benefit to Random is sequential methods always worst over independent methods 44

45 Results (EGMM + Explanation Method) All methods significantly beat random Marginal methods no worse and sometimes better than dropout Independent marginal is nearly as good as sequential marginal 45

46 KDD99 (Computer Intrusion) Results (EGMM detector) MMFP Independent Dropout Sequential Dropout Independent Marginal Sequential Marginal 46 [95% Confidence Intervals] Marginal Methods are Best One Feature is Enough!

47 Ensemble Methods In Supervised Learning, ensemble methods have been shown to be very powerful bagging random forests boosting Can we develop general-purpose ensemble methods for Anomaly Detection? Our methods employ internal ensembles Can we combine heterogeneous anomaly detection algorithms into an external ensemble? 47

48 Comparison of Ensemble Methods 2-Component Gaussian PCA 0.3 Schubert 0.2 Schubert-info 0.1 glmnet 0 L1 logistic regression -0.1 Isolation Forest -0.2 best non-ensemble method -0.3 Change in logit(auc) wrt gauss-model Ensemble Comparison (MAGIC Gamma Telescope) gauss-model PCA Schubert Schubertinfo glmnet iforest 48

49 Ensemble Conclusions No convincing evidence that ensembles work better than simply running iforest 49

50 Concluding Remarks Anomaly Detection has received relatively little study in machine learning, statistics, and data mining There are two main paradigms for designing algorithms anomaly detection by under-fitting anomaly detection by over-fitting The over-fitting paradigm is producing interesting algorithms They also require less modeling effort They can be very efficient In the analyst case, simple marginal scores work very well for sequential feature explanations 50

51 Questions? Anomaly Detection has received relatively little study in machine learning, statistics, and data mining There are two main paradigms for designing algorithms anomaly detection by under-fitting anomaly detection by over-fitting The over-fitting paradigm is producing interesting algorithms They also require less modeling effort They can be very efficient In the analyst case, simple marginal scores work very well for sequential feature explanations 51

SYSTEMATIC CONSTRUCTION OF ANOMALY DETECTION BENCHMARKS FROM REAL DATA. Outlier Detection And Description Workshop 2013

SYSTEMATIC CONSTRUCTION OF ANOMALY DETECTION BENCHMARKS FROM REAL DATA. Outlier Detection And Description Workshop 2013 SYSTEMATIC CONSTRUCTION OF ANOMALY DETECTION BENCHMARKS FROM REAL DATA Outlier Detection And Description Workshop 2013 Authors Andrew Emmott emmott@eecs.oregonstate.edu Thomas Dietterich tgd@eecs.oregonstate.edu

More information

Sequential Feature Explanations for Anomaly Detection

Sequential Feature Explanations for Anomaly Detection Sequential Feature Explanations for Anomaly Detection Md Amran Siddiqui, Alan Fern, Thomas G. Die8erich and Weng-Keen Wong School of EECS Oregon State University Anomaly Detection Anomalies : points that

More information

Introduction to Density Estimation and Anomaly Detection. Tom Dietterich

Introduction to Density Estimation and Anomaly Detection. Tom Dietterich Introduction to Density Estimation and Anomaly Detection Tom Dietterich Outline Definition and Motivations Density Estimation Parametric Density Estimation Mixture Models Kernel Density Estimation Neural

More information

arxiv: v2 [cs.ai] 26 Aug 2016

arxiv: v2 [cs.ai] 26 Aug 2016 AD A Meta-Analysis of the Anomaly Detection Problem ANDREW EMMOTT, Oregon State University SHUBHOMOY DAS, Oregon State University THOMAS DIETTERICH, Oregon State University ALAN FERN, Oregon State University

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and

More information

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.

CS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber. CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform

More information

Toward automated quality control for hydrometeorological. station data. DSA 2018 Nyeri 1

Toward automated quality control for hydrometeorological. station data. DSA 2018 Nyeri 1 Toward automated quality control for hydrometeorological weather station data Tom Dietterich Tadesse Zemicheal 1 Download the Python Notebook https://github.com/tadeze/dsa2018 2 Outline TAHMO Project Sensor

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Anomaly Detection. Jing Gao. SUNY Buffalo

Anomaly Detection. Jing Gao. SUNY Buffalo Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Introduction to Signal Detection and Classification. Phani Chavali

Introduction to Signal Detection and Classification. Phani Chavali Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)

More information

Anomaly Detection for the CERN Large Hadron Collider injection magnets

Anomaly Detection for the CERN Large Hadron Collider injection magnets Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing

More information

Unsupervised Anomaly Detection for High Dimensional Data

Unsupervised Anomaly Detection for High Dimensional Data Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Anomaly Detection via Over-sampling Principal Component Analysis

Anomaly Detection via Over-sampling Principal Component Analysis Anomaly Detection via Over-sampling Principal Component Analysis Yi-Ren Yeh, Zheng-Yi Lee, and Yuh-Jye Lee Abstract Outlier detection is an important issue in data mining and has been studied in different

More information

VBM683 Machine Learning

VBM683 Machine Learning VBM683 Machine Learning Pinar Duygulu Slides are adapted from Dhruv Batra Bias is the algorithm's tendency to consistently learn the wrong thing by not taking into account all the information in the data

More information

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with

More information

Chapter 6: Classification

Chapter 6: Classification Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 6: Classification Lecture: Prof. Dr. Thomas

More information

Midterm: CS 6375 Spring 2015 Solutions

Midterm: CS 6375 Spring 2015 Solutions Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an

More information

A Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007

A Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007 Decision Trees, cont. Boosting Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 1 st, 2007 1 A Decision Stump 2 1 The final tree 3 Basic Decision Tree Building Summarized

More information

An Overview of Outlier Detection Techniques and Applications

An Overview of Outlier Detection Techniques and Applications Machine Learning Rhein-Neckar Meetup An Overview of Outlier Detection Techniques and Applications Ying Gu connygy@gmail.com 28.02.2016 Anomaly/Outlier Detection What are anomalies/outliers? The set of

More information

Introduction to Machine Learning Midterm Exam Solutions

Introduction to Machine Learning Midterm Exam Solutions 10-701 Introduction to Machine Learning Midterm Exam Solutions Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes,

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

CS7267 MACHINE LEARNING

CS7267 MACHINE LEARNING CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo Gutierrez-Osuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning

More information

Pointwise Exact Bootstrap Distributions of Cost Curves

Pointwise Exact Bootstrap Distributions of Cost Curves Pointwise Exact Bootstrap Distributions of Cost Curves Charles Dugas and David Gadoury University of Montréal 25th ICML Helsinki July 2008 Dugas, Gadoury (U Montréal) Cost curves July 8, 2008 1 / 24 Outline

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

Learning with multiple models. Boosting.

Learning with multiple models. Boosting. CS 2750 Machine Learning Lecture 21 Learning with multiple models. Boosting. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Learning with multiple models: Approach 2 Approach 2: use multiple models

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite

More information

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet.

The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your one-page (two sides) or two-page (one side) crib sheet. Please

More information

Feedback-Guided Anomaly Discovery via Online Optimization

Feedback-Guided Anomaly Discovery via Online Optimization Feedback-Guided Anomaly Discovery via Online Optimization Md Amran Siddiqui Oregon State University Corvallis, OR 97331, USA siddiqmd@eecs.oregonstate.edu Ryan Wright Galois, Inc. Portland, OR 97204, USA

More information

A Step Towards the Cognitive Radar: Target Detection under Nonstationary Clutter

A Step Towards the Cognitive Radar: Target Detection under Nonstationary Clutter A Step Towards the Cognitive Radar: Target Detection under Nonstationary Clutter Murat Akcakaya Department of Electrical and Computer Engineering University of Pittsburgh Email: akcakaya@pitt.edu Satyabrata

More information

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods

Ensemble learning 11/19/13. The wisdom of the crowds. Chapter 11. Ensemble methods. Ensemble methods The wisdom of the crowds Ensemble learning Sir Francis Galton discovered in the early 1900s that a collection of educated guesses can add up to very accurate predictions! Chapter 11 The paper in which

More information

W vs. QCD Jet Tagging at the Large Hadron Collider

W vs. QCD Jet Tagging at the Large Hadron Collider W vs. QCD Jet Tagging at the Large Hadron Collider Bryan Anenberg: anenberg@stanford.edu; CS229 December 13, 2013 Problem Statement High energy collisions of protons at the Large Hadron Collider (LHC)

More information

Oliver Dürr. Statistisches Data Mining (StDM) Woche 11. Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften

Oliver Dürr. Statistisches Data Mining (StDM) Woche 11. Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften Statistisches Data Mining (StDM) Woche 11 Oliver Dürr Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften oliver.duerr@zhaw.ch Winterthur, 29 November 2016 1 Multitasking

More information

Introduction to Machine Learning Midterm, Tues April 8

Introduction to Machine Learning Midterm, Tues April 8 Introduction to Machine Learning 10-701 Midterm, Tues April 8 [1 point] Name: Andrew ID: Instructions: You are allowed a (two-sided) sheet of notes. Exam ends at 2:45pm Take a deep breath and don t spend

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne

More information

Course in Data Science

Course in Data Science Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an

More information

Loda: Lightweight on-line detector of anomalies

Loda: Lightweight on-line detector of anomalies Mach Learn (2016) 102:275 304 DOI 10.1007/s10994-015-5521-0 Loda: Lightweight on-line detector of anomalies Tomáš Pevný 1,2 Received: 2 November 2014 / Accepted: 25 June 2015 / Published online: 21 July

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

Learning Ensembles. 293S T. Yang. UCSB, 2017.

Learning Ensembles. 293S T. Yang. UCSB, 2017. Learning Ensembles 293S T. Yang. UCSB, 2017. Outlines Learning Assembles Random Forest Adaboost Training data: Restaurant example Examples described by attribute values (Boolean, discrete, continuous)

More information

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio

Class 4: Classification. Quaid Morris February 11 th, 2011 ML4Bio Class 4: Classification Quaid Morris February 11 th, 211 ML4Bio Overview Basic concepts in classification: overfitting, cross-validation, evaluation. Linear Discriminant Analysis and Quadratic Discriminant

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Midterm exam CS 189/289, Fall 2015

Midterm exam CS 189/289, Fall 2015 Midterm exam CS 189/289, Fall 2015 You have 80 minutes for the exam. Total 100 points: 1. True/False: 36 points (18 questions, 2 points each). 2. Multiple-choice questions: 24 points (8 questions, 3 points

More information

Boosting: Foundations and Algorithms. Rob Schapire

Boosting: Foundations and Algorithms. Rob Schapire Boosting: Foundations and Algorithms Rob Schapire Example: Spam Filtering problem: filter out spam (junk email) gather large collection of examples of spam and non-spam: From: yoav@ucsd.edu Rob, can you

More information

Learning theory. Ensemble methods. Boosting. Boosting: history

Learning theory. Ensemble methods. Boosting. Boosting: history Learning theory Probability distribution P over X {0, 1}; let (X, Y ) P. We get S := {(x i, y i )} n i=1, an iid sample from P. Ensemble methods Goal: Fix ɛ, δ (0, 1). With probability at least 1 δ (over

More information

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Machine Learning. Lecture 9: Learning Theory. Feng Li. Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell

More information

Summary and discussion of: Dropout Training as Adaptive Regularization

Summary and discussion of: Dropout Training as Adaptive Regularization Summary and discussion of: Dropout Training as Adaptive Regularization Statistics Journal Club, 36-825 Kirstin Early and Calvin Murdock November 21, 2014 1 Introduction Multi-layered (i.e. deep) artificial

More information

Loss Functions, Decision Theory, and Linear Models

Loss Functions, Decision Theory, and Linear Models Loss Functions, Decision Theory, and Linear Models CMSC 678 UMBC January 31 st, 2018 Some slides adapted from Hamed Pirsiavash Logistics Recap Piazza (ask & answer questions): https://piazza.com/umbc/spring2018/cmsc678

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Ensemble Methods: Bagging, Boosting PAC Learning Readings: Murphy 16.4;; Hastie 16 Stefan Lee Virginia Tech Fighting the bias-variance tradeoff Simple

More information

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Support Vector Machines CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification

More information

Decision Trees: Overfitting

Decision Trees: Overfitting Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9

More information

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Roberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA

Roberto Perdisci^+, Guofei Gu^, Wenke Lee^ presented by Roberto Perdisci. ^Georgia Institute of Technology, Atlanta, GA, USA U s i n g a n E n s e m b l e o f O n e - C l a s s S V M C l a s s i f i e r s t o H a r d e n P a y l o a d - B a s e d A n o m a l y D e t e c t i o n S y s t e m s Roberto Perdisci^+, Guofei Gu^, Wenke

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Bagging and Other Ensemble Methods

Bagging and Other Ensemble Methods Bagging and Other Ensemble Methods Sargur N. Srihari srihari@buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

Click Prediction and Preference Ranking of RSS Feeds

Click Prediction and Preference Ranking of RSS Feeds Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Linear Models Joakim Nivre Uppsala University Department of Linguistics and Philology Slides adapted from Ryan McDonald, Google Research Machine Learning for NLP 1(26) Outline

More information

A Framework for Adaptive Anomaly Detection Based on Support Vector Data Description

A Framework for Adaptive Anomaly Detection Based on Support Vector Data Description A Framework for Adaptive Anomaly Detection Based on Support Vector Data Description Min Yang, HuanGuo Zhang, JianMing Fu, and Fei Yan School of Computer, State Key Laboratory of Software Engineering, Wuhan

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Vector Data: Clustering: Part II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 3, 2017 Methods to Learn: Last Lecture Classification Clustering Vector Data Text Data Recommender

More information

Final Exam, Fall 2002

Final Exam, Fall 2002 15-781 Final Exam, Fall 22 1. Write your name and your andrew email address below. Name: Andrew ID: 2. There should be 17 pages in this exam (excluding this cover sheet). 3. If you need more room to work

More information

FRaC: A Feature-Modeling Approach for Semi-Supervised and Unsupervised Anomaly Detection

FRaC: A Feature-Modeling Approach for Semi-Supervised and Unsupervised Anomaly Detection Noname manuscript No. (will be inserted by the editor) FRaC: A Feature-Modeling Approach for Semi-Supervised and Unsupervised Anomaly Detection Keith Noto Carla Brodley Donna Slonim Received: date / Accepted:

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, U.S.A. Conf. on Algorithmic Learning Theory, October 9,

More information

Chart types and when to use them

Chart types and when to use them APPENDIX A Chart types and when to use them Pie chart Figure illustration of pie chart 2.3 % 4.5 % Browser Usage for April 2012 18.3 % 38.3 % Internet Explorer Firefox Chrome Safari Opera 35.8 % Pie chart

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9

Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Machine Learning: Chenhao Tan University of Colorado Boulder LECTURE 9 Slides adapted from Jordan Boyd-Graber Machine Learning: Chenhao Tan Boulder 1 of 39 Recap Supervised learning Previously: KNN, naïve

More information

A Simple Algorithm for Learning Stable Machines

A Simple Algorithm for Learning Stable Machines A Simple Algorithm for Learning Stable Machines Savina Andonova and Andre Elisseeff and Theodoros Evgeniou and Massimiliano ontil Abstract. We present an algorithm for learning stable machines which is

More information

Data Mining und Maschinelles Lernen

Data Mining und Maschinelles Lernen Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting

More information

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI Support Vector Machines CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Linear Classifier Naive Bayes Assume each attribute is drawn from Gaussian distribution with the same variance Generative model:

More information

Predicting Storms: Logistic Regression versus Random Forests for Unbalanced Data

Predicting Storms: Logistic Regression versus Random Forests for Unbalanced Data CS-BIGS 1(2): 91-101 2007 CS-BIGS http://www.bentley.edu/cdbigs/vol1-2/ruiz.pdf Predicting Storms: Logistic Regression versus Random Forests for Unbalanced Data Anne Ruiz-Gazen Institut de Mathématiques

More information

Machine Learning (CS 567) Lecture 2

Machine Learning (CS 567) Lecture 2 Machine Learning (CS 567) Lecture 2 Time: T-Th 5:00pm - 6:20pm Location: GFS118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

Automated Discovery of Novel Anomalous Patterns

Automated Discovery of Novel Anomalous Patterns Automated Discovery of Novel Anomalous Patterns Edward McFowland III Machine Learning Department School of Computer Science Carnegie Mellon University mcfowland@cmu.edu DAP Committee: Daniel B. Neill Jeff

More information

UVA CS 4501: Machine Learning

UVA CS 4501: Machine Learning UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course

More information

day month year documentname/initials 1

day month year documentname/initials 1 ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes.

Random Forests. These notes rely heavily on Biau and Scornet (2016) as well as the other references at the end of the notes. Random Forests One of the best known classifiers is the random forest. It is very simple and effective but there is still a large gap between theory and practice. Basically, a random forest is an average

More information

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome.

Classification. Classification is similar to regression in that the goal is to use covariates to predict on outcome. Classification Classification is similar to regression in that the goal is to use covariates to predict on outcome. We still have a vector of covariates X. However, the response is binary (or a few classes),

More information

Manual for a computer class in ML

Manual for a computer class in ML Manual for a computer class in ML November 3, 2015 Abstract This document describes a tour of Machine Learning (ML) techniques using tools in MATLAB. We point to the standard implementations, give example

More information

6.036 midterm review. Wednesday, March 18, 15

6.036 midterm review. Wednesday, March 18, 15 6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that

More information

10-701/ Machine Learning - Midterm Exam, Fall 2010

10-701/ Machine Learning - Midterm Exam, Fall 2010 10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Concentration-based Delta Check for Laboratory Error Detection

Concentration-based Delta Check for Laboratory Error Detection Northeastern University Department of Electrical and Computer Engineering Concentration-based Delta Check for Laboratory Error Detection Biomedical Signal Processing, Imaging, Reasoning, and Learning (BSPIRAL)

More information

Midterm Exam, Spring 2005

Midterm Exam, Spring 2005 10-701 Midterm Exam, Spring 2005 1. Write your name and your email address below. Name: Email address: 2. There should be 15 numbered pages in this exam (including this cover sheet). 3. Write your name

More information