Anomaly (outlier) detection. Huiping Cao, Anomaly 1
|
|
- Preston Brooks
- 5 years ago
- Views:
Transcription
1 Anomaly (outlier) detection Huiping Cao, Anomaly 1
2 Outline General concepts What are outliers Types of outliers Causes of anomalies Challenges of outlier detection Outlier detection approaches Huiping Cao, Anomaly 2
3 What are outliers The set of data points that are significantly different from the rest of the objects Assumption There are considerably more normal observations than abnormal observations (outliers/anomalies) in the data Applications Fraud detection (credit card usage) Intrusion detection (computer systems, computer networks) Ecosystem disturbances Public health Medicine Related Novelty detection Huiping Cao, Anomaly 3
4 Types of outliers Global: deviate significantly from the rest of the dataset Also called point anomalies Most outlier detection methods are designed to find such outliers Example Intrusion detection in network traffic Huiping Cao, Anomaly 4
5 Types of outliers Contextual (conditional) outliers An object is an outlier in one context, but may be normal in another context Contextual attributes: define the object s context. date, location Behavior attributes: define the object s characteristics, and are used to evaluate whether the object is an outlier in the context. temperature A generalization of local outlier, defined in density based analysis. Background information to determine contextual attributes, etc. Huiping Cao, Anomaly 5
6 Types of outliers Collective: a subset of data objects forms a collective outlier if the objects as a whole deviate significantly from the entire data set The individual data objects may not be outliers Applications: supply-chain, web visiting, network (denialof-service) Need background information to make object relationships Huiping Cao, Anomaly 6
7 Causes of anomalies Data from different classes Hawkins definition of an Outlier: an outlier is an observation that differs so much from other observations as to arouse suspicion that it was generated by a different mechanism. Natural variation Anomalies that represent extreme or unlikely variations (extreme tall person) Data measurement and collection errors Removing such anomalies is the focus of data preprocessing (data cleaning) Others: several sources Huiping Cao, Anomaly 7
8 Outline General concepts What are outliers Types of outliers Causes of anomalies Challenges of outlier detection Outlier detection approaches Huiping Cao, Anomaly 8
9 Challenges of outlier detection Model normal/outlier objects Hard to model complete normal behavior Some methods assign normal or abnormal Some methods assign a score measuring the outlier-ness of the object. Universal outlier detection: hard to develop Similarity and distance definition is application-dependent Common issues: noise Understandability Understand why the detected objects are outliers Provide justification of the detection Huiping Cao, Anomaly 9
10 Outline General concepts What are outliers Types of outliers Challenges of outlier detection Outlier detection approaches Statistical methods Proximity-based methods Clustering-based methods Huiping Cao, Anomaly 10
11 Outlier detection methods Data for analysis are labeled with normal or abnormal by domain experts. Supervised methods Can be modeled as a classification problem Special aspects to consider: imbalanced normal data points and abnormal points Measures: recall is more meaningful Unsupervised methods Largely utilize clustering methods Semi-supervised Huiping Cao, Anomaly 11
12 Outlier detection methods Outlier detection algorithms make assumptions about outliers versus the rest of the data. Categories according to the assumptions made Statistical methods (or model based) Normal data follow a statistical (stochastic) model Outliers do not follow the model Proximity-based methods The proximity of outliers to their neighbors are different from the proximity of most other objects to their neighbors Distance-based, density-based Clustering-based methods Normal objects belong to large and dense clusters Outliers belong to small or sparse clusters, or belong to no cluster Huiping Cao, Anomaly 12
13 Statistical approaches Probabilistic definition of an outlier: an outlier is an object that has a low probability with respect to a probability distribution model of the data. Normal objects are generated by a stochastic process, occur in regions of high probability for the stochastic model Outliers occur in regions of low probability Approach steps Learn a generative model fitting the given data Identify the objects in low-probability regions of the model Categories Parametric method (univariate, multivariate) Nonparametric method Huiping Cao, Anomaly 13
14 Parametric: univariate Normal Distribution Normal distribution, maximum likelihood estimation (MLE) Standard normal distribution, N(0,1) Non-standard normal distribution, N(μ,σ 2 ), z-score Use MLE to estimate μ and σ 2 Huiping Cao, Anomaly 14
15 Parametric: univariate Normal Distribution prob( x c) = α for N(0,1) Mark an object as an outlier if it is more than 3σ away from the estimated mean μ, where σ is the standard deviation (μ±3σ region contains 99.73% of the data) (c, α) pair for N(0,1) c α for N(0,1) Huiping Cao, Anomaly 15
16 Parametric: univariate Normal Distribution Example A city s average temperature values in 10 years: 24, 28.9, 28.9, 29, 29.1, 29.1, 29.2, 29.2, 29.3, 29.4 μ = σ , σ = sqrt(2.29) = 1.51 Is 24 an outlier? z-score = ( )/1.51 = 3.04 > 3 Huiping Cao, Anomaly 16
17 Parametric: other univariate outlier detection approaches (S.S.) Boxplot method Grubb s test (maximum normed residual test) Huiping Cao, Anomaly 17
18 Parametric: multivariate Multivariate Convert the problem to a univariate outlier detection problem Use Mahalanobis distance from object o to its mean μ Use χ 2 statistic o i : is the value of o on the i-th dimension E i : the mean of the i-th dimension of all objects n: the number of object Huiping Cao, Anomaly 18
19 Nonparametric Nonparametric methods use fewer assumptions about data distribution, thus can be applicable in more scenarios Histogram approach Construct histograms (types: equal width or equal depth, number of bins, or size of each bin) Outliers: not in any bin or in bins with small size Drawback: hard to decide the bin size Others: kernel function (more discussed in machine learning) Huiping Cao, Anomaly 19
20 Outline General concepts What are outliers Types of outliers Challenges of outlier detection Outlier detection approaches Statistical methods Proximity-based methods Clustering-based methods Huiping Cao, Anomaly 20
21 Proximity-based Approaches Data is represented as a vector of features Based on the neighborhood Major approaches Distance based Density based Huiping Cao, Anomaly 21
22 Distance-based approach Anomaly: if an object is distant from most points. Distance to k-nearest Neighbor: the outlier score of an object is given by the distance to its k-nearest neighbor. Outliers: threshold Problem: hard to decide k (see next slides) Improvement: average of the distances to the first k-nearest neighbors Huiping Cao, Anomaly 22
23 k=1, outlier is O k=1, outlier is O k=5, all points at the right upper corner are outliers 23
24 Distance-based outlier detection Given a dataset D with n data points, a distance threshold r r-neighborhood: about outliers vs. the rest of the data Object o is a DB(r,π)-outlier Approach: Compute the distance between every pair of data points O(n 2 ) Practically, O(n) Huiping Cao, Anomaly 24
25 A grid-based method implementation Cell diagonal length: r/2 Cell edge length: where d is the number of dimensions Level-1 cell Direct neighbor cells of a cell C r 2 d Any point o in such cells has dist(o,o ) r Level-2 cell One or two cells away from a cell C Any point with dist(o,o ) > r must be in level-2 cell Huiping Cao, Anomaly 25
26 A grid-based method implementation Pruning n 0 total number of objects in a cell C n 1 total number of objects in a cell C s level-1 cells n 2 total number of objects in a cell C s level-2 cells Level-1 cell pruning: If (n 0 +n 1 ) > πn, o is NOT an outlier Level-2 cell: If (n 0 +n 1 +n 2 ) < πn+1, all the points in C are outliers Huiping Cao, Anomaly 26
27 Distance-based outlier detection Global outliers: cannot handle data sets with regions of different densities p 2 p 1 Huiping Cao, Anomaly 27
28 Proximity-based Approaches Data is represented as a vector of features Based on the neighborhood Major approaches Distance based Density based Huiping Cao, Anomaly 28
29 Density-based outlier detection Local proximity-based outlier Compare the density around one object with the density around its local neighbors p 2 p 1 Huiping Cao, Anomaly 29
30 Density based D: a set of objects Nearest neighbor of o d(o,d) = min{d(o,o ) o in C} Local outliers: relative to their local neighborhoods, particularly with respect to the densities of the neighborhoods. Density based outlier: the outlier score of an object is the inverse of the density around an object. Huiping Cao, Anomaly 30
31 Concepts k-distance of an object o d k (o): measure the relative density of an object o. Formally, d k (o) = d(o,p) s.t. at least k objects o in D/{o}, d(o,o`) d(o,p) at least k-1 objects o in D/{o}, d(o,o`)<d(o,p) K-distance neighborhood of an object o N k (o) = {o o in D, d(o,o ) d k (o)} N k (o) may contain more than k objects Measure local density: average distance from o to N k (o) Problem: fluctuations Huiping Cao, Anomaly 31
32 Reachable distance Concepts reachdist(o ào) = max{d k (o), d(o,o )} Alleviate fluctuations Not symmetric, reachdist(o ào) reachdist(oào ) Local density of o: average reachability distance from o to N k (o) density k (o) = o' N k (o) N k (o) reachdist(o o') = o' N k (o) N k (o) max{d k (o'), d(o, o')} Different from density definition in density-based clustering Global/local Huiping Cao, Anomaly 32
33 Example 5 k=2, use Euclidean distance Distance from o to o s 2NN is 1 d k (o)=1 4 N k (o)={p1,p2,p3} d k (p1) = sqrt( ) = 1.28, dist(o,p1)=0.8 d k (p2) = sqrt(2) =1.41, dist(o,p2)=1 y 3 2 p1 O p3 d k (p3) = sqrt(0.32) = 0.57, dist(o,p3)=1 reachdist(o->p1) = p2 reachdist(o->p2) = 1.41 reachdist(o->p3) = density k (o)=3/( ) = x Huiping Cao, Anomaly 33
34 Local outlier factor (LOF) (or average relative density of o) Average ratio of local reachability density of o and local reachability density of the k-nearest neighbors of o The lower density k (o), and the higher density k (o )è higher LOFà higher probability to be outlier Huiping Cao, Anomaly 34
35 Example 5 k=2, use Euclidean distance Distance from o to o s 2NN is 1 d k (o)=1 4 N k (o)={p1,p2,p3} d k (p1) = sqrt( ) = 1.28, dist(o,p1)=0.8 d k (p2) = sqrt(2) =1.41, dist(o,p2)=1 y 3 2 p1 O p3 d k (p3) = sqrt(0.32) = 0.57, dist(o,p3)=1 reachdist(o->p1) = p2 reachdist(o->p2) = 1.41 reachdist(o->p3) = density k (o)=3/( ) = x Then, calculate density k (p1), density k (p2), density k (p3) Huiping Cao, Anomaly 35
36 Outline General concepts What are outliers Types of outliers Challenges of outlier detection Outlier detection approaches Statistical methods Proximity-based methods Clustering-based methods Huiping Cao, Anomaly 36
37 Clustering-Based Clustering-based outlier: an object is a cluster-based outlier if the object does not strongly belong to any cluster. An outlier an object belonging to a small and remote cluster or not belonging to any cluster Huiping Cao, Anomaly 37
38 Clustering-Based Basic steps: Cluster the data into groups of different density Three general approaches An object does not belong to any cluster à outlier object There is a large distance between an object and the cluster to which it is closest à outlier The object is part of a small and sparse cluster à all the objects in that cluster are outliers Huiping Cao, Anomaly 38
39 Approach 2 There is a large distance between an object and the cluster to which it is closest à outlier Calculate ratio, the larger the ratio, the farther away o is from its closest cluster C o ratio = d(o,c o ) d(o',c o ) o' Co C o Huiping Cao, Anomaly 39
40 Outliers in Lower Dimensional Projection In high-dimensional space, data is sparse and notion of proximity becomes meaningless Every point is an almost equally good outlier from the perspective of proximity-based definitions Lower-dimensional projection methods A point is an outlier if in some lower dimensional projection, it is present in a local region of abnormally low density Huiping Cao, Anomaly 40
41 R packages R parallel implementation of Local Outlier Factor(LOF) which uses multiplecpus to significantly speed up the LOF computation for large datasets. Python LOF implementation: Huiping Cao, Anomaly 41
An Overview of Outlier Detection Techniques and Applications
Machine Learning Rhein-Neckar Meetup An Overview of Outlier Detection Techniques and Applications Ying Gu connygy@gmail.com 28.02.2016 Anomaly/Outlier Detection What are anomalies/outliers? The set of
More informationCS570 Data Mining. Anomaly Detection. Li Xiong. Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber.
CS570 Data Mining Anomaly Detection Li Xiong Slide credits: Tan, Steinbach, Kumar Jiawei Han and Micheline Kamber April 3, 2011 1 Anomaly Detection Anomaly is a pattern in the data that does not conform
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationCSE 546 Final Exam, Autumn 2013
CSE 546 Final Exam, Autumn 0. Personal info: Name: Student ID: E-mail address:. There should be 5 numbered pages in this exam (including this cover sheet).. You can use any material you brought: any book,
More informationClustering. CSL465/603 - Fall 2016 Narayanan C Krishnan
Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification
More informationUnsupervised Anomaly Detection for High Dimensional Data
Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 1
Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationOutlier Detection Using Rough Set Theory
Outlier Detection Using Rough Set Theory Feng Jiang 1,2, Yuefei Sui 1, and Cungen Cao 1 1 Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences,
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationLogic and machine learning review. CS 540 Yingyu Liang
Logic and machine learning review CS 540 Yingyu Liang Propositional logic Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions.
More informationAnomaly Detection via Online Oversampling Principal Component Analysis
Anomaly Detection via Online Oversampling Principal Component Analysis R.Sundara Nagaraj 1, C.Anitha 2 and Mrs.K.K.Kavitha 3 1 PG Scholar (M.Phil-CS), Selvamm Art Science College (Autonomous), Namakkal,
More informationUniversità di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7
Università di Pisa A.A. 2016-2017 Data Mining II June 13th, 2017 Exercise 1 - Sequential patterns (6 points) a) (3 points) Given the following input sequence < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D}
More informationMachine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)
Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationData Preprocessing. Cluster Similarity
1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M
More informationProbabilistic Methods in Bioinformatics. Pabitra Mitra
Probabilistic Methods in Bioinformatics Pabitra Mitra pabitra@cse.iitkgp.ernet.in Probability in Bioinformatics Classification Categorize a new object into a known class Supervised learning/predictive
More informationReverse Nearest Neighbors in Unsupervised Distance-Based Outlier Detection. I.Priyanka 1. Research Scholar,Bharathiyar University. G.
Journal of Analysis and Computation (JAC) (An International Peer Reviewed Journal), www.ijaconline.com, ISSN 0973-2861 International Conference on Emerging Trends in IOT & Machine Learning, 2018 Reverse
More informationMASTER. Anomaly detection on event logs an unsupervised algorithm on ixr-messages. Severins, J.D. Award date: Link to publication
MASTER Anomaly detection on event logs an unsupervised algorithm on ixr-messages Severins, J.D. Award date: 2016 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's),
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationRiemannian Metric Learning for Symmetric Positive Definite Matrices
CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs
More informationStatistical Learning. Dong Liu. Dept. EEIS, USTC
Statistical Learning Dong Liu Dept. EEIS, USTC Chapter 6. Unsupervised and Semi-Supervised Learning 1. Unsupervised learning 2. k-means 3. Gaussian mixture model 4. Other approaches to clustering 5. Principle
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationDetection of Unauthorized Electricity Consumption using Machine Learning
Detection of Unauthorized Electricity Consumption using Machine Learning Bo Tang, Ph.D. Department of Electrical and Computer Engineering Mississippi State University Outline Advanced Metering Infrastructure
More informationIntroduction to Statistical Inference
Structural Health Monitoring Using Statistical Pattern Recognition Introduction to Statistical Inference Presented by Charles R. Farrar, Ph.D., P.E. Outline Introduce statistical decision making for Structural
More informationApplying cluster analysis to 2011 Census local authority data
Applying cluster analysis to 2011 Census local authority data Kitty.Lymperopoulou@manchester.ac.uk SPSS User Group Conference November, 10 2017 Outline Basic ideas of cluster analysis How to choose variables
More informationAnomaly Detection via Online Over-Sampling Principal Component Analysis
Anomaly Detection via Online Over-Sampling Principal Component Analysis Yi-Ren Yeh 1, Yuh-Jye Lee 2 and Yu-Chiang Frank Wang 1 1 Research Center for Information Technology Innovation, Academia Sinica 2
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationPart I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis
Week 5 Based in part on slides from textbook, slides of Susan Holmes Part I Linear Discriminant Analysis October 29, 2012 1 / 1 2 / 1 Nearest centroid rule Suppose we break down our data matrix as by the
More informationScuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017
Scuola di Calcolo Scientifico con MATLAB (SCSM) 2017 Palermo 31 Luglio - 4 Agosto 2017 www.u4learn.it Ing. Giuseppe La Tona Sommario Machine Learning definition Machine Learning Problems Artificial Neural
More informationENTROPY FILTER FOR ANOMALY DETECTION WITH EDDY CURRENT REMOTE FIELD SENSORS
ENTROPY FILTER FOR ANOMALY DETECTION WITH EDDY CURRENT REMOTE FIELD SENSORS By Farid Sheikhi May 2014 A Thesis submitted to the School of Graduate Studies and Research in partial fulfillment of the requirements
More informationCSE446: non-parametric methods Spring 2017
CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationClassification and Pattern Recognition
Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationprobability of k samples out of J fall in R.
Nonparametric Techniques for Density Estimation (DHS Ch. 4) n Introduction n Estimation Procedure n Parzen Window Estimation n Parzen Window Example n K n -Nearest Neighbor Estimation Introduction Suppose
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mixture Models, Density Estimation, Factor Analysis Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 2: 1 late day to hand it in now. Assignment 3: Posted,
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationA Comparative Evaluation of Anomaly Detection Techniques for Sequence Data. Technical Report
A Comparative Evaluation of Anomaly Detection Techniques for Sequence Data Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union Street SE
More information10-701/ Machine Learning - Midterm Exam, Fall 2010
10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam
More informationText Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University
Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationIntro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation
Lecture 15. Pattern Classification (I): Statistical Formulation Outline Statistical Pattern Recognition Maximum Posterior Probability (MAP) Classifier Maximum Likelihood (ML) Classifier K-Nearest Neighbor
More informationGeometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat
Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental
More informationProbability and Statistics
CHAPTER 4: IT IS ALL ABOUT DATA 4b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT
More informationStyle-aware Mid-level Representation for Discovering Visual Connections in Space and Time
Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Experiment presentation for CS3710:Visual Recognition Presenter: Zitao Liu University of Pittsburgh ztliu@cs.pitt.edu
More informationLinear Models for Regression. Sargur Srihari
Linear Models for Regression Sargur srihari@cedar.buffalo.edu 1 Topics in Linear Regression What is regression? Polynomial Curve Fitting with Scalar input Linear Basis Function Models Maximum Likelihood
More informationRare Event Discovery And Event Change Point In Biological Data Stream
Rare Event Discovery And Event Change Point In Biological Data Stream T. Jagadeeswari 1 M.Tech(CSE) MISTE, B. Mahalakshmi 2 M.Tech(CSE)MISTE, N. Anusha 3 M.Tech(CSE) Department of Computer Science and
More informationIntroduction to Machine Learning
Introduction to Machine Learning CS4731 Dr. Mihail Fall 2017 Slide content based on books by Bishop and Barber. https://www.microsoft.com/en-us/research/people/cmbishop/ http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=brml.homepage
More informationday month year documentname/initials 1
ECE471-571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationIntroduction to machine learning
1/59 Introduction to machine learning Victor Kitov v.v.kitov@yandex.ru 1/59 Course information Instructor - Victor Vladimirovich Kitov Tasks of the course Structure: Tools lectures, seminars assignements:
More informationAnomaly Detection in Logged Sensor Data. Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK
Anomaly Detection in Logged Sensor Data Master s thesis in Complex Adaptive Systems JOHAN FLORBÄCK Department of Applied Mechanics CHALMERS UNIVERSITY OF TECHNOLOGY Göteborg, Sweden 2015 MASTER S THESIS
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationSkylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland
Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal
More informationBNG 495 Capstone Design. Descriptive Statistics
BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus
More informationSurprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University
Surprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline Astroinformatics Example Application:
More informationIssues and Techniques in Pattern Classification
Issues and Techniques in Pattern Classification Carlotta Domeniconi www.ise.gmu.edu/~carlotta Machine Learning Given a collection of data, a machine learner eplains the underlying process that generated
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationChap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University
Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics
More informationA Family of Joint Sparse PCA Algorithms for Anomaly Localization in Network Data Streams. Ruoyi Jiang
A Family of Joint Sparse PCA Algorithms for Anomaly Localization in Network Data Streams By Ruoyi Jiang Submitted to the graduate degree program in Department of Electrical Engineering and Computer Science
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationData Mining algorithms
Data Mining algorithms 2017-2018 spring 02.07-09.2018 Overview Classification vs. Regression Evaluation I Basics Bálint Daróczy daroczyb@ilab.sztaki.hu Basic reachability: MTA SZTAKI, Lágymányosi str.
More informationNearest Neighbors Methods for Support Vector Machines
Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad
More informationCPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017
CPSC 340: Machine Learning and Data Mining MLE and MAP Fall 2017 Assignment 3: Admin 1 late day to hand in tonight, 2 late days for Wednesday. Assignment 4: Due Friday of next week. Last Time: Multi-Class
More informationECE521 Lecture7. Logistic Regression
ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard
More informationAnomaly Detection via Over-sampling Principal Component Analysis
Anomaly Detection via Over-sampling Principal Component Analysis Yi-Ren Yeh, Zheng-Yi Lee, and Yuh-Jye Lee Abstract Outlier detection is an important issue in data mining and has been studied in different
More informationCPSC 340: Machine Learning and Data Mining. Linear Least Squares Fall 2016
CPSC 340: Machine Learning and Data Mining Linear Least Squares Fall 2016 Assignment 2 is due Friday: Admin You should already be started! 1 late day to hand it in on Wednesday, 2 for Friday, 3 for next
More informationIntroduction to Machine Learning. Introduction to ML - TAU 2016/7 1
Introduction to Machine Learning Introduction to ML - TAU 2016/7 1 Course Administration Lecturers: Amir Globerson (gamir@post.tau.ac.il) Yishay Mansour (Mansour@tau.ac.il) Teaching Assistance: Regev Schweiger
More informationModeling Complex Temporal Composition of Actionlets for Activity Prediction
Modeling Complex Temporal Composition of Actionlets for Activity Prediction ECCV 2012 Activity Recognition Reading Group Framework of activity prediction What is an Actionlet To segment a long sequence
More informationIntensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis
Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 4 Spatial Point Patterns Definition Set of point locations with recorded events" within study
More informationMULTI-LEVEL RELATIONSHIP OUTLIER DETECTION
MULTI-LEVEL RELATIONSHIP OUTLIER DETECTION by Qiang Jiang B.Eng., East China Normal University, 2010 a Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in
More informationGlobal Scene Representations. Tilke Judd
Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features
More informationIntensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis
Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 5 Topic Overview 1) Introduction/Unvariate Statistics 2) Bootstrapping/Monte Carlo Simulation/Kernel
More informationA Bayesian Method for Guessing the Extreme Values in a Data Set
A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu University of Florida May, 2008 Mingxi Wu (University of Florida) May, 2008 1 / 74 Outline Problem Definition Example Applications
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L2: Instance Based Estimation Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune, January
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 17 2019 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Jan. 25, 2019 Office
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Data & Data Preprocessing & Classification (Basic Concepts) Huan Sun, CSE@The Ohio State University Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han Chapter
More informationBatch Mode Sparse Active Learning. Lixin Shi, Yuhang Zhao Tsinghua University
Batch Mode Sparse Active Learning Lixin Shi, Yuhang Zhao Tsinghua University Our work Propose an unified framework of batch mode active learning Instantiate the framework using classifiers based on sparse
More informationINTRODUCTION TO DATA SCIENCE
INTRODUCTION TO DATA SCIENCE JOHN P DICKERSON Lecture #13 3/9/2017 CMSC320 Tuesdays & Thursdays 3:30pm 4:45pm ANNOUNCEMENTS Mini-Project #1 is due Saturday night (3/11): Seems like people are able to do
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationChemometrics: Classification of spectra
Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture
More informationGraph-Based Anomaly Detection with Soft Harmonic Functions
Graph-Based Anomaly Detection with Soft Harmonic Functions Michal Valko Advisor: Milos Hauskrecht Computer Science Department, University of Pittsburgh, Computer Science Day 2011, March 18 th, 2011. Anomaly
More informationECE521 lecture 4: 19 January Optimization, MLE, regularization
ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity
More informationSurprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University
Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline What is Surprise Detection? Example Application: The LSST
More informationMultivariate Analysis of Crime Data using Spatial Outlier Detection Algorithm
J. Stat. Appl. Pro. 5, No. 3, 433-438 (2016) 433 Journal of Statistics Applications & Probability An International Journal http://dx.doi.org/10.18576/jsap/050307 Multivariate Analysis of Crime Data using
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationKernel expansions with unlabeled examples
Kernel expansions with unlabeled examples Martin Szummer MIT AI Lab & CBCL Cambridge, MA szummer@ai.mit.edu Tommi Jaakkola MIT AI Lab Cambridge, MA tommi@ai.mit.edu Abstract Modern classification applications
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationClassification: Decision Trees
Classification: Decision Trees These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online. Feel
More informationSupplementary Material for Wang and Serfling paper
Supplementary Material for Wang and Serfling paper March 6, 2017 1 Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationProbabilistic clustering
Aprendizagem Automática Probabilistic clustering Ludwig Krippahl Probabilistic clustering Summary Fuzzy sets and clustering Fuzzy c-means Probabilistic Clustering: mixture models Expectation-Maximization,
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Slide Set 3: Detection Theory January 2018 Heikki Huttunen heikki.huttunen@tut.fi Department of Signal Processing Tampere University of Technology Detection theory
More information