Support Measure Data Description for Group Anomaly Detection

Size: px
Start display at page:

Download "Support Measure Data Description for Group Anomaly Detection"

Transcription

1 Support Measure Data Description for Group Anomaly Detection 1 Jorge Guevara Díaz, 2 Stephane Canu, 1 Roberto Hirata Jr. 1 IME-USP, Brazil. 2 LITIS-INSA, France. November 03, 2015.

2 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

3 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

4 Who am I? Three papers in local peruvian conferences. Lorito. A isolated word recognition from scratch, in JAVA. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

5 Who am I? Speech Miner. Speech recognition system using MFCC, HMM. using HTK SOM-TSP. SOM neural network to solve the TSP problem in JAVA. FNN. Neural network for digit classification in JAVA Guevara 2015 (IME-USP) IBM-Research November 03, / 49

6 Who am I? Guevara 2015 (IME-USP) IBM-Research November 03, / 49

7 Who am I? Similarity between fuzzy sets using kernels link between fuzzy systems and kernels theory of positive definite kernels on fuzzy sets kernels induced by fuzzy distance A data description model for set of distributions Guevara 2015 (IME-USP) IBM-Research November 03, / 49

8 Who am I? Similarity between fuzzy sets using kernels link between fuzzy systems and kernels theory of positive definite kernels on fuzzy sets kernels induced by fuzzy distance A data description model for set of distributions Fuzzy kernel hypothesis testing TSK kernels on fuzzy sets, classification of low quality datasets. Group anomaly detection using SMDD. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

9 Who am I? Guevara 2015 (IME-USP) IBM-Research November 03, / 49

10 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

11 Introduction Definition (Anomaly) An anomaly is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism a Figure : Rare starfish found. One in a million! b a) Hawkins D., Identification of Outliers, Chapman and Hall, b) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

12 Some applications include a : Fraud detection. i.e., abnormal buying patterns; Medicine. i.e., Unusual symptoms, abnormal tests; Sports. i.e., Outstanding players; Measurement errors. i.e., abnormal values. Cyber-intrusion detection ; Industrial damage detection; Image processing, Textual anomaly detection (a) Unusual contraction. (b) Ceramic defects b. (c) Traffic jam recognition. a)varun Chandola, Arindam Banerjee, and Vipin Kumar Anomaly detection: A survey. ACM Comput. Surv. b) c) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

13 Techniques: Generative models. i.e., HMM, GMM; Unsupervised methods. i.e., clustering, distance-based, density based; Discriminative models. i.e., SVM, neural networks; Information Theoretic Methods, Geometric methods Figure : 2D-anomalies a. a)varun Chandola, Arindam Banerjee, and Vipin Kumar Anomaly detection: A survey. ACM Comput. Surv. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

14 Problem Given a data set of the form T = {s i } N i=1, (1) where s i = {x (i) 1, x(i) 2,..., x(i) L i } P i, and P i defined on (R D, B(R D ). Try to detect anomalies or group anomalies from T Guevara 2015 (IME-USP) IBM-Research November 03, / 49

15 Problem Given a data set of the form T = {s i } N i=1, (2) where s i = {x (i) 1, x(i) 2,..., x(i) L i } P i, and P i defined on (R D, B(R D ). Try to detect anomalies or group anomalies from T A more complex scenario Guevara 2015 (IME-USP) IBM-Research November 03, / 49

16 Introduction Examples of datasets of this form T = {s i } N i=1, (3) (a) Cluster of galaxies (b) SIFT. (c) USPS. a) b) c) Chang-Dong et al. "Multi-Exemplar Affinity Propagation", Guevara 2015 (IME-USP) IBM-Research November 03, / 49

17 Group anomaly Group of points with unexpected behavior wrt a dataset of group of points. Point-based group anomalies aggregation of anomalous points Distributed-based group anomalies anomalous aggregation of non-anomalous points. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

18 Previous work Feature engineering approach Feature extraction from each group a,b Clustering point anomalies However, they ignore distributed-based group anomalies. Generative approach Flexible genre models c Hierarchical probabilistic models d However, procedures rely on parametric assumptions Discriminative approach Support measure machines e Our work nonparametric, performance depends on the kernel choice. a Chan et al. "Modeling multiple time series for anomaly detection," in Data Mining, IEEE. b Keogh et al. "HOT SAX: efficiently finding the most unusual time series subsequence," in Data Mining, IEEE. c L Xion et al. "Group Anomaly Detection using Flexible Genre Models", NIPS. d L Xion et al. "Hierarchical Probabilistic Models for Group Anomaly Detection", AISTATS. e Muandet et a. "One-Class Support Measure Machines for Group Anomaly Detection", UAI. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

19 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

20 RKHS, where the magic happens Figure : Kernel mapping a Figure from Shawe-Taylor et al. "Kernel Methods for Pattern Analysis". Cambridge University Press. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

21 RKHS, where the magic happens Main ingredient A real-valued symmetric positive definite kernel k. N i=1,j=1 c ic j k(x i, x j ) 0 Guevara 2015 (IME-USP) IBM-Research November 03, / 49

22 Kernel methods SVM kernel PCA Gaussian process SVDD MKL SMDD Kernel regression Kernel two-sample test kernel spectral clustering Guevara 2015 (IME-USP) IBM-Research November 03, / 49

23 Kernel methods SVM kernel PCA Gaussian process SVDD MKL SMDD Kernel regression Kernel two-sample test kernel spectral clustering Representer Theorem f = argmin Cost ( (x 1, y 1, f (x 1 )),..., (x N, y N, f (x N ))) + Ω( f ) (4) f H f (.) = N α i k(., x i ) (5) i =1 Guevara 2015 (IME-USP) IBM-Research November 03, / 49

24 Remembering the problem Detecting group anomalies from the set T = {s i } N i= Guevara 2015 (IME-USP) IBM-Research November 03, / 49

25 Hilbert space embedding for distributions Framework for embedding probability measures into a RKHS H. Definition The embedding of probability measures P P into H is given by the mapping µ : P H P µ P = E P [k(x,.)] = k(x,.)dp(x). x R D Guevara 2015 (IME-USP) IBM-Research November 03, / 49

26 Hilbert space embedding for distributions Properties if µ P (X ) = E P [k(x, X )] <, with k being measurable then µ P H Guevara 2015 (IME-USP) IBM-Research November 03, / 49

27 Hilbert space embedding for distributions Properties if µ P (X ) = E P [k(x, X )] <, with k being measurable then µ P H Reproducing property f, µ P = f, E P [k(x,.)] = E P [f (X )] holds for all f H Guevara 2015 (IME-USP) IBM-Research November 03, / 49

28 Hilbert space embedding for distributions Properties if µ P (X ) = E P [k(x, X )] <, with k being measurable then µ P H Reproducing property f, µ P = f, E P [k(x,.)] = E P [f (X )] holds for all f H µ P is the representative function of P Guevara 2015 (IME-USP) IBM-Research November 03, / 49

29 Hilbert space embedding for distributions Properties if µ P (X ) = E P [k(x, X )] <, with k being measurable then µ P H Reproducing property f, µ P = f, E P [k(x,.)] = E P [f (X )] holds for all f H µ P is the representative function of P if k is characteristic then µ : P H is injective Guevara 2015 (IME-USP) IBM-Research November 03, / 49

30 Hilbert space embedding for distributions Properties if µ P (X ) = E P [k(x, X )] <, with k being measurable then µ P H Reproducing property f, µ P = f, E P [k(x,.)] = E P [f (X )] holds for all f H µ P is the representative function of P if k is characteristic then µ : P H is injective The term µ P µ emp, is bounded, where µ emp is a empirical estimator of µ emp Guevara 2015 (IME-USP) IBM-Research November 03, / 49

31 Kernel on probability measures The mapping P P R (P, Q) P, Q P = µ P, µ Q H defines an inner product on P. the real-valued kernel on P P, defined by k(p, Q) = P, Q P = µ P, µ Q H = k(x, x )dp(x)dq(x ) x R D x R D (6) is positive definite. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

32 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

33 Minimum volume sets (MV-set) a MV-set is a set satisfying some optimization criteria. Definition (MV-set for probability measures) Let (P, A, E) be a probability space, where P is the space of all probability measures P on (R D, B(R D )), A is some suitable σ-algebra of P and E is a probability measure on (P, A). The MV-set is the set Gα = argmin{ρ(g) E(G) α}, (7) G A where ρ is a reference measure on A and α [0, 1]. The MV-set G α, describes a fraction α of the mass concentration of E. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

34 Minimum volume sets (MV-set) Assuming that {P i } N i=1 is an i.i.d. sample distributed according to E Assuming that each P i is unknown. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

35 Minimum volume sets (MV-set) Assuming that {P i } N i=1 is an i.i.d. sample distributed according to E Assuming that each P i is unknown. Examples of three different classes of volume-sets Ĝ 1 (R, c) = {P i P µ Pi c 2 H R 2 }, (8) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

36 Minimum volume sets (MV-set) Assuming that {P i } N i=1 is an i.i.d. sample distributed according to E Assuming that each P i is unknown. Examples of three different classes of volume-sets Ĝ 1 (R, c) = {P i P µ Pi c 2 H R 2 }, (8) Ĝ 2 (R, c) = {P i P µ Pi c 2 H R 2, µ P 2 H = 1}. (9) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

37 Minimum volume sets (MV-set) Assuming that {P i } N i=1 is an i.i.d. sample distributed according to E Assuming that each P i is unknown. Examples of three different classes of volume-sets Ĝ 1 (R, c) = {P i P µ Pi c 2 H R 2 }, (8) Ĝ 2 (R, c) = {P i P µ Pi c 2 H R 2, µ P 2 H = 1}. (9) Ĝ 3 (K) = {P i P P i ( k(x i,.) c 2 H R 2 ) 1 κ i }. (10) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

38 First SMDD model Ĝ 1 (R, c) = {P i P µ Pi c 2 H R 2 } Given the mean functions {µ Pi } N i=1 of {P i} N i=1, the SMDD is: Problem min c H,R R +,ξ R N subject to R 2 + λ N i=1 ξ i µ Pi c 2 H R 2 + ξ i, i = 1,..., N ξ i 0, i = 1,..., N. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

39 Proposition (Dual form) The dual form of the previously problem is given by: Problem max α R N subject to N N α i k(pi, P i ) α i α j k(pi, P j ) i=1 i,j=1 0 α i λ, i = 1,..., N N α i = 1 i=1 where k(p i, P j ) = µ Pi, µ Pj H and α is a Lagrange multiplier vector with non negative components α i. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

40 Proposition (Representer theorem) c(.) = i α i µ Pi, i {i I 0 < α i λ}, where I = {1, 2,..., N}. Furthermore, all P i, i {i I α i = 0} are inside the MV-set Ĝα. All P i, i {i I α i = λ} are the training errors. All P i, i {i I 0 < α i < λ} are the support measures. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

41 Proposition (Representer theorem) c(.) = i α i µ Pi, i {i I 0 < α i λ}, where I = {1, 2,..., N}. Furthermore, Theorem all P i, i {i I α i = 0} are inside the MV-set Ĝ α. All P i, i {i I α i = λ} are the training errors. All P i, i {i I 0 < α i < λ} are the support measures. Let η be the Lagrange multiplier of the constraint N i=1 α i = 1, then R 2 = η + c 2 H. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

42 Proposition (Representer theorem) c(.) = i α i µ Pi, i {i I 0 < α i λ}, where I = {1, 2,..., N}. Furthermore, Theorem all P i, i {i I α i = 0} are inside the MV-set Ĝ α. All P i, i {i I α i = λ} are the training errors. All P i, i {i I 0 < α i < λ} are the support measures. Let η be the Lagrange multiplier of the constraint N i=1 α i = 1, then R 2 = η + c 2 H. If P t is described by this SMDD model, then this must be true: µ Pt c 2 H = k(p t, P t ) 2 i α i k(pi, P t ) + i,j α i α j k(pi, P j ) R, (11) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

43 Second SMDD model Mean maps with stationary kernels do not have constant norm µ P H = E P [k I (X., )] H E P [ k I (X., ) H ] = ɛ normalize mean maps to lie on a surface of some hypersphere k(p i, P j ) = the injectivity of µ : P H is preserved. µ P, µ Q H µp, µ P H µ Q, µ Q H, (12) Figure : Figure from a Muandet et al "One-class support measure machines for group anomaly detection." Guevara 2015 (IME-USP) IBM-Research November 03, / 49

44 This could be modeled by the class of volume sets Ĝ 2 (R, c) = {P i P µ Pi c 2 H R 2, µ P 2 H = 1}. (13) and formulated by the following optimization problem: Problem (M2) max α R N subject to N α i α j k(pi, P j ) i,j=1 0 α i λ, i = 1,..., N N α i = 1. i=1 Guevara 2015 (IME-USP) IBM-Research November 03, / 49

45 Third SMDD model Estimate the MV-set over the class of the volume-sets given by Ĝ 3 (K) = {P i P P i ( k(x i,.) c 2 H R 2 ) 1 κ i }. (14) Given the mean functions {µ Pi } N i=1 of {P i} N i=1, and {κ i} N i=1, κ i [0, 1], the SMDD model is the following chance constrained problem: Problem min c H,R R,ξ R N R 2 + λ N i=1 ξ i subject to P i ( k(x i,.) c(.) 2 H R 2 + ξ i ) 1 κ i, ξ i 0, for all i = 1,..., N. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

46 Lemma Markov s inequality. P i ( k(x i,.) c(.) 2 H R 2 + ξ i ) E P[ k(x i,.) c(.) 2 H ] R 2 + ξ i, (15) holds, for all i = 1, 2,..., N. Trace of the covariance operator: Σ H : H H, tr(σ H ) = E P [k(x, X )] k(p, P), (16) This allows to use the following result E P [ k(x,.) c(.) 2 H] = tr(σ H ) + µ P c(.) 2 H. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

47 Deterministic form Given the mean functions {µ Pi } N i=1 of {P i} N i=1 and {κ i} N i=1, κ i (0, 1], the SMDD model is: Problem min c H,R R,ξ R N R 2 + λ N i=1 ξ i subject to µ Pi c(.) 2 H (R 2 + ξ i )κ i tr(σ H i ), ξ i 0, for all i = 1,..., N Guevara 2015 (IME-USP) IBM-Research November 03, / 49

48 Proposition (Dual form) The dual form is given by the following fractional programming problem Problem (M1) max α R N subject to N α i µ Pi, µ Pi H i=1 + N α i tr(σ H i ) i=1 0 α i κ i λ, i = 1,..., N N α i κ i = 1, i=1 N i,j=1 α iα j µ Pi, µ Pj H N i=1 α i where µ Pi, µ Pj H is computed by k(p i, P j ) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

49 Proposition (Representer theorem) c(.) = where I = {1, 2,..., N}. Furthermore, Theorem i α iµ Pi i α, i {i I 0 < α i κ i λ}, (17) i all P i, i {i I α i = 0} are inside the MV-set Ĝ α. All P i, i {i I α i κ i = λ} are the training errors. All P i, i {i I 0 < α i κ i < λ} are the support measures. Let η be the Lagrange multiplier of the constraint N i=1 α iκ i = 1 then R 2 = η. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

50 Group anomalies could be detected if the term µ Pt c 2 H + tr(σh t ) R is true, where µ Pt c is given by k(p t, P t ) 2 i α i k(pi, P t ) + i,j α i α j k(pi, P j ) + tr(σ H t ) (18) Guevara 2015 (IME-USP) IBM-Research November 03, / 49

51 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

52 Group anomaly detection in astronomical data The Star KIC phenomena Guevara 2015 (IME-USP) IBM-Research November 03, / 49

53 Group anomaly detection in astronomical data The Star KIC phenomena Guevara 2015 (IME-USP) IBM-Research November 03, / 49

54 Group anomaly detection in astronomical data The Star KIC phenomena Guevara 2015 (IME-USP) IBM-Research November 03, / 49

55 Group anomaly detection in astronomical data (a) Cluster of galaxies (b) A galaxy. (c) Spectrum of a galaxy. Pictures from Guevara 2015 (IME-USP) IBM-Research November 03, / 49

56 Guevara 2015 (IME-USP) (m) AUC. (n) ACC Non IBM-Research A. (o) ACC Anom. (p) November Médias grup. 03, / 49 (a) AUC. (b) ACC Non A. (c) ACC Anom. (d) Means. (e) AUC. (f) ACC Non A. (g) ACC Anom. (h) Means. (i) AUC. (j) ACC Non A. (k) ACC Anom. (l) Means.

57 Outline 1 Who am I? 2 Introduction 3 Hilbert space embedding for distributions 4 SMDD models 5 Experiments 6 Conclusions and further research Guevara 2015 (IME-USP) IBM-Research November 03, / 49

58 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Guevara 2015 (IME-USP) IBM-Research November 03, / 49

59 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Performance depends on the kernel choice, and regularization parameter Guevara 2015 (IME-USP) IBM-Research November 03, / 49

60 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Performance depends on the kernel choice, and regularization parameter More models can be easily induced based on the idea of MV-sets Guevara 2015 (IME-USP) IBM-Research November 03, / 49

61 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Performance depends on the kernel choice, and regularization parameter More models can be easily induced based on the idea of MV-sets Further research Create models for classification, regression and other ML tasks Guevara 2015 (IME-USP) IBM-Research November 03, / 49

62 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Performance depends on the kernel choice, and regularization parameter More models can be easily induced based on the idea of MV-sets Further research Create models for classification, regression and other ML tasks exploit structural information of groups Guevara 2015 (IME-USP) IBM-Research November 03, / 49

63 Conclusions and further research SMDD is a non parametric kernel method, with promising results in group anomaly detection task Performance depends on the kernel choice, and regularization parameter More models can be easily induced based on the idea of MV-sets Further research Create models for classification, regression and other ML tasks exploit structural information of groups References Guevara, Jorge and Canu, Stephane and Hirata Jr, Roberto Support Measure Data Description for Group Anomaly Detection 2015, SIGACM KDD, ODDx3 15. Guevara, Jorge and Canu, Stephane and Hirata Jr, Roberto Support Measure Data Description, 2014, Techical Report IME-USP. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

64 Any questions? Guevara 2015 (IME-USP) IBM-Research November 03, / 49

65 Point-based group anomaly detection over a Gaussian mixture distribution dataset Red, blue and magenta boxes: group anomalies. Green and yellow boxes: non-anomalous groups. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

66 M4: OCSMM M5: SVDD kernel on probability measures: gaussian kernel, γ given by the median heuristic. 200 runs,training set=50 groups, test set = 30 groups (20 group anomalies). métricas AUC e ACC (a) AUC. (b) ACC Non A. (c) ACC Anom. (d) Médias grup. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

67 Distribution-based group anomaly detection over a Gaussian mixture distribution dataset (a) AUC. (b) ACC Non A. (c) ACC Anom. (d) Médias grup. (e) AUC. (f) ACC Non A. (g) ACC Anom. (h) Médias grup. Guevara 2015 (IME-USP) IBM-Research November 03, / 49

Supervised machine learning with kernel embeddings of fuzzy sets and probability measures

Supervised machine learning with kernel embeddings of fuzzy sets and probability measures Supervised machine learning with kernel embeddings of fuzzy sets and probability measures Jorge Guevara To cite this version: Jorge Guevara. Supervised machine learning with kernel embeddings of fuzzy

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Learning sets and subspaces: a spectral approach

Learning sets and subspaces: a spectral approach Learning sets and subspaces: a spectral approach Alessandro Rudi DIBRIS, Università di Genova Optimization and dynamical processes in Statistical learning and inverse problems Sept 8-12, 2014 A world of

More information

Classifier Complexity and Support Vector Classifiers

Classifier Complexity and Support Vector Classifiers Classifier Complexity and Support Vector Classifiers Feature 2 6 4 2 0 2 4 6 8 RBF kernel 10 10 8 6 4 2 0 2 4 6 Feature 1 David M.J. Tax Pattern Recognition Laboratory Delft University of Technology D.M.J.Tax@tudelft.nl

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Unsupervised Anomaly Detection for High Dimensional Data

Unsupervised Anomaly Detection for High Dimensional Data Unsupervised Anomaly Detection for High Dimensional Data Department of Mathematics, Rowan University. July 19th, 2013 International Workshop in Sequential Methodologies (IWSM-2013) Outline of Talk Motivation

More information

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space to The The A s s in to Fabio A. González Ph.D. Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 2, 2009 to The The A s s in 1 Motivation Outline 2 The Mapping the

More information

Dynamic Time-Alignment Kernel in Support Vector Machine

Dynamic Time-Alignment Kernel in Support Vector Machine Dynamic Time-Alignment Kernel in Support Vector Machine Hiroshi Shimodaira School of Information Science, Japan Advanced Institute of Science and Technology sim@jaist.ac.jp Mitsuru Nakai School of Information

More information

Lecture 6: Minimum encoding ball and Support vector data description (SVDD)

Lecture 6: Minimum encoding ball and Support vector data description (SVDD) Lecture 6: Minimum encoding ball and Support vector data description (SVDD) Stéphane Canu stephane.canu@litislab.eu Sao Paulo 2014 May 12, 2014 Plan 1 Support Vector Data Description (SVDD) SVDD, the smallest

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

Sequential Minimal Optimization (SMO)

Sequential Minimal Optimization (SMO) Data Science and Machine Intelligence Lab National Chiao Tung University May, 07 The SMO algorithm was proposed by John C. Platt in 998 and became the fastest quadratic programming optimization algorithm,

More information

One-Class Support Measure Machines for Group Anomaly Detection

One-Class Support Measure Machines for Group Anomaly Detection One-Class Support Measure Machines for Group Anomaly Detection Krikamol Muandet Empirical Inference Department Max Planck Institute for Intelligent Systems Spemannstraße 38, 72076 Tübingen Bernhard Schölkopf

More information

Kernel Learning via Random Fourier Representations

Kernel Learning via Random Fourier Representations Kernel Learning via Random Fourier Representations L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Module 5: Machine Learning L. Law, M. Mider, X. Miscouridou, S. Ip, A. Wang Kernel Learning via Random

More information

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination

More information

Multiple Kernel Learning

Multiple Kernel Learning CS 678A Course Project Vivek Gupta, 1 Anurendra Kumar 2 Sup: Prof. Harish Karnick 1 1 Department of Computer Science and Engineering 2 Department of Electrical Engineering Indian Institute of Technology,

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning

Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of

More information

Lecture Support Vector Machine (SVM) Classifiers

Lecture Support Vector Machine (SVM) Classifiers Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines Kernel Methods & Support Vector Machines Mahdi pakdaman Naeini PhD Candidate, University of Tehran Senior Researcher, TOSAN Intelligent Data Miners Outline Motivation Introduction to pattern recognition

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Back to the future: Radial Basis Function networks revisited

Back to the future: Radial Basis Function networks revisited Back to the future: Radial Basis Function networks revisited Qichao Que, Mikhail Belkin Department of Computer Science and Engineering Ohio State University Columbus, OH 4310 que, mbelkin@cse.ohio-state.edu

More information

Support Vector Machines for Classification: A Statistical Portrait

Support Vector Machines for Classification: A Statistical Portrait Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,

More information

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the

More information

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract

Scale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses

More information

Non-linear Support Vector Machines

Non-linear Support Vector Machines Non-linear Support Vector Machines Andrea Passerini passerini@disi.unitn.it Machine Learning Non-linear Support Vector Machines Non-linearly separable problems Hard-margin SVM can address linearly separable

More information

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

A NEW ONE-CLASS SVM FOR ANOMALY DETECTION. Yuting Chen Jing Qian Venkatesh Saligrama. Boston University Boston, MA, 02215

A NEW ONE-CLASS SVM FOR ANOMALY DETECTION. Yuting Chen Jing Qian Venkatesh Saligrama. Boston University Boston, MA, 02215 A NEW ONE-CLASS SVM FOR ANOMALY DETECTION Yuting Chen Jing Qian Venkatesh Saligrama Boston University Boston, MA, 225 ABSTRACT Given n i.i.d. samples from some unknown nominal density f, the task of anomaly

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

Forefront of the Two Sample Problem

Forefront of the Two Sample Problem Forefront of the Two Sample Problem From classical to state-of-the-art methods Yuchi Matsuoka What is the Two Sample Problem? X ",, X % ~ P i. i. d Y ",, Y - ~ Q i. i. d Two Sample Problem P = Q? Example:

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

Robust minimum encoding ball and Support vector data description (SVDD)

Robust minimum encoding ball and Support vector data description (SVDD) Robust minimum encoding ball and Support vector data description (SVDD) Stéphane Canu stephane.canu@litislab.eu. Meriem El Azamin, Carole Lartizien & Gaëlle Loosli INRIA, Septembre 2014 September 24, 2014

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Nearest Neighbors Methods for Support Vector Machines

Nearest Neighbors Methods for Support Vector Machines Nearest Neighbors Methods for Support Vector Machines A. J. Quiroz, Dpto. de Matemáticas. Universidad de Los Andes joint work with María González-Lima, Universidad Simón Boĺıvar and Sergio A. Camelo, Universidad

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Tobias Pohlen Selected Topics in Human Language Technology and Pattern Recognition February 10, 2014 Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6

More information

Perceptron Revisited: Linear Separators. Support Vector Machines

Perceptron Revisited: Linear Separators. Support Vector Machines Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

CS798: Selected topics in Machine Learning

CS798: Selected topics in Machine Learning CS798: Selected topics in Machine Learning Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS798: Selected topics in Machine Learning

More information

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,

More information

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS Jinjin Ye jinjin.ye@mu.edu Michael T. Johnson mike.johnson@mu.edu Richard J. Povinelli richard.povinelli@mu.edu

More information

Support Vector Method for Multivariate Density Estimation

Support Vector Method for Multivariate Density Estimation Support Vector Method for Multivariate Density Estimation Vladimir N. Vapnik Royal Halloway College and AT &T Labs, 100 Schultz Dr. Red Bank, NJ 07701 vlad@research.att.com Sayan Mukherjee CBCL, MIT E25-201

More information

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University

More information

EM Algorithm LECTURE OUTLINE

EM Algorithm LECTURE OUTLINE EM Algorithm Lukáš Cerman, Václav Hlaváč Czech Technical University, Faculty of Electrical Engineering Department of Cybernetics, Center for Machine Perception 121 35 Praha 2, Karlovo nám. 13, Czech Republic

More information

A least-squares approach to anomaly detection in static and sequential data

A least-squares approach to anomaly detection in static and sequential data A least-squares approach to anomaly detection in static and sequential data John A. Quinn a, Masashi Sugiyama b a Department of Computer Science, Makerere University, PO Box 7062, Kampala, Uganda b Department

More information

Brief Introduction to Machine Learning

Brief Introduction to Machine Learning Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector

More information

Anomaly Detection via Over-sampling Principal Component Analysis

Anomaly Detection via Over-sampling Principal Component Analysis Anomaly Detection via Over-sampling Principal Component Analysis Yi-Ren Yeh, Zheng-Yi Lee, and Yuh-Jye Lee Abstract Outlier detection is an important issue in data mining and has been studied in different

More information

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University

Chapter 9. Support Vector Machine. Yongdai Kim Seoul National University Chapter 9. Support Vector Machine Yongdai Kim Seoul National University 1. Introduction Support Vector Machine (SVM) is a classification method developed by Vapnik (1996). It is thought that SVM improved

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Introduction to SVM and RVM

Introduction to SVM and RVM Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance

More information

Cross product kernels for fuzzy set similarity

Cross product kernels for fuzzy set similarity Cross product kernels for fuzzy set similarity Jorge Guevara, Roberto Hirata, Stéphane Canu To cite this version: Jorge Guevara, Roberto Hirata, Stéphane Canu. Cross product kernels for fuzzy set similarity.

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu November 3, 2015 Methods to Learn Matrix Data Text Data Set Data Sequence Data Time Series Graph

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Overview Motivation

More information

Support Vector Machines using GMM Supervectors for Speaker Verification

Support Vector Machines using GMM Supervectors for Speaker Verification 1 Support Vector Machines using GMM Supervectors for Speaker Verification W. M. Campbell, D. E. Sturim, D. A. Reynolds MIT Lincoln Laboratory 244 Wood Street Lexington, MA 02420 Corresponding author e-mail:

More information

Advanced Introduction to Machine Learning

Advanced Introduction to Machine Learning 10-715 Advanced Introduction to Machine Learning Homework Due Oct 15, 10.30 am Rules Please follow these guidelines. Failure to do so, will result in loss of credit. 1. Homework is due on the due date

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Bayesian Support Vector Machines for Feature Ranking and Selection

Bayesian Support Vector Machines for Feature Ranking and Selection Bayesian Support Vector Machines for Feature Ranking and Selection written by Chu, Keerthi, Ong, Ghahramani Patrick Pletscher pat@student.ethz.ch ETH Zurich, Switzerland 12th January 2006 Overview 1 Introduction

More information

Constrained Optimization and Support Vector Machines

Constrained Optimization and Support Vector Machines Constrained Optimization and Support Vector Machines Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/

More information

Kernel methods and the exponential family

Kernel methods and the exponential family Kernel methods and the exponential family Stéphane Canu 1 and Alex J. Smola 2 1- PSI - FRE CNRS 2645 INSA de Rouen, France St Etienne du Rouvray, France Stephane.Canu@insa-rouen.fr 2- Statistical Machine

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

L5 Support Vector Classification

L5 Support Vector Classification L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

An introduction to Support Vector Machines

An introduction to Support Vector Machines 1 An introduction to Support Vector Machines Giorgio Valentini DSI - Dipartimento di Scienze dell Informazione Università degli Studi di Milano e-mail: valenti@dsi.unimi.it 2 Outline Linear classifiers

More information

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in

More information

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji

Dynamic Data Modeling, Recognition, and Synthesis. Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Dynamic Data Modeling, Recognition, and Synthesis Rui Zhao Thesis Defense Advisor: Professor Qiang Ji Contents Introduction Related Work Dynamic Data Modeling & Analysis Temporal localization Insufficient

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Lecture 10: Support Vector Machine and Large Margin Classifier

Lecture 10: Support Vector Machine and Large Margin Classifier Lecture 10: Support Vector Machine and Large Margin Classifier Applied Multivariate Analysis Math 570, Fall 2014 Xingye Qiao Department of Mathematical Sciences Binghamton University E-mail: qiao@math.binghamton.edu

More information

An Overview of Outlier Detection Techniques and Applications

An Overview of Outlier Detection Techniques and Applications Machine Learning Rhein-Neckar Meetup An Overview of Outlier Detection Techniques and Applications Ying Gu connygy@gmail.com 28.02.2016 Anomaly/Outlier Detection What are anomalies/outliers? The set of

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Support Vector Machines Explained

Support Vector Machines Explained December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

Discriminative Direction for Kernel Classifiers

Discriminative Direction for Kernel Classifiers Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

Hilbert Space Methods in Learning

Hilbert Space Methods in Learning Hilbert Space Methods in Learning guest lecturer: Risi Kondor 6772 Advanced Machine Learning and Perception (Jebara), Columbia University, October 15, 2003. 1 1. A general formulation of the learning problem

More information

SMO Algorithms for Support Vector Machines without Bias Term

SMO Algorithms for Support Vector Machines without Bias Term Institute of Automatic Control Laboratory for Control Systems and Process Automation Prof. Dr.-Ing. Dr. h. c. Rolf Isermann SMO Algorithms for Support Vector Machines without Bias Term Michael Vogt, 18-Jul-2002

More information

A Revisit to Support Vector Data Description (SVDD)

A Revisit to Support Vector Data Description (SVDD) A Revisit to Support Vector Data Description (SVDD Wei-Cheng Chang b99902019@csie.ntu.edu.tw Ching-Pei Lee r00922098@csie.ntu.edu.tw Chih-Jen Lin cjlin@csie.ntu.edu.tw Department of Computer Science, National

More information

Characterization of Jet Charge at the LHC

Characterization of Jet Charge at the LHC Characterization of Jet Charge at the LHC Thomas Dylan Rueter, Krishna Soni Abstract The Large Hadron Collider (LHC) produces a staggering amount of data - about 30 petabytes annually. One of the largest

More information

Quantum machine learning for quantum anomaly detection CQT AND SUTD, SINGAPORE ARXIV:

Quantum machine learning for quantum anomaly detection CQT AND SUTD, SINGAPORE ARXIV: Quantum machine learning for quantum anomaly detection NANA LIU CQT AND SUTD, SINGAPORE ARXIV:1710.07405 TUESDAY 7 TH NOVEMBER 2017 QTML 2017, VERONA Types of anomaly detection algorithms Classification-based

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Joint distribution optimal transportation for domain adaptation

Joint distribution optimal transportation for domain adaptation Joint distribution optimal transportation for domain adaptation Changhuang Wan Mechanical and Aerospace Engineering Department The Ohio State University March 8 th, 2018 Joint distribution optimal transportation

More information

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina

Indirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina Indirect Rule Learning: Support Vector Machines Indirect learning: loss optimization It doesn t estimate the prediction rule f (x) directly, since most loss functions do not have explicit optimizers. Indirection

More information

Lecture 3: Machine learning, classification, and generative models

Lecture 3: Machine learning, classification, and generative models EE E6820: Speech & Audio Processing & Recognition Lecture 3: Machine learning, classification, and generative models 1 Classification 2 Generative models 3 Gaussian models Michael Mandel

More information

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels

MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels 1/12 MATH 829: Introduction to Data Mining and Analysis Support vector machines and kernels Dominique Guillot Departments of Mathematical Sciences University of Delaware March 14, 2016 Separating sets:

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Chemometrics: Classification of spectra

Chemometrics: Classification of spectra Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

LMS Algorithm Summary

LMS Algorithm Summary LMS Algorithm Summary Step size tradeoff Other Iterative Algorithms LMS algorithm with variable step size: w(k+1) = w(k) + µ(k)e(k)x(k) When step size µ(k) = µ/k algorithm converges almost surely to optimal

More information

We Prediction of Geological Characteristic Using Gaussian Mixture Model

We Prediction of Geological Characteristic Using Gaussian Mixture Model We-07-06 Prediction of Geological Characteristic Using Gaussian Mixture Model L. Li* (BGP,CNPC), Z.H. Wan (BGP,CNPC), S.F. Zhan (BGP,CNPC), C.F. Tao (BGP,CNPC) & X.H. Ran (BGP,CNPC) SUMMARY The multi-attribute

More information

ML (cont.): SUPPORT VECTOR MACHINES

ML (cont.): SUPPORT VECTOR MACHINES ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version

More information

Approximate Kernel PCA with Random Features

Approximate Kernel PCA with Random Features Approximate Kernel PCA with Random Features (Computational vs. Statistical Tradeoff) Bharath K. Sriperumbudur Department of Statistics, Pennsylvania State University Journées de Statistique Paris May 28,

More information

Discriminative Models

Discriminative Models No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models

More information

GI01/M055: Supervised Learning

GI01/M055: Supervised Learning GI01/M055: Supervised Learning 1. Introduction to Supervised Learning October 5, 2009 John Shawe-Taylor 1 Course information 1. When: Mondays, 14:00 17:00 Where: Room 1.20, Engineering Building, Malet

More information

5.6 Nonparametric Logistic Regression

5.6 Nonparametric Logistic Regression 5.6 onparametric Logistic Regression Dmitri Dranishnikov University of Florida Statistical Learning onparametric Logistic Regression onparametric? Doesnt mean that there are no parameters. Just means that

More information