Machine Learning. Ludovic Samper. September 1st, Antidot. Ludovic Samper (Antidot) Machine Learning September 1st, / 77
|
|
- Adela Higgins
- 5 years ago
- Views:
Transcription
1 Machine Learning Ludovic Samper Antidot September 1st, 2015 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
2 Antidot Software vendor since 1999 Paris, Lyon, Aix-en-Provence 45 employees Founders : Fabrice Lacroix CEO, Stéphane Loesel CTO, Jérôme Mainka Chief Scientist Officer Software products and solutions Antidot Finder Suite (AFS) search engine Antidot Information Factory (AIF) a pipe & filters framework SaaS, Hosted License, 0n-site License 50% of the revenue invested in R&D Ludovic Samper (Antidot) Machine Learning September 1st, / 77
3 Antidot Machine Learning Automatic text document classification Named Entity Extraction Compound Splitter (for german words) Clustering algorithm (for news agregation) Open Data, Semantic Web Social Sciences and Humanities research platform. Enriched with open resources open source library to export a db in RDF Antidot is a Partner organization in WDAqua project Ludovic Samper (Antidot) Machine Learning September 1st, / 77
4 Tutorial Study a classical task in Machine Learning : text classification Show scikit-learn.org Python machine learning library Follow the Working with text data tutorial : working_with_text_data.html Additional material on Ludovic Samper (Antidot) Machine Learning September 1st, / 77
5 Summary of the tutorial 1 Problem definition Supervised classification Evaluation metrics 2 Extracting features from text files Bag of words model Term frequency inverse document frequency (tfidf) 3 Algorithms for classification Naïve Bayes Support Vector Machine (SVM) Tuning parameters Cross validation Grid search 4 Conclusion Methodology Ludovic Samper (Antidot) Machine Learning September 1st, / 77
6 Sommaire 1 Problem definition Supervised classification Evaluation metrics 2 Extracting features from text files 3 Algorithms for classification 4 Conclusion Ludovic Samper (Antidot) Machine Learning September 1st, / 77
7 20 newsgroups dataset 20 newsgroups 20 newsgroups documents collected in the 90 s The label is the newsgroup the document belongs to A popular collection documents : in train, 7532 in test wiss-ml.ipynb#the-20-newsgroups-dataset Ludovic Samper (Antidot) Machine Learning September 1st, / 77
8 Classification Problem statement One label per document Automatically determine the label of an unseen document. Set of documents and their labels A supervised classification problem Training Set of documents and their labels Build a model Inference Given a new document, use the model to predict its label Ludovic Samper (Antidot) Machine Learning September 1st, / 77
9 Precision and Recall I Binary classification C C Labeled C TP True Positive FP False Positive Not labeled C FN False Negative TN True Negative Precision Proba(e C e labeled C ) Recall Proba(e labeled C e C) TP TP + FP TP TP + FN Ludovic Samper (Antidot) Machine Learning September 1st, / 77
10 Precision and Recall II F 1 F 1 = 2 P R P + R Harmonic mean of Precision and Recall Accuracy TP + TN TP + TN + FP + FN Ludovic Samper (Antidot) Machine Learning September 1st, / 77
11 Multiclass I N C = number of class Macro Average B macro = 1 N C N C (B binary (TP k, FP k, TN k, FN k )) k=1 Average mesure by class. Large classes count has much as small ones. Micro Average N C N C N C N C B micro = B binary ( TP i, FP i, TN k, FN k ) Average mesure by instance k=1 k=1 k=1 k=1 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
12 Multiclass II Micro average in single label multiclass and Then, N C N C (FN k ) = (FP k ) k=1 k=1 N C N C (TN k ) = (TP k ) k=1 k=1 Precision micro = Recall micro = Accuracy = NC k=1 (TP k) Nbdoc Ludovic Samper (Antidot) Machine Learning September 1st, / 77
13 Sommaire 1 Problem definition 2 Extracting features from text files Bag of words model Term frequency inverse document frequency (tfidf) 3 Algorithms for classification 4 Conclusion Ludovic Samper (Antidot) Machine Learning September 1st, / 77
14 Bag of words From text to features Count the number of occurrences of words in text bag because position isn t taken into account Extensions Remove stop words Remove too frequent words (max_df) lowercase Ngram (ngram_range) tokenize ngrams instead of words. Useful to take into account word positions wiss-ml.ipynb#bag-of-words Ludovic Samper (Antidot) Machine Learning September 1st, / 77
15 Term frequency inverse document frequency (tfidf) I Intuition Take into account relative importance of each word regarding the whole dataset If a word occurs in every document, it doesn t hold any information Ludovic Samper (Antidot) Machine Learning September 1st, / 77
16 Term frequency inverse document frequency (tfidf) II Definition Term frequency inverse document frequency tfidf (w, d) = tf (w, d) idf (w, d) tf (w, d) = term frequency(word w in doc d) N doc idf (w) = log( doc freq(w) ) In scikit-learn : tfidf (w, d) = tf (w, d) (idf (w) + 1) Terms that occurs in all documents idf = 0 will not be ignored Ludovic Samper (Antidot) Machine Learning September 1st, / 77
17 Term frequency inverse document frequency (tfidf) III Options Normalisation doc = 1. Ex, for norm L 2, w d tfidf(w, d)2 = 1 Smoothing : add one to document frequencies as if an extra doc contained every term in the collection exactly once Example N doc + 1 idf (w) = log( doc freq(w) + 1 ) Show most significants words of a doc wiss-ml.ipynb#tfidf Ludovic Samper (Antidot) Machine Learning September 1st, / 77
18 Sommaire 1 Problem definition 2 Extracting features from text files 3 Algorithms for classification Naïve Bayes Support Vector Machine (SVM) Tuning parameters Cross validation Grid search 4 Conclusion Ludovic Samper (Antidot) Machine Learning September 1st, / 77
19 Supervised classification problem I Notations x = (x 1,, x n ) = (x i ) 0 i<n feature vector {(x d, y d )} 0 d<d the training set i, x i R n x i feature vector for document i n dimension of the feature space d, y d {1,, N C } N C the number of classes y d the class of document d ŷ class prediction For a new vector x, ŷ is the predicted class of x. Ludovic Samper (Antidot) Machine Learning September 1st, / 77
20 Supervised classification problem II Goal Find a function F : R n {1,, N C } x ŷ Ludovic Samper (Antidot) Machine Learning September 1st, / 77
21 In 20newsgroups I Values in 20 newsgroups n = nb features (number of unique terms) D = training samples N C = 20 different classes Goal Find a function F that given a new document predicts its class Ludovic Samper (Antidot) Machine Learning September 1st, / 77
22 Naïve Bayes Algorithm I Bayes theorem P(A B) = P(B A)P(A) P(B) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
23 Naïve Bayes Algorithm II Posterior probability of class C P(x) does not depend on C, P(C x) = P(x C)P(C) P(x) P(C x) P(x C)P(C) Naïve Bayes independent assumption : each feature i is conditionally independent of every other feature j P(C x) P(C) n P(x i C) i=1 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
24 Naïve Bayes Algorithm III Classifier from the probability model ŷ = arg max P(y = k) k {1,,N C } n P(x i y = k) i=0 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
25 Parameter estimation in Naïve Bayes classifier Prior of a class P(y = k) = Can also be uniform : P(y = k) = 1 N C nb samples in class k total nb samples Ludovic Samper (Antidot) Machine Learning September 1st, / 77
26 Multinomial Naïve Bayes I Naïve Bayes P(x y = k) = n i=1 P(x i y = k) Multinomial distribution Event word is i follows a multinomial distribution with parameters (p 1,, p n ) where p i = P(word = i) P(x 1,, x n ) = n i=1 p x i i Where i p i = 1. p i = P(w = i) One distribution for each class y. Ludovic Samper (Antidot) Machine Learning September 1st, / 77
27 Multinomial Naïve Bayes II Multinomial Naïve Bayes One multinomial distribution for each class P(i y = k) = sum of occurrences of word x i in class k total nb words in class k = 0 j<n d k x i d k x j With smoothing, P(i y = k) = 0 j<n d k x i + α d k x j + αn Ludovic Samper (Antidot) Machine Learning September 1st, / 77
28 Multinomial Naïve Bayes III Inference in Multinomial Naïve Bayes ŷ = arg max P(y = k x) k = arg max P(y = k) k = arg max k 0 i<n P(i y = k) x i ( log(p(y = k)) + 0 i<n x i log(p(i y = k)) ) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
29 Multinomial Naïve Bayes IV A linear model In the log space, W 0, is the vector of priors : W is the matrix of distributions : (log P(y = k x)) k W 0 + W T.x W 0 = log(p(y = k)) W = (w ik ), i [1, n], k [1, N C ] w ik = log P(i y = k) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
30 Multinomial Naïve Bayes V Example step-by-step Ludovic Samper (Antidot) Machine Learning September 1st, / 77
31 Sommaire 1 Problem definition 2 Extracting features from text files 3 Algorithms for classification Naïve Bayes Support Vector Machine (SVM) Tuning parameters Cross validation Grid search 4 Conclusion Ludovic Samper (Antidot) Machine Learning September 1st, / 77
32 A linear classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
33 A linear classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
34 A linear classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
35 A linear classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
36 A linear classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
37 Support Vector Machine, notations Problem S, training set {(x i, y i ), x i R n, y i { 1, 1}} i 0..D Find a linear function w, x i + b such that : sign( w, x i + b) = y i Ludovic Samper (Antidot) Machine Learning September 1st, / 77
38 SVM, maximum margin classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
39 Margin distance(x +, x ) = w w, x + x = = = = 1 w ( w, x + w, x ) 1 w (( w, x + + b) ( w, x + b)) 1 (1 ( 1)) w 2 w Ludovic Samper (Antidot) Machine Learning September 1st, / 77
40 SVM, maximum margin classifier Ludovic Samper (Antidot) Machine Learning September 1st, / 77
41 Solving an optimization problem using the Lagrangien Primal problem minimize w,b f (w, b) Under the constraints, h i (w, b) 0 Lagrange function L(w, b, α) = f (w, b) i α i h i (w, b) Let, g(α) = inf (w,b) L(w, b, α) w, b, g(α) L(w, b, α) Moreover, L(w, b, α) f (w, b) Thus, α i 0, g(α) min w,b f (w, b) And with Karush Kuhn Tucker (KKT) optimality condition, max α g(α) = min f (w, b) α ih i (w, x) = 0 w,b Ludovic Samper (Antidot) Machine Learning September 1st, / 77
42 Support Vector Machine, problem Primal problem w 2 minimize (w,b) 2 Under the constraints, 0 < i D, y i ( w, x i + b) 1 Lagrange function L(w, b, α) = 1 2 w 2 i α i (y i ( w, x i + b) 1) Dual problem : maximize (w,b,α) L(w, b, α) with α i 0 Optimality in w, b is a saddle point with α Ludovic Samper (Antidot) Machine Learning September 1st, / 77
43 Support Vector Machine, problem Derivative in w, b need to vanish w L(w, b, α) = w i α i y i x i = 0 b L(w, b, α) = i α i y i = 0 Dual problem under the constraints, maximize α 1 α i α j y i y j x i, x j + 2 i,j i { i α iy i = 0 α i 0 α i Ludovic Samper (Antidot) Machine Learning September 1st, / 77
44 Support Vectors Support vectors w = i y i α i x i Karush Kuhn Tucker (KKT) optimality condition Lagrange multiplier times constraint equals zero α i (y i ( w, x i + b) 1) = 0 Thus, { αi = 0 α i > 0 y i ( w, x i + b) = 1 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
45 Experiments with separable space SVMvaryingC.ipynb Ludovic Samper (Antidot) Machine Learning September 1st, / 77
46 What happens if space is not separable Ludovic Samper (Antidot) Machine Learning September 1st, / 77
47 Adding slack variable Problem was With, minimize (w,b) w 2 2 y i (w.x i + b) 1 With slack minimize (w,b) w C i ξ i With, { yi (w.x i + b) 1 ξ i ξ i 0 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
48 Support Vector Machine, without slack Primal problem With, minimize (w,b) w 2 2 y i (w.x i + b) 1 Lagrange function L(w, b, α) = 1 2 w 2 i α i (y i ( w, x i + b) 1) Dual problem : maximize (w,b,α) L(w, b, α) Optimality in w, b, is a saddle point with α Ludovic Samper (Antidot) Machine Learning September 1st, / 77
49 Support Vector Machine, with slack Primal problem With, w 2 minimize (w,b) + C 2 i { yi (w.x i + b) 1 ξ i ξ i 0 ξ i Lagrange function L(w, b, ξ, α, η) = 1 2 w 2 + C i ξ i i α i (y i ( x i, w + b) + ξ i 1) i η i ξ i Dual problem : maximize (w,b,ξ,α,η) L(w, b, ξ, α, η) Optimality in w, b, ξ is a saddle point with α, η Ludovic Samper (Antidot) Machine Learning September 1st, / 77
50 Support Vector Machine, problem Derivative in w, b, ξ need to vanish w L(w, b, ξ, α, η) = w i α i y i x i = 0 b L(w, b, ξ, α, η) = i α i y i = 0 ξ L(w, b, ξ, α, η) = C α i η i = 0 η i = C α i Dual problem maximize α 1 α i α j y i y j x i, x j + 2 i,j i under the constraints, i α iy i = 0 and 0 α i C α i Ludovic Samper (Antidot) Machine Learning September 1st, / 77
51 Support Vectors Support vectors w = i y i α i x i Karush Kuhn Tucker (KKT) optimality condition Lagrange multiplier times constraint equals zero α i (y i ( w, x i + b) + ξ i 1) = 0 η i ξ i = 0 (C α i )ξ i = 0 Thus, α i = 0 y i ( w, x i + b) 1 0 < α i < C y i ( w, x i + b) = 1 α i = C y i ( w, x i + b) 1 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
52 Support Vector Machine, Loss functions Primal problem With, w 2 minimize (w,b) + C 2 i { yi (w.x i + b) 1 ξ i ξ i 0 ξ i With loss function w 2 minimize (w,b) + C 2 i max(0, 1 y i (w.x i + b)) here, loss(x i, y i ) = max(0, 1 y i (w.x i + b)) = max(0, 1 f (x i )) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
53 Support Vector Machine, Common loss functions Common loss functions hinge loss, L 1 -loss : max(0, 1 y i (w.x i + b)) squares hinge L 2 -loss : max(0, (1 y i (w.x i + b)) 2 ) logistic loss : log(1 + exp( y i (w.x i + b))) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
54 Ludovic Samper (Antidot) Machine Learning September 1st, / 77
55 Expermiments with different values for C SVMvaryingC.ipynb#Varying-C-parameter Ludovic Samper (Antidot) Machine Learning September 1st, / 77
56 Non linearly separable data Ludovic Samper (Antidot) Machine Learning September 1st, / 77
57 Non linearly separable data, Φ(x) = (x, x 2 ) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
58 Non linearly separable data, Φ(x) = (x, x 2 ) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
59 Linear case Primal Problem minimize w,b 1 2 w 2 + C i ξ i subject to, y i ( w, x i + b) 1 ξ i and ξ i 0 Dual Problem maximize α 1 2 subject to, i α iy i = 0 and 0 α i C α i α j y i y j x i, x j + i,j i α i Support vector expansion f (x) = i α i y i x i, x + b Ludovic Samper (Antidot) Machine Learning September 1st, / 77
60 With a transformation Φ : x Φ(x) Primal Problem minimize w,b 1 2 w 2 + C i ξ i subject to, y i ( w, Φ(x i ) + b) 1 ξ i and ξ i 0 Dual Problem maximize α 1 2 α i α j y i y j Φ(x i ), Φ(x j ) + i,j i α i subject to, i α iy i = 0 and 0 α i C Support vector expansion f (x) = i α i y i Φ(x i ), Φ(x) + b Ludovic Samper (Antidot) Machine Learning September 1st, / 77
61 The kernel trick Kernel function k(x, x ) = Φ(x), Φ(x ) We just need to compute the dot product in the new space Dual Problem maximize α 1 2 subject to, i α iy i = 0 and 0 α i C α i α j y i y j k(x i, x j ) + i,j i α i Support vector expansion f (x) = i α i y i k(x i, x) + b Ludovic Samper (Antidot) Machine Learning September 1st, / 77
62 Kernels Kernel functions linear : k(x, x ) = x, x polynomial : k(x, x ) = (γ x, x + r) d rbf : k(x, x ) = exp( γ x x 2 ) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
63 RBF Kernel imply an infinite space Here we re in dimension 1, x R k(x, x ) = exp( (x x ) 2 ) = exp( x 2 )exp( x 2 )exp(2xx ) With Taylor transformation, k(x, x ) = exp( x 2 )exp( x 2 ) 2 k x k x k k=0 k! = (, 2k 1 k! exp( x 2 )x k, ), (, 2k 1 k! exp( x 2 )x k, ) Ludovic Samper (Antidot) Machine Learning September 1st, / 77
64 Experiments with different kernels Ludovic Samper (Antidot) Machine Learning September 1st, / 77
65 SVM in multiclass one-vs-the rest N C binary classifiers (but each involving all dataset) At prediction time, choose the class with maximum decision value one-vs-one N C (N C 1) 2 binary classifiers At prediction time, vote Ludovic Samper (Antidot) Machine Learning September 1st, / 77
66 SVM in scikit-learn SVC : Support Vector Classification sklearn.svm.linearsvc based on Liblinear library strategy : one-vs-the rest only linear kernel loss can be : hinge or squared hinge sklearn.svm.svc based on libsvm multiclass strategy : one-vs-one kernel can be : linear, polynomial, RBF, sigmoid, precomputed only hinge loss Ludovic Samper (Antidot) Machine Learning September 1st, / 77
67 Sommaire 1 Problem definition 2 Extracting features from text files 3 Algorithms for classification Naïve Bayes Support Vector Machine (SVM) Tuning parameters Cross validation Grid search 4 Conclusion Ludovic Samper (Antidot) Machine Learning September 1st, / 77
68 Cross validation I Overfitting Estimation of parameters on the test set can lead to overfitting : parameters are the best for this test set but not in the general case. Train, test and validation dataset A solution : tweak the parameters on the test set validate on a validation dataset only few data in training dataset Ludovic Samper (Antidot) Machine Learning September 1st, / 77
69 Cross validation II Cross validation k-fold cross validation Split training data in k partitions of the same size train the model on k 1 partitions then, evaluate on the kth partition Ludovic Samper (Antidot) Machine Learning September 1st, / 77
70 Cross validation III Ludovic Samper (Antidot) Machine Learning September 1st, / 77
71 Grid Search Grid search Test each value for each parameter brut force algorithm to find the best value for each parameter In scikit-learn Automatically runs k number of parameters values trainings Keeps the best model Demo with scikit-learn Ludovic Samper (Antidot) Machine Learning September 1st, / 77
72 Sommaire 1 Problem definition 2 Extracting features from text files 3 Algorithms for classification 4 Conclusion Methodology Ludovic Samper (Antidot) Machine Learning September 1st, / 77
73 1 Problem definition Supervised classification Evaluation metrics 2 Extracting features from text files Bag of words model Term frequency inverse document frequency (tfidf) 3 Algorithms for classification Naïve Bayes Support Vector Machine (SVM) Tuning parameters Cross validation Grid search 4 Conclusion Methodology Ludovic Samper (Antidot) Machine Learning September 1st, / 77
74 Methodology To solve a problem using Machine Learning, you have to : 1 Understand the data 2 Choose an evaluation measure 3 Be able to test the model 4 Find the main features 5 Try the algorithms, with different parameters Ludovic Samper (Antidot) Machine Learning September 1st, / 77
75 Conclusion Machine Learning has a lot of applications With libraries like scikit-learn, no need to implement algorithms yourself Ludovic Samper (Antidot) Machine Learning September 1st, / 77
76 Questions? Ludovic Samper (Antidot) Machine Learning September 1st, / 77
77 References Machine Learning in Python : Alex Smola very good lecture on Machine Learning at CMU : Kernels : SVM : Ludovic Samper (Antidot) Machine Learning September 1st, / 77
78 Bernoulli Naïve Bayes Features x i = 1 iff word i is present in document Else, x i = 0 The number of occurrences of word i doesn t matter Bernoulli For each feature i, P(x i y = k) = P(i y = k)x i + (1 P(i y = k))(1 x i ) Absence of a feature is explicitly taken into account Estimation of P(i y = k) P(i y = k) = 1 + nb of documents in k that contains word i nb of documents in k Ludovic Samper (Antidot) Machine Learning September 1st, / 77
Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationLINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning
LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationPattern Recognition 2018 Support Vector Machines
Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recognition 1 / 48 Support Vector Machines Ad Feelders ( Universiteit Utrecht
More informationLearning with kernels and SVM
Learning with kernels and SVM Šámalova chata, 23. května, 2006 Petra Kudová Outline Introduction Binary classification Learning with Kernels Support Vector Machines Demo Conclusion Learning from data find
More informationJeff Howbert Introduction to Machine Learning Winter
Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationSupport Vector Machine (continued)
Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationLecture 10: A brief introduction to Support Vector Machine
Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department
More informationSVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels
SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationLinear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction
Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationReview: Support vector machines. Machine learning techniques and image analysis
Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization
More informationSVMs, Duality and the Kernel Trick
SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today
More informationLecture 10: Support Vector Machine and Large Margin Classifier
Lecture 10: Support Vector Machine and Large Margin Classifier Applied Multivariate Analysis Math 570, Fall 2014 Xingye Qiao Department of Mathematical Sciences Binghamton University E-mail: qiao@math.binghamton.edu
More informationIntroduction to SVM and RVM
Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance
More informationMachine Learning. Lecture 6: Support Vector Machine. Feng Li.
Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear
More informationLecture 9: Large Margin Classifiers. Linear Support Vector Machines
Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation
More informationFoundation of Intelligent Systems, Part I. SVM s & Kernel Methods
Foundation of Intelligent Systems, Part I SVM s & Kernel Methods mcuturi@i.kyoto-u.ac.jp FIS - 2013 1 Support Vector Machines The linearly-separable case FIS - 2013 2 A criterion to select a linear classifier:
More informationICS-E4030 Kernel Methods in Machine Learning
ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationMachine Learning for natural language processing
Machine Learning for natural language processing Classification: Naive Bayes Laura Kallmeyer Heinrich-Heine-Universität Düsseldorf Summer 2016 1 / 20 Introduction Classification = supervised method for
More informationSupport Vector Machines and Kernel Methods
2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationKernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning
Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:
More informationSupport Vector Machine for Classification and Regression
Support Vector Machine for Classification and Regression Ahlame Douzal AMA-LIG, Université Joseph Fourier Master 2R - MOSIG (2013) November 25, 2013 Loss function, Separating Hyperplanes, Canonical Hyperplan
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More informationCS798: Selected topics in Machine Learning
CS798: Selected topics in Machine Learning Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS798: Selected topics in Machine Learning
More informationA Tutorial on Support Vector Machine
A Tutorial on School of Computing National University of Singapore Contents Theory on Using with Other s Contents Transforming Theory on Using with Other s What is a classifier? A function that maps instances
More informationML (cont.): SUPPORT VECTOR MACHINES
ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version
More informationSupport Vector Machines
Two SVM tutorials linked in class website (please, read both): High-level presentation with applications (Hearst 1998) Detailed tutorial (Burges 1998) Support Vector Machines Machine Learning 10701/15781
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationMachine Learning and Data Mining. Support Vector Machines. Kalev Kask
Machine Learning and Data Mining Support Vector Machines Kalev Kask Linear classifiers Which decision boundary is better? Both have zero training error (perfect training accuracy) But, one of them seems
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationSupport Vector Machine
Andrea Passerini passerini@disi.unitn.it Machine Learning Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
More informationSUPPORT VECTOR MACHINE
SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition
More informationAnnouncements - Homework
Announcements - Homework Homework 1 is graded, please collect at end of lecture Homework 2 due today Homework 3 out soon (watch email) Ques 1 midterm review HW1 score distribution 40 HW1 total score 35
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationSupport Vector Machines for Classification and Regression
CIS 520: Machine Learning Oct 04, 207 Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may
More informationSupport Vector Machines
Support Vector Machines INFO-4604, Applied Machine Learning University of Colorado Boulder September 28, 2017 Prof. Michael Paul Today Two important concepts: Margins Kernels Large Margin Classification
More informationIntroduction to Logistic Regression and Support Vector Machine
Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationConstrained Optimization and Support Vector Machines
Constrained Optimization and Support Vector Machines Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/
More informationCS6375: Machine Learning Gautam Kunapuli. Support Vector Machines
Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this
More informationData Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396
Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationIndirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina
Indirect Rule Learning: Support Vector Machines Indirect learning: loss optimization It doesn t estimate the prediction rule f (x) directly, since most loss functions do not have explicit optimizers. Indirection
More informationSupport Vector Machines
EE 17/7AT: Optimization Models in Engineering Section 11/1 - April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationMachine Learning And Applications: Supervised Learning-SVM
Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationSupport Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs
E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in
More informationMachine Learning: Assignment 1
10-701 Machine Learning: Assignment 1 Due on Februrary 0, 014 at 1 noon Barnabas Poczos, Aarti Singh Instructions: Failure to follow these directions may result in loss of points. Your solutions for this
More informationPerceptron Revisited: Linear Separators. Support Vector Machines
Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department
More informationOutline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22
Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationNearest Neighbor. Machine Learning CSE546 Kevin Jamieson University of Washington. October 26, Kevin Jamieson 2
Nearest Neighbor Machine Learning CSE546 Kevin Jamieson University of Washington October 26, 2017 2017 Kevin Jamieson 2 Some data, Bayes Classifier Training data: True label: +1 True label: -1 Optimal
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Bayesian Methods Machine Learning CSE546 Kevin Jamieson University of Washington November 1, 2018 2018 Kevin Jamieson 2 MLE Recap - coin flips Data:
More informationGenerative Clustering, Topic Modeling, & Bayesian Inference
Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week
More informationNeural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science
Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Supervised Learning / Support Vector Machines
More informationMachine Learning Lecture 6 Note
Machine Learning Lecture 6 Note Compiled by Abhi Ashutosh, Daniel Chen, and Yijun Xiao February 16, 2016 1 Pegasos Algorithm The Pegasos Algorithm looks very similar to the Perceptron Algorithm. In fact,
More informationCSE546: SVMs, Dual Formula5on, and Kernels Winter 2012
CSE546: SVMs, Dual Formula5on, and Kernels Winter 2012 Luke ZeClemoyer Slides adapted from Carlos Guestrin Linear classifiers Which line is becer? w. = j w (j) x (j) Data Example i Pick the one with the
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationCSC 411 Lecture 17: Support Vector Machine
CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationSupport vector machines
Support vector machines Jianxin Wu LAMDA Group National Key Lab for Novel Software Technology Nanjing University, China wujx2001@gmail.com May 10, 2018 Contents 1 The key SVM idea 2 1.1 Simplify it, simplify
More informationChapter 9. Support Vector Machine. Yongdai Kim Seoul National University
Chapter 9. Support Vector Machine Yongdai Kim Seoul National University 1. Introduction Support Vector Machine (SVM) is a classification method developed by Vapnik (1996). It is thought that SVM improved
More informationSupport Vector and Kernel Methods
SIGIR 2003 Tutorial Support Vector and Kernel Methods Thorsten Joachims Cornell University Computer Science Department tj@cs.cornell.edu http://www.joachims.org 0 Linear Classifiers Rules of the Form:
More informationLast Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression
CSE 446 Gaussian Naïve Bayes & Logistic Regression Winter 22 Dan Weld Learning Gaussians Naïve Bayes Last Time Gaussians Naïve Bayes Logistic Regression Today Some slides from Carlos Guestrin, Luke Zettlemoyer
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationSVMC An introduction to Support Vector Machines Classification
SVMC An introduction to Support Vector Machines Classification 6.783, Biomedical Decision Support Lorenzo Rosasco (lrosasco@mit.edu) Department of Brain and Cognitive Science MIT A typical problem We have
More informationSupport Vector Machines
Support Vector Machines Ryan M. Rifkin Google, Inc. 2008 Plan Regularization derivation of SVMs Geometric derivation of SVMs Optimality, Duality and Large Scale SVMs The Regularization Setting (Again)
More informationSupport vector machines Lecture 4
Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The
More informationOutline. Supervised Learning. Hong Chang. Institute of Computing Technology, Chinese Academy of Sciences. Machine Learning Methods (Fall 2012)
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Linear Models for Regression Linear Regression Probabilistic Interpretation
More informationSupport Vector Machines
Support Vector Machines Bingyu Wang, Virgil Pavlu March 30, 2015 based on notes by Andrew Ng. 1 What s SVM The original SVM algorithm was invented by Vladimir N. Vapnik 1 and the current standard incarnation
More informationSupport Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification
More informationSupport Vector Machines
Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationLecture Support Vector Machine (SVM) Classifiers
Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationDeviations from linear separability. Kernel methods. Basis expansion for quadratic boundaries. Adding new features Systematic deviation
Deviations from linear separability Kernel methods CSE 250B Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Systematic deviation
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationClassification and Support Vector Machine
Classification and Support Vector Machine Yiyong Feng and Daniel P. Palomar The Hong Kong University of Science and Technology (HKUST) ELEC 5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline
More informationNon-linear Support Vector Machines
Non-linear Support Vector Machines Andrea Passerini passerini@disi.unitn.it Machine Learning Non-linear Support Vector Machines Non-linearly separable problems Hard-margin SVM can address linearly separable
More information10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers
Computational Methods for Data Analysis Massimo Poesio SUPPORT VECTOR MACHINES Support Vector Machines Linear classifiers 1 Linear Classifiers denotes +1 denotes -1 w x + b>0 f(x,w,b) = sign(w x + b) How
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationMehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013
Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 April 5, 2013 Due: April 19, 2013 A. Kernels 1. Let X be a finite set. Show that the kernel
More informationKernel methods CSE 250B
Kernel methods CSE 250B Deviations from linear separability Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Deviations from
More informationLecture 2: Linear SVM in the Dual
Lecture 2: Linear SVM in the Dual Stéphane Canu stephane.canu@litislab.eu São Paulo 2015 July 22, 2015 Road map 1 Linear SVM Optimization in 10 slides Equality constraints Inequality constraints Dual formulation
More information