Similarity and kernels in machine learning
|
|
- Arthur Wade
- 5 years ago
- Views:
Transcription
1 1/31 Similarity and kernels in machine learning Zalán Bodó Babeş Bolyai University, Cluj-Napoca/Kolozsvár Faculty of Mathematics and Computer Science MACS 2016 Eger, Hungary
2 2/31 Machine learning Overview of the presentation Similarity. Similarity in (machine) learning Kernels Kernel methods Examples of general purpose kernels Kernels and similarities A sample/simple method: prototype learning The representer theorem Dimensionality The kernelization period Semi-supervised learning and kernels Assumptions in SSL Humans and SSL Data-dependent kernels Reweighting cluster kernels A toy dataset
3 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
4 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
5 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
6 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
7 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
8 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
9 3/31 Machine learning Arthur Samuel, 1959: field of study that gives computers the ability to learn without being explicitly programmed [... ] machine learning is now an independent and mature field that has moved beyond psychologically or neurally inspired algorithms towards providing foundations for a theory of learning that is rooted in statistics and functional analysis [Jäkel et al., 2007] Machine learning = supervised learning classification, regression unsupervised learning clustering, density estimation reinforcement learning + semi-supervised learning (classification)
10 4/31 Example: Content-based spam filtering spamham (1200x287x16M jpeg)
11 5/31 Similarity. Similarity in (machine) learning similarity is fundamental to learning Shepard: in each individual there is an internal metric of similarity between possible situations [Shepard, 1987] generalization is based on similarity between situations/events/objects/... learning = generalize... (a) supervised scenarios:... from labeled to unlabeled data (b) unsupervised scenarios:... from familiar to novel data The fundamental challenge confronted by any system that is expected to generalize from familiar to unfamiliar stimuli is how to estimate similarity over stimuli in a principled and feasible manner. [Shahbazi et al., 2016]
12 5/31 Similarity. Similarity in (machine) learning similarity is fundamental to learning Shepard: in each individual there is an internal metric of similarity between possible situations [Shepard, 1987] generalization is based on similarity between situations/events/objects/... learning = generalize... (a) supervised scenarios:... from labeled to unlabeled data (b) unsupervised scenarios:... from familiar to novel data The fundamental challenge confronted by any system that is expected to generalize from familiar to unfamiliar stimuli is how to estimate similarity over stimuli in a principled and feasible manner. [Shahbazi et al., 2016]
13 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
14 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
15 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
16 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
17 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
18 6/31 Similarity of... sets, e.g. Jaccard similarity J(A, B) = A B A B sequences, e.g. edit (Levenshtein) distance-based similarity E(s, t) = 1 edist(s, t) max( s, t ) vectors, e.g. cosine similarity (= normalized dot product)... C(x, z) = x z x z complex objects, e.g. of two text segments extracted from a PDF file...
19 Machine learning Similarity. Similarity in (machine) learning Kernels Semi-supervised learning and kernels A toy dataset , MACS dinner 7/31
20 8/31 (
21 9/31 Kernels o o Figure : XOR problem: separate the o s from the x s Marvin Minsky, Seymour Papert. Perceptrons: an introduction to computational geometry. MIT Press, Cambridge, Mass., a single artificial neuron/perceptron (= lin. class.) cannot solve the problem M. A. Aizerman, E. M. Braverman, L. I. Rozonoer. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, vol. 25, pp , use kernels!
22 10/ X 1 X X 2 X 1 2 Figure : Using the polynomial kernel map the points using the function φ(x) = [x1 2 x 2 2 2x1 x 2 ] this is equivalent to using k(x, z) = φ(x), φ(z) = (x z) 2 (= polynomial kernel) polynomial kernel: link the features using logical AND (size of the group of linked features is determined by the order of the kernel)
23 11/31 Kernel methods 1909: James Mercer any continuous symmetric, positive semi-definite kernel function can be expressed as a dot product in a high-dimensional space [Mercer, 1909] 1964: Aizerman, Braverman and Rozonoer first application [Aizerman et al., 1964] 1992: Boser, Guyon and Vapnik famous application (SVM) [Boser et al., 1992] linear algorithms non-linear algorithms feature mapping: φ : X H (φ : R d1 R d2 ) kernels: k(x, z) = φ(x), φ(z) = φ(x) φ(z) covers all geometric constructions that can be formulated in terms of angles, lengths and distances Kernel trick Given an algorithm which is formulated in terms of a positive definite kernel k(, ), one can construct an alternative algorithm by replacing k(, ) by another positive definite kernel k(, ).
24 11/31 Kernel methods 1909: James Mercer any continuous symmetric, positive semi-definite kernel function can be expressed as a dot product in a high-dimensional space [Mercer, 1909] 1964: Aizerman, Braverman and Rozonoer first application [Aizerman et al., 1964] 1992: Boser, Guyon and Vapnik famous application (SVM) [Boser et al., 1992] linear algorithms non-linear algorithms feature mapping: φ : X H (φ : R d1 R d2 ) kernels: k(x, z) = φ(x), φ(z) = φ(x) φ(z) covers all geometric constructions that can be formulated in terms of angles, lengths and distances Kernel trick Given an algorithm which is formulated in terms of a positive definite kernel k(, ), one can construct an alternative algorithm by replacing k(, ) by another positive definite kernel k(, ).
25 12/31 Examples of general purpose kernels linear: k(x, z) = x z polynomial: k(x, z) = (ax z + b) c Gaussian (RBF): k(x, z) = exp ( γ x z 2)
26 12/31 Examples of general purpose kernels linear: k(x, z) = x z polynomial: k(x, z) = (ax z + b) c Gaussian (RBF): k(x, z) = exp ( γ x z 2)
27 12/31 Examples of general purpose kernels linear: k(x, z) = x z polynomial: k(x, z) = (ax z + b) c Gaussian (RBF): k(x, z) = exp ( γ x z 2)
28 Kernels and similarities kernel real-valued symmetric positive definite similarity real-valued not necessarily symmetric not necessarily p.d. k(x, z) = 1 [k(x, x) + k(z, z) 2 ] φ(x) φ(z) 2 2 sim(x, z) = inverse of the distance between x and z k(x, z) = φ(x), φ(z) = the cosine similarity of the mapped vectors, provided they are normalized 13/31
29 Kernels and similarities kernel real-valued symmetric positive definite similarity real-valued not necessarily symmetric not necessarily p.d. k(x, z) = 1 [k(x, x) + k(z, z) 2 ] φ(x) φ(z) 2 2 sim(x, z) = inverse of the distance between x and z k(x, z) = φ(x), φ(z) = the cosine similarity of the mapped vectors, provided they are normalized 13/31
30 14/31 A sample/simple method: prototype learning c + w x c c class centers (centroids, prototypes): c + = 1 N + c = 1 x i X + x i x i N x i X
31 15/31 define the following vectors: w = c + c and c = (c + + c )/2 then y(x) = sgn x c, w with b = ( c 2 c + 2) /2. = sgn ( c +, x c, x + b) using dot products between the x i s: y(x) = sgn 1 x, x i 1 x, x i + b N + N x i X + x i X where b = N 2 x i, x j 1 N x i,x j X + 2 x i, x j x i,x j X +
32 16/31 The representer theorem Theorem (Schölkopf and Smola, 2002) Let H be the feature space associated to a positive semi-definite kernel k : X X R. Denote by Ω : [0, ) R a strictly monotonic increasing function, and by c : (X R 2 ) l R { } an arbitrary loss function. Then each minimizer of the regularized risk c((x 1, y 1, f (x 1 )),..., (x l, y l, f (x l ))) + Ω( f H ) admits a representation of the form f (x) = l α i k(x i, x) i=1
33 17/31 Semiparametric representer theorem f (x) = l M α i k(x i, x) + β p ψ p (x) i=1 p=1 Loss function + regularization for the centroid classifier: y i =1 where f (x i ) = w x i + b y i f (x i ) N + N y i = 1 y i f (x i ) + N + 2 w 2 2
34 18/31 curse or blessing? Dimensionality usually: φ : R d1 R d2 with d 2 > d 1 or d 2 d 1 why? higher the dimensionality, easier to find a separating hyperplane Vapnik Chervonenkis dimension of a classification algorithm = largest set of points that the algorithm can shatter (shattering of a set of points = all possible labelings of the points can be realized by the method) VC dimension of oriented hyperplanes in R d is d + 1 (see proof in [Burges, 1998])
35 19/31 φ need not be dimensionality increaser/raiser it suffices to map the points to a better representational space in either case: Johnson Lindenstrauss lemma [Johnson and Lindenstrauss, 1984] if number of data points is relatively small (compared to dimensionality) random projection of logaritmically lower dimensionality relative distances will be approximately preserved corollary: kernels can be used for dimensionality reduction
36 20/31 199x 200y 1992: SVM The kernelization period?: kernel regularized least squares 1996: kernel PCA 1999: kernel Fisher discriminant analysis, transductive SVM 2001: kernel k-means clustering, kernel canonical correlation analysis, SVC (support vector clustering) 2005: first data-dependent non-parametric kernel, Laplacian regularized least squares, Laplacian SVM...
37 20/31 199x 200y 1992: SVM The kernelization period?: kernel regularized least squares 1996: kernel PCA 1999: kernel Fisher discriminant analysis, transductive SVM 2001: kernel k-means clustering, kernel canonical correlation analysis, SVC (support vector clustering) 2005: first data-dependent non-parametric kernel, Laplacian regularized least squares, Laplacian SVM...
38 21/31 (Some DBLP stats) Figure : works retrieved for keyword kernel on (among ) Bernhard Schölkopf (73) Johan A. K. Suykens (68) José Carlos Príncipe (63) Stefan Kratsch (60) Alessandro Moschitti (56) Alexander J. Smola (53) Hortensia Galeana-Sánchez (51) Arthur Gretton (47) Saket Saurabh (44) Edwin R. Hancock (44) Figure : Top 10 authors for the same keyword
39 22/31 Semi-supervised learning and kernels Semi-supervised learning (SSL) supervised learning: D = {(x i, y i ) x i X R d, y i { 1, +1}, i = 1,..., l}; find f : X { 1, +1} which agrees with D semi-supervised learning: D = {(x i, y i ) i = 1,..., l} {x j j = 1,..., u}, l u, N = l + u; inductive: find f : X { 1, +1} which agrees with D + use the information of D U transductive: find f : D U { 1, +1} by using D = D L D U
40 22/31 Semi-supervised learning and kernels Semi-supervised learning (SSL) supervised learning: D = {(x i, y i ) x i X R d, y i { 1, +1}, i = 1,..., l}; find f : X { 1, +1} which agrees with D semi-supervised learning: D = {(x i, y i ) i = 1,..., l} {x j j = 1,..., u}, l u, N = l + u; inductive: find f : X { 1, +1} which agrees with D + use the information of D U transductive: find f : D U { 1, +1} by using D = D L D U
41 23/31 Assumptions in SSL 1. smoothness assumption: If two points x i and x j in a high density region are close, then so should be the corresponding outputs y i and y j. 2. cluster assumption: If two points are in the same cluster, they are likely to be of the same class. 3. manifold assumption (a.k.a. graph-based learning): The high dimensional data lie roughly on a low dimensional manifold.
42 23/31 Assumptions in SSL 1. smoothness assumption: If two points x i and x j in a high density region are close, then so should be the corresponding outputs y i and y j. 2. cluster assumption: If two points are in the same cluster, they are likely to be of the same class. 3. manifold assumption (a.k.a. graph-based learning): The high dimensional data lie roughly on a low dimensional manifold.
43 23/31 Assumptions in SSL 1. smoothness assumption: If two points x i and x j in a high density region are close, then so should be the corresponding outputs y i and y j. 2. cluster assumption: If two points are in the same cluster, they are likely to be of the same class. 3. manifold assumption (a.k.a. graph-based learning): The high dimensional data lie roughly on a low dimensional manifold.
44 24/31 Humans and SSL humans do semi-supervised classification too 2007: experiment (Zhu and his colleagues), University of Wisconsin [Zhu et al., 2007] complex 3D shapes classified into two categories participants were told they see microscopic images of pollen particles from two fictitious flowers (Belianthus and Nortulaca) data given: 2 labeled examples (each appearing 10 times in 20 trials) test set of 21 evenly spaced unlabeled examples to test the learned decision boundary unlabeled examples the means are shifted away from the labeled examples (left-shifted or right-shifted) test set of 21 evenly spaced unlabeled examples to test whether the decision boundary has changed the learned decision boundary is determined by both labeled and unlabeled data
45 25/31 Data-dependent kernels supervised learning + data-dependent kernels = semi-supervised learning conventional kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z) = k(x, z) data-dependent kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z; D 1 ) k(x, z; D 2 ) reads as not necessarily equal
46 25/31 Data-dependent kernels supervised learning + data-dependent kernels = semi-supervised learning conventional kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z) = k(x, z) data-dependent kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z; D 1 ) k(x, z; D 2 ) reads as not necessarily equal
47 25/31 Data-dependent kernels supervised learning + data-dependent kernels = semi-supervised learning conventional kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z) = k(x, z) data-dependent kernels: given data sets D 1 D 2, x, z D 1 D 2 k(x, z; D 1 ) k(x, z; D 2 ) reads as not necessarily equal
48 Reweighting cluster kernels idea borrowed from bagged cluster kernel [Weston et al., 2005] reweighting conventional kernels according to some clustering of the data [Bodó and Csató, 2010] kernel combinations: K 1 + K 2, a K, K 1 K 2 cluster kernel: K = K rw K b where K b = base kernel (e.g. Gaussian, polynomial, etc.) K rw = reweighting kernel K = resulting cluster kernel used in the learning algorithm k rw (x, z) = exp ( U x U z 2 ) 2σ 2 K rw = U U + α 11, α [0, 1) K rw = β U U + 11, β (0, ) ( U = matrix of cluster membership vectors (columns) of size }{{} K }{{} N ) no. of clusters no. of points 26/31
49 Reweighting cluster kernels idea borrowed from bagged cluster kernel [Weston et al., 2005] reweighting conventional kernels according to some clustering of the data [Bodó and Csató, 2010] kernel combinations: K 1 + K 2, a K, K 1 K 2 cluster kernel: K = K rw K b where K b = base kernel (e.g. Gaussian, polynomial, etc.) K rw = reweighting kernel K = resulting cluster kernel used in the learning algorithm k rw (x, z) = exp ( U x U z 2 ) 2σ 2 K rw = U U + α 11, α [0, 1) K rw = β U U + 11, β (0, ) ( U = matrix of cluster membership vectors (columns) of size }{{} K }{{} N ) no. of clusters no. of points 26/31
50 Reweighting cluster kernels idea borrowed from bagged cluster kernel [Weston et al., 2005] reweighting conventional kernels according to some clustering of the data [Bodó and Csató, 2010] kernel combinations: K 1 + K 2, a K, K 1 K 2 cluster kernel: K = K rw K b where K b = base kernel (e.g. Gaussian, polynomial, etc.) K rw = reweighting kernel K = resulting cluster kernel used in the learning algorithm k rw (x, z) = exp ( U x U z 2 ) 2σ 2 K rw = U U + α 11, α [0, 1) K rw = β U U + 11, β (0, ) ( U = matrix of cluster membership vectors (columns) of size }{{} K }{{} N ) no. of clusters no. of points 26/31
51 27/31 A toy dataset Figure : Linked tori dataset labeled examples: 3 + 3, unlabeled examples:
52 28/31 Figure : Linear SVM: Accuracy = % (279/394) Figure : Gaussian SVM: γ = Accuracy = % (274/394)
53 29/31 Figure : SVM with reweighting cluster kernel (RCK) clustering: fuzzy, p = 2, no. of clusters = 30 3rd kernel β = 1000 Accuracy = % (300/394)
54 30/31 Thank you!
55 31/31 Aizerman et al., 1964 References M. A. Aizerman, E. M. Braverman, L. I. Rozoner. Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control, vol. 25, pp , Bodó and Csató, 2010 Z. Bodó, L. Csató. Hierarchical and Reweighting Cluster Kernels for Semi-Supervised Learning. Int. J. of Computers, Communications & Control, Vol. V, No. 4, pp , Boser et al., 1992 Burges, 1998 Jäkel et al., 2007 B. E. Boser, I. M. Guyon, V. N. Vapnik. A Training Algorithm for Optimal Margin Classifiers. COLT, pp , C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining, 2(4), pp , F. Jäkel, B. Schölkopf, F. A. Wichmann. A Tutorial on Kernel Methods for Categorization. Journal of Mathematical Psychology 51(6), pp , Johnson and Lindenstrauss, 1984 W. B. Johnson, J. Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemporary Mathematics, 26, pp , Mercer, 1909 J. Mercer. Functions of positive and negative type and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, Series A, vol. 209, pp , Minsky and Papert, 1969 M. Minsky, S. Papert. Perceptrons: an introduction to computational geometry. MIT Press, Cambridge, Mass., 1969 Schölkopf and Smola, 2002 B. Schölkopf, A. J. Smola. Learning with Kernels. MIT Press, Cambridge, Mass., Shahbazi et al., 2016 Shepard, 1987 Weston et al., 2005 Zhu et al., 2007 R. Shahbazi, R. Raizada, S. Edelman. Similarity, kernels, and the fundamental constraints on cognition. Journal of Mathematical Psychology, vol. 70, pp , R. N. Shepard. Toward a universal law of generalization for psychological science. Science, 237, pp , J. Weston, C. Leslie, D. Zhou, A. Elisseeff, W. S. Noble. Semi-Supervised Protein Classification using Cluster Kernels. Bioinformatics, 21(15), pp , X. Zhu, T. Rogers, R. Qian, C. Kalish. Humans perform semi-supervised classification too. AAAI, pp , 2007.
Learning. Szeged, March Faculty of Mathematics and Computer Science, Babeş Bolyai University, Cluj-Napoca/Kolozsvár
Faculty of Mathematics and Computer Science, Babeş Bolyai University, Cluj-Napoca/Kolozsvár 15 10 5 Machine. Supervised and 0 5 10 15 10 5 0 5 10 15 20 25 Szeged, March 2012 1/41 2/41 Contents Machine.
More informationSupport Vector Machines
Support Vector Machines Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Overview Motivation
More informationKernel Methods. Foundations of Data Analysis. Torsten Möller. Möller/Mori 1
Kernel Methods Foundations of Data Analysis Torsten Möller Möller/Mori 1 Reading Chapter 6 of Pattern Recognition and Machine Learning by Bishop Chapter 12 of The Elements of Statistical Learning by Hastie,
More informationPattern Recognition and Machine Learning. Perceptrons and Support Vector machines
Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3 - MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3
More informationLearning with kernels and SVM
Learning with kernels and SVM Šámalova chata, 23. května, 2006 Petra Kudová Outline Introduction Binary classification Learning with Kernels Support Vector Machines Demo Conclusion Learning from data find
More informationSupport Vector Machines and Kernel Algorithms
Support Vector Machines and Kernel Algorithms Bernhard Schölkopf Max-Planck-Institut für biologische Kybernetik 72076 Tübingen, Germany Bernhard.Schoelkopf@tuebingen.mpg.de Alex Smola RSISE, Australian
More informationSupport Vector Machine & Its Applications
Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia
More informationLinear Classification and SVM. Dr. Xin Zhang
Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a
More informationLearning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking p. 1/31
Learning from Labeled and Unlabeled Data: Semi-supervised Learning and Ranking Dengyong Zhou zhou@tuebingen.mpg.de Dept. Schölkopf, Max Planck Institute for Biological Cybernetics, Germany Learning from
More informationCluster Kernels for Semi-Supervised Learning
Cluster Kernels for Semi-Supervised Learning Olivier Chapelle, Jason Weston, Bernhard Scholkopf Max Planck Institute for Biological Cybernetics, 72076 Tiibingen, Germany {first. last} @tuebingen.mpg.de
More informationSupport'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan
Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination
More informationSupport Vector Machines for Classification: A Statistical Portrait
Support Vector Machines for Classification: A Statistical Portrait Yoonkyung Lee Department of Statistics The Ohio State University May 27, 2011 The Spring Conference of Korean Statistical Society KAIST,
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationMachine Learning : Support Vector Machines
Machine Learning Support Vector Machines 05/01/2014 Machine Learning : Support Vector Machines Linear Classifiers (recap) A building block for almost all a mapping, a partitioning of the input space into
More informationHow to learn from very few examples?
How to learn from very few examples? Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Outline Introduction Part A
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Andreas Maletti Technische Universität Dresden Fakultät Informatik June 15, 2006 1 The Problem 2 The Basics 3 The Proposed Solution Learning by Machines Learning
More informationDiscrete vs. Continuous: Two Sides of Machine Learning
Discrete vs. Continuous: Two Sides of Machine Learning Dengyong Zhou Department of Empirical Inference Max Planck Institute for Biological Cybernetics Spemannstr. 38, 72076 Tuebingen, Germany Oct. 18,
More informationJeff Howbert Introduction to Machine Learning Winter
Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable
More informationReferences. Lecture 7: Support Vector Machines. Optimum Margin Perceptron. Perceptron Learning Rule
References Lecture 7: Support Vector Machines Isabelle Guyon guyoni@inf.ethz.ch An training algorithm for optimal margin classifiers Boser-Guyon-Vapnik, COLT, 992 http://www.clopinet.com/isabelle/p apers/colt92.ps.z
More informationContent. Learning. Regression vs Classification. Regression a.k.a. function approximation and Classification a.k.a. pattern recognition
Content Andrew Kusiak Intelligent Systems Laboratory 239 Seamans Center The University of Iowa Iowa City, IA 52242-527 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak Introduction to learning
More informationWhat is semi-supervised learning?
What is semi-supervised learning? In many practical learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate text processing, video-indexing,
More informationPerceptron Revisited: Linear Separators. Support Vector Machines
Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department
More informationSupport Vector Machines.
Support Vector Machines www.cs.wisc.edu/~dpage 1 Goals for the lecture you should understand the following concepts the margin slack variables the linear support vector machine nonlinear SVMs the kernel
More informationSupport Vector Machines
Support Vector Machines Tobias Pohlen Selected Topics in Human Language Technology and Pattern Recognition February 10, 2014 Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6
More informationStatistical learning theory, Support vector machines, and Bioinformatics
1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.
More informationKernel Methods. Machine Learning A W VO
Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance
More informationNeural networks and support vector machines
Neural netorks and support vector machines Perceptron Input x 1 Weights 1 x 2 x 3... x D 2 3 D Output: sgn( x + b) Can incorporate bias as component of the eight vector by alays including a feature ith
More informationAn introduction to Support Vector Machines
1 An introduction to Support Vector Machines Giorgio Valentini DSI - Dipartimento di Scienze dell Informazione Università degli Studi di Milano e-mail: valenti@dsi.unimi.it 2 Outline Linear classifiers
More informationKernel Methods. Outline
Kernel Methods Quang Nguyen University of Pittsburgh CS 3750, Fall 2011 Outline Motivation Examples Kernels Definitions Kernel trick Basic properties Mercer condition Constructing feature space Hilbert
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationBeyond the Point Cloud: From Transductive to Semi-Supervised Learning
Beyond the Point Cloud: From Transductive to Semi-Supervised Learning Vikas Sindhwani, Partha Niyogi, Mikhail Belkin Andrew B. Goldberg goldberg@cs.wisc.edu Department of Computer Sciences University of
More informationSemi-Supervised Learning through Principal Directions Estimation
Semi-Supervised Learning through Principal Directions Estimation Olivier Chapelle, Bernhard Schölkopf, Jason Weston Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany {first.last}@tuebingen.mpg.de
More informationLinear Spectral Hashing
Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to
More informationSupport Vector Machines. Maximizing the Margin
Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the
More informationOutline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22
Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationDoes Unlabeled Data Help?
Does Unlabeled Data Help? Worst-case Analysis of the Sample Complexity of Semi-supervised Learning. Ben-David, Lu and Pal; COLT, 2008. Presentation by Ashish Rastogi Courant Machine Learning Seminar. Outline
More informationEach new feature uses a pair of the original features. Problem: Mapping usually leads to the number of features blow up!
Feature Mapping Consider the following mapping φ for an example x = {x 1,...,x D } φ : x {x1,x 2 2,...,x 2 D,,x 2 1 x 2,x 1 x 2,...,x 1 x D,...,x D 1 x D } It s an example of a quadratic mapping Each new
More informationUniversal Learning Technology: Support Vector Machines
Special Issue on Information Utilizing Technologies for Value Creation Universal Learning Technology: Support Vector Machines By Vladimir VAPNIK* This paper describes the Support Vector Machine (SVM) technology,
More informationSVM TRADE-OFF BETWEEN MAXIMIZE THE MARGIN AND MINIMIZE THE VARIABLES USED FOR REGRESSION
International Journal of Pure and Applied Mathematics Volume 87 No. 6 2013, 741-750 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu doi: http://dx.doi.org/10.12732/ijpam.v87i6.2
More informationIntroduction to Machine Learning
Introduction to Machine Learning Kernel Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 21
More informationMachine Learning & SVM
Machine Learning & SVM Shannon "Information is any difference that makes a difference. Bateman " It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible
More informationLinear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights
Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University
More informationPolyhedral Computation. Linear Classifiers & the SVM
Polyhedral Computation Linear Classifiers & the SVM mcuturi@i.kyoto-u.ac.jp Nov 26 2010 1 Statistical Inference Statistical: useful to study random systems... Mutations, environmental changes etc. life
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationClassifier Complexity and Support Vector Classifiers
Classifier Complexity and Support Vector Classifiers Feature 2 6 4 2 0 2 4 6 8 RBF kernel 10 10 8 6 4 2 0 2 4 6 Feature 1 David M.J. Tax Pattern Recognition Laboratory Delft University of Technology D.M.J.Tax@tudelft.nl
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction
More information6.036 midterm review. Wednesday, March 18, 15
6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that
More informationFrom Last Meeting. Studied Fisher Linear Discrimination. - Mathematics. - Point Cloud view. - Likelihood view. - Toy examples
From Last Meeting Studied Fisher Linear Discrimination - Mathematics - Point Cloud view - Likelihood view - Toy eamples - Etensions (e.g. Principal Discriminant Analysis) Polynomial Embedding Aizerman,
More informationA GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong
A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Bernhard Schölkopf Max-Planck-Institut für biologische Kybernetik 72076 Tübingen, Germany Bernhard.Schoelkopf@tuebingen.mpg.de Alex Smola RSISE, Australian National
More informationEvaluation of Support Vector Machines and Minimax Probability. Machines for Weather Prediction. Stephen Sullivan
Generated using version 3.0 of the official AMS L A TEX template Evaluation of Support Vector Machines and Minimax Probability Machines for Weather Prediction Stephen Sullivan UCAR - University Corporation
More information(Kernels +) Support Vector Machines
(Kernels +) Support Vector Machines Machine Learning Torsten Möller Reading Chapter 5 of Machine Learning An Algorithmic Perspective by Marsland Chapter 6+7 of Pattern Recognition and Machine Learning
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationSupport Vector Machine. Natural Language Processing Lab lizhonghua
Support Vector Machine Natural Language Processing Lab lizhonghua Support Vector Machine Introduction Theory SVM primal and dual problem Parameter selection and practical issues Compare to other classifier
More informationSupport Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Support Vector Machines II CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Outline Linear SVM hard margin Linear SVM soft margin Non-linear SVM Application Linear Support Vector Machine An optimization
More informationOutline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space
to The The A s s in to Fabio A. González Ph.D. Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 2, 2009 to The The A s s in 1 Motivation Outline 2 The Mapping the
More informationKernel Methods. Konstantin Tretyakov MTAT Machine Learning
Kernel Methods Konstantin Tretyakov (kt@ut.ee) MTAT.03.227 Machine Learning So far Supervised machine learning Linear models Non-linear models Unsupervised machine learning Generic scaffolding So far Supervised
More informationLinear and Non-Linear Dimensionality Reduction
Linear and Non-Linear Dimensionality Reduction Alexander Schulz aschulz(at)techfak.uni-bielefeld.de University of Pisa, Pisa 4.5.215 and 7.5.215 Overview Dimensionality Reduction Motivation Linear Projections
More informationCS798: Selected topics in Machine Learning
CS798: Selected topics in Machine Learning Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS798: Selected topics in Machine Learning
More informationBrief Introduction to Machine Learning
Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector
More informationSupport Vector Machines
Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationSupport Vector Machines and Kernel Methods
2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University
More informationSupport Vector and Kernel Methods
SIGIR 2003 Tutorial Support Vector and Kernel Methods Thorsten Joachims Cornell University Computer Science Department tj@cs.cornell.edu http://www.joachims.org 0 Linear Classifiers Rules of the Form:
More informationKernel Methods. Konstantin Tretyakov MTAT Machine Learning
Kernel Methods Konstantin Tretyakov (kt@ut.ee) MTAT.03.227 Machine Learning So far Supervised machine learning Linear models Least squares regression, SVR Fisher s discriminant, Perceptron, Logistic model,
More informationDeviations from linear separability. Kernel methods. Basis expansion for quadratic boundaries. Adding new features Systematic deviation
Deviations from linear separability Kernel methods CSE 250B Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Systematic deviation
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationBits of Machine Learning Part 1: Supervised Learning
Bits of Machine Learning Part 1: Supervised Learning Alexandre Proutiere and Vahan Petrosyan KTH (The Royal Institute of Technology) Outline of the Course 1. Supervised Learning Regression and Classification
More informationKernel methods CSE 250B
Kernel methods CSE 250B Deviations from linear separability Noise Find a separator that minimizes a convex loss function related to the number of mistakes. e.g. SVM, logistic regression. Deviations from
More informationGraph-Based Semi-Supervised Learning
Graph-Based Semi-Supervised Learning Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux Université de Montréal CIAR Workshop - April 26th, 2005 Graph-Based Semi-Supervised Learning Yoshua Bengio, Olivier
More informationMicroarray Data Analysis: Discovery
Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover
More informationECE-271B. Nuno Vasconcelos ECE Department, UCSD
ECE-271B Statistical ti ti Learning II Nuno Vasconcelos ECE Department, UCSD The course the course is a graduate level course in statistical learning in SLI we covered the foundations of Bayesian or generative
More informationAnalysis of N-terminal Acetylation data with Kernel-Based Clustering
Analysis of N-terminal Acetylation data with Kernel-Based Clustering Ying Liu Department of Computational Biology, School of Medicine University of Pittsburgh yil43@pitt.edu 1 Introduction N-terminal acetylation
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationStatistical Learning Reading Assignments
Statistical Learning Reading Assignments S. Gong et al. Dynamic Vision: From Images to Face Recognition, Imperial College Press, 2001 (Chapt. 3, hard copy). T. Evgeniou, M. Pontil, and T. Poggio, "Statistical
More informationSupport Vector Machine (continued)
Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need
More informationNeural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science
Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Supervised Learning / Support Vector Machines
More informationA graph based approach to semi-supervised learning
A graph based approach to semi-supervised learning 1 Feb 2011 Two papers M. Belkin, P. Niyogi, and V Sindhwani. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples.
More informationLinear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction
Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the
More informationbelow, kernel PCA Eigenvectors, and linear combinations thereof. For the cases where the pre-image does exist, we can provide a means of constructing
Kernel PCA Pattern Reconstruction via Approximate Pre-Images Bernhard Scholkopf, Sebastian Mika, Alex Smola, Gunnar Ratsch, & Klaus-Robert Muller GMD FIRST, Rudower Chaussee 5, 12489 Berlin, Germany fbs,
More informationCSC2545 Topics in Machine Learning: Kernel Methods and Support Vector Machines
CSC2545 Topics in Machine Learning: Kernel Methods and Support Vector Machines A comprehensive introduc@on to SVMs and other kernel methods, including theory, algorithms and applica@ons. Instructor: Anthony
More informationKernel Methods & Support Vector Machines
Kernel Methods & Support Vector Machines Mahdi pakdaman Naeini PhD Candidate, University of Tehran Senior Researcher, TOSAN Intelligent Data Miners Outline Motivation Introduction to pattern recognition
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More information10/05/2016. Computational Methods for Data Analysis. Massimo Poesio SUPPORT VECTOR MACHINES. Support Vector Machines Linear classifiers
Computational Methods for Data Analysis Massimo Poesio SUPPORT VECTOR MACHINES Support Vector Machines Linear classifiers 1 Linear Classifiers denotes +1 denotes -1 w x + b>0 f(x,w,b) = sign(w x + b) How
More informationSUPPORT VECTOR MACHINE
SUPPORT VECTOR MACHINE Mainly based on https://nlp.stanford.edu/ir-book/pdf/15svm.pdf 1 Overview SVM is a huge topic Integration of MMDS, IIR, and Andrew Moore s slides here Our foci: Geometric intuition
More informationPattern Recognition 2018 Support Vector Machines
Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recognition 1 / 48 Support Vector Machines Ad Feelders ( Universiteit Utrecht
More informationKernel methods for comparing distributions, measuring dependence
Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations
More informationDiscriminative Models
No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models
More informationLINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning
LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary
More informationECE662: Pattern Recognition and Decision Making Processes: HW TWO
ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are
More informationSpace-Time Kernels. Dr. Jiaqiu Wang, Dr. Tao Cheng James Haworth University College London
Space-Time Kernels Dr. Jiaqiu Wang, Dr. Tao Cheng James Haworth University College London Joint International Conference on Theory, Data Handling and Modelling in GeoSpatial Information Science, Hong Kong,
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationKernel Methods in Machine Learning
Kernel Methods in Machine Learning Autumn 2015 Lecture 1: Introduction Juho Rousu ICS-E4030 Kernel Methods in Machine Learning 9. September, 2015 uho Rousu (ICS-E4030 Kernel Methods in Machine Learning)
More informationIntroduction to SVM and RVM
Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationReview: Support vector machines. Machine learning techniques and image analysis
Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization
More informationKernel Methods. Barnabás Póczos
Kernel Methods Barnabás Póczos Outline Quick Introduction Feature space Perceptron in the feature space Kernels Mercer s theorem Finite domain Arbitrary domain Kernel families Constructing new kernels
More informationARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92
ARTIFICIAL NEURAL NETWORKS گروه مطالعاتي 17 بهار 92 BIOLOGICAL INSPIRATIONS Some numbers The human brain contains about 10 billion nerve cells (neurons) Each neuron is connected to the others through 10000
More information