Surrogate models for Single and Multi-Objective Stochastic Optimization: Integrating Support Vector Machines and Covariance-Matrix Adaptation-ES
|
|
- Beverly Casey
- 5 years ago
- Views:
Transcription
1 Covariance Matrix Adaptation-Evolution Strategy Surrogate models for Single and Multi-Objective Stochastic Optimization: Integrating and Covariance-Matrix Adaptation-ES Ilya Loshchilov, Marc Schoenauer, Michèle Sebag TAO CNRS INRIA Univ. Paris-Sud May 23rd, 2011 Michèle Sebag Surrogate optimization: SVM for CMA 1/ 47
2 Motivations Find Argmin {F : X IR} Context: ill-posed optimization problems Function F (fitness function) on X IR d Gradient not available or not useful F available as an oracle (black box) continuous Build {x 1,x 2,...} Argmin(F) Black-box approaches + Applicable + Robust comparison-based approaches are invariant High computational costs: number of function evaluations Michèle Sebag Surrogate optimization: SVM for CMA 2/ 47
3 Surrogate optimization Principle Gather E = {(x i,f(x i ))} Build ˆF from E training set learn surrogate model Use surrogate model ˆF for some time: Optimization: use ˆF instead of true F in std algo Filtering: select promising x i based on ˆF in population-based algo. Compute F(x i ) for some x i Update ˆF Iterate Michèle Sebag Surrogate optimization: SVM for CMA 3/ 47
4 Surrogate optimization, cont Issues Learning Hypothesis space (polynoms, neural nets, Gaussian processes,...) Selection of training set (prune, update,...) What is the learning target? Interaction of Learning & Optimization modules Schedule (when to relearn) How to use ˆF to support optimization search How to use search results to support learning ˆF This talk Using Covariance-Matrix Estimation within Support Vector Machines Using SVM for multi-objective optimization Michèle Sebag Surrogate optimization: SVM for CMA 4/ 47
5 Content 1 Covariance Matrix Adaptation-Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization 2 Statistical Machine Learning Linear classifiers The kernel trick 3 Previous Work Mixing Rank-SVM and Local Information Experiments 4 Background Dominance-based Surrogate Experimental Validation Michèle Sebag Surrogate optimization: SVM for CMA 5/ 47
6 Stochastic Search Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization A black box search template to minimize f : R n R Initialize distribution parameters θ, set sample size λ N While not terminate 1 Sample distribution P (x θ) x 1,...,x λ R n 2 Evaluate x 1,...,x λ on f 3 Update parameters θ F θ (θ,x 1,...,x λ,f(x 1 ),...,f(x λ )) Covers Deterministic algorithms, Evolutionary Algorithms, PSO, DE P implicitly defined by the variation operators Estimation of Distribution Algorithms Michèle Sebag Surrogate optimization: SVM for CMA 6/ 47
7 The (µ, λ) Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization Gaussian Mutations x i m+σn i (0,C) for i = 1,...,λ as perturbations of m where x i,m R n, σ R +, and C R n n where the mean vector m R n represents the favorite solution the so-called step-size σ R + controls the step length the covariance matrix C R n n determines the shape of the distribution ellipsoid How to update m, σ, and C? Michèle Sebag Surrogate optimization: SVM for CMA 7/ 47
8 History Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization The one-fifth rule Rechenberg, 73 One single parameter σ for the whole population Measure empirical success rate Increase σ if too large, decrease σ if too small Often wrong in non-smooth landscapes Self-adaptive mutations Schwefel, 81 Each individual carries its own mutation parameter Log-normal mutation of mutation parameters (Normal) mutation of individual Adaptation is slow for full covariance case from 1 to n2 n 2 Michèle Sebag Surrogate optimization: SVM for CMA 8/ 47
9 Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization Cumulative Step-Size Adaptation (CSA) x i = m+σy i m m+σy w Measure the length of the evolution path the pathway of the mean vector m in the generation sequence decrease σ increase σ loosely speaking steps are perpendicular under random selection (in expectation) perpendicular in the desired situation (to be most efficient) Michèle Sebag Surrogate optimization: SVM for CMA 9/ 47
10 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) initial distribution, C = I new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
11 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) y w, movement of the population mean m (disregarding σ) new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
12 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) mixture of distribution C and step y w, C 0.8 C+0.2 y w y T w new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
13 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) new distribution (disregarding σ) new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
14 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) movement of the population mean m new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
15 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) mixture of distribution C and step y w, C 0.8 C+0.2 y w y T w new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
16 Covariance Matrix Adaptation Rank-One Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization m m+σy w, y w = µ i=1 w iy i:λ, y i N i (0,C) new distribution: C 0.8 C+0.2 y w y T w ruling principle: the adaptation increases the probability of successful steps, y w, to appear again Michèle Sebag Surrogate optimization: SVM for CMA 10/ 47
17 Rank-µ Update Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization x i = m + σy i, y i N i(0,c), m m + σy w y w = µ i=1 wi y i:λ x i = m + σ y i, y i N (0,C) Cµ = µ 1 yi:λ y T i:λ C (1 1) C + 1 Cµ mnew m + 1 µ yi:λ sampling of λ = 150 solutions where C = I and σ = 1 calculating C from µ = 50 points, w 1 = = w µ = 1 µ new distribution Remark: the old (sample) distribution shape has a great influence on the new distribution iterations needed Michèle Sebag Surrogate optimization: SVM for CMA 11/ 47
18 Covariance Matrix Adaptation-Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization Invariance: Guarantee for Generalization Invariance properties of CMA-ES Invariance to order preserving transformations in function space like all comparison-based algorithms Translation and rotation invariance to rigid transformations of the search space CMA-ES is almost parameterless Tuning of a small set of functions Hansen & Ostermeier 2001 Default values generalize to whole classes Exception: population size for multi-modal functions but try the Restart-CMA-ES Auger & Hansen, 2005 Michèle Sebag Surrogate optimization: SVM for CMA 12/ 47
19 State-of-the-art Results Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization BBOB Black-Box Optimization Benchmarking ACM-GECCO workshop, in 2009 and 2010 Set of 25 benchmark functions, dimensions 2 to 40 With known difficulties (ill-conditioning, non-separability,... ) Noisy and non-noisy versions Competitors include BFGS (Matlab version), Fletcher-Powell, DFO (Derivative-Free Optimization, Powell 04) Differential Evolution Particle Swarm Optimization and many more Michèle Sebag Surrogate optimization: SVM for CMA 13/ 47
20 Contents Statistical Machine Learning Linear classifiers The kernel trick 1 Covariance Matrix Adaptation-Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization 2 Statistical Machine Learning Linear classifiers The kernel trick 3 Previous Work Mixing Rank-SVM and Local Information Experiments 4 Background Dominance-based Surrogate Experimental Validation Michèle Sebag Surrogate optimization: SVM for CMA 14/ 47
21 Supervised Machine Learning Statistical Machine Learning Linear classifiers The kernel trick Context Universe instance x i Oracle y i Input: Training set E = {(x i,y i ),i = 1...n,x i X,y i Y} Output: Hypothesis h : X Y Criterion: Quality of h Michèle Sebag Surrogate optimization: SVM for CMA 15/ 47
22 Supervised Machine Learning, 2 Statistical Machine Learning Linear classifiers The kernel trick Definitions E = {(x i,y i ),x i X,y i Y,i = 1...n} Classification : Y finite Regression : Y IR Hypothesis space H : X Y failure/ok time to failure Tasks Select H model selection Assess h H expected accuracy IE[h(x) y] Find h in H minimizing the error cost in expectation h = Arg min{ie[l(h(x) y),h H} Michèle Sebag Surrogate optimization: SVM for CMA 16/ 47
23 Dilemma Statistical Machine Learning Linear classifiers The kernel trick Fitting the data Bias variance tradeoff Michèle Sebag Surrogate optimization: SVM for CMA 17/ 47
24 Statistical Machine Learning Statistical Machine Learning Linear classifiers The kernel trick Minimize expected loss Minimize IE[l(h(x), y)] Principle If h is well-behaved on the training set, if the training set is representative and if h is regular, then h is well-behaved in expectation. n i=1 E[F] F(x i) +c(f,n) n Michèle Sebag Surrogate optimization: SVM for CMA 18/ 47
25 Statistical Machine Learning Linear classifiers The kernel trick Linear classification; the noiseless case H : X IR d IR prediction = sgn(h(x)) h(x) = < w,x > +b L 1 L 2 w L 3 b Michèle Sebag Surrogate optimization: SVM for CMA 19/ 47
26 Statistical Machine Learning Linear classifiers The kernel trick Linear classification; the noiseless case, 2 Example Constraint y i (< w,x i > +b) margin 0 Maximize minimum margin 2/ w support vector w b +1 b b 0-1 Formalisation { Minimize 1 2 w 2 subject to i, y i (< w,x i > +b) 1 2/ w i Michèle Sebag Surrogate optimization: SVM for CMA 20/ 47
27 Statistical Machine Learning Linear classifiers The kernel trick Linear classification; the noiseless case, 3 Primal form { Minimize 1 2 w 2 subject to i, y i (< w,x i > +b) 1 Using Lagrange multipliers: Minimize w,b max α { 1 2 w 2 Dual form Maximize α } n α i [y i (w x i b) 1] i=1 n α i 1 α i α j y i y j x i,x j 2 i=1 Optimization: quadratic programming i,j w = α i y i x i subject to α i 0,i = 1...n Michèle Sebag Surrogate optimization: SVM for CMA 21/ 47
28 Statistical Machine Learning Linear classifiers The kernel trick Linear classification; the noisy case Allow constraint violation; consider slack variables Primal form { Minimize 1 2 w 2 +C n i=1 ξ i subject to i, y i (< w,x i > +b) 1 ξ i, 0 ξ i Lagrange multipliers: { 1 n 2 w 2 +C ξ i Dual form Solution i=1 Maximize α Minimize w,b,ξ max α,β } n n α i [y i (w x i b) 1+ξ i ] β i ξ i i=1 n α i 1 α i α j y i y j x i,x j 2 i=1 i,j i=1 subject to 0 α i C,i = 1...n, α i y i = 0 support vectors h(x) = w,x = α i y i x i,x Michèle Sebag Surrogate optimization: SVM for CMA 22/ 47
29 The kernel trick Statistical Machine Learning Linear classifiers The kernel trick Intuition X Ω Φ : x = (x 1,x 2 ) (x 2 1, 2x 1.x 2,x 2 2) Principle: choose Φ, K such that Φ(x),Φ(x ) = K(x,x ) Michèle Sebag Surrogate optimization: SVM for CMA 23/ 47
30 The kernel trick, 2 Statistical Machine Learning Linear classifiers The kernel trick SVM only considers the scalar product PROS h(x) = i α iy i x i,x linear case h(x) = i α iy i K(x i,x) mboxkerneltrick A rich hypothesis space No computational overhead: no explicit mapping on the feature space Open problem: kernel design Michèle Sebag Surrogate optimization: SVM for CMA 24/ 47
31 The kernel trick, 3 Statistical Machine Learning Linear classifiers The kernel trick Kernels Polynomial: k(x i,x j ) = ( x i,x j +1) d Gaussian or Radial Basis Function: k(x i,x j ) = exp( xi xj 2 2σ 2 ) Hyperbolic tangent: k(x i,x j ) = tanh(k x i,x j +c) Examples for Polynomial (left) and Gaussian (right) Kernels: Michèle Sebag Surrogate optimization: SVM for CMA 25/ 47
32 Rank-based SVM Statistical Machine Learning Linear classifiers The kernel trick Learning to order things On training set E = {x i,i = 1...n} expert gives preferences: (x ik x jk ), k = 1...K underconstrained regression Order constraints Primal form { Minimize 1 2 w 2 +C K k=1 ξ k subject to k, w,x ik w,x jk 1 ξ k Michèle Sebag Surrogate optimization: SVM for CMA 26/ 47
33 Contents Previous Work Mixing Rank-SVM and Local Information Experiments 1 Covariance Matrix Adaptation-Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization 2 Statistical Machine Learning Linear classifiers The kernel trick 3 Previous Work Mixing Rank-SVM and Local Information Experiments 4 Background Dominance-based Surrogate Experimental Validation Michèle Sebag Surrogate optimization: SVM for CMA 27/ 47
34 Surrogate Models for CMA-ES Previous Work Mixing Rank-SVM and Local Information Experiments lmm-cma-es Build a full quadratic meta-model around current point Weighted by Mahalanobis distance from covariance matric Speed-up: a factor of 2-3 for n 4 Complexity: from O(n 4 ) to O(n 6 ) (intractable for n>16) Rank-invariance is lost S. Kern et al. (2006). "Local Meta-Models for Optimization Using Evolution Strategies" Z. Bouzarkouna et al. (2010). Investigating the lmm-cma-es for Large Population Sizes Michèle Sebag Surrogate optimization: SVM for CMA 28/ 47
35 Previous Work Mixing Rank-SVM and Local Information Experiments Surrogate Models for CMA-ES, cont Using Rank-SVM Builds a global model using Rank-SVM x i x j iff F(x i ) < F(x j ) Kernel and parameters highly problem-dependent Note: no use of information from current state of CMA T. Runarsson (2006). "Ordinal Regression in Evolutionary Computation" ACM Algorithm Use C from CMA-ES as Gaussian kernel I. Loschilov et al. (2010). "Comparison-based optimizers need comparison-based surrogates Michèle Sebag Surrogate optimization: SVM for CMA 29/ 47
36 Model Learning Non-separable Ellipsoid problem Previous Work Mixing Rank-SVM and Local Information Experiments X 2 0 X X X 1 Rank-SVM regression in original coordinate system Rank-SVM regression in transformed coordinate system given by current covariance matrix C and mean m: x = C 1 2(x m) Michèle Sebag Surrogate optimization: SVM for CMA 30/ 47
37 Using the Surrogate Model Previous Work Mixing Rank-SVM and Local Information Experiments Optimization: Significant Speed-Up... if global and accurate model Filtering: "Guaranteed" Speed-Up Prescreen (λ ) Evaluate (λ ) Retain with rank, Retain with rank λ, ~λ ~ Probability Density Rank Michèle Sebag Surrogate optimization: SVM for CMA 31/ 47
38 ACM-ES Optimization Loop Previous Work Mixing Rank-SVM and Local Information Experiments from the current covariance matrix A. Select training points B. Build a surrogate model 1. Select best k training points. 2. The change of coordinates, defined and the current mean value, reads [4]: X 2 0 X Add new λ training points and update parameters of CMA-ES. D. Select most promising children Prescreen (λ ) Evaluate (λ ) Probability Density Retain with rank, Retain with rank λ, ~ λ Rank -1 ~ X1-1 Rank-based Surrogate Model X1 3. Build a surrogate model using Rank SVM. C. Generate pre-children 4. Generate pre-children and rank them according to surrogate fitness function. Michèle Sebag Surrogate optimization: SVM for CMA 32/ 47
39 Parameters Previous Work Mixing Rank-SVM and Local Information Experiments SVM Learning Number of training points: N training = 30 d for all problems, except Rosenbrock and Rastrigin, where N training = 70 (d) Number of iterations: N iter = d Kernel function: RBF function with σ equal to the average distance of the training points The cost of constraint violation: C i = 10 6 (N training i) 2.0 Offspring Selection Number of test points: N test = 500 Number of evaluated offsprings: λ = λ 3 Offspring selection pressure parameters: σ 2 sel0 = 2σ2 sel1 = 0.8 Michèle Sebag Surrogate optimization: SVM for CMA 33/ 47
40 Results Speed-up Previous Work Mixing Rank-SVM and Local Information Experiments Speedup Schwefel 2 Schwefel 1/4 Rosenbrock 1 Noisy Sphere 1 Ackley Ellipsoid Problem Dimension Function n λ λ e ACM-ES spu CMA-ES Schwefel /4 Schwefel Rosenbrock (0.95) (0.90) (0.75) (0.9) NoisySphere (0.95) Ackley Ellipsoid Rastrigin (0.3) (0.75) Michèle Sebag Surrogate optimization: SVM for CMA 34/ 47
41 Results Learning Time Previous Work Mixing Rank-SVM and Local Information Experiments Cost of model learning/testing increases quasi-linearly with d on Sphere function: 10 Learning time (ms) Slope= Problem Dimension 10 3 Michèle Sebag Surrogate optimization: SVM for CMA 35/ 47
42 ACM-ES: conclusion Previous Work Mixing Rank-SVM and Local Information Experiments ACM-ES From 2 to 4 times faster on Uni-Modal Problems Invariance to rank-preserving transformations preserved Computation complexity is O(n) Available online at Open Issues Extention to multi-modal optimization On-line adaptation of selection pressure and surrogate model complexity Michèle Sebag Surrogate optimization: SVM for CMA 36/ 47
43 Contents Background Dominance-based Surrogate Experimental Validation 1 Covariance Matrix Adaptation-Evolution Strategy Evolution Strategies CMA-ES The state-of-the-art of (Stochastic) Optimization 2 Statistical Machine Learning Linear classifiers The kernel trick 3 Previous Work Mixing Rank-SVM and Local Information Experiments 4 Background Dominance-based Surrogate Experimental Validation Michèle Sebag Surrogate optimization: SVM for CMA 37/ 47
44 Background Dominance-based Surrogate Experimental Validation Multi-objective CMA-ES (MO-CMA-ES) MO-CMA-ES = µ mo independent (1+1)-CMA-ES. Each (1+1)-CMA samples new offspring. The size of the temporary population is 2µ mo. Only µ mo best solutions should be chosen for new population after the hypervolume-based non-dominated sorting. Update of CMA individuals takes place. Dominated Pareto Objective 2 Michèle Sebag Surrogate optimization: SVM for CMA 38/ 47
45 Background Dominance-based Surrogate Experimental Validation A Multi-Objective Surrogate Model Rationale Rationale: find a unique function F(x) that defines the aggregated quality of the solution x in multi-objective case. Idea originally proposed using a mixture of One-Class SVM and regression-svm a a I. Loshchilov, M. Schoenauer, M. Sebag (GECCO 2010). "A Mono Surrogate for Multiobjective Optimization" Dominated Pareto New Pareto p+e 2e p-e Objective 2 X2 Objective 1 FSVM p+e p-e Michèle Sebag Surrogate optimization: SVM for CMA 39/ 47 X1
46 Unsing the Surrogate Model Background Dominance-based Surrogate Experimental Validation Filtering Generate N inform pre-children For each pre-children A and the nearest parent B calculate Gain(A,B) = F svm (A) F svm (B) New children is the point with the maximum value of Gain true Pareto SVM Pareto X2 X1 Michèle Sebag Surrogate optimization: SVM for CMA 40/ 47
47 Dominance-Based Surrogate Using Rank-SVM Which ordered pairs? Background Dominance-based Surrogate Experimental Validation Considering all possible relations may be too expensive. Primary constraints: x and its nearest dominated point Secondary constraints: any 2 points not belonging to the same front (according to non-dominated sorting) Objective 2 a b c e f d FSVM > - constraints: Primary Secondary All primary constraints, and a limited number of secondary constraints Objective 1 Michèle Sebag Surrogate optimization: SVM for CMA 41/ 47
48 Dominance-Based Surrogate (2) Background Dominance-based Surrogate Experimental Validation Construction of the surrogate model Initialize archive Ω active as the set of Primary constraints, and Ω passive as the set of Secondary constraints. Learn the model for 1000 Ω active iterations. Add the most violated passive contraint from Ω passive to Ω active and optimize the model for 10 Ω active iterations. Repeat the last step 0.1 Ω active times. Michèle Sebag Surrogate optimization: SVM for CMA 42/ 47
49 Experimental Validation Parameters Background Dominance-based Surrogate Experimental Validation Surrogate Models ASM - aggregated surrogate model based on One-Class SVM and Regression SVM RASM - proposed Rank-based SVM SVM Learning Number of training points: at most N training = 1000 points Number of iterations: 1000 Ω active + Ω active 2 2N 2 training Kernel function: RBF function with σ equal to the average distance of the training points The cost of constraint violation: C = 1000 Offspring Selection Number of pre-children: p = 2 and p = 10 Michèle Sebag Surrogate optimization: SVM for CMA 43/ 47
50 Experimental Validation Comparative Results Background Dominance-based Surrogate Experimental Validation ASM and Rank-based ASM applied on top of NSGA-II (with hypervolume secondary criterion) and MO-CMA-ES, on ZDT and IHR functions. N = How many more true evaluations than best performer Michèle Sebag Surrogate optimization: SVM for CMA 44/ 47
51 Discussion Background Dominance-based Surrogate Experimental Validation Results on ZDT and IHR problems Rank-SVM versions are 1.5 times faster for p = for p = 10 before the algorithm reaches nearly-optimal Pareto points premature convergence of approximation of optimal µ-distribution: the surrogate model only enforces convergence toward Pareto front, but does not care about diversity. Michèle Sebag Surrogate optimization: SVM for CMA 45/ 47
52 Background Dominance-based Surrogate Experimental Validation Dominance-based Surrogate: Conclusion The proposed aggregated surrogate model is invariant to preserving transformation of the objective functions. The speed-up is significant, but limited to the convergence to the optimal Pareto front. Objective 2 a b g c e f d > - constraints: Primary Secondary = - constraints: Primary Secondary FSVM Thanks to the flexibility of SVM, "any" kind of preference could be taken into account: extreme points, "=" relations, Hypervolume Contribution, Decision Maker - defined relations. Objective 1 Michèle Sebag Surrogate optimization: SVM for CMA 46/ 47
53 Machine Learning for Optimization: Discussion Learning about the landscape easy Using available samples Using prior knowledge / constraints...using it to speed-up the search Dilemma Exploration vs Exploitation Multi-modal optimization very doable; more difficult Michèle Sebag Surrogate optimization: SVM for CMA 47/ 47
BI-population CMA-ES Algorithms with Surrogate Models and Line Searches
BI-population CMA-ES Algorithms with Surrogate Models and Line Searches Ilya Loshchilov 1, Marc Schoenauer 2 and Michèle Sebag 2 1 LIS, École Polytechnique Fédérale de Lausanne 2 TAO, INRIA CNRS Université
More informationComparison-Based Optimizers Need Comparison-Based Surrogates
Comparison-Based Optimizers Need Comparison-Based Surrogates Ilya Loshchilov 1,2, Marc Schoenauer 1,2, and Michèle Sebag 2,1 1 TAO Project-team, INRIA Saclay - Île-de-France 2 Laboratoire de Recherche
More informationAdaptive Coordinate Descent
Adaptive Coordinate Descent Ilya Loshchilov 1,2, Marc Schoenauer 1,2, Michèle Sebag 2,1 1 TAO Project-team, INRIA Saclay - Île-de-France 2 and Laboratoire de Recherche en Informatique (UMR CNRS 8623) Université
More informationProblem Statement Continuous Domain Search/Optimization. Tutorial Evolution Strategies and Related Estimation of Distribution Algorithms.
Tutorial Evolution Strategies and Related Estimation of Distribution Algorithms Anne Auger & Nikolaus Hansen INRIA Saclay - Ile-de-France, project team TAO Universite Paris-Sud, LRI, Bat. 49 945 ORSAY
More informationCovariance Matrix Adaptation in Multiobjective Optimization
Covariance Matrix Adaptation in Multiobjective Optimization Dimo Brockhoff INRIA Lille Nord Europe October 30, 2014 PGMO-COPI 2014, Ecole Polytechnique, France Mastertitelformat Scenario: Multiobjective
More informationOutline. Basic concepts: SVM and kernels SVM primal/dual problems. Chih-Jen Lin (National Taiwan Univ.) 1 / 22
Outline Basic concepts: SVM and kernels SVM primal/dual problems Chih-Jen Lin (National Taiwan Univ.) 1 / 22 Outline Basic concepts: SVM and kernels Basic concepts: SVM and kernels SVM primal/dual problems
More informationJeff Howbert Introduction to Machine Learning Winter
Classification / Regression Support Vector Machines Jeff Howbert Introduction to Machine Learning Winter 2012 1 Topics SVM classifiers for linearly separable classes SVM classifiers for non-linearly separable
More informationTutorial CMA-ES Evolution Strategies and Covariance Matrix Adaptation
Tutorial CMA-ES Evolution Strategies and Covariance Matrix Adaptation Anne Auger & Nikolaus Hansen INRIA Research Centre Saclay Île-de-France Project team TAO University Paris-Sud, LRI (UMR 8623), Bat.
More informationExperimental Comparisons of Derivative Free Optimization Algorithms
Experimental Comparisons of Derivative Free Optimization Algorithms Anne Auger Nikolaus Hansen J. M. Perez Zerpa Raymond Ros Marc Schoenauer TAO Project-Team, INRIA Saclay Île-de-France, and Microsoft-INRIA
More informationEvolution Strategies and Covariance Matrix Adaptation
Evolution Strategies and Covariance Matrix Adaptation Cours Contrôle Avancé - Ecole Centrale Paris Anne Auger January 2014 INRIA Research Centre Saclay Île-de-France University Paris-Sud, LRI (UMR 8623),
More informationStochastic Methods for Continuous Optimization
Stochastic Methods for Continuous Optimization Anne Auger et Dimo Brockhoff Paris-Saclay Master - Master 2 Informatique - Parcours Apprentissage, Information et Contenu (AIC) 2016 Slides taken from Auger,
More informationAdvanced Optimization
Advanced Optimization Lecture 3: 1: Randomized Algorithms for for Continuous Discrete Problems Problems November 22, 2016 Master AIC Université Paris-Saclay, Orsay, France Anne Auger INRIA Saclay Ile-de-France
More informationPerceptron Revisited: Linear Separators. Support Vector Machines
Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department
More informationNeural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science
Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Supervised Learning / Support Vector Machines
More informationIntroduction to SVM and RVM
Introduction to SVM and RVM Machine Learning Seminar HUS HVL UIB Yushu Li, UIB Overview Support vector machine SVM First introduced by Vapnik, et al. 1992 Several literature and wide applications Relevance
More informationIntroduction to Black-Box Optimization in Continuous Search Spaces. Definitions, Examples, Difficulties
1 Introduction to Black-Box Optimization in Continuous Search Spaces Definitions, Examples, Difficulties Tutorial: Evolution Strategies and CMA-ES (Covariance Matrix Adaptation) Anne Auger & Nikolaus Hansen
More informationMachine Learning and Data Mining. Support Vector Machines. Kalev Kask
Machine Learning and Data Mining Support Vector Machines Kalev Kask Linear classifiers Which decision boundary is better? Both have zero training error (perfect training accuracy) But, one of them seems
More informationBio-inspired Continuous Optimization: The Coming of Age
Bio-inspired Continuous Optimization: The Coming of Age Anne Auger Nikolaus Hansen Nikolas Mauny Raymond Ros Marc Schoenauer TAO Team, INRIA Futurs, FRANCE http://tao.lri.fr First.Last@inria.fr CEC 27,
More informationarxiv: v1 [cs.ne] 9 May 2016
Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES) arxiv:1605.02720v1 [cs.ne] 9 May 2016 ABSTRACT Ilya Loshchilov University of Freiburg Freiburg, Germany ilya.loshchilov@gmail.com
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationAbout this class. Maximizing the Margin. Maximum margin classifiers. Picture of large and small margin hyperplanes
About this class Maximum margin classifiers SVMs: geometric derivation of the primal problem Statement of the dual problem The kernel trick SVMs as the solution to a regularization problem Maximizing the
More informationML (cont.): SUPPORT VECTOR MACHINES
ML (cont.): SUPPORT VECTOR MACHINES CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 40 Support Vector Machines (SVMs) The No-Math Version
More informationLecture 18: Kernels Risk and Loss Support Vector Regression. Aykut Erdem December 2016 Hacettepe University
Lecture 18: Kernels Risk and Loss Support Vector Regression Aykut Erdem December 2016 Hacettepe University Administrative We will have a make-up lecture on next Saturday December 24, 2016 Presentations
More informationSupport Vector Machines
Support Vector Machines Hypothesis Space variable size deterministic continuous parameters Learning Algorithm linear and quadratic programming eager batch SVMs combine three important ideas Apply optimization
More informationLocal-Meta-Model CMA-ES for Partially Separable Functions
Local-Meta-Model CMA-ES for Partially Separable Functions Zyed Bouzarkouna, Anne Auger, Didier Yu Ding To cite this version: Zyed Bouzarkouna, Anne Auger, Didier Yu Ding. Local-Meta-Model CMA-ES for Partially
More informationCMA-ES a Stochastic Second-Order Method for Function-Value Free Numerical Optimization
CMA-ES a Stochastic Second-Order Method for Function-Value Free Numerical Optimization Nikolaus Hansen INRIA, Research Centre Saclay Machine Learning and Optimization Team, TAO Univ. Paris-Sud, LRI MSRC
More informationCSC 411 Lecture 17: Support Vector Machine
CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17
More informationA short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie
A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationIndirect Rule Learning: Support Vector Machines. Donglin Zeng, Department of Biostatistics, University of North Carolina
Indirect Rule Learning: Support Vector Machines Indirect learning: loss optimization It doesn t estimate the prediction rule f (x) directly, since most loss functions do not have explicit optimizers. Indirection
More informationSupport Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2
More informationReview: Support vector machines. Machine learning techniques and image analysis
Review: Support vector machines Review: Support vector machines Margin optimization min (w,w 0 ) 1 2 w 2 subject to y i (w 0 + w T x i ) 1 0, i = 1,..., n. Review: Support vector machines Margin optimization
More informationSupport Vector Machines and Kernel Methods
2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University
More informationSupport Vector Machines. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Support Vector Machines CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification
More informationMachine Learning. Lecture 6: Support Vector Machine. Feng Li.
Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)
More informationMachine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)
Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:
More informationIntroduction to Randomized Black-Box Numerical Optimization and CMA-ES
Introduction to Randomized Black-Box Numerical Optimization and CMA-ES July 3, 2017 CEA/EDF/Inria summer school "Numerical Analysis" Université Pierre-et-Marie-Curie, Paris, France Anne Auger, Asma Atamna,
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationMax Margin-Classifier
Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization
More informationBenchmarking Gaussian Processes and Random Forests on the BBOB Noiseless Testbed
Benchmarking Gaussian Processes and Random Forests on the BBOB Noiseless Testbed Lukáš Bajer,, Zbyněk Pitra,, Martin Holeňa Faculty of Mathematics and Physics, Charles University, Institute of Computer
More informationSupport'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan
Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination
More informationMirrored Variants of the (1,4)-CMA-ES Compared on the Noiseless BBOB-2010 Testbed
Author manuscript, published in "GECCO workshop on Black-Box Optimization Benchmarking (BBOB') () 9-" DOI :./8.8 Mirrored Variants of the (,)-CMA-ES Compared on the Noiseless BBOB- Testbed [Black-Box Optimization
More informationOptimisation numérique par algorithmes stochastiques adaptatifs
Optimisation numérique par algorithmes stochastiques adaptatifs Anne Auger M2R: Apprentissage et Optimisation avancés et Applications anne.auger@inria.fr INRIA Saclay - Ile-de-France, project team TAO
More informationClustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26
Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar
More informationSupport Vector Machines.
Support Vector Machines www.cs.wisc.edu/~dpage 1 Goals for the lecture you should understand the following concepts the margin slack variables the linear support vector machine nonlinear SVMs the kernel
More informationClassifier Complexity and Support Vector Classifiers
Classifier Complexity and Support Vector Classifiers Feature 2 6 4 2 0 2 4 6 8 RBF kernel 10 10 8 6 4 2 0 2 4 6 Feature 1 David M.J. Tax Pattern Recognition Laboratory Delft University of Technology D.M.J.Tax@tudelft.nl
More informationNatural Evolution Strategies for Direct Search
Tobias Glasmachers Natural Evolution Strategies for Direct Search 1 Natural Evolution Strategies for Direct Search PGMO-COPI 2014 Recent Advances on Continuous Randomized black-box optimization Thursday
More informationCS6375: Machine Learning Gautam Kunapuli. Support Vector Machines
Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this
More informationDiscriminative Models
No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear
More informationIntroduction to Logistic Regression and Support Vector Machine
Introduction to Logistic Regression and Support Vector Machine guest lecturer: Ming-Wei Chang CS 446 Fall, 2009 () / 25 Fall, 2009 / 25 Before we start () 2 / 25 Fall, 2009 2 / 25 Before we start Feel
More informationInvestigating the Local-Meta-Model CMA-ES for Large Population Sizes
Investigating the Local-Meta-Model CMA-ES for Large Population Sizes Zyed Bouzarkouna, Anne Auger, Didier Yu Ding To cite this version: Zyed Bouzarkouna, Anne Auger, Didier Yu Ding. Investigating the Local-Meta-Model
More informationSupport Vector Machine & Its Applications
Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationSupport Vector Machines
Support Vector Machines Some material on these is slides borrowed from Andrew Moore's excellent machine learning tutorials located at: http://www.cs.cmu.edu/~awm/tutorials/ Where Should We Draw the Line????
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Hsuan-Tien Lin Learning Systems Group, California Institute of Technology Talk in NTU EE/CS Speech Lab, November 16, 2005 H.-T. Lin (Learning Systems Group) Introduction
More informationLecture Support Vector Machine (SVM) Classifiers
Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in
More informationCSC242: Intro to AI. Lecture 21
CSC242: Intro to AI Lecture 21 Administrivia Project 4 (homeworks 18 & 19) due Mon Apr 16 11:59PM Posters Apr 24 and 26 You need an idea! You need to present it nicely on 2-wide by 4-high landscape pages
More informationBounding the Population Size of IPOP-CMA-ES on the Noiseless BBOB Testbed
Bounding the Population Size of OP-CMA-ES on the Noiseless BBOB Testbed Tianjun Liao IRIDIA, CoDE, Université Libre de Bruxelles (ULB), Brussels, Belgium tliao@ulb.ac.be Thomas Stützle IRIDIA, CoDE, Université
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More informationSupport Vector Machine
Support Vector Machine Fabrice Rossi SAMM Université Paris 1 Panthéon Sorbonne 2018 Outline Linear Support Vector Machine Kernelized SVM Kernels 2 From ERM to RLM Empirical Risk Minimization in the binary
More informationSupport Vector Machine (continued)
Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need
More informationSupport Vector Machines
Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)
More informationStochastic optimization and a variable metric approach
The challenges for stochastic optimization and a variable metric approach Microsoft Research INRIA Joint Centre, INRIA Saclay April 6, 2009 Content 1 Introduction 2 The Challenges 3 Stochastic Search 4
More informationPattern Recognition 2018 Support Vector Machines
Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recognition 1 / 48 Support Vector Machines Ad Feelders ( Universiteit Utrecht
More informationStatistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
More informationData Mining - SVM. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - SVM 1 / 55
Data Mining - SVM Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - SVM 1 / 55 Outline 1. Introduction 2. Linear regression 3. Support Vector Machine 4.
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationFoundation of Intelligent Systems, Part I. SVM s & Kernel Methods
Foundation of Intelligent Systems, Part I SVM s & Kernel Methods mcuturi@i.kyoto-u.ac.jp FIS - 2013 1 Support Vector Machines The linearly-separable case FIS - 2013 2 A criterion to select a linear classifier:
More informationSupport Vector Machines. Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar
Data Mining Support Vector Machines Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 Support Vector Machines Find a linear hyperplane
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More information(Kernels +) Support Vector Machines
(Kernels +) Support Vector Machines Machine Learning Torsten Möller Reading Chapter 5 of Machine Learning An Algorithmic Perspective by Marsland Chapter 6+7 of Pattern Recognition and Machine Learning
More informationScale-Invariance of Support Vector Machines based on the Triangular Kernel. Abstract
Scale-Invariance of Support Vector Machines based on the Triangular Kernel François Fleuret Hichem Sahbi IMEDIA Research Group INRIA Domaine de Voluceau 78150 Le Chesnay, France Abstract This paper focuses
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Support Vector Machine (SVM) Hamid R. Rabiee Hadi Asheri, Jafar Muhammadi, Nima Pourdamghani Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Introduction
More informationStatistical learning theory, Support vector machines, and Bioinformatics
1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.
More informationKernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning
Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:
More informationLecture 10: A brief introduction to Support Vector Machine
Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department
More informationSupport Vector Machines
Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector
More informationLinear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights
Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University
More informationDiscriminative Models
No.5 Discriminative Models Hui Jiang Department of Electrical Engineering and Computer Science Lassonde School of Engineering York University, Toronto, Canada Outline Generative vs. Discriminative models
More informationTDT4173 Machine Learning
TDT4173 Machine Learning Lecture 3 Bagging & Boosting + SVMs Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline 1 Ensemble-methods
More informationAddressing Numerical Black-Box Optimization: CMA-ES (Tutorial)
Addressing Numerical Black-Box Optimization: CMA-ES (Tutorial) Anne Auger & Nikolaus Hansen INRIA Research Centre Saclay Île-de-France Project team TAO University Paris-Sud, LRI (UMR 8623), Bat. 490 91405
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationMachine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari
Machine Learning Basics: Stochastic Gradient Descent Sargur N. srihari@cedar.buffalo.edu 1 Topics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation Sets
More informationBasis Expansion and Nonlinear SVM. Kai Yu
Basis Expansion and Nonlinear SVM Kai Yu Linear Classifiers f(x) =w > x + b z(x) = sign(f(x)) Help to learn more general cases, e.g., nonlinear models 8/7/12 2 Nonlinear Classifiers via Basis Expansion
More informationModelli Lineari (Generalizzati) e SVM
Modelli Lineari (Generalizzati) e SVM Corso di AA, anno 2018/19, Padova Fabio Aiolli 19/26 Novembre 2018 Fabio Aiolli Modelli Lineari (Generalizzati) e SVM 19/26 Novembre 2018 1 / 36 Outline Linear methods
More informationLecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron
CS446: Machine Learning, Fall 2017 Lecture 14 : Online Learning, Stochastic Gradient Descent, Perceptron Lecturer: Sanmi Koyejo Scribe: Ke Wang, Oct. 24th, 2017 Agenda Recap: SVM and Hinge loss, Representer
More informationIncorporating detractors into SVM classification
Incorporating detractors into SVM classification AGH University of Science and Technology 1 2 3 4 5 (SVM) SVM - are a set of supervised learning methods used for classification and regression SVM maximal
More informationMachine Learning. Support Vector Machines. Fabio Vandin November 20, 2017
Machine Learning Support Vector Machines Fabio Vandin November 20, 2017 1 Classification and Margin Consider a classification problem with two classes: instance set X = R d label set Y = { 1, 1}. Training
More informationSupport Vector Machines II. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Support Vector Machines II CAP 5610: Machine Learning Instructor: Guo-Jun QI 1 Outline Linear SVM hard margin Linear SVM soft margin Non-linear SVM Application Linear Support Vector Machine An optimization
More informationA Restart CMA Evolution Strategy With Increasing Population Size
Anne Auger and Nikolaus Hansen A Restart CMA Evolution Strategy ith Increasing Population Size Proceedings of the IEEE Congress on Evolutionary Computation, CEC 2005 c IEEE A Restart CMA Evolution Strategy
More information5.6 Nonparametric Logistic Regression
5.6 onparametric Logistic Regression Dmitri Dranishnikov University of Florida Statistical Learning onparametric Logistic Regression onparametric? Doesnt mean that there are no parameters. Just means that
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationTutorial CMA-ES Evolution Strategies and Covariance Matrix Adaptation
Tutorial CMA-ES Evolution Strategies and Covariance Matrix Adaptation Anne Auger & Nikolaus Hansen INRIA Research Centre Saclay Île-de-France Project team TAO University Paris-Sud, LRI (UMR 8623), Bat.
More informationSVMs, Duality and the Kernel Trick
SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today
More information