Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)

Size: px
Start display at page:

Download "Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)"

Transcription

1 Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006

2 Task-specific similarity A toy example:

3 Task-specific similarity A toy example:

4 Task-specific similarity A toy example: Norm Angle

5 The problem Learn similarity from examples of what is similar [or not]. Binary similarity: for x, y X S(x, y) = { + if x and y are similar, if they are dissimilar. Two goals in mind: Similarity detection: judge whether two entities are similar. Similarity search: given a query entity and a database, find examples in a database that are similar to the query. Our approach: learn an embedding of the data into a space where similarity corresponds to a simple distance. 2

6 Task-specific similarity Articulated pose: 3

7 Task-specific similarity Visual similarity of image patches: 4

8 Related prior work Learning parametrized distance metric [Xing et al; Roweis], such as Mahalanobis distances. Lots of work on low-dimensional graph embedding (MDS, LLE,... ) - but unclear how to generalize without relying on distance in X. Embedding known distance [Athitsos et al]: assumes known distance, uses embedding to approximate/speed up. DistBoost [Hertz et al]: learning distance for classification / clustering setup. Locality sensitive hashing: fast similarity search. 5

9 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 6

10 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 6

11 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 6

12 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 0 6

13 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 0 6

14 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 0 6

15 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 0 6

16 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls: 0 0 6

17 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls:

18 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls:

19 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls:

20 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls:

21 Locality Sensitive Hashing [Indyk et al] Algorithm for finding a (ɛ, r)-neighbor of x 0 with high probability in sublinear time O ( N /(+ɛ)). Index the data by l random hash functions, and only search the union of the l buckets where the query falls:

22 Locality sensitive hashing [Indyk et al] A family H of functions is locality sensitive if P h U[H] (h(x 0 ) = h(x) x 0 x r) p, P h U[H] (h(x 0 ) = h(x) x 0 x R) p 2. Uses the gap between TP and FP rates; amplified by concatenating functions into a hash key. Projections on random lines are locality sensitive w.r.t. L p norms, p 2 [Gionis et al, Datar et al]. 7

23 How is this relevant? LSH is excellent if L p is all we want. L p may not be a suitable proxy for S: we may Waste lots of bits on irrelevant features; Miss pairs similar under S but not close w.r.t. L p If we know what S is, may be able to analytically design embedding of X into L space [Thaper&Indyk, Grauman&Darrell]. We will instead learn LSH-style binary functions that fit S as conveyed by examples. 8

24 Our approach Given: pairs of similar [and pairs of dissimilar] examples, based on the hidden binary similarity S. Two related tasks: Similarity judgment: S(x, y) =? Given a query x 0 ; need to find {x i : S(x i, x 0 ) = +}. Our solution to both problems: a similarity sensitive embedding H(x) = [α h (x),..., α M h M (x)] ; We will learn h m (x) {0, } and α m. 9

25 Desired embedding properties Embedding H(x) = [α h (x),..., α M h M (x)]: Rely on L (= weighted Hamming) distance H(x ) H(x 2 ) = M m= α m h m (x ) h m (x 2 ) H is similarity sensitive: for some R, want high P x,x 2 p(x)( H(x ) H(x 2 ) R S(x, x 2 ) = +), low P x,x 2 p(x)( H(x ) H(x 2 ) R S(x, x 2 ) = ). H(x) H(y) is a proxy for S. 0

26 Projection-based classifiers For a projection f : X R, consider, for some T R, h(x; f, T ) = { if f(x) T 0 if f(x) > T. This defines a simple similarity classifier of pairs: c(x, y; f, T ) = + h(x; f, T ) = h(y; f, T ) q y z b a h(x; f, T ) = 0 T h(x; f, T ) = f(x) c(q, y; f, T ) = c(q, z; f, T ) = +, c(q, a; f, T ) = c(q, b; f, T ) =.

27 Selecting the threshold Algorithm for selecting the threshold based on similar/dissimilar pairs: For a moment, we focus on N positive pairs only. Sort the 2N values of f(x). Need to check at most 2N + values of T, and count the number of pairs that are dissected (a bad event). f(x) 2

28 Selecting the threshold Algorithm for selecting the threshold based on similar/dissimilar pairs: For a moment, we focus on N positive pairs only. Sort the 2N values of f(x). Need to check at most 2N + values of T, and count the number of pairs that are dissected (a bad event). 0 f(x) 2

29 Selecting the threshold Algorithm for selecting the threshold based on similar/dissimilar pairs: For a moment, we focus on N positive pairs only. Sort the 2N values of f(x). Need to check at most 2N + values of T, and count the number of pairs that are dissected (a bad event). 0 f(x) 2

30 Selecting the threshold Algorithm for selecting the threshold based on similar/dissimilar pairs: For a moment, we focus on N positive pairs only. Sort the 2N values of f(x). Need to check at most 2N + values of T, and count the number of pairs that are dissected (a bad event) f(x) 2

31 Selecting the threshold Also need to consider negative examples (dissimilar pairs), and estimate the gap: true positive rate TP minus false positive rate FP. POS NEG f(x) 3

32 Selecting the threshold Also need to consider negative examples (dissimilar pairs), and estimate the gap: true positive rate TP minus false positive rate FP POS NEG f(x)

33 Optimization of h(x; f, T ) Objective (TP FP gap): argmax f,t N + i= N c(x + i, x+ i ) j= c(x j, x j ) Parametric projection families, e.g. f(x) = j θ jx p (d j )xp 2 (d 2 j ) Soft versions of h and c make the gap differentiable w.r.t. θ, T : h(x; f, T ) = + exp (f(x) T )) c(x, y) = 4 (h(x) /2) (h(y) /2) 4

34 Ensemble classifiers A weighted embedding H(x) = [α h (x),..., α M h M (x)]. Each h m (x) defines a classifier c m. Together they form an ensemble classifier of similarity: C(x, y) = sgn ( M m= α m c m (x, y) We will construct the ensemble of bit-valued hs by a greedy algorithm based on AdaBoost (operting on the corresponding cs). ). 5

35 Boosting [Schapire et al] Assumes access to weak learner that can at every iteration m produce a classifier c m better than chance. Main idea: maintain a distribution W m (i) on the training examples, and update it according to the prediction of c m : If c m (x i ) is correct, then W m+ (i) goes down; Otherwise, W m+ (i) increases ( steering c m+ towards it.) Our examples are pairs, weak classifiers are thresholded projections. To evaluate threshold, we will calculate the weight of separated pairs, rather than count them. 6

36 BoostPro Given pairs (x () i, x (2) i ) labeled with l i = S(x () i, x (2) i ): : Initialize weights W (i), to uniform. 2: for all m =,..., M do 3: Find f, T = argmax f,t r m (f, T ) using gradient descent on r m (f, T ) = N i= W m (i)l i c m (x () i, x (2) i ). 4: Set h m h(x; f, T ). 5: Set α m (see Boosting papers.) 6: Update weights: W m+ (i) W m (i) exp ( ) l i c m (x () i, x (2) i ). 7

37 Similarity is a rare event In many domains: vast majority of possible pairs are negative. People s poses, image patches, documents,... A reasonable approximation of the sampling process: Independently draw x, y from the data distribution. f(x), f(y) drawn from p(f(x)). Label (x, y) negative. 8

38 Similarity is a rare event In many domains: vast majority of possible pairs are negative. People s poses, image patches, documents,... A reasonable approximation of the sampling process: Independently draw x, y from the data distribution. f(x), f(y) drawn from p(f(x)). Label (x, y) negative. T f( x) 8

39 Similarity is a rare event In many domains: vast majority of possible pairs are negative. People s poses, image patches, documents,... A reasonable approximation of the sampling process: Independently draw x, y from the data distribution. f(x), f(y) drawn from p(f(x)). Label (x, y) negative. π T ( π ) T π T = Pr(f(x) T ) T f( x) 8

40 Similarity is a rare event In many domains: vast majority of possible pairs are negative. People s poses, image patches, documents,... A reasonable approximation of the sampling process: Independently draw x, y from the data distribution. f(x), f(y) drawn from p(f(x)). Label (x, y) negative. 2 π T ( π T ) T f( x) π T = Pr(f(x) T ) 8

41 Similarity is a rare event In many domains: vast majority of possible pairs are negative. People s poses, image patches, documents,... A reasonable approximation of the sampling process: Independently draw x, y from the data distribution. f(x), f(y) drawn from p(f(x)). Label (x, y) negative. 2 2 π T ( π T ) T f( x) π T = Pr(f(x) T ) 8

42 Semi-supervised setup Given similar pairs and unlabeled x,..., x N. Estimate TP rate as before. FP rate: Estimate the CDF of f(x); let π T = ˆP (f(x) T ). Then FP = π 2 T + ( π T ) 2. Note: this means FP /2 [Ke et al]. 9

43 BoostPro in a semi-supervised setup Normal boosting assumes positive and negative examples. Intuition: each unlabeled example x i represents all possible pairs (x i, y); those are, w.h.p., negative under our assumption. Two distributions: W m (j) on positive pairs, as before. S m (i) on unlabeled (single) examples. Instead of c m (x () i, x (2) i ) use the expectation E y [c m (x i, y)] to calculate the objective and set the weights. 20

44 Semi-supervised boosting: details Probability of misclassifying a negative (x j, y): P j = h(x j ; f, T )π + ( h(x j ; f, T ))( π). E y [c(x j, y)] = P j (+) + ( P j ) ( ) = 2P j. Modified boosting objective: r = N p i= W (i)c(x () i, x (2) i ) N j= S(j)E y [c(x j, y)] = N p i= W (i)c(x () i, x (2) i ) N j= S j (2P j ). 2

45 Results: Toy problems M =00, trained on 600 unlabeled points +,000 similar pairs 22

46 Results: Toy problems 23

47 Results: UCI data sets Data Set L PSH BoostPro M MPG ± ± ± ± 20 CPU ± ± ± ± 48 Housing ± ± ± ± 28 Abalone ± ± ± ± 8 Census ± ± ± ± 0 Test error on regression benchmark data from UCI/Delve. Mean ± std. deviation of MSE using locally-weighted regression. The last column shows the values of M (dimension of embedding H). Similarity defined in terms of target function values. 24

48 Results: pose retrieval Input 3 nearest neighbors in H H built with semi-supervised BoostPro, on 200,000 examples; M =

49 Results: pose retrieval Recall for k-nn retrieval. For each value of k, the fraction of true k-nn w.r.t. pose within the k-nn w.r.t. an image-based similarity measure is plotted. Black dotted line: L on EDH. Red dash-dot: PSH. Blue solid: BoostPro, M =

50 Visual similarity of image patches Define two patches to be similar under any rotation and mild shift (± 4 pixels). Should be covered by many reasonable similarity definitions. 27

51 Descriptor : sparse overcomplete code Generative model of patches [Olshausen&Field] Very unstable under transformation, hence L is not a good proxy for similarity. With BoostPro, improvement in area under ROC from 0.56 to

52 Descriptor 2: SIFT Scale-Invariant Feature Transform [Lowe] Histogram of gradient magnitude and orientation within the region, normalized by orientation and at an appropriate scale. Discriminative (can not generate a patch). Designed specifically to make two similar patches match. 29

53 Results Area under ROC curve: L on SIFT L on H(SIFT)

54 Conclusions It is beneficial to learn similarity directly for the task rather than rely on the default distance. Key property of our approach: similarity search reduced to L neighbor search (thus can be done in sublinear time.) Most useful for: Regression and multi-label classification; When large amounts of labeled data are available; When L p distance is not a good proxy for similarity. 3

55 Open questions What kinds of similarity concepts can we learn? How do we explore the space of projections more efficiently? Factorization of similarity. Combining unsupervised and supervised similarity learning for image regions. 32

56 Questions?.. 33

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

CSCI-567: Machine Learning (Spring 2019)

CSCI-567: Machine Learning (Spring 2019) CSCI-567: Machine Learning (Spring 2019) Prof. Victor Adamchik U of Southern California Mar. 19, 2019 March 19, 2019 1 / 43 Administration March 19, 2019 2 / 43 Administration TA3 is due this week March

More information

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods) Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March

More information

Artificial Intelligence Roman Barták

Artificial Intelligence Roman Barták Artificial Intelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Introduction We will describe agents that can improve their behavior through diligent study of their

More information

Robotics 2 AdaBoost for People and Place Detection

Robotics 2 AdaBoost for People and Place Detection Robotics 2 AdaBoost for People and Place Detection Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard v.1.0, Kai Arras, Oct 09, including material by Luciano Spinello and Oscar Martinez Mozos

More information

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically

More information

6.867 Machine learning: lecture 2. Tommi S. Jaakkola MIT CSAIL

6.867 Machine learning: lecture 2. Tommi S. Jaakkola MIT CSAIL 6.867 Machine learning: lecture 2 Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learning problem hypothesis class, estimation algorithm loss and estimation criterion sampling, empirical and

More information

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI Bayes Classifiers CAP5610 Machine Learning Instructor: Guo-Jun QI Recap: Joint distributions Joint distribution over Input vector X = (X 1, X 2 ) X 1 =B or B (drinking beer or not) X 2 = H or H (headache

More information

18.9 SUPPORT VECTOR MACHINES

18.9 SUPPORT VECTOR MACHINES 744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the

More information

Spectral Hashing: Learning to Leverage 80 Million Images

Spectral Hashing: Learning to Leverage 80 Million Images Spectral Hashing: Learning to Leverage 80 Million Images Yair Weiss, Antonio Torralba, Rob Fergus Hebrew University, MIT, NYU Outline Motivation: Brute Force Computer Vision. Semantic Hashing. Spectral

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Robotics 2. AdaBoost for People and Place Detection. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard

Robotics 2. AdaBoost for People and Place Detection. Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard Robotics 2 AdaBoost for People and Place Detection Kai Arras, Cyrill Stachniss, Maren Bennewitz, Wolfram Burgard v.1.1, Kai Arras, Jan 12, including material by Luciano Spinello and Oscar Martinez Mozos

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

CS7267 MACHINE LEARNING

CS7267 MACHINE LEARNING CS7267 MACHINE LEARNING ENSEMBLE LEARNING Ref: Dr. Ricardo Gutierrez-Osuna at TAMU, and Aarti Singh at CMU Mingon Kang, Ph.D. Computer Science, Kennesaw State University Definition of Ensemble Learning

More information

ECE521 Lecture7. Logistic Regression

ECE521 Lecture7. Logistic Regression ECE521 Lecture7 Logistic Regression Outline Review of decision theory Logistic regression A single neuron Multi-class classification 2 Outline Decision theory is conceptually easy and computationally hard

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Stephen Scott.

Stephen Scott. 1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

Linear Spectral Hashing

Linear Spectral Hashing Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to

More information

Anomaly Detection. Jing Gao. SUNY Buffalo

Anomaly Detection. Jing Gao. SUNY Buffalo Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their

More information

Logic and machine learning review. CS 540 Yingyu Liang

Logic and machine learning review. CS 540 Yingyu Liang Logic and machine learning review CS 540 Yingyu Liang Propositional logic Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions.

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan

Ensemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite

More information

Introduction to Machine Learning Midterm Exam

Introduction to Machine Learning Midterm Exam 10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but

More information

Regularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline

Regularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline Other Measures 1 / 52 sscott@cse.unl.edu learning can generally be distilled to an optimization problem Choose a classifier (function, hypothesis) from a set of functions that minimizes an objective function

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

A Gentle Introduction to Gradient Boosting. Cheng Li College of Computer and Information Science Northeastern University

A Gentle Introduction to Gradient Boosting. Cheng Li College of Computer and Information Science Northeastern University A Gentle Introduction to Gradient Boosting Cheng Li chengli@ccs.neu.edu College of Computer and Information Science Northeastern University Gradient Boosting a powerful machine learning algorithm it can

More information

Ensemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12

Ensemble Methods. Charles Sutton Data Mining and Exploration Spring Friday, 27 January 12 Ensemble Methods Charles Sutton Data Mining and Exploration Spring 2012 Bias and Variance Consider a regression problem Y = f(x)+ N(0, 2 ) With an estimate regression function ˆf, e.g., ˆf(x) =w > x Suppose

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin (accepted by ALT 06, joint work with Ling Li) Learning Systems Group, Caltech Workshop Talk in MLSS 2006, Taipei, Taiwan, 07/25/2006

More information

Logistic Regression. Machine Learning Fall 2018

Logistic Regression. Machine Learning Fall 2018 Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes

More information

10-701/ Machine Learning - Midterm Exam, Fall 2010

10-701/ Machine Learning - Midterm Exam, Fall 2010 10-701/15-781 Machine Learning - Midterm Exam, Fall 2010 Aarti Singh Carnegie Mellon University 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 15 numbered pages in this exam

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

COMS 4721: Machine Learning for Data Science Lecture 13, 3/2/2017

COMS 4721: Machine Learning for Data Science Lecture 13, 3/2/2017 COMS 4721: Machine Learning for Data Science Lecture 13, 3/2/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University BOOSTING Robert E. Schapire and Yoav

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric

More information

CS534 Machine Learning - Spring Final Exam

CS534 Machine Learning - Spring Final Exam CS534 Machine Learning - Spring 2013 Final Exam Name: You have 110 minutes. There are 6 questions (8 pages including cover page). If you get stuck on one question, move on to others and come back to the

More information

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual

Outline: Ensemble Learning. Ensemble Learning. The Wisdom of Crowds. The Wisdom of Crowds - Really? Crowd wiser than any individual Outline: Ensemble Learning We will describe and investigate algorithms to Ensemble Learning Lecture 10, DD2431 Machine Learning A. Maki, J. Sullivan October 2014 train weak classifiers/regressors and how

More information

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Optimal Data-Dependent Hashing for Approximate Near Neighbors Optimal Data-Dependent Hashing for Approximate Near Neighbors Alexandr Andoni 1 Ilya Razenshteyn 2 1 Simons Institute 2 MIT, CSAIL April 20, 2015 1 / 30 Nearest Neighbor Search (NNS) Let P be an n-point

More information

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring /

Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring / Machine Learning Ensemble Learning I Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi Spring 2015 http://ce.sharif.edu/courses/93-94/2/ce717-1 / Agenda Combining Classifiers Empirical view Theoretical

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17 3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/9/7 Perceptron as a neural

More information

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on

More information

Multiclass Classification-1

Multiclass Classification-1 CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass

More information

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA LOCALITY PRESERVING HASHING Yi-Hsuan Tsai Ming-Hsuan Yang Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA ABSTRACT The spectral hashing algorithm relaxes

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

The AdaBoost algorithm =1/n for i =1,...,n 1) At the m th iteration we find (any) classifier h(x; ˆθ m ) for which the weighted classification error m

The AdaBoost algorithm =1/n for i =1,...,n 1) At the m th iteration we find (any) classifier h(x; ˆθ m ) for which the weighted classification error m ) Set W () i The AdaBoost algorithm =1/n for i =1,...,n 1) At the m th iteration we find (any) classifier h(x; ˆθ m ) for which the weighted classification error m m =.5 1 n W (m 1) i y i h(x i ; 2 ˆθ

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

Statistical learning. Chapter 20, Sections 1 4 1

Statistical learning. Chapter 20, Sections 1 4 1 Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

The Power of Asymmetry in Binary Hashing

The Power of Asymmetry in Binary Hashing The Power of Asymmetry in Binary Hashing Behnam Neyshabur Yury Makarychev Toyota Technological Institute at Chicago Russ Salakhutdinov University of Toronto Nati Srebro Technion/TTIC Search by Image Image

More information

Global Scene Representations. Tilke Judd

Global Scene Representations. Tilke Judd Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Notion of Distance. Metric Distance Binary Vector Distances Tangent Distance

Notion of Distance. Metric Distance Binary Vector Distances Tangent Distance Notion of Distance Metric Distance Binary Vector Distances Tangent Distance Distance Measures Many pattern recognition/data mining techniques are based on similarity measures between objects e.g., nearest-neighbor

More information

CSE446: non-parametric methods Spring 2017

CSE446: non-parametric methods Spring 2017 CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want

More information

Semi-Supervised Learning of Speech Sounds

Semi-Supervised Learning of Speech Sounds Aren Jansen Partha Niyogi Department of Computer Science Interspeech 2007 Objectives 1 Present a manifold learning algorithm based on locality preserving projections for semi-supervised phone classification

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Machine Learning. Lecture 9: Learning Theory. Feng Li. Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell

More information

Performance evaluation of binary classifiers

Performance evaluation of binary classifiers Performance evaluation of binary classifiers Kevin P. Murphy Last updated October 10, 2007 1 ROC curves We frequently design systems to detect events of interest, such as diseases in patients, faces in

More information

Boosting. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL , 10.7, 10.13

Boosting. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL , 10.7, 10.13 Boosting Ryan Tibshirani Data Mining: 36-462/36-662 April 25 2013 Optional reading: ISL 8.2, ESL 10.1 10.4, 10.7, 10.13 1 Reminder: classification trees Suppose that we are given training data (x i, y

More information

Numerical Learning Algorithms

Numerical Learning Algorithms Numerical Learning Algorithms Example SVM for Separable Examples.......................... Example SVM for Nonseparable Examples....................... 4 Example Gaussian Kernel SVM...............................

More information

Discriminative Learning and Big Data

Discriminative Learning and Big Data AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26

Clustering. Professor Ameet Talwalkar. Professor Ameet Talwalkar CS260 Machine Learning Algorithms March 8, / 26 Clustering Professor Ameet Talwalkar Professor Ameet Talwalkar CS26 Machine Learning Algorithms March 8, 217 1 / 26 Outline 1 Administration 2 Review of last lecture 3 Clustering Professor Ameet Talwalkar

More information

Representing Images Detecting faces in images

Representing Images Detecting faces in images 11-755 Machine Learning for Signal Processing Representing Images Detecting faces in images Class 5. 15 Sep 2009 Instructor: Bhiksha Raj Last Class: Representing Audio n Basic DFT n Computing a Spectrogram

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

6.036 midterm review. Wednesday, March 18, 15

6.036 midterm review. Wednesday, March 18, 15 6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that

More information

Clustering with k-means and Gaussian mixture distributions

Clustering with k-means and Gaussian mixture distributions Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13

More information

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines CS4495/6495 Introduction to Computer Vision 8C-L3 Support Vector Machines Discriminative classifiers Discriminative classifiers find a division (surface) in feature space that separates the classes Several

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features Y target

More information

Lecture 8. Instructor: Haipeng Luo

Lecture 8. Instructor: Haipeng Luo Lecture 8 Instructor: Haipeng Luo Boosting and AdaBoost In this lecture we discuss the connection between boosting and online learning. Boosting is not only one of the most fundamental theories in machine

More information

Stochastic Gradient Descent

Stochastic Gradient Descent Stochastic Gradient Descent Machine Learning CSE546 Carlos Guestrin University of Washington October 9, 2013 1 Logistic Regression Logistic function (or Sigmoid): Learn P(Y X) directly Assume a particular

More information

Name (NetID): (1 Point)

Name (NetID): (1 Point) CS446: Machine Learning (D) Spring 2017 March 16 th, 2017 This is a closed book exam. Everything you need in order to solve the problems is supplied in the body of this exam. This exam booklet contains

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Visual Object Detection

Visual Object Detection Visual Object Detection Ying Wu Electrical Engineering and Computer Science Northwestern University, Evanston, IL 60208 yingwu@northwestern.edu http://www.eecs.northwestern.edu/~yingwu 1 / 47 Visual Object

More information

Large-Margin Thresholded Ensembles for Ordinal Regression

Large-Margin Thresholded Ensembles for Ordinal Regression Large-Margin Thresholded Ensembles for Ordinal Regression Hsuan-Tien Lin and Ling Li Learning Systems Group, California Institute of Technology, U.S.A. Conf. on Algorithmic Learning Theory, October 9,

More information

Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers

Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers Hiva Ghanbari Joint work with Prof. Katya Scheinberg Industrial and Systems Engineering Department US & Mexico Workshop

More information

9 Classification. 9.1 Linear Classifiers

9 Classification. 9.1 Linear Classifiers 9 Classification This topic returns to prediction. Unlike linear regression where we were predicting a numeric value, in this case we are predicting a class: winner or loser, yes or no, rich or poor, positive

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

Classification: The rest of the story

Classification: The rest of the story U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher

More information

Lecture 14 October 16, 2014

Lecture 14 October 16, 2014 CS 224: Advanced Algorithms Fall 2014 Prof. Jelani Nelson Lecture 14 October 16, 2014 Scribe: Jao-ke Chin-Lee 1 Overview In the last lecture we covered learning topic models with guest lecturer Rong Ge.

More information

Web Search and Text Mining. Learning from Preference Data

Web Search and Text Mining. Learning from Preference Data Web Search and Text Mining Learning from Preference Data Outline Two-stage algorithm, learning preference functions, and finding a total order that best agrees with a preference function Learning ranking

More information

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology

... SPARROW. SPARse approximation Weighted regression. Pardis Noorzad. Department of Computer Engineering and IT Amirkabir University of Technology ..... SPARROW SPARse approximation Weighted regression Pardis Noorzad Department of Computer Engineering and IT Amirkabir University of Technology Université de Montréal March 12, 2012 SPARROW 1/47 .....

More information

Lecture 2 Machine Learning Review

Lecture 2 Machine Learning Review Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things

More information

Beyond Spatial Pyramids

Beyond Spatial Pyramids Beyond Spatial Pyramids Receptive Field Learning for Pooled Image Features Yangqing Jia 1 Chang Huang 2 Trevor Darrell 1 1 UC Berkeley EECS 2 NEC Labs America Goal coding pooling Bear Analysis of the pooling

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Learning from Examples

Learning from Examples Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble

More information

Open Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers

Open Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers JMLR: Workshop and Conference Proceedings vol 35:1 8, 014 Open Problem: A (missing) boosting-type convergence result for ADABOOST.MH with factorized multi-class classifiers Balázs Kégl LAL/LRI, University

More information

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning CS 446 Machine Learning Fall 206 Nov 0, 206 Bayesian Learning Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes Overview Bayesian Learning Naive Bayes Logistic Regression Bayesian Learning So far, we

More information

AdaBoost. S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology

AdaBoost. S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology AdaBoost S. Sumitra Department of Mathematics Indian Institute of Space Science and Technology 1 Introduction In this chapter, we are considering AdaBoost algorithm for the two class classification problem.

More information

Lecture 5 Neural models for NLP

Lecture 5 Neural models for NLP CS546: Machine Learning in NLP (Spring 2018) http://courses.engr.illinois.edu/cs546/ Lecture 5 Neural models for NLP Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Office hours: Tue/Thu 2pm-3pm

More information

Data Mining und Maschinelles Lernen

Data Mining und Maschinelles Lernen Data Mining und Maschinelles Lernen Ensemble Methods Bias-Variance Trade-off Basic Idea of Ensembles Bagging Basic Algorithm Bagging with Costs Randomization Random Forests Boosting Stacking Error-Correcting

More information

Variations of Logistic Regression with Stochastic Gradient Descent

Variations of Logistic Regression with Stochastic Gradient Descent Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic

More information