The Power of Asymmetry in Binary Hashing
|
|
- Pamela Harrison
- 5 years ago
- Views:
Transcription
1 The Power of Asymmetry in Binary Hashing Behnam Neyshabur Yury Makarychev Toyota Technological Institute at Chicago Russ Salakhutdinov University of Toronto Nati Srebro Technion/TTIC
2 Search by Image Image Database
3 Search by Image f( ) = Database of hashed image patches f( ) = Brompton
4 2 2 2 Binary Hashing { 1} k { 1} k 0/1 [ d hamming ( f(x), f(x ) ) < µ ] ¼ Sim(x,x )
5 2 2 2 Binary Hashing { 1} k { 1} k 0/1 [ d hamming ( f(x), f(x ) ) < µ ] ¼ Sim(x,x ) g Even if Sim(x,x ) is symmetric and well behaved : Use f(x) to hash objects in the database Use g(x) to hash queries No additional memory, communication or computation to perform query No increase in hash table size or lookup complexity We will show: shorter bit length; higher accuracy
6 Outline Theoretical Observation: Capturing similarity with arbitrary binary hashes Empirical Investigation: Database lookup with linear threshold hashes
7 Capturing Similarity with an Arbitrary Binary Code Given similarity S(x,x ) over objects X={x 1,x 2,,x n }, want mapping f:x! 1 k, s.t. S ij = S(x i,x j ) = [d(f(x i ),f(x j ))<µ] f( ) arbitrary, specified by u 1,,u n, u i =f(x i ) What is shortest k possible? 9 u 1,,u n 2 1 k, µ2r 8 ij S ij =[d(u i,u j )<µ] What is shortest k if we allow asymmetry? 9 u 1,,u n,v 1,,v n 2 1 k, µ2r 8 ij S ij =[d(u i,v j )<µ] v j =g(x j )
8 The Power of Asymmetry for Arbitrary Binary Hashes Theorem: For any r, there exists a set of points in Euclidean space, s.t. to capture S(x i,x j ) = [ x i -x j < 1 ] using a symmetric binary hash we need k sym 2 r-1 bits, but using an asymmetric hash, we need only k asym 2r bits. X X = n n n n n n n=2 r
9 Binary Hashing as Matrix Factorization [ d hamming ( u i, u j ) < µ ] ¼ Sim(x,x )
10 Binary Hashing as Matrix Factorization [ d hamming [ h( u i, i, u j j i) < µ ] ¼ Sim(x,x ) Given similarity matrix S 2 { 1} n n : min k s.t. U 2 { 1} k n symmetric sign(u T U- µ)=s min k s.t. U,V 2 { 1} k n asymmetric sign(u T V- µ)=s
11 Bonus: Approximation Algorithm via SDP Relaxation Given similarity matrix S 2 { 1} n n : min k s.t. U 2 { 1} k n sign( Y - µ)=s Y = U T U min k s.t. U,V 2 { 1} k n sign( Y - µ)=s Y = U T V symmetric asymmetric
12 Bonus: Approximation Algorithm via SDP Relaxation Given similarity matrix S 2 { 1} n n : min k s.t. U k symmetric S Y θ 1 Y = U T U min k s.t. U, V k asymmetric S Y θ 1 Y = U T V
13 Bonus: Approximation Algorithm via SDP Relaxation Given similarity matrix S 2 { 1} n n : min Y max s.t. symmetric S Y θ 1 Y 0 min Y max s.t. asymmetric S Y θ 1 Y 2 max min Y LR L,2 R,2
14 SDP Relaxation Rounding Y 2 max min Y LR L,2 R,2 Using random vectors z 1,,z k : u i = sign(l it z i ) v i = sign(r it z i ) k = O( (k opt ) 2 log n )
15 The Power of Asymmetry for Arbitrary Binary Hashes Theorem: For any r, there exists a set of points in Euclidean space, s.t. to capture S(x i,x j ) = [ x i -x j < 1 ] using a symmetric binary hash we need k sym 2 r-1 bits, but using an asymmetric hash, we need only k asym 2r bits. X X = n n n n n n n=2 r
16 Arbitrary Binary Hashes: Empirical Evaluation 35 10D Uniform 70 LabelMe bits Symmetric Asymmetric bits Symmetric Asymmetric Average Precision Average Precision
17 Parametric Mappings f(x) = Á F (x) has some parametric form E.g. Á F (x) = sign(fx), F2R k d Could be more complex, e.g. multi-layer network, kernel based, etc. Why restrict f( )? Generalization: learn F using objects x 1,,x n, then receive new objects x as queries Compactness of representation Asymmetric extension: S(x,x ) ¼ [ d(á F (x), Á G (x ) ) < µ ] Learn params F, G from training data (= pairs of database objects) Hash database objects using Á G Hash queries using Á F
18 A bit on optimization Loss function : L(Á F (x), Á G (x ), S(x,x )) Parameters F, G For Á F (x) = sign(fx): Updating single row of F (responsible for single bit in the hashing) entails solving a single weighted binary classification problem
19 Empirical Results using Asymmetric Linear Threshold Hashes bits required LabelMe bits required MNIST 5 5 bits required Average Precision Peekaboom Average Precision Comparison with learning symmetric linear threshold hashes using Minimum Loss Hashing [Norouzi Fleet ICML 2011] Training on all pairs of database objects, average precision measured on held out query objects.
20 Even More Power: Searching a Static Database S(x query, x i ) ¼ [ d(áf F (x query ), Ág G (x i ) ) < µ ] f(x query ) g(x 1 ) g(x 5 ) g(x 2 ) g(x 4 ) g(x n ) g(x 3 )
21 Even More Power: Searching a Static Database S(x query, x i ) ¼ [ d(á F (x query ), Á G (x i ) ) < µ ] v i No need to generalize: only used on database objects Stored in database hash, no need for compactness Can use arbitrary mapping g(x i )=v i Learn parametric Á F ( ) and arbitrary vectors v 1,,v n such that on x 1,,x n : S(x i,x j ) ¼ [ d(á F (x i ), v j ) < µ ] For query x, use d(á F (x),v j ) to aprox S(x,x j ) e.g. find v j in DB with small d(á F (x),v j )
22 Empirical Results using Linear-Threshold-to-Arbitrary Hashes bits required LabelMe bits required MNIST 5 5 bits required Average Precision Peekaboom Precision Average Precision LabelMe Lin:Arb 64bits MLH 64bits MLH 256bits Recall
23 Summary Neyshabur et al, The Power of Asymmetry in Binary Hashing, NIPS 2013 Binary Hashing Fast similarity approximation Approximate retrieval Nearest Neighbor search Locality Sensitive Hashing (LSH) In many applications: want short bit-length Smaller hash tables Super-fast hamming distance eval on short words Fewer bits to transmit Asymmetric hashes can enable better approximation with shorter bit-length! Even if similarity function symmetric and well-behaves In most applications: no additional computational or memory costs
On Symmetric and Asymmetric LSHs for Inner Product Search
Behnam Neyshabur Nathan Srebro Toyota Technological Institute at Chicago, Chicago, IL 6637, USA BNEYSHABUR@TTIC.EDU NATI@TTIC.EDU Abstract We consider the problem of designing locality sensitive hashes
More informationMinimal Loss Hashing for Compact Binary Codes
Mohammad Norouzi David J. Fleet Department of Computer Science, University of oronto, Canada norouzi@cs.toronto.edu fleet@cs.toronto.edu Abstract We propose a method for learning similaritypreserving hash
More informationSpectral Hashing: Learning to Leverage 80 Million Images
Spectral Hashing: Learning to Leverage 80 Million Images Yair Weiss, Antonio Torralba, Rob Fergus Hebrew University, MIT, NYU Outline Motivation: Brute Force Computer Vision. Semantic Hashing. Spectral
More informationMetric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)
Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy
More informationLinear Spectral Hashing
Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to
More informationThe University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.
The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and
More informationSupplemental Material for Discrete Graph Hashing
Supplemental Material for Discrete Graph Hashing Wei Liu Cun Mu Sanjiv Kumar Shih-Fu Chang IM T. J. Watson Research Center Columbia University Google Research weiliu@us.ibm.com cm52@columbia.edu sfchang@ee.columbia.edu
More informationImplicit Optimization Bias
Implicit Optimization Bias as a key to Understanding Deep Learning Nati Srebro (TTIC) Based on joint work with Behnam Neyshabur (TTIC IAS), Ryota Tomioka (TTIC MSR), Srinadh Bhojanapalli, Suriya Gunasekar,
More informationComposite Quantization for Approximate Nearest Neighbor Search
Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationStochastic Generative Hashing
Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationLOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA
LOCALITY PRESERVING HASHING Yi-Hsuan Tsai Ming-Hsuan Yang Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA ABSTRACT The spectral hashing algorithm relaxes
More informationMatrix Factorizations: A Tale of Two Norms
Matrix Factorizations: A Tale of Two Norms Nati Srebro Toyota Technological Institute Chicago Maximum Margin Matrix Factorization S, Jason Rennie, Tommi Jaakkola (MIT), NIPS 2004 Rank, Trace-Norm and Max-Norm
More informationSimilarity-Based Theoretical Foundation for Sparse Parzen Window Prediction
Similarity-Based Theoretical Foundation for Sparse Parzen Window Prediction Nina Balcan Avrim Blum Nati Srebro Toyota Technological Institute Chicago Sparse Parzen Window Prediction We are concerned with
More information18.6 Regression and Classification with Linear Models
18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationHopfield Neural Network and Associative Memory. Typical Myelinated Vertebrate Motoneuron (Wikipedia) Topic 3 Polymers and Neurons Lecture 5
Hopfield Neural Network and Associative Memory Typical Myelinated Vertebrate Motoneuron (Wikipedia) PHY 411-506 Computational Physics 2 1 Wednesday, March 5 1906 Nobel Prize in Physiology or Medicine.
More informationIsotropic Hashing. Abstract
Isotropic Hashing Weihao Kong, Wu-Jun Li Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering, Shanghai Jiao Tong University, China {kongweihao,liwujun}@cs.sjtu.edu.cn
More informationECE521 Lectures 9 Fully Connected Neural Networks
ECE521 Lectures 9 Fully Connected Neural Networks Outline Multi-class classification Learning multi-layer neural networks 2 Measuring distance in probability space We learnt that the squared L2 distance
More informationMetric Learning. 16 th Feb 2017 Rahul Dey Anurag Chowdhury
Metric Learning 16 th Feb 2017 Rahul Dey Anurag Chowdhury 1 Presentation based on Bellet, Aurélien, Amaury Habrard, and Marc Sebban. "A survey on metric learning for feature vectors and structured data."
More informationCIS 520: Machine Learning Oct 09, Kernel Methods
CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed
More informationMachine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier
Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW1 due next lecture Project details are available decide on the group and topic by Thursday Last time Generative vs. Discriminative
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationImproved Hamming Distance Search using Variable Length Hashing
Improved Hamming istance Search using Variable Length Hashing Eng-Jon Ong and Miroslaw Bober Centre for Vision, Speech and Signal Processing University of Surrey, Guildford, UK e.ong,m.bober@surrey.ac.uk
More informationProbabilistic Hashing for Efficient Search and Learning
Ping Li Probabilistic Hashing for Efficient Search and Learning July 12, 2012 MMDS2012 1 Probabilistic Hashing for Efficient Search and Learning Ping Li Department of Statistical Science Faculty of omputing
More informationSupervised Quantization for Similarity Search
Supervised Quantization for Similarity Search Xiaojuan Wang 1 Ting Zhang 2 Guo-Jun Qi 3 Jinhui Tang 4 Jingdong Wang 5 1 Sun Yat-sen University, China 2 University of Science and Technology of China, China
More informationA Convex Approach for Designing Good Linear Embeddings. Chinmay Hegde
A Convex Approach for Designing Good Linear Embeddings Chinmay Hegde Redundancy in Images Image Acquisition Sensor Information content
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationBasic Properties of Metric and Normed Spaces
Basic Properties of Metric and Normed Spaces Computational and Metric Geometry Instructor: Yury Makarychev The second part of this course is about metric geometry. We will study metric spaces, low distortion
More informationRuslan Salakhutdinov Joint work with Geoff Hinton. University of Toronto, Machine Learning Group
NON-LINEAR DIMENSIONALITY REDUCTION USING NEURAL NETORKS Ruslan Salakhutdinov Joint work with Geoff Hinton University of Toronto, Machine Learning Group Overview Document Retrieval Present layer-by-layer
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationCSE446: non-parametric methods Spring 2017
CSE446: non-parametric methods Spring 2017 Ali Farhadi Slides adapted from Carlos Guestrin and Luke Zettlemoyer Linear Regression: What can go wrong? What do we do if the bias is too strong? Might want
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationClustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation
Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide
More informationImage retrieval, vector quantization and nearest neighbor search
Image retrieval, vector quantization and nearest neighbor search Yannis Avrithis National Technical University of Athens Rennes, October 2014 Part I: Image retrieval Particular object retrieval Match images
More informationQuantum Recommendation Systems
Quantum Recommendation Systems Iordanis Kerenidis 1 Anupam Prakash 2 1 CNRS, Université Paris Diderot, Paris, France, EU. 2 Nanyang Technological University, Singapore. April 4, 2017 The HHL algorithm
More informationLecture 10. Semidefinite Programs and the Max-Cut Problem Max Cut
Lecture 10 Semidefinite Programs and the Max-Cut Problem In this class we will finally introduce the content from the second half of the course title, Semidefinite Programs We will first motivate the discussion
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationA Bayesian Method for Guessing the Extreme Values in a Data Set
A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu University of Florida May, 2008 Mingxi Wu (University of Florida) May, 2008 1 / 74 Outline Problem Definition Example Applications
More informationBayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI
Bayes Classifiers CAP5610 Machine Learning Instructor: Guo-Jun QI Recap: Joint distributions Joint distribution over Input vector X = (X 1, X 2 ) X 1 =B or B (drinking beer or not) X 2 = H or H (headache
More informationKernel Methods. Charles Elkan October 17, 2007
Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then
More informationAlgorithms for Nearest Neighbors
Algorithms for Nearest Neighbors Background and Two Challenges Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura McGill University, July 2007 1 / 29 Outline
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationDatabase Theory VU , SS Ehrenfeucht-Fraïssé Games. Reinhard Pichler
Database Theory Database Theory VU 181.140, SS 2018 7. Ehrenfeucht-Fraïssé Games Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Pichler 15
More informationMulti-layer Perceptron Networks
Multi-layer Perceptron Networks Our task is to tackle a class of problems that the simple perceptron cannot solve. Complex perceptron networks not only carry some important information for neuroscience,
More informationWeek 4: Hopfield Network
Week 4: Hopfield Network Phong Le, Willem Zuidema November 20, 2013 Last week we studied multi-layer perceptron, a neural network in which information is only allowed to transmit in one direction (from
More informationNeural Networks. Prof. Dr. Rudolf Kruse. Computational Intelligence Group Faculty for Computer Science
Neural Networks Prof. Dr. Rudolf Kruse Computational Intelligence Group Faculty for Computer Science kruse@iws.cs.uni-magdeburg.de Rudolf Kruse Neural Networks 1 Hopfield Networks Rudolf Kruse Neural Networks
More informationDay 3: Classification, logistic regression
Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised
More informationNeural Networks DWML, /25
DWML, 2007 /25 Neural networks: Biological and artificial Consider humans: Neuron switching time 0.00 second Number of neurons 0 0 Connections per neuron 0 4-0 5 Scene recognition time 0. sec 00 inference
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationOnline Learning and Sequential Decision Making
Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Online Learning
More informationStochastic Optimization for Deep CCA via Nonlinear Orthogonal Iterations
Stochastic Optimization for Deep CCA via Nonlinear Orthogonal Iterations Weiran Wang Toyota Technological Institute at Chicago * Joint work with Raman Arora (JHU), Karen Livescu and Nati Srebro (TTIC)
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationRank, Trace-Norm & Max-Norm
Rank, Trace-Norm & Max-Norm as measures of matrix complexity Nati Srebro University of Toronto Adi Shraibman Hebrew University Matrix Learning users movies 2 1 4 5 5 4? 1 3 3 5 2 4? 5 3? 4 1 3 5 2 1? 4
More informationNotion of Distance. Metric Distance Binary Vector Distances Tangent Distance
Notion of Distance Metric Distance Binary Vector Distances Tangent Distance Distance Measures Many pattern recognition/data mining techniques are based on similarity measures between objects e.g., nearest-neighbor
More information18.9 SUPPORT VECTOR MACHINES
744 Chapter 8. Learning from Examples is the fact that each regression problem will be easier to solve, because it involves only the examples with nonzero weight the examples whose kernels overlap the
More informationOptimized Product Quantization for Approximate Nearest Neighbor Search
Optimized Product Quantization for Approximate Nearest Neighbor Search Tiezheng Ge Kaiming He 2 Qifa Ke 3 Jian Sun 2 University of Science and Technology of China 2 Microsoft Research Asia 3 Microsoft
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationAlgorithms for Data Science: Lecture on Finding Similar Items
Algorithms for Data Science: Lecture on Finding Similar Items Barna Saha 1 Finding Similar Items Finding similar items is a fundamental data mining task. We may want to find whether two documents are similar
More informationSpectral Hashing. Antonio Torralba 1 1 CSAIL, MIT, 32 Vassar St., Cambridge, MA Abstract
Spectral Hashing Yair Weiss,3 3 School of Computer Science, Hebrew University, 9904, Jerusalem, Israel yweiss@cs.huji.ac.il Antonio Torralba CSAIL, MIT, 32 Vassar St., Cambridge, MA 0239 torralba@csail.mit.edu
More informationConsistency of Nearest Neighbor Methods
E0 370 Statistical Learning Theory Lecture 16 Oct 25, 2011 Consistency of Nearest Neighbor Methods Lecturer: Shivani Agarwal Scribe: Arun Rajkumar 1 Introduction In this lecture we return to the study
More informationNEAREST neighbor (NN) search has been a fundamental
JOUNAL OF L A T E X CLASS FILES, VOL. 3, NO. 9, JUNE 26 Composite Quantization Jingdong Wang and Ting Zhang arxiv:72.955v [cs.cv] 4 Dec 27 Abstract This paper studies the compact coding approach to approximate
More informationDay 4: Classification, support vector machines
Day 4: Classification, support vector machines Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 21 June 2018 Topics so far
More informationNearest Neighbor Search with Keywords: Compression
Nearest Neighbor Search with Keywords: Compression Yufei Tao KAIST June 3, 2013 In this lecture, we will continue our discussion on: Problem (Nearest Neighbor Search with Keywords) Let P be a set of points
More informationLearning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval
Learning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval Qifan Wang 1, Luo Si 1, and Dan Zhang 2 1 Department of Computer Science, Purdue
More informationTrusting the Cloud with Practical Interactive Proofs
Trusting the Cloud with Practical Interactive Proofs Graham Cormode G.Cormode@warwick.ac.uk Amit Chakrabarti (Dartmouth) Andrew McGregor (U Mass Amherst) Justin Thaler (Harvard/Yahoo!) Suresh Venkatasubramanian
More informationCell-Probe Proofs and Nondeterministic Cell-Probe Complexity
Cell-obe oofs and Nondeterministic Cell-obe Complexity Yitong Yin Department of Computer Science, Yale University yitong.yin@yale.edu. Abstract. We study the nondeterministic cell-probe complexity of static
More informationMa/CS 6b Class 25: Error Correcting Codes 2
Ma/CS 6b Class 25: Error Correcting Codes 2 By Adam Sheffer Recall: Codes V n the set of binary sequences of length n. For example, V 3 = 000,001,010,011,100,101,110,111. Codes of length n are subsets
More informationKaggle.
Administrivia Mini-project 2 due April 7, in class implement multi-class reductions, naive bayes, kernel perceptron, multi-class logistic regression and two layer neural networks training set: Project
More informationSupervised Hashing via Uncorrelated Component Analysis
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Supervised Hashing via Uncorrelated Component Analysis SungRyull Sohn CG Research Team Electronics and Telecommunications
More informationOn the Hopfield algorithm. Foundations and examples
General Mathematics Vol. 13, No. 2 (2005), 35 50 On the Hopfield algorithm. Foundations and examples Nicolae Popoviciu and Mioara Boncuţ Dedicated to Professor Dumitru Acu on his 60th birthday Abstract
More informationArtificial Intelligence Roman Barták
Artificial Intelligence Roman Barták Department of Theoretical Computer Science and Mathematical Logic Introduction We will describe agents that can improve their behavior through diligent study of their
More informationRegularized Least Squares
Regularized Least Squares Ryan M. Rifkin Google, Inc. 2008 Basics: Data Data points S = {(X 1, Y 1 ),...,(X n, Y n )}. We let X simultaneously refer to the set {X 1,...,X n } and to the n by d matrix whose
More informationClassification of handwritten digits using supervised locally linear embedding algorithm and support vector machine
Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine Olga Kouropteva, Oleg Okun, Matti Pietikäinen Machine Vision Group, Infotech Oulu and
More informationNonlinear Least Squares
Nonlinear Least Squares Stephen Boyd EE103 Stanford University December 6, 2016 Outline Nonlinear equations and least squares Examples Levenberg-Marquardt algorithm Nonlinear least squares classification
More information2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net.
2- AUTOASSOCIATIVE NET - The feedforward autoassociative net considered in this section is a special case of the heteroassociative net. - For an autoassociative net, the training input and target output
More informationAsymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment
Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment Anshumali Shrivastava and Ping Li Cornell University and Rutgers University WWW 25 Florence, Italy May 2st 25 Will Join
More informationSemi Supervised Distance Metric Learning
Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationMachine Learning 4771
Machine Learning 477 Instructor: Tony Jebara Topic 5 Generalization Guarantees VC-Dimension Nearest Neighbor Classification (infinite VC dimension) Structural Risk Minimization Support Vector Machines
More informationOnline Learning Summer School Copenhagen 2015 Lecture 1
Online Learning Summer School Copenhagen 2015 Lecture 1 Shai Shalev-Shwartz School of CS and Engineering, The Hebrew University of Jerusalem Online Learning Shai Shalev-Shwartz (Hebrew U) OLSS Lecture
More informationOn the power and the limits of evolvability. Vitaly Feldman Almaden Research Center
On the power and the limits of evolvability Vitaly Feldman Almaden Research Center Learning from examples vs evolvability Learnable from examples Evolvable Parity functions The core model: PAC [Valiant
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationBrief Introduction to Machine Learning
Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector
More informationRecognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015
Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,
More informationArtificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!
Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationComments. x > w = w > x. Clarification: this course is about getting you to be able to think as a machine learning expert
Logistic regression Comments Mini-review and feedback These are equivalent: x > w = w > x Clarification: this course is about getting you to be able to think as a machine learning expert There has to be
More informationEfficient Data Reduction and Summarization
Ping Li Efficient Data Reduction and Summarization December 8, 2011 FODAVA Review 1 Efficient Data Reduction and Summarization Ping Li Department of Statistical Science Faculty of omputing and Information
More informationJeffrey D. Ullman Stanford University
Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some
More informationDecision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014
Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists
More informationMidterm exam CS 189/289, Fall 2015
Midterm exam CS 189/289, Fall 2015 You have 80 minutes for the exam. Total 100 points: 1. True/False: 36 points (18 questions, 2 points each). 2. Multiple-choice questions: 24 points (8 questions, 3 points
More informationEfficient and Principled Online Classification Algorithms for Lifelon
Efficient and Principled Online Classification Algorithms for Lifelong Learning Toyota Technological Institute at Chicago Chicago, IL USA Talk @ Lifelong Learning for Mobile Robotics Applications Workshop,
More informationLearning from Examples
Learning from Examples Data fitting Decision trees Cross validation Computational learning theory Linear classifiers Neural networks Nonparametric methods: nearest neighbor Support vector machines Ensemble
More informationLocal regression, intrinsic dimension, and nonparametric sparsity
Local regression, intrinsic dimension, and nonparametric sparsity Samory Kpotufe Toyota Technological Institute - Chicago and Max Planck Institute for Intelligent Systems I. Local regression and (local)
More informationComputational and Statistical Learning theory
Computational and Statistical Learning theory Problem set 2 Due: January 31st Email solutions to : karthik at ttic dot edu Notation : Input space : X Label space : Y = {±1} Sample : (x 1, y 1,..., (x n,
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationPositive Semi-definite programing and applications for approximation
Combinatorial Optimization 1 Positive Semi-definite programing and applications for approximation Guy Kortsarz Combinatorial Optimization 2 Positive Sem-Definite (PSD) matrices, a definition Note that
More information