Composite Quantization for Approximate Nearest Neighbor Search

Size: px
Start display at page:

Download "Composite Quantization for Approximate Nearest Neighbor Search"

Transcription

1 Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research ICML 104, joint work with my interns Ting Zhang from USTC and Chao Du from Tsinghua University

2 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

3 Nearest neighbor search Application to similar image search query

4 Nearest neighbor search Application to particular object retrieval query

5 Nearest neighbor search Application to duplicate image search

6 Nearest neighbor search Similar image search Application to K-NN annotation: Annotate the query image using the similar images

7 Nearest neighbor search Definition Database: Query: Nearest neighbor: R d x q

8 Nearest neighbor search Exact nearest neighbor search linear scan:

9 Nearest neighbor search - speedup K-dimensional tree (Kd tree) Generalized binary search tree Metric tree Ball tree VP tree BD tree Cover tree

10 Nearest neighbor search Exact nearest neighbor search linear scan: Costly and impractical for large scale high-dimensional cases Approximate nearest neighbor (ANN) search Efficient Acceptable accuracy Practically used

11 Two principles for ANN search Recall the complexity of linear scan: 1. Reduce the number of distance computations Time complexity: Tree structure, neighborhood graph search and inverted index

12 Our work: TP Tree + NG Search TP Tree Jingdong Wang, Naiyan Wang, You Jia, Jian Li, Gang Zeng, Hongbin Zha, Xian-Sheng Hua:Trinary- Projection Trees for Approximate Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 36(): (014) You Jia, Jingdong Wang, Gang Zeng, Hongbin Zha, Xian-Sheng Hua: Optimizing kd-trees for scalable visual descriptor indexing. CVPR 010: Neighborhood graph search Jingdong Wang, Shipeng Li:Query-driven iterated neighborhood graph search for large scale indexing. ACM Multimedia 01: Jing Wang, Jingdong Wang, Gang Zeng, Rui Gan, Shipeng Li, Baining Guo: Fast Neighborhood Graph Search Using Cartesian Concatenation. ICCV 013: Neighborhood graph construction Jing Wang, Jingdong Wang, Gang Zeng, Zhuowen Tu, Rui Gan, Shipeng Li: Scalable k-nn graph construction for visual descriptors. CVPR 01:

13 Comparison over SIFT 1M ICCV13 ACMMM1 CVPR10 1 NN 13

14 Comparison over GIST 1M ICCV13 ACMMM1 CVPR10 1 NN 14

15 Comparison over HOG 10M ICCV13 ACMMM1 CVPR10 1 NN 15

16 Neighborhood Graph Search Shipped to Bing ClusterBed Number Index building time on 40M documents is only hours and 10 minutes. Search DPS on each NNS machine stable at 950 without retry and errors Five times faster improved FLANN

17 Two principles for ANN search Recall the complexity of linear scan: 1. Reduce the number of distance computations Time complexity: 1. High efficiency. Large memory cost Tree structure, neighborhood graph search and inverted index. Reduce the cost of each distance computation Time complexity: Hashing (compact codes) 1. Small memory cost. Low efficiency

18 Approximate nearest neighbor search Binary embedding methods (hashing) Produce a few distinct distances Limited ability and flexibility of distance approximation Vector quantization (compact codes) K-means Impossible to use for medium and large code length Impossible to learn the codebook Impossible to compute a code for a vector Product quantization Cartesian k-means

19 Combined solution for very large scale search Retrieve candidates with an index structure using compact codes Load raw features for retrieved candidates from disk Reranking using the true distances Efficient and small memory consumption IO cost is small

20 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

21 Product quantization Approximate x by the concatenation of M subvectors

22 Product quantization Approximate x by the concatenation of M subvectors x 1i1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x i x = p i {p 1, p,, p K } Codebook in the nd subspace x MiM p MiM {p M1, p M,, p MK } Codebook in the M th subspace

23 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

24 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

25 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

26 Product quantization Approximate x by the concatenation of M subvectors Code presentation: (i 1, i,, i M ) Distance computation: d q, x = d q 1, p 1i1 + d q, p i + + d qm, p MiM M additions using a pre-computed distance table x 1 p 1i1 d( {p 11, p 1,, p 1K }, qcodebook in the 1 st 1 ) subspace {d q 1, p 11, d q 1, p 1,, d(q 1, p 1K )} x = x x = p i d( {p 1, p,, p K }, qcodebook ) in the nd subspace q 1 x M p MiM d( {p M1, p M,, p MK }, qcodebook in the M th M ) subspace

27 Product quantization M = Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace

28 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace p 3 p p 1 p 11 p 1 p 13

29 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups p 3 p The center of each is the concatenation of M subvectorsp 1 p 11 p 1 p 13 = p 11 p 1

30 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors

31 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors

32 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors p x p 13 x x = p 13 p

33 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

34 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 1i1 x x = R p i p MiM

35 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 3 p p 13 x x = R p 1i1 p i p 1 p 1 p 11 p MiM

36 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 3 p x p 13 x x = R p 1i1 p i p MiM p 1 p 1 p 11 x x = R p 13 p

37 Outline Introduction Composite quantization Experiments

38 Composite quantization Approximate x by the addition of M vectors x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M Each source codebook is composed of K d-dimensional vectors

39 Composite quantization source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } c 11 c 3 c 1 c 1 c c 13

40 Composite quantization c 11 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c 1 c 1

41 Composite quantization c 11 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c c

42 Composite quantization c 11 c 3 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c 3

43 Composite quantization source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } More composite centers

44 Composite quantization Source codebook: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite codebook: 9 composite centers

45 Composite quantization x Source codebook: {c 11, c 1, c 13 } {c 1, c, c 3 } c 11 Space partition: 9 groups c x x = c 11 + c

46 Composite quantization Approximate x by the addition of M vectors x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M

47 Composite quantization Approximate x by the addition of M vectors Code representation: i 1 i i M length: MlogK x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M

48 Connection to product quantization Concatenation in product quantization p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

49 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

50 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

51 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

52 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM Product quantization is constrained Composite quantization composite quantization x x = c 1i1 + c i + + c MiM

53 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM

54 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM

55 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM Cartesian k-means is constrained Composite quantization composite quantization x x = c 1i1 + c i + + c MiM

56 Composite quantization Composite quantization Generalizing product quantization and Cartesian k-means Advantages More flexible codebooks More accurate data approximation Higher search accuracy Search efficiency? Depend on the efficiency of the distance computation Product quantization: Coordinate aligned space partition Cartesian k-means: Rotated coordinate aligned space partition Composite quantization: Flexible space partition

57 Approximate distance computation Recall the data approximation M x x = m=1 c mim x Approximate distance to the query q q x q M m=1 c mim x Time-consuming

58 Approximate distance computation Expanded into three terms q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x)

59 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) O(M) additions Implemented with a pre-computed distance lookup table Distance lookup table: Store the distances from source codebook elements to q T c lil (x)

60 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant

61 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant O(M ) additions Using a pre-computed dot product lookup table Dot product lookup table: Store the dot products between codebook elements

62 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant O(M ) additions O(M ): still expensive

63 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant If constant

64 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant If constant Computing this is enough for search

65 Formulation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) Minimize quantization error: x M m=1 c mim x Constant Subject to the third term is a constant

66 Formulation Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency

67 Connection to PQ and CKM Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency

68 Connection to PQ and CKM Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency Product quantization and Cartesian k-means: suboptimal solutions of our approach Non-overlapped space partitioning Codebooks are mutually orthogonal m l T c mim (x) c lil (x) = ε Product quantization and Cartesian k-means

69 Formulation Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency Transformed to unconstrained formulation Add the constraints into the objective function φ {C m, i m (x), ε) = x x M m=1 c mim x + μ T x m l c mim (x) c lil (x) ε

70 Optimization Unconstrained formulation φ {C m, i m (x), ε) = x x M m=1 c mim x Distortion error + μ T x m l c mim (x) c lil (x) ε Alternative optimization between C m, i m (x), ε Constraints violation Selected by validation

71 Alternative optimization Update {i m (x)} Iteratively alternative optimization, fixing i l (x) l m, update i m (x) Update ε Closed form solution,ε = 1 #{x} Update {C m } L-BFGS algorithm T x m l c mim x c lil (x)

72 Complexity Update {i m (x)} Iteratively alternative optimization, fixing i l (x) l m, update i m (x) O(MKdT d ) Update ε Closed form solution,ε = 1 #{x} O(NM ) Update {C m } L-BFGS algorithm O(NMdT l T c ) T x m l c mim x Convergency? c lil (x)

73 Convergence 1MSIFT, 64 bits Converge about 10~15 iterations

74 Outline Introduction Composite quantization Experiments

75 Experiments on ANN search Datasets 1 million of 18D SIFT vectors, queries 1 million of 960D GIST vectors, 1000 queries 1 billion of 18D SIFT vectors, 1000 queries Evaluation Recall@R the fraction of queries for which the ground-truth Euclidean nearest neighbor is in the R retrieved items

76 Search pipeline Source codebooks Code of database vector x Query q Distance tables (between query and codebook elements) Distance between q and x Output the nearest vectors Repeated for n database vectors

77 Comparison on 1M SIFT and 1M GIST

78 Comparison on 1M SIFT and 1M GIST 64 btis Our:71.59% CKM:63.83%

79 Comparison on 1M SIFT and 1M GIST Our:71.59% CKM:63.83% Relatively small improvement on 1M GIST might be that CKM has already achieved large improvement

80 Comparison on 1M SIFT and 1M GIST Our:71.59% 64 bits ITQ: 53.95% 18 bits ITQ without asymmetric distance underperformed ITQ with asymmetric distance Our approach with 64 bits outperforms (A) ITQ with 18 bits, with slightly smaller search cost

81 Comparison on 1B SIFT

82 Comparison on 1B SIFT Our:70.1% CKM:64.57%

83 Average query time Average query time

84 Application to object retrieval Two datasets INRIA Holidays contains 500 queries and 991 corresponding relevant images UKBench contains 550 groups of 4 images each Results MAP on the holiday dataset Scores on the UKBench dataset

85 Discussion The effect of ε min x x x T T subject to m l c mim (x) c lil (x) = ε m l c mim (x) c lil (x) = 0 Indicate the dictionaries are mutual orthogonal like splitting the space (R,T) Search performance with learnt ε is better, since learning ε is more flexible

86 Discussion The effect of translation min x x x T subject to m l c mim (x) c lil (x) = ε search performance doesn t change too much x x t x (R,T) Contribution of the offset is relatively small compared with the composite quantization

87 Extension (CVPR15) Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency m C m 1 T Sparsity constraint for precomputation efficiency

88 Extension (CVPR15) Constrained formulation min C m, i m x,ε x x P M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency m C m 1 T Sparsity constraint for precomputation efficiency Dimension reduction

89 Multi-stage vector quantization Database X Vector quantization (1) Residuals Vector quantization () Residuals Residuals Vector quantization (M)

90 Conclusion Composite quantization A compact coding approach for approximate nearest neighbor search Joint optimization of search accuracy and search efficiency State-of-the-art performance

91 A Survey on Learning to Hash

92 Call for papers

93 Call for papers

94 Thanks Q&A

95 A distance preserving view of quantization Quantization Data approximation: x x x x q Better search If better distance preserving: q x q x Distance preserving view Triangle inequality: q x q x x x Minimize the upper bound: x x x

96 A joint minimization view Generalized triangle inequality d q, x d(q, x) x x + δ 1/ Triangle inequality q x q x x x Distortion Efficiency M d q, x = (Σ m=1 q c mim x ) 1/ d q, x = ( q x + M 1 q ) 1/ T δ = Σ m l c mim (x) c lil (x) x = Σ M m=1 c mim (x)

97 A joint minimization view Generalized triangle inequality d q, x d(q, x) x x + δ 1/ Distortion Efficiency Our formulation min Σ x x x C m, i m x,ε s. t. δ = ε Minimize distortion for search accuracy Constant constraint for search efficiency

NEAREST neighbor (NN) search has been a fundamental

NEAREST neighbor (NN) search has been a fundamental JOUNAL OF L A T E X CLASS FILES, VOL. 3, NO. 9, JUNE 26 Composite Quantization Jingdong Wang and Ting Zhang arxiv:72.955v [cs.cv] 4 Dec 27 Abstract This paper studies the compact coding approach to approximate

More information

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Tailored Bregman Ball Trees for Effective Nearest Neighbors Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia

More information

Stochastic Generative Hashing

Stochastic Generative Hashing Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by

More information

Supervised Quantization for Similarity Search

Supervised Quantization for Similarity Search Supervised Quantization for Similarity Search Xiaojuan Wang 1 Ting Zhang 2 Guo-Jun Qi 3 Jinhui Tang 4 Jingdong Wang 5 1 Sun Yat-sen University, China 2 University of Science and Technology of China, China

More information

arxiv: v1 [cs.cv] 22 Dec 2015

arxiv: v1 [cs.cv] 22 Dec 2015 Transformed Residual Quantization for Approximate Nearest Neighbor Search Jiangbo Yuan Florida State University jyuan@cs.fsu.edu Xiuwen Liu Florida State University liux@cs.fsu.edu arxiv:1512.06925v1 [cs.cv]

More information

Optimized Product Quantization for Approximate Nearest Neighbor Search

Optimized Product Quantization for Approximate Nearest Neighbor Search Optimized Product Quantization for Approximate Nearest Neighbor Search Tiezheng Ge Kaiming He 2 Qifa Ke 3 Jian Sun 2 University of Science and Technology of China 2 Microsoft Research Asia 3 Microsoft

More information

Image retrieval, vector quantization and nearest neighbor search

Image retrieval, vector quantization and nearest neighbor search Image retrieval, vector quantization and nearest neighbor search Yannis Avrithis National Technical University of Athens Rennes, October 2014 Part I: Image retrieval Particular object retrieval Match images

More information

Correlation Autoencoder Hashing for Supervised Cross-Modal Search

Correlation Autoencoder Hashing for Supervised Cross-Modal Search Correlation Autoencoder Hashing for Supervised Cross-Modal Search Yue Cao, Mingsheng Long, Jianmin Wang, and Han Zhu School of Software Tsinghua University The Annual ACM International Conference on Multimedia

More information

Collaborative Quantization for Cross-Modal Similarity Search

Collaborative Quantization for Cross-Modal Similarity Search Collaborative Quantization for Cross-Modal Similarity Search ing Zhang 1 Jingdong Wang 2 1 University of Science and echnology, China 2 Microsoft Research, Beijing, China 1 zting@mail.ustc.edu.cn 2 jingdw@microsoft.com

More information

Compressed Fisher vectors for LSVR

Compressed Fisher vectors for LSVR XRCE@ILSVRC2011 Compressed Fisher vectors for LSVR Florent Perronnin and Jorge Sánchez* Xerox Research Centre Europe (XRCE) *Now with CIII, Cordoba University, Argentina Our system in a nutshell High-dimensional

More information

Collaborative Quantization for Cross-Modal Similarity Search

Collaborative Quantization for Cross-Modal Similarity Search Collaborative Quantization for Cross-Modal Similarity Search ing Zhang 1 Jingdong Wang 2 1 University of Science and echnology of China, China 2 Microsoft Research, China 1 zting@mail.ustc.edu.cn 2 jingdw@microsoft.com

More information

Iterative Laplacian Score for Feature Selection

Iterative Laplacian Score for Feature Selection Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,

More information

Similarity-Preserving Binary Signature for Linear Subspaces

Similarity-Preserving Binary Signature for Linear Subspaces Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Similarity-Preserving Binary Signature for Linear Subspaces Jianqiu Ji, Jianmin Li, Shuicheng Yan, Qi Tian, Bo Zhang State Key

More information

WITH the explosive growth of multimedia data, compressing

WITH the explosive growth of multimedia data, compressing JOURAL OF L A TEX CLASS FILES, VOL. 14, O. 8, AUGUST 2015 1 PQTable: on-exhaustive Fast Search for Product-quantized Codes using Hash Tables Yusuke Matsui, Member, IEEE, Toshihiko Yamasaki, Member, IEEE,

More information

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric Axioms of a Metric Picture analysis always assumes that pictures are defined in coordinates, and we apply the Euclidean metric as the golden standard for distance (or derived, such as area) measurements.

More information

Nearest Neighbor Search for Relevance Feedback

Nearest Neighbor Search for Relevance Feedback Nearest Neighbor earch for Relevance Feedbac Jelena Tešić and B.. Manjunath Electrical and Computer Engineering Department University of California, anta Barbara, CA 93-9 {jelena, manj}@ece.ucsb.edu Abstract

More information

Optimized Product Quantization

Optimized Product Quantization OPTIMIZED PRODUCT QUATIZATIO Optimized Product Quantization Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun Abstract Product quantization (PQ) is an effective vector quantization method. A product quantizer

More information

Isotropic Hashing. Abstract

Isotropic Hashing. Abstract Isotropic Hashing Weihao Kong, Wu-Jun Li Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering, Shanghai Jiao Tong University, China {kongweihao,liwujun}@cs.sjtu.edu.cn

More information

Academic Editors: Andreas Holzinger, Adom Giffin and Kevin H. Knuth Received: 26 February 2016; Accepted: 16 August 2016; Published: 24 August 2016

Academic Editors: Andreas Holzinger, Adom Giffin and Kevin H. Knuth Received: 26 February 2016; Accepted: 16 August 2016; Published: 24 August 2016 entropy Article Distribution Entropy Boosted VLAD for Image Retrieval Qiuzhan Zhou 1,2,3, *, Cheng Wang 2, Pingping Liu 4,5,6, Qingliang Li 4, Yeran Wang 4 and Shuozhang Chen 2 1 State Key Laboratory of

More information

Learning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval

Learning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval Learning to Hash with Partial Tags: Exploring Correlation Between Tags and Hashing Bits for Large Scale Image Retrieval Qifan Wang 1, Luo Si 1, and Dan Zhang 2 1 Department of Computer Science, Purdue

More information

Algorithms for Nearest Neighbors

Algorithms for Nearest Neighbors Algorithms for Nearest Neighbors Background and Two Challenges Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura McGill University, July 2007 1 / 29 Outline

More information

Random Angular Projection for Fast Nearest Subspace Search

Random Angular Projection for Fast Nearest Subspace Search Random Angular Projection for Fast Nearest Subspace Search Binshuai Wang 1, Xianglong Liu 1, Ke Xia 1, Kotagiri Ramamohanarao, and Dacheng Tao 3 1 State Key Lab of Software Development Environment, Beihang

More information

Spectral Hashing. Antonio Torralba 1 1 CSAIL, MIT, 32 Vassar St., Cambridge, MA Abstract

Spectral Hashing. Antonio Torralba 1 1 CSAIL, MIT, 32 Vassar St., Cambridge, MA Abstract Spectral Hashing Yair Weiss,3 3 School of Computer Science, Hebrew University, 9904, Jerusalem, Israel yweiss@cs.huji.ac.il Antonio Torralba CSAIL, MIT, 32 Vassar St., Cambridge, MA 0239 torralba@csail.mit.edu

More information

The Power of Asymmetry in Binary Hashing

The Power of Asymmetry in Binary Hashing The Power of Asymmetry in Binary Hashing Behnam Neyshabur Yury Makarychev Toyota Technological Institute at Chicago Russ Salakhutdinov University of Toronto Nati Srebro Technion/TTIC Search by Image Image

More information

Issues Concerning Dimensionality and Similarity Search

Issues Concerning Dimensionality and Similarity Search Issues Concerning Dimensionality and Similarity Search Jelena Tešić, Sitaram Bhagavathy, and B. S. Manjunath Electrical and Computer Engineering Department University of California, Santa Barbara, CA 936-956

More information

Minimal Loss Hashing for Compact Binary Codes

Minimal Loss Hashing for Compact Binary Codes Mohammad Norouzi David J. Fleet Department of Computer Science, University of oronto, Canada norouzi@cs.toronto.edu fleet@cs.toronto.edu Abstract We propose a method for learning similaritypreserving hash

More information

Beyond Spatial Pyramids

Beyond Spatial Pyramids Beyond Spatial Pyramids Receptive Field Learning for Pooled Image Features Yangqing Jia 1 Chang Huang 2 Trevor Darrell 1 1 UC Berkeley EECS 2 NEC Labs America Goal coding pooling Bear Analysis of the pooling

More information

SPATIAL INDEXING. Vaibhav Bajpai

SPATIAL INDEXING. Vaibhav Bajpai SPATIAL INDEXING Vaibhav Bajpai Contents Overview Problem with B+ Trees in Spatial Domain Requirements from a Spatial Indexing Structure Approaches SQL/MM Standard Current Issues Overview What is a Spatial

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia

More information

Supervised Hashing via Uncorrelated Component Analysis

Supervised Hashing via Uncorrelated Component Analysis Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Supervised Hashing via Uncorrelated Component Analysis SungRyull Sohn CG Research Team Electronics and Telecommunications

More information

Improved Hamming Distance Search using Variable Length Hashing

Improved Hamming Distance Search using Variable Length Hashing Improved Hamming istance Search using Variable Length Hashing Eng-Jon Ong and Miroslaw Bober Centre for Vision, Speech and Signal Processing University of Surrey, Guildford, UK e.ong,m.bober@surrey.ac.uk

More information

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l

Vector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT

More information

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data

Multimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data 1/29/2010 13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

More information

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat

Geometric View of Machine Learning Nearest Neighbor Classification. Slides adapted from Prof. Carpuat Geometric View of Machine Learning Nearest Neighbor Classification Slides adapted from Prof. Carpuat What we know so far Decision Trees What is a decision tree, and how to induce it from data Fundamental

More information

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics

Impression Store: Compressive Sensing-based Storage for. Big Data Analytics Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research The Curse of O(N) in

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI

Making Nearest Neighbors Easier. Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4. Outline. Chapter XI Restrictions on Input Algorithms for Nearest Neighbor Search: Lecture 4 Yury Lifshits http://yury.name Steklov Institute of Mathematics at St.Petersburg California Institute of Technology Making Nearest

More information

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression

Vector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18

More information

Global Scene Representations. Tilke Judd

Global Scene Representations. Tilke Judd Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features

More information

Hamming Compatible Quantization for Hashing

Hamming Compatible Quantization for Hashing Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Hamming Compatible Quantization for Hashing Zhe Wang, Ling-Yu Duan, Jie Lin, Xiaofang Wang, Tiejun

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu May 2, 2017 Announcements Homework 2 due later today Due May 3 rd (11:59pm) Course project

More information

University of Florida CISE department Gator Engineering. Clustering Part 1

University of Florida CISE department Gator Engineering. Clustering Part 1 Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects

More information

Efficient Storage and Decoding of SURF Feature Points

Efficient Storage and Decoding of SURF Feature Points Efficient Storage and Decoding of SURF Feature Points Kevin McGuinness, Kealan McCusker, Neil O Hare, and Noel E. O Connor CLARITY: Center for Sensor Web Technologies, Dublin City University Abstract.

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and

More information

Anti-sparse coding for approximate nearest neighbor search

Anti-sparse coding for approximate nearest neighbor search Anti-sparse coding for approximate nearest neighbor search Hervé Jégou, Teddy Furon, Jean-Jacques Fuchs To cite this version: Hervé Jégou, Teddy Furon, Jean-Jacques Fuchs. Anti-sparse coding for approximate

More information

Riemannian Metric Learning for Symmetric Positive Definite Matrices

Riemannian Metric Learning for Symmetric Positive Definite Matrices CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs

More information

Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)

Metric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT) Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy

More information

Nearest Neighbor Search with Keywords in Spatial Databases

Nearest Neighbor Search with Keywords in Spatial Databases 776 Nearest Neighbor Search with Keywords in Spatial Databases 1 Sphurti S. Sao, 2 Dr. Rahila Sheikh 1 M. Tech Student IV Sem, Dept of CSE, RCERT Chandrapur, MH, India 2 Head of Department, Dept of CSE,

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt

Pictorial Structures Revisited: People Detection and Articulated Pose Estimation. Department of Computer Science TU Darmstadt Pictorial Structures Revisited: People Detection and Articulated Pose Estimation Mykhaylo Andriluka Stefan Roth Bernt Schiele Department of Computer Science TU Darmstadt Generic model for human detection

More information

Locality-Sensitive Hashing for Chi2 Distance

Locality-Sensitive Hashing for Chi2 Distance IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. XXXX, NO. XXX, JUNE 010 1 Locality-Sensitive Hashing for Chi Distance David Gorisse, Matthieu Cord, and Frederic Precioso Abstract In

More information

Accelerated Manhattan hashing via bit-remapping with location information

Accelerated Manhattan hashing via bit-remapping with location information DOI 1.17/s1142-15-3217-x Accelerated Manhattan hashing via bit-remapping with location information Wenshuo Chen 1 Guiguang Ding 1 Zijia Lin 2 Jisheng Pei 2 Received: 2 May 215 / Revised: 2 December 215

More information

Dominant Feature Vectors Based Audio Similarity Measure

Dominant Feature Vectors Based Audio Similarity Measure Dominant Feature Vectors Based Audio Similarity Measure Jing Gu 1, Lie Lu 2, Rui Cai 3, Hong-Jiang Zhang 2, and Jian Yang 1 1 Dept. of Electronic Engineering, Tsinghua Univ., Beijing, 100084, China 2 Microsoft

More information

Linear Spectral Hashing

Linear Spectral Hashing Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to

More information

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Optimal Data-Dependent Hashing for Approximate Near Neighbors Optimal Data-Dependent Hashing for Approximate Near Neighbors Alexandr Andoni 1 Ilya Razenshteyn 2 1 Simons Institute 2 MIT, CSAIL April 20, 2015 1 / 30 Nearest Neighbor Search (NNS) Let P be an n-point

More information

Nonnegative Matrix Factorization Clustering on Multiple Manifolds

Nonnegative Matrix Factorization Clustering on Multiple Manifolds Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Nonnegative Matrix Factorization Clustering on Multiple Manifolds Bin Shen, Luo Si Department of Computer Science,

More information

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

LOCALITY SENSITIVE HASHING FOR BIG DATA

LOCALITY SENSITIVE HASHING FOR BIG DATA 1 LOCALITY SENSITIVE HASHING FOR BIG DATA Wei Wang, University of New South Wales Outline 2 NN and c-ann Existing LSH methods for Large Data Our approach: SRS [VLDB 2015] Conclusions NN and c-ann Queries

More information

Pivot Selection Techniques

Pivot Selection Techniques Pivot Selection Techniques Proximity Searching in Metric Spaces by Benjamin Bustos, Gonzalo Navarro and Edgar Chávez Catarina Moreira Outline Introduction Pivots and Metric Spaces Pivots in Nearest Neighbor

More information

Data preprocessing. DataBase and Data Mining Group 1. Data set types. Tabular Data. Document Data. Transaction Data. Ordered Data

Data preprocessing. DataBase and Data Mining Group 1. Data set types. Tabular Data. Document Data. Transaction Data. Ordered Data Elena Baralis and Tania Cerquitelli Politecnico di Torino Data set types Record Tables Document Data Transaction Data Graph World Wide Web Molecular Structures Ordered Spatial Data Temporal Data Sequential

More information

arxiv:cs/ v1 [cs.cg] 7 Feb 2006

arxiv:cs/ v1 [cs.cg] 7 Feb 2006 Approximate Weighted Farthest Neighbors and Minimum Dilation Stars John Augustine, David Eppstein, and Kevin A. Wortman Computer Science Department University of California, Irvine Irvine, CA 92697, USA

More information

Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II

Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II Noah Snavely Cornell James Hays CMU MIT (Fall 2009) Brown (Spring 2010 ) Complexity of the Visual World

More information

Spectral Hashing: Learning to Leverage 80 Million Images

Spectral Hashing: Learning to Leverage 80 Million Images Spectral Hashing: Learning to Leverage 80 Million Images Yair Weiss, Antonio Torralba, Rob Fergus Hebrew University, MIT, NYU Outline Motivation: Brute Force Computer Vision. Semantic Hashing. Spectral

More information

Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large- Scale Datasets

Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large- Scale Datasets Tampere University of Technology Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large- Scale Datasets Citation Ozan, E. C. (017). Vector Quantization Techniques for Approximate

More information

Supplemental Material for Discrete Graph Hashing

Supplemental Material for Discrete Graph Hashing Supplemental Material for Discrete Graph Hashing Wei Liu Cun Mu Sanjiv Kumar Shih-Fu Chang IM T. J. Watson Research Center Columbia University Google Research weiliu@us.ibm.com cm52@columbia.edu sfchang@ee.columbia.edu

More information

Large-scale Collaborative Ranking in Near-Linear Time

Large-scale Collaborative Ranking in Near-Linear Time Large-scale Collaborative Ranking in Near-Linear Time Liwei Wu Depts of Statistics and Computer Science UC Davis KDD 17, Halifax, Canada August 13-17, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack

More information

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization

CHAPTER 3. Transformed Vector Quantization with Orthogonal Polynomials Introduction Vector quantization 3.1. Introduction CHAPTER 3 Transformed Vector Quantization with Orthogonal Polynomials In the previous chapter, a new integer image coding technique based on orthogonal polynomials for monochrome images

More information

Face recognition Computer Vision Spring 2018, Lecture 21

Face recognition Computer Vision Spring 2018, Lecture 21 Face recognition http://www.cs.cmu.edu/~16385/ 16-385 Computer Vision Spring 2018, Lecture 21 Course announcements Homework 6 has been posted and is due on April 27 th. - Any questions about the homework?

More information

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang /

CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors. Furong Huang / CMSC 422 Introduction to Machine Learning Lecture 4 Geometry and Nearest Neighbors Furong Huang / furongh@cs.umd.edu What we know so far Decision Trees What is a decision tree, and how to induce it from

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1. Erkun Yang, Cheng Deng, Member, IEEE, Chao Li, Wei Liu, Jie Li, and Dacheng Tao

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1. Erkun Yang, Cheng Deng, Member, IEEE, Chao Li, Wei Liu, Jie Li, and Dacheng Tao IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 1 Shared Predictive Cross-Modal Deep Quantization Erkun Yang, Cheng Deng, Member, IEEE, Chao Li, Wei Liu, Jie Li, and Dacheng Tao, Fellow, IEEE

More information

Large-Scale Behavioral Targeting

Large-Scale Behavioral Targeting Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting

More information

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins

Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins 11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,

More information

Configuring Spatial Grids for Efficient Main Memory Joins

Configuring Spatial Grids for Efficient Main Memory Joins Configuring Spatial Grids for Efficient Main Memory Joins Farhan Tauheed, Thomas Heinis, and Anastasia Ailamaki École Polytechnique Fédérale de Lausanne (EPFL), Imperial College London Abstract. The performance

More information

Term Filtering with Bounded Error

Term Filtering with Bounded Error Term Filtering with Bounded Error Zi Yang, Wei Li, Jie Tang, and Juanzi Li Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University, China {yangzi, tangjie, ljz}@keg.cs.tsinghua.edu.cn

More information

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition

More information

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA

LOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA LOCALITY PRESERVING HASHING Yi-Hsuan Tsai Ming-Hsuan Yang Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA ABSTRACT The spectral hashing algorithm relaxes

More information

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization

Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Ranking from Crowdsourced Pairwise Comparisons via Matrix Manifold Optimization Jialin Dong ShanghaiTech University 1 Outline Introduction FourVignettes: System Model and Problem Formulation Problem Analysis

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Types of data sets Record Tables Document Data Transaction Data Graph World Wide Web Molecular Structures

More information

Toward Correlating and Solving Abstract Tasks Using Convolutional Neural Networks Supplementary Material

Toward Correlating and Solving Abstract Tasks Using Convolutional Neural Networks Supplementary Material Toward Correlating and Solving Abstract Tasks Using Convolutional Neural Networks Supplementary Material Kuan-Chuan Peng Cornell University kp388@cornell.edu Tsuhan Chen Cornell University tsuhan@cornell.edu

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment

Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment Anshumali Shrivastava and Ping Li Cornell University and Rutgers University WWW 25 Florence, Italy May 2st 25 Will Join

More information

Ch. 10 Vector Quantization. Advantages & Design

Ch. 10 Vector Quantization. Advantages & Design Ch. 10 Vector Quantization Advantages & Design 1 Advantages of VQ There are (at least) 3 main characteristics of VQ that help it outperform SQ: 1. Exploit Correlation within vectors 2. Exploit Shape Flexibility

More information

Mid Term-1 : Practice problems

Mid Term-1 : Practice problems Mid Term-1 : Practice problems These problems are meant only to provide practice; they do not necessarily reflect the difficulty level of the problems in the exam. The actual exam problems are likely to

More information

Semi Supervised Distance Metric Learning

Semi Supervised Distance Metric Learning Semi Supervised Distance Metric Learning wliu@ee.columbia.edu Outline Background Related Work Learning Framework Collaborative Image Retrieval Future Research Background Euclidean distance d( x, x ) =

More information

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction

High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction Chapter 11 High Dimensional Geometry, Curse of Dimensionality, Dimension Reduction High-dimensional vectors are ubiquitous in applications (gene expression data, set of movies watched by Netflix customer,

More information

Efficient identification of Tanimoto nearest neighbors

Efficient identification of Tanimoto nearest neighbors Int J Data Sci Anal (2017) 4:153 172 DOI 10.1007/s41060-017-0064-z REGULAR PAPER Efficient identification of Tanimoto nearest neighbors All-pairs similarity search using the extended Jaccard coefficient

More information

Efficient Data Reduction and Summarization

Efficient Data Reduction and Summarization Ping Li Efficient Data Reduction and Summarization December 8, 2011 FODAVA Review 1 Efficient Data Reduction and Summarization Ping Li Department of Statistical Science Faculty of omputing and Information

More information

For Review Only. Codebook-free Compact Descriptor for Scalable Visual Search

For Review Only. Codebook-free Compact Descriptor for Scalable Visual Search Page of 0 0 0 0 0 0 Codebook-free Compact Descriptor for Scalable Visual Search Yuwei u, Feng Gao, Yicheng Huang, Jie Lin Member, IEEE, Vijay Chandrasekhar Member, IEEE, Junsong Yuan Senior Member, IEEE,

More information

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality Shu Kong, Donghui Wang Dept. of Computer Science and Technology, Zhejiang University, Hangzhou 317,

More information

HARMONIC VECTOR QUANTIZATION

HARMONIC VECTOR QUANTIZATION HARMONIC VECTOR QUANTIZATION Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Jonas Svedberg, and Harald Pobloth SMN, Ericsson Research, Ericsson AB 64 8, Stockholm, Sweden ABSTRACT

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Dan Oneaţă 1 Introduction Probabilistic Latent Semantic Analysis (plsa) is a technique from the category of topic models. Its main goal is to model cooccurrence information

More information

Self-Tuning Semantic Image Segmentation

Self-Tuning Semantic Image Segmentation Self-Tuning Semantic Image Segmentation Sergey Milyaev 1,2, Olga Barinova 2 1 Voronezh State University sergey.milyaev@gmail.com 2 Lomonosov Moscow State University obarinova@graphics.cs.msu.su Abstract.

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Université du Littoral- Calais July 11, 16 First..... to the

More information

Face detection and recognition. Detection Recognition Sally

Face detection and recognition. Detection Recognition Sally Face detection and recognition Detection Recognition Sally Face detection & recognition Viola & Jones detector Available in open CV Face recognition Eigenfaces for face recognition Metric learning identification

More information

Transforming Hierarchical Trees on Metric Spaces

Transforming Hierarchical Trees on Metric Spaces CCCG 016, Vancouver, British Columbia, August 3 5, 016 Transforming Hierarchical Trees on Metric Spaces Mahmoodreza Jahanseir Donald R. Sheehy Abstract We show how a simple hierarchical tree called a cover

More information

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning)

Regression. Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Linear Regression Regression Goal: Learn a mapping from observations (features) to continuous labels given a training set (supervised learning) Example: Height, Gender, Weight Shoe Size Audio features

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information