Composite Quantization for Approximate Nearest Neighbor Search

Size: px

Start display at page:

Download "Composite Quantization for Approximate Nearest Neighbor Search"

Giles Cross
5 years ago
Views:

1 Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research ICML 104, joint work with my interns Ting Zhang from USTC and Chao Du from Tsinghua University

2 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

3 Nearest neighbor search Application to similar image search query

4 Nearest neighbor search Application to particular object retrieval query

5 Nearest neighbor search Application to duplicate image search

6 Nearest neighbor search Similar image search Application to K-NN annotation: Annotate the query image using the similar images

7 Nearest neighbor search Definition Database: Query: Nearest neighbor: R d x q

8 Nearest neighbor search Exact nearest neighbor search linear scan:

9 Nearest neighbor search - speedup K-dimensional tree (Kd tree) Generalized binary search tree Metric tree Ball tree VP tree BD tree Cover tree

10 Nearest neighbor search Exact nearest neighbor search linear scan: Costly and impractical for large scale high-dimensional cases Approximate nearest neighbor (ANN) search Efficient Acceptable accuracy Practically used

11 Two principles for ANN search Recall the complexity of linear scan: 1. Reduce the number of distance computations Time complexity: Tree structure, neighborhood graph search and inverted index

12 Our work: TP Tree + NG Search TP Tree Jingdong Wang, Naiyan Wang, You Jia, Jian Li, Gang Zeng, Hongbin Zha, Xian-Sheng Hua:Trinary- Projection Trees for Approximate Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 36(): (014) You Jia, Jingdong Wang, Gang Zeng, Hongbin Zha, Xian-Sheng Hua: Optimizing kd-trees for scalable visual descriptor indexing. CVPR 010: Neighborhood graph search Jingdong Wang, Shipeng Li:Query-driven iterated neighborhood graph search for large scale indexing. ACM Multimedia 01: Jing Wang, Jingdong Wang, Gang Zeng, Rui Gan, Shipeng Li, Baining Guo: Fast Neighborhood Graph Search Using Cartesian Concatenation. ICCV 013: Neighborhood graph construction Jing Wang, Jingdong Wang, Gang Zeng, Zhuowen Tu, Rui Gan, Shipeng Li: Scalable k-nn graph construction for visual descriptors. CVPR 01:

13 Comparison over SIFT 1M ICCV13 ACMMM1 CVPR10 1 NN 13

14 Comparison over GIST 1M ICCV13 ACMMM1 CVPR10 1 NN 14

15 Comparison over HOG 10M ICCV13 ACMMM1 CVPR10 1 NN 15

16 Neighborhood Graph Search Shipped to Bing ClusterBed Number Index building time on 40M documents is only hours and 10 minutes. Search DPS on each NNS machine stable at 950 without retry and errors Five times faster improved FLANN

17 Two principles for ANN search Recall the complexity of linear scan: 1. Reduce the number of distance computations Time complexity: 1. High efficiency. Large memory cost Tree structure, neighborhood graph search and inverted index. Reduce the cost of each distance computation Time complexity: Hashing (compact codes) 1. Small memory cost. Low efficiency

18 Approximate nearest neighbor search Binary embedding methods (hashing) Produce a few distinct distances Limited ability and flexibility of distance approximation Vector quantization (compact codes) K-means Impossible to use for medium and large code length Impossible to learn the codebook Impossible to compute a code for a vector Product quantization Cartesian k-means

19 Combined solution for very large scale search Retrieve candidates with an index structure using compact codes Load raw features for retrieved candidates from disk Reranking using the true distances Efficient and small memory consumption IO cost is small

20 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

21 Product quantization Approximate x by the concatenation of M subvectors

22 Product quantization Approximate x by the concatenation of M subvectors x 1i1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x i x = p i {p 1, p,, p K } Codebook in the nd subspace x MiM p MiM {p M1, p M,, p MK } Codebook in the M th subspace

23 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

24 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

25 Product quantization Approximate x by the concatenation of M subvectors x 1 p 1i1 {p 11, p 1,, p 1K } Codebook in the 1 st subspace x = x x = p i {p 1, p,, p K } Codebook in the nd subspace x M p MiM {p M1, p M,, p MK } Codebook in the M th subspace

26 Product quantization Approximate x by the concatenation of M subvectors Code presentation: (i 1, i,, i M ) Distance computation: d q, x = d q 1, p 1i1 + d q, p i + + d qm, p MiM M additions using a pre-computed distance table x 1 p 1i1 d( {p 11, p 1,, p 1K }, qcodebook in the 1 st 1 ) subspace {d q 1, p 11, d q 1, p 1,, d(q 1, p 1K )} x = x x = p i d( {p 1, p,, p K }, qcodebook ) in the nd subspace q 1 x M p MiM d( {p M1, p M,, p MK }, qcodebook in the M th M ) subspace

27 Product quantization M = Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace

28 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace p 3 p p 1 p 11 p 1 p 13

29 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups p 3 p The center of each is the concatenation of M subvectorsp 1 p 11 p 1 p 13 = p 11 p 1

30 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors

31 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors

32 Product quantization M =, K = 3 Approximate x by the concatenation of M subvectors Codebook generation Do k-means for each subspace Result in K M groups The center of each is the concatenation of M subvectors p x p 13 x x = p 13 p

33 Outline Introduction Problem Product quantization Cartesian k-means Composite quantization Experiments

34 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 1i1 x x = R p i p MiM

35 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 3 p p 13 x x = R p 1i1 p i p 1 p 1 p 11 p MiM

36 Cartesian K-means Extended product quantization Optimal space rotation Perform PQ over the rotated space p 3 p x p 13 x x = R p 1i1 p i p MiM p 1 p 1 p 11 x x = R p 13 p

37 Outline Introduction Composite quantization Experiments

38 Composite quantization Approximate x by the addition of M vectors x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M Each source codebook is composed of K d-dimensional vectors

39 Composite quantization source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } c 11 c 3 c 1 c 1 c c 13

40 Composite quantization c 11 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c 1 c 1

41 Composite quantization c 11 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c c

42 Composite quantization c 11 c 3 source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite center: = c 11 + c 3

43 Composite quantization source codebooks: {c 11, c 1, c 13 } {c 1, c, c 3 } More composite centers

44 Composite quantization Source codebook: {c 11, c 1, c 13 } {c 1, c, c 3 } Composite codebook: 9 composite centers

45 Composite quantization x Source codebook: {c 11, c 1, c 13 } {c 1, c, c 3 } c 11 Space partition: 9 groups c x x = c 11 + c

46 Composite quantization Approximate x by the addition of M vectors x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M

47 Composite quantization Approximate x by the addition of M vectors Code representation: i 1 i i M length: MlogK x x = c 1i1 + c i + + c MiM {c 11, c 1,, c 1K } {c 1, c,, c K } {c M1, c M,, c MK } Source codebook 1 Source codebook Source codebook M

48 Connection to product quantization Concatenation in product quantization p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

49 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

50 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

51 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM

52 Connection to product quantization Concatenation in product quantization = addition p 1i1 p 1i1 0 0 x x = p i = 0 + p i p MiM 0 0 p MiM Product quantization is constrained Composite quantization composite quantization x x = c 1i1 + c i + + c MiM

53 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM

54 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM

55 Connection to Cartesian k-means Concatenation in Cartesian k-means = addition p 1i1 p 1i1 0 0 x x = R p i = R 0 + R p i + + R 0 p MiM 0 0 p MiM Cartesian k-means is constrained Composite quantization composite quantization x x = c 1i1 + c i + + c MiM

56 Composite quantization Composite quantization Generalizing product quantization and Cartesian k-means Advantages More flexible codebooks More accurate data approximation Higher search accuracy Search efficiency? Depend on the efficiency of the distance computation Product quantization: Coordinate aligned space partition Cartesian k-means: Rotated coordinate aligned space partition Composite quantization: Flexible space partition

57 Approximate distance computation Recall the data approximation M x x = m=1 c mim x Approximate distance to the query q q x q M m=1 c mim x Time-consuming

58 Approximate distance computation Expanded into three terms q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x)

59 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) O(M) additions Implemented with a pre-computed distance lookup table Distance lookup table: Store the distances from source codebook elements to q T c lil (x)

60 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant

61 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant O(M ) additions Using a pre-computed dot product lookup table Dot product lookup table: Store the dot products between codebook elements

62 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant O(M ) additions O(M ): still expensive

63 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant If constant

64 Approximate distance computation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) O(M) additions Constant If constant Computing this is enough for search

65 Formulation q M m=1 = m=1 c mim x M q c mim (x) M 1 q + m l c mim (x) T c lil (x) Minimize quantization error: x M m=1 c mim x Constant Subject to the third term is a constant

66 Formulation Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency

67 Connection to PQ and CKM Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency

68 Connection to PQ and CKM Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency Product quantization and Cartesian k-means: suboptimal solutions of our approach Non-overlapped space partitioning Codebooks are mutually orthogonal m l T c mim (x) c lil (x) = ε Product quantization and Cartesian k-means

69 Formulation Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency Transformed to unconstrained formulation Add the constraints into the objective function φ {C m, i m (x), ε) = x x M m=1 c mim x + μ T x m l c mim (x) c lil (x) ε

70 Optimization Unconstrained formulation φ {C m, i m (x), ε) = x x M m=1 c mim x Distortion error + μ T x m l c mim (x) c lil (x) ε Alternative optimization between C m, i m (x), ε Constraints violation Selected by validation

71 Alternative optimization Update {i m (x)} Iteratively alternative optimization, fixing i l (x) l m, update i m (x) Update ε Closed form solution,ε = 1 #{x} Update {C m } L-BFGS algorithm T x m l c mim x c lil (x)

72 Complexity Update {i m (x)} Iteratively alternative optimization, fixing i l (x) l m, update i m (x) O(MKdT d ) Update ε Closed form solution,ε = 1 #{x} O(NM ) Update {C m } L-BFGS algorithm O(NMdT l T c ) T x m l c mim x Convergency? c lil (x)

73 Convergence 1MSIFT, 64 bits Converge about 10~15 iterations

74 Outline Introduction Composite quantization Experiments

75 Experiments on ANN search Datasets 1 million of 18D SIFT vectors, queries 1 million of 960D GIST vectors, 1000 queries 1 billion of 18D SIFT vectors, 1000 queries Evaluation Recall@R the fraction of queries for which the ground-truth Euclidean nearest neighbor is in the R retrieved items

76 Search pipeline Source codebooks Code of database vector x Query q Distance tables (between query and codebook elements) Distance between q and x Output the nearest vectors Repeated for n database vectors

77 Comparison on 1M SIFT and 1M GIST

78 Comparison on 1M SIFT and 1M GIST 64 btis Our:71.59% CKM:63.83%

79 Comparison on 1M SIFT and 1M GIST Our:71.59% CKM:63.83% Relatively small improvement on 1M GIST might be that CKM has already achieved large improvement

80 Comparison on 1M SIFT and 1M GIST Our:71.59% 64 bits ITQ: 53.95% 18 bits ITQ without asymmetric distance underperformed ITQ with asymmetric distance Our approach with 64 bits outperforms (A) ITQ with 18 bits, with slightly smaller search cost

81 Comparison on 1B SIFT

82 Comparison on 1B SIFT Our:70.1% CKM:64.57%

83 Average query time Average query time

84 Application to object retrieval Two datasets INRIA Holidays contains 500 queries and 991 corresponding relevant images UKBench contains 550 groups of 4 images each Results MAP on the holiday dataset Scores on the UKBench dataset

recall@r Discussion The effect of ε min x x x T T subject to m l c mim (x) c lil (x) = ε m l c mim (x) c lil (x) = 0 Indicate the

85 Discussion The effect of ε min x x x T T subject to m l c mim (x) c lil (x) = ε m l c mim (x) c lil (x) = 0 Indicate the dictionaries are mutual orthogonal like splitting the space (R,T) Search performance with learnt ε is better, since learning ε is more flexible

recall@r Discussion The effect of translation min x x x T subject to m l c mim (x) c lil (x) = ε search performance

86 Discussion The effect of translation min x x x T subject to m l c mim (x) c lil (x) = ε search performance doesn t change too much x x t x (R,T) Contribution of the offset is relatively small compared with the composite quantization

87 Extension (CVPR15) Constrained formulation min C m, i m x,ε x x M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency m C m 1 T Sparsity constraint for precomputation efficiency

88 Extension (CVPR15) Constrained formulation min C m, i m x,ε x x P M m=1 T c mim x s. t. m l c mim (x) c lil (x) = ε Minimize quantization error for search accuracy Constant constraint for search efficiency m C m 1 T Sparsity constraint for precomputation efficiency Dimension reduction

89 Multi-stage vector quantization Database X Vector quantization (1) Residuals Vector quantization () Residuals Residuals Vector quantization (M)

90 Conclusion Composite quantization A compact coding approach for approximate nearest neighbor search Joint optimization of search accuracy and search efficiency State-of-the-art performance

91 A Survey on Learning to Hash

92 Call for papers

Call for papers http://research.microsoft.

93 Call for papers

94 Thanks Q&A

95 A distance preserving view of quantization Quantization Data approximation: x x x x q Better search If better distance preserving: q x q x Distance preserving view Triangle inequality: q x q x x x Minimize the upper bound: x x x

96 A joint minimization view Generalized triangle inequality d q, x d(q, x) x x + δ 1/ Triangle inequality q x q x x x Distortion Efficiency M d q, x = (Σ m=1 q c mim x ) 1/ d q, x = ( q x + M 1 q ) 1/ T δ = Σ m l c mim (x) c lil (x) x = Σ M m=1 c mim (x)

97 A joint minimization view Generalized triangle inequality d q, x d(q, x) x x + δ 1/ Distortion Efficiency Our formulation min Σ x x x C m, i m x,ε s. t. δ = ε Minimize distortion for search accuracy Constant constraint for search efficiency

NEAREST neighbor (NN) search has been a fundamental

NEAREST neighbor (NN) search has been a fundamental JOUNAL OF L A T E X CLASS FILES, VOL. 3, NO. 9, JUNE 26 Composite Quantization Jingdong Wang and Ting Zhang arxiv:72.955v [cs.cv] 4 Dec 27 Abstract This paper studies the compact coding approach to approximate