Image retrieval, vector quantization and nearest neighbor search
|
|
- Austin Hart
- 5 years ago
- Views:
Transcription
1 Image retrieval, vector quantization and nearest neighbor search Yannis Avrithis National Technical University of Athens Rennes, October 2014
2 Part I: Image retrieval Particular object retrieval Match images under different viewpoint/lighting, occlusion Given local descriptors, investigate match kernels beyond Bag-of-Words Part II: Vector quantization and nearest neighbor search Fast nearest neighbor search in high-dimensional spaces Encode vectors based on vector quantization Improve fitting to underlying distribution
3 Part I: Image retrieval Particular object retrieval Match images under different viewpoint/lighting, occlusion Given local descriptors, investigate match kernels beyond Bag-of-Words Part II: Vector quantization and nearest neighbor search Fast nearest neighbor search in high-dimensional spaces Encode vectors based on vector quantization Improve fitting to underlying distribution
4 Part I: Image retrieval To aggregate or not to aggregate: selective match kernels for image search Joint work with Giorgos Tolias and Hervé Jégou, ICCV 2013
5 Overview Problem: particular object retrieval Build common model for matching (HE) and aggregation (VLAD) methods; derive new match kernels Evaluate performance under exact or approximate descriptors
6 Related work In our common model: Bag-of-Words (BoW) [Sivic & Zisserman 03] Descriptor approximation (Hamming embedding) [Jégou et al. 08] Aggregated representations (VLAD, Fisher) [Jégou et al. 10][Perronnin et al. 10] Relevant to Part II: Soft (multiple) assignment [Philbin et al. 08][Jégou et al. 10] Not discussed: Spatial matching [Philbin et al. 08][Tolias & Avrithis 11] Query expansion [Chum et al. 07][Tolias & Jégou 13] Re-ranking [Qin et al. 11][Shen et al. 12]
7 Related work In our common model: Bag-of-Words (BoW) [Sivic & Zisserman 03] Descriptor approximation (Hamming embedding) [Jégou et al. 08] Aggregated representations (VLAD, Fisher) [Jégou et al. 10][Perronnin et al. 10] Relevant to Part II: Soft (multiple) assignment [Philbin et al. 08][Jégou et al. 10] Not discussed: Spatial matching [Philbin et al. 08][Tolias & Avrithis 11] Query expansion [Chum et al. 07][Tolias & Jégou 13] Re-ranking [Qin et al. 11][Shen et al. 12]
8 Related work In our common model: Bag-of-Words (BoW) [Sivic & Zisserman 03] Descriptor approximation (Hamming embedding) [Jégou et al. 08] Aggregated representations (VLAD, Fisher) [Jégou et al. 10][Perronnin et al. 10] Relevant to Part II: Soft (multiple) assignment [Philbin et al. 08][Jégou et al. 10] Not discussed: Spatial matching [Philbin et al. 08][Tolias & Avrithis 11] Query expansion [Chum et al. 07][Tolias & Jégou 13] Re-ranking [Qin et al. 11][Shen et al. 12]
9 Image representation Entire image: set of local descriptors X = {x 1,..., x n } Descriptors assigned to cell c: X c = {x X : q(x) = c}
10 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
11 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
12 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
13 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
14 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
15 Set similarity function K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c ) normalization factor cell weighting cell similarity
16 Bag-of-Words (BoW) similarity function Cosine similarity M(X c, Y c ) = X c Y c = x X c y Y c 1 Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
17 Bag-of-Words (BoW) similarity function Cosine similarity M(X c, Y c ) = X c Y c = x X c y Y c 1 Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
18 Bag-of-Words (BoW) similarity function Cosine similarity M(X c, Y c ) = X c Y c = x X c y Y c 1 Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
19 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
20 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
21 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
22 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
23 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
24 Hamming Embedding (HE) M (X c, Y c ) = x X c y Y c w ( )) h (b x, b y weight function Hamming distance binary codes Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
25 VLAD M (X c, Y c ) = V (X c ) V (Y c ) = x X c y Y c r(x) r(y) aggregated residual x X c r(x) residual x q(x) Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
26 VLAD M (X c, Y c ) = V (X c ) V (Y c ) = x X c y Y c r(x) r(y) aggregated residual x X c r(x) residual x q(x) Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
27 VLAD M (X c, Y c ) = V (X c ) V (Y c ) = x X c y Y c r(x) r(y) aggregated residual x X c r(x) residual x q(x) Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
28 VLAD M (X c, Y c ) = V (X c ) V (Y c ) = x X c y Y c r(x) r(y) aggregated residual x X c r(x) residual x q(x) Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
29 VLAD M (X c, Y c ) = V (X c ) V (Y c ) = x X c y Y c r(x) r(y) aggregated residual x X c r(x) residual x q(x) Generic set similarity K(X, Y) = γ(x ) γ(y) c C w c M (X c, Y c )
30 Design choices Hamming embedding Binary signature & voting per descriptor (not aggregated) Selective: discard weak votes VLAD One aggregated vector per cell Linear operation Questions Is aggregation good with large vocabularies (e.g. 65k)? How important is selectivity in this case?
31 Design choices Hamming embedding Binary signature & voting per descriptor (not aggregated) Selective: discard weak votes VLAD One aggregated vector per cell Linear operation Questions Is aggregation good with large vocabularies (e.g. 65k)? How important is selectivity in this case?
32 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
33 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
34 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
35 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
36 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
37 Common model Non aggregated M N (X c, Y c ) = x X c y Y c σ ( ) φ(x) φ(y) selectivity function descriptor representation (residual, binary, scalar) Aggregated ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( ) φ(y) = σ Φ(X c ) Φ(Y c ) x X c y Yc normalization (l 2, power-law) cell representation
38 BoW, HE and VLAD in the common model Model M(X c, Y c ) φ(x) σ(u) ψ(z) Φ(X c ) BoW M N or M A 1 u z X c HE M N only ˆbx w ( B 2 (1 u)) VLAD M N or M A r(x) u z V (X c ) BoW HE VLAD M(X c, Y c ) = 1 = X c Y c x X c y Y c M (X c, Y c ) = w (h (b x, b y )) x X c y Y c M (X c, Y c ) = x X c M N (X c, Y c ) = y Y c r(x) r(y) = V (X c ) V (Y c ) σ x X c y Y c ( ) φ(x) φ(y) ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( φ(y) = σ x X c y Y c ) Φ(X c ) Φ(Y c )
39 BoW, HE and VLAD in the common model Model M(X c, Y c ) φ(x) σ(u) ψ(z) Φ(X c ) BoW M N or M A 1 u z X c HE M N only ˆbx w ( B 2 (1 u)) VLAD M N or M A r(x) u z V (X c ) BoW HE VLAD M(X c, Y c ) = 1 = X c Y c x X c y Y c M (X c, Y c ) = w (h (b x, b y )) x X c y Y c M (X c, Y c ) = x X c M N (X c, Y c ) = y Y c r(x) r(y) = V (X c ) V (Y c ) σ x X c y Y c ( ) φ(x) φ(y) ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( φ(y) = σ x X c y Y c ) Φ(X c ) Φ(Y c )
40 BoW, HE and VLAD in the common model Model M(X c, Y c ) φ(x) σ(u) ψ(z) Φ(X c ) BoW M N or M A 1 u z X c HE M N only ˆbx w ( B 2 (1 u)) VLAD M N or M A r(x) u z V (X c ) BoW HE VLAD M(X c, Y c ) = 1 = X c Y c x X c y Y c M (X c, Y c ) = w (h (b x, b y )) x X c y Y c M (X c, Y c ) = x X c M N (X c, Y c ) = y Y c r(x) r(y) = V (X c ) V (Y c ) σ x X c y Y c ( ) φ(x) φ(y) ( ) M A (X c, Y c ) = σ ψ φ(x) ψ ( φ(y) = σ x X c y Y c ) Φ(X c ) Φ(Y c )
41 Selective Match Kernel (SMK) SMK(X c, Y c ) = x X c y Y c σ α (ˆr(x) ˆr(y)) Descriptor representation: l 2 -normalized residual Selectivity function σ α (u) = φ(x) = ˆr(x) = r(x)/ r(x) { sign(u) u α, u > τ 0, otherwise
42 Selective Match Kernel (SMK) SMK(X c, Y c ) = x X c y Y c σ α (ˆr(x) ˆr(y)) Descriptor representation: l 2 -normalized residual Selectivity function σ α (u) = φ(x) = ˆr(x) = r(x)/ r(x) { sign(u) u α, u > τ 0, otherwise 1 α=3, τ=0 similarity score dot product
43 Matching example impact of threshold α = 1, τ = 0.0 α = 1, τ = 0.25 thresholding removes false correspondences
44 Matching example impact of shape parameter α = 3, τ = 0.0 α = 3, τ = 0.25 weighs matches based on confidence
45 Aggregated Selective Match Kernel (ASMK) ( ) ASMK(X c, Y c ) = σ α ˆV (Xc ) ˆV (Yc ) Cell representation: l 2 -normalized aggregated residual Φ(X c ) = ˆV (X c ) = V (X c )/ V (X c ) Similar to [Arandjelovic & Zisserman 13], but: with selectivity function σ α used with large vocabularies
46 Aggregated Selective Match Kernel (ASMK) ( ) ASMK(X c, Y c ) = σ α ˆV (Xc ) ˆV (Yc ) Cell representation: l 2 -normalized aggregated residual Φ(X c ) = ˆV (X c ) = V (X c )/ V (X c ) Similar to [Arandjelovic & Zisserman 13], but: with selectivity function σ α used with large vocabularies
47 Aggregated features: k = 128 as in VLAD
48 Aggregated features: k = 65K as in ASMK
49 Why to aggregate: burstiness Burstiness: non-iid statistical behaviour of descriptors Matches of bursty features dominate the total similarity score Previous work: [Jégou et al. 09][Chum & Matas 10][Torii et al. 13] In this work Aggregation and normalization per cell handles burstiness Keeps a single representative, similar to max-pooling
50 Why to aggregate: burstiness Burstiness: non-iid statistical behaviour of descriptors Matches of bursty features dominate the total similarity score Previous work: [Jégou et al. 09][Chum & Matas 10][Torii et al. 13] In this work Aggregation and normalization per cell handles burstiness Keeps a single representative, similar to max-pooling
51 Binary counterparts SMK and ASMK Full vector representation: high memory cost Approximate vector representation: binary vector SMK (X c, Y c ) = } {ˆb(r(x)) ˆb(r(y)) x X c y Y c σ α ( ) ASMK (X c, Y c ) = σ α r(x) ˆb ˆb r(y) x X c y Yc ˆb includes centering and rotation as in HE, followed by binarization and l 2 -normalization
52 Impact of selectivity map map SMK Oxford5k Paris6k Holidays α ASMK Oxford5k Paris6k Holidays α map map SMK * Oxford5k Paris6k Holidays α ASMK * Oxford5k Paris6k Holidays α
53 Impact of aggregation Improves performance for different vocabulary sizes Reduces memory requirements of inverted file Oxford5k - MA k memory ratio 8k 69 % 16k 78 % 32k 85 % 65k 89 % with k = 8k on Oxford5k VLAD 65.5% SMK 74.2% ASMK 78.1% map SMK 72 SMK-BURST ASMK 70 8k 16k 32k 65k k
54 Comparison to state of the art Dataset MA Oxf5k Oxf105k Par6k Holiday ASMK ASMK ASMK ASMK HE [Jégou et al. 10] HE [Jégou et al. 10] HE-BURST [Jain et al. 10] HE-BURST [Jain et al. 10] Fine vocab. [Mikuĺık et al. 10] AHE-BURST [Jain et al. 10] AHE-BURST [Jain et al. 10] Rep. structures [Torri et al. 13]
55 Discussion Aggregation is also beneficial with large vocabularies burstiness Selectivity always helps (with or without aggregation) Descriptor approximation reduces performance only slightly
56 Part II: Vector quantization and nearest neighbor search Locally optimized product quantization Joint work with Yannis Kalantidis, CVPR 2014
57 Overview Problem: given query point q, find its nearest neighbor with respect to Euclidean distance within data set X in a d-dimensional space Focus on large scale: encode (compress) vectors, speed up distance computations Fit better underlying distribution with little space & time overhead
58 Applications Retrieval (image as point) [Jégou et al. 10][Perronnin et al. 10] Retrieval (descriptor as point) [Tolias et al. 13][Qin et al. 13] Localization, pose estimation [Sattler et al. 12][Li et al. 12] Classification [Boiman et al. 08][McCann & Lowe 12] Clustering [Philbin et al. 07][Avrithis 13]
59 Related work Indexing Inverted index (image retrieval) Inverted multi-index [Babenko & Lempitsky 12] (nearest neighbor search) Encoding and ranking Vector quantization (VQ) Product quantization (PQ) [Jégou et al. 11] Optimized product quantization (OPQ) [Ge et al. 13] Cartesian k-means [Norouzi & Fleet 13] Locally optimized product quantization (LOPQ) [Kalantidis and Avrithis 14] Not discussed Tree-based indexing, e.g., [Muja and Lowe 09] Hashing and binary codes, e.g., [Norouzi et al. 12]
60 Related work Indexing Inverted index (image retrieval) Inverted multi-index [Babenko & Lempitsky 12] (nearest neighbor search) Encoding and ranking Vector quantization (VQ) Product quantization (PQ) [Jégou et al. 11] Optimized product quantization (OPQ) [Ge et al. 13] Cartesian k-means [Norouzi & Fleet 13] Locally optimized product quantization (LOPQ) [Kalantidis and Avrithis 14] Not discussed Tree-based indexing, e.g., [Muja and Lowe 09] Hashing and binary codes, e.g., [Norouzi et al. 12]
61 Related work Indexing Inverted index (image retrieval) Inverted multi-index [Babenko & Lempitsky 12] (nearest neighbor search) Encoding and ranking Vector quantization (VQ) Product quantization (PQ) [Jégou et al. 11] Optimized product quantization (OPQ) [Ge et al. 13] Cartesian k-means [Norouzi & Fleet 13] Locally optimized product quantization (LOPQ) [Kalantidis and Avrithis 14] Not discussed Tree-based indexing, e.g., [Muja and Lowe 09] Hashing and binary codes, e.g., [Norouzi et al. 12]
62 Index Inverted index query images
63 Inverted file Inverted index query images
64 Inverted file Inverted index query images
65 Inverted file Inverted index query images
66 Ranking Inverted index query ranked shortlist images
67 Inverted index issues Are items in a postings list equally important? What changes under soft (multiple) assignment? How should vectors be encoded for memory efficiency and fast search?
68 Inverted multi-index inverted index inverted multi-index Figure 1. Indexing the set of 600 points (small black) distributed non-uniformly within the unit 2D square. Left the inverted index based on standard quantization (the codebook has 16 2D codewords; boundaries are in green). Right the inverted multi-index based on product quantization decompose (each of the two codebooks vectors hasas 16 1Dx codewords). = (x 1, The x 2 number ) of operations needed to match a query to codebooks is the same for both structures. Two example queries are issued (light-blue and light-red circles). The lists returned by the inverted index (left) contain 45train and 62 words codebooks respectively (circled). C 1, CNote 2 from that whendatasets a query lies near{x a space 1 n}, partition {x 2 n} boundary (as happens most often in high dimensions) the resulting list is heavily skewed and may not contain many of the nearest neighbors. Note also that the inverted index is not able induced to return listscodebook of a pre-specifiedcsmall 1 length C 2 (e.g. gives 30 points). a finer For the same partition queries, the candidate lists of at least 30 vectors are requested from the inverted multi-index (right) and the lists containing 31 and 32 words are returned (circled). As even such short lists require visiting several nearest cells in the partition (which can be done efficiently via the multi-sequence algorithm), the resulting given query q, visit cells (c 1, c 2 ) C 1 C 2 in ascending order of vector sets span the neighborhoods that are much less skewed (i.e., the neighborhoods are approximately centered at the queries). In high dimensions, distance the capabilityto visit q many by cells multi-sequence that surround the query from algorithm different directions translates into considerably higher accuracy of retrieval and nearest neighbor search. multi-index table corresponds to a part of the original vector space and contains a list of points that fall within that part. Importantly, we propose a simple and efficient algorithm that produces a sequence of multi-index entries oroneered by Sivic and Zisserman [23]. Since then, a large number of improvements that transfer further ideas from text retrieval (e.g. [4]), improve the quantization process (e.g. [20]), and integrate the query process with geomet-
69 Inverted multi-index inverted index inverted multi-index Figure 1. Indexing the set of 600 points (small black) distributed non-uniformly within the unit 2D square. Left the inverted index based on standard quantization (the codebook has 16 2D codewords; boundaries are in green). Right the inverted multi-index based on product quantization decompose (each of the two codebooks vectors hasas 16 1Dx codewords). = (x 1, The x 2 number ) of operations needed to match a query to codebooks is the same for both structures. Two example queries are issued (light-blue and light-red circles). The lists returned by the inverted index (left) contain 45train and 62 words codebooks respectively (circled). C 1, CNote 2 from that whendatasets a query lies near{x a space 1 n}, partition {x 2 n} boundary (as happens most often in high dimensions) the resulting list is heavily skewed and may not contain many of the nearest neighbors. Note also that the inverted index is not able induced to return listscodebook of a pre-specifiedcsmall 1 length C 2 (e.g. gives 30 points). a finer For the same partition queries, the candidate lists of at least 30 vectors are requested from the inverted multi-index (right) and the lists containing 31 and 32 words are returned (circled). As even such short lists require visiting several nearest cells in the partition (which can be done efficiently via the multi-sequence algorithm), the resulting given query q, visit cells (c 1, c 2 ) C 1 C 2 in ascending order of vector sets span the neighborhoods that are much less skewed (i.e., the neighborhoods are approximately centered at the queries). In high dimensions, distance the capabilityto visit q many by cells multi-sequence that surround the query from algorithm different directions translates into considerably higher accuracy of retrieval and nearest neighbor search. multi-index table corresponds to a part of the original vector space and contains a list of points that fall within that part. Importantly, we propose a simple and efficient algorithm that produces a sequence of multi-index entries oroneered by Sivic and Zisserman [23]. Since then, a large number of improvements that transfer further ideas from text retrieval (e.g. [4]), improve the quantization process (e.g. [20]), and integrate the query process with geomet-
70 Inverted multi-index inverted index inverted multi-index Figure 1. Indexing the set of 600 points (small black) distributed non-uniformly within the unit 2D square. Left the inverted index based on standard quantization (the codebook has 16 2D codewords; boundaries are in green). Right the inverted multi-index based on product quantization decompose (each of the two codebooks vectors hasas 16 1Dx codewords). = (x 1, The x 2 number ) of operations needed to match a query to codebooks is the same for both structures. Two example queries are issued (light-blue and light-red circles). The lists returned by the inverted index (left) contain 45train and 62 words codebooks respectively (circled). C 1, CNote 2 from that whendatasets a query lies near{x a space 1 n}, partition {x 2 n} boundary (as happens most often in high dimensions) the resulting list is heavily skewed and may not contain many of the nearest neighbors. Note also that the inverted index is not able induced to return listscodebook of a pre-specifiedcsmall 1 length C 2 (e.g. gives 30 points). a finer For the same partition queries, the candidate lists of at least 30 vectors are requested from the inverted multi-index (right) and the lists containing 31 and 32 words are returned (circled). As even such short lists require visiting several nearest cells in the partition (which can be done efficiently via the multi-sequence algorithm), the resulting given query q, visit cells (c 1, c 2 ) C 1 C 2 in ascending order of vector sets span the neighborhoods that are much less skewed (i.e., the neighborhoods are approximately centered at the queries). In high dimensions, distance the capabilityto visit q many by cells multi-sequence that surround the query from algorithm different directions translates into considerably higher accuracy of retrieval and nearest neighbor search. multi-index table corresponds to a part of the original vector space and contains a list of points that fall within that part. Importantly, we propose a simple and efficient algorithm that produces a sequence of multi-index entries oroneered by Sivic and Zisserman [23]. Since then, a large number of improvements that transfer further ideas from text retrieval (e.g. [4]), improve the quantization process (e.g. [20]), and integrate the query process with geomet-
71 Multi-sequence algorithm C 2 C 1
72 Vector quantization (VQ) minimize x Xmin x c C c 2 = x q(x) 2 = E(C) x X dataset codebook quantizer distortion
73 Vector quantization (VQ) minimize x Xmin x c C c 2 = x q(x) 2 = E(C) x X dataset codebook quantizer distortion
74 Vector quantization (VQ) minimize x Xmin x c C c 2 = x q(x) 2 = E(C) x X dataset codebook quantizer distortion
75 Vector quantization (VQ) For small distortion large k = C : hard to train too large to store too slow to search
76 Product quantization (PQ) minimize subject to x X min x c 2 c C C = C 1 C m
77 Product quantization (PQ) train: q = (q 1,..., q m ) where q 1,..., q m obtained by VQ store: C = k m with C 1 = = C m = k m search: y q(x) 2 = y j q j (x j ) 2 where q j (x j ) C j j=1
78 Optimized product quantization (OPQ) minimize subject to x X min x Rĉ 2 ĉ Ĉ Ĉ = C 1 C m R R = I
79 OPQ, parametric solution for X N (0, Σ) independence: PCA-align by diagonalizing Σ as UΛU balanced variance: permute Λ such that i λ i is constant in each subspace; R UP π find Ĉ by PQ on rotated data ˆx = R x
80 Locally optimized product quantization (LOPQ) compute residuals r(x) = x q(x) on coarse quantizer q collect residuals Z i = {r(x) : q(x) = c i } per cell train (R i, q i ) OPQ(Z i ) per cell
81 Locally optimized product quantization (LOPQ) better capture support of data distribution, like local PCA [Kambhatla & Leen 97] multimodal (e.g. mixture) distributions distributions on nonlinear manifolds residual distributions closer to Gaussian assumption
82 Multi-LOPQ x = ( x 1, x 2 ) q 2 q
83 Comparison to state of the art SIFT1B, 64-bit codes Method R = 1 R = 10 R = 100 Ck-means [Norouzi & Fleet 13] IVFADC IVFADC [Jégou et al. 11] OPQ Multi-D-ADC [Babenko & Lempitsky 12] LOR+PQ LOPQ Most benefit comes from locally optimized rotation!
84 Comparison to state of the art SIFT1B, 64-bit codes Method R = 1 R = 10 R = 100 Ck-means [Norouzi & Fleet 13] IVFADC IVFADC [Jégou et al. 11] OPQ Multi-D-ADC [Babenko & Lempitsky 12] LOR+PQ LOPQ Most benefit comes from locally optimized rotation!
85 Comparison to state of the art SIFT1B, 128-bit codes T Method R = K 10K 30K 100K IVFADC+R [Jégou et al. 11] LOPQ+R Multi-D-ADC [Babenko & Lempitsky 12] OMulti-D-OADC [Ge et al. 13] Multi-LOPQ Multi-D-ADC [Babenko & Lempitsky 12] OMulti-D-OADC [Ge et al. 13] Multi-LOPQ Multi-D-ADC [Babenko & Lempitsky 12] OMulti-D-OADC [Ge et al. 13] Multi-LOPQ
86 Residual encoding in related work PQ (IVFADC) [Jégou et al. 11]: single product quantizer for all cells [Uchida et al. 12]: multiple product quantizers shared by multiple cells OPQ [Ge et al. 13]: single product quantizer for all cells, globally optimized for rotation (single/multi-index) LOPQ: with/without one product quantizer per cell, with/without rotation optimization per cell (single/multi-index) [Babenko & Lempitsky 14]: one product quantizer per cell, optimized for rotation per cell (multi-index)
87 Residual encoding in related work PQ (IVFADC) [Jégou et al. 11]: single product quantizer for all cells [Uchida et al. 12]: multiple product quantizers shared by multiple cells OPQ [Ge et al. 13]: single product quantizer for all cells, globally optimized for rotation (single/multi-index) LOPQ: with/without one product quantizer per cell, with/without rotation optimization per cell (single/multi-index) [Babenko & Lempitsky 14]: one product quantizer per cell, optimized for rotation per cell (multi-index)
88 Thank you!
Optimized Product Quantization
OPTIMIZED PRODUCT QUATIZATIO Optimized Product Quantization Tiezheng Ge, Kaiming He, Qifa Ke, and Jian Sun Abstract Product quantization (PQ) is an effective vector quantization method. A product quantizer
More informationarxiv: v1 [cs.cv] 22 Dec 2015
Transformed Residual Quantization for Approximate Nearest Neighbor Search Jiangbo Yuan Florida State University jyuan@cs.fsu.edu Xiuwen Liu Florida State University liux@cs.fsu.edu arxiv:1512.06925v1 [cs.cv]
More informationCompressed Fisher vectors for LSVR
XRCE@ILSVRC2011 Compressed Fisher vectors for LSVR Florent Perronnin and Jorge Sánchez* Xerox Research Centre Europe (XRCE) *Now with CIII, Cordoba University, Argentina Our system in a nutshell High-dimensional
More informationOrientation covariant aggregation of local descriptors with embeddings
Orientation covariant aggregation of local descriptors with embeddings Giorgos Tolias, Teddy Furon, Hervé Jégou To cite this version: Giorgos Tolias, Teddy Furon, Hervé Jégou. Orientation covariant aggregation
More informationOptimized Product Quantization for Approximate Nearest Neighbor Search
Optimized Product Quantization for Approximate Nearest Neighbor Search Tiezheng Ge Kaiming He 2 Qifa Ke 3 Jian Sun 2 University of Science and Technology of China 2 Microsoft Research Asia 3 Microsoft
More informationFisher Vector image representation
Fisher Vector image representation Machine Learning and Category Representation 2014-2015 Jakob Verbeek, January 9, 2015 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15 A brief recap on kernel
More informationComposite Quantization for Approximate Nearest Neighbor Search
Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from
More informationInterferences in match kernels
Interferences in match kernels Naila Murray, Hervé Jégou, Florent Perronnin, Andrew Zisserman Xerox Research Centre Europe [firstname.lastname@xrce.xerox.com] Facebook AI Research [rvj@fb.com] Department
More informationFAemb: a function approximation-based embedding method for image retrieval
FAemb: a function approximation-based embedding method for image retrieval Thanh-Toan Do Quang D Tran Ngai-Man Cheung Singapore University of Technology and Design {thanhtoan do, quang tran, ngaiman cheung}@sutdedusg
More informationWE consider the problem of searching for vectors similar to
Memory vectors for similarity search in high-dimensional spaces Ahmet Iscen, Teddy Furon, Vincent Gripon, Michael Rabbat, and Hervé Jégou arxiv:4.338v7 [cs.cv] Mar 7 Abstract We study an indexing architecture
More informationWITH the explosive growth of multimedia data, compressing
JOURAL OF L A TEX CLASS FILES, VOL. 14, O. 8, AUGUST 2015 1 PQTable: on-exhaustive Fast Search for Product-quantized Codes using Hash Tables Yusuke Matsui, Member, IEEE, Toshihiko Yamasaki, Member, IEEE,
More informationKernel Density Topic Models: Visual Topics Without Visual Words
Kernel Density Topic Models: Visual Topics Without Visual Words Konstantinos Rematas K.U. Leuven ESAT-iMinds krematas@esat.kuleuven.be Mario Fritz Max Planck Institute for Informatics mfrtiz@mpi-inf.mpg.de
More informationImproving Image Similarity With Vectors of Locally Aggregated Tensors (VLAT)
Improving Image Similarity With Vectors of Locally Aggregated Tensors (VLAT) David Picard and Philippe-Henri Gosselin picard@ensea.fr, gosselin@ensea.fr ETIS / ENSEA - University of Cergy-Pontoise - CNRS
More informationVector Quantization. Institut Mines-Telecom. Marco Cagnazzo, MN910 Advanced Compression
Institut Mines-Telecom Vector Quantization Marco Cagnazzo, cagnazzo@telecom-paristech.fr MN910 Advanced Compression 2/66 19.01.18 Institut Mines-Telecom Vector Quantization Outline Gain-shape VQ 3/66 19.01.18
More informationEfficient Storage and Decoding of SURF Feature Points
Efficient Storage and Decoding of SURF Feature Points Kevin McGuinness, Kealan McCusker, Neil O Hare, and Noel E. O Connor CLARITY: Center for Sensor Web Technologies, Dublin City University Abstract.
More informationOn Compression Encrypted Data part 2. Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University
On Compression Encrypted Data part 2 Prof. Ja-Ling Wu The Graduate Institute of Networking and Multimedia National Taiwan University 1 Brief Summary of Information-theoretic Prescription At a functional
More informationarxiv: v1 [cs.lg] 25 Mar 2019
Xiang Wu Ruiqi Guo Sanjiv Kumar David Simcha Google Research, New York arxiv:903.039v [cs.lg] 25 Mar 209 Abstract Inverted file and asymmetric distance computation (IVFADC) have been successfully applied
More informationMultimedia Communications. Scalar Quantization
Multimedia Communications Scalar Quantization Scalar Quantization In many lossy compression applications we want to represent source outputs using a small number of code words. Process of representing
More informationThe Power of Asymmetry in Binary Hashing
The Power of Asymmetry in Binary Hashing Behnam Neyshabur Yury Makarychev Toyota Technological Institute at Chicago Russ Salakhutdinov University of Toronto Nati Srebro Technion/TTIC Search by Image Image
More informationHigher-order Statistical Modeling based Deep CNNs (Part-I)
Higher-order Statistical Modeling based Deep CNNs (Part-I) Classical Methods & From Shallow to Deep Qilong Wang 2018-11-23 Context 1 2 3 4 Higher-order Statistics in Bag-of-Visual-Words (BoVW) Higher-order
More informationCourse 10. Kernel methods. Classical and deep neural networks.
Course 10 Kernel methods. Classical and deep neural networks. Kernel methods in similarity-based learning Following (Ionescu, 2018) The Vector Space Model ò The representation of a set of objects as vectors
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationImproved Bayesian Compression
Improved Bayesian Compression Marco Federici University of Amsterdam marco.federici@student.uva.nl Karen Ullrich University of Amsterdam karen.ullrich@uva.nl Max Welling University of Amsterdam Canadian
More informationLecture 6 Sept Data Visualization STAT 442 / 890, CM 462
Lecture 6 Sept. 25-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Dual PCA It turns out that the singular value decomposition also allows us to formulate the principle components
More informationNEAREST neighbor (NN) search has been a fundamental
JOUNAL OF L A T E X CLASS FILES, VOL. 3, NO. 9, JUNE 26 Composite Quantization Jingdong Wang and Ting Zhang arxiv:72.955v [cs.cv] 4 Dec 27 Abstract This paper studies the compact coding approach to approximate
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationVector Quantization Encoder Decoder Original Form image Minimize distortion Table Channel Image Vectors Look-up (X, X i ) X may be a block of l
Vector Quantization Encoder Decoder Original Image Form image Vectors X Minimize distortion k k Table X^ k Channel d(x, X^ Look-up i ) X may be a block of l m image or X=( r, g, b ), or a block of DCT
More informationKernel Square-Loss Exemplar Machines for Image Retrieval
Kernel Square-Loss Exemplar Machines for Image Retrieval Rafael Rezende, Joaquin Zepeda, Jean Ponce, Francis Bach, Patrick Perez To cite this version: Rafael Rezende, Joaquin Zepeda, Jean Ponce, Francis
More informationRandom projection trees and low dimensional manifolds. Sanjoy Dasgupta and Yoav Freund University of California, San Diego
Random projection trees and low dimensional manifolds Sanjoy Dasgupta and Yoav Freund University of California, San Diego I. The new nonparametrics The new nonparametrics The traditional bane of nonparametric
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationAcademic Editors: Andreas Holzinger, Adom Giffin and Kevin H. Knuth Received: 26 February 2016; Accepted: 16 August 2016; Published: 24 August 2016
entropy Article Distribution Entropy Boosted VLAD for Image Retrieval Qiuzhan Zhou 1,2,3, *, Cheng Wang 2, Pingping Liu 4,5,6, Qingliang Li 4, Yeran Wang 4 and Shuozhang Chen 2 1 State Key Laboratory of
More informationLinear Spectral Hashing
Linear Spectral Hashing Zalán Bodó and Lehel Csató Babeş Bolyai University - Faculty of Mathematics and Computer Science Kogălniceanu 1., 484 Cluj-Napoca - Romania Abstract. assigns binary hash keys to
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationStochastic Generative Hashing
Stochastic Generative Hashing B. Dai 1, R. Guo 2, S. Kumar 2, N. He 3 and L. Song 1 1 Georgia Institute of Technology, 2 Google Research, NYC, 3 University of Illinois at Urbana-Champaign Discussion by
More informationConnection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis
Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal
More informationMultimedia Systems Giorgio Leonardi A.A Lecture 4 -> 6 : Quantization
Multimedia Systems Giorgio Leonardi A.A.2014-2015 Lecture 4 -> 6 : Quantization Overview Course page (D.I.R.): https://disit.dir.unipmn.it/course/view.php?id=639 Consulting: Office hours by appointment:
More informationPulse characterization with Wavelet transforms combined with classification using binary arrays
Pulse characterization with Wavelet transforms combined with classification using binary arrays Overview Wavelet Transformation Creating binary arrays out of and how to deal with them An estimator for
More informationSPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS
SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised
More information6/9/2010. Feature-based methods and shape retrieval problems. Structure. Combining local and global structures. Photometric stress
1 2 Structure Feature-based methods and shape retrieval problems Global Metric Local Feature descriptors Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book 048921
More informationECE 661: Homework 10 Fall 2014
ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;
More informationAlgorithms for Nearest Neighbors
Algorithms for Nearest Neighbors Background and Two Challenges Yury Lifshits Steklov Institute of Mathematics at St.Petersburg http://logic.pdmi.ras.ru/~yura McGill University, July 2007 1 / 29 Outline
More informationLec 12 Review of Part I: (Hand-crafted) Features and Classifiers in Image Classification
Image Analysis & Retrieval Spring 2017: Image Analysis Lec 12 Review of Part I: (Hand-crafted) Features and Classifiers in Image Classification Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu,
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest
More informationNearest Neighbor Search with Keywords: Compression
Nearest Neighbor Search with Keywords: Compression Yufei Tao KAIST June 3, 2013 In this lecture, we will continue our discussion on: Problem (Nearest Neighbor Search with Keywords) Let P be a set of points
More informationNonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the
More informationCorners, Blobs & Descriptors. With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros
Corners, Blobs & Descriptors With slides from S. Lazebnik & S. Seitz, D. Lowe, A. Efros Motivation: Build a Panorama M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 How do we build panorama?
More informationFace Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition
ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationThe E8 Lattice and Error Correction in Multi-Level Flash Memory
The E8 Lattice and Error Correction in Multi-Level Flash Memory Brian M Kurkoski University of Electro-Communications Tokyo, Japan kurkoski@iceuecacjp Abstract A construction using the E8 lattice and Reed-Solomon
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationCorrelation Autoencoder Hashing for Supervised Cross-Modal Search
Correlation Autoencoder Hashing for Supervised Cross-Modal Search Yue Cao, Mingsheng Long, Jianmin Wang, and Han Zhu School of Software Tsinghua University The Annual ACM International Conference on Multimedia
More informationDiscriminative Learning and Big Data
AIMS-CDT Michaelmas 2016 Discriminative Learning and Big Data Lecture 2: Other loss functions and ANN Andrew Zisserman Visual Geometry Group University of Oxford http://www.robots.ox.ac.uk/~vgg Lecture
More informationIntroduction to Machine Learning Midterm Exam
10-701 Introduction to Machine Learning Midterm Exam Instructors: Eric Xing, Ziv Bar-Joseph 17 November, 2015 There are 11 questions, for a total of 100 points. This exam is open book, open notes, but
More informationMachine Learning. Kernels. Fall (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang. (Chap. 12 of CIML)
Machine Learning Fall 2017 Kernels (Kernels, Kernelized Perceptron and SVM) Professor Liang Huang (Chap. 12 of CIML) Nonlinear Features x4: -1 x1: +1 x3: +1 x2: -1 Concatenated (combined) features XOR:
More informationPhoto Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II
Photo Tourism and im2gps: 3D Reconstruction and Geolocation of Internet Photo Collections Part II Noah Snavely Cornell James Hays CMU MIT (Fall 2009) Brown (Spring 2010 ) Complexity of the Visual World
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationHARMONIC VECTOR QUANTIZATION
HARMONIC VECTOR QUANTIZATION Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Jonas Svedberg, and Harald Pobloth SMN, Ericsson Research, Ericsson AB 64 8, Stockholm, Sweden ABSTRACT
More informationLOCALITY PRESERVING HASHING. Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA
LOCALITY PRESERVING HASHING Yi-Hsuan Tsai Ming-Hsuan Yang Electrical Engineering and Computer Science University of California, Merced Merced, CA 95344, USA ABSTRACT The spectral hashing algorithm relaxes
More informationIsotropic Hashing. Abstract
Isotropic Hashing Weihao Kong, Wu-Jun Li Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering, Shanghai Jiao Tong University, China {kongweihao,liwujun}@cs.sjtu.edu.cn
More informationSupervised Hashing via Uncorrelated Component Analysis
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) Supervised Hashing via Uncorrelated Component Analysis SungRyull Sohn CG Research Team Electronics and Telecommunications
More informationRiemannian Metric Learning for Symmetric Positive Definite Matrices
CMSC 88J: Linear Subspaces and Manifolds for Computer Vision and Machine Learning Riemannian Metric Learning for Symmetric Positive Definite Matrices Raviteja Vemulapalli Guide: Professor David W. Jacobs
More informationDISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION
DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan MOTIVATION State-of-the-art Object Detection HOG Linear SVM Dalal&Triggs Histograms
More informationOptimized Transform Coding for Approximate KNN Search
M. PARK & AL.: OPTIMIZED TRASFORM CODIG FOR APPROXIMATE K SEARCH Optimized Transform Coding for Approximate K Search Minwoo Par mpar@objectvideo.com Kiran Gunda gunda@objectvideo.com Himaanshu, Gupta hgupta@objectvideo.com
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationVector Quantization Techniques for Approximate Nearest Neighbor Search on Large- Scale Datasets
Tampere University of Technology Vector Quantization Techniques for Approximate Nearest Neighbor Search on Large- Scale Datasets Citation Ozan, E. C. (017). Vector Quantization Techniques for Approximate
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationKai Yu NEC Laboratories America, Cupertino, California, USA
Kai Yu NEC Laboratories America, Cupertino, California, USA Joint work with Jinjun Wang, Fengjun Lv, Wei Xu, Yihong Gong Xi Zhou, Jianchao Yang, Thomas Huang, Tong Zhang Chen Wu NEC Laboratories America
More informationON THE STABILITY OF DEEP NETWORKS
ON THE STABILITY OF DEEP NETWORKS AND THEIR RELATIONSHIP TO COMPRESSED SENSING AND METRIC LEARNING RAJA GIRYES AND GUILLERMO SAPIRO DUKE UNIVERSITY Mathematics of Deep Learning International Conference
More informationEE368B Image and Video Compression
EE368B Image and Video Compression Homework Set #2 due Friday, October 20, 2000, 9 a.m. Introduction The Lloyd-Max quantizer is a scalar quantizer which can be seen as a special case of a vector quantizer
More informationEnsemble Methods. NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan
Ensemble Methods NLP ML Web! Fall 2013! Andrew Rosenberg! TA/Grader: David Guy Brizan How do you make a decision? What do you want for lunch today?! What did you have last night?! What are your favorite
More informationAlgorithms for Data Science: Lecture on Finding Similar Items
Algorithms for Data Science: Lecture on Finding Similar Items Barna Saha 1 Finding Similar Items Finding similar items is a fundamental data mining task. We may want to find whether two documents are similar
More informationSupervised Quantization for Similarity Search
Supervised Quantization for Similarity Search Xiaojuan Wang 1 Ting Zhang 2 Guo-Jun Qi 3 Jinhui Tang 4 Jingdong Wang 5 1 Sun Yat-sen University, China 2 University of Science and Technology of China, China
More informationMinwise hashing for large-scale regression and classification with sparse data
Minwise hashing for large-scale regression and classification with sparse data Nicolai Meinshausen (Seminar für Statistik, ETH Zürich) joint work with Rajen Shah (Statslab, University of Cambridge) Simons
More informationLie Algebrized Gaussians for Image Representation
Lie Algebrized Gaussians for Image Representation Liyu Gong, Meng Chen and Chunlong Hu School of CS, Huazhong University of Science and Technology {gongliyu,chenmenghust,huchunlong.hust}@gmail.com Abstract
More informationScalar and Vector Quantization. National Chiao Tung University Chun-Jen Tsai 11/06/2014
Scalar and Vector Quantization National Chiao Tung University Chun-Jen Tsai 11/06/014 Basic Concept of Quantization Quantization is the process of representing a large, possibly infinite, set of values
More informationFastfood Approximating Kernel Expansions in Loglinear Time. Quoc Le, Tamas Sarlos, and Alex Smola Presenter: Shuai Zheng (Kyle)
Fastfood Approximating Kernel Expansions in Loglinear Time Quoc Le, Tamas Sarlos, and Alex Smola Presenter: Shuai Zheng (Kyle) Large Scale Problem: ImageNet Challenge Large scale data Number of training
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationWavelet-based Salient Points with Scale Information for Classification
Wavelet-based Salient Points with Scale Information for Classification Alexandra Teynor and Hans Burkhardt Department of Computer Science, Albert-Ludwigs-Universität Freiburg, Germany {teynor, Hans.Burkhardt}@informatik.uni-freiburg.de
More informationFor Review Only. Codebook-free Compact Descriptor for Scalable Visual Search
Page of 0 0 0 0 0 0 Codebook-free Compact Descriptor for Scalable Visual Search Yuwei u, Feng Gao, Yicheng Huang, Jie Lin Member, IEEE, Vijay Chandrasekhar Member, IEEE, Junsong Yuan Senior Member, IEEE,
More informationMIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design
MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation
More informationAlgorithm-Independent Learning Issues
Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning
More informationPreprocessing & dimensionality reduction
Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016
More informationKernel methods for comparing distributions, measuring dependence
Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations
More informationReview for Exam 1. Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA
Review for Exam Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA 0003 March 26, 204 Abstract Here are some things you need to know for the in-class
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2010 Lecture 22: Nearest Neighbors, Kernels 4/18/2011 Pieter Abbeel UC Berkeley Slides adapted from Dan Klein Announcements On-going: contest (optional and FUN!)
More informationTowards Good Practices for Action Video Encoding
Towards Good Practices for Action Video Encoding Jianxin Wu National Key Laboratory for Novel Software Technology Nanjing University, China wujx21@nju.edu.cn Yu Zhang Nanyang Technological University Singapore
More informationWaldHash: sequential similarity-preserving hashing
WaldHash: sequential similarity-preserving hashing Alexander M. Bronstein 1, Michael M. Bronstein 2, Leonidas J. Guibas 3, and Maks Ovsjanikov 4 1 Department of Electrical Engineering, Tel-Aviv University
More informationFast Orthogonal Projection Based on Kronecker Product
Fast Orthogonal Projection Based on Kronecker Product Xu Zhang,2, Felix X. Yu 2,3, Ruiqi Guo 3, Sanjiv Kumar 3, Shengjin Wang, Shih-Fu Chang 2 Tsinghua University, 2 Columbia University, 3 Google Research
More informationC.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University
Quantization C.M. Liu Perceptual Signal Processing Lab College of Computer Science National Chiao-Tung University http://www.csie.nctu.edu.tw/~cmliu/courses/compression/ Office: EC538 (03)5731877 cmliu@cs.nctu.edu.tw
More informationData dependent operators for the spatial-spectral fusion problem
Data dependent operators for the spatial-spectral fusion problem Wien, December 3, 2012 Joint work with: University of Maryland: J. J. Benedetto, J. A. Dobrosotskaya, T. Doster, K. W. Duke, M. Ehler, A.
More informationKernel Methods. Charles Elkan October 17, 2007
Kernel Methods Charles Elkan elkan@cs.ucsd.edu October 17, 2007 Remember the xor example of a classification problem that is not linearly separable. If we map every example into a new representation, then
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationCS798: Selected topics in Machine Learning
CS798: Selected topics in Machine Learning Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS798: Selected topics in Machine Learning
More informationLearning Eigenfunctions: Links with Spectral Clustering and Kernel PCA
Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More information