Image Tag Completion by Noisy Matrix Recovery

Size: px

Start display at page:

Download "Image Tag Completion by Noisy Matrix Recovery"

Martina Malone
6 years ago
Views:

1 Image Tag Completion by Noisy Matrix Recovery Supplementary Document Zheyun Feng, Songhe Feng, Rong Jin, Anil K Jain {fengzhey, rongjin, jain}@csemsuedu, shfeng@bjtueducn Michigan State University, Beijing Jiaotong University Abstract In this supplementary document, we present Detailed proofs of Lemma, Lemma 2, theorem 2, theorem 4 and theorem 5 in the main paper Detailed statistics about the refined datasets Supplementary experimental results, mainly in terms of AR@N and C@N Note all the notations are the same as used in the main paper Detailed Proofs Proof of Lemma Proof We have = P i,j Q i,j 2 Q i,j j P i,j Q i,j 2 Q i,j j= Q i,j P i,j Q i,j Qi,j Qi,j = P Q 2 Proof of Lemma 2 Proof To facilitate our analysis, we rewrite each d i as m d i = d j i, j= where d j i is the image tag vector corresponding to the j-th word sampling for the tag vector of the i-th image To utilize Lemma 2, we define Z i,j as Z i = d j i p i e i,

2 2 Zheyun Feng, Songhe Feng, Rong Jin, Anil K Jain and therefore M = n m Z i,j m To bound U in Lemma 2, we have Z i,j d j i p i d j i 2 2 To bound σ Z, we compute m E [ Z i,j Z ] i,j nm = m nm = m [ E d j i nm dj i ] p i p i max j m [ E d j i p i d j i i ] p n p i,j i p 2 i,j = P n Similarly, we have m E [ Z ] i Z i nm = m [ ] E d j i nm p i d j i p i e i e i = m E [ d i d i p i p i ei e ] i nm n We complete the proof by plugging the bounds for U and σ Z 3 Proof of Theorem 2 Proof We consider any solution Q Since ˆQ is the optimal solution to Eq in the main paper, we have L ˆQ, ˆQ Q, ie m n d i,j ˆQi,j Q i,j + ε ˆQ ˆQ tr, ˆQ Q, i,j where ˆQ tr is a subgradient of ˆQ tr Using the fact that ˆQ tr Q tr, ˆQ Q, we can replace ˆQ tr, ˆQ Q with Q tr, ˆQ Q, which results in the following inequality m n d i,j ˆQi,j Q i,j + ε Q tr, ˆQ Q

3 Image Tag Completion by Noisy Matrix Recovery - Supplementary 3 Define Z i,j = Q i,j / We have m n d i,j ˆQi,j Q i,j = n d i e i, Z = P, Z M, Z m Thus the bound in Eq 8 in the main paper is modified as Since we have j= P i,j ˆQi,j Q i,j + ε Q tr, ˆQ ˆQ Q i,j j= P i,j ˆQi,j Q i,j = P i,j ˆQi,j Q i,j = j= M i,j ˆQi,j Q i,j P i,j ˆQ ˆQ i,j ˆQi,j Q i,j i,j P i,j Q i,j 2 2 Q i,j P i,j 2 2 Define matrix B R n m as B i,j = M i,j / Using the fact [, µ + ] and result from Lemma, we have 2 P ˆQ + ˆQ Q 2 F 2µ + + ε Q tr, ˆQ Q M ˆQ Q tr + P Q 2 F 2 We write the Singular value decomposition of Q as Q = r σ i u i vi, where r is the rank of Q, σ i is the i-th singular value of Q, and u i, v i are the left and right singular vectors of Q Let U R n n r and V R m m r be the orthogonal bases complementary to U and V, respectively Define the linear operators P Q and P Q as P Q Z = UU Z + ZV V UU ZV V, P Q Z = Z P Q Z According to, the subgradient Q tr is given by the set W { } W = UV + U W V : W R n r m r, W = Thus by choosing an appropriate matrix W for the subgradient Q tr, we have Q tr, ˆQ Q P Q ˆQ Q tr + P Q ˆQ Q tr

4 4 Zheyun Feng, Songhe Feng, Rong Jin, Anil K Jain and therefore 2 P ˆQ + ˆQ Q 2 F 2µ + + ε P Q ˆQ Q tr ε P Q ˆQ Q tr + M ˆQ Q tr + P Q 2 F 2 Using the fact we have ε M, P ˆQ + ˆQ Q 2 F µ + 4ε P Q ˆQ Q tr + P Q 2 F We consider two cases In the first case, we assume P ˆQ P Q 2 F, in which the bound in theorem trivially holds In the second case, we have the opposite which implies and therefore P ˆQ > P Q 2 F, ˆQ Q 2 F µ + 4ε P Q ˆQ Q tr, P Q ˆQ Q tr 4εrµ + We complete the proof by plugging the above bound 4 Proof of Theorem 4 Proof Following the same analysis as that for Theorem 2 in the main paper see Section 3 in this supplementary for its proof, we have Using the fact ˆq i [, µ + ], we have p i ˆq i 2 z i p i ˆq i ˆq i ˆq i p i ˆq i 2 2 µ + z 2 p ˆq 2,

5 Image Tag Completion by Noisy Matrix Recovery - Supplementary 5 and therefore p i ˆq 2 µ + z 2 We finally complete the proof by using the fact 5 Proof of Theorem 5 p i ˆq i 2 p ˆq ˆq i Proof We will use the Chernoff bound, ie X,, X m be independent draws from a Bernoulli distribution with PX = = µ We have m P X i + δµ exp δ2 µm, m 3 m P X i δµ exp δ2 µm 2 m Using the Chernoff bound, we have, with a probability 2 exp δ 2 µm /2 X µ 2 δ 2 µ 2 By taking the union bound, we have, with a probability 2e t z 2 t + log m m p 2

6 6 Zheyun Feng, Songhe Feng, Rong Jin, Anil K Jain 2 Statistics about the Refined Datasets Table Statistics for the datasets used in the experiments These datasets are not the original datasets but refined according to our setup Note NUS-WIDE has two types of tags: the one automatically crawled from Flickr and used for model training, and the one manually annotated ESP Game IAPR TC2 MirFlickr NUS-WIDE Number of Images,45 2,985 5,23 2,968 Visual feature dimension 5 Vocabulary size Average tags per image Min/max tags per image 5/5 5/23 4/43 9/5 Average images per tag Min/max images per tag 6/3,439 4/4,752 /78 78/5,58 Number of observed tags m The number of observed tags when training our proposed model throughout the experimental section if without specific explanation

7 Image Tag Completion by Noisy Matrix Recovery - Supplementary 7 3 Supplementary Experimental Results In this section, we further present the experimental results of our proposed TCMR in comparison with the baseline approaches 3 Comparison to the state-of-the-art Tag Completion Methods AR@N 3 2 AR@N a AR@N on Mir Flickr b AR@N on ESP Game AR@N 3 2 c AR@N on NUSWIDE LRES TMC MC FastTag LSR TagProp RKML vknn TCMR C@N C@N C@N LRES TMC MC FastTag LSR TagProp RKML vknn TCMR d C@N on Mir Flickr e C@N on ESP Game f C@N on NUSWIDE Fig Tag completion performance of the proposed method and state-of-the-art baselines on Mir Flickr, ESP Game and NUS-WIDE datasets, reported by AR@N and C@N This figure can be viewed as supplemental to Fig in the main paper

8 8 Zheyun Feng, Songhe Feng, Rong Jin, Anil K Jain 32 Evaluation of Noisy Matrix Recovery AR@N 3 2 a Mir Flickr b ESP Game c IAPR TC2 2 d NUSWIDE Freq LSA tknn LDA LRES plsa TCMR C@N 4 2 e Mir Flickr f ESP Game g IAPR TC h NUSWIDE Freq LSA tknn LDA LRES plsa TCMR Fig 2 Comparison of different topic models and matrix completion algorithms without taking into account the visual feature The top row is evaluated by AR@N, and the bottom row is evaluated by C@N This figure can be viewed as supplemental to Fig 2 in the main paper 33 Sensitivity to the Number of Observed Tags AR@ AR@ LRES MC FastTag LSR TagProp vknn LSA tknn TCMR Number of observed tags m* a AR@5 on IAPR TC Number of observed tags m* b AR@5 on NUS-WIDE Fig 3 Tag completion performance with varied number of observed tags, with reported This figure can be viewed as supplemental to Fig 3 in the main paper

Large-scale Image Annotation by Efficient and Robust Kernel Metric Learning

Large-scale Image Annotation by Efficient and Robust Kernel Metric Learning Supplementary Material Zheyun Feng Rong Jin Anil Jain Department of Computer Science and Engineering, Michigan State University,