Relevance Aggregation Projections for Image Retrieval

Relevance Aggregation Projections for Image Retrieval CIVR 2008 Wei Liu Wei Jiang Shih-Fu Chang wliu@ee.columbia.edu

Syllabus Motivations and Formulation Our Approach: Relevance Aggregation Projections Experimental Results Conclusions Liu et al. Columbia University 2/28

Syllabus Motivations and Formulation Our Approach: Relevance Aggregation Projections Experimental Results Conclusions Liu et al. Columbia University 3/28

Motivations and Formulation Relevance feedback to close the semantic gap. to explore knowledge about the user s intention. to select features, refine models. Relevance feedback mechanism User selects a query image. The system presents highest ranked images to user, except for labeled ones. During each iteration, the user marks relevant (positive) and irrelevant (negative) images. The system gradually refines retrieval results. Liu et al. Columbia University 4/28

Problems Small sample learning Number of labeled images is extremely small. High dimensionality Feature dim >100, labeled data number < 100. Asymmetry relevant data are coherent and irrelevant data are diverse. Liu et al. Columbia University 5/28

Asymmetry in CBIR query relevant images irrelevant images Liu et al. Columbia University 6/28

Possible Solutions Asymmetry: T query query margin =1 margin =1 Small sample learning semi-supervised learning Curse of dimensionality dimensionality reduction Liu et al. Columbia University 7/28

Previous Work Methods LPP ARE SSP SR NIPS 03 ACM MM 05 ACM MM 06 ACM MM 07 labeled unlabeled asymmetry dimension d l-1 l-1 2 bound image dim: d, total sample #: n, labeled sample #: l In CBIR, n > d > l Liu et al. Columbia University 8/28

Disadvantages LPP: unsupervised. SSP and SR: fail to engage the asymmetry. SSP emphasizes the irrelevant set. SR treats relevant and irrelevant sets equally. ARE, SSP and SR: produce very low-dimensional subspaces (at most l-1 dimensions). Especially for SR (2D subspace). Liu et al. Columbia University 9/28

Syllabus Motivations and Formulation Relevance Aggregation Projections (RAP) Experimental Results Conclusions Liu et al. Columbia University 10/28

Symbols n: total #, l: labeled # d: original dim, r: reduced dim d n X = [ x,..., x, x,..., x ] : samples 1 l l+ 1 n d l X = [ x,..., x ] : labeled samples l 1 l F l : relevant set, F : irrelevant set + : relevant #, l : irrelevant # + A d r d : subspace, a : projecting vector GV (, EW, ):graph, L= D W: graph Laplacian Liu et al. Columbia University 11/28

Graph Construction Build a k-nn graph as W ij 2 xi xj k k = exp( ), x ( ) ( ) 2 i N xj xj N xi σ 0, otherwise Establish an edge if x is among k-nns of x or is x i among k-nns of. i j x j Graph Laplacian regularizers. L= D W n n : used in smoothness Liu et al. Columbia University 12/28

Our Approach T T min tr( A XLX A) (1.1) A d r T T + + st.. A x = A x / l, i F (1.2) i j F j T + A ( x x / l ) r, i F (1.3) i + j F + j Target subspace A reducing raw data from d dims to r dims Obj (1.1) minimize local scatter using labeled and unlabeled data Cons (1.2) aggregate positive data (in F+ ) to the positive center Cons (1.3) push negative data (in F-) far away from the positive center with at least r unit distances. Cons (1.2) (1.3) just address asymmetry in CBIR. Liu et al. Columbia University 13/28 2

Core Idea: Relevance Aggregation An ideal subspace is one in which the relevant examples are aggregated into a single point and the irrelevant examples are simultaneously separated by a large margin. Liu et al. Columbia University 14/28

Relevance Aggregation Projections We transform eq. (1) to eq. (2) in terms of each column vector a in A (a is a projecting vector): T T min a XLX a (2.1) a d T T + + st.. a x = a c, i F (2.2) i + 2 T a ( x c ) 1, i F (2.3) i + + where c = xj / l is the positive center. j F + Liu et al. Columbia University 15/28

Solution Eq. (2.1-2.3) is a quadratically constrained quadratic optimization problem and thus hard to solve directly. We want to remove constraints first and minimize the cost function then. We adopt a heuristic trick to explore the solution. Find ideal 1D projections which satisfy the constraints. Removing constraints, solve a part of the solution. Solve another part of the solution. Liu et al. Columbia University 16/28

Solution: Find Ideal Projections Run PCA to get the r principle eigenvectors and renormalize d r T T them to get V = [ v,..., v r ] such that V XX V = I. 1 On each vector v in V, T T v x v x < 2, i, j = 1,..., n. i j Form the ideal 1D projections on each projecting direction v T + + vc, i F T T T + v xi, i F v xi v c 1 yi = T + T T + vc + 1, i F 0 vxi vc < 1 T + T T + vc 1, i F 1 < vxi vc < 0 y = [ y,..., y ] T l l 1 Liu et al. Columbia University 17/28 (3)

Solution: Find Ideal Projections T v X l 1 l T T v x v c + > 1 i T T v xi v c + 1 y T 1 l T vc + y T i v c + > 1 T y v c + = 1 i The vector y is formed according to each PCA vector v. Liu et al. Columbia University 18/28

Solution: QR Factorization Remove constraints eq. (2.2-2.3) via solving a linear system T Xl a = y (4) Because l < d, eq. (4) is underdetermined and thus strictly satisfied. R Perform QR factorization: Xl [ Q1 Q ] = 2 1 0 = QR The optimal solution is a sum of a particular solution and a complementary solution, i.e. a= Qb 1 1+ Q2b2 (5) T 1 where b1 = ( R ) y Liu et al. Columbia University 19/28

Solution: Regularization We hope that the final solution will not deviate the PCA solution too much, so we develop a regularization framework. Our framework is γ > 0 controls the trade-off between PCA solution and data locality preserving (original loss function). The second term behaves as a regularization term. Plugging a= Qb + Q b into eq. (6), we solve 2 T T f( a) = a v +γ a XLX a (6) 1 1 2 2 b = ( I + γq XLX Q ) ( Q v γq XLX Qb) T T -1 T T T 2 2 2 2 2 1 1 Liu et al. Columbia University 20/28

Algorithm 1 Construct a k-nn graph W, L, S = XLX T 2 PCA initialization V = [ v1,..., v r ] 3 QR factorization Q1, Q2, R 4 5 Transductive Regularization Projecting [ a,..., a ] T x 1 r for j = 1: r form y with v end T 1 b1 ( R ) y T -1 b2 = ( I + γ Q2SQ2) T T Qv 2 j γ QSQb 2 1 1 j = ( ) a = Qb + Q b 1 1 2 2 j Liu et al. Columbia University 21/28

Syllabus Motivations and Formulation Our Approach: Relevance Aggregation Projections Experimental Results Conclusions Liu et al. Columbia University 22/28

Experimental Setup Corel image database: 10,000 image, 100 image per category. Features: two types of color features and two types of texture features, 91 dims. Five feedback iterations, label top-10 ranked images in each iteration. The statistical average top-n precision is used for performance evaluation. Liu et al. Columbia University 23/28

Evaluation Liu et al. Columbia University 24/28

Evaluation Liu et al. Columbia University 25/28

Syllabus Motivations and Formulation Our Approach: Relevance Aggregation Projections Experimental Results Conclusions Liu et al. Columbia University 26/28

Conclusions We develop RAP to simultaneously solve three fundamental issues in relevance feedback: asymmetry between classes small sample size (incorporate unlabeled samples) high dimensionality RAP learns a semantic subspace in which the relevant samples collapse while the irrelevant samples are pushed outward with a large margin. RAP can be used to solve imbalanced semi-supervised learning problems with few labeled data. Experiments on COREL demonstrate RAP can achieve a significantly higher precision than the stat-of-the-arts. Liu et al. Columbia University 27/28

Thanks! http://www.ee.columbia.edu/~wliu/ Liu et al. Columbia University 28/28