Kronecker Decomposition for Image Classification

Similar documents
Joint SVM for Accurate and Fast Image Tagging

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

Fantope Regularization in Metric Learning

CITS 4402 Computer Vision

Global Scene Representations. Tilke Judd

Scalable, Accurate Image Annotation with Joint SVMs and Output Kernels

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Lecture: Face Recognition and Feature Reduction

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine Learning Basics

Lecture: Face Recognition and Feature Reduction

CS4495/6495 Introduction to Computer Vision. 8C-L3 Support Vector Machines

Fisher Vector image representation

Statistical Machine Learning

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Information Retrieval

Lecture 7: Kernels for Classification and Regression

Support Vector Machines: Maximum Margin Classifiers

Face detection and recognition. Detection Recognition Sally

Lecture 13 Visual recognition

TUTORIAL PART 1 Unsupervised Learning

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Discriminative Direction for Kernel Classifiers

Introduction to Machine Learning. Introduction to ML - TAU 2016/7 1

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

c 4, < y 2, 1 0, otherwise,

Chemometrics: Classification of spectra

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

Machine learning for pervasive systems Classification in high-dimensional spaces

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Learning to Rank and Quadratic Assignment

Advanced Introduction to Machine Learning CMU-10715

Introduction to Support Vector Machines

PAC-learning, VC Dimension and Margin-based Bounds

Linear Algebra for Machine Learning. Sargur N. Srihari

Computational Linear Algebra

Principal Component Analysis

Problems. Looks for literal term matches. Problems:

Discriminative Learning and Big Data

14 Singular Value Decomposition

MAT 343 Laboratory 6 The SVD decomposition and Image Compression

Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

15 Singular Value Decomposition

Dimensionality Reduction:

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

Lecture Notes 2: Matrices

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

Least Squares Optimization

An Introduction to Machine Learning

Linear Algebra Background

Pattern Recognition and Machine Learning

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Tensor Methods for Feature Learning

Support Vector Machine. Industrial AI Lab.

Nearest Neighbors Methods for Support Vector Machines

From Binary to Multiclass Classification. CS 6961: Structured Prediction Spring 2018

Singular Value Decomposition and Digital Image Compression

Large Scale Data Analysis Using Deep Learning

Mathematical Formulation of Our Example

Example: Face Detection

Kernel Methods and Support Vector Machines

Active Appearances. Statistical Appearance Models

Singular Value Decompsition

CS 559: Machine Learning Fundamentals and Applications 5 th Set of Notes

Salt Dome Detection and Tracking Using Texture Analysis and Tensor-based Subspace Learning

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Image Analysis. PCA and Eigenfaces

Singular Value Decomposition: Compression of Color Images

The Singular-Value Decomposition

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

What is Image Deblurring?

Discriminative Models

Neural networks and optimization

CS 3710: Visual Recognition Describing Images with Features. Adriana Kovashka Department of Computer Science January 8, 2015

Clustering with k-means and Gaussian mixture distributions

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Matrix decompositions

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Basic Calculus Review

Classification and Support Vector Machine

Linear Subspace Models

Lecture: Face Recognition

Dimensionality Reduction and Principle Components Analysis

Notes on Latent Semantic Analysis

Sparse Kernel Machines - SVM

Statistical Pattern Recognition

Machine Learning - MT & 14. PCA and MDS

Dimensionality Reduction

IV. Matrix Approximation using Least-Squares

Principal Component Analysis (PCA)

Transcription:

university of innsbruck institute of computer science intelligent and interactive systems Kronecker Decomposition for Image Classification Sabrina Fontanella 1,2, Antonio Rodríguez-Sánchez 1, Justus Piater 1, and Sandor Szedmak 3 1 University of Innsbruck 2 University of Salerno 3 Aalto University Évora, September 2016

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 1/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 2/41

Image classification I I I Images are classified according to their visual content Applicability: 1. Recognition of specific objects 2. Indoor/outdoor recognition 3. Analysis of medical images Antonio Rodrı guez-sa nchez (CLEF 2016) 3/41

Image classification II Example of classification algorithm, Bag of Words: 1. Features extraction, stored into feature vectors 2. Approximation of the distribution of the features by an histogram 3. Apply a classification algorithm (Support Vector Machine, Neural Network, Markov Random Field, etc) Antonio Rodríguez-Sánchez (CLEF 2016) 4/41

Relations between objects are of interest Is it possible to recognize relationships between the objects appearing in a scene? This is of interest, since this relationship can provide knowledge necessary to identify and classify the image E.g. A car is quite likely to be in an image where there is also buildings and people. E.g. A zebra is quite likely to be outdoors, surrounded by Savanna plants or animals. Antonio Rodríguez-Sánchez (CLEF 2016) 5/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 6/41

Decomposing the environment Structured decomposition of the environment Learning structured output is a popular stream of machine learning By decomposing the matrix that represent the image, the structure behind the scene could be captured Let us consider 2D image decomposition Points close to each other within continuous 2D blocks can strongly relate to each other Antonio Rodríguez-Sánchez (CLEF 2016) 7/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 8/41

Tensor decomposition A tensor is a multidimensional or N-way array an N- way or Nth-order tensor is an element of the tensor product of N vector spaces Tensor decomposition can be considered as a higher- order generalization of the matrix singular value decomposition (SVD) and principal component analysis (PCA) The tensor decomposition for a same image is not unique Given an RGB image of size (256,256,3), it is possible to perform the following decompositions: (16,16,3),(16,16,1) tensor + matrix (2 components) (8,8,3), (8,8,1), (4,4,1) tensor + 2 matrices (3 components) Antonio Rodríguez-Sánchez (CLEF 2016) 9/41

Tensor decomposition Concerning computer vision, the tensor decomposition could be used to represent: Color images, where three matrices express the RGB images and we can use a tensor of order three (for example (1024,1024,3)). Video stream of color images where the dimensions are R, G, B and the time. Antonio Rodríguez-Sánchez (CLEF 2016) 10/41

The Kronecker product Given two matrices A R m A n A and B R m B n B, the Kronecker product X can be expressed as: A 1,1 B A 1,2 B A 1,nA B A 2,1 B A 2,2 B A 2,nA B X = A B...... A ma,1b A ma,2b A ma,n A B with m X = m A m B, n X = n A n B If X is given (the image), how can we compute A and B (its components)? B can be considered as a 2D filter of the image represented by the matrix X components)? Antonio Rodríguez-Sánchez (CLEF 2016) 11/41

The Kronecker decomposition and SVD The Kronecker decomposition can be carried out by Singular Value Decomposition(SVD) Given an arbitrary matrix X with size m n the SVD is given by X = USV T where U R mxm is an orthogonal matrix of left singular vectors, where UU T = I m, V R nxn, is an orthogonal matrix of right singular vectors, where VV T = I n, S R mxn, is a diagonal matrix containing the singular values with nonnegative components in its diagonal Antonio Rodríguez-Sánchez (CLEF 2016) 12/41

Note The algorithm solving the SVD does not depend on the order of the elements of the matrix Thus, any permutation of the indexes, reordering, of the columns and (or) rows preserves the same solution We can then work on a reordered representation of the matrix X Antonio Rodríguez-Sánchez (CLEF 2016) 13/41

Algorithm for solving Kronecker decomposition 1. Reorder the matrix 2. Compute SVD decomposition 3. Compute the approximation of X 4. Invert the reordering Antonio Rodríguez-Sánchez (CLEF 2016) 14/41

Nearest Kronecker Product (NKP) Given a matrix X R mxn, the NKP problem involves minimizing: φ(a, B) = X A B F F is the Frobenius norm This problem can be solved using SVD, working on a reordered representation of X Antonio Rodríguez-Sánchez (CLEF 2016) 15/41

Step 1: Reorder matrix X 1 X x 11 x 12 x 13 x 14 x 15 x 16 x 21 x 22 x 23 x 24 x 25 x 26 x 31 x 32 x 33 x 34 x 35 x 36 x 41 x 42 x 43 x 44 x 45 x 46 x 51 x 52 x 53 x 54 x 55 x 56 x 61 x 62 x 63 x 64 x 65 x 66 = = A B a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 [ b11 b 12 b 21 b 22 ], can be reordered into = X = Ã B x 11 x 13 x 15 x 31 x 33 x 35 x 51 x 53 x 55 x 12 x 14 x 16 x 32 x 34 x 36 x 52 x 54 x 56 x 21 x 23 x 25 x 41 x 43 x 45 x 61 x 63 x 65 x 22 x 24 x 26 x 42 x 44 x 46 x 62 x 64 x 66 b 11 b 12 b 21 b 22 [ a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ], 1 C.F.V. Loan. The ubiquitous Kronecker product. Journal of Computational and Applied Mathematics, 123:85-100, 2000. Antonio Rodríguez-Sánchez (CLEF 2016) 16/41

Approximation of X and reordering X vec(a) vec(b) F Vec() is a vectorization operator which stacks columns of a matrix on top of each other Problem of finding the nearest rank-1 matrix to X Well known solutions using SVD Antonio Rodríguez-Sánchez (CLEF 2016) 17/41

Step 2: Compute SVD decomposition X vec(a) vec(b) F Let X = USV T the decomposition of X The best à and B are defined as: à = σ 1 U(:, 1) and B = σ 1 V (:; 1) where σ 1 is the largest singular value and U and V are the corresponding singular vectors Antonio Rodríguez-Sánchez (CLEF 2016) 18/41

Steps 3 and 4: Approximation and reordering Once we have à and B is possible to compute the approximation of X Since at beginning we have changed the order of values into matrix, invert the reordering is necessary for obtain the original A and B Antonio Rodríguez-Sánchez (CLEF 2016) 19/41

Components and factorization The number of components and factorization influence the level of details Given, for example, a gray image of size (1024,1024): If it has many details, is better chose many components with small factorization: Example: (4,4)(4,4)(4,4)(4,4)(4,4) If is less detailed, less component with high factorization: Example: (32,32)(32,32) Antonio Rodríguez-Sánchez (CLEF 2016) 20/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 21/41

Compression I The tensor decomposition can provide a very high level of images compression It takes consideration only the largest singular values (Eckart-Young theorem) The level of compression is given by the total number of: elements in image matrix elements of components in the decomposition Antonio Rodríguez-Sánchez (CLEF 2016) 22/41

Compression II Let nsv number of singular values taken in consideration nf number of factors for component v value of factors n c the number of components used Then the total number of elements of components is given by: nsv n c v n f For simplify the notation we assume that the all factors are equal for every component Decomposition with different factors can be taken in consideration For example (32,28)(16,8)(2,4) Antonio Rodríguez-Sánchez (CLEF 2016) 23/41

Compression III: Example Given an image of size (1024,1024). It can be compressed with components (32,32)(32,32) and with 10 singular values by: 1024 2 10 2 32 2 = 51.2 (4,4),(4,4),(4,4),(4,4),(4,4) and with 10 singular values by: 1024 2 10 5 4 2 = 1310.72 Antonio Rodríguez-Sánchez (CLEF 2016) 24/41

Compression IV: Example Compression ratio: 202 Compression ratio: 99 Figure: Example of compression on toys room image. Antonio Rodríguez-Sánchez (CLEF 2016) 25/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 26/41

Interpretation of image components I X = A B B can be interpreted like an image filter It finds the boundary of the critical regions where most of the structural information concentrates This represents a big advantage: In general, in image filtering processes, a predetermined filter is used The Kronecker decomposition automatically tries to predict the optimal filters Antonio Rodríguez-Sánchez (CLEF 2016) 27/41

Interpretation of image components II Highest components (A) Lowest components (B) Figure: Toys room picture and its components. The Highest component and the Lowest component correspond to the matrices A1,... and B1,... respectively. Antonio Rodríguez-Sánchez (CLEF 2016) 28/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 29/41

Learning Sample set of pairs of output and input objects {(y i, x i ) : y i Y, x i X, i = 1,..., m} Define two functions, φ and ψ, that map the input and output objects respectively into linear vector spaces feature space in case of the input label space in case of the output φ : X H φ and ψ : Y H ψ Antonio Rodríguez-Sánchez (CLEF 2016) 30/41

Objective Find a linear function acting on the feature space f (φ(x)) = Wφ(x) + b that produces a prediction of every input object in the label space The output corresponding to X is: y = ψ 1 (f (φ(x))) Antonio Rodríguez-Sánchez (CLEF 2016) 31/41

MMR (Maximum Margin Regression) vs SVM (Support Vector Machine) MMR is a framework for multilabel classification Is based on Support Vector Machine (SVM) Key idea: reinterpretation of the normal vector w SVM w is the normal vector of the separating hyperplane. y i { 1, +1} binary outputs. The labels are equal to the binary objects. Extended View W is a linear operator projecting the feature space into the label space y i Y arbitrary outputs ψ(y i ) H ψ are the labels, the embedded outputs in a linear vector space Antonio Rodríguez-Sánchez (CLEF 2016) 32/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 33/41

ImageCLEF dataset Task: multi-label classification Figure: The hierarchy of classes in ImageCLEF multi-label challenge. Antonio Rodríguez-Sánchez (CLEF 2016) 34/41

Results on ImageCLEF F1 score F1 score Degree of polynomial (a) Standard deviation (b) Figure: Results for six filter sizes: 4, 8, 12, 20, 18 and 32 using 3 components, training with two different kernel: a) polynomial b) Gaussian. The parameter varied in F1 measure are degree of polynomial from 1 to 10 for polynomial kernel and values of standard deviation of Gaussian for Gaussian kernel. Antonio Rodríguez-Sánchez (CLEF 2016) 35/41

Outline Image classification The problem Decomposing the environment The tensor decomposition What is it Compression Interpretation of the image components Learning approach Maximum Margin Regression Experimental evaluation ImageCLEF 2015 Experimental evaluation Pascal and Flickr Antonio Rodríguez-Sánchez (CLEF 2016) 36/41

Pascal and Flickr: Features to compare to Feature Dimension Source Descriptor Hsv 4096 color HSV Lab 4096 color LAB Rgb 4096 color RGB HsvV3H1 5184 color HSV LabV3H1 5184 color LAB RgbV3H1 5184 color RGB DenseHue 100 texture hue HarrisHue 100 texture Hue DenseHueV3H1 300 texture hue HarrisHueV3H1 300 texture Hue DenseSift 1000 texture sift HarrisSift 1000 texture sift DenseSiftV3H1 3000 texture sift HarrisSiftV3H1 3000 texture sift Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22). 1 Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodríguez-Sánchez (CLEF 2016) 37/41

Results on Pascal07 dataset Gaussian kernel Feature P(%) R(%) F1(%) TD 0.4158 0.2877 0.3400 HarrisSiftV3H1 0.4623 0.4491 0.4552 HarrisSift 0.4202 0.4895 0.4522 DenseSiftV3H1 0.4189 0.4886 0.4510 DenseSift 0.3750 0.5044 0.4302 LabV3H1 0.3911 0.3366 0.3618 DenseHueV3H1 0.3884 0.3282 0.3558 HarrisHueV3H1 0.3274 0.3884 0.3552 RgbV3H1 0.3907 0.3224 0.3533 HsvV3H1 0.4080 0.3048 0.3489 Hsv 0.3911 0.3085 0.3449 Lab 0.4135 0.2920 0.3423 Rgb 0.3857 0.2985 0.3350 HarrisHue 0.3930 0.2887 0.3328 DenseHue 0.3962 0.2828 0.3299 Polynomial kernel Feature P(%) R(%) F1(%) TD 0.3931 0.2855 0.3308 HarrisSiftV3H1 0.4002 0.5520 0.4640 HarrisSift 0.3728 0.5523 0.4449 DenseSiftV3H1 0.3592 0.5663 0.4396 DenseSift 0.3442 0.5337 0.4184 HsvV3H1 0.3815 0.3295 0.3536 RgbV3H1 0.3479 0.3551 0.3515 LabV3H1 0.3106 0.3868 0.3434 HarrisHueV3H1 0.3110 0.3894 0.3417 DenseHueV3H1 0.3166 0.3607 0.3363 Hsv 0.3390 0.3232 0.3309 HarrisHue 0.3037 0.3597 0.3241 Rgb 0.2906 0.3420 0.3135 Lab 0.2800 0.3389 0.3031 DenseHue 0.2808 0.3329 0.2995 Figure: Comparing tensor decomposition with other features 1 on Pascal07 dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22). 1 Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodríguez-Sánchez (CLEF 2016) 38/41

Results on Flickr dataset Gaussian kernel Feature P(%) R(%) F1(%) TD 0.3164 0.3780 0.3118 HarrisSiftV3H1 0.5470 0.3842 0.4512 DenseSift 0.5438 0.3862 0.4515 HarrisSift 0.5368 0.3780 0.4435 DenseSiftV3H1 0.5475 0.3807 0.4491 LabV3H1 0.4693 0.3200 0.3806 HarrisHueV3H1 0.4368 0.3288 0.3752 DenseHueV3H1 0.4221 0.3333 0.3723 HsvV3H1 0.4570 0.3062 0.3667 HarrisHue 0.3753 0.3435 0.3587 RgbV3H1 0.4150 0.3089 0.3542 Lab 0.4153 0.3016 0.3494 DenseHue 0.3854 0.3187 0.3477 Rgb 0.4181 0.2824 0.3371 Hsv 0.4152 0.2762 0.3317 Polynomial kernel Feature P(%) R(%) F1(%) TD 0.2311 0.2615 0.2453 HarrisSiftV3H1 0.5289 0.4646 0.4940 DenseSiftV3H1 0.5328 0.4415 0.4828 HarrisSift 0.5260 0.4447 0.4819 DenseSift 0.5132 0.4316 0.4688 LabV3H1 0.4508 0.3533 0.3961 HsvV3H1 0.3961 0.3655 0.3798 HarrisHueV3H1 0.4115 0.3490 0.3777 DenseHueV3H1 0.4086 0.3445 0.3737 RgbV3H1 0.3996 0.3460 0.3704 Lab 0.2717 0.5600 0.3658 DenseHue 0.2698 0.5249 0.3564 HarrisHue 0.3294 0.4159 0.3561 Hsv 0.3603 0.3602 0.3540 Rgb 0.3495 0.3406 0.3443 Figure: Comparing tensor decomposition with other features 1 on Flickr dataset with Gaussian and Polynomial kernel. The decomposition chosen is 3 components with factorization (22,22). 1 Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek, and Cordelia Schmid. Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation, 2009. Antonio Rodríguez-Sánchez (CLEF 2016) 39/41

Conclusions We have presented a method for feature extraction based on decomposition of environment Pro: 1. Compression 2. Automatic prediction of the best filters to use for extracting features Cons: 1. Different decompositions can strong influence the final result 2. Lack of a mechanism for automatically choose the best parameters Antonio Rodríguez-Sánchez (CLEF 2016) 40/41