Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #4 Similarity. Edward Chang

Similar documents
Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Non-linear Dimensionality Reduction

Statistical Pattern Recognition

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

LECTURE NOTE #11 PROF. ALAN YUILLE

Manifold Learning and it s application

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Distance Metric Learning in Data Mining (Part II) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Machine Learning. for Image Retrieval. Edward Chang Associate Professor, Electrical Engineering, UC Santa Barbara CTO, VIMA Technologies

Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #3 Machine Learning. Edward Chang

Nonlinear Dimensionality Reduction

Lecture 10: Dimension Reduction Techniques

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Global (ISOMAP) versus Local (LLE) Methods in Nonlinear Dimensionality Reduction

Distance Preservation - Part 2

Dimension Reduction and Low-dimensional Embedding

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

Nonlinear Manifold Learning Summary

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Statistical Machine Learning

(Non-linear) dimensionality reduction. Department of Computer Science, Czech Technical University in Prague

Nonlinear Dimensionality Reduction. Jose A. Costa

Dimensionality Reduc1on

Supplemental Materials for. Local Multidimensional Scaling for. Nonlinear Dimension Reduction, Graph Drawing. and Proximity Analysis

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

EECS 275 Matrix Computation

Apprentissage non supervisée

Nonlinear Dimensionality Reduction

Advanced Machine Learning & Perception

Lecture: Some Practical Considerations (3 of 4)

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

Statistical Learning. Dong Liu. Dept. EEIS, USTC

Manifold Learning: From Linear to nonlinear. Presenter: Wei-Lun (Harry) Chao Date: April 26 and May 3, 2012 At: AMMAI 2012

Intrinsic Structure Study on Whale Vocalizations

Unsupervised dimensionality reduction

Lecture 8: Principal Component Analysis; Kernel PCA

Dimensionality Reduction:

Lecture 17: Face Recogni2on

Machine Learning. Data visualization and dimensionality reduction. Eric Xing. Lecture 7, August 13, Eric Xing Eric CMU,

DIDELĖS APIMTIES DUOMENŲ VIZUALI ANALIZĖ

L26: Advanced dimensionality reduction

Dimensionality Reduction: A Comparative Review

Dimensionality Reduction: A Comparative Review

Lecture 17: Face Recogni2on

Data-dependent representations: Laplacian Eigenmaps

CS 6140: Machine Learning Spring What We Learned Last Week 2/26/16

Manifold Learning: Theory and Applications to HRI

Pivot Selection Techniques

A Tour of Unsupervised Learning Part I Graphical models and dimension reduction

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Iterative Laplacian Score for Feature Selection

Graph Metrics and Dimension Reduction

Lecture: Examples of LP, SOCP and SDP

Spherical Euclidean Distance Embedding of a Graph

Algorithms, Lecture 3 on NP : Nondeterminis7c Polynomial Time

Robust Laplacian Eigenmaps Using Global Information

Networks. Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource

Permutation-invariant regularization of large covariance matrices. Liza Levina

Discriminant Uncorrelated Neighborhood Preserving Projections

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Semi Supervised Distance Metric Learning

Is Manifold Learning for Toy Data only?

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Issues and Techniques in Pattern Classification

UVA CS / Introduc8on to Machine Learning and Data Mining

Basics and Random Graphs con0nued

Distance Metric Learning

ISOMAP TRACKING WITH PARTICLE FILTER

Semidefinite Programming Basics and Applications

Regression.

CS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model

CS 6140: Machine Learning Spring 2016

Learning a Kernel Matrix for Nonlinear Dimensionality Reduction

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

Final Exam, Machine Learning, Spring 2009

Metrics: Growth, dimension, expansion

arxiv: v2 [cs.lg] 8 May 2014

Linear and Non-Linear Dimensionality Reduction

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

An Empirical Comparison of Dimensionality Reduction Methods for Classifying Gene and Protein Expression Datasets

Department of Computer Science and Engineering

A Duality View of Spectral Methods for Dimensionality Reduction

Data dependent operators for the spatial-spectral fusion problem

Outline. What is Machine Learning? Why Machine Learning? 9/29/08. Machine Learning Approaches to Biological Research: Bioimage Informa>cs and Beyond

Neural Networks, Convexity, Kernels and Curses

Spectral Clustering. by HU Pili. June 16, 2013

CSCI1950 Z Computa3onal Methods for Biology Lecture 24. Ben Raphael April 29, hgp://cs.brown.edu/courses/csci1950 z/ Network Mo3fs

Unsupervised Learning: K- Means & PCA

Lecture 3: Compressive Classification

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

Machine Learning for Software Engineering

A Duality View of Spectral Methods for Dimensionality Reduction

Learning a kernel matrix for nonlinear dimensionality reduction

The Curse of Dimensionality for Local Kernel Machines

Mul$variate clustering & classifica$on with R

High Dimensional Discriminant Analysis

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Transcription:

Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #4 Similarity Edward Y. Chang Edward Chang Foundations of LSMM 1

Edward Chang Foundations of LSMM 2

Similar? Edward Chang Foundations of LSMM 3

Two Key Technical Problems Curse of Dimensionality Modeling Subjec'vity Query/User/App Dependent Edward Chang Foundations of LSMM 4

Dimensionality Curse D: Data Dimension When D increases Nearest neighbors are not local All points are equally distanced Edward Chang Foundations of LSMM 5

Sparse High- D Space [C. Aggarwal, etc. ICDT 2001] Hyper- cube Range Queries d P [ s] = s d Edward Chang Foundations of LSMM 6

Range Coverage à 0% Edward Chang Foundations of LSMM 7

Sparse High- D Space Spherical Range Queries Edward Chang Foundations of LSMM 8

P[ R sp d ( Q,0.5)] = π d (0.5) d d Γ( + 1) 2 Edward Chang Foundations of LSMM 9

No Point in the Nearest Neighborhood Edward Chang Foundations of LSMM 10

Dimensionality Curse Edward Chang Foundations of LSMM 11

Equidistant Points 4D 512D Edward Chang Foundations of LSMM 12

Are We Doomed? How does the curse affect classifica'on? Similar objects tend to cluster together Dimensionality reduc'on Edward Chang Foundations of LSMM 13

Summary of Approaches Dynamic Par'al Func'on Restricted Es'mators Specifying the nature of local neighborhood E.g., Manifold learning Adap've Feature Reduc'on PCA, LDA Edward Chang Foundations of LSMM 14

Distribu'on of Distances Edward Chang Foundations of LSMM 15

Some Solu'ons to High- D Restricted Es'mators Specifying the nature of local neighborhood Manifold learning Adap've Feature Reduc'on PCA, LDA Dynamic Par'al Func'on Edward Chang Foundations of LSMM 16

Three Major Paradigms Preserve data descrip'on in a lower dimensional space PCA Maximize discriminability in a lower dimensional space LDA Ac'vate only similar channels DPF Edward Chang Foundations of LSMM 17

Minkowski Distance Objects P and Q D = (Σ M (pi - qi) n ) 1/n Similar images are similar in all M features Edward Chang Foundations of LSMM 18

1.0E-01 1.0E-02 Frequency 1.0E-03 1.0E-04 1.0E-05 1.0E-06 0 0.06 0.13 0.19 0.25 0.32 0.38 0.44 0.51 0.57 0.63 0.69 0.76 0.82 0.88 0.95 Feature Distance 1.0E-01 1.0E-02 Frequency 1.0E-03 1.0E-04 1.0E-05 1.0E-06 0 0.06 0.13 0.19 0.25 0.32 0.38 0.44 0.51 0.57 0.63 0.69 0.76 0.82 0.88 0.95 Edward Chang Feature Distance Foundations of LSMM 19

Weighted Minkowski Distance D = (Σ M wi(pi - qi) n ) 1/n Similar images are similar in the same subset of the M features Edward Chang Foundations of LSMM 20

Average Distance 0 0 0 0 0 0 0 0 0GIF 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.12 0 0 0 0 0 0 0 0.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.08 0.007545 0.01307 0.004637 0.002413 0.002635 0.002954 0.002007 0.06 0.014669 0.02717 0.010578 0.006734 0.007725 0.006379 0.005766 0.04 0.012615 0.023055 0.009333 0.006764 0.007363 0.006593 0.005443 0.082128 0.212612 0.068016 0.037835 0.032241 0.018068 0.013203 0.02 0.061564 0.176548 0.045542 0.026445 0.026374 0.018583 0.022037 0.019243 0 0.037016 0.015684 0.010834 0.012792 0.013536 0.009346 0.09418 0.153677 0.066896 0.040249 0.036368 0.030341 0.021138 0.1284 0.335405 0.13774 0.072613 0.054947 0.039216 0.043319 0.041414 0.101403 0.035881 0.022633 0.018991 0.017131 0.01945 Feature Number 0.014024 0.049782 0.01457 0.0053 0.004439 0.003041 0.005226 0.049319 0.120274 0.045804 0.020165 0.019499 0.013805 0.018513 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 0 0 0 0 0 0 0 0 0 0 Scale 0 up/down 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00.4 0 0 0 0 0 0 0 0.35 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.3 0 0 0 0 0 0 0 0 0.25 0 0 0 0 0 0 0 0 0.0029230.2 0.004377 0.029086 0.017063 0.007649 0.002019 0.001984 0.01156 0.006648 0.010143 0.070708 0.046142 0.023502 0.005178 0.005169 0.03014 0.15 0.006298 0.009264 0.075118 0.042225 0.020053 0.006285 0.006533 0.030043 0.0101980.1 0.056025 0.052869 0.033199 0.018294 0.00688 0.006858 0.02362 0.017066 0.05 0.047514 0.104013 0.073459 0.037468 0.013849 0.01293 0.048344 0.008148 0.015337 0.074134 0.044238 0.021222 0.005197 0.005099 0.029978 0 0.013529 0.051743 0.063263 0.038084 0.020885 0.010481 0.009844 0.028511 0.045746 0.104141 0.145924 0.11276 0.065015 0.026333 0.02593 0.075192 0.026167 0.034522 0.085067 0.054154 0.02918 0.015887 0.014371 0.039732 Feature Number 0.002676 0.012148 0.008913 0.004682 0.002452 0.000913 0.000905 0.003573 0.014527 0.036084 0.046779 0.024712 0.017418 0.004182 0.004991 0.019616 0.012121 0.030269 0.045198 0.022268 0.012468 0.004706 0.004955 0.017919 Average Distance 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 Average Distance 0.024788 0.069615 0.0226 0.009364 0.01 0.00678 0.009712 0.006109 0.019169 0.032795 0.015229 0.008667 0.002357 0.00292 0.012394 0.094781 0.227558 Cropping 0.099002 0.046466 0.047815 0.036883 0.024699 0.01223 0.070665 0.046472 Rotation 0.02549 0.017445 0.008694 0.00841 0.021302 0.093399 0.233519 0.188091 0.043026 0.037991 0.022151 0.024064 0.019067 0.08113 0.04592 0.024327 0.014169 0.004995 0.005275 0.018937 0.040228 0.102763 0.034949 0.014184 0.01465 0.010237 0.015517 0.011323 0.029089 0.063856 0.037716 0.01988 0.00522 0.005556 0.026446 0.350.001163 0.000896 0.000722 0.000627 0.000349 0.000452 0.002758 0.000995 0.12 0.000971 0.00241 0.001415 0.000736 0.000275 0.000272 0.001022 0.006947 0.006769 0.003541 0.006377 0.002048 0.005515 0.013006 0.007103 0.006337 0.015615 0.008709 0.003433 0.001572 0.002071 0.00628 0.3 0.006365 0.005313 0.002064 0.004006 0.002055 0.003338 0.0101 0.0043210.1 0.004457 0.012494 0.007507 0.003403 0.001351 0.001976 0.005346 0.250.011705 0.010935 0.006615 0.007506 0.003319 0.005911 0.015211 0.007451 0.008135 0.017145 0.008711 0.003192 0.001154 0.00223 0.006486 0.08 0.009434 0.010169 0.004484 0.006306 0.002582 0.004798 0.013657 0.2 0.00576 0.006822 0.015235 0.00869 0.003676 0.001193 0.002159 0.006191 0.006305 0.005997 0.003392 0.005719 0.002382 0.004853 0.012802 0.006491 0.06 0.005948 0.013473 0.007436 0.003165 0.001777 0.002377 0.005646 0.150.005835 0.00945 0.004323 0.00564 0.002688 0.004535 0.006332 0.003832 0.005257 0.011884 0.008077 0.002654 0.001227 0.001213 0.005011 0.008149 0.009636 0.0047 0.006213 0.002564 0.003375 0.006421 0.1 0.004812 0.04 0.005389 0.011737 0.00729 0.003216 0.001534 0.002039 0.005163 0.006776 0.010315 0.005393 0.008004 0.003845 0.005659 0.013203 0.008795 0.007888 0.016303 0.008801 0.004048 0.002367 0.0027 0.006844 0.050.001526 0.002551 0.000576 0.000371 0.000331 0.000286 0.00038 0.02 0.000451 0.000707 0.002277 0.001346 0.000797 0.000253 0.000239 0.000982 0.016302 0.022657 0.007055 0.00353 0.002171 0.004162 0.00398 0 0.004914 0.006924 0 0.01499 0.009123 0.006657 0.003364 0.003391 0.007505 0.012414 0.020159 0.007076 0.003102 0.00188 0.004606 0.00349 0.004473 0.006398 0.017247 0.008858 0.005219 0.002338 0.002392 0.007211 0.007231 0.013591 0.004979 0.001092 0.000582 0.002766 0.000741 0.001723 0.003639 0.010426 0.005216 0.003024 0.00043 0.000423 0.003904 0.011588 0.015102 0.005764 0.003855 0.00262 0.004584 0.003792 Feature Number 0.00427 0.005712 0.011221Feature 0.00856 Number 0.006923 0.004464 0.004462 0.007126 0.01212 0.016013 0.006441 0.004048 0.002728 0.004856 0.004241 0.004978 0.006186 0.009864 0.007161 0.005881 0.003835 0.003847 0.006118 0.012235 0.01671 0.00483 0.002616 0.00197 0.00268 0.001672 0.001722 0.0046 0.015611 0.007291 0.00338 0.000508 0.00049 0.005456 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 Edward Chang Foundations of LSMM 21 Average Distance 1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136

Similarity Theories Objects are similar in all respects (Richardson 1928) Objects are similar in some respects (Tversky 1977) Similarity is a process of determining respects, rather than using predefined respects (Goldstone 94) Edward Chang Foundations of LSMM 22

DPF Which Place is Similar to Kyoto? Par'al Dynamic Dynamic Par'al Func'on Edward Chang Foundations of LSMM 23

Precision/Recall Edward Chang Foundations of LSMM 24

Summary of Approaches Dynamic Par'al Func'on Restricted Es'mators Specifying the nature of local neighborhood E.g., Manifold learning Adap've Feature Reduc'on PCA, LDA Edward Chang Foundations of LSMM 25

Manifold Learning Algorithms Auto. NN KPCA Principal curves SOM GTM MDS ISOMAP LLE Explicit Manifold No No Yes No Yes No Yes Yes Parametric Yes Yes No No Yes No No No Dissimilarity matrix Local neighborhood No No(?) No No No Yes Yes No(?) No No No(?) No No No Yes Yes Edward Chang Foundations of LSMM 26

Geodesic Distance Geodesic: the shortest curve on a manifold that connects two points on the manifold Example: on a sphere, geodesics are great circles Geodesic distance: length of the geodesic A B Figure from http:// mathworld.wolfram.com /GreatCircle.html Edward Chang Foundations of LSMM 27

Geodesic Distance Euclidean distance needs not be a good measure between two points on a manifold Length of geodesic is more appropriate Example: Swiss roll Figure from LLE paper Edward Chang Foundations of LSMM 28

Isometric Feature Mapping (ISOMAP) Take a distance matrix {g ij } as input Es'mate geodesic distance between any two points by a chain of short paths Formulate this as a graph theory problem Perform classical scaling on the matrix of geodesic distances to obtain final projec'on Edward Chang Foundations of LSMM 29

Steps to Es'mate Geodesic Distances 1. Find the neighbors of all data items z i Two possible defini'ons of neighbors Set of items whose distances are less than e The K closest items 2. Construct a weighted undirected graph Vertex i corresponds to z i An edge between the vertex i and j iff z i and z j are neighbors, and its weight is g ij Edward Chang Foundations of LSMM 30

Steps to Es'mate Geodesic Distances 3. Find the shortest distance between all pairs of ver'ces in the graph Floyd (O(m 3 )) or Dijkstra (O(m 2 log m+mp)) The shortest distance between ver'ces i and j in the graph is the es'mated geodesic distance between z i and z j Edward Chang Foundations of LSMM 31

Ra'onale for the Geodesic Distance Es'ma'on Figures from ISOMAP paper Edward Chang Foundations of LSMM 32

A Run of ISOMAP Figure from http:// isomap.stanford.edu/ handfig.html Edward Chang Foundations of LSMM 33

A Run of ISOMAP Figures from ISOMAP paper Edward Chang Foundations of LSMM 34

Interpola'on on Straight Lines in the Projected Co- ordinates Figures from ISOMAP paper Edward Chang Foundations of LSMM 35

Summary of Approaches Dynamic Par'al Func'on Restricted Es'mators Specifying the nature of local neighborhood E.g., Manifold learning Adap've Feature Reduc'on PCA, LDA Edward Chang Foundations of LSMM 36

Two Key Technical Problems Curse of Dimensionality Modeling Subjec'vity Query/User/App Dependent Edward Chang Foundations of LSMM 37

Distance Func'on? Foundations 38 of LSMM Edward Chang

Group by Proximity Foundations 39 of LSMM Edward Chang

Group by Proximity x1 x2 x3 x4 x5 x6 x7 x8 x1 1.7.4.3.7.6.2.1 X2 1.4.3.6.7.3.2 X3 1.7.3.4.7.6 x4 1.1.2.6.7 x5 1.7.3.2 x6 1.6.4 x7 1.7 X8 1 Foundations 40 of LSMM Edward Chang

Group by Shape Foundations 41 of LSMM Edward Chang

Group by Shape x1 x2 x3 x4 x5 x6 x7 x8 x1 1.7.7.7.2.2.2.2 X2 1.7.7.2.2.2.2 X3 1.7.2.2.2.2 x4 1.2.2.2.2 x5 1.7.7.7 x6 1.7.7 x7 1.7 X8 1 Foundations 42 of LSMM Edward Chang

Group by Color Foundations 43 of LSMM Edward Chang

Group by Color x1 x2 x3 x4 x5 x6 x7 x8 x1 1.7.3.3.3.2.2.7 x2 1.3.3.3.3.7.7 x3 1.7.7.7.3.3 x4 1.7.7.3.3 x5 1.7.3.3 x6 1.3.3 x7 1.7 x8 1 Foundations 44 of LSMM Edward Chang

Naïve Alignment Rules Increasing the scores of similar pairs Decreasing the scores of dissimilar pairs S ij > D ij Foundations 45 of LSMM Edward Chang

Our Work [ACM KDD 2005, ACM MM 05] kij = β 1 kij if (xi, xj) D kij = β 2 kij + (1 - β 2 ) if (xi, xj) S 0 β 1 β 2 1 Theorem #1 The resul'ng matrix is psd Theorem #2 The resul'ng matrix is beser aligned with the ideal kernel Foundations 46 of LSMM Edward Chang

Personaliza'on & Scalability Unsupervised Method Clustering Mul'- version Clustering Ac've Learning Reinforcement Learning Foundations of LSMM 47 Edward Chang

Pairs 1,2 3,4 5,6 7,8 Are stable pairs Foundations of LSMM 48 Edward Chang

ULP: Unified Learning Paradigm Stable Pairs x1 x2 x3 x4 x5 x6 x7 x8 x1 1.7.3.3.3.2.2.7 X2 1.3.3.3.3.7.7 X3 1.7.7.7.3.3 x4 1.7.7.3.3 x5 1.7.3.3 x7 1.7 X8 1 Foundations of LSMM 49 Edward Chang

ULP Stable Pairs (green circles) Found via shot- gun clustering Selected Uncertain Pairs (red circles) Iden'fied via the maximum informa'on or fastest convergence rule Propaga'on (green arrow) Foundations of LSMM 50 Edward Chang

ULP [EITC 05] D Input: D = L + U K = CalcInitKernel(D) L K M = DoClustering(K) [K,M] [T,Xu] = DoSimilarityReinforce(K,M, M, L) Xu M M =DoActiveLearning(Xu) T K = TransformKernel(K, T) K K=K false IsConverge() true Output: K* Foundations of LSMM 51 Edward Chang

Convex Optimization SOCP SDP QCQP LP QP Foundations of LSMM 52 Edward Chang

Learning Similarity from Data Please refer to Chap 5 of FLSMIMR Edward Chang Foundations of LSMM 53

Summary Curse of Dimension Dynamic Par'al Func'on Manifold learning PCA, LDA Learning Distance Func'on from Data Kernel Alignment Unified Learning Paradigm Edward Chang Foundations of LSMM 54

Reading Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval, E. Y. Chang, Springer, 2011 Chapter #4 Similarity Chapter #5 Learning Distance Func'on Edward Chang Foundations of LSMM 55