Subspace Methods for Visual Learning and Recognition

Size: px
Start display at page:

Download "Subspace Methods for Visual Learning and Recognition"

Transcription

1 Subspace Methods for Visual Learning and Recognition Horst Bischof Inst. f. Computer Graphics and Vision, Graz Univesity of Technology Austria Aleš Leonardis Faculty of Computer and Information Science, University of Ljubljana Slovenia

2 Outline Part 1 Motivation Appearance based learning and recognition Subspace methods for visual object recognition Principal Components Analysis (PCA) Linear Discriminant Analysis (LDA) Canonical Correlation Analysis (CCA) Independent Component Analysis (ICA) Non-negative Matrix Factorization (NMF) Kernel methods for non-linear subspaces 2

3 Outline Part 2 Robot localization Robust representations and recognition Robust PCA recognition Scale invariant recognition using PCA Illumination insensitive recognition Representations for panoramic images Incremental building of eigenspaces Multiple eigenspaces for efficient representation Robust building of eigenspaces Research issues 3

4 The name of the game complex objects/scenes varying pose (3D rotation, scale) cluttered background/foreground occlusions (noise) varying illumination 4

5 Object Representation High-Level Shape Models (e.g., Generalized Cylinders) Idealized Images Texture Less Mid-Level Shape Models (e.g. CAD models, Superquadrics) More Complex Well-defined geometry Low-level Appearance Based Models (e.g. Eigenspaces) Most complex Complicated shapes 5

6 Problems Segmentation: Pose/Shape: 6

7 Illumination 7

8 Example 8

9 Learning and recognition 3D reconstruction matching learning matching scene training images input image 9

10 Appearance-based approaches A renewed attention in the appearance-based approaches Encompass combined effects of: shape, reflectance properties, pose in the scene, illumination conditions. Acquired through an automatic learning phase. 10

11 A variety of successful applications: Appearance-based approaches Human face recognition e.g. [Beymer & Poggio, Turk & Pentland] Visual inspection e.g. [Yoshimura & Kanade] Visual positioning and tracking of robot manipulators, e.g. [Nayar & Murase] Tracking e.g., [Black & Jepson] Illumination planning e.g., [Murase & Nayar] Image spotting e.g., [Murase & Nayar] Mobile robot localization e.g., [Jogan & Leonardis] Background modeling e.g., [Oliver, Rosario & Pentland] 11

12 Appearance-based approaches Objects are represented by a large number of views: 12

13 Subspace Methods Images are represented as points in the N-dimensional vector space Set of images populate only a small fraction of the space Characterize subspace spanned by images Image set Basis images Representation 13

14 Subspace Methods Properties of the representation: Optimal Reconstruction PCA Optimal Separation LDA Optimal Correlation CCA Independent Factors ICA Non-negative Factors NMF Non-linear Extension Kernel Methods 14

15 Image Matching ρ = x T y x y > Θ Normalized images x y 2 < Ψ Compress images 15

16 16 Eigenspace representation Image set (normalised, zero-mean) We are looking for orthonormal basis functions: Individual image is a linear combination of basis functions ) ( ) ( )) ( ) ( ( ) ( ) ( y x u y x u y u x y x j j k j j j j k j j j k j j j q q q q q q = = = = =

17 Optimisation problem Best basis functions ν? Taking the k eigenvectors with the largest eigenvalues of PCA or Karhunen-Loéve Transform (KLT) 17

18 Efficient eigenspace computation n << m Compute the eigenvectors u' i, i = 0,...,n-1, of the inner product matrix The eigenvectors of XX T can be obtained by using XX T Xv i '=λ' i Xv i ': 18

19 Principal Component Analysis 19

20 Principal Component Analysis = q 1 + q 2 + q

21 Image representation with PCA u 1 u 2 u 3 21

22 Image presentation with PCA 22

23 Properties PCA Any point x i can be projected to an appropriate point q i by : q i = U T (x i - µ) and conversely (since U -1 = U T ) Uq i + µ = x i Y x i U T (x i -µ) Y X Uq i + µ q i X

24 Properties PCA It can be shown that the mean square error between x i and its reconstruction using only m principle eigenvectors is given by the expression : j= 1 j= 1 PCA minimizes reconstruction error N λ j m λ j = N λ j= m+ 1 j PCA maximizes variance of projection Finds a more natural coordinate system for the sample data.

25 PCA for visual recognition and pose estimation Objects are represented as coordinates in an n-dimensional eigenspace. An example: 3-D space with points representing individual objects or a manifold representing parametric eigenspace (e.g., orientation, pose, illumination). 25

26 PCA for visual recognition and pose estimation Calculate coefficients Search for the nearest point (individual or on the curve) Point determines object and/or pose 26

27 Calculation of coefficients To recover a i the image is projected onto the eigenspace Complete image x i is required to calculate a i. Corresponds to Least-Squares Solution 27

28 Linear Discriminant Analysis (LDA) PCA minimizes projection error Best discriminating Projection PCA-Projection PCA is unsupervised no information on classes is used Discriminating information might be lost 28

29 LDA Linear Discriminance Analysis (LDA) Maximize distance between classes Minimize distance within a class Fisher Linear Discriminance ρ( w) = w w T T S S B W w w 29

30 LDA: Problem formulation n Sample images: c classes: { x,, } 1 L x n { χ,, } 1 L χ c Average of each class: Total average: µ i µ = = 1 n i 1 x k χ N n k = 1 i x k x k 30

31 31 LDA: Practice Scatter of class i: ( )( ) T i k x i k i x x S i k µ µ χ = = = c i S W S i 1 ( )( ) = = c i T i i i S B 1 µ µ µ µ χ B W T S S S + = Within class scatter: Between class scatter: Total scatter:

32 LDA: Practice After projection: y = W k T x k Between class scatter (of y s): ~ S B = W T S B W Within class scatter (of y s): ~ S W = W T S W W 32

33 LDA S 1 S B S W = S 1 + S 2 S 2 Good separation 33

34 Maximization of ρ( w) = w w T T S S B W w w LDA is given by solution of generalized eigenvalue problem S B w = λs w w For the c-class case we obtain (at most) c-1 projections as the largest eigenvalues of S B w = λs i ww i 34

35 LDA How to calculate LDA for high-dimensional images? Problem: S W is always singular Number of pixels in each image is larger than the number of images in the training set 1. Fischerfaces Reduce dimension by PCA and then perform LDA 2. Simultaneous diagonalization of S W and S B 35

36 LDA Fischerfaces (Belhumeur et.al. 1997) Reduce dimensionality to n-c with PCA U pca = arg max U U T QU = [ u 1 u 2... u n c ] Further reduce to c-1 with FLD W fld = T T W WpcaSBWpcaW arg max = [ w1 w2... wc 1] W T T W W SWW W pca pca The optimal projection becomes W opt = W Τ U Τ fld 36

37 LDA Example Fisherface of recognition Glasses/NoGlasses (Belhumeur et.al. 1997) 37

38 LDA Example comparison for face recognition (Belhumeur et.al. 1997) Superior performance than PCA for face recognition Noise sensitive Requires larger training set, more sensitive to different training data [Martinez&Kak2001] 38

39 Canonical Correlation Analysis (CCA) Also supervised method but motivated by regression tasks, e.g. pose estimation. Canonical Correlation Analysis relates two sets of observations by determining pairs of directions that yield maximum correlation between these sets. Find a pair of directions (canonical factors) w x R p, w y R q, so that the correlation of the projections c = w xt x and d = w yt y becomes maximal. 39

40 What is CCA? Canonical Correlation 0 r 1 Between Set Covariance Matrix ρ = E[ w w T x C E[ c T x w xx E[ cd] E[ w xx T x w 2 T C x ] E[ d xy T x w w x w xy T y y T ] E[ w C 2 ] yy w = y w T y ] y yy T w y ] = 40

41 What is CCA? Finding solutions w = w w x y A = 0 C yx C 0 xy, B = C 0 xx C 0 yy Rayleigh Quotient Generalized Eigenproblem r = w w T T Aw Bw Aw = µbw 41

42 CCA for images Same problem as for LDA Computationally efficient algorithm based on SVD A A = = C 1 2 xx C UDV xy T C 1 2 yy w w xi yi = = C C 1 2 xx 1 2 yy u v i i 42

43 Properties of CCA At most min(p,q,n) CCA factors Invariance w.r.t. affine transformations Orthogonality of the Canonical factors w w w T xi T yi T xi C C C xx yy xy w w w xj yj yj = = =

44 CCA Example Parametric eigenspace obtained by PCA for 2DoF in pose

45 CCA Example CCA representation (projections of training images onto w x1, w x2 )

46 Independent Component Analysis (ICA) ICA is a powerful technique from signal processing (Blind Source Separation) Can be seen as an extension of PCA PCA takes into account only statistics up to 2 nd order ICA finds components that are statistically independent (or as independent as possible) Local descriptors, sparse coding 46

47 Independent Component Analysis (ICA) m scalar variables X=(x 1... x m ) T They are assumed to be obtained as linear mixtures of n sources S=(s 1... s n ) T X = AS Task: Given X find A, S (under the assumption that S are independent) 47

48 ICA Example Original Sources Mixtures Recovered Sources 48

49 ICA Example ICA basis obtained from 16x16 patches of natural images (Bell&Sejnowski 96) 49

50 ICA Algorithms 1. Minimize (Maximize) function Complex matrix (tensor) functions 2. Adaptive Algorithms based on stochastic gradient Measure of independence Non-Gaussian, e.g. Kurtosis, Negentropy Fast ICA Algorithm (Hyvärinen) 1. W = W / Repeat until 3 2. W = 2 W WW 1 2 T convergence WW T 50

51 ICA Properties ICA works only for Non-Gaussian Sources Usually centering and Whiteing of data is performed We can not measure the variance of the components X = AP 1 PS ICA does not provide ordering ICA components are not orthogonal 51

52 ICA for Noise Suppression Sparse Code Shrinkage (similar to Wavelet Shrinkage Hyvärinen 99) 52

53 Face Recognition using ICA PCA vs. ICA on Ferret DB (Baek et.al. 02) PCA ICA 53

54 Non-Negative Matrix Factorization (NMF) How can we obtain part-based representation? Local representation where parts are added E.g. learn from a set of faces the parts a face consists of, i.e. eyes, nose, mouth, etc. Non-Negative Matrix Factorization (Lee & Seung 1999) lead to part based representation 54

55 Matrix Factorization - Constraints V WH PCA : W are orthonormal basis vectors W = [ 1 2, L w, w, w n ], w i w j = δ ij VQ : H are unity vectors H T 1 L = [ h, h2, L, hn], h j = [0,0,1,0,,0] NMF: V,W,H are non-negative Vij, Wij, H ij 0 i, j 55

56 NMF - Cost functions Euclidean distance between A and B A B 2 = ( A ij B ij ij ) 2 Divergence of A from B (Relative entropy) D( A B) = A ij ( Aij log Aij + B ij ij B ij ) 56

57 NMF - update rules V WH 2 is non-increasing under H aµ H aµ ( W ( W T T V ) aµ WH ) aµ W iµ W iµ ( VH ( WHH ) T iµ T ) iµ D( V WH ) is non-increasing under H aµ H aµ i W V ia iµ k W ( WH ) ka iµ W ia W ia H V µ aµ iµ iµ ν W ( WH ) aν We can start with random matrices for W and H and update each matrix iteratively until W and H are at a stationary point - the cost functions are invariant at this point. 57

58 Concrete example Handwritten Digits Training data set for learning process 58

59 Learning Find basis images from the training set Training images Basis images 59

60 Reconstruction New image Basis images Encoding X [Non-negative] [Non-negative] Approximation 60

61 Face features Basis images Encoding (Coefficients) Reconstructed image 61

62 Kernel Methods All presented methods are linear Can we generalize to non-linear methods in a computational efficient manner? 62

63 Kernel Methods Kernel Methods are powerful methods (introduced with Support Vector Machines) to generalize linear methods BASIC IDEA: 1. Non-linear mapping of data in high dimensional space 2. Perform linear method in high-dimensional space Non-linear method in original space 63

64 Kernel Trick Problem: High-dimensional spaces: E.g. N=16x16 polynomial of degree Can we avoid computing the non-linear mapping directly? E.g. polynomial and inner products Φ( x) Φ( y) = ( x 1, 2x1x2, x2 )( y1, 2 y1 y2, y2 ) = ( xy) = k( x, y) If algorithm can be specified in terms of dot products and non-linearity satisfies Mercers condition we can apply the kernel trick 64

65 Example kernels Gaussian: Polynomial : Sigmoid: x y k ( x, y ) = exp( ) 2 2σ d k ( x, y ) = ( x y + c), c 0 k ( x, y) = σ ( κ (x y) + Θ ), κ, Θ 2 R Nonlinear separation can be achieved. 65

66 Kernel Principal Component Analysis KPCA carries out a linear PCA in the feature space F The extracted features take the nonlinear form f k (x) = l i= 1 α k i k(x i x), The k αi are the components of the k-th eigenvector of the matrix ( k (x x )) i j ij 66

67 KPCA and Dot Products Find eigenvectors V and eigenvalues λ of the covariance matrix C = 1 l Φ ( x i ) Φ ( x i ) T l i = 1. Again, replace with Φ(x) Φ(y). k(x,y). 67

68 Kernel PCA Toy Example Artificial data set from three point sources, 100 point each. 68

69 De-noising in 2-dimensions A half circle and a square in the plane De-noised versions are the solid lines 69

70 Kernel-CCA Reformulation of CCA for finite sample size n X p n matrix of training images Y q n matrix of pose parameters Aˆ T T = 1 0 XY 1 XX 0 = T, Bˆ n 1 XY 0 n 1 0 YY T 70

71 Kernel-CCA Theorem The component vectors w x *, w y * of the extremum points w* of r = w w T T Aw ˆ Bw ˆ lie in the span of the training data X, Y, i.e., * * f g : w = Xf, w =, x y Yg 71

72 Kernel-CCA III T K= X X T L= Y Y n n Inner Product (Gram) Matrix r = ( T T f g ) 0 LK KK KL f 0 g 0 f ( T T f g ) 0 LL g 72

73 Kernel-CCA Apply non-linear transformations φ(), θ() to the data φ(x) = < φ(x 1 ),, φ(x n ) > θ(y) = < θ(y 1 ),, θ(y n ) > The Kernel Trick: K ij = φ(x i ) T φ(x j ) = k φ (x i,x j ) L ij = θ(y i ) T θ(y j ) = k θ (y i,y j ) Inner Product in Feature Space Kernel Evaluation in Input Space 73

74 Experiments Hippo: Rotated through 360 o (1 DOF) in 2 o steps. CCA-Factor Linear Y-Encoding: y i = turntable position α i in degrees. Estimates for y i w x1 74

75 Experiments Trigonometric Y-Encoding: y i = <sin(α i ),cos(α i )> CCA-Factors Estimates for y i1,y i2 atan2 w x 1 w x 2 75

76 Experiments Using Kernel-CCA, optimal output features can be found automatically Application of a RBF-kernel to the scalar output parameters α i yielded two factors pairs with a canonical correlation of 1. 76

77 Outline Part 1 Motivation Appearance based learning and recognition Subspace methods for visual object recognition Principal Components Analysis (PCA) Linear Discriminant Analysis (LDA) Canonical Correlation Analysis (CCA) Independent Component Analysis (ICA) Non-negative Matrix Factorization (NMF) Kernel methods for non-linear subspaces 77

78 Outline Part 2 Robot localization Robust representations and recognition Robust recognition using PCA Scale invariant recognition using PCA Illumination insensitive recognition Representations for panoramic images Incremental building of eigenspaces Multiple eigenspaces for efficient representation Robust building of eigenspaces Research issues 78

79 Mobile Robot 79

80 Panoramic image 80

81 Environment map environments are represented by a large number of views localisation = recognition 81

82 Compression with PCA 82

83 Image representation with PCA 83

84 Localisation 84

85 Distance vs. similarity 85

86 Robot localisation Interpolated hyper-surface represents the memorized environment. The parameters to be retrieved are related to position and orientation. Parameters of an input image are obtained by scalar product. 86

87 Localisation 87

88 Enhancing recognition and representations Occlusions, varying background, outliers Robust recognition using PCA Scale variance Multiresolution coefficient estimation Scale invariant recognition using PCA Illumination variations Illumination insensitive recognition Rotated panoramic images Spinning eigenimages Incremental building of eigenspaces Multiple eigenspaces for efficient representations Robust building of eigenspaces 88

89 Occlusions 89

90 Standard recovery of coefficients 90

91 Non-robustness 91

92 Robust method 92

93 Robust algorithm 93

94 Selection 94

95 Robust recovery of coefficients 95

96 Robustness Experimental results 96

97 Recognition and pose estimation 97

98 Robust localisation under occlusions 98

99 Robust localisation at 60% occlusion Standard approach Robust approach 99

100 Mean error of localisation Mean error of localisation with respect to % of occlusion 100

101 Multiresolution coefficient estimation Multiresolution a well-known technique to reduce computational complexity a search for the solution at the coarsest level and then a refinement through finer scales Standard eigenspace method cannot be applied in an ordinary multiresolution way it relies on the orthogonality of eigenimages. 101

102 Standard multiresolution coefficient estimation Eigenimages in each resolution layer are computed from a set of templates in that layer Computationally costly and requires additional storage space 102

103 Robust multiresolution coefficient estimation Robust method requires only a single set of eigenimages obtained on the finest resolution. Linear system of equations: does not require orthogonality. 103

104 Multiresolution coefficient estimation 104

105 Scaled images 105

106 Scale estimation 106

107 Numerical demonstration 107

108 Multiresolution approach Estimate scale & coefficients simultaneously in the pyramid Efficient search structure 108

109 Experimental results test image 109

110 Experimental results 110

111 Illumination insensitive recognition Recognition of objects under varying illumination global illumination changes highlights shadows Dramatic effects of illumination on objects appearance Training set under a single ambient illumination 111

112 Illumination insensitive recognition Our Approach Global eigenspace representation Local gradient based filters Efficient combination of global and local representations Robust coefficient recovery in eigenspaces 112

113 Eigenspaces and filtering 113

114 Filtered eigenspaces 114

115 Gradient-based filters Global illumination Gradient-based filters Steerable filters [Simoncelli] 115

116 Robust coefficient recovery Highlights and shadows Robust coefficient recovery Robust solution of linear equations = a + a + a = a + a + a = a + a + a M M a + a + = a L +L +L Hypothesize & Select +L 116

117 Experimental results Test images Our approach Standard method Demo 117

118 Experimental results Robust filtered method - all eigenvectors used obj % ang avg Standard method - all eigenvectors used obj % ang avg

119 Illumination invariant localisation Illumination variations and occlusions 119

120 Filtered eigenvectors 120

121 Experimental results Training set: straight path, uniform illumination 121

122 Experimental results Test sets T/1/2/3 without occlusion Test sets 4/5/6/7 with occlusion 122

123 Experimental results Comparison with standard method 123

124 Experimental results Comparison with standard method 124

125 Experimental results Comparison with standard method Coefficient error, test sets 2 and 6 125

126 Experimental results Comparison with standard method Coefficient error, test sets 3 and 7 126

127 Experimental results Average localisation error (in cm). 127

128 Rotated panoramic images 128

129 Unwrapping 129

130 rotated/shifted n times A rotated panoramic image Inner product matrix Q=X T X symmetric, Toeplitz, circulant 130

131 Eigenvectors of a circulant matrix Shift theorem: the eigenvectors of a general circulant matrix are the N basis vectors from the Fourier matrix F=[u 0 ', u 1 ',..., u n-1 '], where The eigenvalues can be calculated simply by retrieving the magnitude of the DFT of one row of Q 131

132 From u i ' to u i The eigenvectors of XX T can be obtained by using XX T Xu i '=λ' i Xu i ': eigenvectors u i same frequency as u i ', phase and amplitude may change 132

133 Eigenvectors 133

134 Generalisation to several locations 134

135 A set of rotated images P different locations, each shifted n times Every Q ij is circulant (but in general not symmetric!) Is it possible to exploit these properties? It is still possible to compute the eigenvectors without performing the SVD decomposition of A. 135

136 Rotated panoramic images 136

137 Eigenvalue problem Solution of the problem Aw' = µw' matrix blocks Q jl of A are circulant matrices every circulant matrix can be diagonalised in the same basis by Fourier matrix F all the submatrices Q jk have the same set of eigenvectors 137

138 Aw' = µw' written blockwise: Derivations Since u i ' is an eigenvector of every Q jl, λ i jk is an eigenvalue of Q jl corresponding to u i '. 138

139 Derivations This implies a new eigenvalue problem Λα i = µα ι, where and 139

140 Computational complexity Since Q jk =Q T kj, it can be proved that Λ is Hermitian and we have P linearly independent eigenvectors α i, which provide P linearly independent eigenvectors w' i. Since the same procedure can be performed for every v' i, we can obtain N P linearly independent eigenvectors of A. It is therefore possible to solve the eigen-problem using N decompositions of order P (as opposed to decomposition of P N). 140

141 Complex eigenspace of spinning images Real and imaginary part of one of the vectors: Real and imaginary part of one of the w vectors: 141

142 Eigenvectors 142

143 Energy distribution compressing efficiency of the eigenspace 62 images, each in 50 different orientations 143

144 Timings for building the eigenspace Images of dimensions 40x68, each image was rotated/shifted 68 times, i.e., for 40 locations we got 2720 images. This is also the number of image elements (the border case when the covariance matrix is of the same size as the inner product matrix, and the complexity of the SVD method reaches its upper bound). 144

145 Eigenspace of spinning images K-L expansion of a set of rotated panoramic images SVD on the complete covariance matrix is not necessary Instead, we solve a set of smaller eigen-problems The final eigenvectors are composed of locally varying harmonic functions (analytic functions!) 145

146 Other approaches Images stored in arbitrary orientation Images stored in a reference orientation (e.g. gyrocompass) Autocorrelation FFT power spectra Zero Phase Representation Eigenspace of spinning-images 146

147 Batch computation of PCA i i + 1 k l 147

148 Incremental computation of PCA i i + 1 k 148

149 Incremental computation of PCA Algorithm increase dimensionality discard a dimension preserve dimensionality 149

150 Determining the number of eigenvectors Increase the number of dimensions by one if the distance between the last image and its projection is high. Increase the number of dimensions by one if the cumulative distance between the images and their projections is high. 150

151 Incremental PCA in detail Extend U with a residual h n of the new image y and rotate by R Rotation matrix R is a result of the eigenproblem Λ' is the new eigenvalue matrix ; 151

152 Incremental PCA in detail D is assembled from the current eigenvalues Λ, the new image y and its corresponding coefficients a 152

153 Comparison with batch method 153

154 Localization with incremental method Matched nate Location [m] of location a training test image image 154

155 Reconstruction error through time 155

156 Multiple Eigenspaces - Motivation A single eigenspace high dimensionality low-dimensional structure of data is ignored poor generalisation Ad-hoc partitioning of the image set is not efficient 156

157 Multiple eigenspaces our goal Systematically construct multiple low-dimensional eigenspaces from a set of training images Each image is described as a linear combination Design a numerically feasible and robust procedure 157

158 Eigenspace growing and selection 158

159 Multiple eigenspaces - experiments 159

160 Eigenspace growing and selection 160

161 Eigenimages of individual eigenspaces 161

162 Mean images of individual eigenspaces 162

163 "Box" images in four eigenspaces 163

164 "Block" images in five eigenspaces 164

165 Multiple eigenspaces 165

166 Robust Subspace Learning Subspace learning from data containing outliers: Detect outliers Learn using only inliers. [D. Skočaj, A. Leonardis, H. Bischof: A robust PCA algorithm for building representations from panoramic images, ECCV 2002] 166

167 EM algorithm for learning Solving systems of linear equations Only in nonmissing pixels E-step: Estimate coefficients A using given principal vectors U. M-step: Estimate principal vectors U using given coefficients A. Smoothing in missing pixels 167

168 Energy function Minimization of reconstruction error in non-missing pixels (the main property of PCA). Smoothing reconstructed values in missing pixels (additional constraint to prevent over-fitting). 168

169 Robust learning algorithm Input: Learning images containing outliers and occlusions. 1. Compute PV using SVD on the whole image set. 2. Detect outliers (pixels with large reconstruction error). 3. Compute PV from inliers using EM algorithm. 4. Repeat until change in outlier set is small. Output: Principal subspace, learning images without outliers, detected outliers and occlusions. 169

170 Experimental results synthetic data ground truth added outliers standard PCA 2PC standard PCA 8PC robust PCA 8PC 170

171 Experimental results real data input standard PCA robust PCA outliers 171

172 Research issues Comparative studies (e.g., LDA versus PCA, PCA versus ICA) Robust learning of other representations (e.g. LDA, CCA) Integration of robust learning with modular eigenspaces Local versus Global subspace represenations Combination of subspace representations in a hierarchical framework 172

173 Further readings Recognizing objects by their appearance using eigenimages (SOFSEM 2000, LNCS 1963) Robust recognition using eigenimages (CVIU 2000, Special Issue on Robust Methods in CV) Hierarchical top down enhancement of robust PCA (SSPR 2002) Illumination insensitive eigenspaces (ICCV 2001) Mobile robot localization under varying illumination (ICPR 2002) Eigenspace of spinning images (OMNI 2000, ICPR 2000, ICAR 2001) Incremental building of eigenspaces (ICRA 2002, ICPR 2002) Multiple eigenspaces (Pattern Recognition, In press) Robust building of eigenspaces (ECCV 2002) Generalized canonical correlation analysis (ICANN 2001) 173

Subspace Methods for Visual Learning and Recognition

Subspace Methods for Visual Learning and Recognition This is a shortened version of the tutorial given at the ECCV 2002, Copenhagen, and ICPR 2002, Quebec City. Copyright 2002 by Aleš Leonardis, University of Ljubljana, and Horst Bischof, Graz University

More information

Karhunen Loéve Expansion of a Set of Rotated Templates

Karhunen Loéve Expansion of a Set of Rotated Templates IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 12, NO. 7, JULY 2003 817 Karhunen Loéve Expansion of a Set of Rotated Templates Matjaž Jogan, Student Member, IEEE, Emil Žagar, and Aleš Leonardis, Member, IEEE

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

CS 4495 Computer Vision Principle Component Analysis

CS 4495 Computer Vision Principle Component Analysis CS 4495 Computer Vision Principle Component Analysis (and it s use in Computer Vision) Aaron Bobick School of Interactive Computing Administrivia PS6 is out. Due *** Sunday, Nov 24th at 11:55pm *** PS7

More information

Example: Face Detection

Example: Face Detection Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Lecture 13 Visual recognition

Lecture 13 Visual recognition Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Introduction Consider a zero mean random vector R n with autocorrelation matri R = E( T ). R has eigenvectors q(1),,q(n) and associated eigenvalues λ(1) λ(n). Let Q = [ q(1)

More information

CITS 4402 Computer Vision

CITS 4402 Computer Vision CITS 4402 Computer Vision A/Prof Ajmal Mian Adj/A/Prof Mehdi Ravanbakhsh Lecture 06 Object Recognition Objectives To understand the concept of image based object recognition To learn how to match images

More information

Manifold Learning: Theory and Applications to HRI

Manifold Learning: Theory and Applications to HRI Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface

Image Analysis & Retrieval. Lec 14. Eigenface and Fisherface Image Analysis & Retrieval Lec 14 Eigenface and Fisherface Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu Z. Li, Image Analysis & Retrv, Spring

More information

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS

SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS SPECTRAL CLUSTERING AND KERNEL PRINCIPAL COMPONENT ANALYSIS ARE PURSUING GOOD PROJECTIONS VIKAS CHANDRAKANT RAYKAR DECEMBER 5, 24 Abstract. We interpret spectral clustering algorithms in the light of unsupervised

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA

Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Lecture 5 Supspace Tranformations Eigendecompositions, kernel PCA and CCA Pavel Laskov 1 Blaine Nelson 1 1 Cognitive Systems Group Wilhelm Schickard Institute for Computer Science Universität Tübingen,

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface

Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface CS/EE 5590 / ENG 401 Special Topics, Spring 2018 Image Analysis & Retrieval Lec 14 - Eigenface & Fisherface Zhu Li Dept of CSEE, UMKC http://l.web.umkc.edu/lizhu Office Hour: Tue/Thr 2:30-4pm@FH560E, Contact:

More information

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1

Eigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1 Eigenimages " Unitary transforms" Karhunen-Loève transform" Eigenimages for recognition" Sirovich and Kirby method" Example: eigenfaces" Eigenfaces vs. Fisherfaces" Digital Image Processing: Bernd Girod,

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

COS 429: COMPUTER VISON Face Recognition

COS 429: COMPUTER VISON Face Recognition COS 429: COMPUTER VISON Face Recognition Intro to recognition PCA and Eigenfaces LDA and Fisherfaces Face detection: Viola & Jones (Optional) generic object models for faces: the Constellation Model Reading:

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015

Recognition Using Class Specific Linear Projection. Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Recognition Using Class Specific Linear Projection Magali Segal Stolrasky Nadav Ben Jakov April, 2015 Articles Eigenfaces vs. Fisherfaces Recognition Using Class Specific Linear Projection, Peter N. Belhumeur,

More information

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to: System 2 : Modelling & Recognising Modelling and Recognising Classes of Classes of Shapes Shape : PDM & PCA All the same shape? System 1 (last lecture) : limited to rigidly structured shapes System 2 :

More information

Subspace Analysis for Facial Image Recognition: A Comparative Study. Yongbin Zhang, Lixin Lang and Onur Hamsici

Subspace Analysis for Facial Image Recognition: A Comparative Study. Yongbin Zhang, Lixin Lang and Onur Hamsici Subspace Analysis for Facial Image Recognition: A Comparative Study Yongbin Zhang, Lixin Lang and Onur Hamsici Outline 1. Subspace Analysis: Linear vs Kernel 2. Appearance-based Facial Image Recognition.

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis Alvina Goh Vision Reading Group 13 October 2005 Connection of Local Linear Embedding, ISOMAP, and Kernel Principal

More information

Direct Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Classification Logistic regression models for classification problem We consider two class problem: Y {0, 1}. The Bayes rule for the classification is I(P(Y = 1 X = x) > 1/2) so

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

Reconnaissance d objetsd et vision artificielle

Reconnaissance d objetsd et vision artificielle Reconnaissance d objetsd et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09 Lecture 6 Face recognition Face detection Neural nets Attention! Troisième exercice de programmation du le

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Raquel Urtasun TTI Chicago August 2, 2013 R. Urtasun (TTIC) Gaussian Processes August 2, 2013 1 / 59 Motivation for Non-Linear Dimensionality Reduction USPS Data Set

More information

HST.582J/6.555J/16.456J

HST.582J/6.555J/16.456J Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear

More information

Advanced Machine Learning & Perception

Advanced Machine Learning & Perception Advanced Machine Learning & Perception Instructor: Tony Jebara Topic 2 Nonlinear Manifold Learning Multidimensional Scaling (MDS) Locally Linear Embedding (LLE) Beyond Principal Components Analysis (PCA)

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

Independent Component Analysis and Its Application on Accelerator Physics

Independent Component Analysis and Its Application on Accelerator Physics Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Contents. Acknowledgments

Contents. Acknowledgments Table of Preface Acknowledgments Notation page xii xx xxi 1 Signals and systems 1 1.1 Continuous and discrete signals 1 1.2 Unit step and nascent delta functions 4 1.3 Relationship between complex exponentials

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

An Introduction to Independent Components Analysis (ICA)

An Introduction to Independent Components Analysis (ICA) An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

Immediate Reward Reinforcement Learning for Projective Kernel Methods

Immediate Reward Reinforcement Learning for Projective Kernel Methods ESANN'27 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), 25-27 April 27, d-side publi., ISBN 2-9337-7-2. Immediate Reward Reinforcement Learning for Projective Kernel Methods

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Lecture 8: Interest Point Detection. Saad J Bedros

Lecture 8: Interest Point Detection. Saad J Bedros #1 Lecture 8: Interest Point Detection Saad J Bedros sbedros@umn.edu Last Lecture : Edge Detection Preprocessing of image is desired to eliminate or at least minimize noise effects There is always tradeoff

More information

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen Lecture 3: Linear feature extraction Feature extraction feature extraction: (more general) transform the original to (k < d).

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview

Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System 2 Overview 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 4 4 4 6 Modeling Classes of Shapes Suppose you have a class of shapes with a range of variations: System processes System Overview Previous Systems:

More information

Supervised locally linear embedding

Supervised locally linear embedding Supervised locally linear embedding Dick de Ridder 1, Olga Kouropteva 2, Oleg Okun 2, Matti Pietikäinen 2 and Robert P.W. Duin 1 1 Pattern Recognition Group, Department of Imaging Science and Technology,

More information

CS 340 Lec. 6: Linear Dimensionality Reduction

CS 340 Lec. 6: Linear Dimensionality Reduction CS 340 Lec. 6: Linear Dimensionality Reduction AD January 2011 AD () January 2011 1 / 46 Linear Dimensionality Reduction Introduction & Motivation Brief Review of Linear Algebra Principal Component Analysis

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Dimension Reduction and Low-dimensional Embedding

Dimension Reduction and Low-dimensional Embedding Dimension Reduction and Low-dimensional Embedding Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/26 Dimension

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine learning for pervasive systems Classification in high-dimensional spaces Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 2 - Spring 2017 Lecture 4 Jan-Willem van de Meent (credit: Yijun Zhao, Arthur Gretton Rasmussen & Williams, Percy Liang) Kernel Regression Basis function regression

More information

Machine Learning Techniques

Machine Learning Techniques Machine Learning Techniques ( 機器學習技法 ) Lecture 13: Deep Learning Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University ( 國立台灣大學資訊工程系

More information

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction

Linear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

CS798: Selected topics in Machine Learning

CS798: Selected topics in Machine Learning CS798: Selected topics in Machine Learning Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Jakramate Bootkrajang CS798: Selected topics in Machine Learning

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.

Machine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang. Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning

More information