Advanced data analysis

Size: px
Start display at page:

Download "Advanced data analysis"

Transcription

1 Advanced data analysis Akisato Kimura ( 木村昭悟 ) NTT Communication Science Laboratories akisato@ieee.org

2 Advanced data analysis 1. Introduction (Aug 20) 2. Dimensionality reduction (Aug 20,21) PCA, LPP, FDA, CCA, PLS 3. Non-linear methods (Aug 27) Kernel trick, kernel PCA Kernel LPP, Laplacian eigenmap, kernel FDA/CCA 4. Clustering (Aug 28) K-means, spectral clustering 5. Generalization (Sep 3) 4

3 Class web page lass.html Slides and data will be uploaded on this page. 5

4 Advanced data analysis 1. Introduction 2. Dimensionality reduction PCA, LPP, FDA, CCA, PLS 3. Non-linear methods Kernel trick, kernel PCA Kernel LPP, Laplacian eigenmap, kernel FDA/CCA 4. Clustering K-means, spectral clustering 5. Generalization 6

5 Curse of dimensionality 7 xx ii ii=1 nn, xx ii R dd, dd 1 If your data samples are high-dimensional, they are often too complex to directly analyze. Usual geometric intuition is often only applicable to low-dimensional problems. Such geometric intuition could be even misleading in high-dimensional problems. 7

6 Curse of dimensionality (cont.) When the dimensionality increases, Volume of unit hyper-cube VV cc is always 1. Volume of inscribed hyper-sphere VV ss goes to 0. Relative size of hyper-sphere gets small! VV ss 0 VVcc (in contradiction to our geometric intuition) dd = 1 dd = 2 dd = 3 dd 1 ππ ππ / ππ nn/2 Γ nn/ dd 0 8

7 Curse of Dimensionality (cont.) Grid sampling requires an exponentially large number of points. dd = 1 dd = 2 dd = 3 dd nn = 5 nn = 5 2 nn = 5 3 nn = 5 dd Unless you have an exponentially large number of samples, your high-dimensional samples are never dense. 9

8 Dimensionality reduction We want to reduce the dimensionality of data while preserving the intrinsic information in the data. Dimensionality reduction is also called embedding. If the dimension is reduced up to 3, it is also called data visualization. Basic assumption (or belief) behind dimensionality reduction: your highdimensional data is redundant in some sense. 10

9 Notation: Linear embedding Data samples: xx ii ii=1, xx ii R dd, dd 1 Embedding matrix: BB R mm dd, 1 mm dd Embedded data samples: nn, zz ii = BBxx ii R mm zz ii ii=1 = mm zz ii BB xx ii dd R mm R dd xx ii BB zz ii 11

10 Advanced data analysis 1. Introduction 2. Dimensionality reduction PCA, LPP, FDA, CCA, PLS 3. Non-linear methods Kernel trick, kernel PCA Kernel LPP, Laplacian eigenmap, kernel FDA/CCA 4. Clustering K-means, spectral clustering 5. Generalization 12

11 Principal component analysis (PCA) Idea: We want to get rid of a redundant dimension of the data samples 10 0, , , 20, 30 This could be achieved by minimizing the distance between embedded samples and original samples. xx ii zz ii 13

12 Data centering We center the data samples by nn nn xx ii = xx ii 1 nn xx jj 1 nn jj=1 ii xx ii = 0 In matrix, XX = XXXX XX = xx 1 xx 2 xx nn XX = (xx 1 xx 2 xx nn ) HH = II nn 1 nn 11 nn nn II nn : nn-dimensional identity matrix 11 nn nn : nn nn matrix with all ones 14

13 Orthogonal projection bb ii R dd mm ii=1 : Orthonormal basis in mm-dimensional embedding subspace bb ii, bb jj = bb 1 (ii = jj) ii bb jj = δδ ii,jj = 0 (ii jj) In matrix, BB = bb 1 bb 2 bb mm BBBB = II mm The orthogonal projection of xx ii is expressed by mm bb jj, xx ii bb jj (= BB BB xx ii ) jj=1 15

14 PCA criterion Minimize the sum of squared distances. nn ii=1 BB BB xx ii xx ii 2 = tr BB CCBB + tr CC tr BB CCBB PCA criterion: BB PPPPPP = arg max tr BB CCBB BB R mm dd subject to BBBB = II mm mm = bb ii CCbb ii ii=1 nn CC = ii=1 xx ii xx ii = XXXX xx ii BB BBxx ii 16

15 PCA: Summary A PCA solution: BB PPPPPP = ψψ 1 ψψ 2 ψψ mm dd λλ ii, ψψ ii ii=1 : Sorted eigenvalues and normalized eigenvectors of CCψψ = λλψψ λλ 1 λλ 2 λλ dd, ψψ ii, ψψ jj = δδ ii,jj PCA embedding of sample : zz = BB PPPPPP xx 1 nn XX11 nn 11 nn : nn-dimensional vector with all ones Data centering 17

16 Proof We first show necessary conditions of solutions. Lagrangian: LL BB, ΔΔ = tr BB CCBB Necessary conditions: tr( BBBB II mm ΔΔ) ΔΔ: Lagrange multipliers (symmetric matrix) LL BB = 2BB CC 2ΔΔBB = 00 LL ΔΔ = BBBB II mm = 00 CCBB = BB ΔΔ BBBB = II mm (2) (1) 18

17 Proof (cont.) Eigendecomposition of ΔΔ ΔΔ = TTTTTT (3) TT: orthogonal matrix (TT 1 = TT ) (1) CCBB = BB ΔΔ ΓΓ: diagonal matrix CCBB = BB TTTTTT (4) CCBB TT = BB TTΓΓ (5) This is an eigensystem of CC ΓΓ = diag λλ kk1, λλ kk2, λλ kkmm (6) BB TT = (ψψ kk1 ψψ kk2 ψψ kkmm ) kk ii {1,2,, dd} BB = TT ψψ kk1 ψψ kk2 ψψ kkmm (7) 19

18 Proof (cont.) BBBB = II mm rank(bb) = mm All kk ii ii=1 mm are distinct. Necessary conditions summary (3) ΔΔ = TTTTTT (6) ΓΓ = diag λλ kk1, λλ kk2, λλ kkmm (7) BB = TT ψψ kk1 ψψ kk2 ψψ kkmm All kk mm ii ii=1 are distinct. 20

19 Proof (cont.) Now, we choose the best kk ii ii=1 mm that maximizes the objective function tr BB CCBB. (2), (4) & (6) (4) (2) tr BB CCBB = tr BBBB TTΓΓTT = tr TTΓΓTT = tr ΓΓTT TT = mm ii=1 λλ kkii (6) (6) + TT is orthogonal λλ 1 λλ 2 λλ dd kk ii = ii maximizes the objective function. Choosing TT = II mm gives BB = ψψ 1 ψψ 2 ψψ mm 21

20 Pearson correlation nn Correlation coefficient for ss ii, tt ii ii=1 ρρ = nn ii=1 nn ii=1 ss ii ss ss ii ss 2 tt ii tt nn ii=1 tt ii tt 2 : ss = ss ii /nn tt = tt ii /nn Positively correlated Uncorrelated Negatively correlated ρρ > 0 ρρ 0 ρρ < 0 22

21 PCA uncorrelates data BB PPPPPP = ψψ 1 ψψ 2 ψψ mm The covariance matrix of PCA-embedded samples is diagonal. 1 nn zz ii zz ii = diag(λλ 1, λλ 2,, λλ mm ) nn ii (Homework) Elements in zz are uncorrelated! 23

22 Examples Data is well described. PCA is intuitive, easy to implement, analytically computable, and fast. 24

23 Examples (cont.) Iris data (4d->2d) Letter data (16d->2d) Embedded samples seem informative. 25

24 Examples (cont.) However, PCA does not necessarily preserve interesting information such as clusters. 26

25 Homework 1. Implement PCA and reproduce the 2- dimensional examples shown in the class. Datasets 1 and 2 are available from (Optional) Test PCA on your own (artificial or real) data and analyze characteristics of PCA. 27

26 Homework (cont.) 2. Prove that PCA uncorrelates samples. More specifically, prove that the covariance matrix of PCA-embedded samples is the following diagonal matrix: 1 nn nn ii zz ii zz ii = diag(λλ 1, λλ 2,, λλ mm ) BB PPPPPP = ψψ 1 ψψ 2 ψψ mm zz ii = BB PPPPPP xx ii 28

27 30

28 Advanced data analysis 1. Introduction 2. Dimensionality reduction PCA, LPP, FDA, CCA, PLS 3. Non-linear methods Kernel trick, kernel PCA Kernel LPP, Laplacian eigenmap, kernel FDA/CCA 4. Clustering K-means, spectral clustering 5. Generalization 31

29 Locality preserving projection (LPP) PCA finds a subspace that describes the data well. However, PCA can miss some interesting structures such as clusters. Another idea: Find a subspace that well preserves local structures in the data. 32

30 Similarity matrix Similarity matrix WW : the similar xx ii and xx jj are, the larger WW ii,jj is. Assumptions on WW : Symmetric: WW ii,jj = WW jj,ii Normalized: 0 WW ii,jj 1 WW is also called the affinity matrix. 33

31 Examples of similarity matrix Distance-based: WW ii,jj = exp xx ii xx jj 2 /γγ 2, γγ > 0 Nearest-neighbor-based: WW ii,jj = 1 if xx ii is a kk-nearest neighbor of xx jj or xx jj is a kk-nearest neighbor of xx ii. Otherwise, WW ii,jj = 0. Combination of these two is also possible 34

32 LPP criterion Idea: embed two close points as close, i.e., minimize nn 2 WW ii,jj BBxx ii BBBB jj 0 ii,jj=1 This is expressed as 2tr BBBBBBXX BB (Homework) XX = xx 1 xx 2 xx nn LL = DD WW nn DD = diag jj=1 nn WW 1,jj,, jj=1 WW nn,jj Since BB = 00 gives a meaningless solution, we impose BBBBBBXX BB = II mm 35

33 LPP: Summary LPP criterion: BB LLLLLL = arg min tr BB R BBBBBBXX BB mm dd subject to BBBBBBXX BB = II mm Solution (homework): BB LLLLLL = ψψ dd ψψ dd 1 ψψ dd mm+1 dd λλ ii, ψψ ii ii=1 : Sorted generalized eigenvalues and normalized eigenvectors of XXXXXX ψψ = λλxxxxxx ψψ λλ 1 λλ 2 λλ dd, XXXXXX ψψ ii, ψψ jj = δδ ii,jj LPP embedding of sample xx : zz = BB LLLLLL xx 36

34 Generalized eigenvalue problem AAAA = λλcccc CC: positive symmetric matrix Then, there exists a positive symmetric matrix CC 1/2 such that CC 1/2 2 = CC Eigenvalue decomposition of CC CC = γγ ii φφ ii φφ ii, γγ > 0 ii CC 1/2 = γγ ii φφ ii φφ ii ii 37

35 Generalized eigenvalue problem (cont.) AAAA = λλcccc Letting φφ = CC 1/2 ψψ, we obtain CC 1/22 AACC 1/2 φφ = λλφφ This is an ordinary eigenproblem. Ordinary eigenvectors are orthonormal. 1 (ii = jj) φφ ii, φφ jj δδ ii,jj = 0 (ii jj) Generalized eigenvectors are CC-orthonormal: CCCC ii, ψψ jj δδ ii,jj 38

36 Examples Blue: PCA Green: LPP Note: Similarity matrix is defined by the nearestneighbor-based method with 50 nearest neighbors. LPP can describe the data well, and also it preserves cluster structure. LPP is intuitive, easy to implement, analytically computable, and fast. 39

37 Examples (cont.) Embedding handwritten numerals from 3 to 8 into a 2- dimensional subspace. Each image consists of 16x16 pixels. 40

38 Examples (cont.) LPP finds (slightly) clearer clusters than PCA PCA LPP 41

39 Drawbacks of LPP Obtained results highly depend on the similarity matrix WW. Appropriately designing the similarity matrix (e.g., kk, γγ) is not always easy. 42

40 Local scaling of samples Densities of samples may be locally different. Dense region Sparse region Using the same γγ globally in the similarity matrix may not be appropriate. WW ii,jj = exp xx ii xx jj 2 /γγ 2, γγ > 0 43

41 Local scaling heuristic γγ ii : scaling around the sample xx ii (kk) γγ ii = xx ii xx ii xx ii (kk) : k-th nearest neighbor sample of xxii Local scaling based similarity matrix 2 WW ii,jj = exp xx ii xx jj /(γγii γγ jj ) A heuristic choice is kk = 7 L. Zelnik-Manor & P. Perona, Self-tuning spectral clustering, Advances in Neural Information Processing Systems 17, , MIT Press,

42 Graph theory Graph: A set of vertices and edges. Adjacency matrix WW : WW ii,jj is the number of edges from ii-th to jj-th vertices. Vertex degree dd ii : Number of connected edges at ii-th vertex. 45

43 Spectral graph theory Spectral graph theory studies relationships between the properties of a graph and its adjacency matrix. Graph Laplacian LL : dd ii (ii = jj) LL ii,jj = 1 (ii jj and WW ii,jj > 0) 0 (otherwise) 46

44 Relation to spectral graph theory Suppose our similarity matrix WW is defined based on nearest neighbors. Consider the following graph: Each vertex corresponds to each point xx ii. An edge exists if WW ii,jj > 0 WW is the adjacency matrix. DD is the diagonal matrix of vertex degrees. LL is the graph Laplacian. 47

45 Homework 1. Prove nn 2 WW ii,jj BBxx ii BBBB jj = 2tr BBBBBBXX BB ii,jj=1 XX = xx 1 xx 2 xx nn LL = DD WW nn nn DD = diag jj=1 WW 1,jj,, jj=1 WW nn,jj 48

46 Homework (cont.) 2. Let BB: mm dd matrix (1 mm dd) CC, DD: dd dd matrix, positive definite, symmetric mm λλ ii, ψψ ii ii=1: Sorted generalized eigenvalues and normalized eigenvectors of CCCC = λλdddd λλ 1 λλ 2 λλ dd, DDψψ ii, ψψ jj = δδ ii,jj Then, prove that a solution of BB mmmmmm = arg min tr BBBBBB BB Rmm dd is given by subject to BBBBBB = II mm BB mmmmmm = ψψ dd ψψ dd 1 ψψ dd mm+1 49

47 Homework (cont.) 3. (Optional) Implement LPP and reproduce the 2- dimensional examples shown in the class. Datasets 1 and 2 are available from Test LPP on your own (artificial or real) data and analyze characteristics of LPP. 50

48 51 51

49 Advanced data analysis 1. Introduction 2. Dimensionality reduction PCA, LPP, FDA, CCA, PLS 3. Non-linear methods Kernel trick, kernel PCA Kernel LPP, Laplacian eigenmap, kernel FDA/CCA 4. Clustering K-means, spectral clustering 5. Generalization 52

50 Supervised dimensionality reduction The best embedding is unknown in general. If every sample has a class label, the best embedding is the one such that samples in different classes are well separated. Better for representing large variances Which is the best??? Better for representing local structures 53

51 Supervised dimensionality reduction nn Samples xx ii ii=1 nn have class labels yy ii ii=1 xx ii, yy ii nn ii=1 xx ii R dd yy ii {1,2,, cc} We want to obtain an embedding such that samples in different classes are well separated from each other. 54

52 Within-class scatter matrix Sum of scatters within each class cc SS (ww) = xx ii μμ yy xx ii μμ yy yy=1 ii:yy ii =yy Mean of samples in class yy μμ yy = 1 nn yy ii:yy ii =yy xx ii # samples in class yy 55

53 Between-class scatter matrix Sum of scatters between classes cc SS (bb) = nn yy μμ yy μμ μμ yy μμ yy=1 Mean of samples in class yy μμ yy = 1 nn yy nn μμ = 1 nn ii ii:yy ii =yy xx ii xx ii Mean of all samples 56

54 Fisher discriminant analysis (FDA) Idea: Minimize within-class scatter and maximize between-class scatter by maximizing tr BBSS ww BB 1 BBSS (bb) BB To disable arbitrary scaling, we impose BBSS (ww) BB = II mm FDA criterion: BB FFFFFF = arg max tr BB R BBSS(bb) BB mm dd subject to BBSS (ww) BB = II mm 57

55 FDA: Summary FDA criterion: BB FFFFFF = arg max tr BB R BBSS(bb) BB mm dd subject to BBSS (ww) BB = II mm Solution: BB FFFFFF = ψψ 1 ψψ 2 ψψ mm λλ ii, ψψ mm ii ii=1 : Sorted generalized eigenvalues and normalized eigenvectors of SS (bb) ψψ = λλss (ww) ψψ λλ 1 λλ 2 λλ dd, SS (ww) ψψ ii, ψψ jj = δδ ii,jj FDA embedding of sample xx : zz = BB FFFFFF xx 58

56 Examples of FDA FDA can find an appropriate subspace. 59

57 Examples of FDA (cont.) However, FDA does not work well if samples in a class have multi-modality. 60

58 Dimensionality of embedding space It holds rank SS (bb) dd This means that λλ ii ii=cc λλ 1 λλ 2 λλ dd cc 1 (Homework) are always zero. Due to multiplicity of eigenvalues, dd eigenvectors ψψ ii ii=cc can be arbitrarily rotated in the null space of SS (bb). Thus, FDA essentially requires mm cc 1. When cc = 2, mm cannot be larger than 1! 61

59 Local Fisher discriminant analysis (LFDA) Idea: Take the locality of data into account. 1. Nearby samples in the same class are made close. 2. Samples in different classes are made apart. 3. Far-apart samples in the same class can be ignored M. Sugiyama: Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis, JMLR, 8(May),

60 Pairwise expressions of scatters (Homework) nn SS (ww) = 1 2 ii,jj=1 nn SS (bb) = 1 2 ii,jj=1 Samples in the same class are made close. (ww) QQ ii,jj xxii xx jj (bb) QQ ii,jj xxii xx jj xx ii xx jj (ww) 1/nn yy (yy ii = yy jj = yy) QQ ii,jj = 0 (yy ii yy jj ) xx ii xx jj (bb) 1/nn 1/nn yy (yy ii = yy jj = yy) QQ ii,jj = 1/nn (yy ii yy jj ) Samples in different classes are made apart. 63

61 Locality-aware scatters nn SS (llll) = 1 2 ii,jj=1 nn SS (llll) = 1 2 ii,jj=1 (llll) QQ ii,jj xxii xx jj (llll) QQ ii,jj xxii xx jj xx ii xx jj (llll) WW ii,jj /nn yy (yy ii = yy jj = yy) QQ ii,jj = 0 (yy ii yy jj ) QQ ii,jj Nearby samples in the same class are made close. xx ii xx jj WW ii.jj : similarity matrix (llll) WW ii,jj (1/nn 1/nn yy ) (yy ii = yy jj = yy) = 1/nn (yy ii yy jj ) Samples in different classes are made apart. 64

62 LFDA: Summary LFDA criterion: BB LLLLLLLL = arg max tr BB R BBSS(llll) BB mm dd subject to BBSS (llll) BB = II mm Solution: BB LLLLLLLL = ψψ 1 ψψ 2 ψψ mm λλ ii, ψψ mm ii ii=1 : Sorted generalized eigenvalues and normalized eigenvectors of SS (llll) ψψ = λλss (llll) ψψ λλ 1 λλ 2 λλ dd, SS (llll) ψψ ii, ψψ jj = δδ ii,jj FDA embedding of sample xx : zz = BB LLLLLLLL xx 65

63 Examples of LFDA Similarity matrix = nearest-neighbor method with 50 nearest neighbors. LFDA works well even for samples with within-class multi-modality. Since rank SS (llll) cc in general, thus mm can be larger in LFDA. 66

64 Examples of FDA/LFDA Thyroid disease data (5-dimensional) Representing several statistics obtained from blood tests. Label: Healthy or Sick Sick can be caused by Hyper-functioning of thyroid (too much working) Hypo-functioning of thyroid (too little working) 67

65 Projected samples onto 1-d space Sick and healthy are not separated. Hyper- and hypo-functioning are completely mixed. Sick and healthy are nicely separated. Hyper- and hypo-functioning are also well separated. 68

66 70

Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction

Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Masashi Sugiyama sugi@cs.titech.ac.jp Department of Computer Science, Tokyo Institute of Technology, ---W8-7, O-okayama, Meguro-ku,

More information

Data Preprocessing. Jilles Vreeken IRDM 15/ Oct 2015

Data Preprocessing. Jilles Vreeken IRDM 15/ Oct 2015 Data Preprocessing Jilles Vreeken 22 Oct 2015 So, how do you pronounce Jilles Yill-less Vreeken Fray-can Okay, now we can talk. Questions of the day How do we preprocess data before we can extract anything

More information

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds CS 6347 Lecture 8 & 9 Lagrange Multipliers & Varitional Bounds General Optimization subject to: min ff 0() R nn ff ii 0, h ii = 0, ii = 1,, mm ii = 1,, pp 2 General Optimization subject to: min ff 0()

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

Variations. ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra

Variations. ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra Variations ECE 6540, Lecture 02 Multivariate Random Variables & Linear Algebra Last Time Probability Density Functions Normal Distribution Expectation / Expectation of a function Independence Uncorrelated

More information

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Lecture 3 STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher Previous lectures What is machine learning? Objectives of machine learning Supervised and

More information

Independent Component Analysis and FastICA. Copyright Changwei Xiong June last update: July 7, 2016

Independent Component Analysis and FastICA. Copyright Changwei Xiong June last update: July 7, 2016 Independent Component Analysis and FastICA Copyright Changwei Xiong 016 June 016 last update: July 7, 016 TABLE OF CONTENTS Table of Contents...1 1. Introduction.... Independence by Non-gaussianity....1.

More information

Non-linear Dimensionality Reduction

Non-linear Dimensionality Reduction Non-linear Dimensionality Reduction CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Laplacian Eigenmaps Locally Linear Embedding (LLE)

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi Overview Introduction Linear Methods for Dimensionality Reduction Nonlinear Methods and Manifold

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

More information

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Support Vector Machines CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 A Linearly Separable Problem Consider the binary classification

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Lesson 24: Using the Quadratic Formula,

Lesson 24: Using the Quadratic Formula, , b ± b 4ac x = a Opening Exercise 1. Examine the two equation below and discuss what is the most efficient way to solve each one. A. 4xx + 5xx + 3 = xx 3xx B. cc 14 = 5cc. Solve each equation with the

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Property Testing and Affine Invariance Part I Madhu Sudan Harvard University

Property Testing and Affine Invariance Part I Madhu Sudan Harvard University Property Testing and Affine Invariance Part I Madhu Sudan Harvard University December 29-30, 2015 IITB: Property Testing & Affine Invariance 1 of 31 Goals of these talks Part I Introduce Property Testing

More information

10.4 The Cross Product

10.4 The Cross Product Math 172 Chapter 10B notes Page 1 of 9 10.4 The Cross Product The cross product, or vector product, is defined in 3 dimensions only. Let aa = aa 1, aa 2, aa 3 bb = bb 1, bb 2, bb 3 then aa bb = aa 2 bb

More information

(1) Introduction: a new basis set

(1) Introduction: a new basis set () Introduction: a new basis set In scattering, we are solving the S eq. for arbitrary VV in integral form We look for solutions to unbound states: certain boundary conditions (EE > 0, plane and spherical

More information

Quadratic Equations and Functions

Quadratic Equations and Functions 50 Quadratic Equations and Functions In this chapter, we discuss various ways of solving quadratic equations, aaxx 2 + bbbb + cc 0, including equations quadratic in form, such as xx 2 + xx 1 20 0, and

More information

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold. Nonlinear Methods Data often lies on or near a nonlinear low-dimensional curve aka manifold. 27 Laplacian Eigenmaps Linear methods Lower-dimensional linear projection that preserves distances between all

More information

TECHNICAL NOTE AUTOMATIC GENERATION OF POINT SPRING SUPPORTS BASED ON DEFINED SOIL PROFILES AND COLUMN-FOOTING PROPERTIES

TECHNICAL NOTE AUTOMATIC GENERATION OF POINT SPRING SUPPORTS BASED ON DEFINED SOIL PROFILES AND COLUMN-FOOTING PROPERTIES COMPUTERS AND STRUCTURES, INC., FEBRUARY 2016 TECHNICAL NOTE AUTOMATIC GENERATION OF POINT SPRING SUPPORTS BASED ON DEFINED SOIL PROFILES AND COLUMN-FOOTING PROPERTIES Introduction This technical note

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Principal Component Analysis and Linear Discriminant Analysis

Principal Component Analysis and Linear Discriminant Analysis Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

7.3 The Jacobi and Gauss-Seidel Iterative Methods

7.3 The Jacobi and Gauss-Seidel Iterative Methods 7.3 The Jacobi and Gauss-Seidel Iterative Methods 1 The Jacobi Method Two assumptions made on Jacobi Method: 1.The system given by aa 11 xx 1 + aa 12 xx 2 + aa 1nn xx nn = bb 1 aa 21 xx 1 + aa 22 xx 2

More information

(1) Correspondence of the density matrix to traditional method

(1) Correspondence of the density matrix to traditional method (1) Correspondence of the density matrix to traditional method New method (with the density matrix) Traditional method (from thermal physics courses) ZZ = TTTT ρρ = EE ρρ EE = dddd xx ρρ xx ii FF = UU

More information

Online Mode Shape Estimation using Complex Principal Component Analysis and Clustering, Hallvar Haugdal (NTMU Norway)

Online Mode Shape Estimation using Complex Principal Component Analysis and Clustering, Hallvar Haugdal (NTMU Norway) Online Mode Shape Estimation using Complex Principal Component Analysis and Clustering, Hallvar Haugdal (NTMU Norway) Mr. Hallvar Haugdal, finished his MSc. in Electrical Engineering at NTNU, Norway in

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Kernel PCA 2 Isomap 3 Locally Linear Embedding 4 Laplacian Eigenmap

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Fisher s Linear Discriminant Analysis

Fisher s Linear Discriminant Analysis Fisher s Linear Discriminant Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Principal Component Analysis

Principal Component Analysis CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Quantitative Understanding in Biology Principal Components Analysis

Quantitative Understanding in Biology Principal Components Analysis Quantitative Understanding in Biology Principal Components Analysis Introduction Throughout this course we have seen examples of complex mathematical phenomena being represented as linear combinations

More information

Inferring the origin of an epidemic with a dynamic message-passing algorithm

Inferring the origin of an epidemic with a dynamic message-passing algorithm Inferring the origin of an epidemic with a dynamic message-passing algorithm HARSH GUPTA (Based on the original work done by Andrey Y. Lokhov, Marc Mézard, Hiroki Ohta, and Lenka Zdeborová) Paper Andrey

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Vector Data: Clustering: Part II Instructor: Yizhou Sun yzsun@cs.ucla.edu May 3, 2017 Methods to Learn: Last Lecture Classification Clustering Vector Data Text Data Recommender

More information

LECTURE NOTE #11 PROF. ALAN YUILLE

LECTURE NOTE #11 PROF. ALAN YUILLE LECTURE NOTE #11 PROF. ALAN YUILLE 1. NonLinear Dimension Reduction Spectral Methods. The basic idea is to assume that the data lies on a manifold/surface in D-dimensional space, see figure (1) Perform

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

Notes on Implementation of Component Analysis Techniques

Notes on Implementation of Component Analysis Techniques Notes on Implementation of Component Analysis Techniques Dr. Stefanos Zafeiriou January 205 Computing Principal Component Analysis Assume that we have a matrix of centered data observations X = [x µ,...,

More information

Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD Outline These preliminaries serve to signal to students what tools they need to know to succeed in ECON 360 and refresh their

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra Worksheets for GCSE Mathematics Quadratics mr-mathematics.com Maths Resources for Teachers Algebra Quadratics Worksheets Contents Differentiated Independent Learning Worksheets Solving x + bx + c by factorisation

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Kernel methods for comparing distributions, measuring dependence

Kernel methods for comparing distributions, measuring dependence Kernel methods for comparing distributions, measuring dependence Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Principal component analysis Given a set of M centered observations

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra Worksheets for GCSE Mathematics Algebraic Expressions Mr Black 's Maths Resources for Teachers GCSE 1-9 Algebra Algebraic Expressions Worksheets Contents Differentiated Independent Learning Worksheets

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Text Processing and High-Dimensional Data

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Text Processing and High-Dimensional Data Lecture Notes to Winter Term 2017/2018 Text Processing and High-Dimensional Data Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur Schmid, Daniyal Kazempour,

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Integrating Rational functions by the Method of Partial fraction Decomposition. Antony L. Foster

Integrating Rational functions by the Method of Partial fraction Decomposition. Antony L. Foster Integrating Rational functions by the Method of Partial fraction Decomposition By Antony L. Foster At times, especially in calculus, it is necessary, it is necessary to express a fraction as the sum of

More information

PCA and LDA. Man-Wai MAK

PCA and LDA. Man-Wai MAK PCA and LDA Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: S.J.D. Prince,Computer

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable models Background PCA

More information

10.1 Three Dimensional Space

10.1 Three Dimensional Space Math 172 Chapter 10A notes Page 1 of 12 10.1 Three Dimensional Space 2D space 0 xx.. xx-, 0 yy yy-, PP(xx, yy) [Fig. 1] Point PP represented by (xx, yy), an ordered pair of real nos. Set of all ordered

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto

Unsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Data-dependent representations: Laplacian Eigenmaps

Data-dependent representations: Laplacian Eigenmaps Data-dependent representations: Laplacian Eigenmaps November 4, 2015 Data Organization and Manifold Learning There are many techniques for Data Organization and Manifold Learning, e.g., Principal Component

More information

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA Yoshua Bengio Pascal Vincent Jean-François Paiement University of Montreal April 2, Snowbird Learning 2003 Learning Modal Structures

More information

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16 Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent

More information

Principal Component Analysis (PCA) CSC411/2515 Tutorial

Principal Component Analysis (PCA) CSC411/2515 Tutorial Principal Component Analysis (PCA) CSC411/2515 Tutorial Harris Chan Based on previous tutorial slides by Wenjie Luo, Ladislav Rampasek University of Toronto hchan@cs.toronto.edu October 19th, 2017 (UofT)

More information

Lesson 7: Algebraic Expressions The Commutative and Associative Properties

Lesson 7: Algebraic Expressions The Commutative and Associative Properties Algebraic Expressions The Commutative and Associative Properties Classwork Exercise 1 Suzy draws the following picture to represent the sum 3 + 4: Ben looks at this picture from the opposite side of the

More information

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the

More information

Principal Components Analysis. Sargur Srihari University at Buffalo

Principal Components Analysis. Sargur Srihari University at Buffalo Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2

More information

PHL424: Nuclear Shell Model. Indian Institute of Technology Ropar

PHL424: Nuclear Shell Model. Indian Institute of Technology Ropar PHL424: Nuclear Shell Model Themes and challenges in modern science Complexity out of simplicity Microscopic How the world, with all its apparent complexity and diversity can be constructed out of a few

More information

IN Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg

IN Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg IN 5520 30.10.18 Pratical guidelines for classification Evaluation Feature selection Principal component transform Anne Solberg (anne@ifi.uio.no) 30.10.18 IN 5520 1 Literature Practical guidelines of classification

More information

Course Business. Homework 3 Due Now. Homework 4 Released. Professor Blocki is travelling, but will be back next week

Course Business. Homework 3 Due Now. Homework 4 Released. Professor Blocki is travelling, but will be back next week Course Business Homework 3 Due Now Homework 4 Released Professor Blocki is travelling, but will be back next week 1 Cryptography CS 555 Week 11: Discrete Log/DDH Applications of DDH Factoring Algorithms,

More information

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering Stochastic Processes and Linear Algebra Recap Slides Stochastic processes and variables XX tt 0 = XX xx nn (tt) xx 2 (tt) XX tt XX

More information

L26: Advanced dimensionality reduction

L26: Advanced dimensionality reduction L26: Advanced dimensionality reduction The snapshot CA approach Oriented rincipal Components Analysis Non-linear dimensionality reduction (manifold learning) ISOMA Locally Linear Embedding CSCE 666 attern

More information

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Indexes for Multimedia Data 14 Indexes for Multimedia

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Covariance and Correlation Matrix

Covariance and Correlation Matrix Covariance and Correlation Matrix Given sample {x n } N 1, where x Rd, x n = x 1n x 2n. x dn sample mean x = 1 N N n=1 x n, and entries of sample mean are x i = 1 N N n=1 x in sample covariance matrix

More information

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Apprentissage non supervisée

Apprentissage non supervisée Apprentissage non supervisée Cours 3 Higher dimensions Jairo Cugliari Master ECD 2015-2016 From low to high dimension Density estimation Histograms and KDE Calibration can be done automacally But! Let

More information

Lesson 1: Successive Differences in Polynomials

Lesson 1: Successive Differences in Polynomials Lesson 1 Lesson 1: Successive Differences in Polynomials Classwork Opening Exercise John noticed patterns in the arrangement of numbers in the table below. 2.4 3.4 4.4 5.4 6.4 5.76 11.56 19.36 29.16 40.96

More information

Manifold Regularization

Manifold Regularization 9.520: Statistical Learning Theory and Applications arch 3rd, 200 anifold Regularization Lecturer: Lorenzo Rosasco Scribe: Hooyoung Chung Introduction In this lecture we introduce a class of learning algorithms,

More information

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering

More information

STA 414/2104: Lecture 8

STA 414/2104: Lecture 8 STA 414/2104: Lecture 8 6-7 March 2017: Continuous Latent Variable Models, Neural networks Delivered by Mark Ebden With thanks to Russ Salakhutdinov, Jimmy Ba and others Outline Continuous latent variable

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Support Vector Machine and Neural Network Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 1 Announcements Due end of the day of this Friday (11:59pm) Reminder

More information

Grover s algorithm. We want to find aa. Search in an unordered database. QC oracle (as usual) Usual trick

Grover s algorithm. We want to find aa. Search in an unordered database. QC oracle (as usual) Usual trick Grover s algorithm Search in an unordered database Example: phonebook, need to find a person from a phone number Actually, something else, like hard (e.g., NP-complete) problem 0, xx aa Black box ff xx

More information

Kernel-Based Contrast Functions for Sufficient Dimension Reduction

Kernel-Based Contrast Functions for Sufficient Dimension Reduction Kernel-Based Contrast Functions for Sufficient Dimension Reduction Michael I. Jordan Departments of Statistics and EECS University of California, Berkeley Joint work with Kenji Fukumizu and Francis Bach

More information

Compressed representation of Kohn-Sham orbitals via selected columns of the density matrix

Compressed representation of Kohn-Sham orbitals via selected columns of the density matrix Lin Lin Compressed Kohn-Sham Orbitals 1 Compressed representation of Kohn-Sham orbitals via selected columns of the density matrix Lin Lin Department of Mathematics, UC Berkeley; Computational Research

More information

Classical RSA algorithm

Classical RSA algorithm Classical RSA algorithm We need to discuss some mathematics (number theory) first Modulo-NN arithmetic (modular arithmetic, clock arithmetic) 9 (mod 7) 4 3 5 (mod 7) congruent (I will also use = instead

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Dimension Reduction (PCA, ICA, CCA, FLD,

Dimension Reduction (PCA, ICA, CCA, FLD, Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction

More information

Multi-Label Informed Latent Semantic Indexing

Multi-Label Informed Latent Semantic Indexing Multi-Label Informed Latent Semantic Indexing Shipeng Yu 12 Joint work with Kai Yu 1 and Volker Tresp 1 August 2005 1 Siemens Corporate Technology Department of Neural Computation 2 University of Munich

More information

Wave Motion. Chapter 14 of Essential University Physics, Richard Wolfson, 3 rd Edition

Wave Motion. Chapter 14 of Essential University Physics, Richard Wolfson, 3 rd Edition Wave Motion Chapter 14 of Essential University Physics, Richard Wolfson, 3 rd Edition 1 Waves: propagation of energy, not particles 2 Longitudinal Waves: disturbance is along the direction of wave propagation

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information