Dimension Reduction and Iterative Consensus Clustering
|
|
- Johnathan Carr
- 5 years ago
- Views:
Transcription
1 Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative
2 1 Document Clustering Geometry of the SVD Centered SVD Uncentered SVD Principal Direction Divisive Partitioning 2 3 Combination Algorithms Iterating to Reach Consensus 4 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne 5 Dimension Reduction and Iterative
3 Document Clustering Geometry of the SVD Principal Direction Divisive Partitioning For document clustering, we create a term-document matrix, A, as follows: A m n = Term 1 Term i 0 Doc 1 Doc j Doc n f ij 1 C A Term m Where f i,j is the frequency of term i in document j. Various types of term-weighting can be used in place of raw frequencies. For our experiments, we simply normalized the columns. Each column of A represents the coordinates of a document in the m-dimensional term-space", where each standard basis vector represents one term from the dictionary. Dimension Reduction and Iterative
4 Geometry of the SVD Principal Direction Divisive Partitioning Singular Value Decomposition (SVD) Decomposes A = U ΣV T where U and V are orthogonal matrices and Σ is a diagonal matrix of singular values. Dimension Reduction and Iterative
5 Geometry of the SVD Principal Direction Divisive Partitioning Singular Value Decomposition (SVD) Decomposes A = U ΣV T where U and V are orthogonal matrices and Σ is a diagonal matrix of singular values. The truncated SVD yields the closest rank r approximation to A in the 2-norm. a j rx [V T ] i,j σ i u i i=1 Dimension Reduction and Iterative
6 Geometry of the SVD Principal Direction Divisive Partitioning Singular Value Decomposition (SVD) Decomposes A = U ΣV T where U and V are orthogonal matrices and Σ is a diagonal matrix of singular values. The truncated SVD yields the closest rank r approximation to A in the 2-norm. a j rx [V T ] i,j σ i u i i=1 Thus, a column v j of the truncated V T is the coordinates of a j once projected into the lower dimensional space spanned by the orthogonal basis. (σ 1 u 1, σ 2 u 2,... σ r u r ) Dimension Reduction and Iterative
7 Geometry of the SVD Principal Direction Divisive Partitioning Singular Value Decomposition (SVD) Decomposes A = U ΣV T where U and V are orthogonal matrices and Σ is a diagonal matrix of singular values. The truncated SVD yields the closest rank r approximation to A in the 2-norm. a j rx [V T ] i,j σ i u i i=1 Thus, a column v j of the truncated V T is the coordinates of a j once projected into the lower dimensional space spanned by the orthogonal basis. (σ 1 u 1, σ 2 u 2,... σ r u r ) We ll use the columns of V T as a lower dimensional representation of the columns of A for the purposes of clustering. Dimension Reduction and Iterative
8 Geometry of the SVD Principal Direction Divisive Partitioning Geometry of Singular Vectors when A is centered The first left-hand singular vector, u 1, of the centered matrix C = A µe T is the direction along which the variance of the data is maximal. u 1 Dimension Reduction and Iterative
9 Geometry of the SVD Principal Direction Divisive Partitioning Geometry of Singular Vectors when A is centered The second left singular vector of C, u 2, is the direction orthogonal to u 1 along which the variance is maximal. u 2 u 1 u 1 Dimension Reduction and Iterative
10 Outline Geometry of the SVD Principal Direction Divisive Partitioning Geometry of SVD when A is uncentered The first left singular vector of A is the direction of the least-squares line through the origin. T = α* u (A ) 1 (Zero intercept total least squares line) u (A ) 1 u (C ) 1 Dimension Reduction and Iterative
11 Geometry of the SVD Principal Direction Divisive Partitioning Principal Direction Divisive Partitioning (PDDP) Algorithm proposed by Daniel Boley at Univ. of MN in 2002 Dimension Reduction and Iterative
12 Geometry of the SVD Principal Direction Divisive Partitioning Principal Direction Divisive Partitioning (PDDP) Algorithm proposed by Daniel Boley at Univ. of MN in 2002 Iterative process partitions data into 2 clusters with each iteration, based upon their projection onto the direction of maximal variance. Dimension Reduction and Iterative
13 Geometry of the SVD Principal Direction Divisive Partitioning Principal Direction Divisive Partitioning (PDDP) Algorithm proposed by Daniel Boley at Univ. of MN in 2002 Iterative process partitions data into 2 clusters with each iteration, based upon their projection onto the direction of maximal variance. PDDP can be adapted to use more than just the principal singular vector. Dimension Reduction and Iterative
14 Geometry of the SVD Principal Direction Divisive Partitioning Principal Direction Divisive Partitioning (PDDP) Algorithm proposed by Daniel Boley at Univ. of MN in 2002 Iterative process partitions data into 2 clusters with each iteration, based upon their projection onto the direction of maximal variance. PDDP can be adapted to use more than just the principal singular vector. We will often use the results from PDDP to seed the k-means algorithm with an initial guess Dimension Reduction and Iterative
15 Outline Illustration of one iteration of PDDP Geometry of the SVD Principal Direction Divisive Partitioning Signs of the entries in v 1 tell us whether the projected data fall to the left or right of the origin. v (i) <0 1 v 1(i) >0 Figure: Data Cloud Projected onto the span of u 1 (C) Dimension Reduction and Iterative
16 The nonnegative matrix factorization (NMF) seeks to decompose a nonnegative matrix into the product of two nonnegative matrices: A m n = W m r H r n. Dimension Reduction and Iterative
17 The nonnegative matrix factorization (NMF) seeks to decompose a nonnegative matrix into the product of two nonnegative matrices: A m n = W m r H r n. The decomposition is created by solving the following nonlinear optimization problem: min A WH 2 F such that W 0 and H 0 Dimension Reduction and Iterative
18 The nonnegative matrix factorization (NMF) seeks to decompose a nonnegative matrix into the product of two nonnegative matrices: A m n = W m r H r n. The decomposition is created by solving the following nonlinear optimization problem: min A WH 2 F such that W 0 and H 0 The inner dimension of the factorization, r, must be input by the user. Dimension Reduction and Iterative
19 The nonnegative matrix factorization (NMF) seeks to decompose a nonnegative matrix into the product of two nonnegative matrices: A m n = W m r H r n. The decomposition is created by solving the following nonlinear optimization problem: min A WH 2 F such that W 0 and H 0 The inner dimension of the factorization, r, must be input by the user. The result is an additive, parts-based approximation to each data column a j in the form of a linear combination of feature" vectors, w i, as follows: Dimension Reduction and Iterative
20 The nonnegative matrix factorization (NMF) seeks to decompose a nonnegative matrix into the product of two nonnegative matrices: A m n = W m r H r n. The decomposition is created by solving the following nonlinear optimization problem: min A WH 2 F such that W 0 and H 0 The inner dimension of the factorization, r, must be input by the user. The result is an additive, parts-based approximation to each data column a j in the form of a linear combination of feature" vectors, w i, as follows: rx a j h i,j w i i=1 Dimension Reduction and Iterative
21 NMF for Dimension Reduction Columns of H represent the coordinates of each document after projection into the lower dimensional feature-space" spanned by the columns of W. We ll use the columns of H as a lower dimensional representation of the columns of A for the purposes of clustering. a 1 a 2 a 3. a m h 1 h 2 h 3. h r Dimension Reduction and Iterative
22 New Algorithms out of Old Combination Algorithms Iterating to Reach Consensus Data Matrix A SVD centered matrix C uncentered matrix A first r (or k) columns of VT NMF H k-means PDDP Cluster Solution Dimension Reduction and Iterative
23 New Algorithms out of Old Combination Algorithms Iterating to Reach Consensus One Possible Clustering Algorithm un.svdr.pddp-kmeans Data Matrix A SVD centered matrix C uncentered matrix A first r (or k) columns of VT NMF H k-means PDDP Cluster Solution Dimension Reduction and Iterative
24 New Algorithms out of Old Combination Algorithms Iterating to Reach Consensus Clustering Algorithm w/ 2 Dimension Reductions (H.SVDk.PDDP-kmeans) Data Matrix A SVD NMF uncentered matrix A centered matrix C T first r (or k) columns of V H k-means PDDP Cluster Solution Dimension Reduction and Iterative
25 Combination Algorithms Iterating to Reach Consensus Since no single algorithm will perform better than all others on a given class of data, we propose using several algorithms to find agreement upon clusters. Dimension Reduction and Iterative
26 Combination Algorithms Iterating to Reach Consensus Since no single algorithm will perform better than all others on a given class of data, we propose using several algorithms to find agreement upon clusters. We create an adjacency matrix for each clustering, whose (i, j) th entry is 1 if a i and a j were clustered together and 0 otherwise. Dimension Reduction and Iterative
27 Combination Algorithms Iterating to Reach Consensus Since no single algorithm will perform better than all others on a given class of data, we propose using several algorithms to find agreement upon clusters. We create an adjacency matrix for each clustering, whose (i, j) th entry is 1 if a i and a j were clustered together and 0 otherwise. We sum the adjacency matrices from various algorithms to create a consensus matrix, M whose (i, j) th entry reveals the number of times a i and a j were clustered together. Dimension Reduction and Iterative
28 Combination Algorithms Iterating to Reach Consensus Since no single algorithm will perform better than all others on a given class of data, we propose using several algorithms to find agreement upon clusters. We create an adjacency matrix for each clustering, whose (i, j) th entry is 1 if a i and a j were clustered together and 0 otherwise. We sum the adjacency matrices from various algorithms to create a consensus matrix, M whose (i, j) th entry reveals the number of times a i and a j were clustered together. Entries in the consensus matrix that are below a certain tolerance may be changed to zero. Dimension Reduction and Iterative
29 Combination Algorithms Iterating to Reach Consensus Since no single algorithm will perform better than all others on a given class of data, we propose using several algorithms to find agreement upon clusters. We create an adjacency matrix for each clustering, whose (i, j) th entry is 1 if a i and a j were clustered together and 0 otherwise. We sum the adjacency matrices from various algorithms to create a consensus matrix, M whose (i, j) th entry reveals the number of times a i and a j were clustered together. Entries in the consensus matrix that are below a certain tolerance may be changed to zero. This consensus matrix is then clustered using the same algorithms to see if the algorithms will agree upon a solution. Dimension Reduction and Iterative
30 Iterating the Consensus Process Combination Algorithms Iterating to Reach Consensus Data Matrix Clustering Algorithm 1 Clustering Algorithm 2 Clustering Algorithm n cluster solution 1 cluster solution 2 cluster solution n Consensus Matrix Clustering Algorithm 1 Clustering Algorithm 2 Clustering Algorithm n cluster solution 1 cluster solution 2 cluster solution n Consensus Matrix If no consensus reached Dimension Reduction and Iterative
31 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Medlars/Cranfield/CISI - Medical and Scientific Abstracts docs/11000 terms -k = 3 clusters Accuracies for Med/Cran/CISI after Dimension Reduction to r = 3 Algorithm r=k=3 Consensus 1 Consensus 2 Consensus 3 NMF Basic PDDP PDDP-kmeans SVDr-PDDP-kmeans un.svdr-pddp-kmeans H-PDDP H-PDDP-kmeans H-SVDk-PDDP-kmeans H-un.SVDk-PDDP-kmeans Dimension Reduction and Iterative
32 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Medlars/Cranfield/CISI - Medical and Scientific Abstracts doc, terms - k = 3 clusters Accuracies for Med/Cran/CISI after dimension reduction to r = 15 Algorithm r=15 Consensus 1 Consensus 2 Consensus 3 NMF Basic [] PDDP PDDP-kmeans SVDr-PDDP-kmeans un.svdr-pddp-kmeans H-PDDP H-PDDP-kmeans H-SVDk-PDDP-kmeans H-un.SVDk-PDDP-kmeans Dimension Reduction and Iterative
33 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data Set Collected by Sinka and Corne documents proposed as a benchmark collection for document clustering. Dimension Reduction and Iterative
34 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data Set Collected by Sinka and Corne documents proposed as a benchmark collection for document clustering. Documents pertain to 4 broad topics (banking/finance, programming, science, and sport) Dimension Reduction and Iterative
35 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data Set Collected by Sinka and Corne documents proposed as a benchmark collection for document clustering. Documents pertain to 4 broad topics (banking/finance, programming, science, and sport) Each topic contains 2 or 3 subtopics (commercial banks, insurance agencies, java, astronomy, biology, etc). Dimension Reduction and Iterative
36 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data Set Collected by Sinka and Corne documents proposed as a benchmark collection for document clustering. Documents pertain to 4 broad topics (banking/finance, programming, science, and sport) Each topic contains 2 or 3 subtopics (commercial banks, insurance agencies, java, astronomy, biology, etc). Documents were extracted automatically from the web. Dimension Reduction and Iterative
37 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data Set Collected by Sinka and Corne documents proposed as a benchmark collection for document clustering. Documents pertain to 4 broad topics (banking/finance, programming, science, and sport) Each topic contains 2 or 3 subtopics (commercial banks, insurance agencies, java, astronomy, biology, etc). Documents were extracted automatically from the web. Some long detailed articles Some just list of words, addresses, or links. Noisy Data! Dimension Reduction and Iterative
38 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data - subset BCFG - k = 4 clusters Cluster Accuracies for BenchmarkBCFG after Dimension Reduction to r = 4 Algorithm r=4 Consensus 1 Consensus 2 Consensus 3 NMF Basic PDDP PDDP-kmeans SVDr-PDDP-kmeans un.svdr-pddp-kmeans H-PDDP H-PDDP-kmeans H-SVDk-PDDP-kmeans H-un.SVDk-PDDP-kmeans Dimension Reduction and Iterative
39 Medlars/Cranfield/CISI Benchmark Data Set by Sinka and Corne Benchmark Data - subset BCFG-k = 4 clusters Experiment 3: Cluster Accuracies for BenchmarkBCFG after Dimension Reduction to r = 10 Algorithm r=10 Consensus 1 Consensus 2 Consensus 3 Consensus 5 NMF Basic PDDP PDDP-kmeans SVDr-PDDP-kmeans un.svdr-pddp-kmeans H-PDDP H-PDDP-kmeans H-SVDk-PDDP-kmeans H-un.SVDk-PDDP-kmeans Dimension Reduction and Iterative
40 The choices for the size of dimension reduction, r, and the various combinations of algorithms produce hundreds of clusterings for the consensus approach. shows potential as a technique to determine a final clustering solution through many different algorithms. Although the final clustering solution determined through is not guaranteed to be optimal, experiments suggest that the technique provides a solution that is well above the average of the algorithms used. Dimension Reduction and Iterative
Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY
Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY OUTLINE Why We Need Matrix Decomposition SVD (Singular Value Decomposition) NMF (Nonnegative
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationLecture 6. Numerical methods. Approximation of functions
Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation
More informationPreprocessing & dimensionality reduction
Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016
More informationApplied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More informationNumerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??
Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement
More informationAM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition
AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized
More informationCollaborative Filtering: A Machine Learning Perspective
Collaborative Filtering: A Machine Learning Perspective Chapter 6: Dimensionality Reduction Benjamin Marlin Presenter: Chaitanya Desai Collaborative Filtering: A Machine Learning Perspective p.1/18 Topics
More informationQuick Introduction to Nonnegative Matrix Factorization
Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices
More informationSingular Value Decomposition
Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our
More informationbe a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u
MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =
More informationLinear Algebra. Session 12
Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)
More informationGeneric Text Summarization
June 27, 2012 Outline Introduction 1 Introduction Notation and Terminology 2 3 4 5 6 Text Summarization Introduction Notation and Terminology Two Types of Text Summarization Query-Relevant Summarization:
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationMethods in Clustering
Methods in Clustering Katelyn Gao Heather Hardeman Edward Lim Cristian Potter Carl Meyer Ralph Abbey August 5, 2011 Abstract Cluster Analytics helps to analyze the massive amounts of data which have accrued
More informationLecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora
princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the
More informationLinear Least Squares. Using SVD Decomposition.
Linear Least Squares. Using SVD Decomposition. Dmitriy Leykekhman Spring 2011 Goals SVD-decomposition. Solving LLS with SVD-decomposition. D. Leykekhman Linear Least Squares 1 SVD Decomposition. For any
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More information1 Non-negative Matrix Factorization (NMF)
2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationRelations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis
Relations Between Adjacency And Modularity Graph Partitioning: Principal Component Analysis vs. Modularity Component Analysis Hansi Jiang Carl Meyer North Carolina State University October 27, 2015 1 /
More informationWhen Dictionary Learning Meets Classification
When Dictionary Learning Meets Classification Bufford, Teresa 1 Chen, Yuxin 2 Horning, Mitchell 3 Shee, Liberty 1 Mentor: Professor Yohann Tendero 1 UCLA 2 Dalhousie University 3 Harvey Mudd College August
More informationClustering. SVD and NMF
Clustering with the SVD and NMF Amy Langville Mathematics Department College of Charleston Dagstuhl 2/14/2007 Outline Fielder Method Extended Fielder Method and SVD Clustering with SVD vs. NMF Demos with
More informationSVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)
Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationFinal Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson
Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationMachine Learning - MT & 14. PCA and MDS
Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)
More informationBare minimum on matrix algebra. Psychology 588: Covariance structure and factor models
Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More information1 Searching the World Wide Web
Hubs and Authorities in a Hyperlinked Environment 1 Searching the World Wide Web Because diverse users each modify the link structure of the WWW within a relatively small scope by creating web-pages on
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 PCA, NMF Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Summary: PCA PCA is SVD
More information(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =
. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)
More informationLatent Semantic Analysis. Hongning Wang
Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationSummary of Week 9 B = then A A =
Summary of Week 9 Finding the square root of a positive operator Last time we saw that positive operators have a unique positive square root We now briefly look at how one would go about calculating the
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization
More informationMatrix decompositions
Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic Indexing Hinrich Schütze Center for Information and Language Processing, University of Munich 2013-07-10 1/43
More informationThe Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will
More informationLinear Algebra (Review) Volker Tresp 2017
Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationCOMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017
COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationIntroduction to Data Mining
Introduction to Data Mining Lecture #21: Dimensionality Reduction Seoul National University 1 In This Lecture Understand the motivation and applications of dimensionality reduction Learn the definition
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More informationSTAT 350: Geometry of Least Squares
The Geometry of Least Squares Mathematical Basics Inner / dot product: a and b column vectors a b = a T b = a i b i a b a T b = 0 Matrix Product: A is r s B is s t (AB) rt = s A rs B st Partitioned Matrices
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationParallel Singular Value Decomposition. Jiaxing Tan
Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector
More informationNonnegative Matrix Factorization. Data Mining
The Nonnegative Matrix Factorization in Data Mining Amy Langville langvillea@cofc.edu Mathematics Department College of Charleston Charleston, SC Yahoo! Research 10/18/2005 Outline Part 1: Historical Developments
More informationLearning representations
Learning representations Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 4/11/2016 General problem For a dataset of n signals X := [ x 1 x
More informationLecture Notes 2: Matrices
Optimization-based data analysis Fall 2017 Lecture Notes 2: Matrices Matrices are rectangular arrays of numbers, which are extremely useful for data analysis. They can be interpreted as vectors in a vector
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationMatrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang
Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space
More informationSingular Value Decomposition
Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding
More informationRegularized Discriminant Analysis and Reduced-Rank LDA
Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and
More information2. Review of Linear Algebra
2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear
More informationFisher s Linear Discriminant Analysis
Fisher s Linear Discriminant Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationAssignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition. Name:
Assignment #10: Diagonalization of Symmetric Matrices, Quadratic Forms, Optimization, Singular Value Decomposition Due date: Friday, May 4, 2018 (1:35pm) Name: Section Number Assignment #10: Diagonalization
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationPrincipal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014
Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationSingular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces
Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang
More informationAutomatic Rank Determination in Projective Nonnegative Matrix Factorization
Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and
More informationDesigning Information Devices and Systems II
EECS 16B Fall 2016 Designing Information Devices and Systems II Linear Algebra Notes Introduction In this set of notes, we will derive the linear least squares equation, study the properties symmetric
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationMODULE 8 Topics: Null space, range, column space, row space and rank of a matrix
MODULE 8 Topics: Null space, range, column space, row space and rank of a matrix Definition: Let L : V 1 V 2 be a linear operator. The null space N (L) of L is the subspace of V 1 defined by N (L) = {x
More informationSolving Linear Systems
Solving Linear Systems Iterative Solutions Methods Philippe B. Laval KSU Fall 207 Philippe B. Laval (KSU) Linear Systems Fall 207 / 2 Introduction We continue looking how to solve linear systems of the
More informationData Mining and Matrices
Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationThe Singular Value Decomposition and Least Squares Problems
The Singular Value Decomposition and Least Squares Problems Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 27, 2009 Applications of SVD solving
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationLecture Notes 10: Matrix Factorization
Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To
More information5 Linear Algebra and Inverse Problem
5 Linear Algebra and Inverse Problem 5.1 Introduction Direct problem ( Forward problem) is to find field quantities satisfying Governing equations, Boundary conditions, Initial conditions. The direct problem
More informationCSE 494/598 Lecture-6: Latent Semantic Indexing. **Content adapted from last year s slides
CSE 494/598 Lecture-6: Latent Semantic Indexing LYDIA MANIKONDA HT TP://WWW.PUBLIC.ASU.EDU/~LMANIKON / **Content adapted from last year s slides Announcements Homework-1 and Quiz-1 Project part-2 released
More informationLinear Methods for Regression. Lijun Zhang
Linear Methods for Regression Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Linear Regression Models and Least Squares Subset Selection Shrinkage Methods Methods Using Derived
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationData Mining and Matrices
Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,
More informationThe Nyström Extension and Spectral Methods in Learning
Introduction Main Results Simulation Studies Summary The Nyström Extension and Spectral Methods in Learning New bounds and algorithms for high-dimensional data sets Patrick J. Wolfe (joint work with Mohamed-Ali
More informationInverse Singular Value Problems
Chapter 8 Inverse Singular Value Problems IEP versus ISVP Existence question A continuous approach An iterative method for the IEP An iterative method for the ISVP 139 140 Lecture 8 IEP versus ISVP Inverse
More informationData Mining Lecture 4: Covariance, EVD, PCA & SVD
Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More information