Matrix Approximations
|
|
- Corey Alexander
- 5 years ago
- Views:
Transcription
1 Matrix pproximations Sanjiv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 00 Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
2 Latent Semantic Indexing (LSI) Given n documents with words from a vocabulary of size m, discover hidden topics and represent documents semantically. Process form a term-document matrix n documents m terms ij # of term i in j th document top left singular vectors represent topics transform input query in the -dim semantic space match against the semantically transformed database items n ~ O( B), m ~ O(00K) Large sparse/dense matrix Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
3 Latent Semantic Indexing (LSI) Given n documents with words from a vocabulary of size m, discover hidden topics and represent documents semantically. Document representation and retrieval n documents U Σ V m terms Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 3
4 Latent Semantic Indexing (LSI) Given n documents with words from a vocabulary of size m, discover hidden topics and represent documents semantically. Document representation and retrieval n documents U Σ V m terms d j j th column dˆ j d j U Σ dˆ j Given a query q ˆ j Σ d qˆ Σ Find the nearest neighbors from database using sim q d ) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 4 U U d q j ( ˆ, ˆ j
5 Principal Component nalysis (PC) Given n vectors, find the linear reconstruction with minimum mean squared error. Process form the centered data matrix (i.e. subtract mean from each vector) n vectors d dims get the top left singular vectors (equivalent to eigenvectors of covariance matrix) project the input data on the singular vectors n ~ O( B), d ~ O(00K) Large dense matrix Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 5
6 Kernel Methods Commonly used for nonlinear extensions in classification, regression, dimensionality reduction etc. Kernel Matrix Most ernel methods boil down to manipulation of ernel matrix n K = ( x i, x j ) n Symmetric Positive Semi-Definite (SPSD) matrix of size Get top/bottom eigenvectors (eg., ernel PC) or low-ran approximation (SVM) n ~ O( B) Very large dense matrix n n Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 6
7 Spectral Clustering Graph-based clustering method Less assumptions on data distribution than parametric model e.g.,gmm Luxburg [] Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 7
8 Spectral Clustering Graph-based clustering method Less assumptions on data distribution than parametric model e.g.,gmm Key Idea: Find a (low) -dim embedding such that neighbors remain close Yˆ = rg min Wij yi y j = rg miny LY Y i, j Y D W Solution: Find bottom eigenvectors of L (ignoring last) assuming graph is connected Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 8
9 Spectral Clustering Graph-based clustering method Less assumptions on data distribution than parametric model e.g.,gmm Procedure: Compute weight matrix W: W ij σ = exp( x x / ) or Dense Compute normalized Laplacian i j W ij = exp( xi x j / σ 0 otherwise Sparse ) if i ~ j / / = I D WD Symmetric or = I D W symmetric D ii = jwij Do a -means clustering in optimal reduced dims of : D / U or U Bottom eigenvectors of, ignoring last n ~ O( B), d ~ O(00K) Large dense or sparse matrix Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 9
10 wo scenarios Dense Matrices. Number of elements. Matrix-vector products O( n ) O( n ) 3. Spectrum usually falls exponentially for most real-world data Most energy contained in top few eigenvalues Low-ran approximation gives reasonable results 4. Commonly arise in ernel methods, regression, manifold learning, second order optimization (hessian) Sparse Matrices. Number of elements O(n). Very fast matrix-vector products O(n) 3. Spectrum is usually flatter (energy falls more slowly) 4. Commonly arise in spectral clustering, manifold learning, semisupervised learning Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 0
11 Low-ran pproximation Given a matrix of ran < min( m, n) = B m n C m n Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
12 Low-ran pproximation Given a matrix of ran < min( m, n) dvantages = B m n Commonly used to discover structure in data i.e., Latent Semantic Indexing (LSI) e.g., discovering topics in documents or Recommendation system e.g., predicting user preferences C m n Storage reduces to only O (( m + n) ) instead of O(mn) O (( m + n) ) Matrix-vector product requires only instead of O(mn) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
13 Singular Value Decomposition (SVD) = [ m n] [ n] U V m [ n n] [ n n] Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 3
14 Singular Value Decomposition (SVD) wlog suppose m > n = [ m n] [ n] U V m [ n n] [ n n] Left singular vectors U U = Eigenvectors of I Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 4
15 Singular Value Decomposition (SVD) wlog suppose m > n = [ m n] [ n] U V m [ n n] [ n n] Left singular vectors U U = Eigenvectors of I Right singular vectors V V = Eigenvectors of I Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 5
16 Singular Value Decomposition (SVD) wlog suppose m > n = [ m n] [ n] U V m [ n n] [ n n] Left singular vectors U U = Eigenvectors of I diag( σ,..., σ n ) σ... σ 0 n σ are eigenvalues of or i Right singular vectors V V = Eigenvectors of I Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 6
17 Low-ran pproximation Frobenius Norm Spectral Norm F = = = r( ) = r( ) = i, j max x 0 ij x x = σ i σ i F n Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 7
18 Low-ran pproximation Frobenius Norm Spectral Norm F = = = r( ) = r( ) = i, j max x 0 ij x x = σ i σ i F n = rg min D ran ( D) = D F, = U m n Σ V m n Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 8
19 Low-ran pproximation Frobenius Norm Spectral Norm F = = = r( ) = r( ) = i, j max x 0 ij x x = σ i σ i F n = rg min D ran ( D) = D F, = U m n Σ V m n Σ Removing small entries in has noise-removal interpretation e.g., PC Computational Complexity full SVD O( mnmin{ m, n}) ran- SVD O(mn) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 9
20 Matrix Projection = U V = U U = V V Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 0
21 Matrix Projection = U V = U U = V V Projection of on space spanned by columns of U Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
22 V V U U V U = = = Matrix Projection V V U U V U ~ ~ ~ ~ ~ ~ ~ ~ = For approximate decomposition, low-ran approximation using SVD and matrix projection give different results! Projection of on space spanned by columns of U Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning
23 Power Method o compute the largest eigenvalue/eigenvector of a matrix ssume is a symmetric matrix, i.e. it is diagonalizable with, U U = λ diag( λ, λ,..., λ... λ n λ n ) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 3
24 Power Method o compute the largest eigenvalue/eigenvector of a matrix ssume is a symmetric matrix, i.e. it is diagonalizable with, Power iteration U U =. Start with random vector. for =,,, K q = q diag( λ, λ,..., λ Normalize and compute eigenvalue z = q K λ = K z K K λ q z K K q 0 R n λ n λ n repeated matrix-vector product Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 4 )
25 Convergence properties Suppose, q = au + au a n u n 0 Since columns of U mae a basis in R n q = a λ u + a λ u a 0 n λ n u n Since u = λu u = i i i λ u i Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 5
26 Convergence properties Suppose, a n u n u a u a q =... 0 ) ( ), ( λ λ O u q dist = Since columns of U mae a basis in R n n n n u λ a u λ a u λ a q = = = j j n j j u λ λ a a u λ a q 0 i i i i u λ u λu u = = Since if 0 a Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 6
27 Convergence properties Suppose, a n u n u a u a q =... 0 ) ( ), ( λ λ O u q dist = Since columns of U mae a basis in R n n n n u λ a u λ a u λ a q =... 0 i i i i u λ u λu u = = Since if 0 a bigger gap leads to faster convergence easy to satisfy by random initialization Can be generalized to multiple eigenvectors simultaneously by Orthogonal Iteration via QR-decomposition! Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 7 + = = j j n j j u λ λ a a u λ a q 0
28 Page Ran Which document or lin is more popular independent of the query? [wiipedia] Random wal over pages If one randomly clics the lins then what is the probability that one will be at page B? Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 8
29 Page Ran Stationary probability distribution over pages djacency matrix n [ ] n if i j ij = 0 otherwise ransition matrix n p( j i) if i j = 0 otherwise ij [ ] n ransition matrix is stochastic, i.e., rows sum to Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 9
30 Page Ran Stationary probability distribution over pages djacency matrix n [ ] n if i j ij = 0 otherwise ransition matrix n p( j i) if i j = 0 otherwise ij [ ] n ransition matrix is stochastic, i.e., rows sum to Stationary Probability distribution given by the eigenvector corresponding to largest eigenvalue (i.e. ) of u = u Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 30
31 Page Ran Stationary probability distribution over pages djacency matrix n [ ] n if i j ij = 0 otherwise ransition matrix n p( j i) if i j = 0 otherwise ij [ ] n ransition matrix is stochastic, i.e., rows sum to Stationary Probability distribution given by the eigenvector corresponding to largest eigenvalue (i.e. ) of ( ) u0 Power Iteration u = Damping is used to mae sure graph connected u u = = α u + ( α)(/ n) + n-vector of s u Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 3
32 Lanczos Method technique to solve large sparse symmetric eigenproblems Useful when only top or bottom few eigenvalues/vectors are needed Generates a sequence of tridiagonal matrices whose extremal eigenvalues progressively better estimates of desired values Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 3
33 Lanczos Method technique to solve large sparse symmetric eigenproblems Useful when only top or bottom few eigenvalues/vectors are needed Generates a sequence of tridiagonal matrices whose extremal eigenvalues progressively better estimates of desired values Krylov Subspaces Κ(, q, ) = span{ q, q,..., q} Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 33
34 Lanczos Method technique to solve large sparse symmetric eigenproblems Useful when only top or bottom few eigenvalues/vectors are needed Generates a sequence of tridiagonal matrices whose extremal eigenvalues progressively better estimates of desired values Krylov Subspaces Key Idea Κ(, q, ) = span{ q, q,..., q} Successively generate a new orthonormal vector such that span{ q, q,..., q + } = span{ q, q,..., q} How to find { q, q,..., q + } Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 34
35 Why ridiagonalization Lanczos Method Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 35 Q Q = For any symmetric matrix : Where Q is n x n orthogonal matrix and is a tridiagonal matrix = n n α α α α 3 3 M M
36 Why ridiagonalization Lanczos Method Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 36 ]... [ n q q q Q = Q Q = For any symmetric matrix : Where Q is n x n orthogonal matrix and is a tridiagonal matrix = n n α α α α 3 3 M M Qe q = Q = Q = 0 0 M M e th let
37 Lanczos Method Why ridiagonalization For any symmetric matrix : Q Q = Where Q is n x n orthogonal matrix and is a tridiagonal matrix e 0 M = M 0 th Krylov Matrix let K m Q = Q Q = q q = Qe q... q [ n (, q, n) = [ q q... q] ] n n = Q [ e e e... e ] α α = M M R α 3 3 n α n Columns of Q form orthogonal bases for Krylov Matrix! Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 37
38 Construct partial Q and partial iteratively Lanczos Method Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 38 Q = Q Let s focus on th column of Q = n n α α α α 3 3 M M = q q α q q r q q I α q = + ) ( r q / = + r = if 0 r where
39 Lanczos Method Construct partial Q and partial iteratively Q = Q Let s focus on th column of Q q q = q + αq + q+ + = ( α I) q q if r 0 q / + = r where = r α α α = M M r0 = q, 0 =, =,... while ( 0) q = r / α = q q using orthormality of q s Need matrix-vector r = ( α I) q q product q = r n α Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 39 r 3 3 premultiply by q + n
40 Lanczos Method When to stop? Ideally until = ran of Krylov matrix Usually too long, so a threshold is used on residual Q = Q + r e Decompose S S = diag( θ,..., θ ) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 40
41 Lanczos Method When to stop? Ideally until = ran of Krylov matrix Usually too long, so a threshold is used on residual Decompose estimated eigenvectors of : Q = Q + r S S = Q S e diag( θ,..., θ ) O() estimated eigenvalues of Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 4
42 Lanczos Method When to stop? Ideally until = ran of Krylov matrix Usually too long, so a threshold is used on residual Q = Q + r e Decompose estimated eigenvectors of : S S = Q S diag( θ,..., θ ) estimated eigenvalues of ccuracy min μ λ( ) θ i μ s i Convergence λ θ λ ( λ λn ) tan( φ ) /( c ( + ρ )) Kaniel-Page heory cos( φ q = u Chebyshev polynomial ) top eigenvector of ρ = ( λ λ ) /( λ λ n ) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 4
43 Lanczos Method When to stop? Ideally until = ran of Krylov matrix Usually too long, so a threshold is used on residual Q = Q + r e Decompose estimated eigenvectors of : S S = Q S diag( θ,..., θ ) estimated eigenvalues of ccuracy min μ λ( ) θ i μ s i Convergence λ θ λ ( λ λn ) tan( φ ) /( c ( + ρ )) Kaniel-Page heory cos( φ q = u Chebyshev polynomial top eigenvector of Lanczos iterations converge faster than power iteration! ) Practical Implementation: void rounding errors Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 43
44 rnoldi s Method technique to solve large sparse asymmetric eigenproblems Generates a sequence of Hessenberg matrices with extremal eigenvalues progressively better estimates Q Q = H Q = QH q = + H = full hiqi r = q hiqi i= i= Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 44
45 rnoldi s Method technique to solve large sparse asymmetric eigenproblems Generates a sequence of Hessenberg matrices with extremal eigenvalues progressively better estimates if h +, fter steps Q Q = H Q = QH q = + H = full hiqi r = q hiqi i= i= 0 q + = r / h +, Q = Q H + r H y = λy e ( λi) Q y = ( e y) r ritz value ritz vector h = +, Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 45 O( ) r In practice, applied with multiple restarts!
46 Randomized SVD and low-ran approx Basic Problem oversampling Given an m n matrix, and an integer l = + p, find an m l orthonormal matrix Q such that m n Q Q m l he columns of Q form orthonormal basis for the range of l << m, n Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 46
47 Randomized SVD and low-ran approx Basic Problem oversampling Given an m n matrix, and an integer l = + p, find an m l orthonormal matrix Q such that m n Q Q m l he columns of Q form orthonormal basis for the range of l << m, n How to do SVD?. Form a small (l n) matrix:. Compute SVD of B: 3. Set U = QUˆ B = Uˆ B = Q Σ V O(mnl) O( nl O( ml ) ) Of course, the best Q is: Q = How to build approximate Q? Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 47 U l Expensive!
48 Randomized method Basic Problem Given an m n matrix, and an integer l, find an m lorthonormal matrix Q such that Q Q Procedure. Draw random vectors from P(w) w, w,..., w l R n. Form projected sample vectors y = w,..., y l = w l R m 3. Form orthonormal vectors q, q,... q l R m Span( q, q,..., ql ) = Span( y, y,..., yl ) Gram-Schmidt or QR Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 48
49 Randomized method Basic Problem Given an m n matrix, and an integer l, find an m lorthonormal matrix Q such that Q Q Matrix-version. Construct a random matrix Ω l of size n l O(nl) most expensive!. Form m l sample matrix Yl = Ωl O(mnl) Use randomized FF for O(mnlog(l)) 3. Form an m l orthonormal matrix Q l such that Y l = QlQl Y l O( ml ) Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 49
50 Randomized method Basic Problem Given an m n matrix, and an integer l, find an m lorthonormal matrix Q such that Q Q Matrix-version. Construct a random matrix Ω l of size n l O(nl) most expensive!. Form m l sample matrix Yl = Ωl O(mnl) Use randomized FF for O(mnlog(l)) 3. Form an m l orthonormal matrix Q l such that Y l = QlQl Y l O( ml ) Error in pproximation Q Q σl+ Q l Q l [ + + p. min( m, n)] σ+ with probability 6 p p Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 50
51 Matrices with slowly decaying spectrum Basic Idea Incorporate Krylov-space type power iteration B = ( ) q same singular vectors as of but fast decaying singular values Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 5
52 Matrices with slowly decaying spectrum Basic Idea Incorporate Krylov-space type power iteration B = ( ) q same singular vectors as of but fast decaying singular values lgorithm. Construct a random matrix Ω l of size n l. Form m l sample matrix Z = ( ) q Ω l 3. Form an m l orthonormal matrix Q l from Z using partial QR Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 5
53 Matrices with slowly decaying spectrum Basic Idea Incorporate Krylov-space type power iteration B = ( ) q same singular vectors as of but fast decaying singular values lgorithm. Construct a random matrix Ω l of size n l. Form m l sample matrix Z = ( ) q Ω l 3. Form an m l orthonormal matrix Q l from Z using partial QR Error in pproximation Q l Q l /(q+ ) [ + + p. min( m, n)] σ+ with probability 6 p p Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 53
54 Randomized method: Example Eigenfaces m = 754 faces n = pixels Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 54
55 Randomized method dvantages. Much faster than standard methods: O ( mnlog ) instead of O(mn). Need to loo at data only very few times (sometimes single pass!) 3. Easy parallel implementation 4. he error in approximation can be made arbitrarily small with very high probability (even 0-0 ) inexpensively 5. Simple to code Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 55
56 Randomized Vs Sparse Methods Randomized methods Build the approximate range of by multiplying with a new random vector progressively Sparse Methods Build the approximate range of by multiplying recursively with the outcome of the previous stage starting with a random vector dvantages of randomized method. Similar computational complexity. Can be parallelized y = w,..., y l = w l y 3. Provide better approximations even for small spectrum gaps = w,..., y l = yl if is square 4. Can learn from very few (sometimes just one) passes over the data Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 56
57 References. Deerwester, S., et al, Improving Information Retrieval with Latent Semantic Indexing, Proceedings of the 5st nnual Meeting of the merican Society for Information Science 5, Ulrie Von Luxburg, tutorial on spectral clustering, ech Report, R-49, Max-plan Institute. 3. Golub and Van Loan, Matrix Computations, Johns Hopins Press, Sergey Brin, Larry Page (998). "he natomy of a Large-Scale Hypertextual Web Search Engine". Proceedings of the 7th international conference on World Wide Web (WWW). Brisbane, ustralia. pp Frieze, R. Kannan and S. Vempala, Fast Monte-Carlo lgorithms for finding low-ran approximations, Proceedings of the Foundations of Computer Science, P. Drineas and R. Kannan, M. Mahoney, Fast Monte Carlo lgorithms for Matrices I: pproximating Matrix Multiplication, ech-report R Halo, Martinsson, & ropp, "Finding structure with randomness: Stochastic algorithms for constructing approximate matrix decompositions, CM Report, 009. Sanjiv Kumar 9/4/00 EECS6898 Large Scale Machine Learning 57
Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection
Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue
More informationRandomized Algorithms
Randomized Algorithms Saniv Kumar, Google Research, NY EECS-6898, Columbia University - Fall, 010 Saniv Kumar 9/13/010 EECS6898 Large Scale Machine Learning 1 Curse of Dimensionality Gaussian Mixture Models
More informationSingular Value Decompsition
Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview
More informationThe Lanczos and conjugate gradient algorithms
The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More informationApplications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices
Applications of Randomized Methods for Decomposing and Simulating from Large Covariance Matrices Vahid Dehdari and Clayton V. Deutsch Geostatistical modeling involves many variables and many locations.
More informationIndex. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden
Index 1-norm, 15 matrix, 17 vector, 15 2-norm, 15, 59 matrix, 17 vector, 15 3-mode array, 91 absolute error, 15 adjacency matrix, 158 Aitken extrapolation, 157 algebra, multi-linear, 91 all-orthogonality,
More informationPreliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012
Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.
More informationNumerical Methods in Matrix Computations
Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices
More information4.8 Arnoldi Iteration, Krylov Subspaces and GMRES
48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationApplied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give
More informationLarge-scale eigenvalue problems
ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationEigenvalue Problems Computation and Applications
Eigenvalue ProblemsComputation and Applications p. 1/36 Eigenvalue Problems Computation and Applications Che-Rung Lee cherung@gmail.com National Tsing Hua University Eigenvalue ProblemsComputation and
More informationManning & Schuetze, FSNLP, (c)
page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction
More informationLARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems
LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization General Tools for Solving Large Eigen-Problems
More informationLatent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze
Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries
More informationMATH 304 Linear Algebra Lecture 34: Review for Test 2.
MATH 304 Linear Algebra Lecture 34: Review for Test 2. Topics for Test 2 Linear transformations (Leon 4.1 4.3) Matrix transformations Matrix of a linear mapping Similar matrices Orthogonality (Leon 5.1
More informationLARGE SPARSE EIGENVALUE PROBLEMS
LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationA randomized algorithm for approximating the SVD of a matrix
A randomized algorithm for approximating the SVD of a matrix Joint work with Per-Gunnar Martinsson (U. of Colorado) and Vladimir Rokhlin (Yale) Mark Tygert Program in Applied Mathematics Yale University
More informationAPPLIED NUMERICAL LINEAR ALGEBRA
APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation
More informationPrincipal components analysis COMS 4771
Principal components analysis COMS 4771 1. Representation learning Useful representations of data Representation learning: Given: raw feature vectors x 1, x 2,..., x n R d. Goal: learn a useful feature
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationManning & Schuetze, FSNLP (c) 1999,2000
558 15 Topics in Information Retrieval (15.10) y 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Figure 15.7 An example of linear regression. The line y = 0.25x + 1 is the best least-squares fit for the four points (1,1),
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationIterative Subgraph Mining for Principal Component Analysis
Iterative Subgraph Mining for Principal Component nalysis Submitted for Blind Review bstract Graph mining methods enumerate frequent subgraphs efficiently, but they are not necessarily good features for
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationCS 340 Lec. 6: Linear Dimensionality Reduction
CS 340 Lec. 6: Linear Dimensionality Reduction AD January 2011 AD () January 2011 1 / 46 Linear Dimensionality Reduction Introduction & Motivation Brief Review of Linear Algebra Principal Component Analysis
More informationNonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the
More information6.4 Krylov Subspaces and Conjugate Gradients
6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P
More informationOn Sampling-based Approximate Spectral Decomposition
Sanjiv Kumar Google Research, New Yor, NY Mehryar Mohri Courant Institute of Mathematical Sciences and Google Research, New Yor, NY Ameet Talwalar Courant Institute of Mathematical Sciences, New Yor, NY
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationDimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota
Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota Université du Littoral- Calais July 11, 16 First..... to the
More informationLinear Algebra and Eigenproblems
Appendix A A Linear Algebra and Eigenproblems A working knowledge of linear algebra is key to understanding many of the issues raised in this work. In particular, many of the discussions of the details
More informationRandomized Numerical Linear Algebra: Review and Progresses
ized ized SVD ized : Review and Progresses Zhihua Department of Computer Science and Engineering Shanghai Jiao Tong University The 12th China Workshop on Machine Learning and Applications Xi an, November
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationCS60021: Scalable Data Mining. Dimensionality Reduction
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a
More informationAA242B: MECHANICAL VIBRATIONS
AA242B: MECHANICAL VIBRATIONS 1 / 17 AA242B: MECHANICAL VIBRATIONS Solution Methods for the Generalized Eigenvalue Problem These slides are based on the recommended textbook: M. Géradin and D. Rixen, Mechanical
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More informationIterative methods for Linear System
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and
More informationNon-negative matrix factorization with fixed row and column sums
Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationSolution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method
Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts
More informationComputation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.
Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces
More informationA Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades
A Tour of the Lanczos Algorithm and its Convergence Guarantees through the Decades Qiaochu Yuan Department of Mathematics UC Berkeley Joint work with Prof. Ming Gu, Bo Li April 17, 2018 Qiaochu Yuan Tour
More informationSolution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method
Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices
More informationSolving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method
Solving Constrained Rayleigh Quotient Optimization Problem by Projected QEP Method Yunshen Zhou Advisor: Prof. Zhaojun Bai University of California, Davis yszhou@math.ucdavis.edu June 15, 2017 Yunshen
More informationKrylov Subspace Methods to Calculate PageRank
Krylov Subspace Methods to Calculate PageRank B. Vadala-Roth REU Final Presentation August 1st, 2013 How does Google Rank Web Pages? The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector
More informationA hybrid reordered Arnoldi method to accelerate PageRank computations
A hybrid reordered Arnoldi method to accelerate PageRank computations Danielle Parker Final Presentation Background Modeling the Web The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector
More informationKrylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17
Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More informationFast Krylov Methods for N-Body Learning
Fast Krylov Methods for N-Body Learning Nando de Freitas Department of Computer Science University of British Columbia nando@cs.ubc.ca Maryam Mahdaviani Department of Computer Science University of British
More informationProblems. Looks for literal term matches. Problems:
Problems Looks for literal term matches erms in queries (esp short ones) don t always capture user s information need well Problems: Synonymy: other words with the same meaning Car and automobile 电脑 vs.
More informationLinear Algebra Background
CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationPrincipal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014
Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes
More informationComputation. For QDA we need to calculate: Lets first consider the case that
Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the
More informationLatent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology
Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,
More informationEigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis
Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues also important in analyzing numerical methods Theory and algorithms apply
More informationIntroduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa
Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large
More informationMath 307 Learning Goals
Math 307 Learning Goals May 14, 2018 Chapter 1 Linear Equations 1.1 Solving Linear Equations Write a system of linear equations using matrix notation. Use Gaussian elimination to bring a system of linear
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationExample: Face Detection
Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,
More informationIKA: Independent Kernel Approximator
IKA: Independent Kernel Approximator Matteo Ronchetti, Università di Pisa his wor started as my graduation thesis in Como with Stefano Serra Capizzano. Currently is being developed with the help of Federico
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: More on Arnoldi Iteration; Lanczos Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17 Outline 1
More informationEigenvalue Problems CHAPTER 1 : PRELIMINARIES
Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Heinrich Voss voss@tu-harburg.de Hamburg University of Technology Institute of Mathematics TUHH Heinrich Voss Preliminaries Eigenvalue problems 2012 1 / 14
More informationAnalysis of Spectral Kernel Design based Semi-supervised Learning
Analysis of Spectral Kernel Design based Semi-supervised Learning Tong Zhang IBM T. J. Watson Research Center Yorktown Heights, NY 10598 Rie Kubota Ando IBM T. J. Watson Research Center Yorktown Heights,
More informationKrylov Subspaces. The order-n Krylov subspace of A generated by x is
Lab 1 Krylov Subspaces Lab Objective: matrices. Use Krylov subspaces to find eigenvalues of extremely large One of the biggest difficulties in computational linear algebra is the amount of memory needed
More informationBig Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.
More informationSpectral Clustering - Survey, Ideas and Propositions
Spectral Clustering - Survey, Ideas and Propositions Manu Bansal, CSE, IITK Rahul Bajaj, CSE, IITB under the guidance of Dr. Anil Vullikanti Networks Dynamics and Simulation Sciences Laboratory Virginia
More informationKrylov Subspaces. Lab 1. The Arnoldi Iteration
Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra
More informationChapter 6: Orthogonality
Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products
More informationMain matrix factorizations
Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined
More informationFinding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October
Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationLecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.
Lecture # 11 The Power Method for Eigenvalues Part II The power method find the largest (in magnitude) eigenvalue of It makes two assumptions. 1. A is diagonalizable. That is, A R n n. A = XΛX 1 for some
More informationJeffrey D. Ullman Stanford University
Jeffrey D. Ullman Stanford University 2 Often, our data can be represented by an m-by-n matrix. And this matrix can be closely approximated by the product of two matrices that share a small common dimension
More informationUnsupervised dimensionality reduction
Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional
More informationChapter 7: Symmetric Matrices and Quadratic Forms
Chapter 7: Symmetric Matrices and Quadratic Forms (Last Updated: December, 06) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved
More informationAn Experimental Evaluation of a Monte-Carlo Algorithm for Singular Value Decomposition
An Experimental Evaluation of a Monte-Carlo Algorithm for Singular Value Decomposition Petros Drineas 1, Eleni Drinea 2, and Patrick S. Huggins 1 1 Yale University, Computer Science Department, New Haven,
More informationLecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm
CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course
More informationMath 307 Learning Goals. March 23, 2010
Math 307 Learning Goals March 23, 2010 Course Description The course presents core concepts of linear algebra by focusing on applications in Science and Engineering. Examples of applications from recent
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 Arnoldi Notes for 2016-11-16 Krylov subspaces are good spaces for approximation schemes. But the power basis (i.e. the basis A j b for j = 0,..., k 1) is not good for numerical work. The vectors in the
More information7 Principal Component Analysis
7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is
More informationMatrices, Vector Spaces, and Information Retrieval
Matrices, Vector Spaces, and Information Authors: M. W. Berry and Z. Drmac and E. R. Jessup SIAM 1999: Society for Industrial and Applied Mathematics Speaker: Mattia Parigiani 1 Introduction Large volumes
More informationFaloutsos, Tong ICDE, 2009
Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity
More informationClassification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).
Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes
More information