Faloutsos, Tong ICDE, 2009

Size: px
Start display at page:

Download "Faloutsos, Tong ICDE, 2009"

Transcription

1 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity Part 4: Case Studies Copyright: Faloutsos, Tong (29) 2-2 Outline: Part 2 Matrix Tools SVD, PC HITS, PageRank Co-clustering Tensor Tools Copyright: Faloutsos, Tong (29) 2-3 1

2 Examples of Matrices Example/Intuition: Documents and terms Find patterns, groups, concepts Paper#1 Paper#2 Paper#3 Paper#4... data mining classif. tree Copyright: Faloutsos, Tong (29) 2-4 Singular Value Decomposition (SVD) X UΣV T X U Σ V T σ 1 v 1 x (1) x (2) x (M) u 1 u 2 u k. σ 2. v 2 σ k v k singular values right singular vectors input data left singular vectors Copyright: Faloutsos, Tong (29) 2-5 SVD as spectral decomposition n n σ 1 u 1 v 1 σ 2 u 2 v 2 m Σ m + V T U Best rank-k approximation in L2 and Frobenius SVD only works for static matrices (a single 2 nd order tensor) See also PRFC Copyright: Faloutsos, Tong (29) 2-6 2

3 Vector outer product intuition: owner age 2; 3; 4 car type VW Volvo BMW 2; 3; 4 VW Volvo BMW 2-d histogram 1-d histograms + independence assumption Copyright: Faloutsos, Tong (29) 2-7 SVD - Example U Σ V T - example: CS MD lung data infṛetrieval brain x x Copyright: Faloutsos, Tong (29) 2-8 CS MD SVD - Example U Σ V T - example: data infṛetrieval CS-concept brain lung MD-concept x x Copyright: Faloutsos, Tong (29) 2-9 3

4 CS MD SVD - Example U Σ V T - example: doc-to-concept similarity matrix data infṛetrieval CS-concept brain lung MD-concept x x Copyright: Faloutsos, Tong (29) 2-1 CS MD SVD - Example U Σ V T - example: data infṛetrieval brain lung strength of CS-concept x x Copyright: Faloutsos, Tong (29) 2-11 CS MD SVD - Example U Σ V T - example: lung data infṛetrieval brain CS-concept x term-to-concept similarity matrix x Copyright: Faloutsos, Tong (29)

5 CS MD SVD - Example U Σ V T - example: lung data infṛetrieval brain CS-concept x term-to-concept similarity matrix x Copyright: Faloutsos, Tong (29) 2-13 SVD - Interpretation documents, terms and concepts : Q: if is the document-to-term matrix, what is T? : term-to-term t t ([m x m]) similarity il it matrix Q: T? : document-to-document ([n x n]) similarity matrix Copyright: Faloutsos, Tong (29) 2-14 SVD properties V are the eigenvectors of the covariance matrix T U are the eigenvectors of the Gram (innerproduct) matrix T Further reading: 1. Ian T. Jolliffe, Principal Component nalysis (2 nd ed), Springer, 22. Copyright: Faloutsos, Tong (29) 2. Gilbert Strang, Linear lgebra and Its pplications (4 th ed), Brooks Cole,

6 Principal Component nalysis (PC) SVD n k R kr n V T kr m m U Σ Loading PCs PC is an important application of SVD Note that U and V are dense and may have negative entries Copyright: Faloutsos, Tong (29) 2-16 PC interpretation best axis to project on: ( best min sum of squares of projection errors) Term2 ( lung ) Term1 ( data ) 2-17 PC - interpretation U Σ V T Term2 ( retrieval ) PC projects points Onto the best axis first singular vector v1 minimum RMS error Term1 ( data ) Copyright: Faloutsos, Tong (29)

7 Outline: Part 2 Matrix Tools SVD, PC HITS, PageRank Co-clustering Tensor Tools Copyright: Faloutsos, Tong (29) 2-19 Kleinberg s algorithm HITS Problem dfn: given the web and a query find the most authoritative web pages for this query Step : find all pages containing the query terms Step 1: expand by one move forward and backward Further reading: Copyright: Faloutsos, Tong (29) J. Kleinberg. uthoritative sources in a hyperlinked environment. SOD 1998 Kleinberg s algorithm HITS Step 1: expand by one move forward and backward Copyright: Faloutsos, Tong (29)

8 Kleinberg s algorithm HITS on the resulting graph, give high score ( authorities ) to nodes that many important nodes point to give high importance score ( hubs )to nodes that point to good authorities hubs authorities Copyright: Faloutsos, Tong (29) 2-22 Kleinberg s algorithm HITS observations recursive definition! each node (say, i -th node) has both an authoritativeness score a i and a hubness score h i Copyright: Faloutsos, Tong (29) 2-23 Kleinberg s algorithm: HITS Let be the adjacency matrix: the (i,j) entry is 1 if the edge from i to j exists Let h and a be [n x 1] vectors with the hubness and authoritativiness scores. Then: Copyright: Faloutsos, Tong (29)

9 Kleinberg s algorithm: HITS Then: k l m i a i h k + h l + h m that is a i Sum (h j ) over all j that t (j,i) edge exists or a T h Copyright: Faloutsos, Tong (29) 2-25 i Kleinberg s algorithm: HITS n p q symmetrically, for the hubness : h i a n + a p + a q that is h i Sum (q j ) over all j that t (i,j) edge exists or h a Copyright: Faloutsos, Tong (29) 2-26 Kleinberg s algorithm: HITS In conclusion, we want vectors h and a such that: h a a T h That is: a T a Copyright: Faloutsos, Tong (29)

10 Kleinberg s algorithm: HITS a is a right singular vector of the adjacency matrix (by dfn!), a.k.a the eigenvector of T Starting from random a and iterating, we ll eventually converge Q: to which of all the eigenvectors? why? : to the one of the strongest eigenvalue, ( T ) k a λ 1 k a Copyright: Faloutsos, Tong (29) 2-28 Kleinberg s algorithm - discussion authority score can be used to find similar pages (how?) closely related to citation analysis, social networks / small world phenomena See also TOPHITS Copyright: Faloutsos, Tong (29) 2-29 Motivating problem: PageRank Given a directed graph, find its most interesting/central node node is important, if it is connected with important nodes (recursive, but OK!) Copyright: Faloutsos, Tong (29) 2-3 1

11 Motivating problem PageRank solution Given a directed graph, find its most interesting/central node Proposed solution: Random walk; spot most popular node (-> steady state prob. (ssp)) node has high ssp, if it is connected with high ssp nodes (recursive, but OK!) Copyright: Faloutsos, Tong (29) 2-31 (Simplified) PageRank algorithm 1 Let be the transition matrix ( adjacency matrix); let B be the transpose, column-normalized - then From To 3 B /2 1/2 1/2 1/2 p1 p2 p3 p4 p5 p1 p2 p3 p4 p5 Copyright: Faloutsos, Tong (29) 2-32 (Simplified) PageRank algorithm B p p B p p /2 1/2 1/2 1/2 p1 p2 p3 p4 p5 p1 p2 p3 p4 p5 Copyright: Faloutsos, Tong (29)

12 (Simplified) PageRank algorithm B p 1 * p thus, p is the eigenvector that corresponds to the highest eigenvalue (1, since the matrix is column-normalized) normalized) Why does such a p exist? p exists if B is nxn, nonnegative, irreducible [Perron Frobenius theorem] Copyright: Faloutsos, Tong (29) 2-34 (Simplified) PageRank algorithm In short: imagine a particle randomly moving along the edges compute its steady-state probabilities (ssp) Full version of algo: with occasional random jumps Why? To make the matrix irreducible Copyright: Faloutsos, Tong (29) 2-35 Full lgorithm With probability 1-c, fly-out to a random node Then, we have p c B p +(1-c)/n 1 > p (1-c)/n [I - c B] -1 1 Copyright: Faloutsos, Tong (29)

13 Outline: Part 2 Matrix Tools SVD, PC HITS, PageRank Co-clustering Tensor Tools Copyright: Faloutsos, Tong (29) 2-37 Co-clustering Given data matrix and the number of row and column groups k and l Simultaneously Cluster rows of p(x, Y) into k disjoint groups Cluster columns of p(x, Y) into l disjoint groups Copyright: Faloutsos, Tong (29) 2-38 Co-clustering Let X and Y be discrete random variables X and Y take values in {1, 2,, m} and {1, 2,, n} p(x, Y) denotes the joint probability distribution if not known, it is often estimated based on co-occurrence data pplication areas: text mining, market-basket analysis, analysis of browsing behavior, etc. Key Obstacles in Clustering Contingency Tables High Dimensionality, Sparsity, Noise Need for robust and scalable algorithms Reference: Copyright: Faloutsos, Tong (29) Dhillon et al. Information-Theoretic Co-clustering, KDD 3 13

14 .5.5 m k m l k.3 l n n..28 [ ] eg, terms x documents Copyright: Faloutsos, Tong (29) 2-4 med. doc cs doc term group x doc. group [ ] doc x doc group med. terms cs terms common terms term x term-group Copyright: Faloutsos, Tong (29) 2-41 Co-clustering Observations uses KL divergence, instead of L2 the middle matrix is not diagonal we ll see that again in the Tucker tensor decomposition Copyright: Faloutsos, Tong (29)

15 Outline: Part 2 Matrix Tools Tensor Tools Tensor Basics Tucker Tucker 1 Tucker 2 Tucker 3 PRFC Incrementalization Copyright: Faloutsos, Tong (29) 2-43 Tensor Basics Copyright: Faloutsos, Tong (29) 2-44 Reminder: SVD n n m m Σ V T U Best rank-k approximation in L2 See also PRFC Copyright: Faloutsos, Tong (29)

16 Reminder: SVD n σ 1 u 1 v 1 σ 2 u 2 v 2 + m + Best rank-k approximation in L2 See also PRFC Copyright: Faloutsos, Tong (29) 2-46 Goal: extension to >3 modes Ix R Jx R ¼ B R x R x R + + Copyright: Faloutsos, Tong (29) 2-47 Main points: 2 major types of tensor decompositions: PRFC and Tucker both can be solved with ``alternating least squares (LS) Details follow we start with terminology: Copyright: Faloutsos, Tong (29)

17 I Faloutsos, Tong ICDE, 29 [T. Kolda, 7] tensor is a multidimensional array n tensor Column (Mode-1) Fibers Row (Mode-2) Fibers Tube (Mode-3) Fibers X 1,1,1 x ijk J 3 rd order tensor mode 1 has dimension I mode 2 has dimension J mode 3 has dimension K Note: Tutorial focus is on 3 rd order, but everything can be extended to higher orders. Horizontal Slices Lateral Slices Frontal Slices Copyright: Faloutsos, Tong (29) 2-49 [T. Kolda, 7] Matricization: Converting a Tensor to a Matrix Matricize (unfolding) (i,j,k) (i,j ) X (n) : The mode-n fibers are rearranged to be the columns of a matrix Reverse Matricize (i,j ) (i,j,k) Tensor Mode-n Multiplication Tensor Times Matrix Tensor Times Vector Multiply each row (mode-2) fiber by B Compute the dot product of a and each column (mode-1) fiber Copyright: [T. Kolda, 7] Faloutsos, Tong (29)

18 Pictorial View of Mode-n Matrix Multiplication Mode-2 multiplication (lateral slices) Mode-1 multiplication (frontal slices) Mode-3 multiplication (horizontal slices) Copyright: [T. Kolda, 7] Faloutsos, Tong (29) 2-52 Mode-n product Example Tensor times a matrix Location Time Time Clusters Locatio ntime Clusters Copyright: [T. Kolda, 7] Faloutsos, Tong (29) 2-53 Mode-n product Example Tensor times a vector Location Time Time Time n Locatio Copyright: [T. Kolda, 7] Faloutsos, Tong (29)

19 Outer, Kronecker, & Khatri-Rao Products 3-Way Outer Product Review: Matrix Kronecker Product M x N P x Q MP x NQ Rank-1 Tensor Matrix Khatri-Rao Product M x R N x R MN x R Observe: For two vectors a and b, a ± b and a b have the same elements, but one is shaped into a matrix and the other into a vector. Copyright: Faloutsos, Tong (29) [T. Kolda, 7] 2-55 Specially Structured Tensors Copyright: Faloutsos, Tong (29) 2-56 Specially Structured Tensors Tucker Tensor Kruskal Tensor Our Notation o Our Notation core Ix R Jx S w 1 Ix R w R Jx R U V R x S x T u 1 v 1 U + + V R x R x R u R v R Copyright: [T. Kolda, 7] Faloutsos, Tong (29)

20 Specially Structured Tensors Tucker Tensor Kruskal Tensor In matrix form: In matrix form: Copyright: [T. Kolda, 7] Faloutsos, Tong (29) 2-58 Outline: Part 2 Matrix Tools Tensor Tools Tensor Basics Tucker Tucker 1 Tucker 2 Tucker 3 PRFC Incrementalization Copyright: Faloutsos, Tong (29) 2-59 Tensor Decompositions Copyright: Faloutsos, Tong (29) 2-6 2

21 Tucker Decomposition - intuition Ix R Jx S ~ B R x S x T author x keyword x conference : author x author-group B: keyword x keyword-group C: conf. x conf-group G: how groups relate to each other Copyright: Faloutsos, Tong (29) 2-61 term group x doc. group [ ] doc x doc group med. terms cs terms common terms term x term-group Copyright: Faloutsos, Tong (29) 2-62 Tucker Decomposition Ix R Jx S ~ B R x S x T Given, B, C, the optimal core is: Proposed by Tucker (1966) K: Three-mode factor analysis, three-mode PC, orthogonal array decomposition, B, and C generally assumed to be orthonormal (generally assume they have full column rank) is not diagonal Not unique Recall the equations for converting a tensor to a matrix Copyright: Faloutsos, Tong (29)

22 Tucker Variations See Kroonenberg & De Leeuw, Psychometrika,198 for discussion. Tucker2 Identity Matrix Ix R Jx S ~ B R x S x K Tucker1 Ix R ~ R x J x K Finding principal components in only mode 1 can be solved via rank-r matrix SVD Copyright: Faloutsos, Tong (29) 2-64 Solving for Tucker Given, B, C orthonormal, the optimal core is: Tensor norm is the square root of the sum of all the elements squared Eliminate the core to get: Ix R Jx S ~ B R x S x T Minimize s.t.,b,c orthonormal fixed maximize this If B & C are fixed, then we can solve for as follows: Optimal is R left leading singular vectors for 2-65 Higher Order SVD (HO-SVD) Ix R Jx S ~ B R x S x T Not optimal, but often used to initialize Tucker- LS algorithm. (Observe connection to Tucker1) De Lathauwer, De Moor, & Vandewalle, SIMX,

23 Tucker-lternating Least Squares (LS) Successively solve for each component (,B,C). Ix R Jx S B R x S x T Initialize Choose R, S, T Calculate, B, C via HO-SVD Until converged do R leading left singular vectors of X (1) (C B) B S leading left singular vectors of X (2) (C ) C T leading left singular vectors of X (3) (B ) Solve for core: Kroonenberg & De Leeuw, Psychometrika, Tucker in Not Unique Ix R Jx S ~ B R x S x T Tucker decomposition is not unique. Let Y be an RxR orthogonal matrix. Then Copyright: [T. Kolda, 7] Faloutsos, Tong (29) 2-68 Outline: Part 2 Matrix Tools Tensor Tools Tensor Basics Tucker Tucker 1 Tucker 2 Tucker 3 PRFC Copyright: Faloutsos, Tong (29)

24 CNDECOMP/PRFC Decomposition Ix R Jx R ~ B R x R x R + + CNDECOMP Canonical Decomposition (Carroll & Chang, 197) PRFC Parallel Factors (Harshman, 197) Core is diagonal (specified by the vector λ) Columns of, B, and C are not orthonormal If R is minimal, then R is called the rank of the tensor (Kruskal 1977) Can have rank ( ) > min{i,j,k} 2-7 PRFC-lternating Least Squares (LS) Successively solve for each component (,B,C). + + Khatri-Rao Product (column-wise Kronecker product) Find all the vectors in one mode at a time Hadamard Product If C, B, and Λ are fixed, the optimal is given by: Repeat for B,C, etc. Copyright: [T. Kolda, 7] Faloutsos, Tong (29) 2-71 PRFC is often unique c 1 a 1 b c R a R b R ssume PRFC decomposition is exact. Sufficient condition for uniqueness (Kruskal, 1977): k k-rank of max number k such that every set of k columns of is linearly independent Copyright: Faloutsos, Tong (29)

25 Tucker vs. PRFC Decompositions Tucker Variable transformation in each mode Core G may be dense, B, C generally orthonormal Not unique PRFC Sum of rank-1 components No core, i.e., superdiagonal core, B, C may have linearly dependent columns Generally unique Ix R Jx S c 1 c R ¼ B R x S x T ¼ a 1 b a R b R 2-73 Tensor tools - summary Two main tools PRFC Tucker Both find row-, column-, tube-groups but in PRFC the three groups are identical To solve: lternating Least Squares Copyright: Faloutsos, Tong (29) 2-74 Tensor tools - resources Toolbox: from Tamara Kolda: csmr.ca.sandia.gov/~tgkolda/tensortoolbox/ T. G. Kolda and B. W. Bader. Tensor Decompositions and pplications. SIM Review, to appear (accepted June 28) csmr.ca.sandia.gov/~tgkolda/pubs/bibtgkfil es/tensorreview-preprint.pdf T. Kolda and J. Sun: Scalable Tensor Decomposition for Multi-spect Data Mining (ICDM 28) Copyright: Faloutsos, Tong (29)

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,

More information

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)

More information

Introduction to Tensors. 8 May 2014

Introduction to Tensors. 8 May 2014 Introduction to Tensors 8 May 2014 Introduction to Tensors What is a tensor? Basic Operations CP Decompositions and Tensor Rank Matricization and Computing the CP Dear Tullio,! I admire the elegance of

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 6: Numerical Linear Algebra: Applications in Machine Learning Cho-Jui Hsieh UC Davis April 27, 2017 Principal Component Analysis Principal

More information

Talk 2: Graph Mining Tools - SVD, ranking, proximity. Christos Faloutsos CMU

Talk 2: Graph Mining Tools - SVD, ranking, proximity. Christos Faloutsos CMU Talk 2: Graph Mining Tools - SVD, ranking, proximity Christos Faloutsos CMU Outline Introduction Motivation Task 1: Node importance Task 2: Recommendations Task 3: Connection sub-graphs Conclusions Lipari

More information

Parallel Tensor Compression for Large-Scale Scientific Data

Parallel Tensor Compression for Large-Scale Scientific Data Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

vmmlib Tensor Approximation Classes Susanne K. Suter April, 2013

vmmlib Tensor Approximation Classes Susanne K. Suter April, 2013 vmmlib Tensor pproximation Classes Susanne K. Suter pril, 2013 Outline Part 1: Vmmlib data structures Part 2: T Models Part 3: Typical T algorithms and operations 2 Downloads and Resources Tensor pproximation

More information

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya CS 375 Advanced Machine Learning Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya Outline SVD and LSI Kleinberg s Algorithm PageRank Algorithm Vector Space Model Vector space model represents

More information

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,

More information

1 Searching the World Wide Web

1 Searching the World Wide Web Hubs and Authorities in a Hyperlinked Environment 1 Searching the World Wide Web Because diverse users each modify the link structure of the WWW within a relatively small scope by creating web-pages on

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY OUTLINE Why We Need Matrix Decomposition SVD (Singular Value Decomposition) NMF (Nonnegative

More information

Large Graph Mining: Power Tools and a Practitioner s guide

Large Graph Mining: Power Tools and a Practitioner s guide Large Graph Mining: Power Tools and a Practitioner s guide Task 5: Graphs over time & tensors Faloutsos, Miller, Tsourakakis CMU KDD '9 Faloutsos, Miller, Tsourakakis P5-1 Outline Introduction Motivation

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 10 Graphs II Rainer Gemulla, Pauli Miettinen Jul 4, 2013 Link analysis The web as a directed graph Set of web pages with associated textual content Hyperlinks between webpages

More information

Information Retrieval

Information Retrieval Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

More information

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University. Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart

More information

Multi-Way Compressed Sensing for Big Tensor Data

Multi-Way Compressed Sensing for Big Tensor Data Multi-Way Compressed Sensing for Big Tensor Data Nikos Sidiropoulos Dept. ECE University of Minnesota MIIS, July 1, 2013 Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing

More information

Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

More information

Tensor Decompositions and Applications

Tensor Decompositions and Applications Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =

More information

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces

Singular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang

More information

Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM

Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM Latent Semantic Analysis and Fiedler Retrieval Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM Informatics & Linear Algebra Eigenvectors

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset

More information

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors Michael K. Ng Centre for Mathematical Imaging and Vision and Department of Mathematics

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com

More information

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Link Analysis Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 The Web as a Directed Graph Page A Anchor hyperlink Page B Assumption 1: A hyperlink between pages

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Data Mining Recitation Notes Week 3

Data Mining Recitation Notes Week 3 Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

Link Mining PageRank. From Stanford C246

Link Mining PageRank. From Stanford C246 Link Mining PageRank From Stanford C246 Broad Question: How to organize the Web? First try: Human curated Web dictionaries Yahoo, DMOZ LookSmart Second try: Web Search Information Retrieval investigates

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page IV.3 HITS Hyperlinked-Induced Topic Search (HITS) identifies authorities as good content sources (~high indegree) hubs as good link sources (~high outdegree) HITS [Kleinberg 99] considers a web page a

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic Indexing Hinrich Schütze Center for Information and Language Processing, University of Munich 2013-07-10 1/43

More information

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018 Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,

More information

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search Xutao Li 1 Michael Ng 2 Yunming Ye 1 1 Department of Computer Science, Shenzhen Graduate School, Harbin Institute of Technology,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #9: Link Analysis Seoul National University 1 In This Lecture Motivation for link analysis Pagerank: an important graph ranking algorithm Flow and random walk formulation

More information

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia Network Feature s Decompositions for Machine Learning 1 1 Department of Computer Science University of British Columbia UBC Machine Learning Group, June 15 2016 1/30 Contact information Network Feature

More information

Link Analysis. Leonid E. Zhukov

Link Analysis. Leonid E. Zhukov Link Analysis Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis and Visualization

More information

0.1 Naive formulation of PageRank

0.1 Naive formulation of PageRank PageRank is a ranking system designed to find the best pages on the web. A webpage is considered good if it is endorsed (i.e. linked to) by other good webpages. The more webpages link to it, and the more

More information

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden Index 1-norm, 15 matrix, 17 vector, 15 2-norm, 15, 59 matrix, 17 vector, 15 3-mode array, 91 absolute error, 15 adjacency matrix, 158 Aitken extrapolation, 157 algebra, multi-linear, 91 all-orthogonality,

More information

Text Analytics (Text Mining)

Text Analytics (Text Mining) http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Text Analytics (Text Mining) Concepts, Algorithms, LSI/SVD Duen Horng (Polo) Chau Assistant Professor Associate Director, MS

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #21: Dimensionality Reduction Seoul National University 1 In This Lecture Understand the motivation and applications of dimensionality reduction Learn the definition

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element

More information

Tutorial on MATLAB for tensors and the Tucker decomposition

Tutorial on MATLAB for tensors and the Tucker decomposition Tutorial on MATLAB for tensors and the Tucker decomposition Tamara G. Kolda and Brett W. Bader Workshop on Tensor Decomposition and Applications CIRM, Luminy, Marseille, France August 29, 2005 Sandia is

More information

Authoritative Sources in a Hyperlinked Enviroment

Authoritative Sources in a Hyperlinked Enviroment Authoritative Sources in a Hyperlinked Enviroment Dimitris Sakavalas June 15, 2010 Outline Introduction Problem description Authorities and Broad-topic queries Algorithm outline Construction of a Focused

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering

More information

Lecture 4. CP and KSVD Representations. Charles F. Van Loan

Lecture 4. CP and KSVD Representations. Charles F. Van Loan Structured Matrix Computations from Structured Tensors Lecture 4. CP and KSVD Representations Charles F. Van Loan Cornell University CIME-EMS Summer School June 22-26, 2015 Cetraro, Italy Structured Matrix

More information

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications Michael K. Ng Centre for Mathematical Imaging and Vision and Department of Mathematics Hong Kong Baptist University Email: mng@math.hkbu.edu.hk

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211

PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211 PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211 IIR 18: Latent Semantic Indexing Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,

More information

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

CVPR A New Tensor Algebra - Tutorial. July 26, 2017 CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Text Analytics (Text Mining)

Text Analytics (Text Mining) http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Text Analytics (Text Mining) Concepts, Algorithms, LSI/SVD Duen Horng (Polo) Chau Assistant Professor Associate Director, MS

More information

The multiple-vector tensor-vector product

The multiple-vector tensor-vector product I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition

More information

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds Complex Networks CSYS/MATH 303, Spring, 2011 Prof. Peter Dodds Department of Mathematics & Statistics Center for Complex Systems Vermont Advanced Computing Center University of Vermont Licensed under the

More information

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

CS60021: Scalable Data Mining. Dimensionality Reduction

CS60021: Scalable Data Mining. Dimensionality Reduction J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 1. Basic Linear Algebra Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Example

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Scalable Tensor Factorizations with Incomplete Data

Scalable Tensor Factorizations with Incomplete Data Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical

More information

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds MAE 298, Lecture 8 Feb 4, 2008 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in a file-sharing

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 2 Often, our data can be represented by an m-by-n matrix. And this matrix can be closely approximated by the product of two matrices that share a small common dimension

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Big Tensor Data Reduction

Big Tensor Data Reduction Big Tensor Data Reduction Nikos Sidiropoulos Dept. ECE University of Minnesota NSF/ECCS Big Data, 3/21/2013 Nikos Sidiropoulos Dept. ECE University of Minnesota () Big Tensor Data Reduction NSF/ECCS Big

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Dimensionality Reduction

Dimensionality Reduction Dimensionality Reduction Given N vectors in n dims, find the k most important axes to project them k is user defined (k < n) Applications: information retrieval & indexing identify the k most important

More information

How does Google rank webpages?

How does Google rank webpages? Linear Algebra Spring 016 How does Google rank webpages? Dept. of Internet and Multimedia Eng. Konkuk University leehw@konkuk.ac.kr 1 Background on search engines Outline HITS algorithm (Jon Kleinberg)

More information

Eigenvalue Problems Computation and Applications

Eigenvalue Problems Computation and Applications Eigenvalue ProblemsComputation and Applications p. 1/36 Eigenvalue Problems Computation and Applications Che-Rung Lee cherung@gmail.com National Tsing Hua University Eigenvalue ProblemsComputation and

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Data and Algorithms of the Web

Data and Algorithms of the Web Data and Algorithms of the Web Link Analysis Algorithms Page Rank some slides from: Anand Rajaraman, Jeffrey D. Ullman InfoLab (Stanford University) Link Analysis Algorithms Page Rank Hubs and Authorities

More information

ECEN 689 Special Topics in Data Science for Communications Networks

ECEN 689 Special Topics in Data Science for Communications Networks ECEN 689 Special Topics in Data Science for Communications Networks Nick Duffield Department of Electrical & Computer Engineering Texas A&M University Lecture 8 Random Walks, Matrices and PageRank Graphs

More information

CS246: Mining Massive Data Sets Winter Only one late period is allowed for this homework (11:59pm 2/14). General Instructions

CS246: Mining Massive Data Sets Winter Only one late period is allowed for this homework (11:59pm 2/14). General Instructions CS246: Mining Massive Data Sets Winter 2017 Problem Set 2 Due 11:59pm February 9, 2017 Only one late period is allowed for this homework (11:59pm 2/14). General Instructions Submission instructions: These

More information

Lecture 12: Link Analysis for Web Retrieval

Lecture 12: Link Analysis for Web Retrieval Lecture 12: Link Analysis for Web Retrieval Trevor Cohn COMP90042, 2015, Semester 1 What we ll learn in this lecture The web as a graph Page-rank method for deriving the importance of pages Hubs and authorities

More information

IR: Information Retrieval

IR: Information Retrieval / 44 IR: Information Retrieval FIB, Master in Innovation and Research in Informatics Slides by Marta Arias, José Luis Balcázar, Ramon Ferrer-i-Cancho, Ricard Gavaldá Department of Computer Science, UPC

More information

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities 1 Outline Outline Dynamical systems. Linear and Non-linear. Convergence. Linear algebra and Lyapunov functions. Markov

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci Link Analysis Information Retrieval and Data Mining Prof. Matteo Matteucci Hyperlinks for Indexing and Ranking 2 Page A Hyperlink Page B Intuitions The anchor text might describe the target page B Anchor

More information

Faloutsos, Kolda, Sun

Faloutsos, Kolda, Sun Faloutsos, Kolda, Sun SDM'7 About the tutorial Mining Large Time-evolving Data Using Matrix and Tensor Tools Christos Faloutsos Carnegie Mellon Univ. Tamara G. Kolda Sandia National Labs Jimeng Sun Carnegie

More information

Lecture 1: Review of linear algebra

Lecture 1: Review of linear algebra Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information