Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

Similar documents
Faloutsos, Tong ICDE, 2009

Large Graph Mining: Power Tools and a Practitioner s guide

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Large Scale Tensor Decompositions: Algorithmic Developments and Applications

Big Tensor Data Reduction

CMU SCS Talk 3: Graph Mining Tools Tensors, communities, parallelism

Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

arxiv: v1 [cs.lg] 18 Nov 2018

Multi-Way Compressed Sensing for Big Tensor Data

Tutorial on MATLAB for tensors and the Tucker decomposition

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Outline : Multimedia Databases and Data Mining. Indexing - similarity search. Indexing - similarity search. C.

Introduction to Tensors. 8 May 2014

Two heads better than one: pattern discovery in time-evolving multi-aspect data

GigaTensor: Scaling Tensor Analysis Up By 100 Times - Algorithms and Discoveries

Postgraduate Course Signal Processing for Big Data (MSc)

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

Orthogonal tensor decomposition

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Scalable Tensor Factorizations with Incomplete Data

Two heads better than one: Pattern Discovery in Time-evolving Multi-Aspect Data

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Introduction to Data Mining

Lecture 4. CP and KSVD Representations. Charles F. Van Loan

Information Retrieval

High Performance Parallel Tucker Decomposition of Sparse Tensors

A Three-Way Model for Collective Learning on Multi-Relational Data

Faloutsos, Kolda, Sun

1. Ignoring case, extract all unique words from the entire set of documents.

As it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation

From Matrix to Tensor. Charles F. Van Loan

STA141C: Big Data & High Performance Statistical Computing

Online Social Networks and Media. Link Analysis and Web Search

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Introduction to Information Retrieval

MultiRank: Co-Ranking for Objects and Relations in Multi-Relational Data

CS60021: Scalable Data Mining. Dimensionality Reduction

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

Text Analytics (Text Mining)

DFacTo: Distributed Factorization of Tensors

MultiVis: Content-based Social Network Exploration Through Multi-way Visual Analysis

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page

Data Mining and Matrices

Tensor Decompositions and Applications

Linear Algebra (Review) Volker Tresp 2017

Link Analysis Ranking

Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY

Data Mining and Matrices

PV211: Introduction to Information Retrieval

Linear Algebra (Review) Volker Tresp 2018

Parallel Tensor Compression for Large-Scale Scientific Data

arxiv: v1 [math.ra] 13 Jan 2009

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Decomposing a three-way dataset of TV-ratings when this is impossible. Alwin Stegeman

Matrices, Vector Spaces, and Information Retrieval

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Computational Linear Algebra

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Bregman Divergences. Barnabás Póczos. RLAI Tea Talk UofA, Edmonton. Aug 5, 2008

Optimization of Symmetric Tensor Computations

Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach

Fundamentals of Multilinear Subspace Learning

Online Social Networks and Media. Link Analysis and Web Search

Illustration by Chris Brigman IPAM Workshop: Big Data Meets Computation February 2, 2017

Online Tensor Factorization for. Feature Selection in EEG

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Future Directions in Tensor-Based Computation and Modeling 1

Problem # Max points possible Actual score Total 120

Illustration by Chris Brigman Michigan Institute for Data Science (MIDAS) Seminar November 9, 2015

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Dealing with curse and blessing of dimensionality through tensor decompositions

Com2: Fast Automatic Discovery of Temporal ( Comet ) Communities

The multiple-vector tensor-vector product

SVD, PCA & Preprocessing

Incremental Pattern Discovery on Streams, Graphs and Tensors

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

ANLP Lecture 22 Lexical Semantics with Dense Vectors

Large Scale Data Analysis Using Deep Learning

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Temporal analysis of semantic graphs using ASALSAN

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

CS246 Final Exam, Winter 2011

A concise proof of Kruskal s theorem on tensor decomposition

Talk 2: Graph Mining Tools - SVD, ranking, proximity. Christos Faloutsos CMU

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

Generic Text Summarization

Transcription:

Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories, Albuquerque, NM and Livermore, CA, November 2007 Lecture #21: Tensor decompositions C. Faloutsos 2 Outline Indexing - Detailed outline Goal: Find similar / interesting things Intro to DB Indexing - similarity search Data Mining 3 primary key indexing secondary key / multi-key indexing spatial access methods fractals text Singular Value Decomposition (SVD) - - Tensors multimedia 4 1

Most of foils by Dr. Tamara Kolda (Sandia N.L.) csmr.ca.sandia.gov/~tgkolda Dr. Jimeng Sun (CMU -> IBM) www.cs.cmu.edu/~jimeng Outline Motivation - Definitions Tensor tools Case studies 3h tutorial: www.cs.cmu.edu/~christos/talks/sdm-tut-07/ 5 6 Motivation 0: Why matrix? Why matrices are important? John Peter Mary Nick Examples of Matrices: Graph - social network John Peter Mary Nick 0 11 22 55 5 0 6 7 7 8 2

Examples of Matrices: cloud of n-d points Examples of Matrices: Market basket market basket as in Association Rules John Peter Mary Nick chol# blood# age.. John Peter Mary Nick milk bread choc. wine 9 10 Examples of Matrices: Documents and terms Examples of Matrices: Authors and terms Paper#1 Paper#2 Paper#3 Paper#4 data mining classif. tree John Peter Mary Nick data mining classif. tree 11 12 3

Examples of Matrices: sensor-ids and time-ticks Motivation: Why tensors? Q: what is a tensor? t1 t2 t3 t4 temp1 temp2 humid. pressure 13 14 Motivation 2: Why tensor? A: N-D generalization of matrix: Motivation 2: Why tensor? A: N-D generalization of matrix: KDD 05 KDD 06 KDD 07 data mining classif. tree KDD 07 data mining classif. tree John Peter Mary Nick John Peter Mary Nick 15 16 4

Tensors are useful for 3 or more modes Terminology: mode (or aspect ): Mode#3 Motivating Applications Why matrices are important? Why tensors are useful? P1: social networks P2: web mining data mining classif. tree Mode#2 Mode (== aspect) #1 17 18 P1: Social network analysis Traditionally, people focus on static networks and find community structures We plan to monitor the change of the community structure over time P2: Web graph mining How to order the importance of web pages? Kleinberg s algorithm HITS PageRank Tensor extension on HITS (TOPHITS) context-sensitive hypergraph analysis 19 20 5

Motivation Definitions Tensor tools Case studies Outline Tensor Basics Tucker PARAFAC Tensor Basics 21 Reminder: SVD Reminder: SVD n n n σ 1 u 1 v 1 σ 2 u 2 v 2 m A m Σ V T m A + U Best rank-k approximation in L2 Best rank-k approximation in L2 23 24 6

Goal: extension to >=3 modes C K x R I x R J x R ~ B = + + A R x R x R Main points: 2 major types of tensor decompositions: PARAFAC and Tucker both can be solved with ``alternating least squares (ALS) Details follow 25 26 Specially Structured Tensors Tucker Tensor Kruskal Tensor Specially Structured Tensors Our Notation Our Notation core W K x T W K x R I x R J x S w 1 I x R w R J x R = U V R x S x T = = u 1 v 1 U + + V R x R x R u R v R 28 7

details Specially Structured Tensors Tucker Tensor Kruskal Tensor Tensor Decompositions In matrix form: In matrix form: 29 Tucker Decomposition - intuition C K x T I x R J x S ~ B A R x S x T Intuition behind core tensor 2-d case: co-clustering [Dhillon et al. Information-Theoretic Coclustering, KDD 03] author x keyword x conference A: author x author-group B: keyword x keyword-group C: conf. x conf-group : how groups relate to each other Needs elaboration! 31 32 8

med. doc cs doc m k k l l m n n eg, terms x documents term group x doc. group doc x doc group med. terms cs terms common terms 33 term x term-group 34 Tucker Decomposition C K x T I x R J x S ~ B A R x S x T Given A, B, C, the optimal core is: Proposed by Tucker (1966) AKA: Three-mode factor analysis, three-mode PCA, orthogonal array decomposition A, B, and C generally assumed to be orthonormal (generally assume they have full column rank) is not diagonal Not unique Recall the equations for converting a tensor to a matrix 35 36 9

Motivation Definitions Tensor tools Case studies Outline Tensor Basics Tucker PARAFAC CANDECOMP/PARAFAC Decomposition ¼ I x R A C K x R R x R x R J x R B = + + 37 CANDECOMP = Canonical Decomposition (Carroll & Chang, 1970) PARAFAC = Parallel Factors (Harshman, 1970) Core is diagonal (specified by the vector λ) Columns of A, B, and C are not orthonormal If R is minimal, then R is called the rank of the tensor (Kruskal 1977) Can have rank( ) > min{i,j,k} 38 Tucker vs. PARAFAC Decompositions Tucker Variable transformation in each mode Core G may be dense A, B, C generally orthonormal Not unique C K x T IMPORTANT PARAFAC Sum of rank-1 components No core, i.e., superdiagonal core A, B, C may have linearly dependent columns Generally unique Tensor tools - summary Two main tools PARAFAC Tucker Both find row-, column-, tube-groups but in PARAFAC the three groups are identical To solve: Alternating Least Squares I x R J x S ~ B A R x S x T ~ c 1 b 1 + + c R b R a 1 a R 39 Toolbox: from Tamara Kolda: http://csmr.ca.sandia.gov/~tgkolda/tensortoolbox/ 40 10

Outline Motivation - Definitions Tensor tools Case studies P1: Web graph mining How to order the importance of web pages? Kleinberg s algorithm HITS PageRank Tensor extension on HITS (TOPHITS) 41 42 P1: Web graph mining Kleinberg s Hubs and Authorities (the HITS method) T. G. Kolda, B. W. Bader and J. P. Kenny, Higher-Order Web Link Analysis Using Multilinear Algebra, ICDM 2005: ICDM, pp. 242-249, November 2005, doi:10.1109/icdm.2005.77. [PDF] Sparse adjacency matrix and its SVD: to from 43 Kleinberg, JACM, 1999 44 11

HITS Authorities on Sample Data Three-Dimensional View of the Web We started our crawl from http://www-neos.mcs.anl.gov/neos, and crawled 4700 pages, resulting in 560 cross-linked hosts. Observe that this tensor is very sparse! to from 45 Kolda, Bader, Kenny, ICDM05 46 Topical HITS (TOPHITS) Main Idea: Extend the idea behind the HITS model to incorporate term (i.e., topical) information. Topical HITS (TOPHITS) Main Idea: Extend the idea behind the HITS model to incorporate term (i.e., topical) information. term term scores term scores term term scores term scores to to from from 47 48 12

TOPHITS Terms & Authorities on Sample Data TOPHITS uses 3D analysis to find the dominant groupings of web pages and terms. w k = # unique links using term k GigaTensor: Scaling Tensor Analysis Up By 100 Times Algorithms and Discoveries U Kang Evangelos Papalexakis Abhay Harpale Christos Faloutsos Tensor PARAFAC term term scores term scores KDD 2012 to from 49 P2: N.E.L.L. analysis NELL: Never Ending Language Learner Q1: dominant concepts / topics? Q2: synonyms for a given new phrase? A1: Concept Discovery Concept Discovery in Knowledge Base Eric Clapton plays guitar Barrack Obama is the president of U.S. (48M) (26M) (26M) NELL (Never Ending Language Learner) Nonzeros =144M 51 13

A1: Concept Discovery A2: Synonym Discovery Synonym Discovery in Knowledge Base a 1 a 2 a R (Given) subject (Discovered) synonym 1 (Discovered) synonym 2 A2: Synonym Discovery Conclusions Real data are often in high dimensions with multiple aspects (modes) Matrices and tensors provide elegant theory and algorithms c 1 c R ~ b 1 + + b R a 1 a R 56 14

References Inderjit S. Dhillon, Subramanyam Mallela, Dharmendra S. Modha: Information-theoretic co-clustering. KDD 2003: 89-98 T. G. Kolda, B. W. Bader and J. P. Kenny. Higher- Order Web Link Analysis Using Multilinear Algebra. In: ICDM 2005, Pages 242-249, November 2005. Jimeng Sun, Spiros Papadimitriou, Philip Yu. Windowbased Tensor Analysis on High-dimensional and Multiaspect Streams, Proc. of the Int. Conf. on Data Mining (ICDM), Hong Kong, China, Dec 2006 57 15