Multi-Way Compressed Sensing for Big Tensor Data
|
|
- Shannon Horn
- 5 years ago
- Views:
Transcription
1 Multi-Way Compressed Sensing for Big Tensor Data Nikos Sidiropoulos Dept. ECE University of Minnesota MIIS, July 1, 2013 Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
2 STAR Group, Collaborators, Credits Signal and Tensor Analytics Research (STAR) group Signal processing Big data Preference measurement Cognitive radio Spectrum sensing Christos Faloutsos, Tom Mitchell, George Karypis (UMN), NSF-NIH/BIGDATA: Big Tensor Mining: Theory, Scalable Algorithms and Applications Vaggelis Papalexakis (CMU) Tasos Kyrillidis (EPFL) Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
3 Tensor? What is this? Has different formal meaning in Physics (spin, symmetries) Informally adopted in CS as shorthand for three-way array: dataset X indexed by three indices, (i, j, k)-th entry X(i, j, k). For two vectors a (I 1) and b (J 1), a b is an I J rank-one matrix with (i, j)-th element a(i)b(j); i.e., a b = ab T. For three vectors, a (I 1), b (J 1), c (K 1), a b c is an I J K rank-one three-way array with (i, j, k)-th element a(i)b(j)c(k). The rank of a three-way array X is the smallest number of outer products needed to synthesize X. Curiosities : Two-way (I J): row-rank = column-rank = rank min(i, J); Three-way: row-rank column-rank tube -rank rank Two-way: rank(randn(i,j))=min(i,j) w.p. 1; Three-way: rank(randn(2,2,2)) is a RV (2 w.p. 0.3, 3 w.p. 0.7) Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
4 CMU / Tom Mitchell Crawl web, learn language like children do : encounter new concepts, learn from context NELL triplets of subject-verb-object naturally lead to a 3-mode tensor subject verb X c 1 b 1 + a 1 c 2 b 2 + a c F a F b F object Each rank-one factor corresponds to a concept, e.g., leaders or tools E.g., say a 1, b 1, c 1 corresponds to leaders : subjects/rows with high score on a 1 will be Obama, Merkel, Steve Jobs, objects/columns with high score on b 1 will be USA, Germany, Apple Inc., and verbs/fibers with high score on c 1 will be verbs, like lead, is-president-of, and is-ceo-of. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
5 Semantic analysis of Brain fmri data fmri semantic category scores fmri mode is vectorized (O( )) Could treat as three separate spatial modes 5-way array... or even include time as another dimension 6-way array Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
6 Kronecker and Khatri-Rao products stands for the Kronecker product: BA(1, 1), BA(1, 2), A B = BA(2, 1), BA(2, 2),. stands for the Khatri-Rao (column-wise Kronecker) product: given A (I F ) and B (J F ), A B is the JI F matrix A B = [ A(:, 1) B(:, 1) A(:, F ) B(:, F ) ] vec(abc) = (C T A)vec(B) If D = diag(d), then vec(adc) = (C T A)d Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
7 Tucker3 I J K three-way array X A : I L, B : J M, C : K N mode loading matrices G : L M N Tucker3 core Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
8 Tucker3, continued Consider an I J K three-way array X comprising K matrix slabs {X k } K k=1, arranged into matrix X := [vec(x 1),, vec(x K )]. The Tucker3 model can be written as X (B A)GC T, where A, B, C, are three mode loading matrices, assumed orthogonal without loss of generality, and G is the so-called Tucker3 core tensor G recast in matrix form. The non-zero elements of the core tensor determine the interactions between columns of A, B, C. The associated model-fitting problem is min X (B A,B,C,G A)GCT 2 F, which is usually solved using an alternating least squares procedure. The Tucker3 model can be fully vectorized as vec(x) (C B A) vec(g). Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
9 Low-rank tensor decomposition / approximation X F a f b f c f, f =1 Parallel factor analysis (PARAFAC) model [Harshman 70-72], a.k.a. canonical decomposition [Carroll & Chang, 70], a.k.a. CP; cf. [Hitchcock, 27] PARAFAC can be written as a system of matrix equations X k AD k (C)B T, where D k (C) is a diagonal matrix holding the k-th row of C in its diagonal; or in compact matrix form as X (B A)C T, using the Khatri-Rao product. In particular, employing a property of the Khatri-Rao product, where 1 is a vector of all 1 s. X (B A)C T vec(x) (C B A) 1, Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
10 Uniqueness The distinguishing feature of the PARAFAC model is its essential uniqueness: under certain conditions, (A, B, C) can be identified from X, i.e., they are unique up to permutation and scaling of columns [Kruskal 77, Sidiropoulos et al 00-07, de Lathauwer 04-, Stegeman 06-] Consider an I J K tensor X of rank F. In vectorized form, it can be written as the IJK 1 vector x = (A B C) 1, for some A (I F), B (J F), and C (K F) - a PARAFAC model of size I J K and order F parameterized by (A, B, C). The Kruskal-rank of A, denoted k A, is the maximum k such that any k columns of A are linearly independent (k A r A := rank(a)). Given X ( x), if k A + k B + k C 2F + 2, then (A, B, C) are unique up to a common column permutation and scaling Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
11 Special case: Two slabs Assume square/tall, full column rank A(I F) and B(J F ); and distinct nonzero elements along the diagonal of D: X 2 X 1 = AB T, X 2 = ADB T Introduce the compact F -component SVD of [ ] [ ] X1 A = B T = UΣV H AD ([ A rank(b) = F implies that span(u) = span AD a nonsingular matrix P such that U = [ U1 ] [ A = U 2 AD ] P ]) ; hence there exists Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
12 Special case: Two slabs, continued Next, construct the auto- and cross-product matrices R 1 = U H 1 U 1 = P H A H AP =: QP, R 2 = U H 1 U 2 = P H A H ADP = QDP. All matrices on the right hand side are square and full-rank. Solving the first equation and substituting to the second, yields the eigenvalue problem ( ) R 1 1 R 2 P 1 = P 1 D P 1 (and P) recovered up to permutation and scaling From bottom of previous page, A likewise recovered, then B Starts making sense... Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
13 Alternating Least Squares (ALS) Based on matrix view: X (KJ I) = (B C)A T Given interim estimates of B, C, solve for conditional LS update of A: A CLS = ((B C) X (KJ I)) T Similarly for the CLS updates of B, C (symmetry); repeat in a circular fashion until cost converges Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
14 Heavy-tailed noise, sparse residuals Instead of LS, Conditional update: LP min X (KJ I) (B C)A T 1 Almost as good: coordinate-wise, using weighted median filtering (very cheap!) Vorobyov, Rong, Sidiropoulos, Gershman, 2005 Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
15 Big data: need for compression Tensors can easily become really big! - size exponential in the number of dimensions ( ways, or modes ). Cannot load in main memory; can reside in cloud storage. Tensor compression? Commonly used compression method for moderate -size tensors: fit orthogonal Tucker3 model, regress data onto fitted mode-bases. Lossless if exact mode bases used [CANDELINC]; but Tucker3 fitting is itself cumbersome for big tensors (big matrix SVDs), cannot compress below mode ranks without introducing errors If tensor is sparse, can store as [i, j, k, value] + use specialized sparse matrix / tensor alorithms [(Sparse) Tensor Toolbox, Bader & Kolda]. Useful if sparse representation can fit in main memory. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
16 Tensor compression Consider compressing x into y = Sx, where S is d IJK, d IJK. In particular, consider a specially structured compression matrix S = U T V T W T Corresponds to multiplying (every slab of) X from the I-mode with U T, from the J-mode with V T, and from the K -mode with W T, where U is I L, V is J M, and W is K N, with L I, M J, N K and LMN IJK M J L I I J X _ K K N L Y _ M N Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
17 Key Due to a property of the Kronecker product ( U T V T W T ) (A B C) = ( ) (U T A) (V T B) (W T C), from which it follows that ( ) ) y = (U T A) (V T B) (W T C) 1 = (à B C 1. i.e., the compressed data follow a PARAFAC model of size L M N and order F parameterized by (Ã, B, C), with à := UT A, B := V T B, C := W T C. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
18 Random multi-way compression can be better! Sidiropoulos & Kyrillidis, IEEE SPL Oct Assume that the columns of A, B, C are sparse, and let n a (n b, n c ) be an upper bound on the number of nonzero elements per column of A (respectively B, C). Let the mode-compression matrices U (I L, L I), V (J M, M J), and W (K N, N K ) be randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R IL, R JM, and R KN, respectively. If min(l, k A ) + min(m, k B ) + min(n, k C ) 2F + 2, L 2n a, M 2n b, N 2n c, and then the original factor loadings A, B, C are almost surely identifiable from the compressed data. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
19 Proof rests on two lemmas + Kruskal Lemma 1: Consider à := UT A, where A is I F, and let the I L matrix U be randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R IL (e.g., multivariate Gaussian with a non-singular covariance matrix). Then kã = min(l, k A ) almost surely (with probability 1). Lemma 2: Consider à := UT A, where à and U are given and A is sought. Suppose that every column of A has at most n a nonzero elements, and that k U T 2n a. (The latter holds with probability 1 if the I L matrix U is randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R IL, and min(i, L) 2n a.) Then A is the unique solution with at most n a nonzero elements per column [Donoho & Elad, 03] Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
20 Complexity First fitting PARAFAC in compressed space and then recovering the sparse A, B, C from the fitted compressed factors entails complexity O(LMNF + (I J K 3.5 )F ). Using sparsity first and then fitting PARAFAC in raw space entails complexity O(IJKF + (IJK ) 3.5 ) - the difference is huge. Also note that the proposed approach does not require computations in the uncompressed data domain, which is important for big data that do not fit in memory for processing. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
21 Further compression - down to O( F) in 2/3 modes Sidiropoulos & Kyrillidis, IEEE SPL Oct Assume that the columns of A, B, C are sparse, and let n a (n b, n c ) be an upper bound on the number of nonzero elements per column of A (respectively B, C). Let the mode-compression matrices U (I L, L I), V (J M, M J), and W (K N, N K ) be randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R IL, R JM, and R KN, respectively. If r A = r B = r C = F L(L 1)M(M 1) 2F (F 1), N F, L 2n a, M 2n b, N 2n c, and then the original factor loadings A, B, C are almost surely identifiable from the compressed data up to a common column permutation and scaling. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
22 Proof: Lemma 3 + results on a.s. ID of PARAFAC Lemma 3: Consider à = UT A, where A (I F) is deterministic, tall/square (I F ) and full column rank r A = F, and the elements of U (I L) are i.i.d. Gaussian zero mean, unit variance random variables. Then the distribution of à is nonsingular multivariate Gaussian. From [Stegeman, ten Berge, de Lathauwer 2006] (see also [Jiang, Sidiropoulos 2004], we know that PARAFAC is almost surely identifiable if the loading matrices Ã, B are randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R (L+M)F, C is full column rank, and L(L 1)M(M 1) 2F (F 1). Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
23 Generalization to higher-way arrays Theorem 3: Let x = (A 1 A δ ) 1 R δ d=1 I d, where Ad is I d F, and consider compressing it to y = ( U T 1 ) UT δ x = ( (U T 1 A 1 ) (U T δ A δ) ) 1 = (Ã1 Ãδ) 1 R δ d=1 L d, where the mode-compression matrices U d (I d L d, L d I d ) are randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R I d L d. Assume that the columns of Ad are sparse, and let n d be an upper bound on the number of nonzero elements per column of A d, for each d {1,, δ}. If δ min(l d, k Ad ) 2F + δ 1, and L d 2n d, d {1,, δ}, d=1 then the original factor loadings {A d } δ d=1 are almost surely identifiable from the compressed data y up to a common column permutation and scaling. Various additional results possible, e.g., generalization of Theorem 2. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
24 Latest on PARAFAC uniqueness Luca Chiantini and Giorgio Ottaviani, On Generic Identifiability of 3-Tensors of Small Rank, SIAM. J. Matrix Anal. & Appl., 33(3), : Consider an I J K tensor X of rank F, and order the dimensions so that I J K Let i be maximal such that 2 i I, and likewise j maximal such that 2 i J If F 2 i+j 2, then X has a unique decomposition almost surely For I, J powers of 2, the condition simplifies to F IJ 4 More generally, condition implies: if F (I+1)(J+1) 16, then X has a unique decomposition almost surely Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
25 Even further compression Assume that the columns of A, B, C are sparse, and let n a (n b, n c ) be an upper bound on the number of nonzero elements per column of A (respectively B, C). Let the mode-compression matrices U (I L, L I), V (J M, M J), and W (K N, N K ) be randomly drawn from an absolutely continuous distribution with respect to the Lebesgue measure in R IL, R JM, and R KN, respectively. Assume L M N, and L, M are powers of 2, for simplicity If r A = r B = r C = F LM 4F, N M L, and L 2n a, M 2n b, N 2n c, then the original factor loadings A, B, C are almost surely identifiable from the compressed data up to a common column permutation and scaling. Allows compression down to order of F in all three modes Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
26 What if A, B, C are not sparse? If A, B, C are sparse with respect to known bases, i.e., A = RǍ, B = SˇB, and C = TČ, with R, S, T the respective sparsifying bases, and Ǎ, ˇB, Č sparse Then the previous results carry over under appropriate conditions, e.g., when R, S, T are non-singular. Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
27 PARCUBE: Parallel sampling-based tensor decomp Papalexakis, Faloutsos, Sidiropoulos, ECML-PKDD 2012 %!"!"! $%! " &" #!" $!" % #" %!" $!"! $%# " # #" $ #" #!" Challenge: different permutations, scaling Anchor in small common sample Hadoop implementation 100-fold improvement (size/speedup) Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
28 Road ahead Important first steps / results pave way, but simply scratched surface Randomized tensor algorithms based on generalized sampling Other models? Rate-distortion theory for big tensor data compression? Statistically and computationally efficient algorithms - big open issue Distributed computations - not all data reside in one place - Hadoop / multicore Statistical inference for big tensors Applications Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing for Big Tensor Data MIIS, July 1, / 28
Big Tensor Data Reduction
Big Tensor Data Reduction Nikos Sidiropoulos Dept. ECE University of Minnesota NSF/ECCS Big Data, 3/21/2013 Nikos Sidiropoulos Dept. ECE University of Minnesota () Big Tensor Data Reduction NSF/ECCS Big
More informationTensor Decomposition Theory and Algorithms in the Era of Big Data
Tensor Decomposition Theory and Algorithms in the Era of Big Data Nikos Sidiropoulos, UMN EUSIPCO Inaugural Lecture, Sep. 2, 2014, Lisbon Nikos Sidiropoulos Tensor Decomposition in the Era of Big Data
More informationFaloutsos, Tong ICDE, 2009
Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity
More informationAnalyzing Data Boxes : Multi-way linear algebra and its applications in signal processing and communications
Analyzing Data Boxes : Multi-way linear algebra and its applications in signal processing and communications Nikos Sidiropoulos Dept. ECE, -Greece nikos@telecom.tuc.gr 1 Dedication In memory of Richard
More informationMust-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos
Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,
More informationTensor Analysis. Topics in Data Mining Fall Bruno Ribeiro
Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)
More informationA PARALLEL ALGORITHM FOR BIG TENSOR DECOMPOSITION USING RANDOMLY COMPRESSED CUBES (PARACOMP)
A PARALLEL ALGORTHM FOR BG TENSOR DECOMPOSTON USNG RANDOMLY COMPRESSED CUBES PARACOMP N.D. Sidiropoulos Dept. of ECE, Univ. of Minnesota Minneapolis, MN 55455, USA E.E. Papalexakis, and C. Faloutsos Dept.
More informationENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition
ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University
More informationIntroduction to Tensors. 8 May 2014
Introduction to Tensors 8 May 2014 Introduction to Tensors What is a tensor? Basic Operations CP Decompositions and Tensor Rank Matricization and Computing the CP Dear Tullio,! I admire the elegance of
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationFitting a Tensor Decomposition is a Nonlinear Optimization Problem
Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia
More informationLecture 4. CP and KSVD Representations. Charles F. Van Loan
Structured Matrix Computations from Structured Tensors Lecture 4. CP and KSVD Representations Charles F. Van Loan Cornell University CIME-EMS Summer School June 22-26, 2015 Cetraro, Italy Structured Matrix
More informationJOS M.F. TEN BERGE SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS
PSYCHOMETRIKA VOL. 76, NO. 1, 3 12 JANUARY 2011 DOI: 10.1007/S11336-010-9193-1 SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS JOS M.F. TEN BERGE UNIVERSITY OF GRONINGEN Matrices can be diagonalized
More informationA randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors
A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous
More informationScalable Tensor Factorizations with Incomplete Data
Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical
More informationTensor Decompositions and Applications
Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =
More informationHigh Performance Parallel Tucker Decomposition of Sparse Tensors
High Performance Parallel Tucker Decomposition of Sparse Tensors Oguz Kaya INRIA and LIP, ENS Lyon, France SIAM PP 16, April 14, 2016, Paris, France Joint work with: Bora Uçar, CNRS and LIP, ENS Lyon,
More informationCVPR A New Tensor Algebra - Tutorial. July 26, 2017
CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic
More informationarxiv: v1 [cs.lg] 18 Nov 2018
THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.
More informationarxiv: v1 [math.ra] 13 Jan 2009
A CONCISE PROOF OF KRUSKAL S THEOREM ON TENSOR DECOMPOSITION arxiv:0901.1796v1 [math.ra] 13 Jan 2009 JOHN A. RHODES Abstract. A theorem of J. Kruskal from 1977, motivated by a latent-class statistical
More informationBare minimum on matrix algebra. Psychology 588: Covariance structure and factor models
Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More information1. Structured representation of high-order tensors revisited. 2. Multi-linear algebra (MLA) with Kronecker-product data.
Lect. 4. Toward MLA in tensor-product formats B. Khoromskij, Leipzig 2007(L4) 1 Contents of Lecture 4 1. Structured representation of high-order tensors revisited. - Tucker model. - Canonical (PARAFAC)
More informationProblem # Max points possible Actual score Total 120
FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to
More informationKnowledge Discovery and Data Mining 1 (VO) ( )
Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory
More informationDecomposing a three-way dataset of TV-ratings when this is impossible. Alwin Stegeman
Decomposing a three-way dataset of TV-ratings when this is impossible Alwin Stegeman a.w.stegeman@rug.nl www.alwinstegeman.nl 1 Summarizing Data in Simple Patterns Information Technology collection of
More informationCoupled Matrix/Tensor Decompositions:
Coupled Matrix/Tensor Decompositions: An Introduction Laurent Sorber Mikael Sørensen Marc Van Barel Lieven De Lathauwer KU Leuven Belgium Lieven.DeLathauwer@kuleuven-kulak.be 1 Canonical Polyadic Decomposition
More informationMatrix-Product-States/ Tensor-Trains
/ Tensor-Trains November 22, 2016 / Tensor-Trains 1 Matrices What Can We Do With Matrices? Tensors What Can We Do With Tensors? Diagrammatic Notation 2 Singular-Value-Decomposition 3 Curse of Dimensionality
More informationLearning representations
Learning representations Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 4/11/2016 General problem For a dataset of n signals X := [ x 1 x
More informationFrom Matrix to Tensor. Charles F. Van Loan
From Matrix to Tensor Charles F. Van Loan Department of Computer Science January 28, 2016 From Matrix to Tensor From Tensor To Matrix 1 / 68 What is a Tensor? Instead of just A(i, j) it s A(i, j, k) or
More informationThe multiple-vector tensor-vector product
I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 208 LECTURE 3 need for pivoting we saw that under proper circumstances, we can write A LU where 0 0 0 u u 2 u n l 2 0 0 0 u 22 u 2n L l 3 l 32, U 0 0 0 l n l
More informationCP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION
International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,
More informationCS60021: Scalable Data Mining. Dimensionality Reduction
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a
More information/16/$ IEEE 1728
Extension of the Semi-Algebraic Framework for Approximate CP Decompositions via Simultaneous Matrix Diagonalization to the Efficient Calculation of Coupled CP Decompositions Kristina Naskovska and Martin
More informationThird-Order Tensor Decompositions and Their Application in Quantum Chemistry
Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Tyler Ueltschi University of Puget SoundTacoma, Washington, USA tueltschi@pugetsound.edu April 14, 2014 1 Introduction A tensor
More informationImproved PARAFAC based Blind MIMO System Estimation
Improved PARAFAC based Blind MIMO System Estimation Yuanning Yu, Athina P. Petropulu Department of Electrical and Computer Engineering Drexel University, Philadelphia, PA, 19104, USA This work has been
More informationA concise proof of Kruskal s theorem on tensor decomposition
A concise proof of Kruskal s theorem on tensor decomposition John A. Rhodes 1 Department of Mathematics and Statistics University of Alaska Fairbanks PO Box 756660 Fairbanks, AK 99775 Abstract A theorem
More informationECE Department, U. of Minnesota, Minneapolis, MN. 3SPICE-ECE-UMN N. Sidiropoulos / SIAM50
( *;:
More informationA Signal Processing Perspective. Low-Rank Decomposition of Multi-Way Arrays:
Low-Rank Decomposition of Multi-Way Arrays: A Signal Processing Perspective Nikos Sidiropoulos Dept. ECE, TUC-Greece & UMN-U.S.A nikos@telecom.tuc.gr 1 Contents Introduction & motivating list of applications
More informationAn Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples
An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples Lars Grasedyck and Wolfgang Hackbusch Bericht Nr. 329 August 2011 Key words: MSC: hierarchical Tucker tensor rank tensor approximation
More informationLinear Algebra (Review) Volker Tresp 2017
Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.
More informationOn Kruskal s uniqueness condition for the Candecomp/Parafac decomposition
Linear Algebra and its Applications 420 (2007) 540 552 www.elsevier.com/locate/laa On Kruskal s uniqueness condition for the Candecomp/Parafac decomposition Alwin Stegeman a,,1, Nicholas D. Sidiropoulos
More informationThe Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will
More informationCS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery
CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery Tim Roughgarden & Gregory Valiant May 3, 2017 Last lecture discussed singular value decomposition (SVD), and we
More informationPARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition
PARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition EVANGELOS E. PAPALEXAKIS and CHRISTOS FALOUTSOS, Carnegie Mellon University NICHOLAS D. SIDIROPOULOS, University of Minnesota How can
More informationFundamentals of Multilinear Subspace Learning
Chapter 3 Fundamentals of Multilinear Subspace Learning The previous chapter covered background materials on linear subspace learning. From this chapter on, we shall proceed to multiple dimensions with
More information4. Multi-linear algebra (MLA) with Kronecker-product data.
ect. 3. Tensor-product interpolation. Introduction to MLA. B. Khoromskij, Leipzig 2007(L3) 1 Contents of Lecture 3 1. Best polynomial approximation. 2. Error bound for tensor-product interpolants. - Polynomial
More informationA BLIND SPARSE APPROACH FOR ESTIMATING CONSTRAINT MATRICES IN PARALIND DATA MODELS
2th European Signal Processing Conference (EUSIPCO 22) Bucharest, Romania, August 27-3, 22 A BLIND SPARSE APPROACH FOR ESTIMATING CONSTRAINT MATRICES IN PARALIND DATA MODELS F. Caland,2, S. Miron 2 LIMOS,
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More informationLinear Algebra Review
Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and
More informationMath 407: Linear Optimization
Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationTensor Decomposition for Signal Processing and Machine Learning
1 Tensor Decomposition for Signal Processing and Machine Learning Nicholas D Sidiropoulos, Fellow, IEEE, Lieven De Lathauwer, Fellow, IEEE, Xiao Fu, Member, IEEE, Kejun Huang, Student Member, IEEE, Evangelos
More informationRecovering Tensor Data from Incomplete Measurement via Compressive Sampling
Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Jason R. Holloway hollowjr@clarkson.edu Carmeliza Navasca cnavasca@clarkson.edu Department of Electrical Engineering Clarkson
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationTENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY
TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets) WHAT ARE TENSORS? Tensors
More informationTruncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences
Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy
More informationA Quick Tour of Linear Algebra and Optimization for Machine Learning
A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators
More informationAppendix A: Matrices
Appendix A: Matrices A matrix is a rectangular array of numbers Such arrays have rows and columns The numbers of rows and columns are referred to as the dimensions of a matrix A matrix with, say, 5 rows
More informationSVD, PCA & Preprocessing
Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees
More informationLearning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse.
Application of the SVD: Compression and Pseudoinverse Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse. Low rank approximation One
More informationLinear Algebra V = T = ( 4 3 ).
Linear Algebra Vectors A column vector is a list of numbers stored vertically The dimension of a column vector is the number of values in the vector W is a -dimensional column vector and V is a 5-dimensional
More informationWindow-based Tensor Analysis on High-dimensional and Multi-aspect Streams
Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,
More informationThis can be accomplished by left matrix multiplication as follows: I
1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method
More information.. CSC 566 Advanced Data Mining Alexander Dekhtyar..
.. CSC 566 Advanced Data Mining Alexander Dekhtyar.. Information Retrieval Latent Semantic Indexing Preliminaries Vector Space Representation of Documents: TF-IDF Documents. A single text document is a
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationOrthogonal tensor decomposition
Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.
More information14.2 QR Factorization with Column Pivoting
page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution
More informationSingular Value Decompsition
Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost
More informationLecture 9: Numerical Linear Algebra Primer (February 11st)
10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course
More informationMatrix decompositions
Matrix decompositions Zdeněk Dvořák May 19, 2015 Lemma 1 (Schur decomposition). If A is a symmetric real matrix, then there exists an orthogonal matrix Q and a diagonal matrix D such that A = QDQ T. The
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationPostgraduate Course Signal Processing for Big Data (MSc)
Postgraduate Course Signal Processing for Big Data (MSc) Jesús Gustavo Cuevas del Río E-mail: gustavo.cuevas@upm.es Work Phone: +34 91 549 57 00 Ext: 4039 Course Description Instructor Information Course
More informationThis work has been submitted to ChesterRep the University of Chester s online research repository.
This work has been submitted to ChesterRep the University of Chester s online research repository http://chesterrep.openrepository.com Author(s): Daniel Tock Title: Tensor decomposition and its applications
More informationThe Canonical Tensor Decomposition and Its Applications to Social Network Analysis
The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia National Labs Sandia is a multiprogram laboratory operated by
More informationImage Registration Lecture 2: Vectors and Matrices
Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this
More informationIntroduction to the Tensor Train Decomposition and Its Applications in Machine Learning
Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March
More informationA Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition
A Simpler Approach to Low-ank Tensor Canonical Polyadic Decomposition Daniel L. Pimentel-Alarcón University of Wisconsin-Madison Abstract In this paper we present a simple and efficient method to compute
More informationDefinition 2.3. We define addition and multiplication of matrices as follows.
14 Chapter 2 Matrices In this chapter, we review matrix algebra from Linear Algebra I, consider row and column operations on matrices, and define the rank of a matrix. Along the way prove that the row
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More informationDS-GA 1002 Lecture notes 10 November 23, Linear models
DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.
More informationMATRIX COMPLETION AND TENSOR RANK
MATRIX COMPLETION AND TENSOR RANK HARM DERKSEN Abstract. In this paper, we show that the low rank matrix completion problem can be reduced to the problem of finding the rank of a certain tensor. arxiv:1302.2639v2
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices
More informationConceptual Questions for Review
Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten
More informationLarge Scale Tensor Decompositions: Algorithmic Developments and Applications
Large Scale Tensor Decompositions: Algorithmic Developments and Applications Evangelos Papalexakis, U Kang, Christos Faloutsos, Nicholas Sidiropoulos, Abhay Harpale Carnegie Mellon University, KAIST, University
More informationLeast Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo
Least Squares Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo October 26, 2010 Linear system Linear system Ax = b, A C m,n, b C m, x C n. under-determined
More informationLecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan
From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 4. Tensor-Related Singular Value Decompositions Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010
More informationBOOLEAN TENSOR FACTORIZATIONS. Pauli Miettinen 14 December 2011
BOOLEAN TENSOR FACTORIZATIONS Pauli Miettinen 14 December 2011 BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS X BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS C X A B BACKGROUND: BOOLEAN MATRIX FACTORIZATIONS
More informationLecture 1: Introduction to low-rank tensor representation/approximation. Center for Uncertainty Quantification. Alexander Litvinenko
tifica Lecture 1: Introduction to low-rank tensor representation/approximation Alexander Litvinenko http://sri-uq.kaust.edu.sa/ KAUST Figure : KAUST campus, 5 years old, approx. 7000 people (include 1400
More informationLinear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations
Linear Algebra in Computer Vision CSED441:Introduction to Computer Vision (2017F Lecture2: Basic Linear Algebra & Probability Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Mathematics in vector space Linear
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationc 2008 Society for Industrial and Applied Mathematics
SIAM J MATRIX ANAL APPL Vol 30, No 3, pp 1219 1232 c 2008 Society for Industrial and Applied Mathematics A JACOBI-TYPE METHOD FOR COMPUTING ORTHOGONAL TENSOR DECOMPOSITIONS CARLA D MORAVITZ MARTIN AND
More information