A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

Similar documents
Dealing with curse and blessing of dimensionality through tensor decompositions

/16/$ IEEE 1728

Coupled Matrix/Tensor Decompositions:

The multiple-vector tensor-vector product

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

A variable projection method for block term decomposition of higher-order tensors

Decompositions of Higher-Order Tensors: Concepts and Computation

arxiv: v1 [cs.lg] 18 Nov 2018

Large Scale Tensor Decompositions: Algorithmic Developments and Applications

Computing and decomposing tensors

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Modeling Parallel Wiener-Hammerstein Systems Using Tensor Decomposition of Volterra Kernels

Available Ph.D position in Big data processing using sparse tensor representations

Postgraduate Course Signal Processing for Big Data (MSc)

FROM BASIS COMPONENTS TO COMPLEX STRUCTURAL PATTERNS Anh Huy Phan, Andrzej Cichocki, Petr Tichavský, Rafal Zdunek and Sidney Lehky

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis

Sparseness Constraints on Nonnegative Tensor Decomposition

Scalable Tensor Factorizations with Incomplete Data

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS

Multiscale Tensor Decomposition

the tensor rank is equal tor. The vectorsf (r)

DISTRIBUTED LARGE-SCALE TENSOR DECOMPOSITION. s:

Fundamentals of Multilinear Subspace Learning

Optimization of Symmetric Tensor Computations

A PARALLEL ALGORITHM FOR BIG TENSOR DECOMPOSITION USING RANDOMLY COMPRESSED CUBES (PARACOMP)

Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation

Shaden Smith * George Karypis. Nicholas D. Sidiropoulos. Kejun Huang * Abstract

Structured tensor missing-trace interpolation in the Hierarchical Tucker format Curt Da Silva and Felix J. Herrmann Sept. 26, 2013

High Performance Parallel Tucker Decomposition of Sparse Tensors

Introduction to Tensors. 8 May 2014

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Recovering Tensor Data from Incomplete Measurement via Compressive Sampling

Online Tensor Factorization for. Feature Selection in EEG

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

DFacTo: Distributed Factorization of Tensors

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

Nonnegative Tensor Factorization with Smoothness Constraints

Distributed Large-Scale Tensor Decomposition

Blind Source Separation of Single Channel Mixture Using Tensorization and Tensor Diagonalization

Local Feature Extraction Models from Incomplete Data in Face Recognition Based on Nonnegative Matrix Factorization

arxiv: v1 [cs.lg] 3 Jul 2018

Multi-Way Compressed Sensing for Big Tensor Data

Time-Delay Estimation via CPD-GEVD Applied to Tensor-based GNSS Arrays with Errors

MATRIX COMPLETION AND TENSOR RANK

ARestricted Boltzmann machine (RBM) [1] is a probabilistic

Math 671: Tensor Train decomposition methods

Efficient CP-ALS and Reconstruction From CP

A Block-Jacobi Algorithm for Non-Symmetric Joint Diagonalization of Matrices

Orthogonal tensor decomposition

arxiv: v2 [cs.lg] 9 May 2018

arxiv: v4 [math.na] 10 Nov 2014

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Big Tensor Data Reduction

A BLIND SPARSE APPROACH FOR ESTIMATING CONSTRAINT MATRICES IN PARALIND DATA MODELS

Numerical Methods. Rafał Zdunek Underdetermined problems (2h.) Applications) (FOCUSS, M-FOCUSS,

Tensor MUSIC in Multidimensional Sparse Arrays

Dimitri Nion & Lieven De Lathauwer

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark

A Practical Randomized CP Tensor Decomposition

A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization

A Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

TENLAB A MATLAB Ripoff for Tensors

Parallel Numerical Algorithms

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

Novel Alternating Least Squares Algorithm for Nonnegative Matrix and Tensor Factorizations

An Introduction to Hierachical (H ) Rank and TT Rank of Tensors with Examples

Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach

ParCube: Sparse Parallelizable Tensor Decompositions

arxiv: v1 [math.ra] 13 Jan 2009

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

A concise proof of Kruskal s theorem on tensor decomposition

CS60021: Scalable Data Mining. Dimensionality Reduction

c 2008 Society for Industrial and Applied Mathematics

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

A new truncation strategy for the higher-order singular value decomposition

Tensor Decompositions for Signal Processing Applications

Technical Report TR SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication

Computational Linear Algebra

Faloutsos, Tong ICDE, 2009

3D INTERPOLATION USING HANKEL TENSOR COMPLETION BY ORTHOGONAL MATCHING PURSUIT A. Adamo, P. Mazzucchelli Aresys, Milano, Italy

Permutation transformations of tensors with an application

From Stationary Methods to Krylov Subspaces

Using Hankel structured low-rank approximation for sparse signal recovery

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences

Blind Parallel Interrogation of Ultrasonic Neural Dust Motes Based on Canonical Polyadic Decomposition: a Simulation Study

Future Directions in Tensor-Based Computation and Modeling 1

Institute for Computational Mathematics Hong Kong Baptist University

JOS M.F. TEN BERGE SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS

Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x

Selected Topics in Optimization. Some slides borrowed from

A Brief Guide for TDALAB Ver 1.1. Guoxu Zhou and Andrzej Cichocki

Math 671: Tensor Train decomposition methods II

Transcription:

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017

2 Classification of hazardous gasses using e-noses Classify 900 experiments containing 72 time series with 26 000 samples each. Sensor Experiment Time

3 Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application

4 Canonical polyadic decomposition Sum of R rank-1 terms c 1 c R T = a 1 b 1 + + a R b R

4 Canonical polyadic decomposition Sum of R rank-1 terms c 1 c R T = a 1 b 1 + + a R b R Mathematically, for a general Nth order tensor T T = R r=1 a (1) r a (2) r a (N) r = A (1), A (2),..., A (N)

5 Computing a CPD Optimization problem: 1 min A (1),A (2),...,A (N) 2 T A (1), A (2),..., A (N) 2 F

5 Computing a CPD Optimization problem: 1 min A (1),A (2),...,A (N) 2 T A (1), A (2),..., A (N) 2 Algorithms Alternating least squares CPOPT [Acar et al. 2011a] (Damped) Gauss Newton [Phan et al. 2013] (Inexact) nonlinear least squares [Sorber et al. 2013] F

6 Curse of dimensionality Suppose Nth order T C I I I, then number of entries: I N memory and time complexity: O ( I N)

6 Curse of dimensionality Suppose Nth order T C I I I, then number of entries: I N memory and time complexity: O ( I N) number of variables: NIR

6 Curse of dimensionality Suppose Nth order T C I I I, then number of entries: I N memory and time complexity: O ( I N) number of variables: NIR Example [Vervliet et al. 2014] Ninth-order tensor with I = 100 and rank R = 5: number of entries: 10 18 number of variables: 4500

7 How to handle large tensors? Use incomplete tensors Acar et al. 2011b; Vervliet et al. 2014; Vervliet et al. 2016a Exploit sparsity Kang et al. 2012; Papalexakis et al. 2012; Bader and Kolda 2007 Compress the tensor Sidiropoulos et al. 2014; Oseledets and Tyrtyshnikov 2010; Vervliet et al. 2016b Decompose subtensors and combine results Papalexakis et al. 2012; Phan and Cichocki 2011 Parallel Liavas and Sidiropoulos 2015 + many of the above

8 Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application

9 Randomized block sampling CPD: idea + +

9 Randomized block sampling CPD: idea + +

9 Randomized block sampling CPD: idea + + Take sample

9 Randomized block sampling CPD: idea + + Take sample Initialization Compute step + +

9 Randomized block sampling CPD: idea + + Take sample Initialization Update Compute step + +

10 Randomized block sampling CPD: algorithm input : Data T and initial guess A (n), n = 1,..., N output: A (n), n = 1,..., N such that T A (1),..., A (N) while k < K and not converged do Create sample T s and corresponding A (n) s, n = 1,..., N Let Ā (n) s be the result of 1 iteration in a restricted CPD algorithm on T s with initial guess A (n) s, n = 1,..., N and restriction Update the affected variables A (n) using Ā (n) s, n = 1,..., N k k + 1

10 Randomized block sampling CPD: algorithm input : Data T and initial guess A (n), n = 1,..., N output: A (n), n = 1,..., N such that T A (1),..., A (N) while k < K and not converged do Create sample T s and corresponding A (n) s, n = 1,..., N Let Ā (n) s be the result of 1 iteration in a restricted CPD algorithm on T s with initial guess A (n) s, n = 1,..., N and restriction Update the affected variables A (n) using Ā (n) s, n = 1,..., N k k + 1

11 Ingredient 1: randomized block sampling For a 6 6 tensor and block size 3 2: I 1 = {3, 1, 2, 6, 5, 4} I 2 = {1, 2, 4, 6, 3, 5}

11 Ingredient 1: randomized block sampling For a 6 6 tensor and block size 3 2: I 1 = {3, 1, 2, 6, 5, 4} I 2 = {1, 2, 4, 6, 3, 5} I 1 = {3, 1, 2, 6, 5, 4} I 2 = {1, 2, 4, 6, 3, 5}

11 Ingredient 1: randomized block sampling For a 6 6 tensor and block size 3 2: I 1 = {3, 1, 2, 6, 5, 4} I 2 = {1, 2, 4, 6, 3, 5} I 1 = {3, 1, 2, 6, 5, 4} I 2 = {1, 2, 4, 6, 3, 5} I 1 = {6, 1, 4, 2, 5, 3} I 2 = {1, 2, 4, 6, 3, 5}

12 Ingredient 2: restricted CPD algorithm ALS variant A (n) k+1 = (1 α)a(n) k + αt (n) V (n) ( W (n) ) 1 Enforce restriction by α = k.

12 Ingredient 2: restricted CPD algorithm ALS variant A (n) k+1 = (1 α)a(n) k + αt (n) V (n) ( W (n) ) 1 Enforce restriction by α = k. NLS variant in which 1 min p k 2 vec (F(x k)) J k p k 2 s.t. p k k F = T A (1),..., A (N)

13 Ingredient 3: restriction Use restriction of form { k = 0 ˆ 0 α (k Ksearch)/Q if if k < K search k K search 10 1 10 3 0 50 100 150 200 Iteration k

13 Ingredient 3: restriction Use restriction of form { k = 0 ˆ 0 α (k Ksearch)/Q if if k < K search k K search 10 1 10 3 0 50 100 150 200 Iteration k Example (Selecting Q) For a 100 100 100 tensor and block size 25 25 25, Q = 4

14 Ingredient 4: A stopping criterion Function evaluation f val = 0.5 T A (1),..., A (N) 2 10 0 f val CPD Error 10 1 10 2 10 3 0 500 1 000 1 500 Iteration k

14 Ingredient 4: A stopping criterion Function evaluation f val = 0.5 T A (1),..., A (N) 2 10 0 f val CPD Error 10 1 10 2 10 3 0 500 1 000 1 500 Iteration k Step size

15 Intermezzo: Cramér Rao bound Uncertainty of an estimate 68% 3σ 2σ σ 0 σ 2σ 3σ

15 Intermezzo: Cramér Rao bound Uncertainty of an estimate 68% CRB σ 2 3σ 2σ σ 0 σ 2σ 3σ

15 Intermezzo: Cramér Rao bound Uncertainty of an estimate 68% CRB σ 2 3σ 2σ σ 0 σ 2σ 3σ C = τ 2 (J H J) 1

16 Ingredient 4: Cramér Rao bound based stopping criterion Experimental bound (n) Use estimates A k Use f val to estimate noise τ

16 Ingredient 4: Cramér Rao bound based stopping criterion Experimental bound (n) Use estimates A k Use f val to estimate noise τ Stopping criterion: 1 D CRB = R n I n N I n R n=1 i=1 r=1 A (n) k (i, r) A(n) k K CRB (i, r) C (n) (i, r)

16 Ingredient 4: Cramér Rao bound based stopping criterion Experimental bound (n) Use estimates A k Use f val to estimate noise τ Stopping criterion: 1 D CRB = R n I n γ N I n R n=1 i=1 r=1 A (n) k (i, r) A(n) k K CRB (i, r) C (n) (i, r)

17 Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k Unrestricted phase (1 + 2): converge to a neighborhood of an optimum Restricted phase (3): pull iterates towards optimum

17 Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k Unrestricted phase (1 + 2): converge to a neighborhood of an optimum Restricted phase (3): pull iterates towards optimum

17 Unrestricted phase vs restricted phase CPD Error 1 2 3 Iteration k Unrestricted phase (1 + 2): converge to a neighborhood of an optimum Restricted phase (3): pull iterates towards optimum Assumptions CPD of rank R exists SNR is high enough Most block dimensions > R

18 Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application

19 Experiment overview Experiments Comparison ALS vs NLS (see paper) Influence of block size Influence of step size (see paper)

19 Experiment overview Experiments Comparison ALS vs NLS (see paper) Influence of block size Influence of step size (see paper) Performance 50 Monte Carlo experiments CPD error max A (n) n 0 A (n) res / A (n) 0

19 Experiment overview Experiments Comparison ALS vs NLS (see paper) Influence of block size Influence of step size (see paper) Performance 50 Monte Carlo experiments CPD error max A (n) n 0 A (n) res / A (n) 0 cpd rbs in Tensorlab 3.0 [Vervliet et al. 2016c]

20 Influence of block size: setup (4 4 2) ν U(0, 1) = + + + N 800 800 400 R = 20 No noise

21 Influence of block size on computation time 150 Time (s) 100 50 0 5 10 20 40 80 full ν 800 800 400 (4 4 2) ν R = 20, U(0, 1) No noise

22 Influence of block size on data accesses Data accesses (%) 1000 Full tensor 100 10 5 10 20 40 80 full ν 800 800 400 (4 4 2) ν R = 20, U(0, 1) No noise

23 Influence of block size on accuracy 10 0 Unrestricted ECPD 10 1 10 2 5 10 20 40 full ν 800 800 400 (4 4 2) ν R = 20, U(0, 1) 20 db

23 Influence of block size on accuracy ECPD 10 0 10 1 10 2 Unrestricted Restricted 5 10 20 40 full ν 800 800 400 (4 4 2) ν R = 20, U(0, 1) 20 db

24 Overview Decomposing large-scale tensors Randomized block sampling Experimental results Chemo-sensing application

25 Classify hazardous gasses Does the sample contain CO, acetaldehyde or ammonia? Sensor Experiment Time Strategy: classify using coefficients of spatiotemporal patterns. 26 000 72 900 100 36 100 R = 5 Unknown

26 Classify hazardous gasses: results Resulting factor matrices time sensor experiment

26 Classify hazardous gasses: results Resulting factor matrices time sensor experiment Performance after clustering Iterations Time (s) Error (%) No restriction 3000 60 5.0 Restriction 9000 170 0.3 0.8

27 Conclusion The randomized block sampling CPD algorithm enables the decomposition of larger tensors, using fewer data points and less memory Block size controls accuracy, data accesses and time Step size restriction improves accuracy Cramér Rao bound based stopping criterion combines noise and step information

28 More details: N. Vervliet and L. De Lathauwer [2016]. A Randomized Block Sampling Approach to Canonical Polyadic Decomposition of Large-Scale Tensors. In: IEEE Journal of Selected Topics in Signal Processing 10.2, pp. 284 295

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017

2 References I Acar, E., D.M. Dunlavy, and T.G. Kolda (2011a). A scalable optimization approach for fitting canonical tensor decompositions. In: Journal of Chemometrics 25.2, pp. 67 86. Acar, E. et al. (2011b). Scalable tensor factorizations for incomplete data. In: Chemometrics and Intelligent Laboratory Systems 106.1, pp. 41 56. Bader, B.W. and T.G. Kolda (2007). Efficient MATLAB computations with sparse and factored tensors. In: SIAM J. Sci. Comput. 30.1, pp. 205 231. Kang, U. et al. (2012). GigaTensor: scaling tensor analysis up by 100 times-algorithms and discoveries. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp. 316 324.

3 References II Liavas, A. and N. Sidiropoulos (2015). Parallel Algorithms for Constrained Tensor Factorization via the Alternating Direction Method of Multipliers. In: IEEE Trans. Signal Process. PP.99, pp. 1 1. Oseledets, I.V. and E.E. Tyrtyshnikov (2010). TT-cross approximation for multidimensional arrays. In: Linear Algebra and its Applications 432.1, pp. 70 88. Papalexakis, E., C. Faloutsos, and N. Sidiropoulos (2012). ParCube: Sparse Parallelizable Tensor Decompositions. English. In: Machine Learning and Knowledge Discovery in Databases. Ed. by PeterA. Flach, Tijl De Bie, and Nello Cristianini. Vol. 7523. Lecture Notes in Computer Science. Springer Berlin Heidelberg, pp. 521 536.

4 References III Phan, A.-H. and A. Cichocki (2011). PARAFAC algorithms for large-scale problems. In: Neurocomputing 74.11, pp. 1970 1984. Phan, A.-H., P. Tichavský, and A. Cichocki (2013). Low Complexity Damped Gauss Newton Algorithms for CANDECOMP/PARAFAC. In: SIAM J. Appl. Math. 34.1, pp. 126 147. Sidiropoulos, N., E. Papalexakis, and C. Faloutsos (2014). Parallel randomly compressed cubes: A scalable distributed architecture for big tensor decomposition. In: IEEE Signal Process. Mag. 31.5, pp. 57 70.

5 References IV Sorber, L., M. Van Barel, and L. De Lathauwer (2013). Optimization-Based Algorithms for Tensor Decompositions: Canonical Polyadic Decomposition, Decomposition in Rank-(L r, L r, 1) Terms, and a New Generalization. In: 23.2, pp. 695 720. Vervliet, N. and L. De Lathauwer (2016). A Randomized Block Sampling Approach to Canonical Polyadic Decomposition of Large-Scale Tensors. In: IEEE Journal of Selected Topics in Signal Processing 10.2, pp. 284 295. Vervliet, N., O. Debals, and L. De Lathauwer (2016a). Canonical polyadic decomposition of incomplete tensors with linearly constrained factors. Technical Report 16 172, ESAT-STADIUS, KU Leuven, Belgium.

6 References V (2016b). Tensorlab 3.0 Numerical optimization strategies for large-scale constrained and coupled matrix/tensor factorization. In: 2016 50th Asilomar Conference on Signals, Systems and Computers. Vervliet, N. et al. (2014). Breaking the Curse of Dimensionality Using Decompositions of Incomplete Tensors: Tensor-based scientific computing in big data analysis. In: IEEE Signal Process. Mag. 31.5, pp. 71 79. Vervliet, N. et al. (2016c). Tensorlab 3.0. Available online at http://www.tensorlab.net.