High Performance Parallel Tucker Decomposition of Sparse Tensors

Size: px
Start display at page:

Download "High Performance Parallel Tucker Decomposition of Sparse Tensors"

Transcription

1 High Performance Parallel Tucker Decomposition of Sparse Tensors Oguz Kaya INRIA and LIP, ENS Lyon, France SIAM PP 16, April 14, 2016, Paris, France Joint work with: Bora Uçar, CNRS and LIP, ENS Lyon, France 1/ 16 Parallel Sparse Tucker Decompositions

2 Tucker Tensor Decomposition K C R 3 I X J K G I A B J R 2 R 1 Tucker decomposition provides a rank-(r 1,..., R N ) approximation of a tensor. consists of a core tensor G R R 1 R N and N matrices having R 1,..., R N columns. We are interested in the case when X is big, sparse, and is of low rank. Example: Google web queries, Netflix movie ratings, Amazon product reviews, etc. 2/ 16 Parallel Sparse Tucker Decompositions

3 Tucker Tensor Decomposition K C R 3 I X J K G I A B J R 2 R 1 Tucker decomposition provides a rank-(r 1,..., R N ) approximation of a tensor. consists of a core tensor G R R 1 R N and N matrices having R 1,..., R N columns. We are interested in the case when X is big, sparse, and is of low rank. Example: Google web queries, Netflix movie ratings, Amazon product reviews, etc. 2/ 16 Parallel Sparse Tucker Decompositions

4 Tucker Tensor Decomposition Related Work: Matlab Tensor Toolbox by Kolda et al. Efficient and scalable computations with sparse tensors (Baskaran et al., 12) Parallel Tensor Compression for Large-Scale Scientific Data (Austin et al., 15) Haten2: Billion-scale tensor decompositions (Jung et al., 15) Applications (in data mining): CubeSVD: A Novel Approach to Personalized Web Search (Sun et al., 05) Tag Recommendations Based on Tensor Dimensionality Reduction (Symeonidis et al., 08) Extended feature combination model for recommendations in location-based mobile services (Sattari et al. 15) Goal: To compute sparse Tucker decomposition in parallel (shared/distributed memory). 3/ 16 Parallel Sparse Tucker Decompositions

5 Tucker Tensor Decomposition Related Work: Matlab Tensor Toolbox by Kolda et al. Efficient and scalable computations with sparse tensors (Baskaran et al., 12) Parallel Tensor Compression for Large-Scale Scientific Data (Austin et al., 15) Haten2: Billion-scale tensor decompositions (Jung et al., 15) Applications (in data mining): CubeSVD: A Novel Approach to Personalized Web Search (Sun et al., 05) Tag Recommendations Based on Tensor Dimensionality Reduction (Symeonidis et al., 08) Extended feature combination model for recommendations in location-based mobile services (Sattari et al. 15) Goal: To compute sparse Tucker decomposition in parallel (shared/distributed memory). 3/ 16 Parallel Sparse Tucker Decompositions

6 Tucker Tensor Decomposition Related Work: Matlab Tensor Toolbox by Kolda et al. Efficient and scalable computations with sparse tensors (Baskaran et al., 12) Parallel Tensor Compression for Large-Scale Scientific Data (Austin et al., 15) Haten2: Billion-scale tensor decompositions (Jung et al., 15) Applications (in data mining): CubeSVD: A Novel Approach to Personalized Web Search (Sun et al., 05) Tag Recommendations Based on Tensor Dimensionality Reduction (Symeonidis et al., 08) Extended feature combination model for recommendations in location-based mobile services (Sattari et al. 15) Goal: To compute sparse Tucker decomposition in parallel (shared/distributed memory). 3/ 16 Parallel Sparse Tucker Decompositions

7 Higher Order Orthogonal Iteration (HOOI) Algorithm K R 3 Algorithm: HOOI for 3rd order tensors repeat I X J K G I A R 1 C B J R 2 1 Â [X 2 B 3 C] (1) 2 A TRSVD(Â, R 1) //R 1 leading left singular vectors 3 ˆB [X 1 A 3 C] (2) 4 B TRSVD(ˆB, R 2) 5 Ĉ [X 1 A 2 B] (2) 6 C TRSVD(Ĉ, R 3) until no more improvement or maximum iterations reached 7 G X 1 A 2 B 3 C 8 return [G; A, B, C] We discuss the case where R 1 = R 2 = = R N = R and N = 3. A R I R,B R J R, and C R K R are dense. Â [X 2 B 3 C] (1) R I R2 is called tensor-times-matrix multiply (TTM). Â R I RN 1,ˆB R J RN 1, and Ĉ R K RN 1 are dense. (R 2 columns for N = 3) 4/ 16 Parallel Sparse Tucker Decompositions

8 Higher Order Orthogonal Iteration (HOOI) Algorithm K R 3 Algorithm: HOOI for 3rd order tensors repeat I X J K G I A R 1 C B J R 2 1 Â [X 2 B 3 C] (1) 2 A TRSVD(Â, R 1) //R 1 leading left singular vectors 3 ˆB [X 1 A 3 C] (2) 4 B TRSVD(ˆB, R 2) 5 Ĉ [X 1 A 2 B] (2) 6 C TRSVD(Ĉ, R 3) until no more improvement or maximum iterations reached 7 G X 1 A 2 B 3 C 8 return [G; A, B, C] We discuss the case where R 1 = R 2 = = R N = R and N = 3. A R I R,B R J R, and C R K R are dense. Â [X 2 B 3 C] (1) R I R2 is called tensor-times-matrix multiply (TTM). Â R I RN 1,ˆB R J RN 1, and Ĉ R K RN 1 are dense. (R 2 columns for N = 3) 4/ 16 Parallel Sparse Tucker Decompositions

9 Higher Order Orthogonal Iteration (HOOI) Algorithm K R 3 Algorithm: HOOI for 3rd order tensors repeat I X J K G I A R 1 C B J R 2 1 Â [X 2 B 3 C] (1) 2 A TRSVD(Â, R 1) //R 1 leading left singular vectors 3 ˆB [X 1 A 3 C] (2) 4 B TRSVD(ˆB, R 2) 5 Ĉ [X 1 A 2 B] (2) 6 C TRSVD(Ĉ, R 3) until no more improvement or maximum iterations reached 7 G X 1 A 2 B 3 C 8 return [G; A, B, C] We discuss the case where R 1 = R 2 = = R N = R and N = 3. A R I R,B R J R, and C R K R are dense. Â [X 2 B 3 C] (1) R I R2 is called tensor-times-matrix multiply (TTM). Â R I RN 1,ˆB R J RN 1, and Ĉ R K RN 1 are dense. (R 2 columns for N = 3) 4/ 16 Parallel Sparse Tucker Decompositions

10 Higher Order Orthogonal Iteration (HOOI) Algorithm K R 3 Algorithm: HOOI for 3rd order tensors repeat I X J K G I A R 1 C B J R 2 1 Â [X 2 B 3 C] (1) 2 A TRSVD(Â, R 1) //R 1 leading left singular vectors 3 ˆB [X 1 A 3 C] (2) 4 B TRSVD(ˆB, R 2) 5 Ĉ [X 1 A 2 B] (2) 6 C TRSVD(Ĉ, R 3) until no more improvement or maximum iterations reached 7 G X 1 A 2 B 3 C 8 return [G; A, B, C] We discuss the case where R 1 = R 2 = = R N = R and N = 3. A R I R,B R J R, and C R K R are dense. Â [X 2 B 3 C] (1) R I R2 is called tensor-times-matrix multiply (TTM). Â R I RN 1,ˆB R J RN 1, and Ĉ R K RN 1 are dense. (R 2 columns for N = 3) 4/ 16 Parallel Sparse Tucker Decompositions

11 Higher Order Orthogonal Iteration (HOOI) Algorithm K R 3 Algorithm: HOOI for 3rd order tensors repeat I X J K G I A R 1 C B J R 2 1 Â [X 2 B 3 C] (1) 2 A TRSVD(Â, R 1) //R 1 leading left singular vectors 3 ˆB [X 1 A 3 C] (2) 4 B TRSVD(ˆB, R 2) 5 Ĉ [X 1 A 2 B] (2) 6 C TRSVD(Ĉ, R 3) until no more improvement or maximum iterations reached 7 G X 1 A 2 B 3 C 8 return [G; A, B, C] We discuss the case where R 1 = R 2 = = R N = R and N = 3. A R I R,B R J R, and C R K R are dense. Â [X 2 B 3 C] (1) R I R2 is called tensor-times-matrix multiply (TTM). Â R I RN 1,ˆB R J RN 1, and Ĉ R K RN 1 are dense. (R 2 columns for N = 3) 4/ 16 Parallel Sparse Tucker Decompositions

12 Tensor-Times-Matrix Multiply  [X 2 B 3 C] (1),  R I R2 B(j, :) C(k, :) R R2 is a Kronecker product. For each nonzero x i,j,k ; Â(i, :) receives the update x i,j,k [B(j, :) C(k, :)]. Algorithm:  [X 2 B 3 C] (1) 1  zeros(i, R 2 ) foreach x i,j,k X do 2 Â(i, :) Â(i, :) + x i,j,k[b(j, :) C(k, :)] 5/ 16 Parallel Sparse Tucker Decompositions

13 Outline 1 Introduction 2 Parallel HOOI 3 Results 4 Conclusion

14 (Bad) Fine-Grain Parallel TTM within Tucker-ALS A and  are rowwise distributed Process p owns and computes A(I p, :) and Â(I p, :). Tensor nonzeros are partitioned (arbitrarily) Process p owns the subset of nonzeros X p Performing x i,j,k[b(j, :) C(k, :)] and generating a partial result for Â(i, :) is a fine-grain task We use post-communication scheme at each iteration: at the beginning, rows of A, B, and C are available. at the end, only A is updated and communicated. Partial row results of  are sent/received (fold). Rows of A(I p, :) are sent/received (expand). Algorithm: Computing A in fine-grain HOOI at process p foreach xi,j,k X p do 1 Â(i, :) Â(i, :) + xi,j,k[b(j, :) C(k, :)] 2 Send/Receive and sum up partial rows of  3 A(Ip, :) TRSVD(Â, R) 4 Send/Receive rows of A 6/ 16 Parallel Sparse Tucker Decompositions

15 (Bad) Fine-Grain Parallel TTM within Tucker-ALS Number of rows sent/received in fold/expand are equal. Each communication unit of expand has size R. Each communication unit of fold has size R N 1. We want to avoid assembling  in fold communication. We need to compute TRSVD(Â, R). Algorithm: Computing A in fine-grain HOOI at process p foreach xi,j,k X p do 1 Â(i, :) Â(i, :) + xi,j,k[b(j, :) C(k, :)] 2 Send/Receive and sum up partial rows of  3 A(Ip, :) TRSVD(Â, R) 4 Send/Receive rows of A 6/ 16 Parallel Sparse Tucker Decompositions

16 (Bad) Fine-Grain Parallel TTM within Tucker-ALS Number of rows sent/received in fold/expand are equal. Each communication unit of expand has size R. Each communication unit of fold has size R N 1. We want to avoid assembling  in fold communication. We need to compute TRSVD(Â, R). Algorithm: Computing A in fine-grain HOOI at process p foreach xi,j,k X p do 1 Â(i, :) Â(i, :) + xi,j,k[b(j, :) C(k, :)] 2 Send/Receive and sum up partial rows of  3 A(Ip, :) TRSVD(Â, R) 4 Send/Receive rows of A 6/ 16 Parallel Sparse Tucker Decompositions

17 Computing TRSVD A ~ ~ A 1 ~ A 2 ~ A 3 = + + Gram matrix  T? Iterative solvers? Need to perform Âx and  T x efficiently. 7/ 16 Parallel Sparse Tucker Decompositions

18 Computing TRSVD A ~ ~ A 1 ~ A 2 ~ A 3 = + + Gram matrix  T? Iterative solvers? Need to perform Âx and  T x efficiently. 7/ 16 Parallel Sparse Tucker Decompositions

19 Computing TRSVD A ~ ~ A 1 ~ A 2 ~ A 3 = + + Gram matrix  T? Iterative solvers? Need to perform Âx and  T x efficiently. 7/ 16 Parallel Sparse Tucker Decompositions

20 Computing y Âx A ~ ~ A 1 ~ A 2 ~ A 3 = + + 8/ 16 Parallel Sparse Tucker Decompositions

21 Computing y Âx ~ ~ ~ A ~ A 1 A 2 A 3 x x x x = + + 8/ 16 Parallel Sparse Tucker Decompositions

22 Computing y Âx ~ ~ ~ A ~ A 1 A 2 A 3 x x x x = + + For each unit of communication, we perform extra work in MxV. 8/ 16 Parallel Sparse Tucker Decompositions

23 Computing y Âx y y 1 y2 y3 = + + Instead of communicating R N 1 entries, we communicate 1! (per SVD iteration) 8/ 16 Parallel Sparse Tucker Decompositions

24 Computing y Âx y y 1 y2 y3 = + + y  T x works in reverse with the same communication cost. Row distribution of y and left-singular vectors are the same as  A gets the same row distribution as Â. 8/ 16 Parallel Sparse Tucker Decompositions

25 (Good) Fine-Grain Parallel TTM within Tucker-ALS Algorithm: Computing A in fine-grain HOOI at process p foreach x i,j,k X p do 1 Â(i, :) Â(i, :) + x i,j,k [B(j, :) C(k, :)] 2 A(I p, :) TRSVD(Â, R, MxV(... ), MTxV(... )) 3 Send/Receive rows of A 9/ 16 Parallel Sparse Tucker Decompositions

26 Hypergraph Model for Parallel HOOI Multi-constraint hypergraph partitioning We balance computation and memory costs. By minimizing the cutsize of the hypergraph, we minimize: the total communication volume of MtV/MTxV, the total extra MxV/MTxV work, and the total volume of communication for TTM. Ideally, should minimize the maximum, not total X = {(1, 2, 3), (2, 3, 1), (3, 1, 2)} a 2 a 1 b 1 c 1 b 2 c 2 x 1,2,3 x 3,1,2 x 2,3,1 c 3 b 3 a 3 10/ 16 Parallel Sparse Tucker Decompositions

27 Outline 1 Introduction 2 Parallel HOOI 3 Results 4 Conclusion

28 Experimental Setup HyperTensor Hybrid OpenMP/MPI code in C++ Dependencies to BLAS, LAPACK, and C++11 STL SLEPc/PETSc for distributed memory TRSVD computations IBM BlueGene/Q Machine 16GB memory and 16 cores (at 1.6GHz) per node Experiments using up to 4096 cores (256 nodes) R i is set to 5/10 for 4/3-dimensional tensors. Tensor sizes Tensor I 1 I 2 I 3 I 4 #nonzeros Netflix 480K 17K 2K - 100M NELL 3.2M K - 78M Delicious 1K 530K 17M 2.4M 140M Flickr K 28M 1.6M 112M 11/ 16 Parallel Sparse Tucker Decompositions

29 Results - Flickr/Delicious Per iteration runtime of the parallel HOOI (in seconds) #nodes #cores Delicious Flickr fine-hp fine-rd coarse-hp coarse-bl fine-hp fine-rd coarse-hp coarse-bl Coarse-grain kernel is slow due to load imbalance and communication. On Delicious, fine-hp is 1.8x/5.4x/6.4x faster than fine-rd/coarse-hp/coarse-bl. On Flickr, fine-hp is 1.5x/3.7x/4.3x faster than fine-rd/coarse-hp/coarse-bl. All instances achieve scalability to 4096 cores. 12/ 16 Parallel Sparse Tucker Decompositions

30 Results - NELL/Netflix Per iteration runtime of the parallel HOOI (in seconds) #nodes #cores NELL Netflix fine-hp fine-rd coarse-hp coarse-bl fine-hp fine-rd coarse-hp coarse-bl On Netflix, fine-hp is 2.8x/5x/5.5x faster than fine-rd/coarse-hp/coarse-bl. On NELL, fine-rd is faster than fine-hp (5x less total comm. but 2x more max comm.) 13/ 16 Parallel Sparse Tucker Decompositions

31 Outline 1 Introduction 2 Parallel HOOI 3 Results 4 Conclusion

32 Conclusion We provide the first high performance shared/distributed memory parallel algorithm/implementation for the Tucker decomposition of sparse tensors hypergraph partitioning models of these computations for better scalability. We achieve scalability up to 4096 cores even with random partitioning. We enable Tucker-based tensor analysis of very big sparse data. 14/ 16 Parallel Sparse Tucker Decompositions

33 Conclusion We provide the first high performance shared/distributed memory parallel algorithm/implementation for the Tucker decomposition of sparse tensors hypergraph partitioning models of these computations for better scalability. We achieve scalability up to 4096 cores even with random partitioning. We enable Tucker-based tensor analysis of very big sparse data. 14/ 16 Parallel Sparse Tucker Decompositions

34 Conclusion We provide the first high performance shared/distributed memory parallel algorithm/implementation for the Tucker decomposition of sparse tensors hypergraph partitioning models of these computations for better scalability. We achieve scalability up to 4096 cores even with random partitioning. We enable Tucker-based tensor analysis of very big sparse data. 14/ 16 Parallel Sparse Tucker Decompositions

35 References [1] O. Kaya and B. Uçar, High-performance parallel algorithms for the Tucker decomposition of higher order sparse tensors, Tech. Rep. RR-8801, Inria, Grenoble Rhône-Alpes, Oct [2] W. Austin, G. Ballard, and T. G. Kolda, Parallel tensor compression for large-scale scientific data, tech. rep., arxiv, [3] I. Jeon, E. E. Papalexakis, U. Kang, and C. Faloutsos, Haten2: Billion-scale tensor decompositions, in IEEE 31st International Conference on Data Engineering (ICDE), pp , [4] M. Baskaran, B. Meister, N. Vasilache, and R. Lethin, Efficient and scalable computations with sparse tensors, in IEEE Conference on High Performance Extreme Computing (HPEC), pp. 1 6, Sept [5] J. Sun, H. Zeng, H. Liu, Y. Lu, and Z. Chen, CubeSVD: A novel approach to personalized web search, in Proceedings of the 14th International Conference on World Wide Web, WWW 05, (New York, NY, USA), pp , ACM, / 16 Parallel Sparse Tucker Decompositions

36 Contact perso.ens-lyon.fr/bora.ucar 16/ 16 Parallel Sparse Tucker Decompositions

Parallel Tensor Compression for Large-Scale Scientific Data

Parallel Tensor Compression for Large-Scale Scientific Data Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel

More information

Using R for Iterative and Incremental Processing

Using R for Iterative and Incremental Processing Using R for Iterative and Incremental Processing Shivaram Venkataraman, Indrajit Roy, Alvin AuYoung, Robert Schreiber UC Berkeley and HP Labs UC BERKELEY Big Data, Complex Algorithms PageRank (Dominant

More information

A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization

A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization Shaden Smith, George Karypis Department of Computer Science and Engineering, University of Minnesota {shaden, karypis}@cs.umn.edu

More information

Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation

Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation Georgios Lourakis and Athanasios P. Liavas School of Electrical and Computer Engineering,

More information

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm Kronecker Product Approximation with Multiple actor Matrices via the Tensor Product Algorithm King Keung Wu, Yeung Yam, Helen Meng and Mehran Mesbahi Department of Mechanical and Automation Engineering,

More information

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous

More information

A Practical Randomized CP Tensor Decomposition

A Practical Randomized CP Tensor Decomposition A Practical Randomized CP Tensor Decomposition Casey Battaglino, Grey Ballard 2, and Tamara G. Kolda 3 SIAM AN 207, Pittsburgh, PA Georgia Tech Computational Sci. and Engr. 2 Wake Forest University 3 Sandia

More information

Distributed Methods for High-dimensional and Large-scale Tensor Factorization

Distributed Methods for High-dimensional and Large-scale Tensor Factorization Distributed Methods for High-dimensional and Large-scale Tensor Factorization Kijung Shin Dept. of Computer Science and Engineering Seoul National University Seoul, Republic of Korea koreaskj@snu.ac.kr

More information

Sparse BLAS-3 Reduction

Sparse BLAS-3 Reduction Sparse BLAS-3 Reduction to Banded Upper Triangular (Spar3Bnd) Gary Howell, HPC/OIT NC State University gary howell@ncsu.edu Sparse BLAS-3 Reduction p.1/27 Acknowledgements James Demmel, Gene Golub, Franc

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

Parallel Sparse Tensor Decompositions using HiCOO Format

Parallel Sparse Tensor Decompositions using HiCOO Format Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline

More information

Towards parallel bipartite matching algorithms

Towards parallel bipartite matching algorithms Outline Towards parallel bipartite matching algorithms Bora Uçar CNRS and GRAAL, ENS Lyon, France Scheduling for large-scale systems, 13 15 May 2009, Knoxville Joint work with Patrick R. Amestoy (ENSEEIHT-IRIT,

More information

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,

More information

Math 671: Tensor Train decomposition methods

Math 671: Tensor Train decomposition methods Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3

More information

CS60021: Scalable Data Mining. Dimensionality Reduction

CS60021: Scalable Data Mining. Dimensionality Reduction J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a

More information

Memory-efficient Parallel Tensor Decompositions

Memory-efficient Parallel Tensor Decompositions Patent Pending Memory-efficient Parallel Tensor Decompositions Muthu Baskaran, Tom Henretty, Benoit Pradelle, M. Harper Langston, David Bruns-Smith, James Ezick, Richard Lethin Reservoir Labs Inc. New

More information

Multi-Way Compressed Sensing for Big Tensor Data

Multi-Way Compressed Sensing for Big Tensor Data Multi-Way Compressed Sensing for Big Tensor Data Nikos Sidiropoulos Dept. ECE University of Minnesota MIIS, July 1, 2013 Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing

More information

Linear Systems of n equations for n unknowns

Linear Systems of n equations for n unknowns Linear Systems of n equations for n unknowns In many application problems we want to find n unknowns, and we have n linear equations Example: Find x,x,x such that the following three equations hold: x

More information

Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent

Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent KDD 2011 Rainer Gemulla, Peter J. Haas, Erik Nijkamp and Yannis Sismanis Presenter: Jiawen Yao Dept. CSE, UT Arlington 1 1

More information

Optimization of Symmetric Tensor Computations

Optimization of Symmetric Tensor Computations Optimization of Symmetric Tensor Computations Jonathon Cai Department of Computer Science, Yale University New Haven, CT 0650 Email: jonathon.cai@yale.edu Muthu Baskaran, Benoît Meister, Richard Lethin

More information

Enhancing Scalability of Sparse Direct Methods

Enhancing Scalability of Sparse Direct Methods Journal of Physics: Conference Series 78 (007) 0 doi:0.088/7-6596/78//0 Enhancing Scalability of Sparse Direct Methods X.S. Li, J. Demmel, L. Grigori, M. Gu, J. Xia 5, S. Jardin 6, C. Sovinec 7, L.-Q.

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia

More information

Math 671: Tensor Train decomposition methods II

Math 671: Tensor Train decomposition methods II Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016 Table of Contents 1 What we ve talked about so far: 2 The Tensor Train decomposition

More information

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan

Lecture 4. Tensor-Related Singular Value Decompositions. Charles F. Van Loan From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 4. Tensor-Related Singular Value Decompositions Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

MTH 464: Computational Linear Algebra

MTH 464: Computational Linear Algebra MTH 464: Computational Linear Algebra Lecture Outlines Exam 2 Material Prof. M. Beauregard Department of Mathematics & Statistics Stephen F. Austin State University February 6, 2018 Linear Algebra (MTH

More information

Wafer Pattern Recognition Using Tucker Decomposition

Wafer Pattern Recognition Using Tucker Decomposition Wafer Pattern Recognition Using Tucker Decomposition Ahmed Wahba, Li-C. Wang, Zheng Zhang UC Santa Barbara Nik Sumikawa NXP Semiconductors Abstract In production test data analytics, it is often that an

More information

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

Model-Driven Sparse CP Decomposition for Higher-Order Tensors 7 IEEE International Parallel and Distributed Processing Symposium Model-Driven Sparse CP Decomposition for Higher-Order Tensors Jiajia Li, Jee Choi, Ioakeim Perros, Jimeng Sun, Richard Vuduc Computational

More information

Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX

Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX Utilisation de la compression low-rank pour réduire la complexité du solveur PaStiX 26 Septembre 2018 - JCAD 2018 - Lyon Grégoire Pichon, Mathieu Faverge, Pierre Ramet, Jean Roman Outline 1. Context 2.

More information

Computing Dense Tensor Decompositions with Optimal Dimension Trees

Computing Dense Tensor Decompositions with Optimal Dimension Trees Computing Dense Tensor Decompositions with Optimal Dimension Trees Oguz Kaya, Yves Robert, Bora Uçar To cite this version: Oguz Kaya, Yves Robert, Bora Uçar. Computing Dense Tensor Decompositions with

More information

Constrained Tensor Factorization with Accelerated AO-ADMM

Constrained Tensor Factorization with Accelerated AO-ADMM Constrained Tensor actorization with Accelerated AO-ADMM Shaden Smith, Alec Beri, George Karypis Department of Computer Science and Engineering, University of Minnesota, Minneapolis, USA Department of

More information

Tensor-Tensor Product Toolbox

Tensor-Tensor Product Toolbox Tensor-Tensor Product Toolbox 1 version 10 Canyi Lu canyilu@gmailcom Carnegie Mellon University https://githubcom/canyilu/tproduct June, 018 1 INTRODUCTION Tensors are higher-order extensions of matrices

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Logistics Notes for 2016-08-26 1. Our enrollment is at 50, and there are still a few students who want to get in. We only have 50 seats in the room, and I cannot increase the cap further. So if you are

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018

TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS. Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 TENSOR LAYERS FOR COMPRESSION OF DEEP LEARNING NETWORKS Cris Cecka Senior Research Scientist, NVIDIA GTC 2018 Tensors Computations and the GPU AGENDA Tensor Networks and Decompositions Tensor Layers in

More information

Postgraduate Course Signal Processing for Big Data (MSc)

Postgraduate Course Signal Processing for Big Data (MSc) Postgraduate Course Signal Processing for Big Data (MSc) Jesús Gustavo Cuevas del Río E-mail: gustavo.cuevas@upm.es Work Phone: +34 91 549 57 00 Ext: 4039 Course Description Instructor Information Course

More information

MUMPS. The MUMPS library: work done during the SOLSTICE project. MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux

MUMPS. The MUMPS library: work done during the SOLSTICE project. MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux The MUMPS library: work done during the SOLSTICE project MUMPS team, Lyon-Grenoble, Toulouse, Bordeaux Sparse Days and ANR SOLSTICE Final Workshop June MUMPS MUMPS Team since beg. of SOLSTICE (2007) Permanent

More information

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA Jacobi-Based Eigenvalue Solver on GPU Lung-Sheng Chien, NVIDIA lchien@nvidia.com Outline Symmetric eigenvalue solver Experiment Applications Conclusions Symmetric eigenvalue solver The standard form is

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)

More information

Tensor Network Computations in Quantum Chemistry. Charles F. Van Loan Department of Computer Science Cornell University

Tensor Network Computations in Quantum Chemistry. Charles F. Van Loan Department of Computer Science Cornell University Tensor Network Computations in Quantum Chemistry Charles F. Van Loan Department of Computer Science Cornell University Joint work with Garnet Chan, Department of Chemistry and Chemical Biology, Cornell

More information

Big Tensor Data Reduction

Big Tensor Data Reduction Big Tensor Data Reduction Nikos Sidiropoulos Dept. ECE University of Minnesota NSF/ECCS Big Data, 3/21/2013 Nikos Sidiropoulos Dept. ECE University of Minnesota () Big Tensor Data Reduction NSF/ECCS Big

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

A Unified Optimization Approach for Sparse Tensor Operations on GPUs

A Unified Optimization Approach for Sparse Tensor Operations on GPUs A Unified Optimization Approach for Sparse Tensor Operations on GPUs Bangtian Liu, Chengyao Wen, Anand D. Sarwate, Maryam Mehri Dehnavi Rutgers, The State University of New Jersey {bangtian.liu, chengyao.wen,

More information

Tensor-Matrix Products with a Compressed Sparse Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith shaden@cs.umn.edu Department of Computer Science and Engineering University of Minnesota, USA George Karypis karypis@cs.umn.edu ABSTRACT

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

Section 1.1: Systems of Linear Equations. A linear equation: a 1 x 1 a 2 x 2 a n x n b. EXAMPLE: 4x 1 5x 2 2 x 1 and x x 1 x 3

Section 1.1: Systems of Linear Equations. A linear equation: a 1 x 1 a 2 x 2 a n x n b. EXAMPLE: 4x 1 5x 2 2 x 1 and x x 1 x 3 Section 1.1: Systems of Linear Equations A linear equation: a 1 x 1 a 2 x 2 a n x n b EXAMPLE: 4x 1 5x 2 2 x 1 and x 2 2 6 x 1 x 3 rearranged rearranged 3x 1 5x 2 2 2x 1 x 2 x 3 2 6 Not linear: 4x 1 6x

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 27, 2015 Outline Linear regression Ridge regression and Lasso Time complexity (closed form solution) Iterative Solvers Regression Input: training

More information

Low-rank Tucker decomposition of large tensors using TensorSketch

Low-rank Tucker decomposition of large tensors using TensorSketch Low-rank Tucker decomposition of large tensors using TensorSketch Stephen Becker and Osman Asif Malik Department of Applied Mathematics May 8, 08 Abstract We propose two randomized algorithms for low-rank

More information

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization)

A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) A model leading to self-consistent iteration computation with need for HP LA (e.g, diagonalization and orthogonalization) Schodinger equation: Hψ = Eψ Choose a basis set of wave functions Two cases: Orthonormal

More information

From Matrix to Tensor. Charles F. Van Loan

From Matrix to Tensor. Charles F. Van Loan From Matrix to Tensor Charles F. Van Loan Department of Computer Science January 28, 2016 From Matrix to Tensor From Tensor To Matrix 1 / 68 What is a Tensor? Instead of just A(i, j) it s A(i, j, k) or

More information

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark Aditya Gudibanda, Tom Henretty, Muthu Baskaran, James Ezick, Richard Lethin Reservoir Labs 632 Broadway Suite 803 New York, NY

More information

Scalable Tensor Factorizations with Incomplete Data

Scalable Tensor Factorizations with Incomplete Data Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical

More information

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

CVPR A New Tensor Algebra - Tutorial. July 26, 2017 CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic

More information

Processing Big Data Matrix Sketching

Processing Big Data Matrix Sketching Processing Big Data Matrix Sketching Dimensionality reduction Linear Principal Component Analysis: SVD-based Compressed sensing Matrix sketching Non-linear Kernel PCA Isometric mapping Matrix sketching

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

V C V L T I 0 C V B 1 V T 0 I. l nk

V C V L T I 0 C V B 1 V T 0 I. l nk Multifrontal Method Kailai Xu September 16, 2017 Main observation. Consider the LDL T decomposition of a SPD matrix [ ] [ ] [ ] [ ] B V T L 0 I 0 L T L A = = 1 V T V C V L T I 0 C V B 1 V T, 0 I where

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #21: Dimensionality Reduction Seoul National University 1 In This Lecture Understand the motivation and applications of dimensionality reduction Learn the definition

More information

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Link Analysis Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 The Web as a Directed Graph Page A Anchor hyperlink Page B Assumption 1: A hyperlink between pages

More information

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Accelerating linear algebra computations with hybrid GPU-multicore systems. Accelerating linear algebra computations with hybrid GPU-multicore systems. Marc Baboulin INRIA/Université Paris-Sud joint work with Jack Dongarra (University of Tennessee and Oak Ridge National Laboratory)

More information

Solution of Linear Systems

Solution of Linear Systems Solution of Linear Systems Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico May 12, 2016 CPD (DEI / IST) Parallel and Distributed Computing

More information

Large Scale Sparse Linear Algebra

Large Scale Sparse Linear Algebra Large Scale Sparse Linear Algebra P. Amestoy (INP-N7, IRIT) A. Buttari (CNRS, IRIT) T. Mary (University of Toulouse, IRIT) A. Guermouche (Univ. Bordeaux, LaBRI), J.-Y. L Excellent (INRIA, LIP, ENS-Lyon)

More information

arxiv:cs/ v1 [cs.ms] 13 Aug 2004

arxiv:cs/ v1 [cs.ms] 13 Aug 2004 tsnnls: A solver for large sparse least squares problems with non-negative variables Jason Cantarella Department of Mathematics, University of Georgia, Athens, GA 30602 Michael Piatek arxiv:cs/0408029v

More information

Quantum Recommendation Systems

Quantum Recommendation Systems Quantum Recommendation Systems Iordanis Kerenidis 1 Anupam Prakash 2 1 CNRS, Université Paris Diderot, Paris, France, EU. 2 Nanyang Technological University, Singapore. April 4, 2017 The HHL algorithm

More information

6 Linear Systems of Equations

6 Linear Systems of Equations 6 Linear Systems of Equations Read sections 2.1 2.3, 2.4.1 2.4.5, 2.4.7, 2.7 Review questions 2.1 2.37, 2.43 2.67 6.1 Introduction When numerically solving two-point boundary value problems, the differential

More information

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY

TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY TENSOR APPROXIMATION TOOLS FREE OF THE CURSE OF DIMENSIONALITY Eugene Tyrtyshnikov Institute of Numerical Mathematics Russian Academy of Sciences (joint work with Ivan Oseledets) WHAT ARE TENSORS? Tensors

More information

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views

A Randomized Approach for Crowdsourcing in the Presence of Multiple Views A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion

More information

Large-scale Matrix Factorization. Kijung Shin Ph.D. Student, CSD

Large-scale Matrix Factorization. Kijung Shin Ph.D. Student, CSD Large-scale Matrix Factorization Kijung Shin Ph.D. Student, CSD Roadmap Matrix Factorization (review) Algorithms Distributed SGD: DSGD Alternating Least Square: ALS Cyclic Coordinate Descent: CCD++ Experiments

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

arxiv: v1 [cs.lg] 18 Nov 2018

arxiv: v1 [cs.lg] 18 Nov 2018 THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Parallel Numerical Algorithms

Parallel Numerical Algorithms Parallel Numerical Algorithms Chapter 6 Structured and Low Rank Matrices Section 6.3 Numerical Optimization Michael T. Heath and Edgar Solomonik Department of Computer Science University of Illinois at

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version

LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version 1 LU factorization with Panel Rank Revealing Pivoting and its Communication Avoiding version Amal Khabou Advisor: Laura Grigori Université Paris Sud 11, INRIA Saclay France SIAMPP12 February 17, 2012 2

More information

Matrix Decomposition and Latent Semantic Indexing (LSI) Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson

Matrix Decomposition and Latent Semantic Indexing (LSI) Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Matrix Decomposition and Latent Semantic Indexing (LSI) Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Latent Semantic Indexing Outline Introduction Linear Algebra Refresher

More information

DFacTo: Distributed Factorization of Tensors

DFacTo: Distributed Factorization of Tensors DFacTo: Distributed Factorization of Tensors Joon Hee Choi Electrical and Computer Engineering Purdue University West Lafayette IN 47907 choi240@purdue.edu S. V. N. Vishwanathan Statistics and Computer

More information

Efficient CP-ALS and Reconstruction From CP

Efficient CP-ALS and Reconstruction From CP Efficient CP-ALS and Reconstruction From CP Jed A. Duersch & Tamara G. Kolda Sandia National Laboratories Livermore, CA Sandia National Laboratories is a multimission laboratory managed and operated by

More information

Technical Report TR SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication

Technical Report TR SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union Street SE Minneapolis, MN 55455-0159 USA TR 15-008 SPLATT: Efficient and Parallel Sparse

More information

Transposition Mechanism for Sparse Matrices on Vector Processors

Transposition Mechanism for Sparse Matrices on Vector Processors Transposition Mechanism for Sparse Matrices on Vector Processors Pyrrhos Stathis Stamatis Vassiliadis Sorin Cotofana Electrical Engineering Department, Delft University of Technology, Delft, The Netherlands

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse.

Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse. Application of the SVD: Compression and Pseudoinverse Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse. Low rank approximation One

More information

A fast randomized algorithm for overdetermined linear least-squares regression

A fast randomized algorithm for overdetermined linear least-squares regression A fast randomized algorithm for overdetermined linear least-squares regression Vladimir Rokhlin and Mark Tygert Technical Report YALEU/DCS/TR-1403 April 28, 2008 Abstract We introduce a randomized algorithm

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Sparse LU Factorization on GPUs for Accelerating SPICE Simulation

Sparse LU Factorization on GPUs for Accelerating SPICE Simulation Nano-scale Integrated Circuit and System (NICS) Laboratory Sparse LU Factorization on GPUs for Accelerating SPICE Simulation Xiaoming Chen PhD Candidate Department of Electronic Engineering Tsinghua University,

More information

LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version

LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version Amal Khabou James Demmel Laura Grigori Ming Gu Electrical Engineering and Computer Sciences University of California

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING POZNAN UNIVE RSIY OF E CHNOLOGY ACADE MIC JOURNALS No. 80 Electrical Engineering 2014 Hussam D. ABDULLA* Abdella S. ABDELRAHMAN* Vaclav SNASEL* USING SINGULAR VALUE DECOMPOSIION (SVD) AS A SOLUION FOR

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS

AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS Progress In Electromagnetics Research M, Vol. 23, 53 63, 2012 AN INDEPENDENT LOOPS SEARCH ALGORITHM FOR SOLVING INDUCTIVE PEEC LARGE PROBLEMS T.-S. Nguyen *, J.-M. Guichon, O. Chadebec, G. Meunier, and

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 27, 2015 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass

More information

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets IEEE Big Data 2015 Big Data in Geosciences Workshop An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets Fatih Akdag and Christoph F. Eick Department of Computer

More information

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation

Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation Sparse Solutions of Linear Systems of Equations and Sparse Modeling of Signals and Images: Final Presentation Alfredo Nava-Tudela John J. Benedetto, advisor 5/10/11 AMSC 663/664 1 Problem Let A be an n

More information

A Note on Google s PageRank

A Note on Google s PageRank A Note on Google s PageRank According to Google, google-search on a given topic results in a listing of most relevant web pages related to the topic. Google ranks the importance of webpages according to

More information

arxiv: v1 [cs.lg] 26 Jul 2017

arxiv: v1 [cs.lg] 26 Jul 2017 Updating Singular Value Decomposition for Rank One Matrix Perturbation Ratnik Gandhi, Amoli Rajgor School of Engineering & Applied Science, Ahmedabad University, Ahmedabad-380009, India arxiv:70708369v

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

Large Scale Tensor Decompositions: Algorithmic Developments and Applications

Large Scale Tensor Decompositions: Algorithmic Developments and Applications Large Scale Tensor Decompositions: Algorithmic Developments and Applications Evangelos Papalexakis, U Kang, Christos Faloutsos, Nicholas Sidiropoulos, Abhay Harpale Carnegie Mellon University, KAIST, University

More information