Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data

Size: px
Start display at page:

Download "Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data"

Transcription

1 Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data Pierre-David Letourneau, Muthu Baskaran, Tom Henretty, James Ezick, Richard Lethin Reservoir Labs 632 Broadway Suite 803 New York, NY Abstract We present Streaming CP Update, an algorithmic framework for updating CP tensor decompositions that possesses the capability of identifying emerging components and can produce decompositions of large, sparse tensors streaming along multiple modes at a low computational cost. We discuss a large-scale implementation of the proposed scheme integrated within the ENSIGN tensor analysis package, and we evaluate and demonstrate the performance of the framework, in terms of computational efficiency and ability to discover emerging components, on a real cyber dataset. I. INTRODUCTION A tensor is a multidimensional array that can be used to represent and store multidimensional data. A tensor decomposition is an object that can extract relationships and correlations among tensor data by representing the latter as a combination of simple components factors, rank-1 tensors, see Figure 1. Tensor decompositions have been successfully used in a multitude of applications. These include: genomics 1, geospatial analysis 2, cybersecurity 3, chemometrics 4, computer vision 5, data mining 6 and precision medicine 7 to name but a few. There exists a variety of numerical packages capable of computing tensor decompositions, including: GigaTensor 8, HaTen2 9, SPLATT 10, SCouT 11 and BIGtensor 12. With the exception of SPLATT 10, existing packages can generally only treat immutable tensors. That is, in situations where the amount of data increases e.g., temporally, they are bound to treat the new data by reconstructing the tensor and re-computing an entirely new decomposition with little, if any, reuse of the information provided by the er decomposition. This, of course, leads to serious computational inefficiencies. In this work, we address these inefficiencies in cases where the amount of data increases in a streaming fashion. That is, we consider cases where data increase is related to the growth of the size of an existing tensor. We focus on the CANDECOMP/PARAFAC CP decomposition and present the Streaming CP Update, an algorithmic framework for updating CP tensor decompositions that possesses the ability to identifying emerging components and can produce robust decompositions of large, sparse streaming tensors at a low-computational cost. Fig. 1. Diagram representing the proposed Streaming CP Update framework for a single streaming mode: an updated decomposition is created from the CP tensor decomposition of an original tensor top and that of an update data tensor middle that adds information along a streaming mode horizontal. Non-streaming modes are merged. The streaming temporal mode is fully updated. The framework generalizes to multiple streaming modes as well. The development of decomposition algorithms for efficiently treating streaming tensors is relatively novel within the field of numerical tensor analysis. Recently-proposed approaches fall within two categories: 1 Perturbation-based methods that perform the update through a continuous modification perturbation of factors found in an existing decomposition 13, Component discovery methods that focus on merging an existing decomposition with a second one obtained from the update data without further modifications to the factors 15. Our method lies at the intersection of both categories; it is a component discovery method because it merges an existing decomposition with a decomposition of the update data along non-streaming modes. However, it is also a perturbationbased method because it modifies and adapts the streaming mode factors following the merging step. In this sense, it offers the best of both worlds while keeping computational costs to a minimum. Our contributions in this regard include: 1 A streaming tensor decomposition framework and algorithm: a low computational cost, small memory footprint algorithm for updating existing tensor decompositions in light of new streaming data, 2 Superior capabilities for identifying and extracting emerging components not present in original data 16, 3 Extension of streaming updates to multi-mode, 4 Implementation of the framework using highperformance tensor decomposition and manipulation routines ENSIGN and 17, 5 Evaluation & demonstration of performance on real data.

2 II. BACKGROUND AND NOTATION We shall use the following notation: vectors are represented by b lowercase letters v, matrices are represented by b capital letters A, and tensors are represented by b calligraphic capital letters X. Tensors are elements of R I1 I2... I N. N is the number of modes of the tensor and I n is the dimension of the tensor along mode n. The CP decomposition of a tensor X is an object denoted A 1, A 2,..., A N, or A n when the context is clear, where {A n } N n1 are I n K factor matrices, K is a fixed integer called the rank and A 1, A 2,..., A N K n1 A1 :, n A 2 :, n... A N :, n, where represents the outer product. Here, we use MATLAB-like notation 1 for array indexing so that a colon : represents all the elements along a certain dimension and a sequence n : n + m represents a restriction to the elements with indices n to n + m included. For instance, the quantity A 1 :, n above refers to the n th column of A 1. We will also use the symbol 0 M N to represent a matrix of size M N with entries being all zeros. We further introduce certain operations on vectors and tensors that will become important in future sections. We denote by the standard inner product between vectors or the Frobenius inner product between matrices and tensors. Similarly, is the Euclidean norm on vectors and the Frobenius norm on matrices and tensors. Given two matrices A and B of size M N, symbols and represent element-wise multiplication and division, i.e., A B i, j Ai, j Bi, j and A B i, j Ai,j Bi,j. We also introduce the Khatri-Rao product A B A:, 1 B:, 1, A:, 2 B:, 2,..., A:, N B:, N, where is the Kronecker product, i.e., A1, 1B... A1, NB A B AM, 1B... AM, NB and A 1, A 2 indicates horizontal matrix concatenation. Finally, we denote the matricization of X along the n th mode by X n to be the re-ordering of the elements of a tensor X in matrix form such that for any fixed index i 1,..., i n 1, i n+1,..., i N, the vector {X i 1,..., i n 1, j, i n+1,..., i N } In j1 is a column of X n. III. RELATED WORK Original work pertaining to CP streaming tensor updates is generally associated with Nion et al. 18. Similar work has also been presented with regard to the Tucker decomposition update in 19, 20, and 21. The Tucker decomposition, although related to the CP, is more restrictive and will not be addressed in this paper. More recent work in streaming tensor decomposition include that of Zhou et al. 13, Smith 14, and Pasricha et al. 15. Zhou et al. s method 13 modifies the factors of an already-existing decomposition in order to account for the update data. The emphasis is on scalability, and the cornerstone of the method is a thorough form of 1 The indexing is 1-indexed; the first index of an array is 1, not 0. computational recycling redundant computation avoidance using a special hierarchy specific to the Alternating Least-Squares ALS approach. Smith 14 proceeds in a similar fashion, but emphasizes the need to down-weight information that was observed far in the past, while paying particular attention to memory and computational costs. Pasricha et al. 15 focuses on discovering emerging components from the update data rather than performing perturbations of existing factors. To do so, a full decomposition of the update data is performed, and the factor matrices are then merged to create an updated decomposition. No further operations are performed following the merging step. IV. STREAMING TENSOR DECOMPOSITION In this section, we introduce the Streaming CP Update framework. We focus on the case of an N + 1-mode tensor for which the N + 1 th mode is the streaming mode 2, and the only mode in which size changes singlemode streaming. The generalization of the framework to multi-mode streaming is discussed in Section IV-A. We will generally refer to the streaming mode as the temporal mode and to the remaining modes as the non-temporal modes. To begin with, we assume that we have access to a rank- K tensor decomposition A n of the original I 1... I N T tensor X. We further denote the update tensor by X new and assume that its size is compatible with that of the original tensor, i.e., that is is of size I 1... I N T new for some T new N. Under these circumstances, our method can be described as follows: 1 Compute tensor decomposition of update data X new, 2 Merge existing and update tensor decompositions factor matrices along non-temporal modes, 3 Update temporal mode factor matrix, 4 Classify factors of updated decomposition optional, 5 Truncate updated decomposition optional. This framework is summarized in Algorithm 1. The first step uses existing routines for computing a rank-k new where K new is user-provided tensor decomposition A new n of the update data tensor X new. Algorithm 1 Streaming CP update Input: A n, X new, K new > 0, 0 < ν sim 1, τ > 0, K Compute: A n new rank-k new decomp. of X new A n, Ã N+1 new MERGE A n, An new, ν sim UPDATE A n, ÃN+1 new {C 1, C 2, C 3 } CLASSIFY A n, K, K, τ A n, S trunc TRUNCATE A n, K, K Output: A n, {C 1, C 2, C 3 }, S trunc 2 This is done purely for convenience and ease of notation; the framework is oblivious to the actual index of the streaming mode.

3 The purpose of the second step is to merge the existing and update tensor factor matrices along the nontemporal modes by first eliminating redundant factors and then concatenating the resulting matrices. For instance, assume that X and X new have non-temporal factor matrices {A i }N i1 and {Aj new} N j1 respectively. First, we identify the non-temporal factors A 1 :, i... AN :, i and A 1 new:, j... A N new:, j that are shared among both decomposition. To do so, we measure the cosine similarity, N A n :, i, An new:, j σi, j n1 A n A :, i n new:, j among each pairs and eliminate from the update data decomposition the factors whose similarity, max i σi, j, lies beyond some thresh ν sim. This leaves non-temporal factor matrices {Ãn new} N n1 of size I n K new, where K new K new. These correspond to novel components not previously observed in the original data, and are of prime importance in capturing phenomena that are not mere perturbations of existing components such as the beginning of a cyber-attack in network analysis. Finally, we concatenate the latter with the original factor matrices to create updated factor matrices: A n A n, Ãn new for n 1,..., N. This is summarized in Algorithm 2. Algorithm 2 Merging non-temporal modes MERGE Input: A n, An Initialize {Ãn for j from 1 to K new new, 0 < ν sim 1 n1 empty matrices do new} N+1 Compute: σj max i N n1 if σj < ν sim then for n from 1 to N + 1 do Add A n new:, j to Ãn end for end if end for for n from 1 to N A n A n end for Output: A n, This step ensures that: do, Ãn new à N+1 new new A n :,i, An new :,j A n A :,i n new:,j Non-temporal components found in the decomposition of the original tensor are leveraged to explain similar components in the update data. Completely new components are allowed to emerge. The third step involves the update of the temporal mode factor matrix. For this purpose, we write, upd, 0 T K new upd,new is a matrix of size T + T new K, where K K + K new, and corresponds to the temporal mode factor matrix of the updated decomposition. The zero matrix on the top-right indicates that the updated non-temporal factors associated with the right-most indices have no influence on the decomposition until time T. corresponds to the temporal mode factor matrix of the original decomposition, A upd, is a T new K matrix corresponding to the temporal mode update associated with previously-observed components, whereas A upd,new is a T new K new matrix corresponding to the temporal mode update associated with novel components. Under our proposed framework, the upper portion, 0 of the temporal factor matrix does not require any modifications throughout the update process see Appendix; only the lower part A upd upd,, AN+1 upd,new involves computations. This relies on the assumption that the components of the decomposition are stable, i.e., that an ab initio decomposition of the full tensor would produce nontemporal components associated with the portion of the data similar to those of X. 3 It is also key to the low cost of the method because the size of the latter T new K, and therefore the cost of the update, is often orders-of-magnitude smaller than that of the former T K. Also, upd,new is initialized using the temporal factor matrix obtained from the decomposition of X upd, i.e., à new N+1, whereas upd, is initialized randomly. The framework remains the same across decompositions, although the explicit nature of the update process varies. In this paper, we focus on three types of decompositions: the CP-APR 22 probabilistic Poisson framework for count data, CP-ALS 23 Alternating Least-Squares, and CP- ALS nonnegative CP-ALS-NN 23 CP-ALS with nonnegativity constraints decompositions. Algorithm 3-5 provide an explicit representation of the update process for each case. We further emphasize that the Streaming CP Update is a general framework that is not limited to this short list. Algorithm 3 CP-APR update UPDATE ; see proof in Appendix Input: A n, A upd initial guess Compute: Π A 1 A 2... A N T while NOT CONVERGED do A upd A upd X N+1,new A upd Π Π T end while Output: A N+1 0 T K new A upd Algorithm 4 CP-ALS update UPDATE Input: A n, A upd initial guess Compute: V A 1T A 1 A 2T A 2... A NT A N Compute: W A 1 A 2... A N A upd X N+1,new W V Output: N+1 A 0 T K new A upd 3 In practice, we have found that this is indeed the case. See Section V.

4 Algorithm 5 CP-ALS-NN update UPDATE Input: A n, A upd initial guess Compute: V A 1T A 1 A 2T A 2... A NT A N Compute: W A 1 A 2... A N A upd A upd X N+1,new W A upd V Output: N+1 A 0 T K new A upd Following the update stage, we proceed to the postprocessing. First, our aim is to provide classification information to the user as per which components of the updated decomposition belong to which of the following three categories: 1 Components present in the original decomposition that do not appear in the update data C 1 2 Components present in the original decomposition that appear in the update data C 2 3 Novel components C 3 Our criteria in each case can be described as follows: a component A 1 :, i 0... A N :, i 0 present in the original decomposition belongs to class C 1 if the associated updated portion of the temporal mode is small, otherwise it belongs to class C 2. Class C 3 components are those associated with the non-temporal factor matrices: {Ãn new} N n1. Explicit formulas can be found in Algorithm 6, and the actual thresh is user-provided. This classification process is flexible and may easily accommodate additional features such as forgetfulness, by which factors not contributing to the explanation of recent data are discarded see, e.g., 14. This may be achieved using, for instance, weighted norms in Algorithm 6. In addition, it provides a means of quantifying the evolution of the updated decomposition s quality. The final step involves truncating the resulting decomposition, which now has rank K K + K new, back to a decomposition of rank 0 < K K provided by the user. To do so, we order modes according to their size and only select the K largest. Information about which components were eliminated is also provided to the user. Pseudo-code for this operation can be found in Algorithm 7. Algorithm 6 Classify modes CLASSIFY Input: A n, K, K, τ > 0 for k from 1 to K do Let: φ k A 1 :, k... A N :, k if k < K then Tnew i1 upd, i,k 2 if :,j k < τ then φ k C 1 else φ k C 2 end if else φ k C 3 end if end for Output: {C i } 3 i1 Although these last two steps are not necessary for the scheme to succeed, they provide a useful and informative summary to the end-user who may not be fully familiar with the details of the framework. Furthermore, by keeping control over the rank of the updated decomposition we can achieve a trade-off between both computational and memory efficiency and the quality of the decomposition. Algorithm 7 Truncate updated decomposition TRUNCATE Input: A n, K, K > 0 Compute: λ k N+1 n1 An :, k Sort: A n :, j 1,..., A n :, j K, λ j1 λ j2... λ jk Truncate: A n A n :, j 1,..., A n :, j K Output: A n, S trunc {j K+1,..., j K} A. Multi-mode Streaming Decomposition To update along various modes, we order the modes and proceed using the single-mode streaming update, updating one mode at a time. The framework, although not the focus of the present paper, has been implemented and has produced results similar to those observed in the single-mode case Section V. One caveat worth mentioning, however, is that the final decomposition may exhibit a dependence on the particular mode ordering in which one performs the updates, especially if truncation is performed at each step. B. Streaming CP Downdate As more and more data enters the stream, the larger the temporal factor matrix becomes. In certain applications, this may quickly prove prohibitive from a memory perspective. In those circumstances, it may be deemed adequate to remove data from the dataset along the streaming mode. We designate to downdate the process of removing data and then adjusting the decomposition to take the removal into consideration. Downdating can be performed in a way that is analogous to our proposed updating process: assuming the existence of a decomposition A n of size I 1... I N T rem + T, our proposed scheme proceeds as follows: 1 Identify non-temporal components observed in the data to be removed but not present in the remaining data using, e.g., cosine similarity 2 Remove identified non-temporal factors from A n 3 Remove 1 : T rem, : from 4 Adjust temporal mode variants of Alg. 3-5 V. RESULTS A. Component Discovery in Streaming Cyber Data In this section, we illustrate the ability of our streaming tensor decomposition to rapidly discover components in streaming data. We show the capability with a real application use case in the cybersecurity domain. Specifically, we show how we identify a cyber attack at its onset in a real operational network, namely, SCinet, and trace its evolution. SCinet, described as the fastest network connecting the fastest computers, is set up each year at SC the

5 Fig. 2. A component revealing a suspected DNS amplification DDoS attack from the decomposition of a DNS query tensor. International Conference for High Performance Computing, Networking, Storage and Analysis. At SCinet 2017, we installed and operated ENSIGN from a node in the SCinet Network Security Cloud. ENSIGN enabled us to identify a number of suspicious network activities including network mapping attempts, port scans, and multiple suspected DNS amplification DDoS attacks. We base our illustration of the streaming tensor decomposition on one of the multiple suspected DNS amplification DDoS attacks that were detected using ENSIGN s CP-APR decomposition implementation. We chose to use CP-APR since it is particularly well-suited to the positive integers/count data found in cyber tensors. It also benefits the most from the streaming framework among all tested methods Table II. To create a DNS query tensor from a cyber log containing DNS queries, we used the following fields as tensor modes: time, sender IP, receiver IP, DNS query, and DNS query type. In Figure 2, we show one of the components from the output of the tensor decomposition on the DNS query tensor. The component shows a single subnet from Seychelles attempting to lookup a single domain across a large number of SCinet hosts. This lookup was for all DNS records related to a single domain and was repeatedly performed for a period of approximately five hours. Presumably, the sender address was forged and query responses were sent to this forged victim address. In theory, this would overwhelm the victim with response traffic. Since the vast majority of SCinet hosts are not DNS servers, significant traffic to the victim domain was not seen. In this case, it is likely that SCinet was a smaller part of a larger DDoS attack against the victim in Seychelles. From Figure 2, we infer that the attack took place from 8:30am until 1:30pm. We used our streaming analysis capability to detect and validate that the attack could be identified at its onset in near real-time. This opened up the opportunity to notify the network administrators about suspicious activities and attacks for timely action. Figure 3 illustrates the tracking of the activity as it happens over time. It describes the activity with a magnified resolution. B. Computational Efficiency In this section, we present experimental results to illustrate the computational efficiency of our framework and how our approach improves the response time of tensor decompositions while analyzing dynamically changing real-world data. We once again use cyber tensors for our illustration of computational efficiency. Specifically, we use the DNS query tensor described in the previous section. We formed tensors from DNS query logs generated through one day from midnight to 4pm. For these experiments, we fixed the base tensor as the one formed from the collection of DNS queries accumulated from midnight to 9am. We considered analysis of data streams coming in every hour. Further, we demonstrated the computational efficiency for the single mode streaming case i.e., assuming the data grows only along one mode, namely, the time mode. Table I lists the different tensors in the order in which they are formed in time and their sizes. TABLE I CYBER TENSOR DATASETS USED FOR THE EXPERIMENTS Mode size Number of Tensor Time Other modes non-zeros dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, dns 0to , 128, 69608, In the absence of streaming tensor decomposition, the analysis of each tensor in the list involves a full decomposition of the tensor without using any information from the decomposition of any of the preceding tensors in the list. Table II presents the computational efficiency in terms of faster analysis time when a streaming version of a particular CP method APR, ALS, ALS-NN is used instead of the base full version of the CP method. We used a modern multi-core system to evaluate our framework. The system we use is a quad socket 8-core system with Intel Xeon E GHz processors Intel Sandy Bridge microarchitecture chips. The system has 128 GB of DRAM. We use 64 threads with hyperthreading on for the performance runs. We observe a prominent reduction in the decomposition analysis time for the streaming version of CP-APR between 25x-80x reduction in time. We observe between 3x- 12x reduction in time with the streaming version of CP-ALS. With the streaming version of CP-ALS-NN, we observe between 2x-3.5x reduction in time. For all of our experiments, we used an existing parallelized and scalable implementation of the various CP decomposition routines 3. Our profiling shows that the most expensive part of the Streaming CP Update is the decomposition of the update data itself. The remaining components of the framework, although not currently parallelized, do not

6 Fig. 3. Components from the streaming decomposition showing the evolution of the DNS amplification DDoS attack as it happens over time. The attack is identified at its onset. represent significant overhead and may be themselves easily parallelized. TABLE II TIME TAKEN BY THE THREE CP METHODS AND THEIR STREAMING VERSIONS ON CYBER TENSOR DATASETS Time in seconds CP-APR CP-ALS CP-ALS-NN Tensor Full Stream Full Stream Full Stream dns 0to dns 0to dns 0to dns 0to dns 0to dns 0to dns 0to dns 0to Finally, Figure 4 compares the final fit of the decomposition output resulting from the application of our Streaming CP Update framework with that of an ab initio decomposition, i.e., merging the and update data into a single, large tensor and performing an entirely new decomposition. It is clear from the final fit that our streaming decomposition methods does not result in any significant loss of accuracy in the decomposition even after after multiple updates. In fact, we even see minor improvement in some cases. This counterintuitive behavior may be accounted for by the nature of the optimization problem underlying the decomposition descent path, non-convexity but is not yet fully understood. Fig. 4. Final fit 1.0 represents perfect reconstruction resulting from our Streaming CP Update versus an ab initio decomposition. Using the streaming update framework does not significantly affect the fit less than 7% for APR, 10% for ALS and 3% for ALS-NN after 7 updates. VI. CONCLUSIONS We have presented a novel, computationally-efficient framework for performing CP tensor decomposition updates capable of updating and tracking the most important components as well as the quality of the decomposition on the fly. We have completed a full implementation of the method using high-performance tensor tools, which we used to evaluate the performance of the approach using real data. In doing so, we demonstrated the ability of the technique to capture important information and emerging components within a stream of data as well as its competitive computational performance.

7 REFERENCES 1 Victoria Hore, Ana Viñuela, Alfonso Buil, Julian Knight, Mark I McCarthy, Kerrin Small, and Jonathan Marchini. Tensor decomposition for multiple-tissue gene expression experiments. Nature genetics, 489:1094, Tom Henretty, Muthu Baskaran, James Ezick, David Bruns-Smith, and Tyler A Simon. A quantitative and qualitative analysis of tensor decompositions on spatiotemporal data. In High Performance Extreme Computing Conference HPEC, 2017 IEEE, pages 1 7. IEEE, Muthu Baskaran, Tom Henretty, Benoit Pradelle, M Harper Langston, David Bruns-Smith, James Ezick, and Richard Lethin. Memoryefficient parallel tensor decompositions. In High Performance Extreme Computing Conference HPEC, 2017 IEEE, pages 1 7. IEEE, Charlotte Møller Andersen and R Bro. Practical aspects of parafac modeling of fluorescence excitation-emission data. Journal of Chemometrics, 174: , Tamir Hazan, Simon Polak, and Amnon Shashua. Sparse image coding using a 3d non-negative tensor factorization. In Computer Vision, ICCV Tenth IEEE International Conference on, volume 1, pages IEEE, Furong Huang. Discovery of latent factors in high-dimensional data using tensor methods. arxiv preprint arxiv: , Yuan Luo, Fei Wang, and Peter Szolovits. Tensor factorization toward precision medicine. Briefings in bioinformatics, 183: , U Kang, Evangelos Papalexakis, Abhay Harpale, and Christos Faloutsos. Gigatensor: scaling tensor analysis up by 100 times-algorithms and discoveries. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, Inah Jeon, Evangelos E Papalexakis, U Kang, and Christos Faloutsos. Haten2: Billion-scale tensor decompositions. In Data Engineering ICDE, 2015 IEEE 31st International Conference on, pages IEEE, Shaden Smith, Niranjay Ravindran, Nicholas D Sidiropoulos, and George Karypis. Splatt: Efficient and parallel sparse tensor-matrix multiplication. In Parallel and Distributed Processing Symposium IPDPS, 2015 IEEE International, pages IEEE, ByungSoo Jeon, Inah Jeon, Lee Sael, and U Kang. Scout: Scalable coupled matrix-tensor factorization-algorithm and discoveries. In Data Engineering ICDE, 2016 IEEE 32nd International Conference on, pages IEEE, Namyong Park, Byungsoo Jeon, Jungwoo Lee, and U Kang. Bigtensor: Mining billion-scale tensor made easy. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages ACM, Shuo Zhou, Nguyen Xuan Vinh, James Bailey, Yunzhe Jia, and Ian Davidson. Accelerating online cp decompositions for higher order tensors. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages ACM, Shaden Smith, Kejun Huang, Nicholas D Sidiropoulos, and George Karypis. Streaming tensor factorization for infinite data sources. In Proceedings of the 2018 SIAM International Conference on Data Mining, pages SIAM, Ravdeep Pasricha, Ekta Gujral, and Evangelos E Papalexakis. Identifying and alleviating concept drift in streaming tensor decomposition. arxiv preprint arxiv: , A. Commike A. Gudibanda T. Henretty M. H. Langston P. Letourneau J. Ros-Giralt R. Lethin J. Ezick, M. Baskaran. Eliminating barriers to automated tensor analysis for large-scale flows, January Reservoir Labs. ENSIGN Tensor Toolbox, Dimitri Nion and Nicholas D Sidiropoulos. Adaptive algorithms to track the parafac decomposition of a third-order tensor. IEEE Transactions on Signal Processing, 576: , Jimeng Sun, Dacheng Tao, and Christos Faloutsos. Beyond streams and graphs: dynamic tensor analysis. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, Jimeng Sun, Dacheng Tao, Spiros Papadimitriou, Philip S Yu, and Christos Faloutsos. Incremental tensor analysis: Theory and applications. ACM Transactions on Knowledge Discovery from Data TKDD, 23:11, Muthu Baskaran, M Harper Langston, Tahina Ramananandro, David Bruns-Smith, Tom Henretty, James Ezick, and Richard Lethin. Accelerated low-rank updates to tensor decompositions. In High Performance Extreme Computing Conference HPEC, 2016 IEEE, pages 1 7. IEEE, Eric C Chi and Tamara G Ka. On tensors, sparsity, and nonnegative factorizations. SIAM Journal on Matrix Analysis and Applications, 334: , Tamara G Ka and Brett W Bader. Tensor decompositions and applications. SIAM review, 513: , 2009.

8 APPENDIX STREAMING CP-APR UPDATE - PROOF In this section, we prove the correctness of the streaming CP-APR update algorithm. Other update algorithms within the framework can be derived in a similar fashion. Our starting point is Algorithm 1-2 of Chi et al. 22 that describe the CP-APR algorithm as an alternating descent scheme. In this case, we fix n N + 1 temporal mode and consider the update process which takes the form, Algorithm 8 CP-APR descent along mode N + 1 while NOT CONVERGED do X N+1 Π Π T end while where, Π upd, 0 T K new upd,new A N A N 1... A 1 T Π Π new Under our framework, Π is a matrix of size K + K new N n1 I n, is a matrix of size T +T new K + K new corresponding to the temporal mode factor matrix of the updated decomposition, is of size T K and corresponds to the temporal mode factor matrix of the original decomposition, A upd, is a T new K matrix corresponding to the temporal mode update of the components we wish to compute, and A upd, new is an analogous T new K new matrix associated with emerging components. In particular, we note that Algorithm 8 stops when it has found a matrix that is a fixed point of the operator UA A X N+1 A Π Π T. Finally, we introduce X N+1 and X N+1 new, the matricization of X and X new along the temporal mode respectively, and write: X N+1 write: upd upd, AN+1 upd,new T XN+1 T, X N+1 new T. We also. With this notation, it follows that, X N+1 Π XN+1, X N+1,new X N+1, upd, Π X N+1,new A upd Π Therefore, the update takes the form, X N+1 B Π Π T X N+1, AN+1 However, since 0 T K new upd 0 T K new upd,new Π Π T Π T new X N+1,new A upd Π X N+1, Π upd X N+1,new A 1,..., AN, AN+1 Π Π new Π T upd Π 0 T K new Π T is a decomposition of X, must be a fixed point by construction, i.e., X N+1, Π Π T This indicates that the known original portion of the temporal factor matrix remains fixed throughout the CP-APR update and that only the portion corresponding to the update data must be modified at each iteration. The streaming CP- APR update thus becomes, Algorithm 9 CP-APR update streaming 1 : T, : 0 T K new Initialize A upd T + 1 : T + T new, : while NOT CONVERGED do A upd A upd X N+1,new A upd Π Π T end while which is the form found in Algorithm 3.

Memory-efficient Parallel Tensor Decompositions

Memory-efficient Parallel Tensor Decompositions Patent Pending Memory-efficient Parallel Tensor Decompositions Muthu Baskaran, Tom Henretty, Benoit Pradelle, M. Harper Langston, David Bruns-Smith, James Ezick, Richard Lethin Reservoir Labs Inc. New

More information

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark

All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark All-at-once Decomposition of Coupled Billion-scale Tensors in Apache Spark Aditya Gudibanda, Tom Henretty, Muthu Baskaran, James Ezick, Richard Lethin Reservoir Labs 632 Broadway Suite 803 New York, NY

More information

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,

More information

Shaden Smith * George Karypis. Nicholas D. Sidiropoulos. Kejun Huang * Abstract

Shaden Smith * George Karypis. Nicholas D. Sidiropoulos. Kejun Huang * Abstract Streaming Tensor Factorization for Infinite Data Sources Downloaded 0/5/8 to 60.94.64.33. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Shaden Smith * shaden.smith@intel.com

More information

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous

More information

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia National Labs Sandia is a multiprogram laboratory operated by

More information

Parallel Sparse Tensor Decompositions using HiCOO Format

Parallel Sparse Tensor Decompositions using HiCOO Format Figure sources: A brief survey of tensors by Berton Earnshaw and NVIDIA Tensor Cores Parallel Sparse Tensor Decompositions using HiCOO Format Jiajia Li, Jee Choi, Richard Vuduc May 8, 8 @ SIAM ALA 8 Outline

More information

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,

More information

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia

More information

High Performance Parallel Tucker Decomposition of Sparse Tensors

High Performance Parallel Tucker Decomposition of Sparse Tensors High Performance Parallel Tucker Decomposition of Sparse Tensors Oguz Kaya INRIA and LIP, ENS Lyon, France SIAM PP 16, April 14, 2016, Paris, France Joint work with: Bora Uçar, CNRS and LIP, ENS Lyon,

More information

Large Scale Tensor Decompositions: Algorithmic Developments and Applications

Large Scale Tensor Decompositions: Algorithmic Developments and Applications Large Scale Tensor Decompositions: Algorithmic Developments and Applications Evangelos Papalexakis, U Kang, Christos Faloutsos, Nicholas Sidiropoulos, Abhay Harpale Carnegie Mellon University, KAIST, University

More information

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,

More information

Scalable Tensor Factorizations with Incomplete Data

Scalable Tensor Factorizations with Incomplete Data Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical

More information

Fast and Scalable Distributed Boolean Tensor Factorization

Fast and Scalable Distributed Boolean Tensor Factorization Fast and Scalable Distributed Boolean Tensor Factorization Namyong Park Seoul National University Email: namyong.park@snu.ac.kr Sejoon Oh Seoul National University Email: ohhenrie@snu.ac.kr U Kang Seoul

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

Optimization of Symmetric Tensor Computations

Optimization of Symmetric Tensor Computations Optimization of Symmetric Tensor Computations Jonathon Cai Department of Computer Science, Yale University New Haven, CT 0650 Email: jonathon.cai@yale.edu Muthu Baskaran, Benoît Meister, Richard Lethin

More information

Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition

Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition Identifying and Alleviating Concept Drift in Streaming Tensor Decomposition Ravdeep Pasricha, Ekta Gujral, and Evangelos E. Papalexakis Department of Computer Science and Engineering, University of California

More information

arxiv: v1 [cs.lg] 18 Nov 2018

arxiv: v1 [cs.lg] 18 Nov 2018 THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18

More information

DFacTo: Distributed Factorization of Tensors

DFacTo: Distributed Factorization of Tensors DFacTo: Distributed Factorization of Tensors Joon Hee Choi Electrical and Computer Engineering Purdue University West Lafayette IN 47907 choi240@purdue.edu S. V. N. Vishwanathan Statistics and Computer

More information

A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization

A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization A Medium-Grained Algorithm for Distributed Sparse Tensor Factorization Shaden Smith, George Karypis Department of Computer Science and Engineering, University of Minnesota {shaden, karypis}@cs.umn.edu

More information

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Tyler Ueltschi University of Puget SoundTacoma, Washington, USA tueltschi@pugetsound.edu April 14, 2014 1 Introduction A tensor

More information

Dealing with curse and blessing of dimensionality through tensor decompositions

Dealing with curse and blessing of dimensionality through tensor decompositions Dealing with curse and blessing of dimensionality through tensor decompositions Lieven De Lathauwer Joint work with Nico Vervliet, Martijn Boussé and Otto Debals June 26, 2017 2 Overview Curse of dimensionality

More information

Tensor Decompositions and Applications

Tensor Decompositions and Applications Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =

More information

Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation

Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation Nesterov-based Alternating Optimization for Nonnegative Tensor Completion: Algorithm and Parallel Implementation Georgios Lourakis and Athanasios P. Liavas School of Electrical and Computer Engineering,

More information

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images

Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images Alfredo Nava-Tudela ant@umd.edu John J. Benedetto Department of Mathematics jjb@umd.edu Abstract In this project we are

More information

CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms

CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms Zachary Blanco Rutgers University zac.blanco@rutgers.edu Bangtian Liu Rutgers University bangtian.liu@rutgers.edu Maryam Mehri Dehnavi

More information

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.

More information

Combining Memory and Landmarks with Predictive State Representations

Combining Memory and Landmarks with Predictive State Representations Combining Memory and Landmarks with Predictive State Representations Michael R. James and Britton Wolfe and Satinder Singh Computer Science and Engineering University of Michigan {mrjames, bdwolfe, baveja}@umich.edu

More information

CS60021: Scalable Data Mining. Dimensionality Reduction

CS60021: Scalable Data Mining. Dimensionality Reduction J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a

More information

Sparseness Constraints on Nonnegative Tensor Decomposition

Sparseness Constraints on Nonnegative Tensor Decomposition Sparseness Constraints on Nonnegative Tensor Decomposition Na Li nali@clarksonedu Carmeliza Navasca cnavasca@clarksonedu Department of Mathematics Clarkson University Potsdam, New York 3699, USA Department

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Pawan Goyal CSE, IITKGP October 21, 2014 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 21, 2014 1 / 52 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation

More information

Fundamentals of Multilinear Subspace Learning

Fundamentals of Multilinear Subspace Learning Chapter 3 Fundamentals of Multilinear Subspace Learning The previous chapter covered background materials on linear subspace learning. From this chapter on, we shall proceed to multiple dimensions with

More information

Efficient CP-ALS and Reconstruction From CP

Efficient CP-ALS and Reconstruction From CP Efficient CP-ALS and Reconstruction From CP Jed A. Duersch & Tamara G. Kolda Sandia National Laboratories Livermore, CA Sandia National Laboratories is a multimission laboratory managed and operated by

More information

An Investigation of Sparse Tensor Formats for Tensor Libraries. Parker Allen Tew

An Investigation of Sparse Tensor Formats for Tensor Libraries. Parker Allen Tew An Investigation of Sparse Tensor Formats for Tensor Libraries by Parker Allen Tew S.B., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation

A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation Vu Malbasa and Slobodan Vucetic Abstract Resource-constrained data mining introduces many constraints when learning from

More information

Fast Nonnegative Tensor Factorization with an Active-Set-Like Method

Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Jingu Kim and Haesun Park Abstract We introduce an efficient algorithm for computing a low-rank nonnegative CANDECOMP/PARAFAC(NNCP)

More information

Wafer Pattern Recognition Using Tucker Decomposition

Wafer Pattern Recognition Using Tucker Decomposition Wafer Pattern Recognition Using Tucker Decomposition Ahmed Wahba, Li-C. Wang, Zheng Zhang UC Santa Barbara Nik Sumikawa NXP Semiconductors Abstract In production test data analytics, it is often that an

More information

ParCube: Sparse Parallelizable Tensor Decompositions

ParCube: Sparse Parallelizable Tensor Decompositions ParCube: Sparse Parallelizable Tensor Decompositions Evangelos E. Papalexakis, Christos Faloutsos, and Nicholas D. Sidiropoulos 2 School of Computer Science, Carnegie Mellon University, Pittsburgh, PA,

More information

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic

More information

SPARSE TENSORS DECOMPOSITION SOFTWARE

SPARSE TENSORS DECOMPOSITION SOFTWARE SPARSE TENSORS DECOMPOSITION SOFTWARE Papa S Diaw, Master s Candidate Dr. Michael W. Berry, Major Professor Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville

More information

Clustering Boolean Tensors

Clustering Boolean Tensors Clustering Boolean Tensors Saskia Metzler and Pauli Miettinen Max Planck Institut für Informatik Saarbrücken, Germany {saskia.metzler, pauli.miettinen}@mpi-inf.mpg.de Abstract. Tensor factorizations are

More information

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences

Truncation Strategy of Tensor Compressive Sensing for Noisy Video Sequences Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 207-4212 Ubiquitous International Volume 7, Number 5, September 2016 Truncation Strategy of Tensor Compressive Sensing for Noisy

More information

A Unified Optimization Approach for Sparse Tensor Operations on GPUs

A Unified Optimization Approach for Sparse Tensor Operations on GPUs A Unified Optimization Approach for Sparse Tensor Operations on GPUs Bangtian Liu, Chengyao Wen, Anand D. Sarwate, Maryam Mehri Dehnavi Rutgers, The State University of New Jersey {bangtian.liu, chengyao.wen,

More information

Online Tensor Factorization for. Feature Selection in EEG

Online Tensor Factorization for. Feature Selection in EEG Online Tensor Factorization for Feature Selection in EEG Alric Althoff Honors Thesis, Department of Cognitive Science, University of California - San Diego Supervised by Dr. Virginia de Sa Abstract Tensor

More information

Technical Report TR SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication

Technical Report TR SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 Keller Hall 200 Union Street SE Minneapolis, MN 55455-0159 USA TR 15-008 SPLATT: Efficient and Parallel Sparse

More information

arxiv: v1 [cs.sc] 17 Apr 2013

arxiv: v1 [cs.sc] 17 Apr 2013 EFFICIENT CALCULATION OF DETERMINANTS OF SYMBOLIC MATRICES WITH MANY VARIABLES TANYA KHOVANOVA 1 AND ZIV SCULLY 2 arxiv:1304.4691v1 [cs.sc] 17 Apr 2013 Abstract. Efficient matrix determinant calculations

More information

Comparative Summarization via Latent Dirichlet Allocation

Comparative Summarization via Latent Dirichlet Allocation Comparative Summarization via Latent Dirichlet Allocation Michal Campr and Karel Jezek Department of Computer Science and Engineering, FAV, University of West Bohemia, 11 February 2013, 301 00, Plzen,

More information

Comparison of Modern Stochastic Optimization Algorithms

Comparison of Modern Stochastic Optimization Algorithms Comparison of Modern Stochastic Optimization Algorithms George Papamakarios December 214 Abstract Gradient-based optimization methods are popular in machine learning applications. In large-scale problems,

More information

arxiv: v1 [cs.lg] 3 Jul 2018

arxiv: v1 [cs.lg] 3 Jul 2018 OCTen: Online Compression-based Tensor Decomposition Ekta Gujral UC Riverside egujr001@ucr.edu Ravdeep Pasricha UC Riverside rpasr001@ucr.edu Tianxiong Yang UC Riverside tyang022@ucr.edu Evangelos E. Papalexakis

More information

Privacy-Preserving Linear Programming

Privacy-Preserving Linear Programming Optimization Letters,, 1 7 (2010) c 2010 Privacy-Preserving Linear Programming O. L. MANGASARIAN olvi@cs.wisc.edu Computer Sciences Department University of Wisconsin Madison, WI 53706 Department of Mathematics

More information

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Duy-Khuong Nguyen 13, Quoc Tran-Dinh 2, and Tu-Bao Ho 14 1 Japan Advanced Institute of Science and Technology, Japan

More information

PARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition

PARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition PARCUBE: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition EVANGELOS E. PAPALEXAKIS and CHRISTOS FALOUTSOS, Carnegie Mellon University NICHOLAS D. SIDIROPOULOS, University of Minnesota How can

More information

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

Model-Driven Sparse CP Decomposition for Higher-Order Tensors 7 IEEE International Parallel and Distributed Processing Symposium Model-Driven Sparse CP Decomposition for Higher-Order Tensors Jiajia Li, Jee Choi, Ioakeim Perros, Jimeng Sun, Richard Vuduc Computational

More information

MalSpot: Multi 2 Malicious Network Behavior Patterns Analysis

MalSpot: Multi 2 Malicious Network Behavior Patterns Analysis MalSpot: Multi 2 Malicious Network Behavior Patterns Analysis Ching-Hao Mao 1, Chung-Jung Wu 1, Evangelos E. Papalexakis 2, Christos Faloutsos 2, and Kuo-Chen Lee 1 1 Institute for Information Industry,

More information

Constrained Tensor Factorization with Accelerated AO-ADMM

Constrained Tensor Factorization with Accelerated AO-ADMM Constrained Tensor actorization with Accelerated AO-ADMM Shaden Smith, Alec Beri, George Karypis Department of Computer Science and Engineering, University of Minnesota, Minneapolis, USA Department of

More information

Introduction to Tensors. 8 May 2014

Introduction to Tensors. 8 May 2014 Introduction to Tensors 8 May 2014 Introduction to Tensors What is a tensor? Basic Operations CP Decompositions and Tensor Rank Matricization and Computing the CP Dear Tullio,! I admire the elegance of

More information

Big Tensor Data Reduction

Big Tensor Data Reduction Big Tensor Data Reduction Nikos Sidiropoulos Dept. ECE University of Minnesota NSF/ECCS Big Data, 3/21/2013 Nikos Sidiropoulos Dept. ECE University of Minnesota () Big Tensor Data Reduction NSF/ECCS Big

More information

Direct and Incomplete Cholesky Factorizations with Static Supernodes

Direct and Incomplete Cholesky Factorizations with Static Supernodes Direct and Incomplete Cholesky Factorizations with Static Supernodes AMSC 661 Term Project Report Yuancheng Luo 2010-05-14 Introduction Incomplete factorizations of sparse symmetric positive definite (SSPD)

More information

Consider the following example of a linear system:

Consider the following example of a linear system: LINEAR SYSTEMS Consider the following example of a linear system: Its unique solution is x + 2x 2 + 3x 3 = 5 x + x 3 = 3 3x + x 2 + 3x 3 = 3 x =, x 2 = 0, x 3 = 2 In general we want to solve n equations

More information

ERLANGEN REGIONAL COMPUTING CENTER

ERLANGEN REGIONAL COMPUTING CENTER ERLANGEN REGIONAL COMPUTING CENTER Making Sense of Performance Numbers Georg Hager Erlangen Regional Computing Center (RRZE) Friedrich-Alexander-Universität Erlangen-Nürnberg OpenMPCon 2018 Barcelona,

More information

On Optimal Frame Conditioners

On Optimal Frame Conditioners On Optimal Frame Conditioners Chae A. Clark Department of Mathematics University of Maryland, College Park Email: cclark18@math.umd.edu Kasso A. Okoudjou Department of Mathematics University of Maryland,

More information

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011

6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 6196 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 9, SEPTEMBER 2011 On the Structure of Real-Time Encoding and Decoding Functions in a Multiterminal Communication System Ashutosh Nayyar, Student

More information

Basic Concepts in Linear Algebra

Basic Concepts in Linear Algebra Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University February 2, 2015 Grady B Wright Linear Algebra Basics February 2, 2015 1 / 39 Numerical Linear Algebra Linear

More information

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets IEEE Big Data 2015 Big Data in Geosciences Workshop An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets Fatih Akdag and Christoph F. Eick Department of Computer

More information

Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach

Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach Anna

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Pawan Goyal CSE, IITKGP October 29-30, 2015 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 1 / 61 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation

More information

A concise proof of Kruskal s theorem on tensor decomposition

A concise proof of Kruskal s theorem on tensor decomposition A concise proof of Kruskal s theorem on tensor decomposition John A. Rhodes 1 Department of Mathematics and Statistics University of Alaska Fairbanks PO Box 756660 Fairbanks, AK 99775 Abstract A theorem

More information

arxiv: v4 [math.na] 10 Nov 2014

arxiv: v4 [math.na] 10 Nov 2014 NEWTON-BASED OPTIMIZATION FOR KULLBACK-LEIBLER NONNEGATIVE TENSOR FACTORIZATIONS SAMANTHA HANSEN, TODD PLANTENGA, TAMARA G. KOLDA arxiv:134.4964v4 [math.na] 1 Nov 214 Abstract. Tensor factorizations with

More information

Strassen s Algorithm for Tensor Contraction

Strassen s Algorithm for Tensor Contraction Strassen s Algorithm for Tensor Contraction Jianyu Huang, Devin A. Matthews, Robert A. van de Geijn The University of Texas at Austin September 14-15, 2017 Tensor Computation Workshop Flatiron Institute,

More information

Multi-Way Compressed Sensing for Big Tensor Data

Multi-Way Compressed Sensing for Big Tensor Data Multi-Way Compressed Sensing for Big Tensor Data Nikos Sidiropoulos Dept. ECE University of Minnesota MIIS, July 1, 2013 Nikos Sidiropoulos Dept. ECE University of Minnesota ()Multi-Way Compressed Sensing

More information

Modeling of Growing Networks with Directional Attachment and Communities

Modeling of Growing Networks with Directional Attachment and Communities Modeling of Growing Networks with Directional Attachment and Communities Masahiro KIMURA, Kazumi SAITO, Naonori UEDA NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Kyoto 619-0237, Japan

More information

Tensor-Matrix Products with a Compressed Sparse Tensor

Tensor-Matrix Products with a Compressed Sparse Tensor Tensor-Matrix Products with a Compressed Sparse Tensor Shaden Smith shaden@cs.umn.edu Department of Computer Science and Engineering University of Minnesota, USA George Karypis karypis@cs.umn.edu ABSTRACT

More information

Review of Basic Concepts in Linear Algebra

Review of Basic Concepts in Linear Algebra Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra

More information

Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems

Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems Charles Byrne (Charles Byrne@uml.edu) Department of Mathematical

More information

Tighter Low-rank Approximation via Sampling the Leveraged Element

Tighter Low-rank Approximation via Sampling the Leveraged Element Tighter Low-rank Approximation via Sampling the Leveraged Element Srinadh Bhojanapalli The University of Texas at Austin bsrinadh@utexas.edu Prateek Jain Microsoft Research, India prajain@microsoft.com

More information

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition Jinshi Yu, Guoxu Zhou, Qibin Zhao and Kan Xie School of Automation, Guangdong University of Technology, Guangzhou,

More information

Success Probability of the Hellman Trade-off

Success Probability of the Hellman Trade-off This is the accepted version of Information Processing Letters 109(7 pp.347-351 (2009. https://doi.org/10.1016/j.ipl.2008.12.002 Abstract Success Probability of the Hellman Trade-off Daegun Ma 1 and Jin

More information

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

CVPR A New Tensor Algebra - Tutorial. July 26, 2017 CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic

More information

Recommendation Systems

Recommendation Systems Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume

More information

Streaming multiscale anomaly detection

Streaming multiscale anomaly detection Streaming multiscale anomaly detection DATA-ENS Paris and ThalesAlenia Space B Ravi Kiran, Université Lille 3, CRISTaL Joint work with Mathieu Andreux beedotkiran@gmail.com June 20, 2017 (CRISTaL) Streaming

More information

to be more efficient on enormous scale, in a stream, or in distributed settings.

to be more efficient on enormous scale, in a stream, or in distributed settings. 16 Matrix Sketching The singular value decomposition (SVD) can be interpreted as finding the most dominant directions in an (n d) matrix A (or n points in R d ). Typically n > d. It is typically easy to

More information

Large-scale Matrix Factorization. Kijung Shin Ph.D. Student, CSD

Large-scale Matrix Factorization. Kijung Shin Ph.D. Student, CSD Large-scale Matrix Factorization Kijung Shin Ph.D. Student, CSD Roadmap Matrix Factorization (review) Algorithms Distributed SGD: DSGD Alternating Least Square: ALS Cyclic Coordinate Descent: CCD++ Experiments

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly

More information

ARestricted Boltzmann machine (RBM) [1] is a probabilistic

ARestricted Boltzmann machine (RBM) [1] is a probabilistic 1 Matrix Product Operator Restricted Boltzmann Machines Cong Chen, Kim Batselier, Ching-Yun Ko, and Ngai Wong chencong@eee.hku.hk, k.batselier@tudelft.nl, cyko@eee.hku.hk, nwong@eee.hku.hk arxiv:1811.04608v1

More information

Postgraduate Course Signal Processing for Big Data (MSc)

Postgraduate Course Signal Processing for Big Data (MSc) Postgraduate Course Signal Processing for Big Data (MSc) Jesús Gustavo Cuevas del Río E-mail: gustavo.cuevas@upm.es Work Phone: +34 91 549 57 00 Ext: 4039 Course Description Instructor Information Course

More information

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur

Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur Branch Prediction based attacks using Hardware performance Counters IIT Kharagpur March 19, 2018 Modular Exponentiation Public key Cryptography March 19, 2018 Branch Prediction Attacks 2 / 54 Modular Exponentiation

More information

Efficient Cryptanalysis of Homophonic Substitution Ciphers

Efficient Cryptanalysis of Homophonic Substitution Ciphers Efficient Cryptanalysis of Homophonic Substitution Ciphers Amrapali Dhavare Richard M. Low Mark Stamp Abstract Substitution ciphers are among the earliest methods of encryption. Examples of classic substitution

More information

Preserving Privacy in Data Mining using Data Distortion Approach

Preserving Privacy in Data Mining using Data Distortion Approach Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in

More information

Recovering Tensor Data from Incomplete Measurement via Compressive Sampling

Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Jason R. Holloway hollowjr@clarkson.edu Carmeliza Navasca cnavasca@clarkson.edu Department of Electrical Engineering Clarkson

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems

Research Article A Novel Differential Evolution Invasive Weed Optimization Algorithm for Solving Nonlinear Equations Systems Journal of Applied Mathematics Volume 2013, Article ID 757391, 18 pages http://dx.doi.org/10.1155/2013/757391 Research Article A Novel Differential Evolution Invasive Weed Optimization for Solving Nonlinear

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

CS224W: Methods of Parallelized Kronecker Graph Generation

CS224W: Methods of Parallelized Kronecker Graph Generation CS224W: Methods of Parallelized Kronecker Graph Generation Sean Choi, Group 35 December 10th, 2012 1 Introduction The question of generating realistic graphs has always been a topic of huge interests.

More information

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm Kronecker Product Approximation with Multiple actor Matrices via the Tensor Product Algorithm King Keung Wu, Yeung Yam, Helen Meng and Mehran Mesbahi Department of Mechanical and Automation Engineering,

More information

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization

More information

Process Model Formulation and Solution, 3E4

Process Model Formulation and Solution, 3E4 Process Model Formulation and Solution, 3E4 Section B: Linear Algebraic Equations Instructor: Kevin Dunn dunnkg@mcmasterca Department of Chemical Engineering Course notes: Dr Benoît Chachuat 06 October

More information

Count-Min Tree Sketch: Approximate counting for NLP

Count-Min Tree Sketch: Approximate counting for NLP Count-Min Tree Sketch: Approximate counting for NLP Guillaume Pitel, Geoffroy Fouquier, Emmanuel Marchand and Abdul Mouhamadsultane exensa firstname.lastname@exensa.com arxiv:64.5492v [cs.ir] 9 Apr 26

More information