Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams
|
|
- Jason Hawkins
- 6 years ago
- Views:
Transcription
1 Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne, NY, USA Abstract Data stream s are often associated with multiple aspects. For example, each from environmental sensors may have an associated type (e.g., temperature, humidity, etc) as well as location. Aside from timestamp, type and location are the two additional aspects. How to model such streams? How to simultaneously find patterns within and across the multiple aspects? How to do it incrementally in a streaming fashion? In this paper, all these problems are addressed through a general data model, tensor streams, and an effective algorithmic framework, window-based tensor analysis (WTA). Two variations of WTA, independentwindow tensor analysis () and moving-window tensor analysis (), are presented and evaluated extensively on real datasets. Finally, we illustrate one important application, Multi-Aspect Correlation Analysis (MACA), which uses WTA and we demonstrate its effectiveness on an environmental monitoring application. 1 Introduction Data streams have received attention in different communities due to emerging applications, such as environmental monitoring and sensor networks [7]. The data are modelled as a number of co-evolving streams (time series with an increasing length). Most data mining operations need to be redesigned for data streams because of the streaming requirement, i.e., the mining result has to be updated efficiently for the newly arrived data. In the standard stream model, each is associated with a (timestamp, stream-id) pair. However, the stream-id itself may have some additional structure. For example, it may be decomposed into (location-id, type) stream-id. We call each such component of the stream-id an aspect. The number of discrete s each aspect may take is called its dimensionality, e.g., the type aspect has dimensionality and the individual dimensions are temperature, humidity and etc. This additional structure should not be ignored in data exploration tasks, since it may provide additional insights. Thus the typical flat-world view may be insufficient. In summary, even though the traditional data stream model is quite general, it cannot easily capture some important aspects of the data. In this paper, we tackle the problem at three levels. First, we address the issue of modeling such high-dimensional and multi-aspect streams. We present the tensor stream model, which generalizes multiple streams into high-order tensors represented using a sequence of multi-arrays. Second, we study how to summarize the tensor stream efficiently. We generalize the moving/sliding window model from a single stream to a tensor stream. Every tensor window includes multiple tensors. Each of these tensors corresponds to the multi-aspect set of measurements associated with one timestamp. Subsequently, using multilinear analysis [3] which is a generalization of matrix analysis, we propose window-based tensor analysis (WTA) for tensor streams, which summarizes the tensor windows efficiently, using small core tensors associated with different projection matrices, where core tensors and projection matrices are analogous to the singular s and singular vectors for a matrix. Two variations of the algorithms for WTA are proposed: 1) independent-window tensor analysis () which treats each tensor window independently; ) movingwindow tensor analysis () which exploits the time dependence across neighboring windows to reduce computational cost significantly. Third, we introduce an important application using WTA, which demonstrates its practical significance. In particular, we describe Multi-Aspect Correlation Analysis (MACA), which simultaneously finds correlations within a single aspect and also across multiple aspects. In summary, our main contributions are the following: Data model: We introduce tensor streams to deal with high-dimensional and multi-aspect streams. Algorithmic framework: We propose window-based tensor analysis (WTA) to effectively extract core patterns from tensor streams. Application: Based on WTA, multi-aspect correlation analysis (MACA) is presented to simultaneously compute the correlation within and across all aspects. 1
2 Problem Formulation N 1 W Tensor Window D n N Tensor Streams. Location Type U 1 R1 N 1 R R1 R N R U T Core Tensor Figure 1. window-based tensor analysis In this section, we first formally define the notions of tensor stream and tensor window. Then we formulate two versions of window-based tensor analysis (WTA). Recall that a tensor mode corresponds to an aspect of the data streams. Definition 1 (Tensor stream) A sequence of Mth order tensors D 1... D n, where each D i R N1 NM (1 i n) and n is an integer that increases with time, is called a tensor stream denoted as {D i 1 i n, n }. We can consider a tensor stream is built incrementally over time. The most recent tensor in the stream is D n. In the environmental monitoring example, a new nd order tensor (as shown in Figure 1) arrives every minute. Definition (Tensor window) A tensor window D(n, W) consists of a subset of a tensor stream ending at time n with size W. Formally D(n, w) R W N1... NM = {D n w+1,...,d n } where each D i R N1 NM. A tensor window localizes the tensor stream into a (smaller) tensor sequence with cardinality W and at particular time n. The current tensor window refers to the window ending at the current time. Notice that the tensor window is a natural high-order generalization of the sliding window model in data streams. The goal of window-based tensor analysis (WTA) is to incrementally extract patterns from high-dimensional and multi-aspect streams. In this paper, we formulate this pattern extraction process as a dimensionality reduction problem on tensor windows. Two versions of WTA are presented as follows: Problem 1 (Independent-window tensor analysis ()) Given a tensor window D R W N1 NM, find the projection matrices U R W R and U i R Ni Ri M i=1 such that the reconstruction error is minimized: e = M D D i (U i U T i ) F i= The core tensor Y is defined as D M i U i. Intuitively, i= a projection matrix specifies the concepts along a given R W U aspect/mode, while the core tensor consists of the concept association across all aspects/modes. Note that for nd-order tensors (i.e., matrices), the core tensor is the diagonal matrix Σ of singular s. Problem (Moving-window tensor analysis ()) Given the current tensor window D(n, W) R W N1 NM and the old result for D(n 1, W), find the new projection matrices U R W R,n and U i R Ni Ri,n M i=1 such that the reconstruction error is minimized: e = M D(n, W) D(n, W) i (U i U T i ) F i= Finally, we illustrate a mining application using the output of WTA in Section 5. 3 Window-based Tensor analysis Section 3.1 first introduces the goal and insight of the general window-based tensor analysis, where we point out the importance of good initialization for the algorithm. Next we propose two algorithms, independent-window and moving-window tensor analysis based on different initialization strategies in Section Iterative Optimization on windows The goal of tensor analysis is to find the set of orthonormal projection matrices U i M i= that minimize the reconstruction error d(d, D), where d(, ) is a divergence function between two tensors; D is the input tensor; D is the approximation tensor defined as D M i (U i U T i ). i= The principle is to optimize parameters one at a time by fixing all the rest. The benefit comes from the simplicity and robustness of the algorithms. An iterative meta-algorithm for window-based tensor analysis is shown in Figure. To instantiate the algorithm, three things are needed: Initialization condition: This turns out to be the crucial component for data streams. Different schemes for this are presented in Section 3.. Optimization strategy: This is closely related to the divergence function d(, ). Gradient descent type of methods can be developed in most of cases. However, in this paper, we use the square Frobenius norm F as the divergence function, which naturally leads to a simpler and faster iterated method, alternating least squares. More specifically, line of Figure is replaced by the following steps: 1. Construct D d = D( i U i ) R R N d R M i d. unfold(d d, d) as D (d) R N d ( Q k d R k) 3. Construct variance matrix C d = D T (d) D (d). Compute U d by diagonalizing C d
3 Input: The tensor window D R W N1 NM The dimensionality of the output tensors Y R R RM. Output: The projection matrix U R W R, U i M i=1 RNi Ri and the core tensor Y. Algorithm: 1. Initialize U i M i=. Conduct 3-5 iteratively 3. For k = to M. Fix U i for i k and find the U k that minimizes d(d, D M i (U i U T i )) i= 5. Check convergence. Calculate the core tensor Y = D M i U i i= Figure. Iterative tensor analysis Convergence checking: We use the standard approach of monitoring the change of the projection matrices until it is sufficiently small. Formally, the change on i-th mode is quantified by trace( U T i,new U i,old I ). 3. Independent and moving window tensor analysis Here we first introduce independent-window tensor analysis () as the baseline algorithm. Then moving-window tensor analysis () is presented that exploits the time dependence structure to quickly set a good initial condition, thereby significantly reducing the computational cost. Independent window tensor analysis () is a simple way to deal with tensor windows by fitting the model independently. At every timestamp a tensor window D(n, W) is formed, which includes the current tensor D n and W 1 old tensors. Then we can apply the alternating least squares method (Figure ) on D(n, W) to compute the projection matrices U i M i=. The projection matrices U i can, in theory, be any orthonormal matrices. For instance, we initialize U i to be a N i R i truncated identity matrix in the experiment, which leads to extremely fast initialization of the projection matrices. However, the number of iterations until convergence can be large. Moving-window tensor analysis () utilizes the overlapping information of two consecutive tensor windows to update variance matrices C d M d=1. More specifically, given a tensor window D(n, W) R W N1 NM, we have M +1 variance matricesc i M i=, one for each mode. Note that the current window D(n, W) removes an old tensor D n W and includes a new tensor D n, compared to the previous window D(n 1, W). Update modes 1 to M: For all but the time mode, the variance matrix is as follows: [ ] T [ C old X X d = = X D D] T X + D T D where X is the unfolding matrix of the old tensor D n W and D is the unfolding matrix of tensor window D(n 1, W 1) (i.e., the overlapping part of the two consecutive tensor windows). Similarly, C new d = D T D + Y T Y, where Y is the unfolding matrix of the new tensor D n. As a result, the update can be easily achieved as follows: C d C d D T n W,(d) D n W,(d) + D T n,(d) D n,(d) where D n W,(d) (D n,(d) ) is the mode-d unfolding matrix of tensor D n W (D n ). Intuitively, the variance matrix can be updated easily when adding or deleting rows from an unfolded matrix, since all computation only involves the added and deleted rows. Updating time mode (mode ) cannot be done efficiently as the other modes. Fortunately, the iterative algorithm actually only needs initialization for all but one mode in order to start. Therefore, after initialization of the other modes, the iterated update starts from the time mode and proceeds until convergence. This gives both quick convergence and fast initialization. The pseudo-code is listed in Figure 3. Input: The new tensor D n R N1 NM for inclusion The old tensor window D n W R N1 NM for removal The old variance matrices C d M d=1 The dimensionality of the output tensors Y R R RM Output: The new variance matrices C d M d=1 The projection matrices U i M i= RNi Ri Algorithm: // Initialize every mode except time 1. For d = 1 to M. Mode-d matricize D n W (D n ) as D n W,(d) (D n,(d) ). Update C d C d D T n W,(d) D n W,(d) + D T n,(d) D n,(d) 5. Diagonalization C d = U d Λ d U T d. Truncate U d to first R d columns 7. Apply the iterative algorithm with the new initialization Figure 3. Moving-window tensor analysis () Performance Evaluation Data Description SENSOR: The sensor data are collected from 5 MICA Mote sensors placed in a lab, over a period 3
4 of a month. Every 3 seconds each Mote sensor sends to the central collector via wireless radio four types of measurements: light intensity, humidity, temperature, battery voltage. In order to compare different types, we scale each type of measurement into zero mean and unit variance. This calibration process can actually be done online since mean and variance can be easily updated incrementally. Note that we put equal weight on all measurements across all nodes. However, other weighting schemes can be easily applied, based on the application. Characteristics: The data are very bursty but still correlated across locations and time. For example, the measurements of same type behave similarly, since all the nodes are deployed in the same lab. Tensor construction: By scanning through the data once, the tensor windows are incrementally constructed and processed/decomposed. More specifically, every tensor window is a 3-mode tensor with dimensions (W, 5, ) where W varies from 1 to 5 in the experiments. Parameters: This experiment has three parameters: 1) Window size W: the number of timestamps included in each tensor window. ) Step ratio S: the number of newly arrived tensors in the new tensor windows divided by window size W (a ratio between and 1). 3) Core tensor size: (R, R 1, R ) where R is the size of time mode. and reach the same error level 1 across all the experiments, since we use the same termination criterion for the iterative algorithm in both cases. Stable over time: Figure (a) shows the CPU time as a function of elapsed time, where we set W = 1, S =. (i.e. % new tensors). Overall, CPU time for both and exhibits a constant trend. achieves about 3% overall improvement compared to, on both datasets. The performance gain of comes from its incremental initialization scheme. As shown in Figure (b), the CPU time is strongly correlated with the number of iterations. As a result of, which reduces the number of iterations needed, is much faster than. Consistent across different parameter settings: Window size: Figure (c) shows CPU time (in logscale) vs. window size W. CPU time is increasing with window size. Note that the method achieves big computational saving across all sizes, compared to. Step size: Figure (d) presents step size vs. CPU time. is much faster than across all settings, even when there is no overlap between two consecutive tensor windows (i.e., step size equals 1). This clearly shows that the importance of a good initialization for the iterative algorithm. Core tensor size: We vary the core tensor size along the, where the tensor recon- 1 Reconstruction error is e(d) = D D F D F struction is D = D Q i (U i U T i ) time-mode and show CPU time as a function of this size (see Figure (e)). Again, performs much faster than, over all sizes. Similar results are achieved when varying the sizes of the other modes, so we omit them for brevity. 5 Application and Case study In this section, we introduce a powerful mining application of window-based tensor analysis, Multi-Aspect Correlation Analysis (MACA). Then we present a case study of MACA on the SENSOR dataset. Figure 5 shows two factors 1st factor nd factor scaling constant 38 scaling constant time (min) (a1) daily pattern location (b1) main pattern....8 Volt Humid Temp Light type (c1) Main correlation time (min) (a) abnormal residual location (b) three abnormal sensors Volt Humid Temp Light type (c) Voltage anomaly Figure 5. Case study on environmental data across three aspects (modes): time, location, and type. Intuitively, each factor corresponds to a trend and consists of a global importance weight of this trend, a pattern across time summarizing the typical behavior for this trend and, finally, one set of weights for each aspect (representing their participation in this trend). More specifically, the trends are from projection matrices and the weights from core tensors. Figures 5(a1,b1,c1) show the three components (one for each aspect) of the first factor, which is the main trend. If we read the components independently, Figure 5 (a1) shows the daily periodic pattern over time (high activation during the day, low activation at night); (b1) shows the participation weights of all sensor locations, where the weights seem to be uniformly spread out; (c1) shows that light, temperature and voltage are positively correlated (all positive s) while they are anti-correlated with humidity (negative
5 CPU time elapse time (min) x Iterations window size stepsize size 8 1 R on time mode (a) time (b) num of iteration (c) window size (d) stepsize (e) rank Figure. Experiments ). All of those patterns can be confirmed from the raw data. All three components act together scaled by the constant factor 38 to form an approximation of the data. Part of the residual of that approximation is shown as the second factor in Figures 5(a,b,c), which corresponds to some major anomalies. More specifically, Figure 5 (a) suggest that this pattern is uniform across all time, without a clear trend as in (a1); (b) shows that two sensors have strong positive weights while one other sensor has a dominating low weight; (c) tells us that this happened mainly at voltage which has the most prominent weight. The residual (i.e., the information remaining after the first factor is subtracted) is in turn approximated by combining all three components and scaling by constant -15. In layman terms, two sensors have abnormally low voltage compared to the rest; while one other sensor has unusually high voltage. Again all three cases are confirmed from the data. Related work Data streams have been extensively studied in recent years. Recent surveys [] have discussed many data stream algorithms. Among them, the sliding (or moving) window is a popular model in data stream literature. Most of them monitor some statistics in the sliding window over a single stream using probabilistic counting techniques, while we work with a more general notion of multi-aspect streams. Tensor algebra and multilinear analysis have been applied successfully in many domains. Powerful tools have been proposed, including the Tucker decomposition [9], parallel factor analysis [] or canonical decomposition []. Kolda et al. [5] apply PARAFAC on web graphs to generalize the hub and authority scores for web ranking through term information. These methods usually assume static data, while we are interested in streams of tensors. For the dynamic case, Sun et. al [8] proposed dynamic and streaming tensor analysis for higher-order data streams, but the time aspect is not analyzed explicitly as in this paper. 7 Conclusion In collections of multiple, time-evolving data streams from a large number of sources, different data s are often associated with multiple aspects. Usually, the patterns changes over time. In this paper, we propose the tensor stream model to capture the structured dynamics in such collections of streams. Furthermore, we introduce tensor windows, which naturally generalize the sliding window model. Under this data model, two techniques, independent and moving window tensor analysis ( and ), are presented to incrementally summarize the tensor stream. The summary consists of local patterns over time, which formally correspond to core tensors with the associated projection matrices. Finally, extensive performance evaluation and case study demonstrate the efficiency and effectiveness of the proposed methods. 8 Acknowledgement We are pleased to acknowledge Brett Bader and Tamara Kolda from Sandia National lab for providing the tensor toolbox [1] which makes our implementation and experiments an easy job. References [1] B. W. Bader and T. G. Kolda. Matlab tensor toolbox version., tgkolda/tensortoolbox/.. [] J. D. Carroll and J. Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika, 35(3), 197. [3] L. de Lathauwer. Signal Processing Based on Multilinear Algebra. PhD thesis, Katholieke, University of Leuven, Belgium, [] R. Harshman. Foundations of the parafac procedure: model and conditions for an explanatory multi-mode factor analysis. UCLA working papers in phonetics, 1, 197. [5] T. G. Kolda, B. W. Bader, and J. P. Kenny. Higher-order web link analysis using multilinear algebra. In ICDM, 5. [] S. Muthukrishnan. Data streams: algorithms and applications, volume 1. Foundations and Trends. in Theoretical Computer Science, 5. [7] S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB, 5. [8] J. Sun, D. Tao, and C. Faloutsos. Beyond streams and graphs: Dynamic tensor analysis. In KDD,. [9] L. R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 19. 5
Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos
Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,
More informationTwo heads better than one: Pattern Discovery in Time-evolving Multi-Aspect Data
Two heads better than one: Pattern Discovery in Time-evolving Multi-Aspect Data Jimeng Sun 1, Charalampos E. Tsourakakis 2, Evan Hoke 4, Christos Faloutsos 2, and Tina Eliassi-Rad 3 1 IBM T.J. Watson Research
More informationTwo heads better than one: pattern discovery in time-evolving multi-aspect data
Data Min Knowl Disc (2008) 17:111 128 DOI 10.1007/s10618-008-0112-3 Two heads better than one: pattern discovery in time-evolving multi-aspect data Jimeng Sun Charalampos E. Tsourakakis Evan Hoke Christos
More informationFaloutsos, Tong ICDE, 2009
Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity
More informationTensor Analysis. Topics in Data Mining Fall Bruno Ribeiro
Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)
More informationFundamentals of Multilinear Subspace Learning
Chapter 3 Fundamentals of Multilinear Subspace Learning The previous chapter covered background materials on linear subspace learning. From this chapter on, we shall proceed to multiple dimensions with
More informationIncremental Pattern Discovery on Streams, Graphs and Tensors
Incremental Pattern Discovery on Streams, Graphs and Tensors Jimeng Sun Thesis Committee: Christos Faloutsos Tom Mitchell David Steier, External member Philip S. Yu, External member Hui Zhang Abstract
More informationTensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia
Network Feature s Decompositions for Machine Learning 1 1 Department of Computer Science University of British Columbia UBC Machine Learning Group, June 15 2016 1/30 Contact information Network Feature
More informationWafer Pattern Recognition Using Tucker Decomposition
Wafer Pattern Recognition Using Tucker Decomposition Ahmed Wahba, Li-C. Wang, Zheng Zhang UC Santa Barbara Nik Sumikawa NXP Semiconductors Abstract In production test data analytics, it is often that an
More informationThe multiple-vector tensor-vector product
I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition
More informationOnline Tensor Factorization for. Feature Selection in EEG
Online Tensor Factorization for Feature Selection in EEG Alric Althoff Honors Thesis, Department of Cognitive Science, University of California - San Diego Supervised by Dr. Virginia de Sa Abstract Tensor
More informationKronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm
Kronecker Product Approximation with Multiple actor Matrices via the Tensor Product Algorithm King Keung Wu, Yeung Yam, Helen Meng and Mehran Mesbahi Department of Mechanical and Automation Engineering,
More informationFrom Matrix to Tensor. Charles F. Van Loan
From Matrix to Tensor Charles F. Van Loan Department of Computer Science January 28, 2016 From Matrix to Tensor From Tensor To Matrix 1 / 68 What is a Tensor? Instead of just A(i, j) it s A(i, j, k) or
More informationarxiv: v1 [cs.lg] 18 Nov 2018
THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18
More informationThird-Order Tensor Decompositions and Their Application in Quantum Chemistry
Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Tyler Ueltschi University of Puget SoundTacoma, Washington, USA tueltschi@pugetsound.edu April 14, 2014 1 Introduction A tensor
More informationTutorial on MATLAB for tensors and the Tucker decomposition
Tutorial on MATLAB for tensors and the Tucker decomposition Tamara G. Kolda and Brett W. Bader Workshop on Tensor Decomposition and Applications CIRM, Luminy, Marseille, France August 29, 2005 Sandia is
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationMATRIX COMPLETION AND TENSOR RANK
MATRIX COMPLETION AND TENSOR RANK HARM DERKSEN Abstract. In this paper, we show that the low rank matrix completion problem can be reduced to the problem of finding the rank of a certain tensor. arxiv:1302.2639v2
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationCom2: Fast Automatic Discovery of Temporal ( Comet ) Communities
Com2: Fast Automatic Discovery of Temporal ( Comet ) Communities Miguel Araujo 1,5, Spiros Papadimitriou 2, Stephan Günnemann 1, Christos Faloutsos 1, Prithwish Basu 3, Ananthram Swami 4, Evangelos E.
More informationDecomposing a three-way dataset of TV-ratings when this is impossible. Alwin Stegeman
Decomposing a three-way dataset of TV-ratings when this is impossible Alwin Stegeman a.w.stegeman@rug.nl www.alwinstegeman.nl 1 Summarizing Data in Simple Patterns Information Technology collection of
More informationCP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION
International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,
More informationCVPR A New Tensor Algebra - Tutorial. July 26, 2017
CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic
More informationSingular Value Decomposition
Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our
More informationOrthogonal tensor decomposition
Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.
More informationIntroduction to Tensors. 8 May 2014
Introduction to Tensors 8 May 2014 Introduction to Tensors What is a tensor? Basic Operations CP Decompositions and Tensor Rank Matricization and Computing the CP Dear Tullio,! I admire the elegance of
More informationA Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition
A Simpler Approach to Low-ank Tensor Canonical Polyadic Decomposition Daniel L. Pimentel-Alarcón University of Wisconsin-Madison Abstract In this paper we present a simple and efficient method to compute
More information/16/$ IEEE 1728
Extension of the Semi-Algebraic Framework for Approximate CP Decompositions via Simultaneous Matrix Diagonalization to the Efficient Calculation of Coupled CP Decompositions Kristina Naskovska and Martin
More informationCloud-Based Computational Bio-surveillance Framework for Discovering Emergent Patterns From Big Data
Cloud-Based Computational Bio-surveillance Framework for Discovering Emergent Patterns From Big Data Arvind Ramanathan ramanathana@ornl.gov Computational Data Analytics Group, Computer Science & Engineering
More informationA randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors
A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous
More informationParallel Tensor Compression for Large-Scale Scientific Data
Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Linear Algebra U Kang Seoul National University U Kang 1 In This Lecture Overview of linear algebra (but, not a comprehensive survey) Focused on the subset
More informationarxiv: v1 [math.ra] 13 Jan 2009
A CONCISE PROOF OF KRUSKAL S THEOREM ON TENSOR DECOMPOSITION arxiv:0901.1796v1 [math.ra] 13 Jan 2009 JOHN A. RHODES Abstract. A theorem of J. Kruskal from 1977, motivated by a latent-class statistical
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic Indexing Hinrich Schütze Center for Information and Language Processing, University of Munich 2013-07-10 1/43
More informationFitting a Tensor Decomposition is a Nonlinear Optimization Problem
Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia
More informationA concise proof of Kruskal s theorem on tensor decomposition
A concise proof of Kruskal s theorem on tensor decomposition John A. Rhodes 1 Department of Mathematics and Statistics University of Alaska Fairbanks PO Box 756660 Fairbanks, AK 99775 Abstract A theorem
More informationCS60021: Scalable Data Mining. Dimensionality Reduction
J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org 1 CS60021: Scalable Data Mining Dimensionality Reduction Sourangshu Bhattacharya Assumption: Data lies on or near a
More informationRecovering Tensor Data from Incomplete Measurement via Compressive Sampling
Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Jason R. Holloway hollowjr@clarkson.edu Carmeliza Navasca cnavasca@clarkson.edu Department of Electrical Engineering Clarkson
More informationMath 671: Tensor Train decomposition methods
Math 671: Eduardo Corona 1 1 University of Michigan at Ann Arbor December 8, 2016 Table of Contents 1 Preliminaries and goal 2 Unfolding matrices for tensorized arrays The Tensor Train decomposition 3
More informationMaintaining Frequent Itemsets over High-Speed Data Streams
Maintaining Frequent Itemsets over High-Speed Data Streams James Cheng, Yiping Ke, and Wilfred Ng Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Kowloon,
More informationAnomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach
Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Anomaly Detection in Temporal Graph Data: An Iterative Tensor Decomposition and Masking Approach Anna
More informationLarge Graph Mining: Power Tools and a Practitioner s guide
Large Graph Mining: Power Tools and a Practitioner s guide Task 5: Graphs over time & tensors Faloutsos, Miller, Tsourakakis CMU KDD '9 Faloutsos, Miller, Tsourakakis P5-1 Outline Introduction Motivation
More informationScalable Tensor Factorizations with Incomplete Data
Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical
More informationComputing Correlation Anomaly Scores using Stochastic Nearest Neighbors
Computing Correlation Anomaly Scores using Stochastic Nearest Neighbors Tsuyoshi Idé IBM Research, Tokyo Research Laboratory Yamato, Kanagawa, Japan goodidea@jp.ibm.com Spiros Papadimitriou Michail Vlachos
More informationConsensus Algorithms for Camera Sensor Networks. Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University
Consensus Algorithms for Camera Sensor Networks Roberto Tron Vision, Dynamics and Learning Lab Johns Hopkins University Camera Sensor Networks Motes Small, battery powered Embedded camera Wireless interface
More informationA Randomized Approach for Crowdsourcing in the Presence of Multiple Views
A Randomized Approach for Crowdsourcing in the Presence of Multiple Views Presenter: Yao Zhou joint work with: Jingrui He - 1 - Roadmap Motivation Proposed framework: M2VW Experimental results Conclusion
More informationBruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM
Latent Semantic Analysis and Fiedler Retrieval Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM Informatics & Linear Algebra Eigenvectors
More informationPV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211
PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211 IIR 18: Latent Semantic Indexing Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,
More informationA FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS
A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS Evrim Acar, Mathias Nilsson, Michael Saunders University of Copenhagen, Faculty of Science, Frederiksberg C, Denmark University
More informationSalt Dome Detection and Tracking Using Texture Analysis and Tensor-based Subspace Learning
Salt Dome Detection and Tracking Using Texture Analysis and Tensor-based Subspace Learning Zhen Wang*, Dr. Tamir Hegazy*, Dr. Zhiling Long, and Prof. Ghassan AlRegib 02/18/2015 1 /42 Outline Introduction
More informationDimitri Nion & Lieven De Lathauwer
he decomposition of a third-order tensor in block-terms of rank-l,l, Model, lgorithms, Uniqueness, Estimation of and L Dimitri Nion & Lieven De Lathauwer.U. Leuven, ortrijk campus, Belgium E-mails: Dimitri.Nion@kuleuven-kortrijk.be
More informationMining Frequent Items in a Stream Using Flexible Windows (Extended Abstract)
Mining Frequent Items in a Stream Using Flexible Windows (Extended Abstract) Toon Calders, Nele Dexters and Bart Goethals University of Antwerp, Belgium firstname.lastname@ua.ac.be Abstract. In this paper,
More informationPower System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur. Lecture - 21 Power Flow VI
Power System Analysis Prof. A. K. Sinha Department of Electrical Engineering Indian Institute of Technology, Kharagpur Lecture - 21 Power Flow VI (Refer Slide Time: 00:57) Welcome to lesson 21. In this
More informationSVD, PCA & Preprocessing
Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 6: Bias and variance (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 49 Our plan today We saw in last lecture that model scoring methods seem to be trading off two different
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationTensor Decompositions and Applications
Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview
More informationSVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)
Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular
More informationc 2008 Society for Industrial and Applied Mathematics
SIAM J MATRIX ANAL APPL Vol 30, No 3, pp 1219 1232 c 2008 Society for Industrial and Applied Mathematics A JACOBI-TYPE METHOD FOR COMPUTING ORTHOGONAL TENSOR DECOMPOSITIONS CARLA D MORAVITZ MARTIN AND
More informationMath 671: Tensor Train decomposition methods II
Math 671: Tensor Train decomposition methods II Eduardo Corona 1 1 University of Michigan at Ann Arbor December 13, 2016 Table of Contents 1 What we ve talked about so far: 2 The Tensor Train decomposition
More informationLab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018
Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google
More informationMatrix Assembly in FEA
Matrix Assembly in FEA 1 In Chapter 2, we spoke about how the global matrix equations are assembled in the finite element method. We now want to revisit that discussion and add some details. For example,
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationParallel Singular Value Decomposition. Jiaxing Tan
Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector
More informationBehavioral Data Mining. Lecture 7 Linear and Logistic Regression
Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression
More informationCS246 Final Exam, Winter 2011
CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including
More informationMobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti
Mobile Robotics 1 A Compact Course on Linear Algebra Giorgio Grisetti SA-1 Vectors Arrays of numbers They represent a point in a n dimensional space 2 Vectors: Scalar Product Scalar-Vector Product Changes
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Week #1
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future
More informationA Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation
A Reservoir Sampling Algorithm with Adaptive Estimation of Conditional Expectation Vu Malbasa and Slobodan Vucetic Abstract Resource-constrained data mining introduces many constraints when learning from
More informationLecture 2. Tensor Unfoldings. Charles F. Van Loan
From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 2. Tensor Unfoldings Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi,
More informationOptimization of Symmetric Tensor Computations
Optimization of Symmetric Tensor Computations Jonathon Cai Department of Computer Science, Yale University New Haven, CT 0650 Email: jonathon.cai@yale.edu Muthu Baskaran, Benoît Meister, Richard Lethin
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationLecture 7 MIMO Communica2ons
Wireless Communications Lecture 7 MIMO Communica2ons Prof. Chun-Hung Liu Dept. of Electrical and Computer Engineering National Chiao Tung University Fall 2014 1 Outline MIMO Communications (Chapter 10
More informationvmmlib Tensor Approximation Classes Susanne K. Suter April, 2013
vmmlib Tensor pproximation Classes Susanne K. Suter pril, 2013 Outline Part 1: Vmmlib data structures Part 2: T Models Part 3: Typical T algorithms and operations 2 Downloads and Resources Tensor pproximation
More informationA Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 5, SEPTEMBER 2001 1215 A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing Da-Zheng Feng, Zheng Bao, Xian-Da Zhang
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationFast Nonnegative Tensor Factorization with an Active-Set-Like Method
Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Jingu Kim and Haesun Park Abstract We introduce an efficient algorithm for computing a low-rank nonnegative CANDECOMP/PARAFAC(NNCP)
More informationTensor rank-one decomposition of probability tables
Tensor rank-one decomposition of probability tables Petr Savicky Inst. of Comp. Science Academy of Sciences of the Czech Rep. Pod vodárenskou věží 2 82 7 Prague, Czech Rep. http://www.cs.cas.cz/~savicky/
More informationSingular Value Decompsition
Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationTemporal Multi-View Inconsistency Detection for Network Traffic Analysis
WWW 15 Florence, Italy Temporal Multi-View Inconsistency Detection for Network Traffic Analysis Houping Xiao 1, Jing Gao 1, Deepak Turaga 2, Long Vu 2, and Alain Biem 2 1 Department of Computer Science
More informationSPARSE TENSORS DECOMPOSITION SOFTWARE
SPARSE TENSORS DECOMPOSITION SOFTWARE Papa S Diaw, Master s Candidate Dr. Michael W. Berry, Major Professor Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville
More informationAnomaly Detection. Davide Mottin, Konstantina Lazaridou. HassoPlattner Institute. Graph Mining course Winter Semester 2016
Anomaly Detection Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 and Next week February 7, third round presentations Slides are due by February 6,
More informationLinear Algebra and Robot Modeling
Linear Algebra and Robot Modeling Nathan Ratliff Abstract Linear algebra is fundamental to robot modeling, control, and optimization. This document reviews some of the basic kinematic equations and uses
More informationCS246: Mining Massive Data Sets Winter Only one late period is allowed for this homework (11:59pm 2/14). General Instructions
CS246: Mining Massive Data Sets Winter 2017 Problem Set 2 Due 11:59pm February 9, 2017 Only one late period is allowed for this homework (11:59pm 2/14). General Instructions Submission instructions: These
More informationEfficient CP-ALS and Reconstruction From CP
Efficient CP-ALS and Reconstruction From CP Jed A. Duersch & Tamara G. Kolda Sandia National Laboratories Livermore, CA Sandia National Laboratories is a multimission laboratory managed and operated by
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices
More informationTHE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR
THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional
More informationTensor-Based Dictionary Learning for Multidimensional Sparse Recovery. Florian Römer and Giovanni Del Galdo
Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery Florian Römer and Giovanni Del Galdo 2nd CoSeRa, Bonn, 17-19 Sept. 2013 Ilmenau University of Technology Institute for Information
More informationUAPD: Predicting Urban Anomalies from Spatial-Temporal Data
UAPD: Predicting Urban Anomalies from Spatial-Temporal Data Xian Wu, Yuxiao Dong, Chao Huang, Jian Xu, Dong Wang and Nitesh V. Chawla* Department of Computer Science and Engineering University of Notre
More informationPerformance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization
Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly
More informationAs it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation
Graphs and Networks Page 1 Lecture 2, Ranking 1 Tuesday, September 12, 2006 1:14 PM I. II. I. How search engines work: a. Crawl the web, creating a database b. Answer query somehow, e.g. grep. (ex. Funk
More informationGoogle Page Rank Project Linear Algebra Summer 2012
Google Page Rank Project Linear Algebra Summer 2012 How does an internet search engine, like Google, work? In this project you will discover how the Page Rank algorithm works to give the most relevant
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationSingular Value Decomposition. 1 Singular Value Decomposition and the Four Fundamental Subspaces
Singular Value Decomposition This handout is a review of some basic concepts in linear algebra For a detailed introduction, consult a linear algebra text Linear lgebra and its pplications by Gilbert Strang
More informationStreaming multiscale anomaly detection
Streaming multiscale anomaly detection DATA-ENS Paris and ThalesAlenia Space B Ravi Kiran, Université Lille 3, CRISTaL Joint work with Mathieu Andreux beedotkiran@gmail.com June 20, 2017 (CRISTaL) Streaming
More informationCS246 Final Exam. March 16, :30AM - 11:30AM
CS246 Final Exam March 16, 2016 8:30AM - 11:30AM Name : SUID : I acknowledge and accept the Stanford Honor Code. I have neither given nor received unpermitted help on this examination. (signed) Directions
More informationLinear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More information