Key words. multiway array, Tucker decomposition, low-rank approximation, maximum block improvement
|
|
- Conrad Bishop
- 6 years ago
- Views:
Transcription
1 ON TENSOR TUCKER DECOMPOSITION: THE CASE FOR AN ADJUSTABLE CORE SIZE BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG Abstract. This paper is concerned with the problem of finding a Tucer decomposition for tensors. Traditionally, solution methods for Tucer decomposition presume that the size of the core tensor is specified in advance, which may not be a realistic assumption in some applications. In this paper we propose a new computational model where the configuration and the size of the core become a part of the decisions to be optimized. Our approach is based on the so-called maximum bloc improvement (MBI) method for non-convex bloc optimization. We consider a number of real applications in this paper, showing the promising numerical performances of the proposed algorithms. Key words. multiway array, Tucer decomposition, low-ran approximation, maximum bloc improvement AMS subject classifications. 15A69, 90C26, 90C59, 49M27, 65F99 1. Introduction. The models and solution methods for high order tensor decompositions were initially motivated and developed in psychometrics (see, e.g. [41, 42, 43]) and chemometrics (see, e.g. [4]) where the need for analysis of multiway data is imperative. Since then, tensor decomposition has become an active field of research in mathematics due partly to the intricate algebraic structures of tensors. In addition to this, high order (tensor) statistics and independent component analysis (ICA) have been developed by engineers and statisticians primarily because of their wide practical applications. There are two major tensor decompositions that can be considered as higher order extensions of the matrix singular value decomposition (SVD): one is nown as the CANDECOMP/PARAFAC (CP) decomposition, which attempts to present the tensor as a sum of ran-one tensors and has various applications in chemometrics [2, 4], neuroscience [1], and data mining [6], and the other is nown as the Tucer decomposition, which is a higher order generalization of the principal component analysis (PCA) and has found wide applications in signal processing [18], data mining [37], computer vision [45, 46], and chemical analysis [22, 23]. There are extensive studies in the literature for finding the CP decomposition and the Tucer decomposition for tensors; see, e.g. [16, 15, 27, 30, 31, 35, 17, 47, 50, 40]. On the computational side, the existing popular algorithms are based on the alternating least square (ALS) method proposed originally by Carroll and Chang [12], and Harshman [20]. However, the ALS method is not guaranteed to converge to a global optimal or a stationary point, but only to a solution where the objective function ceases to decrease. However, if the ALS method converges under certain non-degeneracy assumption, then it actually has a local linear convergence rate [44]. Recently, Chen et al. [14] put forward a bloc coordinate descent type search method nown as maximum bloc improvement (MBI), for solving a specific nonlinear optimization with separable structure (including tensor decompositions as special cases), and proved that the se- Department of Automation, Xiamen University, Xiamen , China (blchen@xmu.edu.cn). Department of Mathematics, Shanghai University, Baoshan District, Shanghai , China (zheningli@gmail.com). The research of this author was supported in part by Natural Science Foundation of Shanghai (Grant 12ZR ) and Ph.D. Programs Foundation of Chinese Ministry of Education (Grant ). Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, MN 55455, USA (zhangs@umn.edu). The research of this author was supported in part by National Science Foundation of USA (Grant CMMI ). 1
2 2 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG quence produced by the MBI method converges to a stationary point in general. This performs very well numerically. The MBI method has been applied successfully to the problem of finding the best ran-one approximation of tensors [14], and in solving problems arising from bioinformatics [49, 48] and radar systems [5]. Local linear convergence of the MBI method for certain types of problems were established by Li et al. [33]. Recent treatise on this topic can be found in the Ph.D. thesis of Chen [13]. In this paper, we study the Tucer decomposition problem with the characteristic that the size of its core is unspecified. We shall demonstrate how the MBI method can be applied to solve such problems. Tucer decomposition can be viewed as a generalization of CP decomposition which is a Tucer model with equal number of components in each mode. It was first introduced in 1963 by Tucer [41], and later redefined in Levin [32] and Tucer [42, 43]. The goal of Tucer decomposition is to decompose a tensor into a core tensor multiplied by a matrix along each mode. It is related to the best ran-(r 1, r 2,..., r d ) approximation of d-th order tensors (cf. [16]). Typically, a ran-(r 1, r 2,..., r d ) Tucer decomposition (sometimes called the best multilinear ran-(r 1, r 2,..., r d ) approximation) means that the size of the core tensor is r 1 r 2 r d. In terms of the decomposition methods, Tucer [43] proposed three methods to find the Tucer decomposition for three-way tensors in 1966, among which the first method is referred to the Tucer1 method, and is now better nown as the higher order singular value decomposition (HOSVD) (see [15]). In [15], De Lathauwer et al. not only showed that the HOSVD for the tensor is a convincing generalization of SVD for the matrix case, but also proposed a truncated HOSVD that gives a suboptimal ran-(r 1, r 2,..., r d ) approximation of tensors. Analogous to the ALS method developed for computing CP decomposition, researchers also derived similar methods for solving Tucer decomposition. Kroonenberg and De Leeuw [30] proposed TUCKALS3 for computing Tucer decomposition for the three-way arrays. Later Kapteyn et al. [25] extended the TUCKALS3 to decompose the general d-way arrays for d > 3. Moreover, De Lathauwer et al. [16] derived a more efficient technique called higher order orthogonal iteration (HOOI) method for computing ran-(r 1, r 2,..., r d ) Tucer decomposition. Possible improvements for the HOOI algorithm are considered by Andersson and Bro [3]. However, as we mentioned earlier, the ALS method has no convergence guarantee in general. Alternatively, there are methods for Tucer decomposition with a convergence guarantee. Eldén and Savas [19], and Ishtev et al. [24] simultaneously proposed Newton-based method for computing the best multilinear ran-(r 1, r 2, r 3 ) approximation of tensors, the former is nown as the Newton-Grassmann method, and the latter is also called the differential-geometric Newton method. Both methods have quadratic local convergence property. (Note that computing Hessian is usually a very demanding tas). For more discussions on the Tucer decomposition and its variants, one is referred to the survey paper by Kolda and Bader [29]. In Tucer decomposition, the ran-(r 1, r 2,..., r d ) (namely the size of the core) is considered a set of predetermined parameters. This requirement, however, is restrictive in some applications. For example, the problem on how to choose a good number of clusters for co-clustering of gene expression data comes up in the area of bioinformatics. In fact, the number of suitable co-clusters is not nown in advance (see Zhang et al. [48]). The question of how to choose the ran of Tucer model (cf. [39, 26, 21]) is challenging, since the problem is already NP-hard even if the Tucer ran (r 1, r 2,..., r d ) is nown. Similar issue on selecting an appropriate ran has also been considered in CP decomposition, e.g. a consistency diagnostic named COR-
3 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 3 CONDIA designed in [11]. Timmerman and Kiers [39] proposed DIFFIT procedure based on optimal fit to choose the numbers of components in Tucer decomposition of three-way data. The procedure, however, is rather time-consuming. Kiers and Der Kinderen [26] revised the procedure and computed DIFFIT on approximate fit to save the computational time. In this paper, we propose a new scheme to choose good rans in Tucer decomposition, provided that the summation of the rans d i=1 r i is fixed. The remainder of the paper is organized as follows. In Section 2, we introduce the notations and frequently used tensor operations throughout the paper. Then we apply the MBI method for solving ran-(r 1, r 2,..., r d ) Tucer decomposition in Section 3. In Section 4, a new model for Tucer decomposition with unspecified core size is proposed and solved based on the MBI method and the penalty method. A heuristic approach is also developed to compute the new model in this section. Finally, numerical results on testing the new model and algorithms are presented in Section Notations. Throughout this paper, we uniformly use non-bold lowercase letters, boldface lowercase letters, capital letters, and calligraphic letters to denote s- calars, vectors, matrices, and tensors, respectively; e.g.: scalar i, vector y, matrix A, and tensor F. We use the subscripts to denote the component of a vector, a matrix, or a tensor; e.g.: y i being the i-th entry of the vector y, A ij being the (i, j)-th entry of the matrix A, and F ij being the (i, j, )-th entry of the tensor F. Let us first introduce some important tensor operations frequently appeared in this paper, which are largely in line with that in [29, 15]. For an overview of tensor operations and properties, we refer to the survey paper [29]. A tensor is a multidimensional array, and the order of a tensor is its dimension, also nown as the ways or the modes of a tensor. In particular, a vector is a tensor of order one, and a matrix is a tensor of order two. Consider A = (A i1i 2...i d ) R n1 n2 n d, a standard tensor of order d, where d 3. A usual way to handle a tensor is to reorder its elements into a matrix; the process is called matricization, also nown as unfolding or flattening. A R n1 n2 n d has d modes, namely, mode-1, mode-2,..., mode-d. Denote the mode- matricization of tensor A to be A (), then the (i 1, i 2,..., i d )-th entry of tensor A is mapped to the (i, j)-th entry of matrix A () R n l n l, where j = l d,l (i l 1) 1 t l 1,t The -ran of tensor A, denoted by ran (A), is the column ran of mode- unfolding A (), i.e., ran (A) = ran(a () ). A d-th order tensor whose ran (A) = r for = 1, 2,..., d, is briefly called a ran-(r 1, r 2,..., r d ) tensor. Analogous to the Frobenius norm of a matrix, the Frobenius norm of tensor A is the usual 2-norm, defined by n 1 A := n 2 i 1=1 i 2=1 n d i d =1 A i1i 2...i d 2. In this paper we uniformly denote the 2-norm for vectors, and the Frobenius norm for matrices and tensors, all by notation. The inner product of two same-sized n t.
4 4 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG tensors A, B is given as A, B = n 1 n 2 i 1=1 i 2=1 n d i d =1 A i1i 2...i d B i1i 2...i d. Hence, it is clear that A, A = A 2. One important tensor operation is the multiplication of a tensor by a matrix. The -mode product of tensor A by a matrix U R m n, denoted by A U, is a tensor in R n1 n2 n 1 m n +1 n d, whose (i 1, i 2,..., i 1, l, i +1,..., i d )-th entry is defined by (A U) i1i 2...i 1 l i +1...i d = n i =1 A i1i 2...i 1 i i +1...i d U li. The equation can also be written in terms of tensor unfolding as well, i.e., Y = A U Y () = UA (). This multiplication in fact changes the dimension of tensor A in mode-. In particular, if U is a vector in R n, the order of tensor A U is then reduced to d 1, whose size is n 1 n 2 n 1 n +1 n d. The -mode products can also be expressed by the the matrix Kronecer product as follows: Y = A 1 U (1) 2 U (2) d U (d) Y () = U () A () (U (d) U (+1) U ( 1) U (1)), for any = 1, 2,..., d with U () R m n for = 1, 2,..., d. The proof of this property can be found in [28]. Throughout this paper we uniformly use a subscript in the parentheses to denote the matricization of a tensor (e.g. A (1) being mode-1 matricization of tensor A), and use a superscript in the parentheses to denote the matrix in the mode product of a tensor (e.g. U (1) in appropriate size showed in mode-1 product of a tensor). 3. Convergence of the traditional Tucer decomposition. Traditionally, Tucer decomposition attempts to find the best approximation for a large-sized tensor by a small-sized tensor with pre-specified dimensions (called the core tensor) multiplied by a matrix on each mode. The problem can be formulated as follows: Given a real tensor F R n1 n2 n d, find a core tensor C R r1 r2 r d with pre-specified integers r i with 1 r i n i for i = 1, 2,..., d, that optimizes (T min ) min F C 1 A (1) 2 A (2) d A (d) s.t. C R r1 r2 r d, A (i) R ni ri, i = 1, 2,..., d. Here, matrices A (i) s are the factor matrices. Without loss of generality, these matrices are assumed to be columnwise orthogonal. The problem can be considered as a generalization of the best ran-one approximation problem, namely the case of c 1 = c 2 = = c d = 1. For any fixed matrices A (i) s, if we optimize the objective function of (T min ) over C, it is not hard to verify that (T min ) is equivalent to the following maximization model (see the discussion in [29, 16, 3]): (T max ) max F 1 (A (1) ) T 2 (A (2) ) T d (A (d) ) T s.t. A (i) R ni ri, (A (i) ) T A (i) = I, i = 1, 2,..., d.
5 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 5 The worhorse for solving (T max ) or (T min ) has been traditionally the ALS method. However, the convergence of the ALS method is not guaranteed in general. Chen et al. [14] proposed the MBI method, a greedy-type search algorithm for optimization model with general separable structure (G) max f(x 1, x 2,..., x d ) s.t. x i S i R ni, i = 1, 2,..., d, where f : R n1+n2+ +n d R is a general continuous function. Assuming that for any fixed d 1 blocs of variables x i s, optimization over one bloc of variables can be done, the MBI method is guaranteed to converge to a stationary point under the mild condition that S i is compact for i = 1, 2,..., d. We notice that (T max ) is a particular instance of (G). Moreover, by fixing any d 1 blocs A (i) s in (T max ), the optimization subproblem in the MBI method can be easily solved by the singular value decomposition (SVD). To be specific, the subproblem of (T max ) by applying the MBI method is (Tmax) i max F 1 (A (1) ) T 2 (A (2) ) T d (A (d) ) T s.t. A (i) R ni ri, (A (i) ) T A (i) = I, where matrices A (1), A (2),..., A (i 1), A (i+1),..., A (d) are given. The objective function of (Tmax) i can be written in the matrix form as follows: (A (i) ) T F (i) (A (d) A (i+1) A (i 1) A (1)). Therefore, the optimal ( solution for (Tmax) i is the r i leading left singular vectors of the matrix F (i) A (d) A (i+1) A (i 1) A (1)). Let us now present the algorithm for solving (T max ) by the MBI approach. Algorithm 1. The MBI method for Tucer decomposition Input Tensor F R n1 n2 n d and scalars r 1, r 2,..., r d. Output Core tensor C R r1 r2 r d and matrices A (i) R ni ri for i = 1, 2,..., d. 0 Choose an initial feasible solution (A (1) 0, A(2) 0,..., A(d) 0 ) and compute initial objective value v 0 := F 1 (A (1) 0 )T 2 (A (2) 0 )T d (A (d). 0 )T Set := 0. 1 For each i = 1, 2,..., d, compute B (i) +1 consisted by the r i leading left singular ( ) vectors of F (i) A (d) A (i+1) A (i 1) A (1), and let w+1 i := F 1 (A (1) )T i 1 (A (i 1) ) T i (B (i) +1 )T i+1 (A (i+1) ) T d (A (d) )T. 2 Let v +1 := max 1 i d w+1 i and choose one i = arg max 1 i d w+1 i, and further denote { A (i) +1 := A (i) i {1, 2,..., d}\{i }, B (i) +1 i = i. 3 If v +1 v ɛ, stop and output the core tensor and matrices A (i) = A (i) to Step 1. C := F 1 (A (1) +1 )T 2 (A (2) +1 )T d (A (d) +1 )T +1 for i = 1, 2,..., d; otherwise, set := + 1 and go
6 6 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG According to the convergence result of the MBI method established in [14], Algorithm 1 guarantees to converge to a stationary point for Tucer decomposition, which is however not the case for many other algorithms, e.g. HOSVD method and HOOI method. 4. Tucer decomposition with unspecified size of the core. In this section, we propose a new model for Tucer decomposition without pre-specifying the size of the core tensor. The dimension of each mode for the core is no longer a constant. Rather it is a variable that also needs to be optimized, which is a ey ingredient in our model. Several algorithms are proposed to solve the new model as well The formulation. Given a tensor F R n1 n2 n d, the goal is to find a small-sized (low ran) core tensor C and d slim matrices A (i) s to express F, as close as possible. We are interested in determining the ran of each mode of C as well as the best approximation of F. Let the i-ran of C be r i for i = 1, 2,..., d. Clearly, we have 1 r i n i for i = 1, 2,..., d. Unlie the general Tucer decomposition, r i s are now decision variables which need to be determined. Denote c to be a given constant for the summation of all i-ran variables, i.e., d i=1 r i = c, which in general prevents r i being too large. To determine the allocation for r i s in the total number c, the new model is (NT min ) min F C 1 A (1) 2 A (2) d A (d) s.t. C R r1 r2 r d, A (i) R ni ri, i = 1, 2,..., d, r i Z, 1 r i n i, i = 1, 2,..., d, d i=1 r i = c. It is difficult to solve (NT min ) directly, since the first two constraints of (NT min ) combine the bloc variables A (i) and i-ran variable r i together. A straightforward method is to separate these variables, and we introduce d more bloc variables Y (i) R mi mi where m i := min{n i, c} for i = 1, 2,..., d and Y (i) = diag(y (i) ), y (i) {0, 1} mi, and m i j=1 y (i) j = r i for i = 1, 2,..., d. By the equivalence between Tucer decomposition models (T min ) and (T max ), we may reformulate (NT min ) as follows: ( (NT max ) max F 1 A (1) Y (1)) T ( 2 A (2) Y (2)) T ( d A (d) Y (d)) T s.t. A (i) R ni mi, (A (i) ) T A (i) = I, i = 1, 2,..., d, y (i) {0, 1} mi, i = 1, 2,..., d, mi j=1 y(i) j 1, i = 1, 2,..., d, d i=1 mi j=1 y(i) j = c. Throughout this paper, Y (i) always represents diag(y (i) ) rather than putting Y (i) = diag(y (i) ) in the constraints, in order to mae the models being concise. Note that r i in (NT min ) is already replaced the number of nonzero entries of y (i) in (NT max ). Let X = F 1 ( A (1) Y (1)) T 2 ( A (2) Y (2)) T d ( A (d) Y (d)) T R m 1 m 2 m d. If feasible bloc variables Y (i) s satisfy
7 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 7 Y (i) = diag(1, 1,..., 1, 0, 0,..., 0), i = 1, 2,..., d, (4.1) }{{}}{{} r i m i r i then the size of the core tensor X can be reduced to r 1 r 2 r d by deleting the tails of zero entries in all modes, and the ran of X in each mode is equal to r 1, r 2,..., r d, respectively. This observation, however establishes the equivalence between (NT min ) and (NT max ). As we shall see in the next subsection, we may without loss of generality assume Y (i) to be in the form of (4.1). This allows us to construct a core tensor with ran-(r 1, r 2,..., r d ) easily, which is exactly the same size of the core tensor C to be optimized in (NT min ). Besides, we would lie to remar that the third constraint mi j=1 y(i) j 1 for i = 1, 2,..., d in (NT max ) is actually redundant. The reason is that if m i j=1 y(i) j = 0 for some i, then clearly the objective is zero, which can never be optimal. However, we still eep it in (NT max ) for continency in implementing the algorithms discussed in the following subsections The penalty method. Let us focus on the model for Tucer decomposition with unspecified size of the core in the maximization form (NT max ). Recall that the separable structure of (G) is required to implement the MBI method. Therefore, we mi j=1 y(i) j move the nonseparable constraint d i=1 the following penalty function ( p λ, A (1), A (2),..., A (d), Y (1), Y (2),..., Y (d)) := = c to the objective function, i.e., F 1 (A (1) Y (1)) T 2 (A (2) Y (2)) T 3 d (A (d) Y (d)) T 2 d m i λ i=1 j=1 where λ > 0 is a penalty parameter. The penalty model for (NT max ) is then (P T ) max p ( λ, A (1), A (2),..., A (d), Y (1), Y (2),..., Y (d)) s.t. A (i) R ni mi, (A (i) ) T A (i) = I, i = 1, 2,..., d, y (i) {0, 1} mi, i = 1, 2,..., d, mi j=1 y(i) j 1, i = 1, 2,..., d. y (i) j 2 c, We are ready to apply the MBI method to solve (P T ) since the bloc constraints are now separable. Before presenting the formal algorithm, let us first discuss the subproblems in implementing the MBI method, which have to be solved globally as a requirement to guarantee convergence. Without loss of generality, we wish to optimize (A (1), Y (1) ) while all other bloc variables (A (i), Y (i) ) for i = 2, 3,..., d are fixed, i.e., (P T 1 ) max ( A (1) Y (1)) T W (1) 2 ( m1 ) 2 λ j=1 y(1) j + c 1 s.t. A (1) R n1 m1, (A (1) ) T A (1) = I, y (1) {0, 1} m1, m1 j=1 y(1) j 1, where W (1) := F (1) ( A (d) Y (d) A (2) Y (2)) and c 1 := d i=2 mi j=1 y(i) j c. This subproblem indeed can be solved easily. First, as the optimization of A (1) is irrelevant to the penalty term, the optimal A (1) is the m 1 leading left singular
8 8 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG vectors of matrix W (1) by applying SVD. Next, we search for the optimal Y (1) for given optimal A (1). Denote v j to be the j-th row vector of matrix (A (1) ) T W (1) for j = 1, 2,..., m 1, and we have ( A (1) Y (1)) T 2 m 1 W (1) = y (1) j v j 2. Therefore, the optimization of Y (1) is then the following problem: ( m1 ) 2 max λ j=1 y(1) m1 j + j=1 ( v j 2 2λ c 1 )y (1) j λ( c 1 ) 2 j=1 s.t. y (1) {0, 1} m1, m 1 j=1 y(1) j 1. Although the above model appears to be combinatorial, it is solvable in polynomialtime. This is because all the possible values for m 1 j=1 y(1) j are {1, 2,..., m 1 }, and for any fixed m 1 j=1 y(1) j, the optimal allocation for y (1) can be assigned greedily as the objective function is now linear. Thus we only need to try m 1 different values for and pic the best solution. m1 j=1 y(1) j In the above discussion, we now that the optimal Y (1) automatically satisfies the formation (4.1). This is because A (1) is the m 1 leading left singular vectors of matrix W (1), leading to the non-increasing order for v j s. Therefore we can update A (1) and Y (1) simultaneously in solving the subproblem of the MBI method for given penalty parameter λ. To summarize, the whole procedure for solving (NT max ) using the MBI method and penalty function method (cf. [8, 38]) is as follows. Algorithm 2. The MBI method for the penalty model Input Tensor F R n1 n2 n d and integer c d. Output Core tensor C R m1 m2 m d and matrices A (i) R ni mi for i = 1, 2,..., d. 0 Choose penalty parameters λ 0 > 0, σ > 1 and an initial feasible solution (A (1) 0, A(2) ( 0,..., A(d) 0, Y (1) 0, Y (2) 0,..., Y (d) 0 ), and compute ) initial objective value v 0 := p λ 0, A (1) 0, A(2) 0,..., A(d) 0, Y (1) 0, Y (2) 0,..., Y (d) 0. Set := 0, l := 0 and λ := λ 0. ( ) 1 For each i = 1, 2,..., d, solve (P T i ) and get its optimal solution B (i) +1, Z(i) +1 with optimal value w+1 i, where (P T i ) is defined similar as (P T 1 ) by replacing bloc 1 to bloc i. 2 Let v +1 := max 1 i d w+1 i and choose one i = arg max 1 i d w+1 i, and further denote ( ) A (i) +1, Y (i) +1 := ( ( A (i) ), Y (i) B (i) +1, Z(i) +1 ) i {1, 2,..., d}\{i }, i = i. 3 If v +1 v ɛ, go to Step 4; otherwise, set := + 1 and go to Step 1. ( d 2 mi 4 If λ l i=1 j=1 (y(i) +1 ) j c) ɛ0, stop and output the core tensor ( ) T ( ) T ( ) T C := F 1 A (1) +1 Y (1) +1 2 A (2) +1 Y (2) +1 3 d A (d) +1 Y (d) +1 and matrices A (i) = A (i) +1 Y (i) +1 for i = 1, 2,..., d; otherwise, set l := l + 1, λ := λ 0 σ l and := 0, and go to Step 1.
9 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 9 We remar that when Algorithm 2 stops, we can shrin the size of the core tensor to r 1 r 2 r d by deleting the tails of zero entries in all modes, where r i is the number of nonzero entries of y (i). As an MBI method, Algorithm 2 is guaranteed to converge to a stationary point of (P T ), as claimed in [14] A heuristic method. To further save the computational efforts, in this subsection we present a heuristic approach to solve Tucer decomposition model with unspecified size of the core, i.e., (NT max ). We now that if each i-ran (i = 1, 2,..., d) of the core tensor C is given, then the problem becomes the traditional Tucer decomposition discussed in Section 3, which can be solved by the MBI method (Algorithm 1). From the discussion in Section 4, we notice that the ey issue of the model is to allocate the constant number c to each i-ran of the core tensor. Therefore, the optimal value of (T max ) can be considered as a function of (r 1, r 2,..., r d ), denoted by g(r 1, r 2,..., r d ). Our heuristic approach tries to find a distribution of the constant c, by adapting the idea of the MBI method. Specifically, we may start with lower i-rans, e.g. r1 0 = r2 0 = = rd 0 = 1, and compute Tucer decomposition using Algorithm 1; then we increase one of the i-rans (1 i d) by one, by choosing i as the best Tucer decomposition among d possible increment of the i-ran, i.e., (r1 +1, r2 +1,..., r +1 d ) = arg max 1 i d g(r 1, r2,..., ri 1, ri + 1, ri+1,..., rd). This procedure is continued until d i=1 r i = c for some (= c d). If the function g(r 1, r 2,..., r d ) for Tucer decomposition can be globally solved for any given (r 1, r 2,..., r d ) (though it is NP-hard in general), then the above approach is a dynamic program and the optimality of (NT max ) is guaranteed when the method stops. Though in general Algorithm 1 can only find the stationary solution of g(r 1, r 2,..., r d ), this heuristic approach is in fact very effective; see the numerical tests in Section 5. Apart from the ran increasing approach, ran decreasing strategy can be the other alternative, i.e., starting from large number r i s and decreasing them until d i=1 r i = c. We conclude this subsection by presenting the heuristic algorithm for ran decreasing method below, which is actually very easy to implement. Algorithm 3. The ran decreasing method Input Tensor F R n1 n2 n d and integer c d. Output Core tensor C R r1 r2 r d and matrices A (i) R ni ri for i = 1, 2,..., d. 0 Choose initial rans (r 1, r 2,..., r d ) with d i=1 r i > c. 1 Apply Algorithm 1 to solve (T max ) with input (r 1, r 2,..., r d ), and output matrices A (i) for i = 1, 2,..., d. 2 For each i = 1, 2,..., d, let B (i) R ni (ri 1) be the matrix by deleting the r i -th column of A (i), and compute i := arg max F 1 (A (1) ) T i 1 (A (i 1) ) T i (B (i) ) T i+1 (A (i+1) ) T 1 i d d (A (d) ) T. Update r i := r i 1 and A (i ) := B (i ). Repeat if necessary until d i=1 r i = c. 3 Apply Algorithm 1 to solve (T max ) with input (r 1, r 2,..., r d ), and output the core tensor C and matrices A (i) for i = 1, 2,..., d.
10 10 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG 5. Numerical experiments. In this section, we apply the algorithms proposed in the previous sections to solve Tucer decomposition without specification of the core size. Four different types of data are tested to demonstrate the effectiveness and efficiency of our algorithms. All the numerical computations are conducted in an Intel Xeon CPU 3.40GHz computer with 8GB RAM. The supporting software is MATLAB (R2011a) as a platform. We use MATLAB Tensor Toolbox Version 2.5 [7] whenever tensor operations are called, and also apply its embedded algorithm (the ALS method) to solve Tucer decomposition with given ran-(r 1, r 2,..., r d ). The termination precision for the ALS method is set to be In implementing Algorithm 2, the penalty increment parameter is set as σ := 2. The starting matrices A (i) 0 s are randomly generated and then made to be orthonormal. The starting diagonal matrices Y (i) 0 s are all set as 0 := diag(1, 0, 0,..., 0) for i = 1, 2,..., d. }{{} m i 1 Y (i) The termination precision for Algorithm 2 is also set to be For Algorithm 3, the initial rans are set as r i := min(n i, c) for i = 1, 2,..., d. When applying Algorithm 1 in Step 1, the starting matrices A (i) 0 s for solving (T max) are randomly generated, and the termination precision is set to be While applying Algorithm 1 in Step 3, the starting matrices A (i) 0 s for solving (T max) are obtained from Step 2, and the termination precision is set to be For the given data tensor F in the following tests, and the ran-(r 1, r 2,..., r d ) approximation tensor ˆF computed by the algorithms, the relative square error of the approximation is normally defined as F ˆF / F. To measure how close it is to the original tensor F, the term fit is defined as fit := 1 F F ˆ. F The larger the fit, the better the performance Noisy tensor decomposition. First we present some preliminary test results on two synthetic data sets. We use Tensor Toolbox to randomly generated a noise-free tensor Y R with ran-(4, 4, 2), where entries of the core tensor follow standard normal distributions and entries of the orthogonal factor matrices follow uniform distributions. A noise tensor N R in the same size is randomly generated, whose entries follow standard normal distributions. The noisy tensor Z to be tested is then constructed as follows: Z = Y + η Y N N, where η is the noise parameter, more formally, the perturbed ratio. Our tas is to find the real i-rans of the core tensor from the corrupted tensor Z. In this set of tests, the initial penalty parameter for Algorithm 2 is set as λ 0 = Z, and summation of the rans is set as c = 10. Results of this experiment are summarized in Figure 5.1, representing the fit and the computational time of Algorithms 2 and 3 for different perturbed ratios. For comparison, we also call the ALS method for solving the traditional Tucer decomposition with given (r 1, r 2, r 3 ), which are randomly generated satisfying r 1 + r 2 + r 3 = 10. Also, 10 random ran samples
11 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 11 are generated for the ALS method, and its maximum fit, average fit and minimum fit are presented in the left hand side of Figure 5.1, corresponding to three pentagon dots from the top to the bottom in each line segment, respectively. The computational time for the ALS method is then the summation of these 10 ran samples. We observe that Algorithms 2 and 3 can easily and quicly find the exact i-rans (4, 4, 2) of the core tensor for different perturbed ratios, and can still approximate the original tensor Y accurately even for large noise. Figure 5.1 shows that Algorithms 2 and 3 outperform the ALS method significantly, both in fit and in computational time. Fig Approximating a noisy tensor of size with original ran-(4,4,2). Figure 5.2 shows another synthetic experiment whose data is generated similarly as that in Figure 5.1. In this set of data, Y R , the i-rans are (5, 5, 4). The summation of the rans in testing Algorithms 2 and 3 are set as c = 14. Figure 5.2 again shows that our algorithms perform quite well in anti-interference and are far superior to the ALS method. Furthermore, both Algorithms 2 and 3 can find the exact i-rans (5, 5, 4) of the core tensor for various perturbed ratios. Fig Approximating a noisy tensor of size with original ran-(5,5,4) Amino acid fluorescence data. This data set was originally generated and measured by Claus Andersson and was later published and tested by Bro [9, 10].
12 12 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG It consists of five laboratory-made samples. Each sample contains different amounts of tyrosine, tryptophan and phenylalanine dissolved in phosphate buffered water. The size of the array A to be decomposed is , which corresponds to samples, emission wavelength ( nm) and excitation wavelength ( nm), respectively. Ideally the array should be describable with three PARAFAC components, where its fit is equal to 97.44% obtained by Tensor Toolbox. This data can also be approximated well in the sense of Tucer decomposition. In implementing Algorithm 2, the initial penalty parameter is set as λ 0 = 0.01 A. The numerical results for the three methods (ALS, Algorithms 2 and 3) are p- resented in Figure 5.3. Each pentagon dot in Figure 5.3 denotes the average fit by running the ALS method 10 times with randomly generated ran-(r 1, r 2, r 3 ) satisfying r 1 + r 2 + r 3 = c, and the CPU time is the summation of 10 random samples. The performances of Algorithms 2 and 3 are much better than that of the ALS method, especially for Algorithm 2, which can get a convincing fit even for lower c. Furthermore, Algorithm 2 decomposes this data set as good as PARAFAC does without nowing the exact i-rans, e.g. the fit reaches 97.55% when c = 9 which finds ran-(3, 3, 3) when it stops. Fig Approximating the amino acid fluorescence data for different c = r 1 + r 2 + r 3 To further justify the importance of Tucer decomposition with unspecified size of the core as well as Algorithm 2, here we investigate the ALS method with all possible pre-specified i-rans to test this data set. For given summation of i-rans c = 9, there are total 25 possible combinations of (r 1, r 2, r 3 ), and their corresponding fits are listed in Figure 5.4. It shows that Tucer decomposition of ran-(3, 3, 3) outperforms all other combinations, which is exactly the i-rans found by Algorithm Gene expression data. We further test one real three-way tensor F R , which is from 3D Arabidopsis gene expression data 1. Essentially this is a set of gene expression data with 2395 genes, measured at 6 time points and 9 different conditions. We again apply the three methods to test this data set. The initial penalty parameter for Algorithm 2 is set as λ 0 = 0.01 F. Results of our experiments are summarized in Table 5.1. For the fit in the last three columns by the ALS method, fit-2 and fit-3 denote the fit by using the ALS method with the rans obtained from 1 We than Professor Xiuzhen Huang for providing this set of data.
13 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 13 Fig The ALS method on the amino acid fluorescence data for c = 9. c Algorithm 2 Algorithm 3 ALS (r 1, r 2, r 3) CPU Fit(%) (r 1, r 2, r 3) CPU Fit(%) Fit-2(%) Fit-3(%) Fit(%) 3 (1, 1, 1) (1, 1, 1) (2, 1, 2) (2, 1, 2) (4, 2, 4) (1, 3, 6) (7, 3, 5) (5, 6, 4) (10, 4, 6) (5, 6, 9) Table 5.1 Approximating the gene expression data Algorithms 2 and 3, respectively; while the fit in the last column is the average fit by running the ALS method 10 times with randomly generated rans satisfying r 1 + r 2 + r 3 = c. The following observations are clearly visible: Excellent performances of Algorithms 2 and 3 are validated. Computing the i-ran information is important, as the rans outputted by Algorithms 2 and 3 are more suitable than randomly generated rans for the ALS method in terms of the fit. With the combination of the ALS method, Algorithm 2 wors better than Algorithm 3 in terms of the fit, while its CPU time is more costly than that of Algorithm Tensor compression of image data. Our final tests are applying the models and algorithms in this paper to compress the tensors from the image data. Here two set of faces are presented, with each in one subsection The ORL database of faces. The first set of images is from the ORL database of faces [36] in AT&T Laboratories Cambridge. In this database, there are 10 different images for each of the 40 distinct subjects, and the size of each image is pixels. Here, we draw one single distinct subject, and then construct a tensor T R When applying Algorithm 2, the initial penalty parameter is set as λ 0 = T. Figure 5.5 displays the images compressed and recovered by the three methods
14 14 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG c Methods (r 1, r 2, r 3) CR CP(%) RMSE Fit(%) CPU Algorithm 3 (25, 22, 3) ALS (best) (16, 25, 9) (worst) (43, 6, 1) Algorithm 2 (21, 19, 10) Algorithm 3 (35, 31, 4) ALS (best) (34, 29, 7) (worst) (11, 58, 1) Algorithm 2 (30, 30, 10) Table 5.2 Tensor compression for the ORL database of faces when c = 50 and 70 respectively. The detailed numerical values for the two cases are listed in Table 5.2. Some standard abbreviations from image science are adopted, namely, CP for compression, CR for compression ratio, and RMSE for root mean squared error. For the ALS method, its CPU time is computed by the summation of 10 random generated i-rans satisfying r 1 + r 2 + r 3 = c, and its best and worst compressions in terms of fit are reported for comparison. Fig Recovered images for the ORL database of faces (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) Observation from Figure 5.5 indicates that only the compressed images by the ALS method (the best one) and Algorithm 2 eep all the facial expressions. However Algorithm 2 indeed outperforms the ALS method (the best one) in terms of fit and RMSE as shown in Table 5.2. It is worth mentioning that most of computational effort of Algorithm 2 is to find a suitable combination of rans (r 1, r 2, r 3 ), while this issue for the ALS method is pre-specified. This computational effort in selecting better initial rans is worthwhile, as we noticed that the ALS method for a large c may be worse than Algorithm 2 for a smaller c, e.g. the 4th row in the right of Figure 5.5
15 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 15 c Methods (r 1, r 2, r 3) CR CP(%) RMSE Fit(%) CPU Algorithm 3 (40, 39, 1) ALS (best) (39, 38, 3) (worst) (75, 2, 3) 1.0e Algorithm 2 (39, 34, 7) Algorithm 3 (53, 55, 2) ALS (best) (54, 52, 4) (worst) (1, 105, 4) 1.1e Algorithm 2 (53, 50, 7) Table 5.3 Tensor compression for for the JAFFE database is not clearer than the 2rd row in the left of Figure 5.5, which is also confirmed by Table 5.2. Figure 5.6 presents the performance of Algorithm 2 for various c. Clearly, the larger c is, the larger the fit, and the lower the RMSE. This figure also suggests a guideline for choosing a suitable c depending on the quality of the compressed images required by the user. Fig Performance of Algorithm 2 for the ORL database of faces The Japanese female facial expression (JAFFE) database. Our last tests are on the facial database [34] from the Psychology Department in Kyushu University, which contains 213 images of 7 different emotional facial expressions (sadness, happiness, surprise, anger, disgust, fear and neutral) posed by 10 Japanese female models. The size of each image is pixels. Here we draw 7 different facial expressions of one female model and construct a tensor T of size A similar set of tests as in Section are conducted. The results are presented in Figures 5.7 and 5.8, Table 5.3, and Figure 5.9, which again confirm the observation from the results for the ORL database of faces. In particular, by comparing the best two set of images (the best ALS method and Algorithm 2) in Figures 5.7 and 5.8, Algorithm 2 is clearer better in details, e.g. the eyebrows.
16 16 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG Fig Recovered images for the JAFFE database when c = 80 (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) 6. Conclusion. In this paper we study the problem of finding a proper Tucer decomposition for a general tensor without a pre-specified size of the core tensor, which addresses an important practical issue for real applications. The size of the core tensor is assumed in almost all the existing methods for tensor Tucer decomposition. The approach that we propose is based on the so-called maximum bloc improvement (MBI) method. Our numerical experiments on a variety of instances taen from real applications suggest that the proposed methods perform robustly and effectively. REFERENCES [1] A. H. Andersen and W. S. Rayens, Structure-seeing multilinear methods for the analysis of fmri data, NeuroImage, 22 (2004), pp [2] C. M. Andersen and R. Bro, Practical aspects of PARAFAC modeling of fluorescence excitation-emission data, Journal of Chemometrics, 17 (2003), pp [3] C. A. Andersson and R. Bro, Improving the speed of multi-way algorithms: Part I. Tucer3, Chemometrics and Intelligent Laboratory Systems, 42 (1998), pp [4] C. J. Appellof and E. R. Davidson, Strategies for analyzing data from video fluorometric monitoring of liquid chromatographic effluents, Analytical Chemistry, 53 (1981), pp [5] A. Aubry, A. De Maio, B. Jiang, and S. Zhang, A cognitive approach for ambiguity function shaping, Proceedings of the 7th IEEE Sensor Array and Multichannel Signal Processing Worshop, 2012, pp [6] B. W. Bader, M. W. Berry, and M. Browne, Discussion tracing in enron using PARAFAC, in M. W. Berry and M. Castellanos (eds.), Survey of Text Mining: Clustering, Classification, and Retrieval (2nd edition), Springer, New Yor, 2007, pp [7] B. W. Bader and T. G. Kolda, Matlab Tensor Toolbox, Version 2.5, tgolda/tensortoolbox
17 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 17 Fig Recovered images for the JAFFE database when c = 110 (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) Fig Performance of Algorithm 2 for the JAFFE database [8] D. P. Bertseas, Nonlinear Programming (2nd edition), Athena Scientific, Belmont, MA, [9] R. Bro, PARAFAC: Tutorial and applications, Chemometrics and Intelligent Laboratory Systems, 38 (1997), pp [10] R. Bro, Multi-way Analysis in the Food Industry: Models, Algorithms, and Applications, Ph.D. Thesis, University of Amsterdam, Nethelands, and Royal Veterinary and Agricultural University, Denmar, [11] R. Bro and H. A. L. Kiers, A new efficient method for determining the number of components in PARAFAC models, Journal of Chemometrics, 17 (2003), pp [12] J. D. Carroll and J. J. Chang, Analysis of individual differences in multidimensional scaling
18 18 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG via an N-way generalization of Ecart-Young decomposition, Psychometria, 35 (1970), pp [13] B. Chen, Optimization with Bloc Variables: Theory and Applications, Ph.D. Thesis, The Chinese Univesrity of Hong Kong, Hong Kong, [14] B. Chen, S. He, Z. Li, and S. Zhang, Maximum bloc improvement and polynomial optimization, SIAM Journal on Optimization, 22 (2012), pp [15] L. De Lathauwer, B. De Moor, and J. Vandewalle, A multilinear singular value decomposition, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp [16] L. De Lathauwer, B. De Moor, and J. Vandewalle, On the best ran-1 and ran- (R 1, R 2,..., R N ) approximation of higher-order tensors, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp [17] L. De Lathauwer and D. Nion, Decompositions of a higher-order tensor in bloc terms Part III: Alternating least squares algorithms, SIAM Journal on Matrix Analysis and Applications, 30 (2008), pp [18] L. De Lathauwer and J. Vandewalle, Dimensionality reduction in higher-order signal processing and ran-(r 1, R 2,..., R N ) reduction in multilinear algebra, Linear Algebra and its Applications, 391 (2004), pp [19] L. Eldén and B. Savas, A Newton-Grassmann method for computing the best multi-linear ran-(r1, r2, r3) approximation of a tensor, SIAM Journal on Matrix Analysis and Applications, 31 (2009), pp [20] R. A. Harshman, Foundations of the PARAFAC procedure: Models and conditions for an explanatory multi-modal factor analysis, UCLA Woring Papers in Phonetics, 16 (1970), pp Also available online at harshman/wpppfac0.pdf [21] Z. He, A. Cichoci, and S. Xie, Efficient method for Tucer3 model selection, Electronics Letters, 45 (2009), pp [22] R. Henrion, Simultaneous simplification of loading and core matrices in N-way PCA: Application to chemometric data arrays, Fresenius Journal of Analytical Chemistry, 361 (1998), pp [23] R. Henrion, On global, local and stationary solutions in three-way data analysis, J. Chemometrics, 14 (2000), pp [24] M. Ishteva, L. De Lathauwer, P. A. Absil, and S. Van Huffel, Differential-geometric Newton method for the best ran-(r 1, R 2, R 3 ) approximation of tensors, Numerical Algorithms, 51 (2009), pp [25] A. Kapteyn, H. Neudecer, and T. Wansbee, An approach to n-mode components analysis, Psychometria, 51 (1986), pp [26] H. A. L. Kiers and A. Der Kinderen, A fast method for choosing the numbers of components in Tucer3 analysis, British Journal of Mathematical and Statistical Psychology, 56 (2003), pp [27] E. Kofidis and P. A. Regalia, On the best ran-1 approximation of higher order supersymmetric tensors, SIAM Journal on Matrix Analysis and Applications, 23 (2002), pp [28] T. G. Kolda, Multilinear operators for higher-order decompositions, Technical Report SAND , Sandia National Laboratories, Albuquerque, NM, [29] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Review, 51 (2009), pp [30] P. M. Kroonenberg and J. De Leeuw, Principal component analysis of three-mode data by means of alternating least squares algorithms, Psychometria, 45 (1980), pp [31] D. Leibovici and R. Sabatier, A singular value decomposition of a -way array for a principal component analysis of multiway data, PTA-, Linear Algebra and its Applications, 269 (1998), pp [32] J. Levin, Three-mode Factor Analysis, Ph.D. Thesis, University of Illinois, Urbana, IL, [33] Z. Li, A. Uschmajew, and S. Zhang, Linear convergence analysis of the maximum bloc improvement method for spherically constrained optimization, Technical Report, [34] M. J. Lyons, S. Aamatsu, M. Kamachi, and J. Gyoba, Coding facial expressions with gabor wavelets, Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition, 1998, pp [35] L. Qi, The best ran-one approximation ratio of a tensor space, SIAM Journal on Matrix Analysis and Applications, 32 (2011), pp [36] F. Samaria and A. Harter, Parameterisation of a stochastic model for human face identification, Proceedings of 2nd IEEE Worshop on Applications of Computer Vision, 1994, pp [37] B. Savas and L. Eldén, Handwritten digit classification using higher order singular value decomposition, Pattern Recognition, 40 (2007), pp
19 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 19 [38] W. Sun and Y. Yuan, Optimization Theory and Methods: Nonlinear Programming, Springer Optimization and Its Applicationss, Volume 1, Springer, New Yor, [39] M. E. Timmerman and H. A. L. Kiers, Three-mode principal components analysis: Choosing the numbers of components and sensitivity to local optima, British Journal of Mathematical and Statistical Psychology, 53 (2000), pp [40] R. Tomioa, T. Suzui, K. Hayashi, and H. Kashima, Statistical performance of convex tensor decomposition, Proceedings of the 25th Annual Conference on Neural Information Processing Systems, 2011, pp [41] L. R. Tucer, Implications of factor analysis of three-way matrices for measurement of change, in C. W. Harris (eds.), Problems in Measuring Change, University of Wisconsin Press, 1963, pp [42] L. R. Tucer, The extension of factor analysis to three-dimensional matrices, in H. Gullisen and N. Frederisen (eds.), Contributions to Mathematical Psychology, Holt, Rinehardt, & Winston, New Yor, NY, [43] L. R. Tucer, Some mathematical notes on three-mode factor analysis, Psychometria, 31 (1966), pp [44] A. Uschmajew, Local convergence of the alternating least squares algorithm for canonical tensor approximation, SIAM Journal on Matrix Analysis and Applications, 33 (2012), pp [45] M. A. O. Vasilescu and D. Terzopoulos, Multilinear analysis of image ensembles: Tensorfaces, Proceedings of the 7th European Conference on Computer Vision, Lecture Notes in Computer Science, 2350 (2002), Springer, New Yor, pp [46] H. Wang, Q. Wu, L. Shi, Y. Yu, and N. Ahuja, Out-of-core tensor approximation of multidimensional matrices of visual data, ACM Transactions on Graphics (Special Issue for SIGGRAPH 2005), 24 (2005), pp [47] Y. Wang and L. Qi, On the successive supersymmetric ran-1 decomposition of higher-order supersymmetric tensors, Numerical Linear Algebra with Applications, 14 (2007), pp [48] S. Zhang, K. Wang, C. Ashby, B. Chen, and X. Huang, A unified adaptive co-identification framewor for high-d expression data, in T. Shibuya et al. (eds.), Proceedings of the 7th IAPR International Conference on Pattern Recognition in Bioinformatics, Lecture Notes in Computer Science, 7632 (2012), Springer, New Yor, pp [49] S. Zhang, K. Wang, B. Chen, and X. Huang, A new framewor for co-clustering of gene expression data, in M. Loog et al. (eds.), Proceedings of the 6th IAPR International Conference on Pattern Recognition in Bioinformatics, Lecture Notes in Computer Science, 7036 (2011), Springer, New Yor, pp [50] T. Zhang and G. H. Golub, Ran-one approximation to high order tensors, SIAM Journal on Matrix Analysis and Applications, 23 (2001), pp
CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION
International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,
More informationThe multiple-vector tensor-vector product
I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition
More informationA Multi-Affine Model for Tensor Decomposition
Yiqing Yang UW Madison breakds@cs.wisc.edu A Multi-Affine Model for Tensor Decomposition Hongrui Jiang UW Madison hongrui@engr.wisc.edu Li Zhang UW Madison lizhang@cs.wisc.edu Chris J. Murphy UC Davis
More informationFundamentals of Multilinear Subspace Learning
Chapter 3 Fundamentals of Multilinear Subspace Learning The previous chapter covered background materials on linear subspace learning. From this chapter on, we shall proceed to multiple dimensions with
More informationKronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm
Kronecker Product Approximation with Multiple actor Matrices via the Tensor Product Algorithm King Keung Wu, Yeung Yam, Helen Meng and Mehran Mesbahi Department of Mechanical and Automation Engineering,
More informationA new truncation strategy for the higher-order singular value decomposition
A new truncation strategy for the higher-order singular value decomposition Nick Vannieuwenhoven K.U.Leuven, Belgium Workshop on Matrix Equations and Tensor Techniques RWTH Aachen, Germany November 21,
More informationOn the convergence of higher-order orthogonality iteration and its extension
On the convergence of higher-order orthogonality iteration and its extension Yangyang Xu IMA, University of Minnesota SIAM Conference LA15, Atlanta October 27, 2015 Best low-multilinear-rank approximation
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.
More information/16/$ IEEE 1728
Extension of the Semi-Algebraic Framework for Approximate CP Decompositions via Simultaneous Matrix Diagonalization to the Efficient Calculation of Coupled CP Decompositions Kristina Naskovska and Martin
More informationc 2008 Society for Industrial and Applied Mathematics
SIAM J MATRIX ANAL APPL Vol 30, No 3, pp 1219 1232 c 2008 Society for Industrial and Applied Mathematics A JACOBI-TYPE METHOD FOR COMPUTING ORTHOGONAL TENSOR DECOMPOSITIONS CARLA D MORAVITZ MARTIN AND
More informationTutorial on MATLAB for tensors and the Tucker decomposition
Tutorial on MATLAB for tensors and the Tucker decomposition Tamara G. Kolda and Brett W. Bader Workshop on Tensor Decomposition and Applications CIRM, Luminy, Marseille, France August 29, 2005 Sandia is
More informationThird-Order Tensor Decompositions and Their Application in Quantum Chemistry
Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Tyler Ueltschi University of Puget SoundTacoma, Washington, USA tueltschi@pugetsound.edu April 14, 2014 1 Introduction A tensor
More informationWindow-based Tensor Analysis on High-dimensional and Multi-aspect Streams
Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,
More informationFitting a Tensor Decomposition is a Nonlinear Optimization Problem
Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia
More informationMultilinear Subspace Analysis of Image Ensembles
Multilinear Subspace Analysis of Image Ensembles M. Alex O. Vasilescu 1,2 and Demetri Terzopoulos 2,1 1 Department of Computer Science, University of Toronto, Toronto ON M5S 3G4, Canada 2 Courant Institute
More informationAn Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition
An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition Jinshi Yu, Guoxu Zhou, Qibin Zhao and Kan Xie School of Automation, Guangdong University of Technology, Guangzhou,
More informationCVPR A New Tensor Algebra - Tutorial. July 26, 2017
CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic
More informationThe Canonical Tensor Decomposition and Its Applications to Social Network Analysis
The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia National Labs Sandia is a multiprogram laboratory operated by
More informationA Tensor Approximation Approach to Dimensionality Reduction
Int J Comput Vis (2008) 76: 217 229 DOI 10.1007/s11263-007-0053-0 A Tensor Approximation Approach to Dimensionality Reduction Hongcheng Wang Narendra Ahua Received: 6 October 2005 / Accepted: 9 March 2007
More informationMultilinear Analysis of Image Ensembles: TensorFaces
Multilinear Analysis of Image Ensembles: TensorFaces M Alex O Vasilescu and Demetri Terzopoulos Courant Institute, New York University, USA Department of Computer Science, University of Toronto, Canada
More informationNon-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares
Non-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares Hyunsoo Kim, Haesun Park College of Computing Georgia Institute of Technology 66 Ferst Drive,
More informationA Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition
A Simpler Approach to Low-ank Tensor Canonical Polyadic Decomposition Daniel L. Pimentel-Alarcón University of Wisconsin-Madison Abstract In this paper we present a simple and efficient method to compute
More informationENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition
ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University
More informationarxiv: v1 [cs.lg] 18 Nov 2018
THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18
More informationA randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors
A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous
More informationScalable Tensor Factorizations with Incomplete Data
Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical
More informationMust-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos
Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,
More informationCORE CONSISTENCY DIAGNOSTIC AIDED BY RECONSTRUCTION ERROR FOR ACCURATE ENUMERATION OF THE NUMBER OF COMPONENTS IN PARAFAC MODELS
CORE CONSISTENCY DIAGNOSTIC AIDED BY RECONSTRUCTION ERROR FOR ACCURATE ENUMERATION OF THE NUMBER OF COMPONENTS IN PARAFAC MODELS Kefei Liu 1, H.C. So 1, João Paulo C. L. da Costa and Lei Huang 3 1 Department
More informationTHERE is an increasing need to handle large multidimensional
1 Matrix Product State for Feature Extraction of Higher-Order Tensors Johann A. Bengua 1, Ho N. Phien 1, Hoang D. Tuan 1 and Minh N. Do 2 arxiv:1503.00516v4 [cs.cv] 20 Jan 2016 Abstract This paper introduces
More informationFast multilinear Singular Values Decomposition for higher-order Hankel tensors
Fast multilinear Singular Values Decomposition for higher-order Hanel tensors Maxime Boizard, Remy Boyer, Gérard Favier, Pascal Larzabal To cite this version: Maxime Boizard, Remy Boyer, Gérard Favier,
More informationTensor Analysis. Topics in Data Mining Fall Bruno Ribeiro
Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)
More informationRobust Low-Rank Modelling on Matrices and Tensors
Imperial College London Department of Computing MSc in Advanced Computing Robust Low-Ran Modelling on Matrices and Tensors by Georgios Papamaarios Submitted in partial fulfilment of the requirements for
More informationSparseness Constraints on Nonnegative Tensor Decomposition
Sparseness Constraints on Nonnegative Tensor Decomposition Na Li nali@clarksonedu Carmeliza Navasca cnavasca@clarksonedu Department of Mathematics Clarkson University Potsdam, New York 3699, USA Department
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More information18 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008
18 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 MPCA: Multilinear Principal Component Analysis of Tensor Objects Haiping Lu, Student Member, IEEE, Konstantinos N. (Kostas) Plataniotis,
More informationFaloutsos, Tong ICDE, 2009
Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity
More informationAN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION
AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION JOEL A. TROPP Abstract. Matrix approximation problems with non-negativity constraints arise during the analysis of high-dimensional
More informationNon-Negative Tensor Factorisation for Sound Source Separation
ISSC 2005, Dublin, Sept. -2 Non-Negative Tensor Factorisation for Sound Source Separation Derry FitzGerald, Matt Cranitch φ and Eugene Coyle* φ Dept. of Electronic Engineering, Cor Institute of Technology
More informationPAVEMENT CRACK CLASSIFICATION BASED ON TENSOR FACTORIZATION. Offei Amanor Adarkwa
PAVEMENT CRACK CLASSIFICATION BASED ON TENSOR FACTORIZATION by Offei Amanor Adarkwa A thesis submitted to the Faculty of the University of Delaware in partial fulfillment of the requirements for the degree
More informationParallel Tensor Compression for Large-Scale Scientific Data
Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel
More informationARestricted Boltzmann machine (RBM) [1] is a probabilistic
1 Matrix Product Operator Restricted Boltzmann Machines Cong Chen, Kim Batselier, Ching-Yun Ko, and Ngai Wong chencong@eee.hku.hk, k.batselier@tudelft.nl, cyko@eee.hku.hk, nwong@eee.hku.hk arxiv:1811.04608v1
More informationA FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS
A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS Evrim Acar, Mathias Nilsson, Michael Saunders University of Copenhagen, Faculty of Science, Frederiksberg C, Denmark University
More informationLatent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze
Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries
More informationFast Nonnegative Tensor Factorization with an Active-Set-Like Method
Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Jingu Kim and Haesun Park Abstract We introduce an efficient algorithm for computing a low-rank nonnegative CANDECOMP/PARAFAC(NNCP)
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationLinear Algebra Background
CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationTowards A New Multiway Array Decomposition Algorithm: Elementwise Multiway Array High Dimensional Model Representation (EMAHDMR)
Towards A New Multiway Array Decomposition Algorithm: Elementwise Multiway Array High Dimensional Model Representation (EMAHDMR) MUZAFFER AYVAZ İTU Informatics Institute, Computational Sci. and Eng. Program,
More informationLOW MULTILINEAR RANK TENSOR APPROXIMATION VIA SEMIDEFINITE PROGRAMMING
17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 LOW MULTILINEAR RANK TENSOR APPROXIMATION VIA SEMIDEFINITE PROGRAMMING Carmeliza Navasca and Lieven De Lathauwer
More informationTHE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR
THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional
More informationJOS M.F. TEN BERGE SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS
PSYCHOMETRIKA VOL. 76, NO. 1, 3 12 JANUARY 2011 DOI: 10.1007/S11336-010-9193-1 SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS JOS M.F. TEN BERGE UNIVERSITY OF GRONINGEN Matrices can be diagonalized
More informationTHERE is an increasing need to handle large multidimensional
1 Matrix Product State for Higher-Order Tensor Compression and Classification Johann A. Bengua, Ho N. Phien, Hoang D. Tuan and Minh N. Do Abstract This paper introduces matrix product state (MPS) decomposition
More informationSparse Higher-Order Principal Components Analysis
Genevera I. Allen Baylor College of Medicine & Rice University Abstract Traditional tensor decompositions such as the CANDECOMP / PARAFAC CP) and Tucer decompositions yield higher-order principal components
More informationImage Registration Lecture 2: Vectors and Matrices
Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this
More informationUncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization
Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,
More informationTheoretical Performance Analysis of Tucker Higher Order SVD in Extracting Structure from Multiple Signal-plus-Noise Matrices
Theoretical Performance Analysis of Tucker Higher Order SVD in Extracting Structure from Multiple Signal-plus-Noise Matrices Himanshu Nayar Dept. of EECS University of Michigan Ann Arbor Michigan 484 email:
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationc 2008 Society for Industrial and Applied Mathematics
SIAM J. MATRIX ANAL. APPL. Vol. 30, No. 3, pp. 1067 1083 c 2008 Society for Industrial and Applied Mathematics DECOMPOSITIONS OF A HIGHER-ORDER TENSOR IN BLOCK TERMS PART III: ALTERNATING LEAST SQUARES
More informationFactor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra
Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor
More informationGeneralized Higher-Order Tensor Decomposition via Parallel ADMM
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Generalized Higher-Order Tensor Decomposition via Parallel ADMM Fanhua Shang 1, Yuanyuan Liu 2, James Cheng 1 1 Department of
More informationAbsolute Value Programming
O. L. Mangasarian Absolute Value Programming Abstract. We investigate equations, inequalities and mathematical programs involving absolute values of variables such as the equation Ax + B x = b, where A
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationEE 381V: Large Scale Optimization Fall Lecture 24 April 11
EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that
More informationA Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem
A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a
More informationNote on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing
Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,
More informationLecture 2. Tensor Unfoldings. Charles F. Van Loan
From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 2. Tensor Unfoldings Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi,
More informationWafer Pattern Recognition Using Tucker Decomposition
Wafer Pattern Recognition Using Tucker Decomposition Ahmed Wahba, Li-C. Wang, Zheng Zhang UC Santa Barbara Nik Sumikawa NXP Semiconductors Abstract In production test data analytics, it is often that an
More informationTensor Decompositions and Applications
Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationRank Determination for Low-Rank Data Completion
Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,
More informationthe tensor rank is equal tor. The vectorsf (r)
EXTENSION OF THE SEMI-ALGEBRAIC FRAMEWORK FOR APPROXIMATE CP DECOMPOSITIONS VIA NON-SYMMETRIC SIMULTANEOUS MATRIX DIAGONALIZATION Kristina Naskovska, Martin Haardt Ilmenau University of Technology Communications
More informationA practical method for computing the largest M-eigenvalue of a fourth-order partially symmetric tensor
NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2007; 00:1 6 [Version: 2002/09/18 v1.02] A practical method for computing the largest M-eigenvalue of a fourth-order partially symmetric
More informationTensor Low-Rank Completion and Invariance of the Tucker Core
Tensor Low-Rank Completion and Invariance of the Tucker Core Shuzhong Zhang Department of Industrial & Systems Engineering University of Minnesota zhangs@umn.edu Joint work with Bo JIANG, Shiqian MA, and
More information4y Springer NONLINEAR INTEGER PROGRAMMING
NONLINEAR INTEGER PROGRAMMING DUAN LI Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Shatin, N. T. Hong Kong XIAOLING SUN Department of Mathematics Shanghai
More informationTensor Decompositions Workshop Discussion Notes American Institute of Mathematics (AIM) Palo Alto, CA
Tensor Decompositions Workshop Discussion Notes American Institute of Mathematics (AIM) Palo Alto, CA Carla D. Moravitz Martin July 9-23, 24 This work was supported by AIM. Center for Applied Mathematics,
More informationAre There Sixth Order Three Dimensional PNS Hankel Tensors?
Are There Sixth Order Three Dimensional PNS Hankel Tensors? Guoyin Li Liqun Qi Qun Wang November 17, 014 Abstract Are there positive semi-definite PSD) but not sums of squares SOS) Hankel tensors? If the
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationApproximation algorithms for nonnegative polynomial optimization problems over unit spheres
Front. Math. China 2017, 12(6): 1409 1426 https://doi.org/10.1007/s11464-017-0644-1 Approximation algorithms for nonnegative polynomial optimization problems over unit spheres Xinzhen ZHANG 1, Guanglu
More informationAn Even Order Symmetric B Tensor is Positive Definite
An Even Order Symmetric B Tensor is Positive Definite Liqun Qi, Yisheng Song arxiv:1404.0452v4 [math.sp] 14 May 2014 October 17, 2018 Abstract It is easily checkable if a given tensor is a B tensor, or
More informationUsing Matrix Decompositions in Formal Concept Analysis
Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and
More informationNonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy
Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille
More informationNON-NEGATIVE SPARSE CODING
NON-NEGATIVE SPARSE CODING Patrik O. Hoyer Neural Networks Research Centre Helsinki University of Technology P.O. Box 9800, FIN-02015 HUT, Finland patrik.hoyer@hut.fi To appear in: Neural Networks for
More informationNon-Negative Tensor Factorization with Applications to Statistics and Computer Vision
Non-Negative Tensor Factorization with Applications to Statistics and Computer Vision Amnon Shashua shashua@cs.huji.ac.il School of Engineering and Computer Science, The Hebrew University of Jerusalem,
More informationOrthogonal tensor decomposition
Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.
More informationA Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations
A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug
More informationA Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem
A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem Morteza Ashraphijuo Columbia University ashraphijuo@ee.columbia.edu Xiaodong Wang Columbia University wangx@ee.columbia.edu
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationFast Hankel Tensor-Vector Products and Application to Exponential Data Fitting
Fast Hankel Tensor-Vector Products and Application to Exponential Data Fitting Weiyang Ding Liqun Qi Yimin Wei arxiv:1401.6238v1 [math.na] 24 Jan 2014 January 27, 2014 Abstract This paper is contributed
More informationMATRIX COMPLETION AND TENSOR RANK
MATRIX COMPLETION AND TENSOR RANK HARM DERKSEN Abstract. In this paper, we show that the low rank matrix completion problem can be reduced to the problem of finding the rank of a certain tensor. arxiv:1302.2639v2
More informationNew Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization
New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization Bo JIANG Shiqian MA Shuzhong ZHANG January 12, 2015 Abstract In this paper, we propose three new tensor decompositions
More informationarxiv: v1 [math.co] 10 Aug 2016
POLYTOPES OF STOCHASTIC TENSORS HAIXIA CHANG 1, VEHBI E. PAKSOY 2 AND FUZHEN ZHANG 2 arxiv:1608.03203v1 [math.co] 10 Aug 2016 Abstract. Considering n n n stochastic tensors (a ijk ) (i.e., nonnegative
More informationMulti-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems
Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005 Multi-Linear
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More informationMultiscale Tensor Decomposition
Multiscale Tensor Decomposition Alp Ozdemir 1, Mark A. Iwen 1,2 and Selin Aviyente 1 1 Department of Electrical and Computer Engineering, Michigan State University 2 Deparment of the Mathematics, Michigan
More informationAlgorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping
Algorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping BRETT W. BADER and TAMARA G. KOLDA Sandia National Laboratories Tensors (also known as multidimensional arrays or N-way arrays) are used
More informationOutline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St
Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic
More informationRecovering Tensor Data from Incomplete Measurement via Compressive Sampling
Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Jason R. Holloway hollowjr@clarkson.edu Carmeliza Navasca cnavasca@clarkson.edu Department of Electrical Engineering Clarkson
More informationPerformance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization
Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly
More informationNotes on Latent Semantic Analysis
Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically
More information