Key words. multiway array, Tucker decomposition, low-rank approximation, maximum block improvement

Size: px
Start display at page:

Download "Key words. multiway array, Tucker decomposition, low-rank approximation, maximum block improvement"

Transcription

1 ON TENSOR TUCKER DECOMPOSITION: THE CASE FOR AN ADJUSTABLE CORE SIZE BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG Abstract. This paper is concerned with the problem of finding a Tucer decomposition for tensors. Traditionally, solution methods for Tucer decomposition presume that the size of the core tensor is specified in advance, which may not be a realistic assumption in some applications. In this paper we propose a new computational model where the configuration and the size of the core become a part of the decisions to be optimized. Our approach is based on the so-called maximum bloc improvement (MBI) method for non-convex bloc optimization. We consider a number of real applications in this paper, showing the promising numerical performances of the proposed algorithms. Key words. multiway array, Tucer decomposition, low-ran approximation, maximum bloc improvement AMS subject classifications. 15A69, 90C26, 90C59, 49M27, 65F99 1. Introduction. The models and solution methods for high order tensor decompositions were initially motivated and developed in psychometrics (see, e.g. [41, 42, 43]) and chemometrics (see, e.g. [4]) where the need for analysis of multiway data is imperative. Since then, tensor decomposition has become an active field of research in mathematics due partly to the intricate algebraic structures of tensors. In addition to this, high order (tensor) statistics and independent component analysis (ICA) have been developed by engineers and statisticians primarily because of their wide practical applications. There are two major tensor decompositions that can be considered as higher order extensions of the matrix singular value decomposition (SVD): one is nown as the CANDECOMP/PARAFAC (CP) decomposition, which attempts to present the tensor as a sum of ran-one tensors and has various applications in chemometrics [2, 4], neuroscience [1], and data mining [6], and the other is nown as the Tucer decomposition, which is a higher order generalization of the principal component analysis (PCA) and has found wide applications in signal processing [18], data mining [37], computer vision [45, 46], and chemical analysis [22, 23]. There are extensive studies in the literature for finding the CP decomposition and the Tucer decomposition for tensors; see, e.g. [16, 15, 27, 30, 31, 35, 17, 47, 50, 40]. On the computational side, the existing popular algorithms are based on the alternating least square (ALS) method proposed originally by Carroll and Chang [12], and Harshman [20]. However, the ALS method is not guaranteed to converge to a global optimal or a stationary point, but only to a solution where the objective function ceases to decrease. However, if the ALS method converges under certain non-degeneracy assumption, then it actually has a local linear convergence rate [44]. Recently, Chen et al. [14] put forward a bloc coordinate descent type search method nown as maximum bloc improvement (MBI), for solving a specific nonlinear optimization with separable structure (including tensor decompositions as special cases), and proved that the se- Department of Automation, Xiamen University, Xiamen , China (blchen@xmu.edu.cn). Department of Mathematics, Shanghai University, Baoshan District, Shanghai , China (zheningli@gmail.com). The research of this author was supported in part by Natural Science Foundation of Shanghai (Grant 12ZR ) and Ph.D. Programs Foundation of Chinese Ministry of Education (Grant ). Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, MN 55455, USA (zhangs@umn.edu). The research of this author was supported in part by National Science Foundation of USA (Grant CMMI ). 1

2 2 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG quence produced by the MBI method converges to a stationary point in general. This performs very well numerically. The MBI method has been applied successfully to the problem of finding the best ran-one approximation of tensors [14], and in solving problems arising from bioinformatics [49, 48] and radar systems [5]. Local linear convergence of the MBI method for certain types of problems were established by Li et al. [33]. Recent treatise on this topic can be found in the Ph.D. thesis of Chen [13]. In this paper, we study the Tucer decomposition problem with the characteristic that the size of its core is unspecified. We shall demonstrate how the MBI method can be applied to solve such problems. Tucer decomposition can be viewed as a generalization of CP decomposition which is a Tucer model with equal number of components in each mode. It was first introduced in 1963 by Tucer [41], and later redefined in Levin [32] and Tucer [42, 43]. The goal of Tucer decomposition is to decompose a tensor into a core tensor multiplied by a matrix along each mode. It is related to the best ran-(r 1, r 2,..., r d ) approximation of d-th order tensors (cf. [16]). Typically, a ran-(r 1, r 2,..., r d ) Tucer decomposition (sometimes called the best multilinear ran-(r 1, r 2,..., r d ) approximation) means that the size of the core tensor is r 1 r 2 r d. In terms of the decomposition methods, Tucer [43] proposed three methods to find the Tucer decomposition for three-way tensors in 1966, among which the first method is referred to the Tucer1 method, and is now better nown as the higher order singular value decomposition (HOSVD) (see [15]). In [15], De Lathauwer et al. not only showed that the HOSVD for the tensor is a convincing generalization of SVD for the matrix case, but also proposed a truncated HOSVD that gives a suboptimal ran-(r 1, r 2,..., r d ) approximation of tensors. Analogous to the ALS method developed for computing CP decomposition, researchers also derived similar methods for solving Tucer decomposition. Kroonenberg and De Leeuw [30] proposed TUCKALS3 for computing Tucer decomposition for the three-way arrays. Later Kapteyn et al. [25] extended the TUCKALS3 to decompose the general d-way arrays for d > 3. Moreover, De Lathauwer et al. [16] derived a more efficient technique called higher order orthogonal iteration (HOOI) method for computing ran-(r 1, r 2,..., r d ) Tucer decomposition. Possible improvements for the HOOI algorithm are considered by Andersson and Bro [3]. However, as we mentioned earlier, the ALS method has no convergence guarantee in general. Alternatively, there are methods for Tucer decomposition with a convergence guarantee. Eldén and Savas [19], and Ishtev et al. [24] simultaneously proposed Newton-based method for computing the best multilinear ran-(r 1, r 2, r 3 ) approximation of tensors, the former is nown as the Newton-Grassmann method, and the latter is also called the differential-geometric Newton method. Both methods have quadratic local convergence property. (Note that computing Hessian is usually a very demanding tas). For more discussions on the Tucer decomposition and its variants, one is referred to the survey paper by Kolda and Bader [29]. In Tucer decomposition, the ran-(r 1, r 2,..., r d ) (namely the size of the core) is considered a set of predetermined parameters. This requirement, however, is restrictive in some applications. For example, the problem on how to choose a good number of clusters for co-clustering of gene expression data comes up in the area of bioinformatics. In fact, the number of suitable co-clusters is not nown in advance (see Zhang et al. [48]). The question of how to choose the ran of Tucer model (cf. [39, 26, 21]) is challenging, since the problem is already NP-hard even if the Tucer ran (r 1, r 2,..., r d ) is nown. Similar issue on selecting an appropriate ran has also been considered in CP decomposition, e.g. a consistency diagnostic named COR-

3 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 3 CONDIA designed in [11]. Timmerman and Kiers [39] proposed DIFFIT procedure based on optimal fit to choose the numbers of components in Tucer decomposition of three-way data. The procedure, however, is rather time-consuming. Kiers and Der Kinderen [26] revised the procedure and computed DIFFIT on approximate fit to save the computational time. In this paper, we propose a new scheme to choose good rans in Tucer decomposition, provided that the summation of the rans d i=1 r i is fixed. The remainder of the paper is organized as follows. In Section 2, we introduce the notations and frequently used tensor operations throughout the paper. Then we apply the MBI method for solving ran-(r 1, r 2,..., r d ) Tucer decomposition in Section 3. In Section 4, a new model for Tucer decomposition with unspecified core size is proposed and solved based on the MBI method and the penalty method. A heuristic approach is also developed to compute the new model in this section. Finally, numerical results on testing the new model and algorithms are presented in Section Notations. Throughout this paper, we uniformly use non-bold lowercase letters, boldface lowercase letters, capital letters, and calligraphic letters to denote s- calars, vectors, matrices, and tensors, respectively; e.g.: scalar i, vector y, matrix A, and tensor F. We use the subscripts to denote the component of a vector, a matrix, or a tensor; e.g.: y i being the i-th entry of the vector y, A ij being the (i, j)-th entry of the matrix A, and F ij being the (i, j, )-th entry of the tensor F. Let us first introduce some important tensor operations frequently appeared in this paper, which are largely in line with that in [29, 15]. For an overview of tensor operations and properties, we refer to the survey paper [29]. A tensor is a multidimensional array, and the order of a tensor is its dimension, also nown as the ways or the modes of a tensor. In particular, a vector is a tensor of order one, and a matrix is a tensor of order two. Consider A = (A i1i 2...i d ) R n1 n2 n d, a standard tensor of order d, where d 3. A usual way to handle a tensor is to reorder its elements into a matrix; the process is called matricization, also nown as unfolding or flattening. A R n1 n2 n d has d modes, namely, mode-1, mode-2,..., mode-d. Denote the mode- matricization of tensor A to be A (), then the (i 1, i 2,..., i d )-th entry of tensor A is mapped to the (i, j)-th entry of matrix A () R n l n l, where j = l d,l (i l 1) 1 t l 1,t The -ran of tensor A, denoted by ran (A), is the column ran of mode- unfolding A (), i.e., ran (A) = ran(a () ). A d-th order tensor whose ran (A) = r for = 1, 2,..., d, is briefly called a ran-(r 1, r 2,..., r d ) tensor. Analogous to the Frobenius norm of a matrix, the Frobenius norm of tensor A is the usual 2-norm, defined by n 1 A := n 2 i 1=1 i 2=1 n d i d =1 A i1i 2...i d 2. In this paper we uniformly denote the 2-norm for vectors, and the Frobenius norm for matrices and tensors, all by notation. The inner product of two same-sized n t.

4 4 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG tensors A, B is given as A, B = n 1 n 2 i 1=1 i 2=1 n d i d =1 A i1i 2...i d B i1i 2...i d. Hence, it is clear that A, A = A 2. One important tensor operation is the multiplication of a tensor by a matrix. The -mode product of tensor A by a matrix U R m n, denoted by A U, is a tensor in R n1 n2 n 1 m n +1 n d, whose (i 1, i 2,..., i 1, l, i +1,..., i d )-th entry is defined by (A U) i1i 2...i 1 l i +1...i d = n i =1 A i1i 2...i 1 i i +1...i d U li. The equation can also be written in terms of tensor unfolding as well, i.e., Y = A U Y () = UA (). This multiplication in fact changes the dimension of tensor A in mode-. In particular, if U is a vector in R n, the order of tensor A U is then reduced to d 1, whose size is n 1 n 2 n 1 n +1 n d. The -mode products can also be expressed by the the matrix Kronecer product as follows: Y = A 1 U (1) 2 U (2) d U (d) Y () = U () A () (U (d) U (+1) U ( 1) U (1)), for any = 1, 2,..., d with U () R m n for = 1, 2,..., d. The proof of this property can be found in [28]. Throughout this paper we uniformly use a subscript in the parentheses to denote the matricization of a tensor (e.g. A (1) being mode-1 matricization of tensor A), and use a superscript in the parentheses to denote the matrix in the mode product of a tensor (e.g. U (1) in appropriate size showed in mode-1 product of a tensor). 3. Convergence of the traditional Tucer decomposition. Traditionally, Tucer decomposition attempts to find the best approximation for a large-sized tensor by a small-sized tensor with pre-specified dimensions (called the core tensor) multiplied by a matrix on each mode. The problem can be formulated as follows: Given a real tensor F R n1 n2 n d, find a core tensor C R r1 r2 r d with pre-specified integers r i with 1 r i n i for i = 1, 2,..., d, that optimizes (T min ) min F C 1 A (1) 2 A (2) d A (d) s.t. C R r1 r2 r d, A (i) R ni ri, i = 1, 2,..., d. Here, matrices A (i) s are the factor matrices. Without loss of generality, these matrices are assumed to be columnwise orthogonal. The problem can be considered as a generalization of the best ran-one approximation problem, namely the case of c 1 = c 2 = = c d = 1. For any fixed matrices A (i) s, if we optimize the objective function of (T min ) over C, it is not hard to verify that (T min ) is equivalent to the following maximization model (see the discussion in [29, 16, 3]): (T max ) max F 1 (A (1) ) T 2 (A (2) ) T d (A (d) ) T s.t. A (i) R ni ri, (A (i) ) T A (i) = I, i = 1, 2,..., d.

5 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 5 The worhorse for solving (T max ) or (T min ) has been traditionally the ALS method. However, the convergence of the ALS method is not guaranteed in general. Chen et al. [14] proposed the MBI method, a greedy-type search algorithm for optimization model with general separable structure (G) max f(x 1, x 2,..., x d ) s.t. x i S i R ni, i = 1, 2,..., d, where f : R n1+n2+ +n d R is a general continuous function. Assuming that for any fixed d 1 blocs of variables x i s, optimization over one bloc of variables can be done, the MBI method is guaranteed to converge to a stationary point under the mild condition that S i is compact for i = 1, 2,..., d. We notice that (T max ) is a particular instance of (G). Moreover, by fixing any d 1 blocs A (i) s in (T max ), the optimization subproblem in the MBI method can be easily solved by the singular value decomposition (SVD). To be specific, the subproblem of (T max ) by applying the MBI method is (Tmax) i max F 1 (A (1) ) T 2 (A (2) ) T d (A (d) ) T s.t. A (i) R ni ri, (A (i) ) T A (i) = I, where matrices A (1), A (2),..., A (i 1), A (i+1),..., A (d) are given. The objective function of (Tmax) i can be written in the matrix form as follows: (A (i) ) T F (i) (A (d) A (i+1) A (i 1) A (1)). Therefore, the optimal ( solution for (Tmax) i is the r i leading left singular vectors of the matrix F (i) A (d) A (i+1) A (i 1) A (1)). Let us now present the algorithm for solving (T max ) by the MBI approach. Algorithm 1. The MBI method for Tucer decomposition Input Tensor F R n1 n2 n d and scalars r 1, r 2,..., r d. Output Core tensor C R r1 r2 r d and matrices A (i) R ni ri for i = 1, 2,..., d. 0 Choose an initial feasible solution (A (1) 0, A(2) 0,..., A(d) 0 ) and compute initial objective value v 0 := F 1 (A (1) 0 )T 2 (A (2) 0 )T d (A (d). 0 )T Set := 0. 1 For each i = 1, 2,..., d, compute B (i) +1 consisted by the r i leading left singular ( ) vectors of F (i) A (d) A (i+1) A (i 1) A (1), and let w+1 i := F 1 (A (1) )T i 1 (A (i 1) ) T i (B (i) +1 )T i+1 (A (i+1) ) T d (A (d) )T. 2 Let v +1 := max 1 i d w+1 i and choose one i = arg max 1 i d w+1 i, and further denote { A (i) +1 := A (i) i {1, 2,..., d}\{i }, B (i) +1 i = i. 3 If v +1 v ɛ, stop and output the core tensor and matrices A (i) = A (i) to Step 1. C := F 1 (A (1) +1 )T 2 (A (2) +1 )T d (A (d) +1 )T +1 for i = 1, 2,..., d; otherwise, set := + 1 and go

6 6 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG According to the convergence result of the MBI method established in [14], Algorithm 1 guarantees to converge to a stationary point for Tucer decomposition, which is however not the case for many other algorithms, e.g. HOSVD method and HOOI method. 4. Tucer decomposition with unspecified size of the core. In this section, we propose a new model for Tucer decomposition without pre-specifying the size of the core tensor. The dimension of each mode for the core is no longer a constant. Rather it is a variable that also needs to be optimized, which is a ey ingredient in our model. Several algorithms are proposed to solve the new model as well The formulation. Given a tensor F R n1 n2 n d, the goal is to find a small-sized (low ran) core tensor C and d slim matrices A (i) s to express F, as close as possible. We are interested in determining the ran of each mode of C as well as the best approximation of F. Let the i-ran of C be r i for i = 1, 2,..., d. Clearly, we have 1 r i n i for i = 1, 2,..., d. Unlie the general Tucer decomposition, r i s are now decision variables which need to be determined. Denote c to be a given constant for the summation of all i-ran variables, i.e., d i=1 r i = c, which in general prevents r i being too large. To determine the allocation for r i s in the total number c, the new model is (NT min ) min F C 1 A (1) 2 A (2) d A (d) s.t. C R r1 r2 r d, A (i) R ni ri, i = 1, 2,..., d, r i Z, 1 r i n i, i = 1, 2,..., d, d i=1 r i = c. It is difficult to solve (NT min ) directly, since the first two constraints of (NT min ) combine the bloc variables A (i) and i-ran variable r i together. A straightforward method is to separate these variables, and we introduce d more bloc variables Y (i) R mi mi where m i := min{n i, c} for i = 1, 2,..., d and Y (i) = diag(y (i) ), y (i) {0, 1} mi, and m i j=1 y (i) j = r i for i = 1, 2,..., d. By the equivalence between Tucer decomposition models (T min ) and (T max ), we may reformulate (NT min ) as follows: ( (NT max ) max F 1 A (1) Y (1)) T ( 2 A (2) Y (2)) T ( d A (d) Y (d)) T s.t. A (i) R ni mi, (A (i) ) T A (i) = I, i = 1, 2,..., d, y (i) {0, 1} mi, i = 1, 2,..., d, mi j=1 y(i) j 1, i = 1, 2,..., d, d i=1 mi j=1 y(i) j = c. Throughout this paper, Y (i) always represents diag(y (i) ) rather than putting Y (i) = diag(y (i) ) in the constraints, in order to mae the models being concise. Note that r i in (NT min ) is already replaced the number of nonzero entries of y (i) in (NT max ). Let X = F 1 ( A (1) Y (1)) T 2 ( A (2) Y (2)) T d ( A (d) Y (d)) T R m 1 m 2 m d. If feasible bloc variables Y (i) s satisfy

7 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 7 Y (i) = diag(1, 1,..., 1, 0, 0,..., 0), i = 1, 2,..., d, (4.1) }{{}}{{} r i m i r i then the size of the core tensor X can be reduced to r 1 r 2 r d by deleting the tails of zero entries in all modes, and the ran of X in each mode is equal to r 1, r 2,..., r d, respectively. This observation, however establishes the equivalence between (NT min ) and (NT max ). As we shall see in the next subsection, we may without loss of generality assume Y (i) to be in the form of (4.1). This allows us to construct a core tensor with ran-(r 1, r 2,..., r d ) easily, which is exactly the same size of the core tensor C to be optimized in (NT min ). Besides, we would lie to remar that the third constraint mi j=1 y(i) j 1 for i = 1, 2,..., d in (NT max ) is actually redundant. The reason is that if m i j=1 y(i) j = 0 for some i, then clearly the objective is zero, which can never be optimal. However, we still eep it in (NT max ) for continency in implementing the algorithms discussed in the following subsections The penalty method. Let us focus on the model for Tucer decomposition with unspecified size of the core in the maximization form (NT max ). Recall that the separable structure of (G) is required to implement the MBI method. Therefore, we mi j=1 y(i) j move the nonseparable constraint d i=1 the following penalty function ( p λ, A (1), A (2),..., A (d), Y (1), Y (2),..., Y (d)) := = c to the objective function, i.e., F 1 (A (1) Y (1)) T 2 (A (2) Y (2)) T 3 d (A (d) Y (d)) T 2 d m i λ i=1 j=1 where λ > 0 is a penalty parameter. The penalty model for (NT max ) is then (P T ) max p ( λ, A (1), A (2),..., A (d), Y (1), Y (2),..., Y (d)) s.t. A (i) R ni mi, (A (i) ) T A (i) = I, i = 1, 2,..., d, y (i) {0, 1} mi, i = 1, 2,..., d, mi j=1 y(i) j 1, i = 1, 2,..., d. y (i) j 2 c, We are ready to apply the MBI method to solve (P T ) since the bloc constraints are now separable. Before presenting the formal algorithm, let us first discuss the subproblems in implementing the MBI method, which have to be solved globally as a requirement to guarantee convergence. Without loss of generality, we wish to optimize (A (1), Y (1) ) while all other bloc variables (A (i), Y (i) ) for i = 2, 3,..., d are fixed, i.e., (P T 1 ) max ( A (1) Y (1)) T W (1) 2 ( m1 ) 2 λ j=1 y(1) j + c 1 s.t. A (1) R n1 m1, (A (1) ) T A (1) = I, y (1) {0, 1} m1, m1 j=1 y(1) j 1, where W (1) := F (1) ( A (d) Y (d) A (2) Y (2)) and c 1 := d i=2 mi j=1 y(i) j c. This subproblem indeed can be solved easily. First, as the optimization of A (1) is irrelevant to the penalty term, the optimal A (1) is the m 1 leading left singular

8 8 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG vectors of matrix W (1) by applying SVD. Next, we search for the optimal Y (1) for given optimal A (1). Denote v j to be the j-th row vector of matrix (A (1) ) T W (1) for j = 1, 2,..., m 1, and we have ( A (1) Y (1)) T 2 m 1 W (1) = y (1) j v j 2. Therefore, the optimization of Y (1) is then the following problem: ( m1 ) 2 max λ j=1 y(1) m1 j + j=1 ( v j 2 2λ c 1 )y (1) j λ( c 1 ) 2 j=1 s.t. y (1) {0, 1} m1, m 1 j=1 y(1) j 1. Although the above model appears to be combinatorial, it is solvable in polynomialtime. This is because all the possible values for m 1 j=1 y(1) j are {1, 2,..., m 1 }, and for any fixed m 1 j=1 y(1) j, the optimal allocation for y (1) can be assigned greedily as the objective function is now linear. Thus we only need to try m 1 different values for and pic the best solution. m1 j=1 y(1) j In the above discussion, we now that the optimal Y (1) automatically satisfies the formation (4.1). This is because A (1) is the m 1 leading left singular vectors of matrix W (1), leading to the non-increasing order for v j s. Therefore we can update A (1) and Y (1) simultaneously in solving the subproblem of the MBI method for given penalty parameter λ. To summarize, the whole procedure for solving (NT max ) using the MBI method and penalty function method (cf. [8, 38]) is as follows. Algorithm 2. The MBI method for the penalty model Input Tensor F R n1 n2 n d and integer c d. Output Core tensor C R m1 m2 m d and matrices A (i) R ni mi for i = 1, 2,..., d. 0 Choose penalty parameters λ 0 > 0, σ > 1 and an initial feasible solution (A (1) 0, A(2) ( 0,..., A(d) 0, Y (1) 0, Y (2) 0,..., Y (d) 0 ), and compute ) initial objective value v 0 := p λ 0, A (1) 0, A(2) 0,..., A(d) 0, Y (1) 0, Y (2) 0,..., Y (d) 0. Set := 0, l := 0 and λ := λ 0. ( ) 1 For each i = 1, 2,..., d, solve (P T i ) and get its optimal solution B (i) +1, Z(i) +1 with optimal value w+1 i, where (P T i ) is defined similar as (P T 1 ) by replacing bloc 1 to bloc i. 2 Let v +1 := max 1 i d w+1 i and choose one i = arg max 1 i d w+1 i, and further denote ( ) A (i) +1, Y (i) +1 := ( ( A (i) ), Y (i) B (i) +1, Z(i) +1 ) i {1, 2,..., d}\{i }, i = i. 3 If v +1 v ɛ, go to Step 4; otherwise, set := + 1 and go to Step 1. ( d 2 mi 4 If λ l i=1 j=1 (y(i) +1 ) j c) ɛ0, stop and output the core tensor ( ) T ( ) T ( ) T C := F 1 A (1) +1 Y (1) +1 2 A (2) +1 Y (2) +1 3 d A (d) +1 Y (d) +1 and matrices A (i) = A (i) +1 Y (i) +1 for i = 1, 2,..., d; otherwise, set l := l + 1, λ := λ 0 σ l and := 0, and go to Step 1.

9 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 9 We remar that when Algorithm 2 stops, we can shrin the size of the core tensor to r 1 r 2 r d by deleting the tails of zero entries in all modes, where r i is the number of nonzero entries of y (i). As an MBI method, Algorithm 2 is guaranteed to converge to a stationary point of (P T ), as claimed in [14] A heuristic method. To further save the computational efforts, in this subsection we present a heuristic approach to solve Tucer decomposition model with unspecified size of the core, i.e., (NT max ). We now that if each i-ran (i = 1, 2,..., d) of the core tensor C is given, then the problem becomes the traditional Tucer decomposition discussed in Section 3, which can be solved by the MBI method (Algorithm 1). From the discussion in Section 4, we notice that the ey issue of the model is to allocate the constant number c to each i-ran of the core tensor. Therefore, the optimal value of (T max ) can be considered as a function of (r 1, r 2,..., r d ), denoted by g(r 1, r 2,..., r d ). Our heuristic approach tries to find a distribution of the constant c, by adapting the idea of the MBI method. Specifically, we may start with lower i-rans, e.g. r1 0 = r2 0 = = rd 0 = 1, and compute Tucer decomposition using Algorithm 1; then we increase one of the i-rans (1 i d) by one, by choosing i as the best Tucer decomposition among d possible increment of the i-ran, i.e., (r1 +1, r2 +1,..., r +1 d ) = arg max 1 i d g(r 1, r2,..., ri 1, ri + 1, ri+1,..., rd). This procedure is continued until d i=1 r i = c for some (= c d). If the function g(r 1, r 2,..., r d ) for Tucer decomposition can be globally solved for any given (r 1, r 2,..., r d ) (though it is NP-hard in general), then the above approach is a dynamic program and the optimality of (NT max ) is guaranteed when the method stops. Though in general Algorithm 1 can only find the stationary solution of g(r 1, r 2,..., r d ), this heuristic approach is in fact very effective; see the numerical tests in Section 5. Apart from the ran increasing approach, ran decreasing strategy can be the other alternative, i.e., starting from large number r i s and decreasing them until d i=1 r i = c. We conclude this subsection by presenting the heuristic algorithm for ran decreasing method below, which is actually very easy to implement. Algorithm 3. The ran decreasing method Input Tensor F R n1 n2 n d and integer c d. Output Core tensor C R r1 r2 r d and matrices A (i) R ni ri for i = 1, 2,..., d. 0 Choose initial rans (r 1, r 2,..., r d ) with d i=1 r i > c. 1 Apply Algorithm 1 to solve (T max ) with input (r 1, r 2,..., r d ), and output matrices A (i) for i = 1, 2,..., d. 2 For each i = 1, 2,..., d, let B (i) R ni (ri 1) be the matrix by deleting the r i -th column of A (i), and compute i := arg max F 1 (A (1) ) T i 1 (A (i 1) ) T i (B (i) ) T i+1 (A (i+1) ) T 1 i d d (A (d) ) T. Update r i := r i 1 and A (i ) := B (i ). Repeat if necessary until d i=1 r i = c. 3 Apply Algorithm 1 to solve (T max ) with input (r 1, r 2,..., r d ), and output the core tensor C and matrices A (i) for i = 1, 2,..., d.

10 10 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG 5. Numerical experiments. In this section, we apply the algorithms proposed in the previous sections to solve Tucer decomposition without specification of the core size. Four different types of data are tested to demonstrate the effectiveness and efficiency of our algorithms. All the numerical computations are conducted in an Intel Xeon CPU 3.40GHz computer with 8GB RAM. The supporting software is MATLAB (R2011a) as a platform. We use MATLAB Tensor Toolbox Version 2.5 [7] whenever tensor operations are called, and also apply its embedded algorithm (the ALS method) to solve Tucer decomposition with given ran-(r 1, r 2,..., r d ). The termination precision for the ALS method is set to be In implementing Algorithm 2, the penalty increment parameter is set as σ := 2. The starting matrices A (i) 0 s are randomly generated and then made to be orthonormal. The starting diagonal matrices Y (i) 0 s are all set as 0 := diag(1, 0, 0,..., 0) for i = 1, 2,..., d. }{{} m i 1 Y (i) The termination precision for Algorithm 2 is also set to be For Algorithm 3, the initial rans are set as r i := min(n i, c) for i = 1, 2,..., d. When applying Algorithm 1 in Step 1, the starting matrices A (i) 0 s for solving (T max) are randomly generated, and the termination precision is set to be While applying Algorithm 1 in Step 3, the starting matrices A (i) 0 s for solving (T max) are obtained from Step 2, and the termination precision is set to be For the given data tensor F in the following tests, and the ran-(r 1, r 2,..., r d ) approximation tensor ˆF computed by the algorithms, the relative square error of the approximation is normally defined as F ˆF / F. To measure how close it is to the original tensor F, the term fit is defined as fit := 1 F F ˆ. F The larger the fit, the better the performance Noisy tensor decomposition. First we present some preliminary test results on two synthetic data sets. We use Tensor Toolbox to randomly generated a noise-free tensor Y R with ran-(4, 4, 2), where entries of the core tensor follow standard normal distributions and entries of the orthogonal factor matrices follow uniform distributions. A noise tensor N R in the same size is randomly generated, whose entries follow standard normal distributions. The noisy tensor Z to be tested is then constructed as follows: Z = Y + η Y N N, where η is the noise parameter, more formally, the perturbed ratio. Our tas is to find the real i-rans of the core tensor from the corrupted tensor Z. In this set of tests, the initial penalty parameter for Algorithm 2 is set as λ 0 = Z, and summation of the rans is set as c = 10. Results of this experiment are summarized in Figure 5.1, representing the fit and the computational time of Algorithms 2 and 3 for different perturbed ratios. For comparison, we also call the ALS method for solving the traditional Tucer decomposition with given (r 1, r 2, r 3 ), which are randomly generated satisfying r 1 + r 2 + r 3 = 10. Also, 10 random ran samples

11 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 11 are generated for the ALS method, and its maximum fit, average fit and minimum fit are presented in the left hand side of Figure 5.1, corresponding to three pentagon dots from the top to the bottom in each line segment, respectively. The computational time for the ALS method is then the summation of these 10 ran samples. We observe that Algorithms 2 and 3 can easily and quicly find the exact i-rans (4, 4, 2) of the core tensor for different perturbed ratios, and can still approximate the original tensor Y accurately even for large noise. Figure 5.1 shows that Algorithms 2 and 3 outperform the ALS method significantly, both in fit and in computational time. Fig Approximating a noisy tensor of size with original ran-(4,4,2). Figure 5.2 shows another synthetic experiment whose data is generated similarly as that in Figure 5.1. In this set of data, Y R , the i-rans are (5, 5, 4). The summation of the rans in testing Algorithms 2 and 3 are set as c = 14. Figure 5.2 again shows that our algorithms perform quite well in anti-interference and are far superior to the ALS method. Furthermore, both Algorithms 2 and 3 can find the exact i-rans (5, 5, 4) of the core tensor for various perturbed ratios. Fig Approximating a noisy tensor of size with original ran-(5,5,4) Amino acid fluorescence data. This data set was originally generated and measured by Claus Andersson and was later published and tested by Bro [9, 10].

12 12 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG It consists of five laboratory-made samples. Each sample contains different amounts of tyrosine, tryptophan and phenylalanine dissolved in phosphate buffered water. The size of the array A to be decomposed is , which corresponds to samples, emission wavelength ( nm) and excitation wavelength ( nm), respectively. Ideally the array should be describable with three PARAFAC components, where its fit is equal to 97.44% obtained by Tensor Toolbox. This data can also be approximated well in the sense of Tucer decomposition. In implementing Algorithm 2, the initial penalty parameter is set as λ 0 = 0.01 A. The numerical results for the three methods (ALS, Algorithms 2 and 3) are p- resented in Figure 5.3. Each pentagon dot in Figure 5.3 denotes the average fit by running the ALS method 10 times with randomly generated ran-(r 1, r 2, r 3 ) satisfying r 1 + r 2 + r 3 = c, and the CPU time is the summation of 10 random samples. The performances of Algorithms 2 and 3 are much better than that of the ALS method, especially for Algorithm 2, which can get a convincing fit even for lower c. Furthermore, Algorithm 2 decomposes this data set as good as PARAFAC does without nowing the exact i-rans, e.g. the fit reaches 97.55% when c = 9 which finds ran-(3, 3, 3) when it stops. Fig Approximating the amino acid fluorescence data for different c = r 1 + r 2 + r 3 To further justify the importance of Tucer decomposition with unspecified size of the core as well as Algorithm 2, here we investigate the ALS method with all possible pre-specified i-rans to test this data set. For given summation of i-rans c = 9, there are total 25 possible combinations of (r 1, r 2, r 3 ), and their corresponding fits are listed in Figure 5.4. It shows that Tucer decomposition of ran-(3, 3, 3) outperforms all other combinations, which is exactly the i-rans found by Algorithm Gene expression data. We further test one real three-way tensor F R , which is from 3D Arabidopsis gene expression data 1. Essentially this is a set of gene expression data with 2395 genes, measured at 6 time points and 9 different conditions. We again apply the three methods to test this data set. The initial penalty parameter for Algorithm 2 is set as λ 0 = 0.01 F. Results of our experiments are summarized in Table 5.1. For the fit in the last three columns by the ALS method, fit-2 and fit-3 denote the fit by using the ALS method with the rans obtained from 1 We than Professor Xiuzhen Huang for providing this set of data.

13 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 13 Fig The ALS method on the amino acid fluorescence data for c = 9. c Algorithm 2 Algorithm 3 ALS (r 1, r 2, r 3) CPU Fit(%) (r 1, r 2, r 3) CPU Fit(%) Fit-2(%) Fit-3(%) Fit(%) 3 (1, 1, 1) (1, 1, 1) (2, 1, 2) (2, 1, 2) (4, 2, 4) (1, 3, 6) (7, 3, 5) (5, 6, 4) (10, 4, 6) (5, 6, 9) Table 5.1 Approximating the gene expression data Algorithms 2 and 3, respectively; while the fit in the last column is the average fit by running the ALS method 10 times with randomly generated rans satisfying r 1 + r 2 + r 3 = c. The following observations are clearly visible: Excellent performances of Algorithms 2 and 3 are validated. Computing the i-ran information is important, as the rans outputted by Algorithms 2 and 3 are more suitable than randomly generated rans for the ALS method in terms of the fit. With the combination of the ALS method, Algorithm 2 wors better than Algorithm 3 in terms of the fit, while its CPU time is more costly than that of Algorithm Tensor compression of image data. Our final tests are applying the models and algorithms in this paper to compress the tensors from the image data. Here two set of faces are presented, with each in one subsection The ORL database of faces. The first set of images is from the ORL database of faces [36] in AT&T Laboratories Cambridge. In this database, there are 10 different images for each of the 40 distinct subjects, and the size of each image is pixels. Here, we draw one single distinct subject, and then construct a tensor T R When applying Algorithm 2, the initial penalty parameter is set as λ 0 = T. Figure 5.5 displays the images compressed and recovered by the three methods

14 14 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG c Methods (r 1, r 2, r 3) CR CP(%) RMSE Fit(%) CPU Algorithm 3 (25, 22, 3) ALS (best) (16, 25, 9) (worst) (43, 6, 1) Algorithm 2 (21, 19, 10) Algorithm 3 (35, 31, 4) ALS (best) (34, 29, 7) (worst) (11, 58, 1) Algorithm 2 (30, 30, 10) Table 5.2 Tensor compression for the ORL database of faces when c = 50 and 70 respectively. The detailed numerical values for the two cases are listed in Table 5.2. Some standard abbreviations from image science are adopted, namely, CP for compression, CR for compression ratio, and RMSE for root mean squared error. For the ALS method, its CPU time is computed by the summation of 10 random generated i-rans satisfying r 1 + r 2 + r 3 = c, and its best and worst compressions in terms of fit are reported for comparison. Fig Recovered images for the ORL database of faces (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) Observation from Figure 5.5 indicates that only the compressed images by the ALS method (the best one) and Algorithm 2 eep all the facial expressions. However Algorithm 2 indeed outperforms the ALS method (the best one) in terms of fit and RMSE as shown in Table 5.2. It is worth mentioning that most of computational effort of Algorithm 2 is to find a suitable combination of rans (r 1, r 2, r 3 ), while this issue for the ALS method is pre-specified. This computational effort in selecting better initial rans is worthwhile, as we noticed that the ALS method for a large c may be worse than Algorithm 2 for a smaller c, e.g. the 4th row in the right of Figure 5.5

15 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 15 c Methods (r 1, r 2, r 3) CR CP(%) RMSE Fit(%) CPU Algorithm 3 (40, 39, 1) ALS (best) (39, 38, 3) (worst) (75, 2, 3) 1.0e Algorithm 2 (39, 34, 7) Algorithm 3 (53, 55, 2) ALS (best) (54, 52, 4) (worst) (1, 105, 4) 1.1e Algorithm 2 (53, 50, 7) Table 5.3 Tensor compression for for the JAFFE database is not clearer than the 2rd row in the left of Figure 5.5, which is also confirmed by Table 5.2. Figure 5.6 presents the performance of Algorithm 2 for various c. Clearly, the larger c is, the larger the fit, and the lower the RMSE. This figure also suggests a guideline for choosing a suitable c depending on the quality of the compressed images required by the user. Fig Performance of Algorithm 2 for the ORL database of faces The Japanese female facial expression (JAFFE) database. Our last tests are on the facial database [34] from the Psychology Department in Kyushu University, which contains 213 images of 7 different emotional facial expressions (sadness, happiness, surprise, anger, disgust, fear and neutral) posed by 10 Japanese female models. The size of each image is pixels. Here we draw 7 different facial expressions of one female model and construct a tensor T of size A similar set of tests as in Section are conducted. The results are presented in Figures 5.7 and 5.8, Table 5.3, and Figure 5.9, which again confirm the observation from the results for the ORL database of faces. In particular, by comparing the best two set of images (the best ALS method and Algorithm 2) in Figures 5.7 and 5.8, Algorithm 2 is clearer better in details, e.g. the eyebrows.

16 16 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG Fig Recovered images for the JAFFE database when c = 80 (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) 6. Conclusion. In this paper we study the problem of finding a proper Tucer decomposition for a general tensor without a pre-specified size of the core tensor, which addresses an important practical issue for real applications. The size of the core tensor is assumed in almost all the existing methods for tensor Tucer decomposition. The approach that we propose is based on the so-called maximum bloc improvement (MBI) method. Our numerical experiments on a variety of instances taen from real applications suggest that the proposed methods perform robustly and effectively. REFERENCES [1] A. H. Andersen and W. S. Rayens, Structure-seeing multilinear methods for the analysis of fmri data, NeuroImage, 22 (2004), pp [2] C. M. Andersen and R. Bro, Practical aspects of PARAFAC modeling of fluorescence excitation-emission data, Journal of Chemometrics, 17 (2003), pp [3] C. A. Andersson and R. Bro, Improving the speed of multi-way algorithms: Part I. Tucer3, Chemometrics and Intelligent Laboratory Systems, 42 (1998), pp [4] C. J. Appellof and E. R. Davidson, Strategies for analyzing data from video fluorometric monitoring of liquid chromatographic effluents, Analytical Chemistry, 53 (1981), pp [5] A. Aubry, A. De Maio, B. Jiang, and S. Zhang, A cognitive approach for ambiguity function shaping, Proceedings of the 7th IEEE Sensor Array and Multichannel Signal Processing Worshop, 2012, pp [6] B. W. Bader, M. W. Berry, and M. Browne, Discussion tracing in enron using PARAFAC, in M. W. Berry and M. Castellanos (eds.), Survey of Text Mining: Clustering, Classification, and Retrieval (2nd edition), Springer, New Yor, 2007, pp [7] B. W. Bader and T. G. Kolda, Matlab Tensor Toolbox, Version 2.5, tgolda/tensortoolbox

17 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 17 Fig Recovered images for the JAFFE database when c = 110 (rows from the top to the bottom: 1. Original, 2. Algorithm 3, 3. ALS (the best), 4. ALS (the worst), 5. Algorithm 2) Fig Performance of Algorithm 2 for the JAFFE database [8] D. P. Bertseas, Nonlinear Programming (2nd edition), Athena Scientific, Belmont, MA, [9] R. Bro, PARAFAC: Tutorial and applications, Chemometrics and Intelligent Laboratory Systems, 38 (1997), pp [10] R. Bro, Multi-way Analysis in the Food Industry: Models, Algorithms, and Applications, Ph.D. Thesis, University of Amsterdam, Nethelands, and Royal Veterinary and Agricultural University, Denmar, [11] R. Bro and H. A. L. Kiers, A new efficient method for determining the number of components in PARAFAC models, Journal of Chemometrics, 17 (2003), pp [12] J. D. Carroll and J. J. Chang, Analysis of individual differences in multidimensional scaling

18 18 BILIAN CHEN, ZHENING LI, AND SHUZHONG ZHANG via an N-way generalization of Ecart-Young decomposition, Psychometria, 35 (1970), pp [13] B. Chen, Optimization with Bloc Variables: Theory and Applications, Ph.D. Thesis, The Chinese Univesrity of Hong Kong, Hong Kong, [14] B. Chen, S. He, Z. Li, and S. Zhang, Maximum bloc improvement and polynomial optimization, SIAM Journal on Optimization, 22 (2012), pp [15] L. De Lathauwer, B. De Moor, and J. Vandewalle, A multilinear singular value decomposition, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp [16] L. De Lathauwer, B. De Moor, and J. Vandewalle, On the best ran-1 and ran- (R 1, R 2,..., R N ) approximation of higher-order tensors, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp [17] L. De Lathauwer and D. Nion, Decompositions of a higher-order tensor in bloc terms Part III: Alternating least squares algorithms, SIAM Journal on Matrix Analysis and Applications, 30 (2008), pp [18] L. De Lathauwer and J. Vandewalle, Dimensionality reduction in higher-order signal processing and ran-(r 1, R 2,..., R N ) reduction in multilinear algebra, Linear Algebra and its Applications, 391 (2004), pp [19] L. Eldén and B. Savas, A Newton-Grassmann method for computing the best multi-linear ran-(r1, r2, r3) approximation of a tensor, SIAM Journal on Matrix Analysis and Applications, 31 (2009), pp [20] R. A. Harshman, Foundations of the PARAFAC procedure: Models and conditions for an explanatory multi-modal factor analysis, UCLA Woring Papers in Phonetics, 16 (1970), pp Also available online at harshman/wpppfac0.pdf [21] Z. He, A. Cichoci, and S. Xie, Efficient method for Tucer3 model selection, Electronics Letters, 45 (2009), pp [22] R. Henrion, Simultaneous simplification of loading and core matrices in N-way PCA: Application to chemometric data arrays, Fresenius Journal of Analytical Chemistry, 361 (1998), pp [23] R. Henrion, On global, local and stationary solutions in three-way data analysis, J. Chemometrics, 14 (2000), pp [24] M. Ishteva, L. De Lathauwer, P. A. Absil, and S. Van Huffel, Differential-geometric Newton method for the best ran-(r 1, R 2, R 3 ) approximation of tensors, Numerical Algorithms, 51 (2009), pp [25] A. Kapteyn, H. Neudecer, and T. Wansbee, An approach to n-mode components analysis, Psychometria, 51 (1986), pp [26] H. A. L. Kiers and A. Der Kinderen, A fast method for choosing the numbers of components in Tucer3 analysis, British Journal of Mathematical and Statistical Psychology, 56 (2003), pp [27] E. Kofidis and P. A. Regalia, On the best ran-1 approximation of higher order supersymmetric tensors, SIAM Journal on Matrix Analysis and Applications, 23 (2002), pp [28] T. G. Kolda, Multilinear operators for higher-order decompositions, Technical Report SAND , Sandia National Laboratories, Albuquerque, NM, [29] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM Review, 51 (2009), pp [30] P. M. Kroonenberg and J. De Leeuw, Principal component analysis of three-mode data by means of alternating least squares algorithms, Psychometria, 45 (1980), pp [31] D. Leibovici and R. Sabatier, A singular value decomposition of a -way array for a principal component analysis of multiway data, PTA-, Linear Algebra and its Applications, 269 (1998), pp [32] J. Levin, Three-mode Factor Analysis, Ph.D. Thesis, University of Illinois, Urbana, IL, [33] Z. Li, A. Uschmajew, and S. Zhang, Linear convergence analysis of the maximum bloc improvement method for spherically constrained optimization, Technical Report, [34] M. J. Lyons, S. Aamatsu, M. Kamachi, and J. Gyoba, Coding facial expressions with gabor wavelets, Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition, 1998, pp [35] L. Qi, The best ran-one approximation ratio of a tensor space, SIAM Journal on Matrix Analysis and Applications, 32 (2011), pp [36] F. Samaria and A. Harter, Parameterisation of a stochastic model for human face identification, Proceedings of 2nd IEEE Worshop on Applications of Computer Vision, 1994, pp [37] B. Savas and L. Eldén, Handwritten digit classification using higher order singular value decomposition, Pattern Recognition, 40 (2007), pp

19 ADJUSTABLE TENSOR TUCKER DECOMPOSITION 19 [38] W. Sun and Y. Yuan, Optimization Theory and Methods: Nonlinear Programming, Springer Optimization and Its Applicationss, Volume 1, Springer, New Yor, [39] M. E. Timmerman and H. A. L. Kiers, Three-mode principal components analysis: Choosing the numbers of components and sensitivity to local optima, British Journal of Mathematical and Statistical Psychology, 53 (2000), pp [40] R. Tomioa, T. Suzui, K. Hayashi, and H. Kashima, Statistical performance of convex tensor decomposition, Proceedings of the 25th Annual Conference on Neural Information Processing Systems, 2011, pp [41] L. R. Tucer, Implications of factor analysis of three-way matrices for measurement of change, in C. W. Harris (eds.), Problems in Measuring Change, University of Wisconsin Press, 1963, pp [42] L. R. Tucer, The extension of factor analysis to three-dimensional matrices, in H. Gullisen and N. Frederisen (eds.), Contributions to Mathematical Psychology, Holt, Rinehardt, & Winston, New Yor, NY, [43] L. R. Tucer, Some mathematical notes on three-mode factor analysis, Psychometria, 31 (1966), pp [44] A. Uschmajew, Local convergence of the alternating least squares algorithm for canonical tensor approximation, SIAM Journal on Matrix Analysis and Applications, 33 (2012), pp [45] M. A. O. Vasilescu and D. Terzopoulos, Multilinear analysis of image ensembles: Tensorfaces, Proceedings of the 7th European Conference on Computer Vision, Lecture Notes in Computer Science, 2350 (2002), Springer, New Yor, pp [46] H. Wang, Q. Wu, L. Shi, Y. Yu, and N. Ahuja, Out-of-core tensor approximation of multidimensional matrices of visual data, ACM Transactions on Graphics (Special Issue for SIGGRAPH 2005), 24 (2005), pp [47] Y. Wang and L. Qi, On the successive supersymmetric ran-1 decomposition of higher-order supersymmetric tensors, Numerical Linear Algebra with Applications, 14 (2007), pp [48] S. Zhang, K. Wang, C. Ashby, B. Chen, and X. Huang, A unified adaptive co-identification framewor for high-d expression data, in T. Shibuya et al. (eds.), Proceedings of the 7th IAPR International Conference on Pattern Recognition in Bioinformatics, Lecture Notes in Computer Science, 7632 (2012), Springer, New Yor, pp [49] S. Zhang, K. Wang, B. Chen, and X. Huang, A new framewor for co-clustering of gene expression data, in M. Loog et al. (eds.), Proceedings of the 6th IAPR International Conference on Pattern Recognition in Bioinformatics, Lecture Notes in Computer Science, 7036 (2011), Springer, New Yor, pp [50] T. Zhang and G. H. Golub, Ran-one approximation to high order tensors, SIAM Journal on Matrix Analysis and Applications, 23 (2001), pp

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION

CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION International Conference on Computer Science and Intelligent Communication (CSIC ) CP DECOMPOSITION AND ITS APPLICATION IN NOISE REDUCTION AND MULTIPLE SOURCES IDENTIFICATION Xuefeng LIU, Yuping FENG,

More information

The multiple-vector tensor-vector product

The multiple-vector tensor-vector product I TD MTVP C KU Leuven August 29, 2013 In collaboration with: N Vanbaelen, K Meerbergen, and R Vandebril Overview I TD MTVP C 1 Introduction Inspiring example Notation 2 Tensor decompositions The CP decomposition

More information

A Multi-Affine Model for Tensor Decomposition

A Multi-Affine Model for Tensor Decomposition Yiqing Yang UW Madison breakds@cs.wisc.edu A Multi-Affine Model for Tensor Decomposition Hongrui Jiang UW Madison hongrui@engr.wisc.edu Li Zhang UW Madison lizhang@cs.wisc.edu Chris J. Murphy UC Davis

More information

Fundamentals of Multilinear Subspace Learning

Fundamentals of Multilinear Subspace Learning Chapter 3 Fundamentals of Multilinear Subspace Learning The previous chapter covered background materials on linear subspace learning. From this chapter on, we shall proceed to multiple dimensions with

More information

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm

Kronecker Product Approximation with Multiple Factor Matrices via the Tensor Product Algorithm Kronecker Product Approximation with Multiple actor Matrices via the Tensor Product Algorithm King Keung Wu, Yeung Yam, Helen Meng and Mehran Mesbahi Department of Mechanical and Automation Engineering,

More information

A new truncation strategy for the higher-order singular value decomposition

A new truncation strategy for the higher-order singular value decomposition A new truncation strategy for the higher-order singular value decomposition Nick Vannieuwenhoven K.U.Leuven, Belgium Workshop on Matrix Equations and Tensor Techniques RWTH Aachen, Germany November 21,

More information

On the convergence of higher-order orthogonality iteration and its extension

On the convergence of higher-order orthogonality iteration and its extension On the convergence of higher-order orthogonality iteration and its extension Yangyang Xu IMA, University of Minnesota SIAM Conference LA15, Atlanta October 27, 2015 Best low-multilinear-rank approximation

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 6: Some Other Stuff PD Dr.

More information

/16/$ IEEE 1728

/16/$ IEEE 1728 Extension of the Semi-Algebraic Framework for Approximate CP Decompositions via Simultaneous Matrix Diagonalization to the Efficient Calculation of Coupled CP Decompositions Kristina Naskovska and Martin

More information

c 2008 Society for Industrial and Applied Mathematics

c 2008 Society for Industrial and Applied Mathematics SIAM J MATRIX ANAL APPL Vol 30, No 3, pp 1219 1232 c 2008 Society for Industrial and Applied Mathematics A JACOBI-TYPE METHOD FOR COMPUTING ORTHOGONAL TENSOR DECOMPOSITIONS CARLA D MORAVITZ MARTIN AND

More information

Tutorial on MATLAB for tensors and the Tucker decomposition

Tutorial on MATLAB for tensors and the Tucker decomposition Tutorial on MATLAB for tensors and the Tucker decomposition Tamara G. Kolda and Brett W. Bader Workshop on Tensor Decomposition and Applications CIRM, Luminy, Marseille, France August 29, 2005 Sandia is

More information

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry

Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Third-Order Tensor Decompositions and Their Application in Quantum Chemistry Tyler Ueltschi University of Puget SoundTacoma, Washington, USA tueltschi@pugetsound.edu April 14, 2014 1 Introduction A tensor

More information

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams

Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Window-based Tensor Analysis on High-dimensional and Multi-aspect Streams Jimeng Sun Spiros Papadimitriou Philip S. Yu Carnegie Mellon University Pittsburgh, PA, USA IBM T.J. Watson Research Center Hawthorne,

More information

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem

Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Fitting a Tensor Decomposition is a Nonlinear Optimization Problem Evrim Acar, Daniel M. Dunlavy, and Tamara G. Kolda* Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia

More information

Multilinear Subspace Analysis of Image Ensembles

Multilinear Subspace Analysis of Image Ensembles Multilinear Subspace Analysis of Image Ensembles M. Alex O. Vasilescu 1,2 and Demetri Terzopoulos 2,1 1 Department of Computer Science, University of Toronto, Toronto ON M5S 3G4, Canada 2 Courant Institute

More information

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition

An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition An Effective Tensor Completion Method Based on Multi-linear Tensor Ring Decomposition Jinshi Yu, Guoxu Zhou, Qibin Zhao and Kan Xie School of Automation, Guangdong University of Technology, Guangzhou,

More information

CVPR A New Tensor Algebra - Tutorial. July 26, 2017

CVPR A New Tensor Algebra - Tutorial. July 26, 2017 CVPR 2017 A New Tensor Algebra - Tutorial Lior Horesh lhoresh@us.ibm.com Misha Kilmer misha.kilmer@tufts.edu July 26, 2017 Outline Motivation Background and notation New t-product and associated algebraic

More information

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis

The Canonical Tensor Decomposition and Its Applications to Social Network Analysis The Canonical Tensor Decomposition and Its Applications to Social Network Analysis Evrim Acar, Tamara G. Kolda and Daniel M. Dunlavy Sandia National Labs Sandia is a multiprogram laboratory operated by

More information

A Tensor Approximation Approach to Dimensionality Reduction

A Tensor Approximation Approach to Dimensionality Reduction Int J Comput Vis (2008) 76: 217 229 DOI 10.1007/s11263-007-0053-0 A Tensor Approximation Approach to Dimensionality Reduction Hongcheng Wang Narendra Ahua Received: 6 October 2005 / Accepted: 9 March 2007

More information

Multilinear Analysis of Image Ensembles: TensorFaces

Multilinear Analysis of Image Ensembles: TensorFaces Multilinear Analysis of Image Ensembles: TensorFaces M Alex O Vasilescu and Demetri Terzopoulos Courant Institute, New York University, USA Department of Computer Science, University of Toronto, Canada

More information

Non-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares

Non-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares Non-negative Tensor Factorization Based on Alternating Large-scale Non-negativity-constrained Least Squares Hyunsoo Kim, Haesun Park College of Computing Georgia Institute of Technology 66 Ferst Drive,

More information

A Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition

A Simpler Approach to Low-Rank Tensor Canonical Polyadic Decomposition A Simpler Approach to Low-ank Tensor Canonical Polyadic Decomposition Daniel L. Pimentel-Alarcón University of Wisconsin-Madison Abstract In this paper we present a simple and efficient method to compute

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

arxiv: v1 [cs.lg] 18 Nov 2018

arxiv: v1 [cs.lg] 18 Nov 2018 THE CORE CONSISTENCY OF A COMPRESSED TENSOR Georgios Tsitsikas, Evangelos E. Papalexakis Dept. of Computer Science and Engineering University of California, Riverside arxiv:1811.7428v1 [cs.lg] 18 Nov 18

More information

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors

A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors A randomized block sampling approach to the canonical polyadic decomposition of large-scale tensors Nico Vervliet Joint work with Lieven De Lathauwer SIAM AN17, July 13, 2017 2 Classification of hazardous

More information

Scalable Tensor Factorizations with Incomplete Data

Scalable Tensor Factorizations with Incomplete Data Scalable Tensor Factorizations with Incomplete Data Tamara G. Kolda & Daniel M. Dunlavy Sandia National Labs Evrim Acar Information Technologies Institute, TUBITAK-UEKAE, Turkey Morten Mørup Technical

More information

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos

Must-read Material : Multimedia Databases and Data Mining. Indexing - Detailed outline. Outline. Faloutsos Must-read Material 15-826: Multimedia Databases and Data Mining Tamara G. Kolda and Brett W. Bader. Tensor decompositions and applications. Technical Report SAND2007-6702, Sandia National Laboratories,

More information

CORE CONSISTENCY DIAGNOSTIC AIDED BY RECONSTRUCTION ERROR FOR ACCURATE ENUMERATION OF THE NUMBER OF COMPONENTS IN PARAFAC MODELS

CORE CONSISTENCY DIAGNOSTIC AIDED BY RECONSTRUCTION ERROR FOR ACCURATE ENUMERATION OF THE NUMBER OF COMPONENTS IN PARAFAC MODELS CORE CONSISTENCY DIAGNOSTIC AIDED BY RECONSTRUCTION ERROR FOR ACCURATE ENUMERATION OF THE NUMBER OF COMPONENTS IN PARAFAC MODELS Kefei Liu 1, H.C. So 1, João Paulo C. L. da Costa and Lei Huang 3 1 Department

More information

THERE is an increasing need to handle large multidimensional

THERE is an increasing need to handle large multidimensional 1 Matrix Product State for Feature Extraction of Higher-Order Tensors Johann A. Bengua 1, Ho N. Phien 1, Hoang D. Tuan 1 and Minh N. Do 2 arxiv:1503.00516v4 [cs.cv] 20 Jan 2016 Abstract This paper introduces

More information

Fast multilinear Singular Values Decomposition for higher-order Hankel tensors

Fast multilinear Singular Values Decomposition for higher-order Hankel tensors Fast multilinear Singular Values Decomposition for higher-order Hanel tensors Maxime Boizard, Remy Boyer, Gérard Favier, Pascal Larzabal To cite this version: Maxime Boizard, Remy Boyer, Gérard Favier,

More information

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro

Tensor Analysis. Topics in Data Mining Fall Bruno Ribeiro Tensor Analysis Topics in Data Mining Fall 2015 Bruno Ribeiro Tensor Basics But First 2 Mining Matrices 3 Singular Value Decomposition (SVD) } X(i,j) = value of user i for property j i 2 j 5 X(Alice, cholesterol)

More information

Robust Low-Rank Modelling on Matrices and Tensors

Robust Low-Rank Modelling on Matrices and Tensors Imperial College London Department of Computing MSc in Advanced Computing Robust Low-Ran Modelling on Matrices and Tensors by Georgios Papamaarios Submitted in partial fulfilment of the requirements for

More information

Sparseness Constraints on Nonnegative Tensor Decomposition

Sparseness Constraints on Nonnegative Tensor Decomposition Sparseness Constraints on Nonnegative Tensor Decomposition Na Li nali@clarksonedu Carmeliza Navasca cnavasca@clarksonedu Department of Mathematics Clarkson University Potsdam, New York 3699, USA Department

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

18 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008

18 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 18 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 MPCA: Multilinear Principal Component Analysis of Tensor Objects Haiping Lu, Student Member, IEEE, Konstantinos N. (Kostas) Plataniotis,

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION

AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION AN ALTERNATING MINIMIZATION ALGORITHM FOR NON-NEGATIVE MATRIX APPROXIMATION JOEL A. TROPP Abstract. Matrix approximation problems with non-negativity constraints arise during the analysis of high-dimensional

More information

Non-Negative Tensor Factorisation for Sound Source Separation

Non-Negative Tensor Factorisation for Sound Source Separation ISSC 2005, Dublin, Sept. -2 Non-Negative Tensor Factorisation for Sound Source Separation Derry FitzGerald, Matt Cranitch φ and Eugene Coyle* φ Dept. of Electronic Engineering, Cor Institute of Technology

More information

PAVEMENT CRACK CLASSIFICATION BASED ON TENSOR FACTORIZATION. Offei Amanor Adarkwa

PAVEMENT CRACK CLASSIFICATION BASED ON TENSOR FACTORIZATION. Offei Amanor Adarkwa PAVEMENT CRACK CLASSIFICATION BASED ON TENSOR FACTORIZATION by Offei Amanor Adarkwa A thesis submitted to the Faculty of the University of Delaware in partial fulfillment of the requirements for the degree

More information

Parallel Tensor Compression for Large-Scale Scientific Data

Parallel Tensor Compression for Large-Scale Scientific Data Parallel Tensor Compression for Large-Scale Scientific Data Woody Austin, Grey Ballard, Tamara G. Kolda April 14, 2016 SIAM Conference on Parallel Processing for Scientific Computing MS 44/52: Parallel

More information

ARestricted Boltzmann machine (RBM) [1] is a probabilistic

ARestricted Boltzmann machine (RBM) [1] is a probabilistic 1 Matrix Product Operator Restricted Boltzmann Machines Cong Chen, Kim Batselier, Ching-Yun Ko, and Ngai Wong chencong@eee.hku.hk, k.batselier@tudelft.nl, cyko@eee.hku.hk, nwong@eee.hku.hk arxiv:1811.04608v1

More information

A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS

A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS A FLEXIBLE MODELING FRAMEWORK FOR COUPLED MATRIX AND TENSOR FACTORIZATIONS Evrim Acar, Mathias Nilsson, Michael Saunders University of Copenhagen, Faculty of Science, Frederiksberg C, Denmark University

More information

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries

More information

Fast Nonnegative Tensor Factorization with an Active-Set-Like Method

Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Fast Nonnegative Tensor Factorization with an Active-Set-Like Method Jingu Kim and Haesun Park Abstract We introduce an efficient algorithm for computing a low-rank nonnegative CANDECOMP/PARAFAC(NNCP)

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Linear Algebra Background

Linear Algebra Background CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

Towards A New Multiway Array Decomposition Algorithm: Elementwise Multiway Array High Dimensional Model Representation (EMAHDMR)

Towards A New Multiway Array Decomposition Algorithm: Elementwise Multiway Array High Dimensional Model Representation (EMAHDMR) Towards A New Multiway Array Decomposition Algorithm: Elementwise Multiway Array High Dimensional Model Representation (EMAHDMR) MUZAFFER AYVAZ İTU Informatics Institute, Computational Sci. and Eng. Program,

More information

LOW MULTILINEAR RANK TENSOR APPROXIMATION VIA SEMIDEFINITE PROGRAMMING

LOW MULTILINEAR RANK TENSOR APPROXIMATION VIA SEMIDEFINITE PROGRAMMING 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 LOW MULTILINEAR RANK TENSOR APPROXIMATION VIA SEMIDEFINITE PROGRAMMING Carmeliza Navasca and Lieven De Lathauwer

More information

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional

More information

JOS M.F. TEN BERGE SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS

JOS M.F. TEN BERGE SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS PSYCHOMETRIKA VOL. 76, NO. 1, 3 12 JANUARY 2011 DOI: 10.1007/S11336-010-9193-1 SIMPLICITY AND TYPICAL RANK RESULTS FOR THREE-WAY ARRAYS JOS M.F. TEN BERGE UNIVERSITY OF GRONINGEN Matrices can be diagonalized

More information

THERE is an increasing need to handle large multidimensional

THERE is an increasing need to handle large multidimensional 1 Matrix Product State for Higher-Order Tensor Compression and Classification Johann A. Bengua, Ho N. Phien, Hoang D. Tuan and Minh N. Do Abstract This paper introduces matrix product state (MPS) decomposition

More information

Sparse Higher-Order Principal Components Analysis

Sparse Higher-Order Principal Components Analysis Genevera I. Allen Baylor College of Medicine & Rice University Abstract Traditional tensor decompositions such as the CANDECOMP / PARAFAC CP) and Tucer decompositions yield higher-order principal components

More information

Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

More information

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization Haiping Lu 1 K. N. Plataniotis 1 A. N. Venetsanopoulos 1,2 1 Department of Electrical & Computer Engineering,

More information

Theoretical Performance Analysis of Tucker Higher Order SVD in Extracting Structure from Multiple Signal-plus-Noise Matrices

Theoretical Performance Analysis of Tucker Higher Order SVD in Extracting Structure from Multiple Signal-plus-Noise Matrices Theoretical Performance Analysis of Tucker Higher Order SVD in Extracting Structure from Multiple Signal-plus-Noise Matrices Himanshu Nayar Dept. of EECS University of Michigan Ann Arbor Michigan 484 email:

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

c 2008 Society for Industrial and Applied Mathematics

c 2008 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 30, No. 3, pp. 1067 1083 c 2008 Society for Industrial and Applied Mathematics DECOMPOSITIONS OF A HIGHER-ORDER TENSOR IN BLOCK TERMS PART III: ALTERNATING LEAST SQUARES

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

Generalized Higher-Order Tensor Decomposition via Parallel ADMM

Generalized Higher-Order Tensor Decomposition via Parallel ADMM Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Generalized Higher-Order Tensor Decomposition via Parallel ADMM Fanhua Shang 1, Yuanyuan Liu 2, James Cheng 1 1 Department of

More information

Absolute Value Programming

Absolute Value Programming O. L. Mangasarian Absolute Value Programming Abstract. We investigate equations, inequalities and mathematical programs involving absolute values of variables such as the equation Ax + B x = b, where A

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

EE 381V: Large Scale Optimization Fall Lecture 24 April 11 EE 381V: Large Scale Optimization Fall 2012 Lecture 24 April 11 Lecturer: Caramanis & Sanghavi Scribe: Tao Huang 24.1 Review In past classes, we studied the problem of sparsity. Sparsity problem is that

More information

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem

A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem A Bregman alternating direction method of multipliers for sparse probabilistic Boolean network problem Kangkang Deng, Zheng Peng Abstract: The main task of genetic regulatory networks is to construct a

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

Lecture 2. Tensor Unfoldings. Charles F. Van Loan

Lecture 2. Tensor Unfoldings. Charles F. Van Loan From Matrix to Tensor: The Transition to Numerical Multilinear Algebra Lecture 2. Tensor Unfoldings Charles F. Van Loan Cornell University The Gene Golub SIAM Summer School 2010 Selva di Fasano, Brindisi,

More information

Wafer Pattern Recognition Using Tucker Decomposition

Wafer Pattern Recognition Using Tucker Decomposition Wafer Pattern Recognition Using Tucker Decomposition Ahmed Wahba, Li-C. Wang, Zheng Zhang UC Santa Barbara Nik Sumikawa NXP Semiconductors Abstract In production test data analytics, it is often that an

More information

Tensor Decompositions and Applications

Tensor Decompositions and Applications Tamara G. Kolda and Brett W. Bader Part I September 22, 2015 What is tensor? A N-th order tensor is an element of the tensor product of N vector spaces, each of which has its own coordinate system. a =

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

Rank Determination for Low-Rank Data Completion

Rank Determination for Low-Rank Data Completion Journal of Machine Learning Research 18 017) 1-9 Submitted 7/17; Revised 8/17; Published 9/17 Rank Determination for Low-Rank Data Completion Morteza Ashraphijuo Columbia University New York, NY 1007,

More information

the tensor rank is equal tor. The vectorsf (r)

the tensor rank is equal tor. The vectorsf (r) EXTENSION OF THE SEMI-ALGEBRAIC FRAMEWORK FOR APPROXIMATE CP DECOMPOSITIONS VIA NON-SYMMETRIC SIMULTANEOUS MATRIX DIAGONALIZATION Kristina Naskovska, Martin Haardt Ilmenau University of Technology Communications

More information

A practical method for computing the largest M-eigenvalue of a fourth-order partially symmetric tensor

A practical method for computing the largest M-eigenvalue of a fourth-order partially symmetric tensor NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2007; 00:1 6 [Version: 2002/09/18 v1.02] A practical method for computing the largest M-eigenvalue of a fourth-order partially symmetric

More information

Tensor Low-Rank Completion and Invariance of the Tucker Core

Tensor Low-Rank Completion and Invariance of the Tucker Core Tensor Low-Rank Completion and Invariance of the Tucker Core Shuzhong Zhang Department of Industrial & Systems Engineering University of Minnesota zhangs@umn.edu Joint work with Bo JIANG, Shiqian MA, and

More information

4y Springer NONLINEAR INTEGER PROGRAMMING

4y Springer NONLINEAR INTEGER PROGRAMMING NONLINEAR INTEGER PROGRAMMING DUAN LI Department of Systems Engineering and Engineering Management The Chinese University of Hong Kong Shatin, N. T. Hong Kong XIAOLING SUN Department of Mathematics Shanghai

More information

Tensor Decompositions Workshop Discussion Notes American Institute of Mathematics (AIM) Palo Alto, CA

Tensor Decompositions Workshop Discussion Notes American Institute of Mathematics (AIM) Palo Alto, CA Tensor Decompositions Workshop Discussion Notes American Institute of Mathematics (AIM) Palo Alto, CA Carla D. Moravitz Martin July 9-23, 24 This work was supported by AIM. Center for Applied Mathematics,

More information

Are There Sixth Order Three Dimensional PNS Hankel Tensors?

Are There Sixth Order Three Dimensional PNS Hankel Tensors? Are There Sixth Order Three Dimensional PNS Hankel Tensors? Guoyin Li Liqun Qi Qun Wang November 17, 014 Abstract Are there positive semi-definite PSD) but not sums of squares SOS) Hankel tensors? If the

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres

Approximation algorithms for nonnegative polynomial optimization problems over unit spheres Front. Math. China 2017, 12(6): 1409 1426 https://doi.org/10.1007/s11464-017-0644-1 Approximation algorithms for nonnegative polynomial optimization problems over unit spheres Xinzhen ZHANG 1, Guanglu

More information

An Even Order Symmetric B Tensor is Positive Definite

An Even Order Symmetric B Tensor is Positive Definite An Even Order Symmetric B Tensor is Positive Definite Liqun Qi, Yisheng Song arxiv:1404.0452v4 [math.sp] 14 May 2014 October 17, 2018 Abstract It is easily checkable if a given tensor is a B tensor, or

More information

Using Matrix Decompositions in Formal Concept Analysis

Using Matrix Decompositions in Formal Concept Analysis Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and

More information

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy

Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Nonnegative Tensor Factorization using a proximal algorithm: application to 3D fluorescence spectroscopy Caroline Chaux Joint work with X. Vu, N. Thirion-Moreau and S. Maire (LSIS, Toulon) Aix-Marseille

More information

NON-NEGATIVE SPARSE CODING

NON-NEGATIVE SPARSE CODING NON-NEGATIVE SPARSE CODING Patrik O. Hoyer Neural Networks Research Centre Helsinki University of Technology P.O. Box 9800, FIN-02015 HUT, Finland patrik.hoyer@hut.fi To appear in: Neural Networks for

More information

Non-Negative Tensor Factorization with Applications to Statistics and Computer Vision

Non-Negative Tensor Factorization with Applications to Statistics and Computer Vision Non-Negative Tensor Factorization with Applications to Statistics and Computer Vision Amnon Shashua shashua@cs.huji.ac.il School of Engineering and Computer Science, The Hebrew University of Jerusalem,

More information

Orthogonal tensor decomposition

Orthogonal tensor decomposition Orthogonal tensor decomposition Daniel Hsu Columbia University Largely based on 2012 arxiv report Tensor decompositions for learning latent variable models, with Anandkumar, Ge, Kakade, and Telgarsky.

More information

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations

A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations A Customized ADMM for Rank-Constrained Optimization Problems with Approximate Formulations Chuangchuang Sun and Ran Dai Abstract This paper proposes a customized Alternating Direction Method of Multipliers

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem Morteza Ashraphijuo Columbia University ashraphijuo@ee.columbia.edu Xiaodong Wang Columbia University wangx@ee.columbia.edu

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Fast Hankel Tensor-Vector Products and Application to Exponential Data Fitting

Fast Hankel Tensor-Vector Products and Application to Exponential Data Fitting Fast Hankel Tensor-Vector Products and Application to Exponential Data Fitting Weiyang Ding Liqun Qi Yimin Wei arxiv:1401.6238v1 [math.na] 24 Jan 2014 January 27, 2014 Abstract This paper is contributed

More information

MATRIX COMPLETION AND TENSOR RANK

MATRIX COMPLETION AND TENSOR RANK MATRIX COMPLETION AND TENSOR RANK HARM DERKSEN Abstract. In this paper, we show that the low rank matrix completion problem can be reduced to the problem of finding the rank of a certain tensor. arxiv:1302.2639v2

More information

New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization

New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization New Ranks for Even-Order Tensors and Their Applications in Low-Rank Tensor Optimization Bo JIANG Shiqian MA Shuzhong ZHANG January 12, 2015 Abstract In this paper, we propose three new tensor decompositions

More information

arxiv: v1 [math.co] 10 Aug 2016

arxiv: v1 [math.co] 10 Aug 2016 POLYTOPES OF STOCHASTIC TENSORS HAIXIA CHANG 1, VEHBI E. PAKSOY 2 AND FUZHEN ZHANG 2 arxiv:1608.03203v1 [math.co] 10 Aug 2016 Abstract. Considering n n n stochastic tensors (a ijk ) (i.e., nonnegative

More information

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems

Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Multi-Linear Mappings, SVD, HOSVD, and the Numerical Solution of Ill-Conditioned Tensor Least Squares Problems Lars Eldén Department of Mathematics, Linköping University 1 April 2005 ERCIM April 2005 Multi-Linear

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information

Multiscale Tensor Decomposition

Multiscale Tensor Decomposition Multiscale Tensor Decomposition Alp Ozdemir 1, Mark A. Iwen 1,2 and Selin Aviyente 1 1 Department of Electrical and Computer Engineering, Michigan State University 2 Deparment of the Mathematics, Michigan

More information

Algorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping

Algorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping Algorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping BRETT W. BADER and TAMARA G. KOLDA Sandia National Laboratories Tensors (also known as multidimensional arrays or N-way arrays) are used

More information

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St

Outline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic

More information

Recovering Tensor Data from Incomplete Measurement via Compressive Sampling

Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Recovering Tensor Data from Incomplete Measurement via Compressive Sampling Jason R. Holloway hollowjr@clarkson.edu Carmeliza Navasca cnavasca@clarkson.edu Department of Electrical Engineering Clarkson

More information

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization

Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Performance Evaluation of the Matlab PCT for Parallel Implementations of Nonnegative Tensor Factorization Tabitha Samuel, Master s Candidate Dr. Michael W. Berry, Major Professor Abstract: Increasingly

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information