Key words. multilinear algebra, singular value decomposition, tensor decomposition, low rank approximation

Size: px
Start display at page:

Download "Key words. multilinear algebra, singular value decomposition, tensor decomposition, low rank approximation"

Transcription

1 ON THE TENSOR SVD AND THE OPTIMAL LOW RANK ORTHOGONAL APPROXIMATION OF TENSORS JIE CHEN AND YOUSEF SAAD Abstract. It s known that a hgher order tensor does not necessarly have an optmal low rank approxmaton, and that a tensor mght not be orthogonally decomposable (.e., admt a tensor SVD). We provde several suffcent condtons whch lead to the falure of the tensor SVD, and characterze the exstence of the tensor SVD wth respect to the Hgher Order SVD (HOSVD). In face of these dffcultes to generalze standard results known n the matrx case to tensors, we consder the low rank orthogonal approxmaton of tensors. The exstence of an optmal approxmaton s theoretcally guaranteed under certan condtons, and ths optmal approxmaton yelds a tensor decomposton where the dagonal of the core s maxmzed. We present an algorthm to compute ths approxmaton and analyze ts convergence behavor. Numercal experments ndcate a lnear convergence rate for ths algorthm. Key words. multlnear algebra, sngular value decomposton, tensor decomposton, low rank approxmaton AMS subject classfcatons. 15A69, 15A18 1. Introducton. There has been renewed nterest n studyng the propertes and decompostons of tensors (also known as N-way arrays or multdmensonal arrays) n numercal lnear algebra n recent years [30, 13, 12, 43, 17, 9, 28, 29, 15, 11]. The tensor approxmaton technques have been frutfully appled n varous areas whch nclude among others, chemometrcs [38, 4], sgnal processng [10, 8], vson and graphcs [41, 42], and network analyss [31, 1]. From the pont of vew of practcal applcatons, the matrx SVD and the optmal rank-r approxmaton of matrces (a.k.a. Eckart-Young theorem [18]) are of partcular nterest, and t would be nce f these propertes could be drectly generalzed to hgher order tensors. However, for any order N 3, de Slva and Lm [17] showed that the problem of optmal low rank approxmaton of hgher order tensors s ll-posed for many ranks r, and that ths ll-posedness s not rare for order-3 tensors. Furthermore, Kolda presented numerous examples to llustrate the dffcultes of orthogonal tensor decompostons [28, 29]. These studes revealed many aspects of the dssmlartes between tensors and matrces, n spte of the fact that hgher order tensors are multdmensonal generalzatons of matrces. The most commonly used generalzaton of the matrx SVD to hgher order tensors to date s the Hgher Order Sngular Value Decomposton (HOSVD) [12]. The HOSVD decomposes an order-n tensor nto a core tensor that has the same sze as the orgnal tensor together wth N orthogonal 1 sde-matrces. Although ths decomposton preserves many nce aspects of the matrx SVD (e.g., the core has the all-orthogonalty property and the orderng property), a notable dfference s that the core s n general not dagonal. Hence, n contrast wth the matrx SVD, the HOSVD cannot be wrtten as a sum of a few orthogonal outer-product terms 2. Ths work was supported by NSF under grants DMS and DMS and by the Mnnesota Supercomputng Insttute. Department of Computer Scence and Engneerng, Unversty of Mnnesota at Twn Ctes, MN Emal: {jchen, saad}@cs.umn.edu. 1 Throughout ths paper, a matrx A R m n, m n, s sad to be orthogonal f A T A = I. Ths generalzes the defnton for square matrces. 2 For dscussons of orthogonalty, see Secton

2 2 J. CHEN AND Y. SAAD There exst three well-known approxmatons to hgher order tensors: (1) the rank-1 approxmaton [13, 43, 27]; (2) the rank-(r 1, r 2,..., r N ) approxmaton wth a full core and N orthogonal sde-matrces (n the Tucker/HOOI fashon) [40, 13]; and (3) the approxmaton usng r outer-product terms (n the CANDECOMP/PARAFAC fashon) [6, 19]. Note that the approxmated tensor n case (3) mght have rank less than r. Among these approxmatons, the rank-1 approxmaton [17] and the rank- (r 1, r 2,..., r N ) approxmaton are theoretcally guaranteed to have a global optmum. In practcal applcatons, the three approxmatons are generally computed usng an alternatng least squares (ALS) method [33, 3, 25] (the so-called workhorse algorthm [30]), although many other methods have also been proposed [34, 37, 43, 28, 15, 11]. The convergence behavor of the ALS method s theoretcally unknown except under a few strong condtons [32]. Besdes, t has long been observed that the ALS method for the PARAFAC model may converge extremely slowly f at all [36, 26]. An llustraton of ths phenomenon s gven n the Appendx. Kolda [28] nvestgated several orthogonal decompostons of tensors related to dfferent defntons of orthogonalty, ncludng orthogonal rank decomposton, complete orthogonal rank decomposton and strong orthogonal rank decomposton. These decompostons mght not be unque, or even exst. Among these defntons, only the complete orthogonalty gves a stuaton whch parallels that of the matrx SVD. Ths approach demands that the sde-matrces all be orthogonal, n whch case we use the term tensor sngular value decomposton (tensor SVD, see Defnton 4.1) n ths paper. Zhang and Golub [43] proved that for all tensors of order N 3, the tensor SVD s unque (up to sgns) f t exsts, and that the ncremental rank-1 approxmaton approach wll compute ths decomposton. The followng contrbutons are made n ths paper: 1. Suffcent condtons ndcatng whch tensors fal to have a tensor SVD are gven. These condtons are related to the rank, the order, and the dmensons of the tensor, and hence can be vewed as generalzatons of results gven n the lterature wth specfc examples. Furthermore, the exstence of the tensor SVD can be characterzed by the dagonalty of the core n the HOSVD of the tensor. 2. A form of low rank approxmatons one that requres a dagonal core and orthogonal sde-matrces s dscussed. Theoretcally the global optmum of ths approxmaton can be attaned for any (approprate) rank. We present an teratve algorthm to compute ths approxmaton and analyze ts convergence behavor. 3. The proposed approxmaton at the maxmally possble rank leads to a decomposton of the tensor, where the dagonal of the core s maxmzed. Ths maxmal dagonalty for symmetrc order-3 [14] and 4 [7] tensors and for general order-3 tensors [15, 24, 35] has been prevously nvestgated and Jacob algorthms were used n the cted papers, but our dscusson s n a more general context and the proposed algorthm s not of a Jacob type. 2. Tensor algebra. In ths secton, we brefly revew some concepts and notons that are used throughout the paper. A tensor s a multdmensonal array of data whose elements are referred by usng multple ndces. The number of ndces requred s called the order of a tensor. We use A = (a 1, 2,..., N ) R d 1 d 2 d N to denote a tensor A of order N. For n = 1, 2,..., N, d n s the n-th dmenson of A. As a specal case, a vector s an order-1 tensor and a matrx s an order-2 tensor.

3 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION Unfoldngs and mode-n products. It s hard to vsualze tensors of order N > 3. They can be flexbly represented when unfolded nto matrces. The unfoldng of a tensor along mode n s a matrx of dmenson d n (d n+1 d N d 1 d n 1 ). We denote the mode-n unfoldng of tensor A by A (n). Each column of A (n) s a column of A along the n-th mode. An mportant operaton for a tensor s the tensor-matrx multplcaton, also known as mode-n product. Gven a tensor A R d 1 d 2 d N and a matrx M R c n d n, the mode-n product s a tensor where B = A n M R d1 dn 1 cn dn+1 d N b 1,..., n 1,j n, n+1,..., N := d n n=1 for j n = 1, 2,..., c n. In matrx representaton, ths s a 1,..., n 1, n, n+1,..., N m jn, n B (n) = MA (n). (2.1) 2.2. Inner products and tensor norms. The nner product of two tensors A and B of the same sze s defned by A, B F := d N N =1 d 1 1 =1 and the norm nduced from ths nner product s A F := A, A F. a 1,..., N b 1,..., N. We say that A s a unt tensor f A F = 1. When N = 2, A F s the Frobenus norm of matrx A. The norm of a tensor s equal to the Frobenus norm of the unfoldng of the tensor along any mode: A F = A (n) F, for n = 1,..., N Tensor products and outer products of vectors. The tensor product of an order-n tensor A R d 1 d 2 d N and an order-n tensor B R c 1 c 2 c N s an order-(n + N ) tensor where C = A B R d1 d N c 1 c N, c 1,..., N,j 1,...,j N := a 1,..., N b j1,...,j N. Note that the operator for tensor products unfortunately concdes wth the one used to denote the Kronecker product of two matrces. In partcular, the tensor product of two matrces (order-2 tensors) s an order-4 tensor, whle the Kronecker product of two matrces s agan a matrx. The reader shall not be confused by ths notaton snce n ths paper Kronecker products are not nvolved.

4 4 J. CHEN AND Y. SAAD The outer product of N (column) vectors, whch generalzes the standard outer product of two vectors (a rank-1 matrx), s a specal case of tensor products. The outer product of N vectors x (n) R d n, s an order-n tensor of dmenson d 1 d 2 d N : X = x (1) x (2) x (N). The ( 1, 2,..., N )-entry of X s N n=1 (x(n) ) n, where (x (n) ) n denotes the n -th entry of vector x (n). The tensor X s also called a rank-1 tensor. The rank of a tensor s defned n Secton 3. It can be verfed that the mode-n product of a rank-1 tensor X wth a matrx M can be computed as follows: ( X n M = x (1) x (n 1) Mx (n)) x (n+1) x (N), and that the nner product of X wth a general tensor A s A, X F = A, x (1) x (2) x (N) F = A 1 x (1)T 2 x (2)T N x (N)T. If U = u (1) u (2) u (N) and V = v (1) v (2) v (N) are two rank-1 tensors then U, V F = N n=1 u (n), v (n), where, denotes the standard Eucldean nner product of two vectors. A consequence of the above relaton s that U F s the product of the 2-norms of the vectors u (n) s Orthogonalty of tensors. Two tensors A and B of the same sze are F-orthogonal (Frobenus orthogonal) f ther nner product s zero,.e., A, B F = 0. For rank-1 tensors U = u (1) u (2) u (N) and V = v (1) v (2) v (N), the above defnton mples that they are F-orthogonal f N n=1 u (n), v (n) = 0. Ths leads to other optons for defnng orthogonalty for two rank-1 tensors. The paper [28] dscussed two cases: 1. Complete orthogonalty: u (n), v (n) = 0 for all n = 1,..., N. 2. Strong orthogonalty: For all n, ether u (n), v (n) = 0 or u (n) and v (n) are collnear, but there s at least one l such that u (l), v (l) = 0. In ths paper we wll smply use the term orthogonal for two outer products that are completely orthogonal (case 1).

5 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 5 S (3) B S (2) A = S (1) Fg A decomposton of an order-3 tensor A as B 1 S (1) 2 S (2) 3 S (3) Tensor decompostons. A decomposton of a tensor A R d 1 d 2 d N s of the form A = B 1 S (1) 2 S (2) N S (N), where B R c1 c2 c N s called the core tensor, and S (n) R dn cn for n = 1,..., N are called sde-matrces. An llustraton s gven n Fgure 2.1. Let s (n) be the -th column of S (n). The decomposton of A can equvalently be wrtten as a lnear combnaton of rank-1 tensors: A = c N N =1 c 1 1=1 b 1, 2,..., N s (1) 1 s (2) 2 s (N) N. (2.2) In partcular, f B s dagonal,.e., b 1, 2,..., N = 0 except when 1 = 2 = = N, then A = b... s (1) s (2) s (N) (2.3) where r = mn{c 1,..., c N }. In the lterature, the term decomposton s often used when approxmaton s meant nstead. The Tucker3 decomposton s an approxmaton n the form of the rght-hand sde of (2.2), for gven dmensons c 1, c 2,..., c N. Usually, t s requred that c n s less than the rank of A (n) for all n, otherwse the problem s trval. The HOOI approach computes ths approxmaton wth an addtonal property that all the S (n) s are orthogonal matrces. The CANDECOMP/PARAFAC decomposton s an approxmaton n the form of the rght-hand sde of (2.3), for a pre-specfed r. Usually, r s smaller than the smallest dmenson of all modes of A, although requrng a larger r s also possble n the ALS and other algorthms. As wll be dscussed n the next secton, the smallest r that satsfes equalty (2.3) s the rank of the tensor A. 3. Tensor ranks. The rank of a tensor causes dffcultes when attemptng to generalze matrx propertes to tensors. There are several possble generalzatons of the noton of rank. The n-rank of a tensor A R d1 d2 d N, for n = 1,..., N, denoted by rank n (A), s the rank of the unfoldng A (n) : rank n (A) := rank(a (n) ).

6 6 J. CHEN AND Y. SAAD The (outer-product) rank of A, denoted by rank(a), s defned as { rank(a) := mn r x (n) R d n, = 1,..., r, n = 1,..., N, s.t. A = x (1) x (2) x (N) Hence, a tensor s the outer product of N vectors f and only f t has rank one, and the rank of a general tensor A s the mnmum number of rank-1 tensors that sum to A. There are a few notable dfferences between the noton of rank for matrces and that for tensors: 1. For N = 2,.e., when A s a matrx, rank 1 (A) s the row rank, rank 2 (A) s the column rank, and rank(a) s the outer-product rank, and they are all equal. However, for hgher order tensors (N > 2), n general, the n-ranks are dfferent for dfferent modes n, and they are dfferent from rank(a) [12]. Furthermore, the rank of a matrx A can not be larger than the smallest dmenson of both modes of A, but for tensors ths s no longer true,.e., the rank can be larger than the smallest dmenson of the tensor [12]. 2. The matrx SVD yelds one possble way of wrtng a matrx as a sum of outer-product terms, and the number of nonzero sngular values s equal to the rank of the matrx. However, a tensor SVD does not always exst (see Secton 4), but f t ndeed does, t s unque up to sgns [34, 43] and the number of sngular values s equal to the rank of the tensor (see Defnton 4.1 and Proposton 4.2). 3. It s well-known that the optmal rank-r approxmaton of a matrx s smply ts truncated SVD. However some tensors may fal to have an optmal rank-r approxmaton [17]. If such an approxmaton exsts, t s unclear whether t can be wrtten n the form of a dagonal core multpled by orthogonal sde-matrces. Next are some lemmas and a theorem related to tensor ranks, whch were also gven n [17]. They are useful n dervng the results n Secton 4. The frst lemma ndcates that the rank of a tensor can not be smaller than any of ts n-ranks: Lemma 3.1. Let A R d1 d2 d N be an order-n tensor. Then rank n (A) mn{rank(a), d n }, for n = 1, 2,..., N. }. The next lemma llustrates a way to construct hgher order tensors whle preservng the rank. Lemma 3.2. Let A be a tensor and x be a non-zero vector. Then rank(a) = rank(a x). The followng lemma ndcates that gven any dmenson d 1 d 2 d N, we can construct a tensor of arbtrary rank R mn{d 1, d 2,..., d N }. Lemma 3.3. For n = 1,..., N, let x (n) 1,..., x(n) R Rd n be lnearly ndependent. Then the tensor A = R x (1) x (2) x (N)

7 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 7 has rank R. The next theorem s due to JáJá and Takche [23]. They showed that f A and B are order-3 tensors and at least one of them s a stack of two matrces, then the rank of ther drect sum s equal to the sum of ther ranks. Theorem 3.4 (JáJá Takche). Let A R d1 d2 d3 and B R c1 c2 c3. If 2 {d 1, d 2, d 3, c 1, c 2, c 3 }, then rank(a B) = rank(a) + rank(b) The ll-posedness of the optmal low rank approxmaton problem. de Slva and Lm [17] proved that for any order N 3 and dmensons d 1,..., d N 2, there exsts a rank-(r + 1) tensor that has no optmal rank-r approxmaton, for any r = 2,..., mn{d 1,..., d N }. Ths result was further generalzed to an arbtrary rank gap,.e., there exsts a rank-(r + s) tensor that has no optmal rank-r approxmaton, for some r s and s s. Essentally, ths ll-posedness of the optmal approxmaton problem s llustrated by the fact that the tensor E := e 2 e 1 e 1 + e 1 e 2 e 1 + e 1 e 1 e 2 R 2 2 2, where e s the -th column of the dentty matrx, has rank 3 but can be approxmated arbtrarly closely by rank-at-most-2 tensors. Hence E does not have an optmal rank- 2 approxmaton. Then accordng to Theorem 3.4 and Lemma 3.2, the ll-posedness of the problem can be generalzed to arbtrary rank and order, by constructng hgher rank and hgher order tensors usng drect sums and tensor products. We restate one of the results of [17] n the followng theorem. For detals of the proof, see the orgnal paper. Theorem 3.5. For N 3 and d 1, d 2,..., d N 2, there exsts a tensor A R d 1 d 2 d N of rank r + s that has no optmal rank-r approxmaton, for any r and s 1 satsfyng 2s r mn{d 1, d 2,..., d N }. 4. The tensor SVD and ts (non-)exstence. The defnton used for the sngular value decomposton of a tensor generalzes the matrx SVD from the angle of an expanson of outer product matrces, whch becomes an expanson nto a sum of rank-1 tensors. Defnton 4.1. If a tensor A R d 1 d 2 d N can be wrtten n the form A = R σ u (1) u (2) u (N), (4.1) where σ 1 σ 2 σ R > 0 and u (n) j, u (n) k = δ jk (Kronecker delta) for n = 1, 2,..., N, then (4.1) s sad to be the tensor sngular value decomposton (tensor SVD) of A. The σ s are sngular values and the u (n) s for = 1,..., R are the mode-n sngular vectors. We also call (4.1) the SVD of tensor A for short where there s no ambguty about tensors and matrces. In fact, when N = 2,.e., A s a matrx, the tensor SVD of A bols down to the matrx SVD. Expresson (4.1) can equvalently be wrtten n the form A = D 1 U (1) 2 U (2) N U (N), (4.2)

8 8 J. CHEN AND Y. SAAD where D R R R R s the dagonal core tensor wth D... = σ, and [ ] U (n) = u (n) 1, u(n) 2,..., u(n) R R d n R (4.3) are orthogonal matrces for n = 1, 2,..., N. The followng proposton ndcates that the tensor SVD s rank revealng. Proposton 4.2. If a tensor A has the SVD as (4.1), then rank(a) = R. Proof. Ths follows from Lemma 3.3. Trvally, f a tensor s constructed as n (4.1), ts SVD exsts. However, n general, a tensor of order N 3 may fal to have such a decomposton. In ths secton, we dentfy some of these stuatons. To begn wth, note that the orthogonalty of each U (n) mples that R d n for each n,.e., R mn{d 1, d 2,..., d N }. Ths leads to the followng smple result. Proposton 4.3. A tensor A R d1 d2 d N wth rank(a) > mn{d 1, d 2,..., d N } does not admt a tensor SVD. Proof. The exstence of a tensor SVD such as n (4.1) would trvally lead to a contradcton snce the tensor n (4.1) has rank R wth R mn{d 1, d 2, d N }. Note that Theorem 3.5 guarantees that the condton of Proposton 4.3 s not vacuously satsfed, for any order N 3 and dmensons d 1, d 2,..., d N 2. Corollary 4.4. Gven a tensor A satsfyng the condton n Proposton 4.3, any tensor of the form A x (N+1) x (N+l), where l 1 and x (N+1),..., x (N+l) are nonzero vectors, does not admt a tensor SVD. Proof. Ths follows from Proposton 4.3 and Lemma 3.2. Corollary 4.5. A tensor A R d 1 d 2 d N does not admt a tensor SVD f there exsts at least one mode n such that rank n (A) > mn{d 1, d 2,..., d N }. Proof. Ths follows from Proposton 4.3 and Lemma 3.1. Proposton 4.6. There exsts a tensor A R d 1 d 2 d N whch does not admt a tensor SVD whenever d := max {d n} > mn{d n } and d 2 n n N d n. Proof. Wthout loss of generalty, assume that d = d 1 d 2 d N and let d = d 2 d N. Snce d d, for an arbtrary set of orthonormal vectors {a R d = 1,..., d}, we can construct a tensor A whose unfoldng A (1) = [a 1, a 2,..., a d ] T. Then rank 1 (A) = d. By Corollary 4.5, A does not admt a tensor SVD. Note that when N = 2,.e., for the matrx case, t s mpossble for d 1 and d 2 to satsfy the condton n the proposton. In closng ths secton, we provde a necessary and suffcent condton to characterze the exstence of the tensor SVD. 3 Ths s related to the HOSVD proposed by [12]. The essental relaton underlyng the theorem s that the mode-n sngular vectors of A, when ts SVD exsts, are also the left sngular vectors of the unfoldng A (n). n=1 3 As ponted out by a referee, the provded relaton may have long been recognzed n other felds of research, such as sgnal processng, at least n the case of dstnct sngular values.

9 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 9 Theorem 4.7. A tensor A admts an SVD f and only f there exsts an HOSVD of A such that the core s dagonal. Proof. The suffcent condton s obvous. Consder the necessary condton. If A can be wrtten n the form (4.1), defne the tensor and let w (n) Snce u (n) j W (n) := u (n+1) u (N) u (1) u (n 1), be the vectorzaton of W (n). Then the unfoldng of A along mode n s R A (n) =, u (n) k = δ jk for all n, we have σ u (n) w (n) j w (n)., w (n) k = δ jk. Hence the above form s the SVD of matrx A (n). In other words, the vectors u (n) 1,..., u(n) R are the left sngular vectors of A (n). From the constructon of the HOSVD, Equaton (4.2) s a vald HOSVD for A. 4 The proof of the above theorem ndcates that f the SVD of a tensor A exsts, ts sngular values concde wth the nonzero mode-n sngular values n ts HOSVD. However the HOSVD of a tensor may not be unque, snce the SVD of the unfoldngs A (n) s are not guaranteed to be unque. Hence even f a tensor s constructed as n (4.1), ts HOSVD wll not necessarly recover ths form. Ths s the reason why n the above theorem we use the phase... f there exsts.... It s nterestng to note agan that the non-unqueness of matrx SVD s caused by duplcate sngular values, however the tensor SVD s unque (f t exsts) even when some of the sngular values are the same [43, Theorem 3.2]. 5. The optmal low rank orthogonal approxmaton. The problem addressed by tensor analyss s to approxmate some tensor A by a lnear combnaton of tensors T 1, T 2,..., T r that have specal structures, e.g., rank-1 tensors, orthogonal tensors, or smple tensors 5. For ths t s desrable to mnmze A σ T F for a gven r. Wthout loss of generalty, we assume that T F = 1 for all. As dscussed n Secton 3.1, f the T s are rank-1 tensors, the nfmum of the above expresson mght not necessarly be attaned. The followng proposton reveals some propertes when the nfmum s ndeed acheved. Proposton 5.1. Gven a tensor A and a postve nteger r, consder a set of lnear combnatons of tensors of the form T := σ T (5.1) where the T s are arbtrary unt tensors. If nf A T F s reached on ths set, then for the optmal T and T s, A T, T F = 0 for = 1, 2,..., r. 4 In order to strctly conform to the defnton of the HOSVD defned n [12], n (4.2) the sze of D should be enlarged from R R R to d 1 d 2 d N by paddng zeros, and the U (n) s should be padded wth orthogonal columns to make square shapes. 5 A tensor s smple f t s the tensor product of two tensors.

10 10 J. CHEN AND Y. SAAD Furthermore, f the T s are requred to be mutually F-orthogonal, then the optmal σ s are related to the optmal T s by In ths stuaton, T F = r σ = A, T F for = 1, 2,..., r. (5.2) σ2. and A T 2 F = A 2 F T 2 F. (5.3) Proof. If the nfmum s attaned by a certan set of σ s and T s, and f there s a j such that A T, T j F = ɛ 0, then A r σ T ɛt j 2 F = A r σ T 2 F 2ɛ A r σ T, T j F + ɛ 2 T j 2 F = A r σ T 2 F ɛ2 < A r σ T 2 F, whch contradcts the assumpton. If the unt tensors T s are mutually F-orthogonal, then Also, 0 = A r σ T, T j F = A, T j F σ j T j, T j F = A, T j F σ j. T 2 F = T, T F = σ T, σ j T j F =,j=1 σ 2, and 2 A σ T F = A 2 F 2σ A, T F + σ 2 = A 2 F σ 2 = A 2 F T 2 F. The last part of the proof ndcates that the equaltes n (5.3) follow from the orthogonalty of the T s and the relatons (5.2). They do not requre optmalty. In ths secton, we wll see that f the T s are mutually orthogonal rank-1 tensors, then the nfmum n the proposton can be attaned. Formally, we wll prove that the problem mn E := A σ u (1) u (2) u (N) s.t. u (n) j, u (n) k = δ jk, for n = 1, 2,..., N, F (5.4) has a soluton for any A R d 1 d 2 d N and any r mn{d 1, d 2,..., d N }. The soluton for the case r = mn{d 1, d 2,..., d N } leads to a decomposton of A where the dagonal of the core s maxmzed.

11 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION Exstence of the global optmum. Let T := u (1) u (2) u (N), for = 1,..., r, (5.5) and σ s be defned as n (5.2), then accordng to Proposton 5.1 (see comments followng the proof), E 2 2 = A σ T = A 2 F σ 2.,.e., the optmzaton prob- Hence mnmzng E s equvalent to maxmzng r lem (5.4) s equvalent to the followng: max E := s.t. u (n) j (, u (n) k A 1 u (1) F T 2 u (2) σ2 ) T N u (N) 2 T = δ jk, for n = 1, 2,..., N. (5.6) Let U (n) = [ ] u (n) 1, u(n) 2,..., u(n) r Ω (n) (5.7) where Ω (n) := {W R dn r W T W = I} (5.8) for n = 1, 2,..., N. The problem (5.6) can be nterpreted as that of maxmzng E wthn the feasble regon Ω := Ω (1) Ω (2) Ω (N). (5.9) Snce for each n the set Ω (n) s compact (see, e.g., [22, p. 69]), by Tychonoff Theorem, the feasble regon Ω s compact. Under the contnuous mappng E, the mage E (Ω) s also compact. Hence t has a maxmum. Ths proves the followng theorem: Theorem 5.2. There exsts a soluton to the problem (5.6) (or equvalently (5.4) wth σ defned n (5.2)) for any r mn{d 1, d 2,..., d N } Relaton to tensor decompostons. Let U (n), n = 1,..., N be the soluton to the problem (5.4) wth r = mn{d 1, d 2,..., d N } and σ be defned n (5.2). Also for n = 1,..., N, let U (n) be a d n (d n r) matrx such that the square matrx [ Ũ (n) := U (n), U (n) ] R d n d n (5.10) s orthogonal. Further, defne the tensor Then the equalty S := A 1 Ũ (1)T 2 Ũ (2)T N Ũ (N)T R d 1 d 2 d N. (5.11) A = S 1 Ũ (1) 2 Ũ (2) N Ũ (N) (5.12) holds. Ths decomposton of A has the followng two propertes:

12 12 J. CHEN AND Y. SAAD () The sde-matrces Ũ (n) s are orthogonal. () The (squared) norm of the dagonal of the core S: mn{d 1,...,d N } s 2... = ( A 1 u (1) T 2 u (2) ) T N u (N) 2 T = s maxmzed among all the choces of the orthogonal sde-matrces. Ths s known as maxmal dagonalty n [12]. where 5.3. Frst order condton. The Lagrangan of (5.6) s L = σ 2 N j,k=1 n=1 σ = A 1 u (1) µ n j,k T 2 u (2) ( u (n) j σ 2 ), u (n) k δ jk, (5.13) T N u (N) T and the µ n j,k s are Lagrange multplers. Defne the vector v (n) := A 1 u (1) R 1 1 d n 1 1. T n 1 u (n 1) T n+1 u (n+1) T N u (N) T (5.14) (5.15) (Here we abuse the use of notaton =. More precsely, v (n) should be the mode-n unfoldng of the tensor on the rght-hand sde of (5.15).) It s not hard to see that u (n), v (n) = σ for all n and, and v (n) s the partal dervatve of σ wth respect to u (n) The partal dervatve of the Lagrangan wth respect to u (n) s L u (n) = 2σ v (n) j=1 µ n j,u (n) j k=1 µ n,ku (n) k, for any n and. By settng the partal dervatves to 0 and puttng all equatons related to the same n n matrx form, we obtan the followng equatons: µ [ ] 1 v (n) 1 v r (n) σ n 1,1 +µn 1,1 µ [ ]... 2 n 1,r +µn r,1 2 = u (n) 1 u (n).. r....., σ µ n r r,1 +µn 1,r µ 2 n r,r +µn r,r 2 (5.16) for all n = 1, 2,..., N. Let V (n) :=. [ ] v (n) 1, v (n) 2,..., v r (n), (5.17) Σ := dag(σ 1,..., σ r ), (5.18) and let M (n) be the second term on the rght-hand sde of (5.16). Then (5.16) s compactly represented as V (n) Σ = U (n) M (n), n = 1, 2,..., N. (5.19) In summary, the necessary condton of an extremum of the Lagrangan s the equaton (5.19), where V (n) s defned n (5.17), Σ s defned n (5.18), U (n) s defned n (5.7), and M (n) s symmetrc, for all n = 1, 2,..., N.

13 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION Algorthm: LROAT. We seek orthogonal matrces U (n) s and symmetrc matrces M (n) s whch satsfy the system (5.19). (The Σ and V (n) s are computed from the U (n) s.) Note that the par U (n), M (n) happens to be the polar decomposton of the matrx V (n) Σ. Hence the system can be solved n an teratve fashon: We begn wth an ntal guess of the set of orthogonal matrces {U (1), U (2),..., U (N) }, whch can be obtaned, say, by the HOSVD of A. For each n, we compute V (n) and Σ, and update U (n) as an orthogonal polar factor of V (n) Σ. Ths procedure s terated untl convergence s observed. Algorthm 1 (LROAT) summarzes ths dea. Algorthm 1 Low Rank Orthogonal Approxmaton of Tensors (LROAT) Input: Tensor A, rank r, orthogonal matrces U (1),..., U (N) as ntal guess Output: σ 1,..., σ r, U (1),..., U (N) 1: repeat 2: for n 1,..., N do[ ] 3: Compute V (n) = v (n) 1,..., v r (n) accordng to (5.15) 4: Compute Σ = dag(σ 1,..., σ r ) accordng to (5.14) 5: [Q (n), H (n) ] polar-decomp(v (n) Σ) 6: Update U (n) Q (n) 7: end for 8: untl convergence [ Note that when r = 1, the matrx V (n) = v (n) 1 ] [ and U (n) = u (n) 1 ], whch means that for each teraton u (n) 1 s updated as the normalzed v (n) 1. Ths ndcates that the LROAT algorthm for r = 1 bols down to the ALS method [43] (or the so-called hgher-order power method [13, 27]) for computng the optmal rank-1 approxmaton. Hence, t s not unexpected to see n the numercal experments that n general LROAT converges lnearly. We also comment that LROAT s not an alternatng least squares method (except for the case r = 1) by the nature of the update of U (n) Convergence analyss. LROAT employs an alternatng procedure (teratng through U (1), U (2),..., U (N) ), where n each step all but one (U (n) ) parameters are fxed. In general, algorthms of ths type, ncludng alternatng least squares, are not guaranteed to converge. Specfcally, the objectve functon may converge but not the parameters. (See, for example, [32] for some dscussons.) For LROAT, we are also unable yet to prove the global convergence, though emprcally t appears to hold. However, n ths secton, we wll prove that: (1) The teratons monotoncally ncrease the objectve value E (Theorem 5.4); (2) Under a mld condton, of the generated parameter sequence, every convergng subsequence converges to a statonary pont of the objectve functon (Theorem 5.7); and (3) In a neghborhood of a local maxmum, the parameter sequence converges to ths statonary pont (Theorem 5.9). Before analyzng the convergence behavor of LROAT, we ndex all the terates. The outer-loop s ndexed by p and the overall teratons are ndexed by dx, whch s equal to n + (p 1)N. In other words, Algorthm 1 s rewrtten as follows. In partcular, the numbered lnes correspond to the lnes n Algorthm 1. for p 1, 2,... do for n 1,..., N do dx = n + (p 1)N

14 14 J. CHEN AND Y. SAAD For all, compute σ (dx) accordng to U (1) (p+1),..., U (n 1) (p+1), U (n) (p), U (n+1) (p),..., U (N) (p) Objectve E (dx) = ( ) 2 r σ (dx) 3: Compute V (n) (p) from U (1) (p+1),..., U (n 1) (p+1), U (n+1) (p),..., U (N) ( ) (p) 4: Assgn Σ (dx) = dag σ (dx) 1,..., σ r (dx) 5: Polar decomposton V (n) (p) Σ(dx) = Q (n) (p) H(n) (p) 6: Update U (n) (p+1) = Q(n) (p) end for end for The followng lemma, whch s well-known when the matrx A s square, reveals the trace maxmzng property that s mportant for the convergence analyss of LROAT. Lemma 5.3. Let matrx A R m n, m n, have the polar decomposton A = QH where Q R m n s the orthogonal polar factor and H R n n s the symmetrc postve sem-defnte factor, then max tr(p T A) P R m n, P T P =I s attaned when P = Q. Proof. Any P can be wrtten as ZQ, where Z R m m s orthogonal. Then tr(p T A) = tr(q T Z T QH) = tr(z T QHQ T ). Snce QHQ T s symmetrc postve sem-defnte, max tr(z T QHQ T ) s attaned when Z = I. Snce U (n) (n) (p+1) s the orthogonal polar factor of V (p) Σ(dx), by Lemma 5.3, ( σ (dx) ) 2 = tr ( U (n) (p) Then by the Cauchy-Schwarz nequalty, ( σ (dx) ) ( ) T (n) V (p) Σ(dx) tr U (n) T (n) (p+1) V (p) Σ(dx) = ) 2 σ (dx+1) σ (dx) ( σ (dx+1) σ (dx+1) σ (dx). ) 2, (5.20) and ( σ (dx) ) 2 ( = σ (dx+1) ) 2 ff σ (dx) = σ (dx+1) for all. (5.21) Inequalty (5.20) means that each update of U (n) ncreases the value of the objectve functon E,.e., E (dx) E (dx+1). Snce E s bounded from above (exstence of the maxmum, see Theorem 5.2), the sequence {E (dx) } dx=1 converges. Note that the convergence does not depend on the ntal guess nput to the algorthm. Formally, we have establshed the followng theorem:

15 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 15 Theorem 5.4. Gven any ntal guess, the teratons of Algorthm 1 monotoncally ncrease the objectve functon E defned n (5.6) to a lmt. The convergence of the objectve functon does not necessarly mply that the functon parameters wll converge. However, n our case snce the parameters U (n) s are bounded, they admt convergng subsequences. Next we wll show that every such subsequence converges to a statonary pont of E. For ths, the followng lemma uses a helper functon f. Lemma 5.5. Let T : Θ Θ be a contnuous mappng and a sequence {θ n Θ} n=1 be generated from the fxed pont teraton θ n+1 = T (θ n ). If there exsts a contnuous functon f : Θ R satsfyng the followng two condtons: () The sequence {f(θ n )} n=1 converges; () For θ Θ, f f(t (θ)) = f(θ) then T (θ) = θ; then every convergng subsequence of {θ n } n=1 converges to a fxed pont of T. Proof. Let {θ sn } n=1 be a convergng subsequence of {θ n } n=1, where θ sn θ. Also let f be the lmt of f(θ n ). Then f(θ sn ) f(θ ), therefore f(θ ) = f. Meanwhle from the contnuty of T and f, we have T (θ sn ) T (θ ) and f(θ sn+1) = f(t (θ sn )) f(t (θ )), whch mples that f(t (θ )) = f. Condton () of the lemma now mples that θ = T (θ ). Our objectve functon E s just one such helper functon f, and the orthogonal polar factor functon plays the role of the mappng T n the above lemma. The followng lemma establshes the fact that the orthogonal polar factor functon s contnuous. Lemma 5.6. The orthogonal polar factor functon g : A Q defned on the set of matrces wth full column rank s contnuous. Here Q s the orthogonal polar factor of A R m n, m n. Proof. Frst, functon g s well defned, snce the orthogonal polar factor of a full rank matrx exsts and s unque [21]. If Q and Q are the orthogonal polar factors of A and A, respectvely, Sun and Chen [39] have shown that Q Q F 2 A + 2 A A F, where + means pseudo nverse. Hence f A 1, A 2,... converges to A, then g(a 1 ), g(a 2 ),... converges to g(a ). Now we are ready to prove the followng result. Theorem 5.7. Every convergng subsequence of { } U (1) (p),..., U (N) (p) generated p=1 by Algorthm 1 converges to a statonary pont of the objectve functon E defned n (5.6), provded the matrces V (n) n lne 3 of the algorthm do not become rankdefcent throughout the teratons. Proof. For convenence, let U denote the sde-by-sde concatenaton of the U (n) s,.e., at teraton number p we wrte U (p) := [ U (1) (p),..., U (N) (p) V (n) (p) Σ(dx) s computed from U (p) and polar factorzed, and U (n) (p) ]. For each teraton, s updated. Let T be the composte of all these teratons runnng n from 1 to N. That s, U (p+1) = T (U (p) ). It s not hard to see that T s contnuous by Lemma 5.6. The objectve functon E takng parameter U (p) has been prevously shown such that the sequence {E (U (p) )} p=1 s monotoncally convergng. Hence by Lemma 5.5, n order to prove ths theorem, t wll suffce to show that E (T (U)) = E (U) mples T (U) = U. Then every convergng subsequence of {E (U (p) )} p=1 converges to a fxed pont, whch satsfes the frst order condton (5.19),.e., t s also a statonary pont of E.

16 16 J. CHEN AND Y. SAAD If E (T (U)) = E (U), formula (5.21) ndcates that the σ values have not changed after ( the teraton. ) In partcular, for any n, the update of U (n) has not changed tr U (n)t V (n) Σ. Snce the orthogonal polar factor of V (n) Σ s unque when V (n) s not rank-defcent, ths means that U (n) has not changed. Ths n turn means that U s a fxed pont of the mappng T. The condton n the theorem s not a strong requrement n general. Of course, the columns v (n) of the matrx V (n), as computed from (5.15), wll be lnearly dependent f the n-rank of A s less than r. For practcal applcatons, the tensor usually has full n-ranks for all n, so ths does not hamper the applcablty of the theorem. Though the global convergence of {U (p) } s not determned, when localzed, t s possble that ths parameter sequence converges. The followng lemma and theorems consder ths stuaton. Lemma 5.8. If a sequence {θ n } n=1 s bounded, and all of ts convergng subsequences converge to θ, then θ n θ. Proof. (By contradcton.) If {θ n } n=1 does not converge to θ, then there s an ɛ > 0 such that there exsts a subsequence S = {θ sn } n=1, where θ sn θ ɛ for all n. Snce S s bounded, t has a convergng subsequence S. Then S as a subsequence of {θ n } n=1 converges to a lmt [ other than θ. ] Theorem 5.9. Let U = U (1),..., U (N) be a local maxmum of the objectve { [ ]} functon E defned n (5.6). If the sequence U (p) := U (1) (p),..., U (N) (p) generated p=1 by Algorthm 1 les n a neghborhood of U, where U s the only statonary pont n that neghborhood, and f the full rank requrement n Theorem 5.7 s satsfed, then the sequence {U (p) } p=1 converges to U. Proof. Ths mmedately follows from Theorem 5.7 and Lemma 5.8. Note that snce the startng elements of a sequence have no effect on ts convergence behavor, the above theorem holds whenever the talng subsequence, startng from a suffcently large p, les wthn the neghborhood. A weaker, but smpler, result[ s the followng] corollary. Corollary Let U = U (1),..., U (N) be a local maxmum of the objectve functon E defned n (5.6). If ths local maxmum s unque { and [ f the full rank ]} requrement n Theorem 5.7 s satsfed, then the sequence U (p) := U (1) (p),..., U (N) (p) p=1 generated by Algorthm 1 converges to U LROAT for symmetrc tensors. An order-n tensor A R d d d, whose dmensons of all modes are the same, s symmetrc f for all permutatons π, a 1, 2,..., N = a π(1), π(2),..., π(n). For symmetrc tensors, usually the approxmaton problem (5.4) has an addtonal constrant that the sde-matrces U (n) s are the same for all n,.e., the problem becomes mn E = A σ u u u F (5.22) s.t. u j, u k = δ jk. Applyng smlar arguments to those n Secton 5.1, t s easly seen that (5.22) s

17 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 17 equvalent to the followng problem: max E ( = A 1 u T 2 u T N u T s.t. u j, u k = δ jk. ) 2 (5.23) The supremum of E can be attaned. Further, the maxmal-dagonalty decomposton of A (c.f. Equaton (5.12)) has an addtonal property that the core S s symmetrc. Also, the frst order condton (5.19) s smplfed to V Σ = UM. Hence, there are two approaches to compute the approxmaton for the symmetrc tensor A. The frst approach s to drectly apply LROAT on A. Theorems n the above secton guarantee the convergence under mld assumptons, but the sdematrces mght no longer be the same, though n the next secton an experment ndcates that they ndeed converge to the same matrx. The second approach s to only use a sngle ntal guess U and omt the for-loop on n (lne 2 of Algorthm 1). We call ths the symmetrc varant of LROAT. In ths case Theorem 5.4 no longer holds,.e., the teratons mght not monotoncally ncrease the objectve value E defned n (5.23), snce the for-loop on n s omtted. An experment n the next secton shows an oscllatng phenomenon, whch s smlar to the one ndcated n Fgure 4.1 of [27], for the objectve value E. 6. Numercal experments. Ths secton wll show a few experments to llustrate the convergence behavor and the approxmaton qualty of LROAT. For comparsons are the ALS methods for Tucker and PARAFAC, whose mplementatons are based on the codes from the MATLAB Tensor Toolbox developed by Bader and Kolda [2]. We use the major left sngular vectors of the unfoldngs as the ntal guess nput for all the algorthms compared. When t comes to the qualty of the fnal approxmaton, experence shows that compared wth random orthonormal vectors, sngular vectors as ntal guesses do not offer any advantage. It has been argued that runnng the algorthms several tmes usng dfferent sets of random ntal guess enhances the probablty of httng the global optmum. We use sngular vectors here only for repeatablty Convergence of LROAT. In the frst experment, we test LROAT (and the symmetrc varant of LROAT mentoned n Secton 5.6) on a few tensors lsted n Table 6.1. The results are shown n Fgures 6.1 and 6.2. Each row of the fgures s one test on a tensor. The left plot shows the objectve value E (the same as the norm of the approxmated tensor T ) for each teraton p, whle the rght plot shows the convergence hstory of the U (n) s. Snce the optma are unknown, we plot the values { to ndcate the convergence of the sequence F U (n) (p) U (n) (p 1) U (n) (p). p=1,2,... Snce these values are plotted on logarthmc scale, f the curves are bounded from above by a straght decreasng lne, then t s ndcated that the convergence of the sequence s at least lnear. Fgures 6.1 and 6.2 show n a total fve tests. The frst test (Fgure 6.1(a)) uses a randomly generated tensor A 1. The second test (Fgure 6.1(b)) uses a low-rank-plus- Gaussan-nose tensor A 2 = B 1 + ρb 2, }

18 18 J. CHEN AND Y. SAAD Table 6.1 The tensors used for the frst experment. The value r s the rank nput to LROAT; t s not the rank of the tensor. Tensor Dmensons r Notes A random tensor A rank-5 tensor + Gaussan nose A the (, j, k)-entry = 1/( 2 + j 2 + k 2 ) A see [27, Example 1] where the low rank tensor B 1 s n the form (2.3) wth r = 5, the Gaussan nose tensor B 2 has normally dstrbuted elements, and ρ = 0.1 B 1 F / B 2 F. In these two tests the two tensors are appled to the LROAT algorthm. The thrd test (Fgure 6.1(c)) uses a symmetrc tensor A 3 wth entres (A 3 ) jk = j 2 + k 2. In ths test A 3 s appled to the symmetrc varant of LROAT. All the three tests show a lnear convergence rate. The fourth (Fgure 6.2(a)) and the ffth (Fgure 6.2(b)) test use a symmetrc tensor A 4 ntroduced n [27, Example 1]: (A 4 ) 1111 = , (A 4 ) 1112 = , (A 4 ) 1113 = , (A 4 ) 1112 = , (A 4 ) 1123 = , (A 4 ) 1133 = , (A 4 ) 1222 = , (A 4 ) 1223 = , (A 4 ) 1233 = , (A 4 ) 1333 = , (A 4 ) 2222 = , (A 4 ) 2223 = , (A 4 ) 2233 = , (A 4 ) 2333 = , (A 4 ) 3333 = In [27], the symmetrc hgher-order power method for computng the optmal rank-1 approxmaton of A 4 s shown to be non-convergng. We experment wth ths tensor wth r = 2 on LROAT and the symmetrc varant of LROAT. Fgure 6.2(a) shows that when appled to LROAT, the approxmaton to A 4 ndeed lnearly converges, and what s more, all the sde-matrces converge to the same result. The approxmaton computed by LROAT s wth A 4 σ 1 u (1) u (1) u (1) u (1) + σ 2 u (2) u (2) u (2) u (2) σ 1 = , u (1) = [ ] T, σ 2 = , u (2) = [ ] T. On the other hand, Fgure 6.2(b) shows that the symmetrc varant of LROAT fals to converge Low rank orthogonal approxmaton compared wth Tucker and PARAFAC. In the second experment, we compare the approxmaton qualty of three dfferent models: Low rank orthogonal approxmaton (wthout confuson n ths secton, we call ths model LROAT, whch happens to be the name of the algorthm, for short), Tucker and PARAFAC. See Fgure 6.3. We experment wth two tensors: a low-rank-plus-gaussan-nose tensor whch s generated the same way

19 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 19 as A 2 and a real-lfe tensor. The latter s obtaned from a problem n acoustcs [20], and the data can be downloaded from [16]. The resdual norms A T F (p) res(p) := A F over all the teratons p are plotted. Fgure 6.3 ndcates three facts: (1) The three models approxmate the data tensor well to some extent (less than 35% of the nformaton s lost due to approxmaton); (2) PARAFAC s usually slow to converge; (3) The resdual norm for LROAT s larger than those of Tucker and PARAFAC. The last fact s not unexpected snce LROAT can be consdered a specal case of Tucker and of PARAFAC: The Tucker model has a full core whle the core for LROAT s dagonal, and unlke LROAT the sde-matrces n the PARAFAC model are not restrcted to be orthogonal An applcaton. In the blnd source separaton (BSS) problem [5], the cumulant tensor of order 4 s a rank-r tensor: R σ u u u u, (6.1) where R s the number of sources and u s the -th column of the mxng matrx. In the prewhtenng approach for the BSS problem, the u s become the columns of the composte of the whtenng matrx and the mxng matrx, that s, the u s are length-r vectors and are orthonormal. Hence, ths prewhtenng approach reduces to computng the tensor SVD of the cumulant tensor. Snce n practce ths tensor s estmated from a fnte data set, t s not exact. Thus, the low rank orthogonal approxmaton becomes a sutable tool to recover the u s. In an experment, we let R = 3 and generate a data tensor A 5 = B 3 + ρb 4 where B 3 s as (6.1), B 4 s a symmetrc tensor wth normally dstrbuted elements, and ρ = 0.05 B 3 F / B 4 F. The σ s are and the u s are σ 1 = , σ 2 = , σ 3 = , U = [u 1, u 2, u 3 ] = We use four methods to compute the rank-r (or rank-(r, R, R)) approxmatons to A 5 : LROAT, ncremental rank-1 approxmaton, PARAFAC, and Tucker. All the four methods return same sde-matrces for all modes. They are: U LROAT = , U nc = , U PARAFAC = , U Tucker = Observatons are as follows:

20 20 J. CHEN AND Y. SAAD u s: 1. The U nc and U PARAFAC are not orthogonal. 2. Compared wth Tucker, LROAT gves better approxmatons to the vectors U U LROAT = , U U Tucker = In terms of approxmaton qualty, the resdual norms (n percentage of the norm of A 5 ), are res LROAT = 3.07%, res nc = 1.36%, res PARAFAC = 1.36%, res Tucker = 0%. 7. Concludng remarks. In the present paper we studed the tensor SVD, and characterzed ts exstence n relaton to the HOSVD. Smlar to the concept of rank, the SVD of hgher order tensors exhbts a qute dfferent behavor and characterstcs from those of matrces. Thus, the SVD of a matrx s guaranteed to exst, though t may have dfferent representatons due to orthogonal transformatons of sngular vectors correspondng to the same sngular value. On the other hand, there are many ways n whch a tensor can fal to have an SVD (see the results n Secton 4), but when t exsts, ths decomposton s unque up to sgns. We have also dscussed a new form of optmal low rank approxmaton of tensors, where orthogonalty s requred. Ths approxmaton s nspred by the constrants of the Tucker model and the PARAFAC model. In some applcatons, the proposed approxmaton model may be favored, snce t results n N sets of orthonormal vectors or, equvalently, r mutually orthogonal unt rank-1 tensors wth dfferent weghts. Among the advantages of ths approxmaton over the Tucker model s the fact that t requres far fewer entres to represent the core, and that t s easer to nterpret. Also, compared wth the PARAFAC model, the orthogonalty of vectors may be useful n some cases. Further, the LROAT algorthm for computng the proposed approxmaton does not seem to exhbt the well-known slow convergence from whch the ALS algorthm for PARAFAC suffers. A major restrcton of the proposed model s that the number of terms r can not exceed the smallest dmenson of all modes of the tensor. A consequence s that the approxmaton may stll be very dfferent from the orgnal tensor even when the maxmum r s employed. However we note that when performng data analyss, the nterpretaton of the vectors and the core tensor mght be more mportant than how much s lost when the data s approxmated. A nce aspect of the proposed approxmaton s that the optmum of the objectve functon can theoretcally be attaned, n contrast to the PARAFAC model whch s ll-posed n a strct mathematcal sense. We presented an algorthm to compute ths approxmaton, but the computed result s only optmal n a local neghborhood. It wll be nterestng to study for what tensors or what ntal guesses the LROAT algorthm converges to the global optmum, or to devse a new algorthm to solve ths optmzaton problem. It s an open problem how fast LROAT converges, although emprcally convergence s observed to be lnear. We also dscussed the symmetrc varant of LROAT and ponted out the possblty of ts non-convergence. Hence the convergence propertes of ths varant, and the observed phenomenon that the orgnal LROAT algorthm can yeld same sde-matrces for symmetrc tensors, reman to be nvestgated. Appendx. Does the ALS algorthm for PARAFAC converge? It has been ponted out that the ALS algorthm for computng the PARAFAC model may

21 TENSOR SVD AND LOW RANK ORTHOGONAL APPROXIMATION 21 converge very slowly due to degenerate solutons or multcollneartes, and many alternatves have been proposed to address ths problem [36, 37, 26]. Durng teratons, the objectve value monotoncally decreases by the nature of the alternatng least squares procedure, and snce the sequence s bounded, t converges. However, a proof of the convergence of the parallel factors s lackng. In general t s assumed that these factors converge, but may take a very large number of teratons. In ths secton, we dscuss an experment showng that the general concept of convergence s unclear n ths context. Though only one example s gven, we note that the exhbted behavor s not rare for randomly generated tensors. (On the other hand t may be argued that tensors n real applcatons are far from beng flled wth random entres.) We generate an order-3 tensor A R and run the ALS algorthm on r = 2,.e., to compute the approxmaton A λ 1 u (1) 1 u (2) 1 u (3) 1 + λ 2 u (1) 2 u (2) 2 u (3) 2. The Matlab code whch generates the tensor A s as follows: A(:,:,1) = [ ; ; ]; A(:,:,2) = [ ; ; ]; A(:,:,3) = [ ; ; ]; We use u (n) = e, where n = 1, 2, 3 and e s the -th column of the dentty matrx, as the ntal guess. [ ] Denote U (n) = u (n) 1, u(n) 2 for n = 1, 2, 3. Two plots are shown after runnng 10 5 teratons (see Fgure A.1). Fgure A.1(a) shows the convergence hstory for each U (n). The curves represent U (n) (p) U (n) (p 1), where p s the ndex of the teratons. A necessary condton for convergence to occur s that all the three curves decrease to zero. However we see from the fgure that ths may not be the case. To test the conjecture that each of the curves tends to a nonzero value, we use the followng expresson log 10 y = a (10 4 x) 1/α + b to ft the talng part of the curves (startng from the th teraton). Table A.1 gves the fttng results for dfferent α s. When the number of teratons tends to nfnty, the value 10 b wll show the lmt of the dfferences between two consecutve U (n) s. It s stll dffcult to conclude for ths example that the teratons do not converge snce roundng has not been taken nto account. However, t makes no practcal dfference for ths case whether the sequence actually converges or whether t s exceedngly slow to converge. The result, f convergence holds, wll be an nordnate number of teratons to reach a desrable level of convergence, and the cost wll be too hgh n practce. Ths can be made evdent by examnng Fgure A.1(b), whch plots the parallel factor u (1) 2 over all teratons: The 3rd entry of u (1) 2 decreases from at the th teraton to at the th teraton. REFERENCES

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0 MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector

More information

Singular Value Decomposition: Theory and Applications

Singular Value Decomposition: Theory and Applications Sngular Value Decomposton: Theory and Applcatons Danel Khashab Sprng 2015 Last Update: March 2, 2015 1 Introducton A = UDV where columns of U and V are orthonormal and matrx D s dagonal wth postve real

More information

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2 Salmon: Lectures on partal dfferental equatons 5. Classfcaton of second-order equatons There are general methods for classfyng hgher-order partal dfferental equatons. One s very general (applyng even to

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Errors for Linear Systems

Errors for Linear Systems Errors for Lnear Systems When we solve a lnear system Ax b we often do not know A and b exactly, but have only approxmatons  and ˆb avalable. Then the best thng we can do s to solve ˆx ˆb exactly whch

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg prnceton unv. F 17 cos 521: Advanced Algorthm Desgn Lecture 7: LP Dualty Lecturer: Matt Wenberg Scrbe: LP Dualty s an extremely useful tool for analyzng structural propertes of lnear programs. Whle there

More information

Assortment Optimization under MNL

Assortment Optimization under MNL Assortment Optmzaton under MNL Haotan Song Aprl 30, 2017 1 Introducton The assortment optmzaton problem ams to fnd the revenue-maxmzng assortment of products to offer when the prces of products are fxed.

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 16 STAT 39: MATHEMATICAL COMPUTATIONS I FALL 218 LECTURE 16 1 why teratve methods f we have a lnear system Ax = b where A s very, very large but s ether sparse or structured (eg, banded, Toepltz, banded plus

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Luca Trevisan September 5, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analyss Handout 4s Luca Trevsan September 5, 07 Summary of Lecture 4 In whch we ntroduce semdefnte programmng and apply t to Max Cut. Semdefnte Programmng Recall that

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

MMA and GCMMA two methods for nonlinear optimization

MMA and GCMMA two methods for nonlinear optimization MMA and GCMMA two methods for nonlnear optmzaton Krster Svanberg Optmzaton and Systems Theory, KTH, Stockholm, Sweden. krlle@math.kth.se Ths note descrbes the algorthms used n the author s 2007 mplementatons

More information

Perfect Competition and the Nash Bargaining Solution

Perfect Competition and the Nash Bargaining Solution Perfect Competton and the Nash Barganng Soluton Renhard John Department of Economcs Unversty of Bonn Adenauerallee 24-42 53113 Bonn, Germany emal: rohn@un-bonn.de May 2005 Abstract For a lnear exchange

More information

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space. Lnear, affne, and convex sets and hulls In the sequel, unless otherwse specfed, X wll denote a real vector space. Lnes and segments. Gven two ponts x, y X, we defne xy = {x + t(y x) : t R} = {(1 t)x +

More information

Linear Algebra and its Applications

Linear Algebra and its Applications Lnear Algebra and ts Applcatons 4 (00) 5 56 Contents lsts avalable at ScenceDrect Lnear Algebra and ts Applcatons journal homepage: wwwelsevercom/locate/laa Notes on Hlbert and Cauchy matrces Mroslav Fedler

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence.

Vector Norms. Chapter 7 Iterative Techniques in Matrix Algebra. Cauchy-Bunyakovsky-Schwarz Inequality for Sums. Distances. Convergence. Vector Norms Chapter 7 Iteratve Technques n Matrx Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematcs Unversty of Calforna, Berkeley Math 128B Numercal Analyss Defnton A vector norm

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

2.3 Nilpotent endomorphisms

2.3 Nilpotent endomorphisms s a block dagonal matrx, wth A Mat dm U (C) In fact, we can assume that B = B 1 B k, wth B an ordered bass of U, and that A = [f U ] B, where f U : U U s the restrcton of f to U 40 23 Nlpotent endomorphsms

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem.

Lecture 20: Lift and Project, SDP Duality. Today we will study the Lift and Project method. Then we will prove the SDP duality theorem. prnceton u. sp 02 cos 598B: algorthms and complexty Lecture 20: Lft and Project, SDP Dualty Lecturer: Sanjeev Arora Scrbe:Yury Makarychev Today we wll study the Lft and Project method. Then we wll prove

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b

A New Refinement of Jacobi Method for Solution of Linear System Equations AX=b Int J Contemp Math Scences, Vol 3, 28, no 17, 819-827 A New Refnement of Jacob Method for Soluton of Lnear System Equatons AX=b F Naem Dafchah Department of Mathematcs, Faculty of Scences Unversty of Gulan,

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

CHAPTER III Neural Networks as Associative Memory

CHAPTER III Neural Networks as Associative Memory CHAPTER III Neural Networs as Assocatve Memory Introducton One of the prmary functons of the bran s assocatve memory. We assocate the faces wth names, letters wth sounds, or we can recognze the people

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Affine transformations and convexity

Affine transformations and convexity Affne transformatons and convexty The purpose of ths document s to prove some basc propertes of affne transformatons nvolvng convex sets. Here are a few onlne references for background nformaton: http://math.ucr.edu/

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS

8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 493 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces you have studed thus far n the text are real vector spaces because the scalars

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso Supplement: Proofs and Techncal Detals for The Soluton Path of the Generalzed Lasso Ryan J. Tbshran Jonathan Taylor In ths document we gve supplementary detals to the paper The Soluton Path of the Generalzed

More information

Deriving the X-Z Identity from Auxiliary Space Method

Deriving the X-Z Identity from Auxiliary Space Method Dervng the X-Z Identty from Auxlary Space Method Long Chen Department of Mathematcs, Unversty of Calforna at Irvne, Irvne, CA 92697 chenlong@math.uc.edu 1 Iteratve Methods In ths paper we dscuss teratve

More information

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS These are nformal notes whch cover some of the materal whch s not n the course book. The man purpose s to gve a number of nontrval examples

More information

Lecture 21: Numerical methods for pricing American type derivatives

Lecture 21: Numerical methods for pricing American type derivatives Lecture 21: Numercal methods for prcng Amercan type dervatves Xaoguang Wang STAT 598W Aprl 10th, 2014 (STAT 598W) Lecture 21 1 / 26 Outlne 1 Fnte Dfference Method Explct Method Penalty Method (STAT 598W)

More information

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014)

10-801: Advanced Optimization and Randomized Methods Lecture 2: Convex functions (Jan 15, 2014) 0-80: Advanced Optmzaton and Randomzed Methods Lecture : Convex functons (Jan 5, 04) Lecturer: Suvrt Sra Addr: Carnege Mellon Unversty, Sprng 04 Scrbes: Avnava Dubey, Ahmed Hefny Dsclamer: These notes

More information

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 )

Yong Joon Ryang. 1. Introduction Consider the multicommodity transportation problem with convex quadratic cost function. 1 2 (x x0 ) T Q(x x 0 ) Kangweon-Kyungk Math. Jour. 4 1996), No. 1, pp. 7 16 AN ITERATIVE ROW-ACTION METHOD FOR MULTICOMMODITY TRANSPORTATION PROBLEMS Yong Joon Ryang Abstract. The optmzaton problems wth quadratc constrants often

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Convexity preserving interpolation by splines of arbitrary degree

Convexity preserving interpolation by splines of arbitrary degree Computer Scence Journal of Moldova, vol.18, no.1(52), 2010 Convexty preservng nterpolaton by splnes of arbtrary degree Igor Verlan Abstract In the present paper an algorthm of C 2 nterpolaton of dscrete

More information

a b a In case b 0, a being divisible by b is the same as to say that

a b a In case b 0, a being divisible by b is the same as to say that Secton 6.2 Dvsblty among the ntegers An nteger a ε s dvsble by b ε f there s an nteger c ε such that a = bc. Note that s dvsble by any nteger b, snce = b. On the other hand, a s dvsble by only f a = :

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE : Lnear Systems Summary #: Backpropagaton BACKPROPAGATION The perceptron rule as well as the Wdrow Hoff learnng were desgned to tran sngle layer networks. They suffer from the same dsadvantage: they

More information

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016 U.C. Berkeley CS94: Spectral Methods and Expanders Handout 8 Luca Trevsan February 7, 06 Lecture 8: Spectral Algorthms Wrap-up In whch we talk about even more generalzatons of Cheeger s nequaltes, and

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

Exercise Solutions to Real Analysis

Exercise Solutions to Real Analysis xercse Solutons to Real Analyss Note: References refer to H. L. Royden, Real Analyss xersze 1. Gven any set A any ɛ > 0, there s an open set O such that A O m O m A + ɛ. Soluton 1. If m A =, then there

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Math 217 Fall 2013 Homework 2 Solutions

Math 217 Fall 2013 Homework 2 Solutions Math 17 Fall 013 Homework Solutons Due Thursday Sept. 6, 013 5pm Ths homework conssts of 6 problems of 5 ponts each. The total s 30. You need to fully justfy your answer prove that your functon ndeed has

More information

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Appendix for Causal Interaction in Factorial Experiments: Application to Conjoint Analysis A Appendx for Causal Interacton n Factoral Experments: Applcaton to Conjont Analyss Mathematcal Appendx: Proofs of Theorems A. Lemmas Below, we descrbe all the lemmas, whch are used to prove the man theorems

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

Randić Energy and Randić Estrada Index of a Graph

Randić Energy and Randić Estrada Index of a Graph EUROPEAN JOURNAL OF PURE AND APPLIED MATHEMATICS Vol. 5, No., 202, 88-96 ISSN 307-5543 www.ejpam.com SPECIAL ISSUE FOR THE INTERNATIONAL CONFERENCE ON APPLIED ANALYSIS AND ALGEBRA 29 JUNE -02JULY 20, ISTANBUL

More information

Solutions to exam in SF1811 Optimization, Jan 14, 2015

Solutions to exam in SF1811 Optimization, Jan 14, 2015 Solutons to exam n SF8 Optmzaton, Jan 4, 25 3 3 O------O -4 \ / \ / The network: \/ where all lnks go from left to rght. /\ / \ / \ 6 O------O -5 2 4.(a) Let x = ( x 3, x 4, x 23, x 24 ) T, where the varable

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Perron Vectors of an Irreducible Nonnegative Interval Matrix

Perron Vectors of an Irreducible Nonnegative Interval Matrix Perron Vectors of an Irreducble Nonnegatve Interval Matrx Jr Rohn August 4 2005 Abstract As s well known an rreducble nonnegatve matrx possesses a unquely determned Perron vector. As the man result of

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Formulas for the Determinant

Formulas for the Determinant page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use

More information

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1

Estimating the Fundamental Matrix by Transforming Image Points in Projective Space 1 Estmatng the Fundamental Matrx by Transformng Image Ponts n Projectve Space 1 Zhengyou Zhang and Charles Loop Mcrosoft Research, One Mcrosoft Way, Redmond, WA 98052, USA E-mal: fzhang,cloopg@mcrosoft.com

More information

General viscosity iterative method for a sequence of quasi-nonexpansive mappings

General viscosity iterative method for a sequence of quasi-nonexpansive mappings Avalable onlne at www.tjnsa.com J. Nonlnear Sc. Appl. 9 (2016), 5672 5682 Research Artcle General vscosty teratve method for a sequence of quas-nonexpansve mappngs Cuje Zhang, Ynan Wang College of Scence,

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

REAL ANALYSIS I HOMEWORK 1

REAL ANALYSIS I HOMEWORK 1 REAL ANALYSIS I HOMEWORK CİHAN BAHRAN The questons are from Tao s text. Exercse 0.0.. If (x α ) α A s a collecton of numbers x α [0, + ] such that x α

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

SL n (F ) Equals its Own Derived Group

SL n (F ) Equals its Own Derived Group Internatonal Journal of Algebra, Vol. 2, 2008, no. 12, 585-594 SL n (F ) Equals ts Own Derved Group Jorge Macel BMCC-The Cty Unversty of New York, CUNY 199 Chambers street, New York, NY 10007, USA macel@cms.nyu.edu

More information

Appendix B. The Finite Difference Scheme

Appendix B. The Finite Difference Scheme 140 APPENDIXES Appendx B. The Fnte Dfference Scheme In ths appendx we present numercal technques whch are used to approxmate solutons of system 3.1 3.3. A comprehensve treatment of theoretcal and mplementaton

More information

1 GSW Iterative Techniques for y = Ax

1 GSW Iterative Techniques for y = Ax 1 for y = A I m gong to cheat here. here are a lot of teratve technques that can be used to solve the general case of a set of smultaneous equatons (wrtten n the matr form as y = A), but ths chapter sn

More information

The Second Anti-Mathima on Game Theory

The Second Anti-Mathima on Game Theory The Second Ant-Mathma on Game Theory Ath. Kehagas December 1 2006 1 Introducton In ths note we wll examne the noton of game equlbrum for three types of games 1. 2-player 2-acton zero-sum games 2. 2-player

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION

ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EQUATION Advanced Mathematcal Models & Applcatons Vol.3, No.3, 2018, pp.215-222 ON A DETERMINATION OF THE INITIAL FUNCTIONS FROM THE OBSERVED VALUES OF THE BOUNDARY FUNCTIONS FOR THE SECOND-ORDER HYPERBOLIC EUATION

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpenCourseWare http://ocw.mt.edu 6.854J / 18.415J Advanced Algorthms Fall 2008 For nformaton about ctng these materals or our Terms of Use, vst: http://ocw.mt.edu/terms. 18.415/6.854 Advanced Algorthms

More information

Approximate Smallest Enclosing Balls

Approximate Smallest Enclosing Balls Chapter 5 Approxmate Smallest Enclosng Balls 5. Boundng Volumes A boundng volume for a set S R d s a superset of S wth a smple shape, for example a box, a ball, or an ellpsod. Fgure 5.: Boundng boxes Q(P

More information

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS BOUNDEDNESS OF THE IESZ TANSFOM WITH MATIX A WEIGHTS Introducton Let L = L ( n, be the functon space wth norm (ˆ f L = f(x C dx d < For a d d matrx valued functon W : wth W (x postve sem-defnte for all

More information

Welfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium?

Welfare Properties of General Equilibrium. What can be said about optimality properties of resource allocation implied by general equilibrium? APPLIED WELFARE ECONOMICS AND POLICY ANALYSIS Welfare Propertes of General Equlbrum What can be sad about optmalty propertes of resource allocaton mpled by general equlbrum? Any crteron used to compare

More information

On the symmetric character of the thermal conductivity tensor

On the symmetric character of the thermal conductivity tensor On the symmetrc character of the thermal conductvty tensor Al R. Hadjesfandar Department of Mechancal and Aerospace Engneerng Unversty at Buffalo, State Unversty of New York Buffalo, NY 146 USA ah@buffalo.edu

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

Inexact Newton Methods for Inverse Eigenvalue Problems

Inexact Newton Methods for Inverse Eigenvalue Problems Inexact Newton Methods for Inverse Egenvalue Problems Zheng-jan Ba Abstract In ths paper, we survey some of the latest development n usng nexact Newton-lke methods for solvng nverse egenvalue problems.

More information

Graph Reconstruction by Permutations

Graph Reconstruction by Permutations Graph Reconstructon by Permutatons Perre Ille and Wllam Kocay* Insttut de Mathémathques de Lumny CNRS UMR 6206 163 avenue de Lumny, Case 907 13288 Marselle Cedex 9, France e-mal: lle@ml.unv-mrs.fr Computer

More information

On the correction of the h-index for career length

On the correction of the h-index for career length 1 On the correcton of the h-ndex for career length by L. Egghe Unverstet Hasselt (UHasselt), Campus Depenbeek, Agoralaan, B-3590 Depenbeek, Belgum 1 and Unverstet Antwerpen (UA), IBW, Stadscampus, Venusstraat

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN FINITELY-GENERTED MODULES OVER PRINCIPL IDEL DOMIN EMMNUEL KOWLSKI Throughout ths note, s a prncpal deal doman. We recall the classfcaton theorem: Theorem 1. Let M be a fntely-generated -module. (1) There

More information

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C Some basc nequaltes Defnton. Let V be a vector space over the complex numbers. An nner product s gven by a functon, V V C (x, y) x, y satsfyng the followng propertes (for all x V, y V and c C) (1) x +

More information

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra

MEM 255 Introduction to Control Systems Review: Basics of Linear Algebra MEM 255 Introducton to Control Systems Revew: Bascs of Lnear Algebra Harry G. Kwatny Department of Mechancal Engneerng & Mechancs Drexel Unversty Outlne Vectors Matrces MATLAB Advanced Topcs Vectors A

More information

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function

A Local Variational Problem of Second Order for a Class of Optimal Control Problems with Nonsmooth Objective Function A Local Varatonal Problem of Second Order for a Class of Optmal Control Problems wth Nonsmooth Objectve Functon Alexander P. Afanasev Insttute for Informaton Transmsson Problems, Russan Academy of Scences,

More information

form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choo

form, and they present results of tests comparng the new algorthms wth other methods. Recently, Olschowka & Neumaer [7] ntroduced another dea for choo Scalng and structural condton numbers Arnold Neumaer Insttut fur Mathematk, Unverstat Wen Strudlhofgasse 4, A-1090 Wen, Austra emal: neum@cma.unve.ac.at revsed, August 1996 Abstract. We ntroduce structural

More information

7. Products and matrix elements

7. Products and matrix elements 7. Products and matrx elements 1 7. Products and matrx elements Based on the propertes of group representatons, a number of useful results can be derved. Consder a vector space V wth an nner product ψ

More information

Economics 101. Lecture 4 - Equilibrium and Efficiency

Economics 101. Lecture 4 - Equilibrium and Efficiency Economcs 0 Lecture 4 - Equlbrum and Effcency Intro As dscussed n the prevous lecture, we wll now move from an envronment where we looed at consumers mang decsons n solaton to analyzng economes full of

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information