Algebraic models for higher-order correlations
|
|
- Sandra Montgomery
- 5 years ago
- Views:
Transcription
1 Algebraic models for higher-order correlations Lek-Heng Lim and Jason Morton U.C. Berkeley and Stanford Univ. December 15, 2008 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
2 Tensors as hypermatrices Up to choice of bases on U, V, W, a tensor A U V W may be represented as a hypermatrix A = a ijk l,m,n i,j,k=1 Rl m n where dim(u) = l, dim(v ) = m, dim(w ) = n if 1 we give it coordinates; 2 we ignore covariance and contravariance. Henceforth, tensor = hypermatrix. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
3 Multilinear matrix multiplication Matrices can be multiplied on left and right: A R m n, X R p m, Y R q n, C = (X, Y ) A = XAY R p q, c αβ = m,n i,j=1 x αiy βj a ij. 3-tensors can be multiplied on three sides: A R l m n, X R p l, Y R q m, Z R r n, C = (X, Y, Z) A R p q r, c αβγ = l,m,n i,j,k=1 x αiy βj z γk a ijk. Correspond to change-of-bases transformations for tensors. Define right (covariant) multiplication by (X, Y, Z) A = A (X, Y, Z ). L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
4 Symmetric tensors Cubical tensor a ijk R n n n is symmetric if a ijk = a ikj = a jik = a jki = a kij = a kji. For order p, invariant under all permutations σ S p on indices. S p (R n ) denotes set of all order-p symmetric tensors. Symmetric multilinear matrix multiplication C = (X, X, X ) A where c αβγ = l,m,n x αix βj x γk a ijk. i,j,k=1 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
5 Examples of symmetric tensors Higher order derivatives of real-valued multivariate functions. Moments of a vector-valued random variable x = (x 1,..., x n ): S p (x) = [ E(x j1 x j2 x jp ) ] n j 1,...,j p=1. Cumulants of a random vector x = (x 1,..., x n ): ( ) ( ) K p(x) = ( 1) q 1 (q 1)!E E x j n j A q A 1 A q={j 1,...,j p} j A 1 x j K p (x) for p = 1, 2, 3, 4 are expectation, variance, skewness, and kurtosis. j 1,...,j p=1. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
6 Cumulants In terms of log characteristic and cumulant generating functions, p κ j1 j p (x) = t α log E(exp( t, x ) 1 j 1 t αp j p t=0 = ( 1) p p t α log E(exp(i t, x ) 1 t αp. j p t=0 In terms of Edgeworth expansion, log E(exp(i t, x ) = α=0 j 1 i α κ α(x) tα α!, log E(exp( t, x ) = κ α(x) tα α!, α = (α 1,..., α p ) is a multi-index, t α = t α 1 1 tαp p, α! = α 1! α p!. For each x, K p (x) = [κ j1 j p (x)] S p (R n ) is a symmetric tensor. [Fisher, Wishart; 1932] α=0 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
7 Properties of cumulants Multilinearity: If x is a R n -valued random variable and A R m n K p (Ax) = (A,..., A) K p (x). Independence: If x 1,..., x k are mutually independent of y 1,..., y k, then K p (x 1 +y 1,..., x k +y k ) = K p (x 1,..., x k )+K p (y 1,..., y k ). If I and J partition {j 1,..., j n } so that x I and x J are independent, then κ j1 j n (x) = 0. Gaussian: If x is multivariate normal, then K p (x) = 0 for all p 3. Support: There are no distributions where { 0 3 p n, K p (x) = 0 p > n. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
8 Estimation of cumulants How do we estimate K p (x) given multiple observations of x? Central and non-central moments are ˆm n = 1 n (x t x) n, ŝ n = 1 t n x n t t, etc. Cumulant estimator ˆK p (x) for p = 1, 2, 3, 4 given by ˆκ i = ˆm i = 1 n ŝi ˆκ ij = ˆκ ijk = ˆκ ijkl = = n ˆm n 1 ij = 1 (ŝ n 1 ij 1 ŝiŝ n j ) n 2 ˆm n (n 1)(n 2) ijk = [ŝ (n 1)(n 2) ijk 1 (ŝ n iŝ jk + ŝ j ŝ ik + ŝ k ŝ ij ) + 2 ŝ n 2 i ŝ j ŝ k ] n 2 (n 1)(n 2)(n 3) [(n + 1) ˆm ijkl (n 1)( ˆm ij ˆm kl + ˆm ik ˆm jl + ˆm il ˆm jk )] n [(n + 1)ŝ (n 1)(n 2)(n 3) ijkl n+1 (ŝ n iŝ jkl + ŝ j ŝ ikl + ŝ k ŝ ijl + ŝ l ŝ ijk ) n 1 n (ŝ ijŝ kl + ŝ ik ŝ jl + ŝ il ŝ jk ) + ŝ 2 i (ŝ jk + ŝ jl + ŝ kl ) + ŝ 2 j (ŝ ik + ŝ il + ŝ kl ) + ŝ 2 k (ŝ ij + ŝ il + ŝ jl ) + ŝ 2 l (ŝ ij + ŝ ik + ŝ jk ) 6 n 2 ŝ i ŝ j ŝ k ŝ l ]. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
9 Factor analysis Linear generative model y = As + ε noise ε R m, factor loadings A R m r, hidden factors s R r, observed data y R m. Do not know A, s, ε, but need to recover s and sometimes A from multiple observations of y. Time series of observations, get matrices Y = [y 1,..., y n ], S = [s 1,..., s n ], E = [ε 1,..., ε n ], and Y = AS + E. Factor analysis: Recover A and S from Y by a low-rank matrix approximation Y AS L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
10 Principal and independent components analysis Principal components analysis: s Gaussian, ˆK 2 (y) = QΛ 2 Q = (Q, Q) Λ 2, Λ 2 ˆK 2 (s) diagonal matrix, Q O(n, r), [Pearson; 1901]. Independent components analysis: s statistically independent entries, ε Gaussian What if ˆK p (y) = (Q,..., Q) Λ p, p = 2, 3,..., Λ p ˆK p (s) diagonal tensor, Q O(n, r), [Comon; 1994]. s not Gaussian, e.g. power-law distributed data in social networks. s not independent, e.g. functional components in neuroimaging. ε not white noise, e.g. idiosyncratic factors in financial modelling. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
11 Principal cumulant components analysis Note that if ε = 0, then K p (y) = K p (Qs) = (Q,..., Q) K p (s). In general, want principal components that account for variation in all cumulants simultaneously min Q O(n,r), Cp S p (R r ) p=1 α p ˆK p (y) (Q,..., Q) C p 2 F, C p ˆK p (s) not necessarily diagonal. Appears intractable: optimization over infinite-dimensional manifold O(n, r) p=1 Sp (R r ). Surprising relaxation: optimization over a single Grassmannian Gr(n, r) of dimension r(n r), In practice = 3 or 4. max Q Gr(n,r) p=1 α p ˆK p (y) (Q,..., Q) 2 F. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
12 Geometric insights Secants of Veronese in S p (R n ) not closed, not irreducible, difficult to study. Symmetric subspace variety in S p (R n ) closed, irreducible, easy to study. Stiefel manifold O(n, r): set of n r real matrices with orthonormal columns. O(n, n) = O(n), usual orthogonal group. Grassman manifold Gr(n, r): set of equivalence classes of O(n, r) under left multiplication by O(n). Parameterization of S p (R n ) via More generally Gr(n, r) S p (R r ) S p (R n ). Gr(n, r) p=1 Sp (R r ) p=1 Sp (R n ). L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
13 From Stieffel to Grassmann Given A S p (R n ), some r n, want min X O(n,r), C S p (R r ) A (X,..., X ) C F, Unlike approximation by secants of Veronese, subspace approximation problem always has an globally optimal solution. Equivalent to max X O(n,r) (X,..., X ) A F = max X O(n,r) A (X,..., X ) F. Problem defined on a Grassmannian since A (X,..., X ) F = A (XQ,..., XQ) F, for any Q O(r). Only the subspaces spanned by X matters. Equivalent to max X Gr(n,r) A (X,..., X ) F. Once we have optimal X Gr(n, r), may obtain C S p (R r ) up to O(n)-equivalence, C = (X,..., X ) A. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
14 Coordinate-cycling heuristics Alternating Least Squares (i.e. Gauss-Seidel) is commonly used for minimizing Ψ(X, Y, Z) = A (X, Y, Z) 2 F for A R l m n cycling between X, Y, Z and solving a least squares problem at each iteration. What if A S 3 (R n ) and Φ(X ) = A (X, X, X ) 2 F? Present approach: disregard symmetry of A, solve Ψ(X, Y, Z), set upon final iteration. Better: L-BFGS on Grassmannian. X = Y = Z = (X + Y + Z )/3 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
15 Newton/quasi-Newton on a Grassmannian Objective Φ : Gr(n, r) R, Φ(X ) = A (X, X, X ) 2 F. T X tangent space at X Gr(n, r) R n r T X X = 0 1 Compute Grassmann gradient Φ T X. 2 Compute Hessian or update Hessian approximation 3 At X Gr(n, r), solve H : T X H T X. H = Φ for search direction. 4 Update iterate X : Move along geodesic from X in the direction given by. [Arias, Edelman, Smith; 1999], [Eldén, Savas; 2008], [Savas, L.; 2008]. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
16 Picture L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
17 BFGS on Grassmannian The BFGS update H k+1 = H k H ks k s k H k s k H ks k + y ky k y k y k where s k = x k+1 x k = t k p k, y k = f k+1 f k. On Grassmannian the vectors are defined on different points belonging to different tangent spaces. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
18 Different ways of parallel transporting vectors X Gr(n, r), 1, 2 T X and X (t) geodesic path along 1 Parallel transport using global coordinates 2 (t) = T 1 (t) 2 we have also 1 = X D 1 and 2 = X D 2 where X basis for T X. Let X (t) be basis for T X (t). Parallel transport using local coordinates 2 (t) = X (t) D 2. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
19 Parallel transport in local coordinates All transported tangent vectors have the same coordinate representation in the basis X (t) at all points on the path X (t). Plus: No need to transport the gradient or the Hessian. Minus: Need to compute X (t). In global coordinate we compute T k+1 s k = t k T k (t k )p k T k+1 y k = f k+1 T k (t k ) f k T k (t k )H k T 1 k (t k ) : T k+1 T k+1 H k+1 = H k H ks k s k H k s k H ks k + y ky k y k y k L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
20 BFGS Compact representation of BFGS in Euclidean space: H k = H 0 + [ S k where ] [ R H 0 Y k k S k = [s 0,..., s k 1 ], (D k + Y k H 0Y k )R 1 R k k R 1 k 0 Y k = [y 0,..., y k 1 ], [ ] D k = diag s 0 y 0,..., s k 1 y k 1, s 0 y 0 s 0 y 1 s 0 y k 1 0 s 1 R k = y 1 s 1 y k s k 1 y k 1 ] [ ] S k Yk H 0 L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
21 L-BFGS Limited memory BFGS [Byrd et al; 1994]. Replace H 0 by γ k I and keep the m most resent s j and y j, H k = γ k I + [ S k where ] [ R γ k Y k k S k = [s k m,..., s k 1 ], (D k + γ k Y k Y k)r 1 R k k R 1 k 0 ] [ S k γ k Y k Y k = [y k m,..., y k 1 ], [ ] D k = diag s k m y k m,..., s k 1 y k 1, s k m y k m s k m y k m+1 s k m y k 1 0 s k m+1 R k = y k m+1 s k m+1 y k s k 1 y k 1 ] L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
22 L-BFGS on the Grassmannian In each iteration, parallel transport vectors in S k and Y k to T k, ie. perform S k = TS k, Ȳ k = TY k where T is the transport matrix. No need to modify R k or D k u, v = T u, T v where u, v T k and T u, T v T k+1. H k nonsingular, Hessian is singular. No problem T k at x k is invariant subspace of H k, ie. if v T k then H k v T k. [Savas, L.; 2008] L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
23 Convergence Compares favorably with Alternating Least Squares. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
24 Higher order eigenfaces Principal cumulant subspaces supplement varimax subspace from PCA. Take face recognition for example, eigenfaces (p = 2) becomes skewfaces (p = 3) and kurtofaces (p = 4). Eigenfaces: given image pixel matrix A R m n with centered columns where m n. Eigenvectors of pixel pixel covariance matrix K pixel 2 S 2 (R n ) are the eigenfaces. For efficiency, compute image image covariance matrix K image 2 S 2 (R m ) instead. SVD A = UΣV gives both implicitly, K image 2 = 1 n (A, A ) I 2 = 1 n A A = 1 n V ΛV, K pixel 2 = 1 n (A, A) I 2 = 1 m AA = 1 m UΛU. Orthonormal columns of U, eigenvectors of nk pixel 2, are the eigenfaces. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
25 Computing image and pixel skewness Want to implicitly compute K pixel 3 S 3 (R n ), third cumulant tensor of the pixels (huge). Just need projector Π onto the subspace of skewfaces that best explain K pixel 3. Let A = UΣV, U O(n, m), Σ R m m, V O(m). K pixel 3 = 1 m (A, A, A) I m = 1 m (U, U, U) (Σ, Σ, Σ) (V, V, V ) I m K image 3 = 1 n (A, A, A ) I n = 1 n (V, V, V ) (Σ, Σ, Σ) (U, U, U ) I n I n = δ ijk S 3 (R n ) is the Kronecker delta tensor, i.e. δ ijk = 1 iff i = j = k and δ ijk = 0 otherwise. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
26 Computing skewmax projection Define A S 3 (R m ) by A = (Σ, Σ, Σ) (V, V, V ) I m Want Q O(m, s) and core tensor C S 3 (R s ) not necessarily diagonal, so that A (Q, Q, Q) C and thus Solve K pixel 3 1 m (U, U, U) (Q, Q, Q) C = 1 m (UQ, UQ, UQ) C. min Q O(m,s), C S 3 (R s ) A (Q, Q, Q) C F Π = UQ O(n, s) is our orthonormal-column projection matrix onto the skewmax subspace. Caveat: Q only determined up to O(s)-equivalence. Not a problem if we are just interested in the associated subspace or its projector. L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
27 Combining eigen-, skew-, and kurto-faces Combine information from multiple cumulants: Same procedure for the kurtosis tensor (a little more complicated). Say we keep the first r eigenfaces (columns of U), s skewfaces, and t kurtofaces. Their span is our optimal subspace. These three subspaces may overlap; orthogonalize the resulting r + s + t column vectors to get a final projector. This gives an orthonormal projector basis W for the column space of A; its first r vectors best explain the pixel covariance K pixel 2 S 2 (R n ), next s vectors, with W 1:r, best explain the pixel skewness K pixel 3 S 3 (R n ), last t vectors, with W 1:r+s, best explain pixel kurtosis K pixel 4 S 4 (R n ). L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
28 Advertisement and acknowledgement Jason Morton, Algebraic models for multilinear dependence, SAMSI Workshop on Algebraic Statistical Models, Research Triangle Park, NC, January 15 17, Thanks: J.M. Landsberg Berkant Savas L.-H. Lim & J. Morton (MSRI Workshop) Algebraic models for higher-order correlations December 15, / 28
Numerical Multilinear Algebra III
Numerical Multilinear Algebra III Lek-Heng Lim University of California, Berkeley January 5 7, 2009 (Contains joint work with Jason Morton of Stanford and Berkant Savas of UT Austin L.-H. Lim (ICM Lecture)
More informationPrincipal Cumulant Component Analysis
Principal Cumulant Component Analysis Lek-Heng Lim University of California, Berkeley July 4, 2009 (Joint work with: Jason Morton, Stanford University) L.-H. Lim (Berkeley) Principal Cumulant Component
More informationSymmetric eigenvalue decompositions for symmetric tensors
Symmetric eigenvalue decompositions for symmetric tensors Lek-Heng Lim University of California, Berkeley January 29, 2009 (Contains joint work with Pierre Comon, Jason Morton, Bernard Mourrain, Berkant
More informationSpectrum and Pseudospectrum of a Tensor
Spectrum and Pseudospectrum of a Tensor Lek-Heng Lim University of California, Berkeley July 11, 2008 L.-H. Lim (Berkeley) Spectrum and Pseudospectrum of a Tensor July 11, 2008 1 / 27 Matrix eigenvalues
More informationEigenvalues of tensors
Eigenvalues of tensors and some very basic spectral hypergraph theory Lek-Heng Lim Matrix Computations and Scientific Computing Seminar April 16, 2008 Lek-Heng Lim (MCSC Seminar) Eigenvalues of tensors
More informationHyperdeterminants, secant varieties, and tensor approximations
Hyperdeterminants, secant varieties, and tensor approximations Lek-Heng Lim University of California, Berkeley April 23, 2008 Joint work with Vin de Silva L.-H. Lim (Berkeley) Tensor approximations April
More informationMost tensor problems are NP hard
Most tensor problems are NP hard (preliminary report) Lek-Heng Lim University of California, Berkeley August 25, 2009 (Joint work with: Chris Hillar, MSRI) L.H. Lim (Berkeley) Most tensor problems are
More informationAlgorithms for tensor approximations
Algorithms for tensor approximations Lek-Heng Lim MSRI Summer Graduate Workshop July 7 18, 2008 L.-H. Lim (MSRI SGW) Algorithms for tensor approximations July 7 18, 2008 1 / 35 Synopsis Naïve: the Gauss-Seidel
More informationPRINCIPAL CUMULANT COMPONENT ANALYSIS JASON MORTON AND LEK-HENG LIM
PRINCIPAL CUMULANT COMPONENT ANALYSIS JASON MORTON AND LEK-HENG LIM Abstract. Multivariate Gaussian data is completely characterized by its mean and covariance, yet modern non-gaussian data makes higher-order
More informationQUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS
QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS BERKANT SAVAS AND LEK-HENG LIM Abstract. In this paper we proposed quasi-newton and limited memory quasi-newton methods for
More informationConditioning and illposedness of tensor approximations
Conditioning and illposedness of tensor approximations and why engineers should care Lek-Heng Lim MSRI Summer Graduate Workshop July 7 18, 2008 L.-H. Lim (MSRI SGW) Conditioning and illposedness July 7
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationNumerical multilinear algebra
Numerical multilinear algebra From matrices to tensors Lek-Heng Lim University of California, Berkeley March 1, 2008 Lek-Heng Lim (UC Berkeley) Numerical multilinear algebra March 1, 2008 1 / 29 Lesson
More informationNumerical Multilinear Algebra I
Numerical Multilinear Algebra I Lek-Heng Lim University of California, Berkeley January 5 7, 2009 L.-H. Lim (ICM Lecture) Numerical Multilinear Algebra I January 5 7, 2009 1 / 55 Hope Past 50 years, Numerical
More informationTensors. Lek-Heng Lim. Statistics Department Retreat. October 27, Thanks: NSF DMS and DMS
Tensors Lek-Heng Lim Statistics Department Retreat October 27, 2012 Thanks: NSF DMS 1209136 and DMS 1057064 L.-H. Lim (Stat Retreat) Tensors October 27, 2012 1 / 20 tensors on one foot a tensor is a multilinear
More informationc 2010 Society for Industrial and Applied Mathematics
SIAM J. SCI. COMPUT. Vol. 32, No. 6, pp. 3352 3393 c 2010 Society for Industrial and Applied Mathematics QUASI-NEWTON METHODS ON GRASSMANNIANS AND MULTILINEAR APPROXIMATIONS OF TENSORS BERKANT SAVAS AND
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationLecture 2. Tensor Iterations, Symmetries, and Rank. Charles F. Van Loan
Structured Matrix Computations from Structured Tensors Lecture 2. Tensor Iterations, Symmetries, and Rank Charles F. Van Loan Cornell University CIME-EMS Summer School June 22-26, 2015 Cetraro, Italy Structured
More informationFaculty of Engineering, Mathematics and Science School of Mathematics
Faculty of Engineering, Mathematics and Science School of Mathematics JS & SS Mathematics SS Theoretical Physics SS TSM Mathematics Module MA3429: Differential Geometry I Trinity Term 2018???, May??? SPORTS
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationNIELINIOWA OPTYKA MOLEKULARNA
NIELINIOWA OPTYKA MOLEKULARNA chapter 1 by Stanisław Kielich translated by:tadeusz Bancewicz http://zon8.physd.amu.edu.pl/~tbancewi Poznan,luty 2008 ELEMENTS OF THE VECTOR AND TENSOR ANALYSIS Reference
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationLinear Algebra Practice Problems
Linear Algebra Practice Problems Page of 7 Linear Algebra Practice Problems These problems cover Chapters 4, 5, 6, and 7 of Elementary Linear Algebra, 6th ed, by Ron Larson and David Falvo (ISBN-3 = 978--68-78376-2,
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationLinear Algebra. Session 12
Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More information1 Vectors and Tensors
PART 1: MATHEMATICAL PRELIMINARIES 1 Vectors and Tensors This chapter and the next are concerned with establishing some basic properties of vectors and tensors in real spaces. The first of these is specifically
More informationdistances between objects of different dimensions
distances between objects of different dimensions Lek-Heng Lim University of Chicago joint work with: Ke Ye (CAS) and Rodolphe Sepulchre (Cambridge) thanks: DARPA D15AP00109, NSF DMS-1209136, NSF IIS-1546413,
More informationLecture 10: Dimension Reduction Techniques
Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set
More informationCS 143 Linear Algebra Review
CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see
More information1 Curvature of submanifolds of Euclidean space
Curvature of submanifolds of Euclidean space by Min Ru, University of Houston 1 Curvature of submanifolds of Euclidean space Submanifold in R N : A C k submanifold M of dimension n in R N means that for
More informationAPPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.
APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product
More informationLinear Algebra in Actuarial Science: Slides to the lecture
Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations
More informationVECTORS, TENSORS AND INDEX NOTATION
VECTORS, TENSORS AND INDEX NOTATION Enrico Nobile Dipartimento di Ingegneria e Architettura Università degli Studi di Trieste, 34127 TRIESTE March 5, 2018 Vectors & Tensors, E. Nobile March 5, 2018 1 /
More informationCOMP 558 lecture 18 Nov. 15, 2010
Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationLecture 1: Review of linear algebra
Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview
More informationA Brief Introduction to Tensors
A Brief Introduction to Tensors Jay R Walton Fall 2013 1 Preliminaries In general, a tensor is a multilinear transformation defined over an underlying finite dimensional vector space In this brief introduction,
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationSymmetry and Properties of Crystals (MSE638) Stress and Strain Tensor
Symmetry and Properties of Crystals (MSE638) Stress and Strain Tensor Somnath Bhowmick Materials Science and Engineering, IIT Kanpur April 6, 2018 Tensile test and Hooke s Law Upto certain strain (0.75),
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationNatural Gradient Learning for Over- and Under-Complete Bases in ICA
NOTE Communicated by Jean-François Cardoso Natural Gradient Learning for Over- and Under-Complete Bases in ICA Shun-ichi Amari RIKEN Brain Science Institute, Wako-shi, Hirosawa, Saitama 351-01, Japan Independent
More informationOn Expected Gaussian Random Determinants
On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose
More informationTensors. Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra
Tensors Notes by Mateusz Michalek and Bernd Sturmfels for the lecture on June 5, 2018, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra This lecture is divided into two parts. The first part,
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationLECTURE 7. Least Squares and Variants. Optimization Models EE 127 / EE 227AT. Outline. Least Squares. Notes. Notes. Notes. Notes.
Optimization Models EE 127 / EE 227AT Laurent El Ghaoui EECS department UC Berkeley Spring 2015 Sp 15 1 / 23 LECTURE 7 Least Squares and Variants If others would but reflect on mathematical truths as deeply
More informationELE/MCE 503 Linear Algebra Facts Fall 2018
ELE/MCE 503 Linear Algebra Facts Fall 2018 Fact N.1 A set of vectors is linearly independent if and only if none of the vectors in the set can be written as a linear combination of the others. Fact N.2
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationGROUP THEORY PRIMER. New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule
GROUP THEORY PRIMER New terms: tensor, rank k tensor, Young tableau, Young diagram, hook, hook length, factors over hooks rule 1. Tensor methods for su(n) To study some aspects of representations of a
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More informationRobust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds
Robust Principal Component Pursuit via Alternating Minimization Scheme on Matrix Manifolds Tao Wu Institute for Mathematics and Scientific Computing Karl-Franzens-University of Graz joint work with Prof.
More informationChapter 3 Transformations
Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases
More informationPrincipal Component Analysis
Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal
More informationHST.582J/6.555J/16.456J
Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear
More informationTheorem A.1. If A is any nonzero m x n matrix, then A is equivalent to a partitioned matrix of the form. k k n-k. m-k k m-k n-k
I. REVIEW OF LINEAR ALGEBRA A. Equivalence Definition A1. If A and B are two m x n matrices, then A is equivalent to B if we can obtain B from A by a finite sequence of elementary row or elementary column
More informationMATH 240 Spring, Chapter 1: Linear Equations and Matrices
MATH 240 Spring, 2006 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 8th Ed. Sections 1.1 1.6, 2.1 2.2, 3.2 3.8, 4.3 4.5, 5.1 5.3, 5.5, 6.1 6.5, 7.1 7.2, 7.4 DEFINITIONS Chapter 1: Linear
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationMATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003
MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003 1. True or False (28 points, 2 each) T or F If V is a vector space
More information2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian
FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian
More informationJune 21, Peking University. Dual Connections. Zhengchao Wan. Overview. Duality of connections. Divergence: general contrast functions
Dual Peking University June 21, 2016 Divergences: Riemannian connection Let M be a manifold on which there is given a Riemannian metric g =,. A connection satisfying Z X, Y = Z X, Y + X, Z Y (1) for all
More informationMath 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination
Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationLinear Algebra (Review) Volker Tresp 2018
Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c
More informationTutorial on Principal Component Analysis
Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationLecture 9: Numerical Linear Algebra Primer (February 11st)
10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationSolving linear equations with Gaussian Elimination (I)
Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian
More informationGeneric properties of Symmetric Tensors
2006 1/48 PComon Generic properties of Symmetric Tensors Pierre COMON - CNRS other contributors: Bernard MOURRAIN INRIA institute Lek-Heng LIM, Stanford University 2006 2/48 PComon Tensors & Arrays Definitions
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationFoundations of Matrix Analysis
1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the
More informationLINEAR ALGEBRA REVIEW
LINEAR ALGEBRA REVIEW JC Stuff you should know for the exam. 1. Basics on vector spaces (1) F n is the set of all n-tuples (a 1,... a n ) with a i F. It forms a VS with the operations of + and scalar multiplication
More informationAlgebra C Numerical Linear Algebra Sample Exam Problems
Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric
More informationProblems in Linear Algebra and Representation Theory
Problems in Linear Algebra and Representation Theory (Most of these were provided by Victor Ginzburg) The problems appearing below have varying level of difficulty. They are not listed in any specific
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More informationPHYS 705: Classical Mechanics. Rigid Body Motion Introduction + Math Review
1 PHYS 705: Classical Mechanics Rigid Body Motion Introduction + Math Review 2 How to describe a rigid body? Rigid Body - a system of point particles fixed in space i r ij j subject to a holonomic constraint:
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationGatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II
Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference
More information1 Cricket chirps: an example
Notes for 2016-09-26 1 Cricket chirps: an example Did you know that you can estimate the temperature by listening to the rate of chirps? The data set in Table 1 1. represents measurements of the number
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 5: Projectors and QR Factorization Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 14 Outline 1 Projectors 2 QR Factorization
More informationIndex Notation for Vector Calculus
Index Notation for Vector Calculus by Ilan Ben-Yaacov and Francesc Roig Copyright c 2006 Index notation, also commonly known as subscript notation or tensor notation, is an extremely useful tool for performing
More informationThe QR Factorization
The QR Factorization How to Make Matrices Nicer Radu Trîmbiţaş Babeş-Bolyai University March 11, 2009 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 1 / 25 Projectors A projector
More informationCIFAR Lectures: Non-Gaussian statistics and natural images
CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationAlgebra Workshops 10 and 11
Algebra Workshops 1 and 11 Suggestion: For Workshop 1 please do questions 2,3 and 14. For the other questions, it s best to wait till the material is covered in lectures. Bilinear and Quadratic Forms on
More informationMath 321: Linear Algebra
Math 32: Linear Algebra T. Kapitula Department of Mathematics and Statistics University of New Mexico September 8, 24 Textbook: Linear Algebra,by J. Hefferon E-mail: kapitula@math.unm.edu Prof. Kapitula,
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationMachine Learning - MT & 14. PCA and MDS
Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationto appear insiam Journal on Matrix Analysis and Applications.
to appear insiam Journal on Matrix Analysis and Applications. SYMMETRIC TENSORS AND SYMMETRIC TENSOR RANK PIERRE COMON, GENE GOLUB, LEK-HENG LIM, AND BERNARD MOURRAIN Abstract. A symmetric tensor is a
More information