Overlapping Variable Clustering with Statistical Guarantees and LOVE
|
|
- Russell Burns
- 6 years ago
- Views:
Transcription
1 with Statistical Guarantees and LOVE Department of Statistical Science Cornell University WHOA-PSI, St. Louis, August 2017
2 Joint work with Mike Bing, Yang Ning and Marten Wegkamp Cornell University, Department of Statistical Science
3 Variable clustering What is variable clustering? Observable: X = (X 1,..., X j,..., X p ) R p, random vector. Data: X (1),..., X (n) i.i.d. X R p. Goal of variable clustering: Find sub-groups of similar coordinates of X, using Data. Goal different than data/point clustering: Find sub-groups of similar observations X (i), 1 i n. Data different than network clustering: Network data is 0/1 adjacency matrix.
4 Co-clustering genes using expression profiles ENSG ENSG ENSG
5 Model-based Objectives of model-based (overlapping) variable clustering Define model-based similarity between coordinates of X. Model definition depends crucially on what we want to cluster and what type of data we have: Here we cluster variables, and observe their values. Use identifiable model to define clusters of co-ordinates; allow for overlap. Estimate clusters; assess theoretically their accuracy, in the model-based framework.
6 A first step towards a model for overlapping clustering A sparse latent variable model with unstructured sparsity 1 X = AZ + E; A is a p K allocation matrix. 2 Z R K latent vector, E R p noise vector; Z E. 3 A is row sparse; K k=1 A jk 1, for each j {1,..., p}. Variable similarity and clusters X j and X l are similar if connected with the same Z k. Suggests definition for clusters with overlap: G k := { j {1,..., p} : A jk 0 }. Issue: model and clusters not identifiable: AZ = AQQ T Z, for any orthogonal Q. A jk may be 0, but (AQ) jk may not.
7 Identifiable models for overlapping clustering X = AZ + E; X R p, Z R K (Latent), A R p K. A sparse latent variable model with structured sparsity: The pure variable assumption (i) A row sparse; K k=1 A jk 1, for each j {1,..., p}. (ii) For every (column) k {1,..., K }, there exist at least two indices (rows) j {1,..., p} such that A jk = 1 and A jl = 0 for all l k. Spoiler alert! This A is identifiable up to signed permutations.
8 The pure variable assumption: interpretation Cluster: G k := { j {1,..., p} : A jk 0 }. The pure variable assumption A pure variable X j associates with only one latent factor Z k. Pure variables are crucial in building overlapping clusters Cluster G k is given by Z k, which is not observable. Z k = (biological) function. A pure variable X j is an observable proxy for a Z k. Observable X j performs function Z k. It anchors G k.
9 Overlapping clustering: interpretation An instance of determining unknown functions of variables Gene 1 (X 1 ) with function 1 anchors G 1. Gene 3 (X 3 ) with function 2 anchors G 2. Gene 2 (X 2 ) G 1 : Gene 2 performs function 1. Gene 2 (X 2 ) G 2 : Gene 2 also performs function 2. Before clustering Gene 2 had unknown function. After clustering Gene 2 is found to have dual function.
10 Identifiable models for overlapping clustering Ingredients for identifiability Latent variable model X = AZ + E with structure on A: the pure variable assumption. Mild assumption on C = Cov(Z ): ( (C) =: min min{cjj, C kk } C jk ) > 0. j k (C) > 0 = Z j Z k, a.s., for all j k.
11 Identifiability in structured sparse latent models The pure variable set I is identifiable I and its partition I =: {I k } 1 k K can be constructed uniquely, from Σ up to label permutations. The allocation matrix A is identifiable Under the pure variable assumption, there exists a unique matrix A, up to signed permutations, such that X = AZ + E. The clusters G = {G k } 1 k K are identifiable Under the pure variable assumption, the overlapping clusters G k are identifiable, up to label switching. If pure variables do not exist, identifiability of A fails.
12 Central challenge in proving identifiability X = AZ + E = Σ = ACA + Γ. I = Pure variable index set. J = {1,..., p} \ I = Impure variable index set. Central challenge: How to distinguish between I and J? Added challenge: How to distinguish between I and J when we don t know the noise Γ?
13 What is pure and what is impure? A necessary and sufficient condition for purity For each 1 i p, set M i := max j [p]\{i} Σ ij S i := { j [p] \ {i} : Σ ij = M i }. For given A and its induced pure variable set I, we have i I M i = max k [p]\{j} Σ jk for all j S i.
14 Look for maxima in Σ, ignore the diagonal! I = {{1, 2, 3}, {4, 5}, {6, 7}} and J = {8, 9} /2 2/ /2 2/ /2 2/ / / / /2 1/2 1/2 1/ /6 2/3 2/3 2/3 1/3 1/3 1/2 1/2 1/6 M 1 = max k 1 Σ 1k = 1. S 1 = {j 1 : Σ 1j = 1} = {2, 3}. 1 = M 1 = max k 1 Σ 2k = 1. 1 = M 1 = max k 1 Σ 3k = 1. I 1 = S 1 1 = {1, 2, 3} = pure
15 Look for maxima in Σ, ignore the diagonal! I = {{1, 2, 3}, {4, 5}, {6, 7}} and J = {8, 9} /2 2/ /2 2/ /2 2/ / / / /2 1/2 1/2 1/ /6 2/3 2/3 2/3 1/3 1/3 1/2 1/2 1/6 M 8 = max k 1 Σ 8k = 1. S 8 = {j 8 : Σ 8j = 1} = {4, 5}. 1 = M 8 max k 4 Σ 4k = 2. 8 cannot be pure! 8 J.
16 Estimation Estimate I. Estimate A I, the p K sub-matrix of A with rows in I. Estimate A J, with J = {1,..., p} \ I. Estimate the clusters G k.
17 Estimation of the pure variable set Reminder: I = pure variable set. Constructive characterization of I, population version For each 1 i p, set M i := max j [p]\{i} Σ ij S i := { j [p] \ {i} : Σ ij = M i }. For given A and its induced pure variable set I, we have i I M i = max k [p]\{j} Σ jk for all j S i. Moreover, S i {i} = I k, for some k, where I =: {I k } 1 k K is a partition of I.
18 Estimation of the pure variable set Algorithm idea Use the constructive characterization of I, at the population level. Replace Σ by the sample covariance Σ. Replace equalities by inequalities, allowing for tolerance level δ =: Σ Σ. Algorithm has complexity O(p 2 ). Requires input Σ, δ. Algorithm returns Î, its partition Î, and therefore K.
19 Estimation of the allocation submatrix A I A = [ AI Estimated previously: Î and its partition Î = {Î1,..., Î K }. The estimator ÂÎ, has rows i Î consisting of K 1 zeros and one entry equal to either +1 or 1. A J ]. Signs will be determined up to signed permutations. (1) Pick i Îk. Pick a sign for Âik, say Âik = 1. (2) For any j Îk \ {i}, we let { +1, if  jk = Σ ij > 0 1, if Σ ij < 0.
20 Estimation of the allocation sub-matrix A J [ ] [ ΣII Σ Σ = IJ AI CA = T I A I CA T J Σ JI Σ JJ A J CA T I A J CA T J ] [ ] ΓII Γ JJ Estimate A J row by row: motivation Σ IJ = A I CA T J θ j = CA j, for each j J. C kk =: 1 I k ( I k 1) θ j k =: 1 A ik Σ ij I k i I k i,j I k,i j Σ ij ; C km =: and 1 I k I m i I k,j I m Σ ij
21 Estimation of the allocation sub-matrix A J Estimation of rows of A J Under the model: θ j = CA j ; A j sparse. Available: Σ and estimated partition of pure variables Î Use Σ and Î to construct: θ j θ j, Ĉ C. Many choices to estimate sparse A j. Dantzig: Minimize β 1 over β R K such that θ j Ĉβ 2δ. Repeat for each j {1,..., Ĵ } to obtain ÂĴ.
22 Statistical guarantees: assumptions Recall: Σ = ACA T + Γ; X sub-gaussian: Σ Σ =: δ = O( (log p)/n). Signal strength conditions 1 Either on C: (C) = min j k ( min{cjj, C kk } C jk ) cδ. 2 Or on A: Smallest non-zero entry is larger than δ.
23 Estimation of the pure variable set: guarantees I = pure variables J = {1,..., p} \ I. Quasi-pure variables For each k [K ] : J k 1 = { i J : A ik 1 4δ/τ 2}. J 1 = K k=1 Jk 1. If X 1 is pure then A 1k = 1, for some k [K ]. If X 2 is quasi-pure then A 2m 1, for some m [K ].
24 Estimation of the pure variable set: guarantees J1 k = {i : A ik 1}; J 1 =: K k=1 Jk 1 ; I k = {i : A ik = 1}. Recovery guarantees: no signal strength conditions on A (a) K = K. (b) I Î I J 1. (c) I k Îk I k J k 1 w.h.p. for each k [K ]. Minimal recovery mistakes: no conditions on A Pure (1, 0, 0, 0, 0, 0) In, correct. Quasi Pure (0.99, 0.01, 0, 0, 0, 0) In, slight mistake. Impure (0.25, 0.25, 0.001, 0.099, 0.2, 0.2) Out, correct.
25 Estimation of the pure variable set: guarantees Exact recovery, under conditions on A Î = I, up to label switching, with I = K a=1 I a. Exact recovery: conditions on A Pure (1, 0, 0, 0, 0, 0) In, correct. Quasi Pure (0.99, 0.01, 0, 0, 0, 0) Not allowed Impure I (0.25, 0.25, 0.001, 0.099, 0.2, 0.2) Not allowed Impure II (0.25, 0.25, 0.1, 0.1, 0.3, 0.2) Out, correct.
26 Estimation of the allocation matrix A: guarantees Sup-norm consistency Let H denote the space of all K by K signed permutation matrices. We have, with probability exceeding 1 c 1 p c 2, 1 K = K. 2 min P H Â PA κ log p/n; κ =: C 1,1. Non-standard bound; similar to errors-in-variables model bounds. If C is diagonally dominant then κ is constant.
27 Activation and inhibition A = , Â = /2 1/2 0 +1/2 1/6 1/ /2 1/2 0 2/3 1/6 +1/6 Care in interpreting the signs: For each latent factor Z k we can consistently determine which of the X j s are associated with Z k in the same direction, but not the direction.
28 Estimation of the overlapping groups Ĝ = { } Ĝ 1,..., Ĝ K, Ĝk = { i : Âik 0 }. i,k FPR = 1 {A ik =0,Âik 0} i,k i,k 1, FNR = 1 {A ik 0,Âik =0} {A ik =0} i,k 1. {A ik 0} Guarantees for cluster recovery Under conditions on C: Under conditions on A: All results hold w.h.p. K = K ; FPR = 0; FNR = β. K = K ; FPR = 0; FNR = 0; Ĝ = G.
29 Sparsity per row: s j = A j 0 = k 1{A jk 0} J 1 = Quasi pure variables J 2 = Variables associated with some Z s below noise J 3 = Variables associated with all Z s above noise j J FNR = β 1 J 2 s j j J 1 J 2 s j + j J 3 I s. j If J 3 + I >> J 1 + J 2, β is very small.
30 LOVE A Latent model approach to OVErlapping clustering. Estimate the partition I of pure variables by Î Estimate separately A I and A J to obtain Â, the allocation matrix estimate. Estimate the overlapping clusters by Ĝ
31 Co-clustering genes using expression profiles: p = 500 Benchmark data set: RNA-seq transcript level data; Blood platelet samples from n = 285 individuals. ENSG and ENSG both non-coding RNA: placed together in Cluster 4. Each also placed in other clusters. Non-coding RNAs are pleiotropic (multiple functions). ENSG ENSG ENSG
32 Related work Large literature on Non-Negative Matrix Factorization (NMF) X = AZ + E; X, A, Z non-negative matrices. Goal of NMF: find à and Z with X à Z ɛ. In NMF, the pure variable assumption is needed for: Identifiability of A, when E = 0 (Donoho and Stodden, 2007). Identifiability in topic models (count data), Arora et al (2013): column sums of X and A are 1; E = 0. Polynomial time NMF algorithms: Arora et al (2012, 2013); Bittorf et al (2013). Other restrictions on matrices needed.
33 What can you do with LOVE? All you need is LOVE 1 A flexible identifiable latent factor model for overlapping clustering: no restrictions on X and Z. 2 New in the clustering lit: A has both + and entries. 3 New: A and clusters identifiable in the presence of non-ignorable noise E. 4 New algorithm: LOVE, runs in O(p 2 + pk ) time. 5 New: Statistical guarantees for data generated from X = AZ + E, with X sub-gaussian; immediate extensions to Gaussian copula. 6 New: A with both + and - allows for a more refined cluster interpretation.
34 with Statistical Guarantees (2017); F. Bunea, Y. Ning, M. Wegkamp [ Old version; new version coming soon!] Minimax Optimal Variable Clustering in G-models via Cord (2016); F. Bunea, C. Giraud, X. Luo [ Non-overlapping clustering] PECOK: a convex optimization approach to variable clustering(2016); F. Bunea, C.Giraud, M. Royer, N. Verzelen [ Non-overlapping clustering]
35 Thanks!
1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )
Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical
More informationCPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018
CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationMatrices and Vectors
Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix
More informationsparse and low-rank tensor recovery Cubic-Sketching
Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru
More informationLU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b
AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization
More informationDEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular
form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix
More informationRobust Principal Component Analysis
ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M
More informationLU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU
LU Factorization To further improve the efficiency of solving linear systems Factorizations of matrix A : LU and QR LU Factorization Methods: Using basic Gaussian Elimination (GE) Factorization of Tridiagonal
More informationDirect Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le
Direct Methods for Solving Linear Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview General Linear Systems Gaussian Elimination Triangular Systems The LU Factorization
More informationHigh Dimensional Inverse Covariate Matrix Estimation via Linear Programming
High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω
More informationAn algebraic perspective on integer sparse recovery
An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:
More informationCS 246 Review of Linear Algebra 01/17/19
1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector
More informationSpectral Clustering. Guokun Lai 2016/10
Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph
More informationGaussian Elimination for Linear Systems
Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination
More informationEstimation of large dimensional sparse covariance matrices
Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More information19.1 Problem setup: Sparse linear regression
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 19: Minimax rates for sparse linear regression Lecturer: Yihong Wu Scribe: Subhadeep Paul, April 13/14, 2016 In
More informationNumerical Linear Algebra
Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost
More informationProgram Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects
Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen
More informationHigh-dimensional covariance estimation based on Gaussian graphical models
High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,
More informationMath 471 (Numerical methods) Chapter 3 (second half). System of equations
Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.
More informationSparse Legendre expansions via l 1 minimization
Sparse Legendre expansions via l 1 minimization Rachel Ward, Courant Institute, NYU Joint work with Holger Rauhut, Hausdorff Center for Mathematics, Bonn, Germany. June 8, 2010 Outline Sparse recovery
More informationSolving Linear Systems Using Gaussian Elimination. How can we solve
Solving Linear Systems Using Gaussian Elimination How can we solve? 1 Gaussian elimination Consider the general augmented system: Gaussian elimination Step 1: Eliminate first column below the main diagonal.
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 3: Positive-Definite Systems; Cholesky Factorization Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 11 Symmetric
More informationSolving Linear Systems
Solving Linear Systems Iterative Solutions Methods Philippe B. Laval KSU Fall 207 Philippe B. Laval (KSU) Linear Systems Fall 207 / 2 Introduction We continue looking how to solve linear systems of the
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationFactor Analysis (10/2/13)
STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.
More informationCS412: Lecture #17. Mridul Aanjaneya. March 19, 2015
CS: Lecture #7 Mridul Aanjaneya March 9, 5 Solving linear systems of equations Consider a lower triangular matrix L: l l l L = l 3 l 3 l 33 l n l nn A procedure similar to that for upper triangular systems
More informationPivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3
Pivoting Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3 In the previous discussions we have assumed that the LU factorization of A existed and the various versions could compute it in a stable manner.
More informationBoolean Inner-Product Spaces and Boolean Matrices
Boolean Inner-Product Spaces and Boolean Matrices Stan Gudder Department of Mathematics, University of Denver, Denver CO 80208 Frédéric Latrémolière Department of Mathematics, University of Denver, Denver
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationChapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations
Chapter 1: Systems of linear equations and matrices Section 1.1: Introduction to systems of linear equations Definition: A linear equation in n variables can be expressed in the form a 1 x 1 + a 2 x 2
More informationClustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning
Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades
More informationPrediction of double gene knockout measurements
Prediction of double gene knockout measurements Sofia Kyriazopoulou-Panagiotopoulou sofiakp@stanford.edu December 12, 2008 Abstract One way to get an insight into the potential interaction between a pair
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 3 for Applied Multivariate Analysis Outline 1 Reprise-Vectors, vector lengths and the angle between them 2 3 Partial correlation
More informationBoundary Value Problems - Solving 3-D Finite-Difference problems Jacob White
Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about
More informationCPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017
CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2017 Admin Assignment 4: Due Friday. Assignment 5: Posted, due Monday of last week of classes Last Time: PCA with Orthogonal/Sequential
More informationRobust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization
Journal of Machine Learning Research 5 (204) 249-280 Submitted 3/3; Revised 0/3; Published 4/4 Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization Nicolas Gillis Department
More informationScientific Computing
Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting
More informationHigh Dimensional Covariance and Precision Matrix Estimation
High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance
More informationComposite Loss Functions and Multivariate Regression; Sparse PCA
Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.
More information. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in
Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2
More informationThe deterministic Lasso
The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality
More informationDeterminants. Chia-Ping Chen. Linear Algebra. Professor Department of Computer Science and Engineering National Sun Yat-sen University 1/40
1/40 Determinants Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra About Determinant A scalar function on the set of square matrices
More informationCME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6
CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 GENE H GOLUB Issues with Floating-point Arithmetic We conclude our discussion of floating-point arithmetic by highlighting two issues that frequently
More informationCovariate-Assisted Variable Ranking
Covariate-Assisted Variable Ranking Tracy Ke Department of Statistics Harvard University WHOA-PSI@St. Louis, Sep. 8, 2018 1/18 Sparse linear regression Y = X β + z, X R n,p, z N(0, σ 2 I n ) Signals (nonzero
More informationDISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania
Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 208 LECTURE 3 need for pivoting we saw that under proper circumstances, we can write A LU where 0 0 0 u u 2 u n l 2 0 0 0 u 22 u 2n L l 3 l 32, U 0 0 0 l n l
More informationCopositive Plus Matrices
Copositive Plus Matrices Willemieke van Vliet Master Thesis in Applied Mathematics October 2011 Copositive Plus Matrices Summary In this report we discuss the set of copositive plus matrices and their
More informationMath 896 Coding Theory
Math 896 Coding Theory Problem Set Problems from June 8, 25 # 35 Suppose C 1 and C 2 are permutation equivalent codes where C 1 P C 2 for some permutation matrix P. Prove that: (a) C 1 P C 2, and (b) if
More informationCS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization
CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,
More informationRATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University
Submitted to the Annals of Statistics arxiv: arxiv:0000.0000 RATE-OPTIMAL GRAPHON ESTIMATION By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Network analysis is becoming one of the most active
More informationRecovering overcomplete sparse representations from structured sensing
Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix
More informationAMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems
AMS 209, Fall 205 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems. Overview We are interested in solving a well-defined linear system given
More informationStochastic Optimization Algorithms Beyond SG
Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods
More informationLecture 22: More On Compressed Sensing
Lecture 22: More On Compressed Sensing Scribed by Eric Lee, Chengrun Yang, and Sebastian Ament Nov. 2, 207 Recap and Introduction Basis pursuit was the method of recovering the sparsest solution to an
More informationA Numerical Algorithm for Block-Diagonal Decomposition of Matrix -Algebras, Part II: General Algorithm
A Numerical Algorithm for Block-Diagonal Decomposition of Matrix -Algebras, Part II: General Algorithm Takanori Maehara and Kazuo Murota May 2008 / May 2009 Abstract An algorithm is proposed for finding
More informationLinear Equations and Matrix
1/60 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Gaussian Elimination 2/60 Alpha Go Linear algebra begins with a system of linear
More informationNumerical Methods - Numerical Linear Algebra
Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear
More informationCO350 Linear Programming Chapter 6: The Simplex Method
CO350 Linear Programming Chapter 6: The Simplex Method 8th June 2005 Chapter 6: The Simplex Method 1 Minimization Problem ( 6.5) We can solve minimization problems by transforming it into a maximization
More informationSparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28
Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson
More informationSolving linear equations with Gaussian Elimination (I)
Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian
More informationDense LU factorization and its error analysis
Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,
More information2.3. Clustering or vector quantization 57
Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :
More informationSparse PCA in High Dimensions
Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and
More informationDimension Reduction and Iterative Consensus Clustering
Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered
More informationLinear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey
Copyright 2005, W.R. Winfrey Topics Preliminaries Echelon Form of a Matrix Elementary Matrices; Finding A -1 Equivalent Matrices LU-Factorization Topics Preliminaries Echelon Form of a Matrix Elementary
More informationMath 577 Assignment 7
Math 577 Assignment 7 Thanks for Yu Cao 1. Solution. The linear system being solved is Ax = 0, where A is a (n 1 (n 1 matrix such that 2 1 1 2 1 A =......... 1 2 1 1 2 and x = (U 1, U 2,, U n 1. By the
More informationRobust Spectral Inference for Joint Stochastic Matrix Factorization
Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University October 20, 2016 K. Dong (Cornell University) Robust Spectral Inference for Joint Stochastic Matrix Factorization
More informationON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES
ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated
More informationLecture # 20 The Preconditioned Conjugate Gradient Method
Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually,
More informationIEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior
More informationMulti-stage convex relaxation approach for low-rank structured PSD matrix recovery
Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Department of Mathematics & Risk Management Institute National University of Singapore (Based on a joint work with Shujun
More informationAdaptive estimation of the copula correlation matrix for semiparametric elliptical copulas
Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work
More informationThe Solution of Linear Systems AX = B
Chapter 2 The Solution of Linear Systems AX = B 21 Upper-triangular Linear Systems We will now develop the back-substitution algorithm, which is useful for solving a linear system of equations that has
More informationAlgebra C Numerical Linear Algebra Sample Exam Problems
Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric
More informationDirect Methods for Solving Linear Systems. Matrix Factorization
Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationMatrix Factorization and Analysis
Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their
More information14.2 QR Factorization with Column Pivoting
page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationLinear Algebra Practice Final
. Let (a) First, Linear Algebra Practice Final Summer 3 3 A = 5 3 3 rref([a ) = 5 so if we let x 5 = t, then x 4 = t, x 3 =, x = t, and x = t, so that t t x = t = t t whence ker A = span(,,,, ) and a basis
More information5.6. PSEUDOINVERSES 101. A H w.
5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and
More informationLemma 8: Suppose the N by N matrix A has the following block upper triangular form:
17 4 Determinants and the Inverse of a Square Matrix In this section, we are going to use our knowledge of determinants and their properties to derive an explicit formula for the inverse of a square matrix
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationComputational Methods. Systems of Linear Equations
Computational Methods Systems of Linear Equations Manfred Huber 2010 1 Systems of Equations Often a system model contains multiple variables (parameters) and contains multiple equations Multiple equations
More information1.5 Gaussian Elimination With Partial Pivoting.
Gaussian Elimination With Partial Pivoting In the previous section we discussed Gaussian elimination In that discussion we used equation to eliminate x from equations through n Then we used equation to
More informationRandom Methods for Linear Algebra
Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform
More informationCheckered Hadamard Matrices of Order 16
Europ. J. Combinatorics (1998) 19, 935 941 Article No. ej980250 Checkered Hadamard Matrices of Order 16 R. W. GOLDBACH AND H. L. CLAASEN In this paper all the so-called checkered Hadamard matrices of order
More informationLecture 13 October 6, Covering Numbers and Maurey s Empirical Method
CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and
More informationPlanted Cliques, Iterative Thresholding and Message Passing Algorithms
Planted Cliques, Iterative Thresholding and Message Passing Algorithms Yash Deshpande and Andrea Montanari Stanford University November 5, 2013 Deshpande, Montanari Planted Cliques November 5, 2013 1 /
More informationVast Volatility Matrix Estimation for High Frequency Data
Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36 Outline
More informationQuantum Computing Lecture 2. Review of Linear Algebra
Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces
More informationThe uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008
The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008 Emmanuel Candés (Caltech), Terence Tao (UCLA) 1 Uncertainty principles A basic principle
More informationLeast squares under convex constraint
Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption
More informationMath Matrix Algebra
Math 44 - Matrix Algebra Review notes - (Alberto Bressan, Spring 7) sec: Orthogonal diagonalization of symmetric matrices When we seek to diagonalize a general n n matrix A, two difficulties may arise:
More informationLeast Squares Approximation
Chapter 6 Least Squares Approximation As we saw in Chapter 5 we can interpret radial basis function interpolation as a constrained optimization problem. We now take this point of view again, but start
More information