Overlapping Variable Clustering with Statistical Guarantees and LOVE

Size: px
Start display at page:

Download "Overlapping Variable Clustering with Statistical Guarantees and LOVE"

Transcription

1 with Statistical Guarantees and LOVE Department of Statistical Science Cornell University WHOA-PSI, St. Louis, August 2017

2 Joint work with Mike Bing, Yang Ning and Marten Wegkamp Cornell University, Department of Statistical Science

3 Variable clustering What is variable clustering? Observable: X = (X 1,..., X j,..., X p ) R p, random vector. Data: X (1),..., X (n) i.i.d. X R p. Goal of variable clustering: Find sub-groups of similar coordinates of X, using Data. Goal different than data/point clustering: Find sub-groups of similar observations X (i), 1 i n. Data different than network clustering: Network data is 0/1 adjacency matrix.

4 Co-clustering genes using expression profiles ENSG ENSG ENSG

5 Model-based Objectives of model-based (overlapping) variable clustering Define model-based similarity between coordinates of X. Model definition depends crucially on what we want to cluster and what type of data we have: Here we cluster variables, and observe their values. Use identifiable model to define clusters of co-ordinates; allow for overlap. Estimate clusters; assess theoretically their accuracy, in the model-based framework.

6 A first step towards a model for overlapping clustering A sparse latent variable model with unstructured sparsity 1 X = AZ + E; A is a p K allocation matrix. 2 Z R K latent vector, E R p noise vector; Z E. 3 A is row sparse; K k=1 A jk 1, for each j {1,..., p}. Variable similarity and clusters X j and X l are similar if connected with the same Z k. Suggests definition for clusters with overlap: G k := { j {1,..., p} : A jk 0 }. Issue: model and clusters not identifiable: AZ = AQQ T Z, for any orthogonal Q. A jk may be 0, but (AQ) jk may not.

7 Identifiable models for overlapping clustering X = AZ + E; X R p, Z R K (Latent), A R p K. A sparse latent variable model with structured sparsity: The pure variable assumption (i) A row sparse; K k=1 A jk 1, for each j {1,..., p}. (ii) For every (column) k {1,..., K }, there exist at least two indices (rows) j {1,..., p} such that A jk = 1 and A jl = 0 for all l k. Spoiler alert! This A is identifiable up to signed permutations.

8 The pure variable assumption: interpretation Cluster: G k := { j {1,..., p} : A jk 0 }. The pure variable assumption A pure variable X j associates with only one latent factor Z k. Pure variables are crucial in building overlapping clusters Cluster G k is given by Z k, which is not observable. Z k = (biological) function. A pure variable X j is an observable proxy for a Z k. Observable X j performs function Z k. It anchors G k.

9 Overlapping clustering: interpretation An instance of determining unknown functions of variables Gene 1 (X 1 ) with function 1 anchors G 1. Gene 3 (X 3 ) with function 2 anchors G 2. Gene 2 (X 2 ) G 1 : Gene 2 performs function 1. Gene 2 (X 2 ) G 2 : Gene 2 also performs function 2. Before clustering Gene 2 had unknown function. After clustering Gene 2 is found to have dual function.

10 Identifiable models for overlapping clustering Ingredients for identifiability Latent variable model X = AZ + E with structure on A: the pure variable assumption. Mild assumption on C = Cov(Z ): ( (C) =: min min{cjj, C kk } C jk ) > 0. j k (C) > 0 = Z j Z k, a.s., for all j k.

11 Identifiability in structured sparse latent models The pure variable set I is identifiable I and its partition I =: {I k } 1 k K can be constructed uniquely, from Σ up to label permutations. The allocation matrix A is identifiable Under the pure variable assumption, there exists a unique matrix A, up to signed permutations, such that X = AZ + E. The clusters G = {G k } 1 k K are identifiable Under the pure variable assumption, the overlapping clusters G k are identifiable, up to label switching. If pure variables do not exist, identifiability of A fails.

12 Central challenge in proving identifiability X = AZ + E = Σ = ACA + Γ. I = Pure variable index set. J = {1,..., p} \ I = Impure variable index set. Central challenge: How to distinguish between I and J? Added challenge: How to distinguish between I and J when we don t know the noise Γ?

13 What is pure and what is impure? A necessary and sufficient condition for purity For each 1 i p, set M i := max j [p]\{i} Σ ij S i := { j [p] \ {i} : Σ ij = M i }. For given A and its induced pure variable set I, we have i I M i = max k [p]\{j} Σ jk for all j S i.

14 Look for maxima in Σ, ignore the diagonal! I = {{1, 2, 3}, {4, 5}, {6, 7}} and J = {8, 9} /2 2/ /2 2/ /2 2/ / / / /2 1/2 1/2 1/ /6 2/3 2/3 2/3 1/3 1/3 1/2 1/2 1/6 M 1 = max k 1 Σ 1k = 1. S 1 = {j 1 : Σ 1j = 1} = {2, 3}. 1 = M 1 = max k 1 Σ 2k = 1. 1 = M 1 = max k 1 Σ 3k = 1. I 1 = S 1 1 = {1, 2, 3} = pure

15 Look for maxima in Σ, ignore the diagonal! I = {{1, 2, 3}, {4, 5}, {6, 7}} and J = {8, 9} /2 2/ /2 2/ /2 2/ / / / /2 1/2 1/2 1/ /6 2/3 2/3 2/3 1/3 1/3 1/2 1/2 1/6 M 8 = max k 1 Σ 8k = 1. S 8 = {j 8 : Σ 8j = 1} = {4, 5}. 1 = M 8 max k 4 Σ 4k = 2. 8 cannot be pure! 8 J.

16 Estimation Estimate I. Estimate A I, the p K sub-matrix of A with rows in I. Estimate A J, with J = {1,..., p} \ I. Estimate the clusters G k.

17 Estimation of the pure variable set Reminder: I = pure variable set. Constructive characterization of I, population version For each 1 i p, set M i := max j [p]\{i} Σ ij S i := { j [p] \ {i} : Σ ij = M i }. For given A and its induced pure variable set I, we have i I M i = max k [p]\{j} Σ jk for all j S i. Moreover, S i {i} = I k, for some k, where I =: {I k } 1 k K is a partition of I.

18 Estimation of the pure variable set Algorithm idea Use the constructive characterization of I, at the population level. Replace Σ by the sample covariance Σ. Replace equalities by inequalities, allowing for tolerance level δ =: Σ Σ. Algorithm has complexity O(p 2 ). Requires input Σ, δ. Algorithm returns Î, its partition Î, and therefore K.

19 Estimation of the allocation submatrix A I A = [ AI Estimated previously: Î and its partition Î = {Î1,..., Î K }. The estimator ÂÎ, has rows i Î consisting of K 1 zeros and one entry equal to either +1 or 1. A J ]. Signs will be determined up to signed permutations. (1) Pick i Îk. Pick a sign for Âik, say Âik = 1. (2) For any j Îk \ {i}, we let { +1, if  jk = Σ ij > 0 1, if Σ ij < 0.

20 Estimation of the allocation sub-matrix A J [ ] [ ΣII Σ Σ = IJ AI CA = T I A I CA T J Σ JI Σ JJ A J CA T I A J CA T J ] [ ] ΓII Γ JJ Estimate A J row by row: motivation Σ IJ = A I CA T J θ j = CA j, for each j J. C kk =: 1 I k ( I k 1) θ j k =: 1 A ik Σ ij I k i I k i,j I k,i j Σ ij ; C km =: and 1 I k I m i I k,j I m Σ ij

21 Estimation of the allocation sub-matrix A J Estimation of rows of A J Under the model: θ j = CA j ; A j sparse. Available: Σ and estimated partition of pure variables Î Use Σ and Î to construct: θ j θ j, Ĉ C. Many choices to estimate sparse A j. Dantzig: Minimize β 1 over β R K such that θ j Ĉβ 2δ. Repeat for each j {1,..., Ĵ } to obtain ÂĴ.

22 Statistical guarantees: assumptions Recall: Σ = ACA T + Γ; X sub-gaussian: Σ Σ =: δ = O( (log p)/n). Signal strength conditions 1 Either on C: (C) = min j k ( min{cjj, C kk } C jk ) cδ. 2 Or on A: Smallest non-zero entry is larger than δ.

23 Estimation of the pure variable set: guarantees I = pure variables J = {1,..., p} \ I. Quasi-pure variables For each k [K ] : J k 1 = { i J : A ik 1 4δ/τ 2}. J 1 = K k=1 Jk 1. If X 1 is pure then A 1k = 1, for some k [K ]. If X 2 is quasi-pure then A 2m 1, for some m [K ].

24 Estimation of the pure variable set: guarantees J1 k = {i : A ik 1}; J 1 =: K k=1 Jk 1 ; I k = {i : A ik = 1}. Recovery guarantees: no signal strength conditions on A (a) K = K. (b) I Î I J 1. (c) I k Îk I k J k 1 w.h.p. for each k [K ]. Minimal recovery mistakes: no conditions on A Pure (1, 0, 0, 0, 0, 0) In, correct. Quasi Pure (0.99, 0.01, 0, 0, 0, 0) In, slight mistake. Impure (0.25, 0.25, 0.001, 0.099, 0.2, 0.2) Out, correct.

25 Estimation of the pure variable set: guarantees Exact recovery, under conditions on A Î = I, up to label switching, with I = K a=1 I a. Exact recovery: conditions on A Pure (1, 0, 0, 0, 0, 0) In, correct. Quasi Pure (0.99, 0.01, 0, 0, 0, 0) Not allowed Impure I (0.25, 0.25, 0.001, 0.099, 0.2, 0.2) Not allowed Impure II (0.25, 0.25, 0.1, 0.1, 0.3, 0.2) Out, correct.

26 Estimation of the allocation matrix A: guarantees Sup-norm consistency Let H denote the space of all K by K signed permutation matrices. We have, with probability exceeding 1 c 1 p c 2, 1 K = K. 2 min P H Â PA κ log p/n; κ =: C 1,1. Non-standard bound; similar to errors-in-variables model bounds. If C is diagonally dominant then κ is constant.

27 Activation and inhibition A = , Â = /2 1/2 0 +1/2 1/6 1/ /2 1/2 0 2/3 1/6 +1/6 Care in interpreting the signs: For each latent factor Z k we can consistently determine which of the X j s are associated with Z k in the same direction, but not the direction.

28 Estimation of the overlapping groups Ĝ = { } Ĝ 1,..., Ĝ K, Ĝk = { i : Âik 0 }. i,k FPR = 1 {A ik =0,Âik 0} i,k i,k 1, FNR = 1 {A ik 0,Âik =0} {A ik =0} i,k 1. {A ik 0} Guarantees for cluster recovery Under conditions on C: Under conditions on A: All results hold w.h.p. K = K ; FPR = 0; FNR = β. K = K ; FPR = 0; FNR = 0; Ĝ = G.

29 Sparsity per row: s j = A j 0 = k 1{A jk 0} J 1 = Quasi pure variables J 2 = Variables associated with some Z s below noise J 3 = Variables associated with all Z s above noise j J FNR = β 1 J 2 s j j J 1 J 2 s j + j J 3 I s. j If J 3 + I >> J 1 + J 2, β is very small.

30 LOVE A Latent model approach to OVErlapping clustering. Estimate the partition I of pure variables by Î Estimate separately A I and A J to obtain Â, the allocation matrix estimate. Estimate the overlapping clusters by Ĝ

31 Co-clustering genes using expression profiles: p = 500 Benchmark data set: RNA-seq transcript level data; Blood platelet samples from n = 285 individuals. ENSG and ENSG both non-coding RNA: placed together in Cluster 4. Each also placed in other clusters. Non-coding RNAs are pleiotropic (multiple functions). ENSG ENSG ENSG

32 Related work Large literature on Non-Negative Matrix Factorization (NMF) X = AZ + E; X, A, Z non-negative matrices. Goal of NMF: find à and Z with X à Z ɛ. In NMF, the pure variable assumption is needed for: Identifiability of A, when E = 0 (Donoho and Stodden, 2007). Identifiability in topic models (count data), Arora et al (2013): column sums of X and A are 1; E = 0. Polynomial time NMF algorithms: Arora et al (2012, 2013); Bittorf et al (2013). Other restrictions on matrices needed.

33 What can you do with LOVE? All you need is LOVE 1 A flexible identifiable latent factor model for overlapping clustering: no restrictions on X and Z. 2 New in the clustering lit: A has both + and entries. 3 New: A and clusters identifiable in the presence of non-ignorable noise E. 4 New algorithm: LOVE, runs in O(p 2 + pk ) time. 5 New: Statistical guarantees for data generated from X = AZ + E, with X sub-gaussian; immediate extensions to Gaussian copula. 6 New: A with both + and - allows for a more refined cluster interpretation.

34 with Statistical Guarantees (2017); F. Bunea, Y. Ning, M. Wegkamp [ Old version; new version coming soon!] Minimax Optimal Variable Clustering in G-models via Cord (2016); F. Bunea, C. Giraud, X. Luo [ Non-overlapping clustering] PECOK: a convex optimization approach to variable clustering(2016); F. Bunea, C.Giraud, M. Royer, N. Verzelen [ Non-overlapping clustering]

35 Thanks!

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i ) Direct Methods for Linear Systems Chapter Direct Methods for Solving Linear Systems Per-Olof Persson persson@berkeleyedu Department of Mathematics University of California, Berkeley Math 18A Numerical

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Matrices and Vectors

Matrices and Vectors Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix

More information

sparse and low-rank tensor recovery Cubic-Sketching

sparse and low-rank tensor recovery Cubic-Sketching Sparse and Low-Ran Tensor Recovery via Cubic-Setching Guang Cheng Department of Statistics Purdue University www.science.purdue.edu/bigdata CCAM@Purdue Math Oct. 27, 2017 Joint wor with Botao Hao and Anru

More information

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization

More information

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular form) Given: matrix C = (c i,j ) n,m i,j=1 ODE and num math: Linear algebra (N) [lectures] c phabala 2016 DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU LU Factorization To further improve the efficiency of solving linear systems Factorizations of matrix A : LU and QR LU Factorization Methods: Using basic Gaussian Elimination (GE) Factorization of Tridiagonal

More information

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le Direct Methods for Solving Linear Systems Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le 1 Overview General Linear Systems Gaussian Elimination Triangular Systems The LU Factorization

More information

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming

High Dimensional Inverse Covariate Matrix Estimation via Linear Programming High Dimensional Inverse Covariate Matrix Estimation via Linear Programming Ming Yuan October 24, 2011 Gaussian Graphical Model X = (X 1,..., X p ) indep. N(µ, Σ) Inverse covariance matrix Σ 1 = Ω = (ω

More information

An algebraic perspective on integer sparse recovery

An algebraic perspective on integer sparse recovery An algebraic perspective on integer sparse recovery Lenny Fukshansky Claremont McKenna College (joint work with Deanna Needell and Benny Sudakov) Combinatorics Seminar USC October 31, 2018 From Wikipedia:

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Spectral Clustering. Guokun Lai 2016/10

Spectral Clustering. Guokun Lai 2016/10 Spectral Clustering Guokun Lai 2016/10 1 / 37 Organization Graph Cut Fundamental Limitations of Spectral Clustering Ng 2002 paper (if we have time) 2 / 37 Notation We define a undirected weighted graph

More information

Gaussian Elimination for Linear Systems

Gaussian Elimination for Linear Systems Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination

More information

Estimation of large dimensional sparse covariance matrices

Estimation of large dimensional sparse covariance matrices Estimation of large dimensional sparse covariance matrices Department of Statistics UC, Berkeley May 5, 2009 Sample covariance matrix and its eigenvalues Data: n p matrix X n (independent identically distributed)

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

19.1 Problem setup: Sparse linear regression

19.1 Problem setup: Sparse linear regression ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 19: Minimax rates for sparse linear regression Lecturer: Yihong Wu Scribe: Subhadeep Paul, April 13/14, 2016 In

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

High-dimensional covariance estimation based on Gaussian graphical models

High-dimensional covariance estimation based on Gaussian graphical models High-dimensional covariance estimation based on Gaussian graphical models Shuheng Zhou Department of Statistics, The University of Michigan, Ann Arbor IMA workshop on High Dimensional Phenomena Sept. 26,

More information

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Math 471 (Numerical methods) Chapter 3 (second half). System of equations Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.

More information

Sparse Legendre expansions via l 1 minimization

Sparse Legendre expansions via l 1 minimization Sparse Legendre expansions via l 1 minimization Rachel Ward, Courant Institute, NYU Joint work with Holger Rauhut, Hausdorff Center for Mathematics, Bonn, Germany. June 8, 2010 Outline Sparse recovery

More information

Solving Linear Systems Using Gaussian Elimination. How can we solve

Solving Linear Systems Using Gaussian Elimination. How can we solve Solving Linear Systems Using Gaussian Elimination How can we solve? 1 Gaussian elimination Consider the general augmented system: Gaussian elimination Step 1: Eliminate first column below the main diagonal.

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 3: Positive-Definite Systems; Cholesky Factorization Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 11 Symmetric

More information

Solving Linear Systems

Solving Linear Systems Solving Linear Systems Iterative Solutions Methods Philippe B. Laval KSU Fall 207 Philippe B. Laval (KSU) Linear Systems Fall 207 / 2 Introduction We continue looking how to solve linear systems of the

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015 CS: Lecture #7 Mridul Aanjaneya March 9, 5 Solving linear systems of equations Consider a lower triangular matrix L: l l l L = l 3 l 3 l 33 l n l nn A procedure similar to that for upper triangular systems

More information

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3

Pivoting. Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3 Pivoting Reading: GV96 Section 3.4, Stew98 Chapter 3: 1.3 In the previous discussions we have assumed that the LU factorization of A existed and the various versions could compute it in a stable manner.

More information

Boolean Inner-Product Spaces and Boolean Matrices

Boolean Inner-Product Spaces and Boolean Matrices Boolean Inner-Product Spaces and Boolean Matrices Stan Gudder Department of Mathematics, University of Denver, Denver CO 80208 Frédéric Latrémolière Department of Mathematics, University of Denver, Denver

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations Chapter 1: Systems of linear equations and matrices Section 1.1: Introduction to systems of linear equations Definition: A linear equation in n variables can be expressed in the form a 1 x 1 + a 2 x 2

More information

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning

Clustering K-means. Machine Learning CSE546. Sham Kakade University of Washington. November 15, Review: PCA Start: unsupervised learning Clustering K-means Machine Learning CSE546 Sham Kakade University of Washington November 15, 2016 1 Announcements: Project Milestones due date passed. HW3 due on Monday It ll be collaborative HW2 grades

More information

Prediction of double gene knockout measurements

Prediction of double gene knockout measurements Prediction of double gene knockout measurements Sofia Kyriazopoulou-Panagiotopoulou sofiakp@stanford.edu December 12, 2008 Abstract One way to get an insight into the potential interaction between a pair

More information

Convex relaxation for Combinatorial Penalties

Convex relaxation for Combinatorial Penalties Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 3 for Applied Multivariate Analysis Outline 1 Reprise-Vectors, vector lengths and the angle between them 2 3 Partial correlation

More information

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Introduction to Simulation - Lecture 2 Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White Thanks to Deepak Ramaswamy, Michal Rewienski, and Karen Veroy Outline Reminder about

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2017 Admin Assignment 4: Due Friday. Assignment 5: Posted, due Monday of last week of classes Last Time: PCA with Orthogonal/Sequential

More information

Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization

Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization Journal of Machine Learning Research 5 (204) 249-280 Submitted 3/3; Revised 0/3; Published 4/4 Robust Near-Separable Nonnegative Matrix Factorization Using Linear Optimization Nicolas Gillis Department

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Composite Loss Functions and Multivariate Regression; Sparse PCA

Composite Loss Functions and Multivariate Regression; Sparse PCA Composite Loss Functions and Multivariate Regression; Sparse PCA G. Obozinski, B. Taskar, and M. I. Jordan (2009). Joint covariate selection and joint subspace selection for multiple classification problems.

More information

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Determinants. Chia-Ping Chen. Linear Algebra. Professor Department of Computer Science and Engineering National Sun Yat-sen University 1/40

Determinants. Chia-Ping Chen. Linear Algebra. Professor Department of Computer Science and Engineering National Sun Yat-sen University 1/40 1/40 Determinants Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra About Determinant A scalar function on the set of square matrices

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 GENE H GOLUB Issues with Floating-point Arithmetic We conclude our discussion of floating-point arithmetic by highlighting two issues that frequently

More information

Covariate-Assisted Variable Ranking

Covariate-Assisted Variable Ranking Covariate-Assisted Variable Ranking Tracy Ke Department of Statistics Harvard University WHOA-PSI@St. Louis, Sep. 8, 2018 1/18 Sparse linear regression Y = X β + z, X R n,p, z N(0, σ 2 I n ) Signals (nonzero

More information

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania

DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING. By T. Tony Cai and Linjun Zhang University of Pennsylvania Submitted to the Annals of Statistics DISCUSSION OF INFLUENTIAL FEATURE PCA FOR HIGH DIMENSIONAL CLUSTERING By T. Tony Cai and Linjun Zhang University of Pennsylvania We would like to congratulate the

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 208 LECTURE 3 need for pivoting we saw that under proper circumstances, we can write A LU where 0 0 0 u u 2 u n l 2 0 0 0 u 22 u 2n L l 3 l 32, U 0 0 0 l n l

More information

Copositive Plus Matrices

Copositive Plus Matrices Copositive Plus Matrices Willemieke van Vliet Master Thesis in Applied Mathematics October 2011 Copositive Plus Matrices Summary In this report we discuss the set of copositive plus matrices and their

More information

Math 896 Coding Theory

Math 896 Coding Theory Math 896 Coding Theory Problem Set Problems from June 8, 25 # 35 Suppose C 1 and C 2 are permutation equivalent codes where C 1 P C 2 for some permutation matrix P. Prove that: (a) C 1 P C 2, and (b) if

More information

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,

More information

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University

RATE-OPTIMAL GRAPHON ESTIMATION. By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Submitted to the Annals of Statistics arxiv: arxiv:0000.0000 RATE-OPTIMAL GRAPHON ESTIMATION By Chao Gao, Yu Lu and Harrison H. Zhou Yale University Network analysis is becoming one of the most active

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems AMS 209, Fall 205 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems. Overview We are interested in solving a well-defined linear system given

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

Lecture 22: More On Compressed Sensing

Lecture 22: More On Compressed Sensing Lecture 22: More On Compressed Sensing Scribed by Eric Lee, Chengrun Yang, and Sebastian Ament Nov. 2, 207 Recap and Introduction Basis pursuit was the method of recovering the sparsest solution to an

More information

A Numerical Algorithm for Block-Diagonal Decomposition of Matrix -Algebras, Part II: General Algorithm

A Numerical Algorithm for Block-Diagonal Decomposition of Matrix -Algebras, Part II: General Algorithm A Numerical Algorithm for Block-Diagonal Decomposition of Matrix -Algebras, Part II: General Algorithm Takanori Maehara and Kazuo Murota May 2008 / May 2009 Abstract An algorithm is proposed for finding

More information

Linear Equations and Matrix

Linear Equations and Matrix 1/60 Chia-Ping Chen Professor Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Gaussian Elimination 2/60 Alpha Go Linear algebra begins with a system of linear

More information

Numerical Methods - Numerical Linear Algebra

Numerical Methods - Numerical Linear Algebra Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear

More information

CO350 Linear Programming Chapter 6: The Simplex Method

CO350 Linear Programming Chapter 6: The Simplex Method CO350 Linear Programming Chapter 6: The Simplex Method 8th June 2005 Chapter 6: The Simplex Method 1 Minimization Problem ( 6.5) We can solve minimization problems by transforming it into a maximization

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Sparse PCA in High Dimensions

Sparse PCA in High Dimensions Sparse PCA in High Dimensions Jing Lei, Department of Statistics, Carnegie Mellon Workshop on Big Data and Differential Privacy Simons Institute, Dec, 2013 (Based on joint work with V. Q. Vu, J. Cho, and

More information

Dimension Reduction and Iterative Consensus Clustering

Dimension Reduction and Iterative Consensus Clustering Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered

More information

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey Copyright 2005, W.R. Winfrey Topics Preliminaries Echelon Form of a Matrix Elementary Matrices; Finding A -1 Equivalent Matrices LU-Factorization Topics Preliminaries Echelon Form of a Matrix Elementary

More information

Math 577 Assignment 7

Math 577 Assignment 7 Math 577 Assignment 7 Thanks for Yu Cao 1. Solution. The linear system being solved is Ax = 0, where A is a (n 1 (n 1 matrix such that 2 1 1 2 1 A =......... 1 2 1 1 2 and x = (U 1, U 2,, U n 1. By the

More information

Robust Spectral Inference for Joint Stochastic Matrix Factorization

Robust Spectral Inference for Joint Stochastic Matrix Factorization Robust Spectral Inference for Joint Stochastic Matrix Factorization Kun Dong Cornell University October 20, 2016 K. Dong (Cornell University) Robust Spectral Inference for Joint Stochastic Matrix Factorization

More information

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated

More information

Lecture # 20 The Preconditioned Conjugate Gradient Method

Lecture # 20 The Preconditioned Conjugate Gradient Method Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually,

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery

Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Multi-stage convex relaxation approach for low-rank structured PSD matrix recovery Department of Mathematics & Risk Management Institute National University of Singapore (Based on a joint work with Shujun

More information

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas

Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Adaptive estimation of the copula correlation matrix for semiparametric elliptical copulas Department of Mathematics Department of Statistical Science Cornell University London, January 7, 2016 Joint work

More information

The Solution of Linear Systems AX = B

The Solution of Linear Systems AX = B Chapter 2 The Solution of Linear Systems AX = B 21 Upper-triangular Linear Systems We will now develop the back-substitution algorithm, which is useful for solving a linear system of equations that has

More information

Algebra C Numerical Linear Algebra Sample Exam Problems

Algebra C Numerical Linear Algebra Sample Exam Problems Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric

More information

Direct Methods for Solving Linear Systems. Matrix Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization Direct Methods for Solving Linear Systems Matrix Factorization Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011

More information

Matrix Factorization and Analysis

Matrix Factorization and Analysis Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

Linear Algebra Practice Final

Linear Algebra Practice Final . Let (a) First, Linear Algebra Practice Final Summer 3 3 A = 5 3 3 rref([a ) = 5 so if we let x 5 = t, then x 4 = t, x 3 =, x = t, and x = t, so that t t x = t = t t whence ker A = span(,,,, ) and a basis

More information

5.6. PSEUDOINVERSES 101. A H w.

5.6. PSEUDOINVERSES 101. A H w. 5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and

More information

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form: 17 4 Determinants and the Inverse of a Square Matrix In this section, we are going to use our knowledge of determinants and their properties to derive an explicit formula for the inverse of a square matrix

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Computational Methods. Systems of Linear Equations

Computational Methods. Systems of Linear Equations Computational Methods Systems of Linear Equations Manfred Huber 2010 1 Systems of Equations Often a system model contains multiple variables (parameters) and contains multiple equations Multiple equations

More information

1.5 Gaussian Elimination With Partial Pivoting.

1.5 Gaussian Elimination With Partial Pivoting. Gaussian Elimination With Partial Pivoting In the previous section we discussed Gaussian elimination In that discussion we used equation to eliminate x from equations through n Then we used equation to

More information

Random Methods for Linear Algebra

Random Methods for Linear Algebra Gittens gittens@acm.caltech.edu Applied and Computational Mathematics California Institue of Technology October 2, 2009 Outline The Johnson-Lindenstrauss Transform 1 The Johnson-Lindenstrauss Transform

More information

Checkered Hadamard Matrices of Order 16

Checkered Hadamard Matrices of Order 16 Europ. J. Combinatorics (1998) 19, 935 941 Article No. ej980250 Checkered Hadamard Matrices of Order 16 R. W. GOLDBACH AND H. L. CLAASEN In this paper all the so-called checkered Hadamard matrices of order

More information

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method

Lecture 13 October 6, Covering Numbers and Maurey s Empirical Method CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and

More information

Planted Cliques, Iterative Thresholding and Message Passing Algorithms

Planted Cliques, Iterative Thresholding and Message Passing Algorithms Planted Cliques, Iterative Thresholding and Message Passing Algorithms Yash Deshpande and Andrea Montanari Stanford University November 5, 2013 Deshpande, Montanari Planted Cliques November 5, 2013 1 /

More information

Vast Volatility Matrix Estimation for High Frequency Data

Vast Volatility Matrix Estimation for High Frequency Data Vast Volatility Matrix Estimation for High Frequency Data Yazhen Wang National Science Foundation Yale Workshop, May 14-17, 2009 Disclaimer: My opinion, not the views of NSF Y. Wang (at NSF) 1 / 36 Outline

More information

Quantum Computing Lecture 2. Review of Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces

More information

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008

The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008 The uniform uncertainty principle and compressed sensing Harmonic analysis and related topics, Seville December 5, 2008 Emmanuel Candés (Caltech), Terence Tao (UCLA) 1 Uncertainty principles A basic principle

More information

Least squares under convex constraint

Least squares under convex constraint Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption

More information

Math Matrix Algebra

Math Matrix Algebra Math 44 - Matrix Algebra Review notes - (Alberto Bressan, Spring 7) sec: Orthogonal diagonalization of symmetric matrices When we seek to diagonalize a general n n matrix A, two difficulties may arise:

More information

Least Squares Approximation

Least Squares Approximation Chapter 6 Least Squares Approximation As we saw in Chapter 5 we can interpret radial basis function interpolation as a constrained optimization problem. We now take this point of view again, but start

More information