u!i = a T u = 0. Then S satisfies

Similar documents
Necessary and Sufficient Conditions for Sketched Subspace Clustering

The Information-Theoretic Requirements of Subspace Clustering with Missing Data

Permanent vs. Determinant

On combinatorial approaches to compressed sensing

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

LOW ALGEBRAIC DIMENSION MATRIX COMPLETION

arxiv: v1 [stat.ml] 28 Mar 2017

Multi-View Clustering via Canonical Correlation Analysis

TOEPLITZ AND POSITIVE SEMIDEFINITE COMPLETION PROBLEM FOR CYCLE GRAPH

A Characterization of Sampling Patterns for Union of Low-Rank Subspaces Retrieval Problem

6 General properties of an autonomous system of two first order ODE

Acute sets in Euclidean spaces

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

arxiv: v1 [cs.lg] 22 Mar 2014

Ramsey numbers of some bipartite graphs versus complete graphs

Linear First-Order Equations

Multi-View Clustering via Canonical Correlation Analysis

Least-Squares Regression on Sparse Spaces

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

DIFFERENTIAL GEOMETRY, LECTURE 15, JULY 10

Sturm-Liouville Theory

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

Rank Determination for Low-Rank Data Completion

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

Convergence of Random Walks

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Lecture 5. Symmetric Shearer s Lemma

REAL ANALYSIS I HOMEWORK 5

Robustness and Perturbations of Minimal Bases

Iterated Point-Line Configurations Grow Doubly-Exponentially

Logarithmic spurious regressions

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

Witt#5: Around the integrality criterion 9.93 [version 1.1 (21 April 2013), not completed, not proofread]

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Math 1B, lecture 8: Integration by parts

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

2886 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Discrete Mathematics

Multi-View Clustering via Canonical Correlation Analysis

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Systems & Control Letters

The chromatic number of graph powers

Diagonalization of Matrices Dr. E. Jacobs

MAT 545: Complex Geometry Fall 2008

arxiv: v1 [math.mg] 10 Apr 2018

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Sharp Thresholds. Zachary Hamaker. March 15, 2010

Chromatic number for a generalization of Cartesian product graphs

Lower bounds on Locality Sensitive Hashing

arxiv: v3 [stat.ml] 11 Oct 2016

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

PDE Notes, Lecture #11

Applications of the Wronskian to ordinary linear differential equations

LECTURE NOTES ON DVORETZKY S THEOREM

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Influence of weight initialization on multilayer perceptron performance

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

Separation of Variables

Rotation by 90 degree counterclockwise

Multi-View Clustering via Canonical Correlation Analysis

Some Examples. Uniform motion. Poisson processes on the real line

Conservation Laws. Chapter Conservation of Energy

Math 342 Partial Differential Equations «Viktor Grigoryan

Table of Common Derivatives By David Abraham

Summary: Differentiation

THE GENUINE OMEGA-REGULAR UNITARY DUAL OF THE METAPLECTIC GROUP

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

A Note on Exact Solutions to Linear Differential Equations by the Matrix Exponential

Database-friendly Random Projections

Multiple Rank-1 Lattices as Sampling Schemes for Multivariate Trigonometric Polynomials

Breaking the Limits of Subspace Inference

Interconnected Systems of Fliess Operators

Euler equations for multiple integrals

On the Equivalence Between Real Mutually Unbiased Bases and a Certain Class of Association Schemes

A Sketch of Menshikov s Theorem

arxiv:hep-th/ v1 3 Feb 1993

Almost Split Morphisms, Preprojective Algebras and Multiplication Maps of Maximal Rank

Calculus and optimization

Notes on Lie Groups, Lie algebras, and the Exponentiation Map Mitchell Faulk

Learning Subspaces by Pieces

Tractability results for weighted Banach spaces of smooth functions

Jointly continuous distributions and the multivariate Normal

DEGREE DISTRIBUTION OF SHORTEST PATH TREES AND BIAS OF NETWORK SAMPLING ALGORITHMS

Implicit Differentiation

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Proof of SPNs as Mixture of Trees

On the Expansion of Group based Lifts

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy

On colour-blind distinguishing colour pallets in regular graphs

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Transcription:

Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace of R, r <, an suppose that we are only given projections of this subspace onto small subsets of the canonical coorinates The paper establishes necessary an sufficient eterministic conitions on the subsets for subspace ientifiability The results also she new light on low-rank matrix completion I INTRODUCTION Subspace ientification arises in a wie variety of signal an information processing applications In many cases, especially high-imensional situations, it is common to encounter missing ata Hence the growing literature concerning the estimation of low-imensional subspaces an matrices from incomplete ata in theory [ 7] an applications [8, 9] This paper consiers the problem of ientifying an r- imensional subspace of R from projections of the subspace onto small subsets of the canonical coorinates The main contribution of this paper is to establish eterministic necessary an sufficient conitions on such subsets that guarantee that there is only one r-imensional subspace consistent with all the projections These conitions also have implications for low-rank matrix completion an relate problems Organization of the paper In Section II we formally state the problem an our main results We present the proof of our main theorem in Section III Section IV illustrates the implications of our results for low-rank matrix completion Section V presents the graphical interpretation of the problem an another necessary conition base on this viewpoint II MODEL AND MAIN RESULTS Let S enote an r-imensional subpace of R Define as a N binary matrix an let! i enote the i th column of The nonzero entries of! i inicate the canonical coorinates involve in the i th projection Since S is r-imensional, the restriction of S onto ` r coorinates will be R` (in general), an hence such a projection will provie no information specific to S Therefore, without loss of generality (see the appenix for immeiate generalizations) we will assume that: A has exactly r + nonzero entries per column Given an r-imensional subspace S, let S!i R r+ enote the restriction of S to the nonzero coorinates in! i The question aresse in this paper is whether the restrictions {S! i } N i= uniquely etermine S This epens on the sampling pattern in Fig When can S be ientifie from its canonical projections {S! i } N i=? We will see that ientifiability of this sort can only be possible if N r, since ker S is ( r)-imensional Thus, unless otherwise state, we will also assume that: A has exactly N = r columns Let Gr(r,R ) enote the Grassmannian manifol of r- imensional subspaces in R Define S(S, ) Gr(r,R ) such that every S S(S, ) satisfies S!i = S! i i In wors, S(S, ) is the set of all r-imensional subspaces matching S on Example Let = 5, r =, S = span Then, for example, an = S! = span It is easy to see that there are infinitely many -imensional subspaces that match S on In fact, S(S, ) = span[ ] T R{} However, if we instea ha! = [ ] T, then S woul be the only subspace in S(S, ) The main result of this paper is the following theorem, which gives necessary an sufficient conitions on to guarantee that S(S, ) contains no subspace other than S Our results hol for (ae) S, with respect to the uniform measure over Gr(r,R )

Given a matrix, let n( ) enote its number of columns, an m( ) the number of its nonzero rows Theorem Let A an A hol For almost every S, S is the only subspace in S(S, ) if an only if every matrix forme with a subset of the columns in satisfies m( ) n( ) + r () The proof of Theorem is given in Section III In wors, Theorem is stating that S is the only subspace that matches S in if an only if every subset of n columns of has at least n + r nonzero rows Example The following matrix, where enotes a block of all s an I enotes the ientity matrix, satisfies the conitions of Theorem : = I r r When the conitions of Theorem are satisfie, ientifying S becomes a trivial task: S = ker A T, with A as efine in Section III In general, verifying the conitions on in Theorem may be computationally prohibitive, especially for large However, as the next theorem states, uniform ranom sampling patterns will satisfy the conitions in Theorem with high probability (whp) Theorem Assume A an let < be given Suppose r an that each column of contains at 6 least ` nonzero entries, selecte uniformly at ranom an inepenently across columns, with ` max 9 log( ) +, r () Then will satisfy the conitions of Theorem with probability at least Theorem is prove in the appenix Notice that O(r log ) nonzero entries per column is a typical requirement of LRMC methos, while O(max{r,log }) is sufficient for subspace ientifiability III PROOF OF THEOREM For any subspace, matrix or vector that is compatible with a binary vector, we will use the subscript to enote its restriction to the nonzero coorinates/rows in For ae S, S! i is an r-imensional subspace of R r+, an the kernel of S! i is a -imensional subspace of R r+ Lemma Let a!i R r+ be a nonzero element of ker S! i All entries of a!i are nonzero for ae S Proof Suppose a!i has at least one zero entry Use to enote the binary vector of the nonzero entries of a!i Since a!i is orthogonal to S! i, for every u!i S! i we have that a T! i u!i = a T u = Then S satisfies im S im ker a T = < () Observe that for every binary vector with r, ae r-imensional subspace S satisfies im S = Thus () hols only in a set of measure zero Define a i as the vector in R with the entries of a!i in the nonzero positions of! i an zeros elsewhere Then S ker a T i for every S S(S, ) an every i Letting A be the ( r) matrix forme with {a i } r i= as columns, we have that S ker A T for every S S(S, ) Note that if im ker A T = r, then S(S, ) contains just one element, S, which is the ientifiability conition of interest Thus, we will establish conitions on guaranteeing that the r columns of A are linearly inepenent Recall that for any matrix A forme with a subset of the columns in A, n(a ) enotes the number of columns in A, an m(a ) enotes the number of nonzero rows in A Lemma For ae S, the columns of A are linearly epenent if an only if n(a ) > m(a ) r for some matrix A forme with a subset of the columns in A We will show Lemma using Lemmas an below Let ℵ(A ) be the largest number of linearly inepenent columns in A, ie, the column rank of A Lemma For ae S, ℵ(A ) m(a ) r Proof Let be the binary vector of nonzero rows of A, an A be the m(a ) n(a ) matrix forme with these rows For ae S, im S = r Since S ker A T, r = im S im ker A T = m(a ) ℵ(A ) We say A is minimally linearly epenent if the columns in A are linearly epenent, but every proper subset of the columns in A is linearly inepenent Lemma Let A be minimally linearly epenent Then for ae S, n(a ) = m(a ) r + Proof Let A = [ A a i ] be minimally linearly epenent Let m = m(a ), n = n(a ), an ℵ = ℵ(A ) Define R n such that A = a i () Note that because A is minimally linearly epenent, all entries in are nonzero Since the columns of A are linearly inepenent, n = ℵ Thus, by Lemma, n m r We want to show that n = m r, so suppose for contraiction that n < m r We can assume without loss of generality that A has all its zero rows (if any) in the first positions In that case, since A is minimally linearly epenent, it follows that the nonzero entries of a i cannot be in the corresponing rows

Thus, without loss of generality, assume that a i has its first r nonzero entries in the first r nonzero rows of A, an that the last nonzero entry of a i is (ie, rescale a i if neee), an is locate in the last row Let â i R r enote the vector with the first nonzero entries of a i, such that we can write: A a i = C B n â i m r m r, where C an B are submatrices use to enote the blocks of A corresponing to the partition of a i The columns of B are linearly inepenent To see this, suppose for contraiction that they are not This means that there exists some nonzero R n, such that B = Let c = A an note that only the r rows in c corresponing to the block C may be nonzero Let enote the binary vector of these nonzero entries Since S is orthogonal to every column of A an c is a linear combination of the columns in A, it follows that S ker c T This implies that im S im ker c T = As in the proof of Lemma, this implies that the columns of B are linearly epenent only in a set of measure zero Going back to (5), since the n columns of B are linearly inepenent an because we are assuming that n < m r, it follows that B has n linearly inepenent rows Let B enote the n n block of B that contains n linearly inepenent rows, an B the (m n r) n remaining block of B Notice that the row of B corresponing to the in a i must belong to B, since otherwise, we have that B =, with as in (), which implies that B is rank eficient, in contraiction to its construction We can further assume without loss of generality that the first nonzero entry of every column of B is (otherwise we may just rescale each column), an that these nonzero entries are in the first columns (otherwise we may just permute the columns accoringly) We will also let B enote all but the first row of B Thus, our matrix is organize as [ A a i ] = B C â i B B m r m n r n (5) (6) Now () implies B we may write = [ ] T, an since B is full rank, = B, ie, is the the last column of the inverse of B, which is a rational function in the elements of B Next, let us look back at () If n < m r, then using the aitional row [ ] of (6) (which oes not appear if m = n + r) we obtain [ ] = Recall that all the entries of are nonzero Thus, the last equation efines the following nonzero rational function in the elements of B : B = (7) Equivalently, (7) is a polynomial equation in the elements of B, which we will enote as f(b ) = Next note that for ae S, we can write S = ker A T for a unique A R ( r) in column echelon form : A = I D r r On the other han every D R r ( r) efines a unique r- imensional subspace of R, via (8) Thus, we have a bijection between R r ( r) an a ense open subset of Gr(r,R ) Since the columns of A must be linear combinations of the columns of A, the elements of B are linear functions in the entries of D Therefore, we can express f(b ) as a nonzero polynomial function g in the entries of D an rewrite (7) as g(d ) = But we know that g(d ) for almost every D R r ( r), an hence for almost every S Gr(r,R ) We conclue that almost every subspace in Gr(r,R ) will not satisfiy (7), an thus n = m r We are now reay to present the proofs of Lemma an Theorem Proof (Lemma ) ( ) Suppose A is minimally linearly epenent By Lemma, n(a ) = m(a ) r + > m(a ) r, an we have the first implication ( ) Suppose there exists an A with n(a ) > m(a ) r By Lemma, n(a ) > ℵ(A ), which implies the columns in A, an hence A, are linearly epenent Proof (Theorem ) Lemma shows that for ae S, the (j,i) th entry of A is nonzero if an only if the (j,i) th entry of is nonzero ( ) Suppose there exists an such that m( ) < n( ) + r Then m(a ) < n(a ) + r for some A Lemma Certain S may not amit this representation, eg, if S is orthogonal to certain canonical coorinates, which, as iscusse in Lemma, is not the case for almost every S in Gr(r, R ) (8)

implies that the columns of A, an hence A, are linearly epenent This implies im ker A T > r ( ) Suppose every satisfies m( ) n( ) + r Then m(a ) n(a )+r for every A, incluing A Therefore, by Lemma, the r columns in A are linearly inepenent, hence im ker A T = r IV IMPLICATIONS FOR LOW-RANK MATRIX COMPLETION Subspace ientifiability is closely relate to the low-rank matrix completion (LRMC) problem []: given a subset of entries in a rank-r matrix, exactly recover all of the missing entries This requires, implicitly, ientification of the subspace spanne by the complete columns of the matrix We use this section to present the implications of our results for LRMC Let X be a N, rank-r matrix an assume that A The columns of X are rawn inepenently accoring to, an absolutely continuous istribution with respect to the Lebesgue measure on S Let X be the incomplete version of X, observe only in the nonzero positions of Necessary an sufficient conitions for LRMC To relate the LRMC problem to our main results, efine Ñ as the number of istinct columns (sampling patterns) in, an let enote a Ñ matrix compose of these columns Corollary If oes not contain a ( r) submatrix satisfying the conitions of Theorem, then X cannot be uniquely recovere from X Since X is rank-r, a column with fewer than r observe entries cannot be complete (in general) We will thus assume without loss of generality the following relaxation of A: A has at least r nonzero entries per column Corollary Let A an A hol Suppose contains a ( r) submatrix satisfying the conitions of Theorem, an that for every column! i in this submatrix, at least r columns in X are observe at the nonzero locations of! i Then for ae S, an almost surely with respect to, X can be uniquely recovere from X Proofs of these results are given in the appenix The intuition behin Corollary is simply that ientifying a subspace from its projections onto sets of canonical coorinates is easier than LRMC, an so the necessary conition of Theorem is also necessary for LRMC Corollary follows from the fact that S (or its projections) can be etermine from r or more observations rawn from Valiating LRMC Uner certain assumptions on the subset of observe entries (eg, ranom sampling) an S (eg, incoherence), existing methos, for example nuclear norm minimization [], succee with high probability in completing the matrix exactly an thus ientifying S These assumptions are sufficient, but not necessary, an are sometimes unverifiable or unjustifie in practice Therefore, the result of an LRMC algorithm can be suspect Simply fining a low-rank matrix that agrees with the observe ata oes not guarantee that it is the correct completion It is possible that there exist other r-imensional subspaces ifferent from S that agree with the observe entries Example Suppose we run an LRMC algorithm on a matrix observe on the support of, with an S as in Example in Section II Suppose that the algorithm prouces a completion with columns from S = span[ 5 5] T instea of S It is clear that the resiual of the projection of any vector from S! i onto S!i will be zero, espite the fact that S S In other wors, if the resiuals are nonzero, we can iscar an incorrect solution, but if the resiuals are zero, we cannot valiate whether our solution is correct or not Corollary, below, allows one to rop the sampling an incoherence assumptions, an valiate the result of any LRMC algorithm eterministically Let x i enote the i th column of X, an x!i be the restriction of x i to the nonzero coorinates of! i We say that a subspace S fits X if x!i S!i for every i Corollary Let A hol, an suppose X contains two isjoint sets of columns, X an X, such that is a ( r) matrix satisfying the conitions of Theorem Let S be the subspace spanne by the columns of a completion of X Then for ae S, an almost surely with respect to, S fits X if an only if S = S The proof of Corollary is given in the appenix In wors, Corollary states that if one runs an LRMC algorithm on X, then the uniqueness an correctness of the resulting lowrank completion can be verifie by testing whether it agrees with the valiation set X Example Consier a matrix X with r = an ieal incoherence In this case, the best sufficient conitions for LRMC that we are aware of [5] require that all entries are observe Simulations show that alternating minimization [7] can exactly complete such matrices when fewer than half of the entries are observe, an only using half of the columns While previous theory for matrix completion gives no guarantees in scenarios like this, our new results o To see this, split X into two submatrices X an X Use nuclear norm, alternating minimization, or any LRMC metho, to fin a completion of X Theorem can be use to show that the sampling of X will satisfy the conitions of Theorem whp even when only half the entries are observe ranomly We can then use Corollary to show that if X is consistent with the completion of X, then the completion is unique an correct

Remarks Observe that the necessary an sufficient conitions in Corollaries an an the valiation in Corollary o not require the incoherence assumptions typically neee in LRMC results in orer to guarantee correctness an uniqueness Another avantage of results above is that they work for matrices of any rank, while stanar LRMC results only hol for ranks significantly smaller than the imension Finally, the results above hol with probability, as oppose to stanar LRMC statements, that hol whp On the other han, verifying whether meets the conitions of Theorem may be ifficult Nevertheless, if the entries in our ata matrix are sample ranomly with rates comparable to stanar conitions in LRMC, we know by Theorem that whp will satisfy such conitions V GRAPHICAL INTERPRETATION OF THE PROBLEM The problem of LRMC has also been stuie from the graph theory perspective For example, it has been shown that graph connectivity is a necessary conition for completion [6] Being subspace ientifiability so tightly relate to LRMC, it comes as no surprise that there also exist graph conitions for subspace ientifiability In this section we raw some connections between subspace ientifiability an graph theory that give insight on the conitions in Theorem We use this interpretation to show that graph connectivity is a necessary yet insufficient conition for subspace ientification Define G( ) as the bipartite graph with isjoint sets of row an column vertices, where there is an ege between row vertex j an column vertex i if the (j,i) th entry of is nonzero Example 5 With = 5, r = an = Rows G( ) Columns Recall that the neighborhoo of a set of vertices is the collection of all their ajacent vertices The graph theoretic interpretation of the conition on in Theorem is that every set of n column vertices in G( ) must have a neighborhoo of at least n + r row vertices Example 6 One may verify that every set of n column vertices in G( ) from Example 5 has a neighborhoo of at least n + r row vertices On the other han, if we consier as in 5 Example, the neighborhoo of the column vertices {,, } in G( ) contains fewer than n + r row vertices: Rows 5 G( ) Columns With this interpretation of Theorem, we can exten terms an results from graph theory to our context One example is the next corollary, which states that r-row-connectivity is a necessary but insufficient conition for subspace ientifiability We say G( ) is r-row-connecte if G( ) remains a connecte graph after removing any set of r row vertices an all their ajacent eges Corollary For ae S, S(S, ) > if G( ) is not r- row-connecte The converse is only true for r = Corollary is prove in the appenix VI CONCLUSIONS In this paper we etermine when an only when can one ientify a subspace from its projections onto subsets of the canonical coorinates We show that the conitions for ientifiability hol whp uner stanar ranom sampling schemes, an that when these conitions are met, ientifying the subspace becomes a trivial task This gives new necessary an sufficient conitions for LRMC, an allows one to verify whether the result of any LRMC algorithm is unique an correct without prior incoherence or sampling assumptions REFERENCES [] L Balzano, B Recht an R Nowak, High-imensional matche subspace etection when ata are missing, IEEE International Symposium on Information Theory, [] Y Chi, Y Elar an R Calerbank, PETRELS: Subspace estimation an tracking from partial observations, IEEE International Conference on Acoustics, Speech an Signal Processing, [] M Marani, G Mateos an G Giannakis, Rank minimization for subspace tracking from incomplete ata, IEEE International Conference on Acoustics, Speech an Signal Processing, [] E Canès an B Recht, Exact matrix completion via convex optimization, Founations of Computational Mathematics, 9 [5] B Recht, A simpler approach to matrix completion, Journal of Machine Learning Research, [6] F Király an R Tomioka, A combinatorial algebraic approach for the ientifiability of low-rank matrix completion, International Conference on Machine Learning, [7] P Jain, P Netrapalli an S Sanghavi, Low-rank matrix completion using alternating minimization, ACM Symposium on Theory Of Computing, [8] B Eriksson, P Barfor an R Nowak, Network iscovery from passive measurements, ACM SIGCOMM, 8 [9] J He, L Balzano an A Szlam, Incremental graient on the grassmannian for online foregroun an backgroun separation in subsample vieo, Conference on Computer Vision an Pattern Recognition, [] B Bollobás, Extremal graph theory, Dover Publications,

Generalization of Our Results APPENDIX Since the restriction of S onto ` r coorinates will be R` (in general), such a projection will provie no information specific to S We will thus assume without loss of generality that: A has at least r + nonzero entries per column Uner A, a column with ` observe entries restricts S(S, ) just as ` r columns uner A Thus in general, if there are columns in with more than r + nonzero entries, we can split them to obtain an expane matrix (efine below), with exactly r + nonzero entries per column, an use Theorem irectly on this expane matrix More precisely, let k,,k`i enote the inices of the `i nonzero entries in the i th column of Define i as the (`i r) matrix, whose j th column has the value in rows k,,k r, k r+j, an zeros elsewhere For example, if k =,,k`i = `i, then i = I `i r r `i r `i, where enotes a block of all s an I the ientity matrix Finally, efine = [ N ] The following is a generalization of Theorem to an arbitrarily number of projections an an arbitrary number of canonical coorinates involve in each projection It states that S will be the only subspace in S(S, ) if an only if there is a matrix, forme with r columns of, that satisfies the conitions of Theorem Theorem Let A hol For almost every S, S is the only subspace in S(S, ) if an only if there is a matrix, forme with r columns of, such that every matrix forme with a subset of the columns in satisfies () Proof It suffices to show that S(S, ) = S(S, ) Let! ij enote the j th column of i ( ) Let S S(S, ) By efinition, S!i = S! i, which trivially implies {S!ij = S! }`i r ij j= Since this is true for every i, we conclue S S(S, ) ( ) Let S S(S, ) By efinition, {S!ij = S! }`i r ij j= Notice that i satisfies the conitions of Theorem restricte to the nonzero rows in! i, which implies S!i = S! i Since this is true for every i, we conclue S S(S, ) Proof of Theorem Let E be the event that fails to satisfy the conitions of Theorem It is easy to see that this may only occur if there is a matrix forme with n columns from that has all its nonzero entries in the same n + r rows Let E n enote the event that the matrix forme with the first n columns from has all its nonzero entries in the first n + r rows Then r P (E) r n n + r P (E n) (9) n= If each column of contains at least ` nonzero entries, istribute uniformly an inepenently at ranom with ` as in (), it is easy to see that P(E n ) = for n ` r, an for ` r < n r, Since r n < P(E n ) P (E) < n+r n+r ` ` n < n + r `n, continuing with (9) we obtain: r n=` r+ < n=` n + r n n `(n r+) + n n= For the terms in (), write n + r `n () n `( n r+) () n n `(n r+) e n n n `(n r+) () Since n ` r, an since n, () < e n n n ` n = e n n ( ` )n, () () e n ( ` )n = e ` + n <, () where the last step follows because ` > log ( e ) + For the terms in (), write n n `( n r+) e n n In this case, since n an r, we have 6 n `( n r+) (5) (5) < (e) n n ` = (e) n n which we may rewrite as (e) n [e n ] `, e log n e n e n ` = e log+ n ` <, (6) `

where the last step follows because ` > log( ) + 6 log + 6 Substituting () an (6) in () an (), we have that P(E) <, as esire Proof of Corollary A subspace satisfying S!i = S! i will fit all the columns of X observe in the nonzero positions of! i Therefore, any subspace that satisfies S!i = S! i for every! i in will fit all the columns in X If oes not satisfy the conitions of Theorem, there will exist multiple subspaces that fit X, whence X cannot be uniquely recovere from X Proof of Corollary Suppose there are at least r columns in X observe in the nonzero positions of! i Then almost surely with respect to, the restrictions of such columns form a basis for S! i Therefore, any subspace S that fits such columns must satisfy S!i = S! i If this is true for every! i in a ( r) submatrix of, then any subspace that fits X must satisfy S!i = S! i for every! i in this submatrix There will be only one subspace that satisfies this conition if this submatrix satisfies the conitions in Theorem Finally, observe that uner A, the conition that X can be uniquely recovere from X is equivalent to saying that S is the only r-imensional subspace that fits X Proof of Corollary ( ) x!i S! i by assumption, so if S = S, it is trivially true that x!i S!i ( ) Use i =,,( r) to inex the columns in X Since S fits X, by efinition x!i S!i On the other han, x!i S! i by assumption, which implies that for every i, x!i lies in the intersection of S!i an S! i Recall that x i is sample inepenently accoring to, an absolutely continuous istribution with respect to the Lebesgue measure on S Since im S!i r, an for ae S, im S! i = r, the event r i= x!i S!i S! i will (almost surely with respect to ) only happen if S!i = S! i i, that is, if S S(S, ) Since satisfies the conitions of Theorem, S is the only subspace in S(S, ) This implies S = S, which conclues the proof Proof of Corollary ( ) Suppose G( ) is not r-row-connecte This means there exists a set of r row vertices such that if remove with their respective eges, G( ) becomes a isconnecte graph Let, an be a partition of the row vertices in G( ) such that an become isconnecte when is remove Similarly, let an be a partition of the columns in such that the column vertices corresponing to are isconnecte from the row vertices in, an the column vertices corresponing to are isconnecte from the row vertices in Rows Columns G( ) Let m = m( ), m = m( ), n = n( ) an n = n( ) It is easy to see that m enotes the number of row vertices that is connecte to Then + r = + m (7) Now suppose for contraiction that S(S, ) = By Theorem, m n + r Substituting this into (7) we obtain n +, (8) n +, (9) where (9) follows by symmetry Now observe that since, an form a partition of the row vertices, = + +, so using (8) an (9) we obtain r n + n + () On the other han, since an form a partition of the r columns in, r = n + n Plugging this in (), we obtain, which is a contraiction We thus conclue that S(S, ) > ( ) For r =, we prove the converse by contrapositive Suppose S(S, ) > By Theorem there exists a matrix forme with a subset of the columns of with m < n + Let be the matrix forme with the remaining columns of If m + m <, there is at least one row in G( ) that is isconnecte, an the converse follows trivially, so suppose m + m = Observe that n + n = Putting these two equations together, we obtain m + m n = n () Since m n, we obtain m > n Let an be the row vertices connecte to the column vertices in an respectively

Rows Columns Rows Columns (i) (ii) Now observe that since each column vertex only has two eges, the column vertices in may connect at most n + row vertices Since m > n, either (i) the eges of connect only vertices in, leaving an isconnecte, or (ii) the eges of connect a vertex in with a vertex in, leaving at least one vertex in isconnecte Either case, G( ) is isconnecte, as claime For r =, consier the following sampling: = One may verify that G( ) is r-row-connecte, yet it oes not satisfy the conitions of Theorem For instance the first columns of, fail to satisfy () This example can be easily generalize for r >