The Maximum Likelihood Threshold of a Graph

The Maximum Likelihood Threshold of a Graph Elizabeth Gross and Seth Sullivant San Jose State University, North Carolina State University August 28, 2014 Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 1 / 25

Gaussian Graphical Models S m = set of m m symmetric real matrices S m >0 S m 0 positive definite matrices in Sm positive semidefinite matrices in Sm Let G = (V, E) be an undirected graph, V = m. Definition M G = {Σ S m >0 : (Σ 1 ) ij = 0 for all i, j s.t. i j, ij / E} The centered Gaussian graphical model associated to the graph G consists of all multivariate normal distributions N (0, Σ) such that Σ M G. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 2 / 25

Sufficient Statistics Let φ G : S m R V +E φ G (Σ) = (σ ii ) i V (σ ij ) ij E For data X 1,..., X n R m, let S = 1 n n i=1 X ixi T covariance matrix. be the sample Proposition The minimal sufficient statistics of M G are φ G (S). The set φ G (S m 0 ) = K G is the cone of sufficient statistics. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 3 / 25

Maximum Likelihood Estimation in GGMs Theorem GGM M G is a regular exponential family. For sufficient statistics φ G (S), the likelihood function is convex, with at most one local maximum. If maximum likelihood estimate exists, it is the unique solution to the equation φ G (S) = φ G (Σ) subject to Σ M G. For a given sample covariance matrix S, the maximum likelihood estimate exists if and only if there exists Σ S m >0 such that φ G (S) = φ G (Σ). Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 4 / 25

The Maximum Likelihood Threshold Note if n m then generically S = 1 n n i=1 X ixi T is positive definite. MLE exists almost surely (since φ G (S) = φ G (S)). What happens when n < m? Definition Let G = ([m], E). Smallest n such that for almost all S S m 0 with rank S = n, there exist Σ S m >0 such that φ G(S) = φ G (Σ) is called the maximum likelihood threshold of G. Denoted: mlt(g) n mlt(g) means MLE exists almost surely for n data points. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 5 / 25

Work of Ben-David Definition (Ben-David 2014) The Gaussian rank of a graph is the smallest n such that for all general position rank n matrices S in S m 0, there exist Σ Sm >0 such that φ G (S) = φ G (Σ). A general position positive semidefinite matrix S is one where if S = L T L then every n n submatrix of L is invertible. Both maximum likelihood threshold and gaussian rank ask for this condition to hold on a dense open subset. In gaussian rank, we explicitly specify which open subset to look at. Thus mlt(g) gaussrank(g). Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 6 / 25

Basic Facts Clique number: ω(g) = size of a largest clique of G. A chordal graph is a graph with no induced cycle of length 4. A chordal cover of G = (V, E) is a graph H = (V, E ) such H is chordal and E E. Tree Width: τ(g) = min{ω(h) 1 : H chordal cover of G} Proposition Let G H. Then mlt(g) mlt(h). If G is chordal then mlt(g) = ω(g). Arbitrary G: ω(g) mlt(g) τ(g) + 1. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 7 / 25

Basic Facts Note that the bounds ω(g) mlt(g) τ(g) + 1. are far from each other in many situations. Proposition [Buhl 1993] For m-cycle C m mlt(c m ) = 3. mlt(g) = 1 if and only if G has no edges. mlt(g) = 2 if and only if G has no cycles. mlt(g) = m if and only if G = K m. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 8 / 25

Other Ways To Think About mlt(g) Instead of computing S = 1 n n i=1 X ixi T take p 1,..., p m R n and let S ij = p i p j. Problem For each graph G, what is the smallest n such that for almost all P = (p 1,..., p m ) R n m there exists a set of linearly independent vectors P = (q 1,..., q m ) R m m such that p i 2 = q i 2 for all i, and p i p j = q i q j for all ij E? We can assume all p i 2 = 1, i.e. all p i are confined to sphere. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 9 / 25

The Cycle Proposition (Buhl 1993) If C m is a cycle then mlt(c m ) = 3. Proof. τ(c m ) = 2, so mlt(c m ) 3. mlt(c m ) > 2, since there are non-liftable embeddings on the circle. 1 5 2 1 2 3 4 4 3 5 Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 10 / 25

Yet another way to think about mlt(g) K G := φ G (S m 0 ) is the cone of sufficient statistics. For a given S S m 0 the maximum likelihood estimate exists if and only if φ G (S) int(k G ). Let S(m, n) = {Σ R m m : Σ = Σ T, rank(σ) n}. Proposition If mlt(g) > n then φ G (S(m, n)) is contained in the Zariski closure of the boundary of K G. Definition Minimal n such that dim φ G (S(m, n)) = dim K G = V + E is rank(g). Corollary (Uhler 2012) mlt(g) rank(g). Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 11 / 25

Algebraic Matroids Definition Let I K[x 1,..., x n ] be a prime ideal. This defines an algebraic matroid with ground set {x 1,..., x n }, and K {x 1,..., x n } an independent set if and only if I K[K ] = 0. Corollary (Uhler 2012) Let I n K[σ ij : 1 i j m] be the ideal of n n minors of an m m symmetric matrix Σ. Let G = (V, E) be a graph. If {σ ii : i [m]} {σ ij : ij E} is an independent set in the algebraic matroid associated to I n+1 then mlt(g) n. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 12 / 25

Combinatorial Rigidity Theory Consider the map ψ n : R n m R m(m 1)/2 (p 1,..., p m ) ( p i p j 2 2 : 1 i < j m) Let J n = I(im(ψ n )) K[x ij : 1 i < j m]. The resulting matroid A(n) is the n-dimensional generic rigidity matroid. Spanning sets in the matroid are called (generically infinitesimally) rigid graphs. Bases in the matroid are called (generically) isostatic graphs. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 13 / 25

Laman s Theorem To be a spanning set in A(n) must have nm ( ) n+1 2 E. Conversely, to be independent must have nm ( ) n+1 2 E. Theorem (Laman 1970) A graph G = (V, E) is isostatic in the rigidity matroid A(2) if and only if E = 2m 3, and For every induced subgraph G W = (W, E W ), E W 2 W 3. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 14 / 25

Rigidity Matroid = Symmetric Minor Matroid Theorem (Gross-S 2014) Proof. A graph G = (V, E) has rank(g) = n if and only E is an independent set in A(n 1) and not an independent set in A(n 2). The matroid A(n 1) is isomorphic to the contraction of the rank n symmetric minor matroid via the diagonal elements. Compare the Jacobian of the map to the Jacobian of the map (p 1,..., p m ) ( p i p j 2 2 : 1 i < j m) (p 1,..., p m ) (p i p j : 1 i j m). Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 15 / 25

Applications of Rigidity Theory Definition Let G be a graph, and r N. The r-core of G is the graph obtained by successively removing vertices of G of degree < r. Theorem (Ben-David 2014, Gross-S 2014) If mlt(core r (G)) r then mlt(g) r. Example (Grid Graphs) mlt(gr k1,k 2 ) = 3, but τ(gr k1,k 2 ) + 1 = min(k 1, k 2 ) + 1. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 16 / 25

Planar Graphs Theorem If G is a planar graph then mlt(g) 4. Proof. Cauchy s theorem implies that every edge graph of a simplicial 3-polytope is rigid. Edge count = G isostatic = rank(g) 4. Every planar graph is a subgraph of graph of a simplicial 3-polytope. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 17 / 25

Splitting Theorem Theorem (Splitting Theorem) Let G be a graph, r 1,..., r k integers and V 1,..., V k a partition of the vertices of G, such that 1 for all i, rank(g Vi ) r i and 2 for all i j, (r i, r j ) birank(g(v i, V j )). Then rank(g) r 1 + + r k. Proof idea: Decompose R r 1+ +r k = R r 1 R r k for i V j let p i = (0, 0,..., 0, q }{{} i, 0,..., 0) jth spot Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 18 / 25

Cycle-Free Coloring Corollary Let G be a graph and V 1,..., V k be a partition of the vertices of G such that 1 for all i, V i is an independent set of G and 2 for all i j, G(V i, V j ) has no cycles. Then rank(g) k. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 19 / 25

Weak Maximum Likelihood Threshold 1 1 2 5 2 1 2 3 4 4 3 4 3 5 5 Definition For each graph G, the weak maximum likelihood threshold, wmlt(g), is the smallest n such that there exists a Σ 0 Sym(m, n) S m 0 and a Σ S m >0 such that φ G(Σ 0 ) = φ G (Σ). In other words, MLE exists with probability > 0. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 20 / 25

Splitting Lemma (Splitting Lemma) Let G be a graph, r 1,..., r k integers and V 1,..., V k a partition of the vertices of G, such that for all i, wmlt(g Vi ) r i. Then wmlt(g) r 1 + + r k. Corollary Let G be a graph. Then wmlt(g) χ(g). Corollary Let G be a nonempty bipartite graph. Then wmlt(g) = 2. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 21 / 25

Buhl s Cycle Condition Proposition Let G = ([m], E) be a graph with wmlt(g) = 2. Then 1 G is triangle-free and 2 there exists a cyclic order w = w 1 w 2 w m of the vertices of G that puts every cycle of G in the wrong cyclic order. Question Does every triangle free graph have a cycle disruptive ordering of its vertices? Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 22 / 25

More Stuff We Don t Know Find an example of a graph where mlt(g) < rank(g). e.g. A graph where dim φ G (S(m, n)) V + E but φ G (S(m, n) S m 0 ) is not in the algebraic boundary of K G. How are boundary components of K G related to circuits in the rigidity matroid? Maximum likelihood threshold has a natural rigidity theory analogue: are they equivalent? Weak maximum likelihood threshold has a natural rigidity theory analogue: are they equivalent? Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 23 / 25

Summary and Conclusions The maximum likelihood threshold bounds the amount of data needed to guarantee the maximum likelihood estimate exists for a given Gaussian graphical model. Result of Uhler connects mlt(g) to algebraic matroids. We connect the symmetric minor matroid to the rigidity matroid. We use results from rigidity theory to get new bounds on mlt(g). Many open problems remain! Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 24 / 25

References E. Ben-David. Sharper lower and upper bounds for the gaussian rank of a graph. (2014) ArXiv:1406.4777 S. Buhl. On the existence of maximum likelihood estimators for graphical Gaussian models. Scand. J. Statist. 20 (1993), no. 3, 263 270. J. Graver, B. Servatius, and H. Servatius. Combinatorial Rigidity. Graduate Studies in Mathematics, Vol. 2. American Mathematical Society, 1993. E. Gross and S. Sullivant. The maximum likelihood threshold of a graph. (2014) ArXiv:1404.6989 G. Kalai, E. Nevo, I. Novik. Bipartite rigidity. (2013) ArXiv: 1312.0209 G. Laman. On graphs and the rigidity of plane skeletal structures, J. Engineering Mathematics 4 (1970): 331 340. C. Uhler. Geometry of maximum likelihood estimation in Gaussian graphical models. Ann. Statist. 40 (2012), no. 1, 238 261. Seth Sullivant (NCSU) Maximum Likelihood Threshold August 28, 2014 25 / 25