An Algebraic and Geometric Perspective on Exponential Families

Similar documents
Parameter estimation in linear Gaussian covariance models

Semidefinite Programming

The Geometry of Semidefinite Programming. Bernd Sturmfels UC Berkeley

The Maximum Likelihood Threshold of a Graph

Multivariate Gaussians, semidefinite matrix completion, and convex algebraic geometry

SPECTRAHEDRA. Bernd Sturmfels UC Berkeley

Unbounded Convex Semialgebraic Sets as Spectrahedral Shadows

Gaussian Graphical Models: An Algebraic and Geometric Perspective

QUARTIC SPECTRAHEDRA. Bernd Sturmfels UC Berkeley and MPI Bonn. Joint work with John Christian Ottem, Kristian Ranestad and Cynthia Vinzant

SPECTRAHEDRA. Bernd Sturmfels UC Berkeley

Likelihood Analysis of Gaussian Graphical Models

Geometry of Gaussoids

Combinatorial Types of Tropical Eigenvector

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

Total positivity in Markov structures

The Algebraic Degree of Semidefinite Programming

The Central Curve in Linear Programming

Geometry of Log-Concave Density Estimation

Algebraic Statistics progress report

Fall, 2003 CIS 610. Advanced geometric methods. Homework 3. November 11, 2003; Due November 25, beginning of class

Contingency tables from the algebraic statistics view point Gérard Letac, Université Paul Sabatier, Toulouse

From the Zonotope Construction to the Minkowski Addition of Convex Polytopes

Combinatorics and geometry of E 7

Complexity of the positive semidefinite matrix completion problem with a rank constraint

The partial-fractions method for counting solutions to integral linear systems

NOTES ON HYPERBOLICITY CONES

Affine Geometry and the Discrete Legendre Transfrom

Commuting birth-and-death processes

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

CONVEX ALGEBRAIC GEOMETRY. Bernd Sturmfels UC Berkeley

Maximum likelihood for dual varieties

ENTANGLED STATES ARISING FROM INDECOMPOSABLE POSITIVE LINEAR MAPS. 1. Introduction

Affine Geometry and Discrete Legendre Transform

Algebraic Geometry (Math 6130)

ECON 4117/5111 Mathematical Economics

TORIC REDUCTION AND TROPICAL GEOMETRY A.

DETECTION OF FXM ARBITRAGE AND ITS SENSITIVITY

Exercise Sheet 7 - Solutions

Integer Programming, Part 1

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Polynomials, Ideals, and Gröbner Bases

Probabilistic Graphical Models

Semidefinite Programming Basics and Applications

Problems on Minkowski sums of convex lattice polytopes

Semidefinite representation of convex hulls of rational varieties

Nonlinear Programming Models

Lecture 3: Tropicalizations of Cluster Algebras Examples David Speyer

Math 302 Outcome Statements Winter 2013

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Convex Optimization & Parsimony of L p-balls representation

On the local stability of semidefinite relaxations

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

COUNTING INTEGER POINTS IN POLYHEDRA. Alexander Barvinok

MATH 323 Linear Algebra Lecture 12: Basis of a vector space (continued). Rank and nullity of a matrix.

Open Problems in Algebraic Statistics

Decomposable and Directed Graphical Gaussian Models

Inner Product, Length, and Orthogonality

CSL361 Problem set 4: Basic linear algebra

Togliatti systems and artinian ideals failing WLP. Weak Lefschetz Property

Hodge theory for combinatorial geometries

Combinatorial types of tropical eigenvectors

LIFTS OF CONVEX SETS AND CONE FACTORIZATIONS JOÃO GOUVEIA, PABLO A. PARRILO, AND REKHA THOMAS

1 Quantum states and von Neumann entropy

Inequality Constraints

ALGEBRAIC DEGREE OF POLYNOMIAL OPTIMIZATION. 1. Introduction. f 0 (x)

Chapter 1. Preliminaries

15. Conic optimization

Decomposable Graphical Gaussian Models

Tropical Varieties. Jan Verschelde

Introduction to Semidefinite Programming I: Basic properties a

MATHEMATICS. Course Syllabus. Section A: Linear Algebra. Subject Code: MA. Course Structure. Ordinary Differential Equations

Lakehead University ECON 4117/5111 Mathematical Economics Fall 2003

Contents Real Vector Spaces Linear Equations and Linear Inequalities Polyhedra Linear Programs and the Simplex Method Lagrangian Duality

Math 3C Lecture 25. John Douglas Moore

HYPERBOLIC POLYNOMIALS, INTERLACERS AND SUMS OF SQUARES

Tropical Geometry Homework 3

Nonlinear Discrete Optimization

Weak Separation, Pure Domains and Cluster Distance

HILBERT BASIS OF THE LIPMAN SEMIGROUP

ON THE RANK OF A TROPICAL MATRIX

1 Maximal Lattice-free Convex Sets

Polytopes and Algebraic Geometry. Jesús A. De Loera University of California, Davis

MAT-INF4110/MAT-INF9110 Mathematical optimization

ALGORITHMIC COMPUTATION OF POLYNOMIAL AMOEBAS

WHEN DOES THE POSITIVE SEMIDEFINITENESS CONSTRAINT HELP IN LIFTING PROCEDURES?

Voronoi Cells of Varieties

Symmetric matrices, Catalan paths, and correlations

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

COMBINATORICS OF RANK JUMPS IN SIMPLICIAL HYPERGEOMETRIC SYSTEMS

HYPERBOLICITY CONES AND IMAGINARY PROJECTIONS

Gaussian representation of a class of Riesz probability distributions

SEMIDEFINITE PROGRAM BASICS. Contents

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

Probabilistic Graphical Models

Lecture 4: Applications: random trees, determinantal measures and sampling

Geometry of Phylogenetic Inference

Toric Varieties in Statistics

An introduction to tropical geometry

ALGEBRA: From Linear to Non-Linear. Bernd Sturmfels University of California at Berkeley

CPSC 540: Machine Learning

Transcription:

An Algebraic and Geometric Perspective on Exponential Families Caroline Uhler (IST Austria) Based on two papers: with Mateusz Micha lek, Bernd Sturmfels, and Piotr Zwiernik, and with Liam Solus and Ruriko Yoshida Current Trends on Gröbner Bases July 10, 2015 Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 1 / 24

Gaussian Graphical Models A random vector X R m follows a multivariate Gaussian distribution with concentration matrix θ S m 0 if it has density ( p θ (x) = (2π) m/2 det(θ) 1/2 exp 1 ) 2 x T θx θ ij = 0 if and only if X i X j X {1,...,m}\{i,j} Represent conditional independence relations by undirected graph (a) Gene interactome (Novarino et al., Science 343, 2014) (b) Stock market (Garos & Panos, Physica A 380, 2007) Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 2 / 24

Exponential Families An exponential family is a parametric statistical model p θ (x) = exp ( θ, T (x) A(θ) ) with sample space X, base measure ν on X, and sufficient statistics T : X R d (measurable). Theorem A(θ) = log X exp( θ, T (x) ) ν(dx) is the log-partition function The following sets are convex: Space of canonical parameters: C = { θ R d : A(θ) < + } Space of sufficient statistics: K = conv ( T (X ) ) R d Suppose C is open and K spans R d. Then the gradient map F : R d R d, θ A(θ) defines an analytic bijection between C and int(k). Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 3 / 24

From Analysis to Algebra Our exponential families satisfy A(θ) = α log(f (θ)), where f (θ) is a homogeneous polynomial and α > 0. The gradient of the log-partition function is the rational function F : R d R d : θ α f (θ) ( f θ 1, f θ 2,..., f θ d ) Algebraic geometers prefer F : CP d 1 CP d 1 : θ ( f θ 1 : f θ 2 : : f ) θ d The partition function f (θ) α = K exp( θ, x ) ν(dx) admits nice integral representation. Which polynomials f (θ), exponents α > 0, and convex sets C, K R d are possible? Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 4 / 24

Multivariate Gaussian Distribution The Multivariate Gaussian distribution is the exponential family ( p θ (x) = (2π) m/2 det(θ) 1/2 exp ( = exp θ, 1 2 xx T 1 ) 2 x T θx, ( 1 2 log det(θ) + m )) 2 log(2π) X = R m, ν Lebesgue measure on X d = ( ) m+1 2, A, B = tr (AB) T (x) = 1 2 xx T, A(θ) = 1 2 log det(θ) + m 2 log(2π) C = S m 0, K = Sm 0, F (θ) = 1 2 θ 1 Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 5 / 24

Duality of Polytopes Duality of Polytopes Ex: How Example to morph (Howa to cube morph into aancube octahedron? into an octahedron?) [Sturmfels [St-Uhler& 2010, U. 2010, Example Example 3.5] 3.5] Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 6 / 24 6/22

Exponential family for cube octahedron Fix the product of linear forms f (θ) = (θ 2 1 θ 2 4)(θ 2 2 θ 2 4)(θ 2 3 θ 2 4) Space of canonical parameters is C = cone over the 3-cube { θ i < 1 : i = 1, 2, 3 } Duality of Polytopes Example (How to morph a cube into an octahedron?) Space of sufficient statistics is [St-Uhler 2010, Example 3.5] K = cone over the octahedron conv{±e 1, ±e 2, ±e 3 } Gradient map f : P 3 P 3 gives bijection between C and int(k). Question: What is (X, ν, T ) in this case? Answer: X = K, T = id, and ν constructed via hypergeometric functions Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 7 / 24

Hyperbolic Polynomials A homogeneous polynomial f R[θ 1,..., θ d ] of degree k is hyperbolic if, for some t R d, every line through t intersects the complex hypersurface {f = 0} in k real points. The connected component C of t in R d \{f = 0} is the hyperbolicity cone. It is an open convex cone. Theorem (Scott & Sokal, 2015) Let f be a homogeneous polynomial in R[θ 1,..., θ d ] that is strictly positive on an open convex cone C. If there exists α > 0 and measure ν such that f (θ) α = exp( θ, σ ) ν(dσ) for all θ C, C then f is hyperbolic with respect to each point in C. The resulting statistical models are hyperbolic exponential families. Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 8 / 24

Riesz Kernel Theorem (Gårding 1951) Let f R[θ 1,..., θ d ] be hyperbolic with hyperbolicity cone C. If α > d, then the following integral converges for any θ C, is independ of θ, and is supported on K = C : q α (σ) = (2π) d R d f (θ + iη) α exp( θ + iη, σ )dη. The polynomial f can be recovered from the Riesz kernel q α via f (θ) α = K exp( θ, σ ) q α(σ) dσ for all θ C. Given a hyperbolic polynomial, what is the annihilator of the Riesz kernel? f product of linear forms: what is the GKZ-system / D-ideal? f elementary symmetric polynomial? Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 9 / 24

Symmetric Determinant f (θ) = det(θ) is a hyperbolic polynomial in d = ( ) m+1 2 unknowns. Hyperbolicity cone C = S m 0 ; its dual is K = C = S m 0. f (θ) has integral representation f (θ) α = exp( θ, σ ) ν(dσ) K for all θ C if and only if α = 0, 1 2,..., m 1 2 or α > m 1 2 Measure ν(dσ) = q α (σ) dσ is given by Wishart density (measure induced on S m 0 by multivariate Gaussian distribution on Rm ) Riesz kernel: q α (σ) = 1 Γ m(α) m+1 det(σ)α 2 Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 10 / 24

Hyperbolic Exponential Families: Another Example The space of canonical parameters C is the hyperbolicity cone of f = θ 1 θ 2 θ 3 + θ 1 θ 2 θ 4 + θ 1 θ 3 θ 4 + θ 2 θ 3 θ 4. Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 11 / 24

Hyperbolic Exponential Families: Another Example The space of sufficient statistics K = C is defined by the Steiner surface σ 4 i 4 σ 3 i σ j + 6 σ 2 i σ 2 j + 4 σ 2 i σ j σ k 40 σ 1 σ 2 σ 3 σ 4. Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 12 / 24

Duality Gradient map f : P 3 P 3 gives a bijection between C and K: Open Problem: What is the Riesz kernel? Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 13 / 24

Intersecting with a Subspace Fix exponential family with rational gradient map F : C K. Main case: F = f where f is hyperbolic Consider a linear subspace L R d with C L := L C nonempty: Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 14 / 24

Exponential Varieties The exponential variety is the image under the gradient map: L F := F (L) P d 1. Its positive part L F 0 lives in K. Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 15 / 24

Convexity and Positivity Theorem (X, ν, T ) exponential family with rational gradient map F : R d R d, and L R d a linear subspace. Restricted gradient map F L is composition C L C F K π L K L. Convex set C L of canonical parameters maps bijectively to positive exponential variety L F 0, and LF 0 maps bijectively to interior of convex set K L of sufficient statistics. Maximum Likelihood Estimation for an exponential variety means inverting these two bijections (by solving polynomial equations). Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 16 / 24

Bijections in Pictures Green maps to blue maps to green. Inverting this map is MLE. 5 0-5 -10 Caroline Uhler (IST Austria) Exponential Varieties 0 5 Osaka, July 2015 17 / 24-10 -5

Maximum Likelihood Estimation Questions: Algebraic degree of inversion of F L? [MSUZ, 2015] When does the MLE exist? I.e. characterize int(k L ). Study K L and its defining polynomial. [SU, 2010] Study the extremal rays of C L. Study Gaussian graphical models on undirected graph G = ({1,..., m}, E): C G = {θ S m 0 θ ij = 0 for all (i, j) / E} K G = CG V = π G (S m 0) Characterize the ranks of extremal rays of C G. Maximal rank is 1 if and only if G chordal (Agler et al., 1988) All graphs of maximal rank 2 have been characterized (Laurent, 2001) Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 18 / 24

Elliptopes and Spectrahedral Shadows Without loss of generality we study the following convex bodies instead of the corresponding convex cones: These convex bodies are dual to each other Elliptope of G: K G = π G ({σ S m 0 diag(σ) = (1,..., 1)}) Spectrahedral shadow of G: C G = π G ({θ S m 0 θ ij = 0 for all (i, j) / E and tr (θ) = 2} Problem: Characterize the ranks of the extremal points of C G Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 19 / 24

Example: 4-cycle 1 2 σ 1 K G = σ 2 σ 3 R4 σ 4 4 : u, v s.t. 3 1 σ 1 u σ 4 σ 1 1 σ 2 v u σ 2 1 σ 3 0 σ 4 v σ 3 1 θ 1 a θ 1 0 θ 4 C G = θ 2 θ 3 R4 : a, b, c s.t. θ 1 b θ 2 0 7/7/14 θ 4 0 θ 2 c θ 3 θ 4 0 θ 3 2 a b c 0 Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 20 / 24

Cut Polytopes, Elliptopes and Spectrahedral Shadows Consider another convex body, the cut polytope of G: Let U V ; the corresponding cutset is the collection of edges δ(u) E with one endpoint in U and the other endpoint in U c Assign to each cutset δ(u) a (±1)-vector v R E with v e = 1 if and only if e δ(u) convex hull of all such vectors is the cut polytope CUT ±1 (G) Note: CUT ±1 (G) K G Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 21 / 24

Example: 3-cycle (a) CUT ±1 (G) (b) K G (c) C G CUT ±1 (G) = conv((1, 1, 1), ( 1, 1, 1), ( 1, 1, 1), (1, 1, 1)) 1 σ 1 σ 3 θ 1 a θ 1 θ 3 K G = σ 1 1 σ 2 0, C G = θ 2 : θ 1 b θ 2 0 σ 3 σ 2 1 θ 3 θ 3 θ 2 2 a b Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 22 / 24

Graphs with no K 4 / K 5 minors Theorem Graphs with no K 5 -minor have the facet-ray identification property, i.e. the normal vectors to the facets of CUT ±1 (G) identify extremal points in C G. If v T x = b is a supporting hyperplane of a facet of CUT ±1 (G), then the extremal ray given by the normal vector v has rank b. Theorem For graphs with no K 4 -minor the facets of the cut polytope identify all extremal ranks. The extremal ranks are {1, m 2 C m is an induced cycle of G}. Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 23 / 24

"#$%&'()*+*,-.**/#("012$02"'*32#))024)5*)'%06'&040"'*%2"$07* References 9%:('"0945*246*8941'7*2(;'<$208*;'9%'"$=*>?@!/*AB5*BCDCE Micha lek, Sturmfels, U., and Zwiernik: Exponential varieties, -.*3'9%'"$=*9&*%270%#%*(0F'(0G996*')"0%2"094*04*32#))024*;$2:G0 arxiv:1412.6185 (2014). 96'()*>G9:'&#((=*94*"G'*2$H01*<=*!#462=E Solus, U., and Yoshida: Extremal positive semidefinite matrices for weakly bipartite graphs, arxiv:1506.06702 (2015). G246$2)'F2$245*!G2G5*,-*?)=%:"9"08)*9&*%270%#%*(0F'(0G996* )"0%2"094*04*32#))024*8=8(')*>04*:$9;$'))E Sturmfels, and U.: Multivariate Gaussians, semidefinite matrix completion, and convex algebraic geometry, Ann. Inst. Stat. Math. 62, (2010).!"#$%&'()* Caroline Uhler (IST Austria) Exponential Varieties Osaka, July 2015 24 / 24