arxiv: v2 [math.st] 29 Oct 2015

Similar documents
6 General properties of an autonomous system of two first order ODE

Acute sets in Euclidean spaces

On colour-blind distinguishing colour pallets in regular graphs

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

arxiv: v1 [math.co] 29 May 2009

Introduction to the Vlasov-Poisson system

A Review of Multiple Try MCMC algorithms for Signal Processing

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

Linear First-Order Equations

Permanent vs. Determinant

Calculus of Variations

SYNCHRONOUS SEQUENTIAL CIRCUITS

ALGEBRAIC AND ANALYTIC PROPERTIES OF ARITHMETIC FUNCTIONS

u!i = a T u = 0. Then S satisfies

On the number of isolated eigenvalues of a pair of particles in a quantum wire

Discrete Mathematics

The Principle of Least Action

Least-Squares Regression on Sparse Spaces

Quantum mechanical approaches to the virial

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Equilibrium in Queues Under Unknown Service Times and Service Value

Technion - Computer Science Department - M.Sc. Thesis MSC Constrained Codes for Two-Dimensional Channels.

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Generalizing Kronecker Graphs in order to Model Searchable Networks

Sharp Thresholds. Zachary Hamaker. March 15, 2010

SYMMETRIC KRONECKER PRODUCTS AND SEMICLASSICAL WAVE PACKETS

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

WUCHEN LI AND STANLEY OSHER

JUST THE MATHS UNIT NUMBER DIFFERENTIATION 2 (Rates of change) A.J.Hobson

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Sturm-Liouville Theory

Ramsey numbers of some bipartite graphs versus complete graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

The total derivative. Chapter Lagrangian and Eulerian approaches

Tractability results for weighted Banach spaces of smooth functions

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

On the Aloha throughput-fairness tradeoff

Agmon Kolmogorov Inequalities on l 2 (Z d )

II. First variation of functionals

arxiv:hep-th/ v1 3 Feb 1993

Number of wireless sensors needed to detect a wildfire

Some properties of random staircase tableaux

05 The Continuum Limit and the Wave Equation

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations

How to Minimize Maximum Regret in Repeated Decision-Making

The chromatic number of graph powers

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v1 [math.mg] 10 Apr 2018

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

Chapter 6: Energy-Momentum Tensors

LECTURE NOTES ON DVORETZKY S THEOREM

Counting Lattice Points in Polytopes: The Ehrhart Theory

Symbolic integration with respect to the Haar measure on the unitary groups

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

7.1 Support Vector Machine

What s in an Attribute? Consequences for the Least Common Subsumer

TOEPLITZ AND POSITIVE SEMIDEFINITE COMPLETION PROBLEM FOR CYCLE GRAPH

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

DEGREE DISTRIBUTION OF SHORTEST PATH TREES AND BIAS OF NETWORK SAMPLING ALGORITHMS

Asymptotic determination of edge-bandwidth of multidimensional grids and Hamming graphs

Robustness and Perturbations of Minimal Bases

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

Systems & Control Letters

arxiv: v1 [math.co] 17 Feb 2011

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Stable and compact finite difference schemes

Calculus of Variations

Designing Information Devices and Systems II Fall 2017 Note Theorem: Existence and Uniqueness of Solutions to Differential Equations

Diagonalization of Matrices Dr. E. Jacobs

Chromatic number for a generalization of Cartesian product graphs

Lecture 2: Correlated Topic Model

Schrödinger s equation.

A. Exclusive KL View of the MLE

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Descriptive Complexity of Linear Equation Systems and Applications to Propositional Proof Complexity

Pure Further Mathematics 1. Revision Notes

Optimal Control of Spatially Distributed Systems

QUANTUMMECHANICAL BEHAVIOUR IN A DETERMINISTIC MODEL. G. t Hooft

Iterated Point-Line Configurations Grow Doubly-Exponentially

Dot trajectories in the superposition of random screens: analysis and synthesis

INTERSECTION HOMOLOGY OF LINKAGE SPACES IN ODD DIMENSIONAL EUCLIDEAN SPACE

Optimal Control of Spatially Distributed Systems

Calculus and optimization

APPROXIMATE SOLUTION FOR TRANSIENT HEAT TRANSFER IN STATIC TURBULENT HE II. B. Baudouy. CEA/Saclay, DSM/DAPNIA/STCM Gif-sur-Yvette Cedex, France

Assignment 1. g i (x 1,..., x n ) dx i = 0. i=1

4. Important theorems in quantum mechanics

Role of parameters in the stochastic dynamics of a stick-slip oscillator

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

EVALUATING HIGHER DERIVATIVE TENSORS BY FORWARD PROPAGATION OF UNIVARIATE TAYLOR SERIES

Variable Independence and Resolution Paths for Quantified Boolean Formulas

Analysis on a Localized Pruning Method for Connected Dominating Sets

On conditional moments of high-dimensional random vectors given lower-dimensional projections

Generalized Tractability for Multivariate Problems

Pseudo-Free Families of Finite Computational Elementary Abelian p-groups

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Transcription:

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES KONSTANTIN ZUEV, OR EISENBERG, AND DMITRI KRIOUKOV arxiv:1502.05032v2 [math.st] 29 Oct 2015 Abstract. Exponential ranom graph moels have attracte significant research attention over the past ecaes. These moels are maximum-entropy ensembles subject to the constraints that the expecte values of a set of graph observables are equal to given values. Here we exten these maximum-entropy ensembles to ranom simplicial complexes, which are more aequate an versatile constructions to moel complex systems in many applications. We show that many ranom simplicial complex moels consiere in the literature can be caste as maximum-entropy ensembles uner certain constraints. We introuce an analyze the most general ranom simplicial complex ensemble with statistically inepenent simplices. Our analysis is simplifie by the observation that any istribution P(O) on any collection of objects O = {O}, incluing graphs an simplicial complexes, is maximum-entropy subject to the constraint that the expecte value of ln P(O) is equal to the entropy of the istribution. With the help of this observation, we prove that ensemble is maximum-entropy subject to the two types of constraints which fix the expecte numbers of simplices an their bounaries. Keywors. Ranom simplicial complexes, ranom graphs, maximum-entropy istributions, exponential ranom graphs moel, network moels. 1. Introuction When stuying complex systems consisting of many interconnecte, interacting components, it is rather natural to represent the system as a graph or, more generally, as a simplicial complex. Moeling complex systems with graphs has prove to be useful for unerstaning systems as intricate as the Internet, the human brain, an interwoven social groups, an has le to a new area of research, calle network science [12, 14, 35]. A host of evelope network moels (e.g. see [21] for a survey) can be roughly ivie into two classes: generative moels an escriptive moels [1]. Generative moels are algorithms which escribe how to generate a network using some probabilistic rules for connecting noes. These moels primarily aim to uncover the hien evolution mechanisms responsible for certain properties observe in real networks. A classical, an perhaps the simplest an best stuie, example of a generative moel is the Erős Rényi ranom graph G(n, p) [17, 18, 43]: given n noes, place a link between every two noes inepenently at ranom with probability p. Among other prominent examples are the preferential attachment moel [4, 13, 30] an the small-worl moel [36, 47, 48] which explain the power-law egree istributions an small istances between most noes, two universal properties observe in many real networks. Any generative moel gives rise to an ensemble (G, P), where G is a set of all graphs the moel can possibly generate an P is the probability istribution on G, where P(G) is the probability that the moel generates G G. One can always reaily sample from P (using the network generating algorithm), but often cannot obtain a close-form expression for P(G), or even implicitly escribe P as a solution of some optimization problem equation. Generative moels can help to unerstan the funamental organizing principles behin real networks an explain their qualitative behavior, but they are not specifically esigne for network ata analysis. Descriptive moels attempt to fill this gap. A escriptive moel is explicitly efine as an ensemble (G, P θ ), Department of Physics, Department of Mathematics, Department of Electrical&Computer Engineering, Northeastern University, Boston, MA 02115, USA E-mail aress: k.zuev@neu.eu, eisenberg.o@husky.neu.eu, ima@neu.eu. 1

2 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV where G is a set of graphs an P θ is the joint probability istribution on G parameterize by a vector of parameters θ, which are to be inferre from the observe network ata. For any graph G G, a escriptive moel gives a close-form expression for P θ (G) which can be use for further statistical inference, e.g. for estimating ensemble averages G G x(g)p θ(g), where x is a network property of interest. In contrast to generative moels, however, a escriptive moel oes not specify how to sample networks from P θ, which is often a challenging task. In simple cases, a network moel can be represente as both generative an escriptive moel. For example, the Erős Rényi ranom graph G(n, p) can be efine, as above, by a generative algorithm, or by the formula for the probability istribution P(G) = p f1(g) (1 p) (n 2) f 1(G), where f 1 (G) is the number of eges in G. In general, however, representing a generative moel as escriptive (an vice versa) is a very ifficult problem whose solution coul be very useful for applications. Exponential ranom graphs (ERGs) [19, 23, 29, 37, 42, 45], often calle p moels in the social network research community [3, 40, 46], are among the most popular an best stuie escriptive moels which provie a conceptual framework for statistical moeling of network ata. Let G n be the set of all simple graphs (without self-loops or multi-eges) with n noes, x 1,..., x r be functions on G n, henceforth referre to as the graph observables, an let x 1,..., x r be the values of these observables x 1 (Ḡ),..., xr(ḡ) for a network of interest Ḡ G n compute from available network ata. The ERG moel efine by Ḡ an its observables x 1,..., x r is the exact analog of the Boltzmann istribution in statistical mechanics: P θ (G) = e H θ(g), H θ (G) = Z(θ) r θ i x i (G), (1.1) where H θ (G) is calle the graph Hamiltonian, Z(θ) the partition function (the normalization constant), an θ = (θ 1,..., θ r ) is a vector of moel parameters which satisfy i=1 ln Z θ i = x i. (1.2) Whereas originally (1.1) was simply postulate an use in empirical stuies [23], it was later recognize [19, 37, 45] that ERGs are maximum-entropy ensembles. Namely, the istribution efine by (1.1) an (1.2) maximizes the Gibbs entropy S(P) = G G n P(G) ln P(G), (1.3) subject to the r soft constraints an the normalization conition E P [x i ] = G G n x i (G)P(G) = x i, (1.4) G G n P(G) = 1. (1.5) The general principle of maximum entropy is thoroughly reviewe in [38]. In the context of complex networks, the principle of maximum entropy an ifferent entropy measures are iscusse in [2]. Despite some known problems with ERGs with nonlinearly correlate constraints [7, 24, 41], ERGs remain one of the most popular escriptive moels for network ata analysis, especially in social science. In many cases, however, representing a complex system with a simplicial complex a higher-imensional analog of a graph is conceptually more soun than the basic network representation, an provies a higher orer approximation of the system. Consier for example a social system of scientific collaboration. Three researchers may co-author a single article or they may have three ifferent papers with two authors each. The network representation, where noes are connecte if the corresponing scientists co-authore a paper, will not istinguish between these two cases. But we can o this by placing (in the former case), or not (in the latter case), a 2-simplex on the three noes. This is illustrate in Fig 1. Other examples, where the simplicial complex representation is more accurate inclue biological protein-interaction systems,

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 3 (a) Z (b) Z Paper ZE Paper KZ Paper ZEK E Paper EK K E K Figure 1. Networks vs simplicial complexes. The outcome of collaboration between scientists Z, E, an K coul be three ifferent papers co-authore by Z an E, E an K, an K an Z (Panel (a)), or a single paper co-authore by all three scientists (Panel (b)). While the network representation oes not istinguish between these two case an results in the graph in Panel (a), the simplicial complex representation oes by aing in the latter case the triangle {Z, E, K} in Panel (b). where proteins form protein complexes often consisting of more than two proteins, economic systems of financial transactions often involving several parties, an social systems, where groups of people are unite by a common motive, interest, or goal, as oppose to merely being pairwise connecte. In general, compare to graphs, simplicial complexes encoe more relevant information about a complex system, an make possible moeling beyon yaic interactions. They have been use in many applications, incluing moeling social aggregation [28], agent interaction [44], opinion formation an ynamics [32, 33], coverage an hole-etection in sensor networks [20], an broacasting in wireless networks [39], to name just a few. We remark that prior to their being use for stuying complex interactions, simplicial complexes were use in a rich variety of geometric problems, ranging from gri generation in finite element analysis to moeling configuration spaces of ynamical systems [15]. Further etails an applications can be foun in [16]. In this paper, we introuce exponential ranom simplicial complexes (ERSCs) which are higher imensional generalizations of exponential ranom graphs, evelop the formalism for ERSCs, an show that several popular generative moels of ranom simplicial complexes ranom flag complexes [25], Linial Meshulam complexes [31], an Kahle s multi-parameter moel [27] can all be explicitly represente as ERSCs. We also introuce the most general ensemble of ranom simplicial complexes with statistically inepenent simplices, an show that this ensemble is an ERSC ensemble as well. 2. Basic Definitions an Notations Here we recall a few basic efinitions an introuce notation that we use throughout the paper. For a comprehensive reference on simplicial complexes the reaer is referre to [34]. A simplicial complex C on n vertices V = {1,..., n} is a collection of non-empty subsets of V, calle simplices. Complex C contains all vertices, {i} C, an is close uner the subset relation: if σ C an τ σ, then τ C, where τ is calle a face of simplex σ, an σ is a coface of τ. A simplex σ is calle a k-simplex of imension k if its carinality is σ = k + 1. It is useful to think of a k-simplex as the convex hull of (k + 1) points in general position in R K, K k [22]. For instance, 0-, 1-, 2-, an 3-simplices are, respectively, vertices, eges, triangles, an tetrahera. A simplicial complex is then a collection of simplices of ifferent imensions properly glue together. We say that C has imension m if it has at least one m-simplex, but oes not have simplices of higher imension. Clearly, m n 1.

4 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Figure 2. Simplicial complex an its ajacency tensors. In this example, C C 10, im C = 3, an the non-zero elements a i of ajacency tensor a, = 1, 2, 3, 4, are: a i1 = 1 for all i 1 = 1,..., 10; a i2 = 1 for i 2 = (1, 4), (1, 5), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), (6, 7), (7, 8), (8, 9), (8, 10), an (9, 10); a i3 = 1 for i 3 = (2, 4, 5), (4, 5, 6), (4, 5, 7), (4, 6, 7), an (5, 6, 7); a i4 = 1 only for i 4 = (4, 5, 6, 7). The ege {4, 6} is not visible because of the 3-simplex {4, 5, 6, 7}. Let C n be the set of all simplicial complexes on n vertices. By analogy with graphs, where there exists a one-to-one corresponence between G n an the set all boolean symmetric n-by-n matrices with zeros on the iagonal, known as ajacency matrices, we can represent C n by a tensor prouct n C n = a, (2.1) =1 where a = {a i1,...,i }, i j = 1,..., n, j = 1,...,, is a boolean symmetric tensor of orer with zeros on all its iagonals. These conitions require precisely that a i1,...,i = a iκ(1),...,i κ() for any permutation κ of subsubinices 1,...,, an a i1,...,i = 0 if i j = i k for any pair of j an k. The non-reunant elements of tensor a are thus a i, where multi-inex i enotes a -tuple of inices with increasing values: i = i 1,..., i, (2.2) 1 i 1 <... < i n. (2.3) The only requirement for n =1 a to be in bijection with C n is then the following compatibility conition: a i = 1 b i ef = a iˆk = 1, where (2.4) iˆk = i 1,..., î k,..., i (2.5) is the ( 1)-long multi-inex obtaine from multi-inex i by omitting inex i k. It is useful to think of iˆk as the result of operation ( )ˆk, which is the eletion of the k th inex, applie to multi-inex i. Conition (2.4) simply formalizes the requirement that if the complex contains simplex {i }, then it also contains all its faces. For a simplicial complex C C n, a = {a i } is thus its ajacency tensor that encoes the presence of ( 1)-simplices: a i = 1 if {i } C, an zero otherwise. Since we assume that C has n vertices, we trivially

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 5 EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 5 Figure 3. Empty an fille skeletons. The left simplicial complex is the 1-skeleton C (1) of the complex C in Fig. 2. The fille 1-skeleton C [2] on the right is obtaine by aing all 2-simplices base on triangular subgraphs of C (1). The 3-simplex {4, 5, 6, 7} C oes not belong to C [2]. The fille skeleton C [2] is not a subcomplex of C since, for example, {8, 9, 10} / C. The filling operation is enote by. [2] have a 1 = 1 n = (1,..., 1). Figure 2 illustrates the corresponence between simplicial complexes an their ajacency tensors. A subcomplex of C is a subset C C that is also a simplicial complex. The -skeleton of C, enote C (), is a subcomplex consisting of all k-simplices of C with k. The 1-skeleton of a simplicial complex, for example, Figure is a graph. 3. Empty an fille skeletons. The left simplicial complex is the 1-skeleton C (1) of the complex C in Fig. 2. The fille 1-skeleton C [2] on the right is obtaine by aing all 2-simplices Definition base 1. on The triangular fille -skeleton, subgraphs enote of C (1). The C [+1] 3-simplex, is a simplicial {4, 5, 6, 7} complex C oes not belong to C [2]. The fille skeleton C [2] is not a subcomplex of C since, C [+1] = C () { for example, {8, {i +2 } : b i+2 = 1 } 9, 10} / C. The filling operation is enote by. [2]. (2.6) In other wors, C [+1] is obtaine from C () by aing ( + 1)-simplices as follows. For every ( + 1)- simplex In other {i wors, +2 }, if C () [+1] contains is obtaine all ( from + 2) C -simplices () by aing ( {iˆk +2 }, + k 1)-simplices = 1,..., as + follows. 2, we a For {i every +2 } to ( C + () 1)-. Intuitively, simplex {i +2 we }, a if C () contains +2 } if its -imensional all ( + 2) -simplices {iˆk bounary is +2 alreay }, k = in 1, C ()..., Note + 2, that we a in this {i +2 case } to we C a (). {i Intuitively, +2 } even we if {i a +2 }{i +2 / C, } if an, its therefore, -imensional C [+1] bounary is not necessarily is alreay ainsubcomplex C (). Note of that C. infor this example, case we C a [1] is {ia +2 complete } even if graph {i +2 on } / nc, vertices, an, therefore, an C [2] C is [+1] the 1-skeleton is not necessarily of C with a subcomplex all its triangular of C. subgraphs example, fillec by [1] 2-simplices. is a completewe graph enote on nthe vertices, fille -skeleton an C [2] is bythe C [+1] 1-skeleton (instea of of C with C [] ), all toits emphasize triangular that subgraphs generallyfille it has by imension 2-simplices. ( Figure + 1). Figure 3 illustrates 3 illustrates the construction the construction of a fille of askeleton. fille skeleton. Thus, we have the following hierarchy of of empty an fille skeletons: [1] C [1] C [] C [m] [] V = C (0) C (1)... C ( 1) C ()... C (m 1) C (m) = C, where where [] [] enotes enotes the the filling filling operation. operation. Let Let f enote enote the the number number of of -simplices -simplices in in C () () (an (an therefore therefore in in C), C), an an φ be be the the number number of of -simplices -simplices in in C [] [].. By By construction, construction, φ f,, an an f = ii +1 a i+1 an φ = ii +1 [m] (2.7) b i+1. (2.7) (2.8) Figure 6 shows all simplicial complexes C C 3 an the values of of f 1, f 2, an φ 2 for each C. 3. Exponential Ranom Simplicial Complexes Let S be any subset of C n, {x 1,..., x r } be a set of functions on S, x i : S R, an { x 1,..., x r } be a set of numbers, x i R. We efine the exponential ranom simplicial complex (ERSC) as a maximum-entropy ensemble of complexes with soft constraints that require the observables x i to have the expecte values x i in the ensemble. Definition 2. ERSC(S, {x i }, { x i }) is a pair (S, P), where P is a probability istribution on S that maximizes

6 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV 3. Exponential Ranom Simplicial Complexes Let S be any subset of C n, {x 1,..., x r } be a set of functions on S, x i : S R, an { x 1,..., x r } be a set of numbers, x i R. We efine the exponential ranom simplicial complex (ERSC) as a maximum-entropy ensemble of complexes with soft constraints that require the observables x i to have the expecte values x i in the ensemble. Definition 2. An exponential ranom simplicial complex ERSC(S, {x i }, { x i }) is a pair (S, P), where P is a probability istribution on S that maximizes the entropy subject to the following constraints S(P) = C S P(C) ln P(C) max, (3.1) E P [x i ] = C S x i (C)P(C) = x i, (3.2) P(C) = 1. (3.3) C S An exponential ranom simplicial complex is thus a escriptive moel for ranom simplicial complexes. Generative moels have been recently introuce an analyze in [5, 6, 49]. We can efine ERSC for any set of simplicial complexes, but, for most of the paper, we restrict ourselves to C n an its subsets. If we use S = G n C n, then we recover the efinition of ERGs. As with ERGs, the solution of the constraine optimization problem (3.1)-(3.3) belongs to the exponential family, hence the name of the ensemble. Theorem 1. The maximum-entropy istribution P efine by (3.1)-(3.3) can be written as follows r P(C) = e H(C) Z(θ), H(C) = θ i x i (C), Z(θ) = e H(C), (3.4) C S i=1 where H(C) is the Hamiltonian of simplicial complex C S, Z(θ) is the normalizing constant, calle the partition function, an θ = (θ 1,..., θ r ) are the parameters satisfying the following system of r equations ln Z θ i = x i. (3.5) The proof is nearly ientical to the proof for ERGs [37], but we give it here for completeness. Proof. We use the stanar metho of Lagrange multipliers to solve the optimization problem (3.1)-(3.3). Let θ 1,..., θ r an α be the Lagrange multipliers for the constraints in (3.2) an (3.3). The Lagrangian is then L = ( r P(C) ln P(C) + θ i x i ) ( x i (C)P(C) + α 1 ) P(C). (3.6) C S C S C S i=1 L The maximum entropy is achieve if the istribution P satisfies P(C) = 0 for any C S. This gives r ln P(C) 1 θ i x i (C) α = 0, (3.7) or, P(C) exp ( i=1 ) r θ i x i (C), (3.8) i=1

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 7 which is equivalent to (3.4), since C S P(C) = 1. It remains to check that (3.5) inee hols: ln Z θ i = 1 e H(C) = 1 H(C) e H(C) Z θ i Z θ i C S C S = 1 x i (C)e H(C) = x i (C)P(C) = x i, Z C S C S since the expecte value of the observable x i in the ensemble is x i. (3.9) 4. Simple Examples of ERSCs Here we illustrate ERSCs with three simple examples: Erős Rényi ranom graphs G(n, p), ranom flag complexes X(n, p) an Linial Meshulam ranom complexes Y (n, p). 4.1. Erős Rényi Ranom Graphs. Perhaps the simplest nontrivial example of an ERSC is the Erős Rényi ranom graph ensemble G(n, p), which can be viewe as a generative moel for 1-imensional simplicial complexes. G(n, p) is a maximum-entropy ensemble with only one constraint that the expecte number of eges f 1 in the ensemble is ( n 2) p [37]: G(n, p) = ERSC ( G n, f 1, ( ) ) n p. (4.1) 2 4.2. Ranom Flag Complexes. The flag complex X(G) of a graph G G n, also calle the clique complex or the Vietoris Rips complex, is a (eterministic) simplicial complex in C n whose 1-skeleton is G an whose k-simplices correspon to complete subgraphs of G, calle cliques, of size k + 1. Since any simplicial complex is homeomorphic to a flag complex, simplicial complexes arise in ifferent applications an are often use for topological ata analysis [50]. Kahle [25, 26] efines the ranom flag complex X(n, p) as the flag complex of the Erős Rényi ranom graph, X(n, p) = X(G(n, p)), an stuies phase transitions of its homology groups. Here we show that X(n, p) is, in fact, an ERSC. Proposition 1. Let F n C n be the set of all flag complexes on n vertices, then ( ( ) ) n X(n, p) = ERSC F n, f 1, p. (4.2) 2 Before giving the proof, we comment on what exactly Proposition 1 states. X(n, p) is a generative moel of simplicial complexes: to generate C X(n, p), one first generates G G(n, p), an then sets C = X(G). Let S X(n,p) F n enote the sample space of this ranom generative process, an P X(n,p) be the resulting probability istribution on S X(n,p). The ranom flag complex X(n, p) can therefore be viewe as ensemble (S X(n,p), P X(n,p) ). Proposition 1 claims that (S X(n,p), P X(n,p) ) is a maximum-entropy ensemble with S X(n,p) = F n an a single constraint that the expecte number of 1-simplices is ( n 2) p. The proof is the same as for (4.1), but we give it here for illustrative purposes. Proof. First, note that any flag complex C F n can be generate by X(n, p) with a non-zero probability: P X(n,p) (C) = P G(n,p) (C (1) ) = p f1(c) (1 p) (n 2) f 1(C). (4.3) Therefore, S X(n,p) is inee equal to F n. To prove (4.2), we nee to show that P X(n,p) is in fact the ERSC probability istribution (3.4),(3.5). Since every flag complex C F n is completely efine by the ajacency

8 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV matrix of its 1-skeleton, F n = 2 =1 a = 1 n a2. The partition function Z can then be compute as follows: Z(θ 1 ) = e H(C) = e θ1f1(c) = C F n C F n = e θ1ai2 = a 2 i 2 i 2 1 a i2 =0 We can now solve (3.5) with x 1 = ( n 2) p for the parameter θ1, an check that inee θ 1 = ln a 2 θ 1 a i2 i e 2 e θ1a i 2 = (1 + e θ 1 ) (n 2). (4.4) p 1 p, (4.5) P X(n,p) (C) = e θ1f1(c). (4.6) Z(θ 1 ) This completes the proof. Given (4.1), the result in (4.2) is intuitively expecte since the ranom part of generating C X(n, p) is sampling the 1-skeleton C (1) G(n, p). The rest of the construction, C = X(C (1) ), is fully eterministic. 4.3. Linial Meshulam Ranom Complexes. Another example of ERSC is a generative moel Y (n, p) for ranom 2-complexes. To generate Y Y (n, p), we start with a complete graph on n vertices, the 1-skeleton of a future simplicial complex, an a each of the ( n 3) possible triangle faces inepenently at ranom with probability p. Linial an Meshulam introuce this moel in [31] an stuie its topological properties. In particular, they prove for Y (n, p) a cohomological analog of the celebrate Erős Rényi theorem on connectivity of the Erős Rényi ranom graphs [17]. The moel Y (n, p) can be reaily generalize to higher imensions: start with a full -complex on n vertices, 1 n 2, an a each of the ( n +2) possible (+1)-simplices inepenently at ranom with probability p. We enote this moel by Y (n, p). The original Linial Meshulam ranom complex Y (n, p) is then Y 1 (n, p). C n be a set of all simplicial complexes of imension ( + 1) or less, an Y C (+1) n Let C n (+1) subset of complexes with full -skeleton. In other wors, be a Y = {C C (+1) n : C (k) = C [k], k = 1,..., }. (4.7) Since for any C Y, the first ( + 1) ajacency tensors a 1,..., a +1 are unit tensors with zero iagonals, Y = a +2. Proposition 2. The Linial Meshulam ranom complex Y (n, p) is the ERSC ensemble: ( ( ) ) n Y (n, p) = ERSC Y, f +1, p. (4.8) + 2 Proof. The proof is similar to that for ranom flag complexes. complex has been generate by Y (n, p) is Given C Y, the probability that the P Y (n,p)(c) = p f +1(C) (1 p) ( n +2) f +1 (C). (4.9)

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 9 We nee to show that this is in fact the maximum-entropy istribution uner the constraint E[f +1 ] = ( n +2) p. The partition function: Z(θ 1 ) = e H(C) = e θ1f+1(c) = C Y C Y = e θ1ai+2 = a +2 i +2 The Lagrange multiplier is then θ 1 = ln as claime. p 1 p, an 1 i +2 a i+2 =0 θ 1 a i+2 i e +2 a +2 e θ1a i +2 (4.10) = (1 + e θ 1 ) ( +2) n. P Y (n,p)(c) = e θ1f +1(C), (4.11) Z(θ 1 ) This result is also expecte, since the Linial Meshulam ranom complex is a higher imensional analog of the Erős Rényi ranom graph: sampling from Y (n, p) is the same Bernoulli trials process as in G(n, p), with the only ifference being that now we are creating ( + 1)-simplices instea of 1-simplices (eges). 5. Any Distribution is Maximum-Entropy It is a well-known fact in statistics an information theory (e.g. [11]) that any iscrete istribution P is maximum-entropy uner properly specifie constraints. Specifically, if one can write ln P as a linear combination λ i h i +ξ of some functions {h i }, then istribution P uniquely maximizes entropy S(P) across all istributions P that satisfy constraints E P [h i ] = E P [h i ]. In this section, we briefly review this general result, an show how it applies to the alreay consiere moels G(n, p), X(n, p), an Y (n, p), where ln P can be written as a linear combination. We will see in the next section that this result simplifies ramatically the proofs for more complicate ERSCs. Let us consier a iscrete probability space (Ω, P ), where Ω is a finite sample space an P is some fixe probability istribution on Ω. Let us represent the istribution P in the Gibbs form as follows: where P (ω) = e ( ln P (ω)) = e H (ω), (5.1) H (ω) = ln P (ω). (5.2) Let H enote the expectation of the function H : Ω R with respect to P, which is exactly the entropy of P, H = E P [H ] = ω Ω H (ω)p (ω) = S(P ). (5.3) Lemma 1. The probability istribution P is the solution of the following optimization problem: S(P) = ω Ω P(ω) ln P(ω) max, (5.4) subject to the constraints ω Ω P(ω) = 1 an E P [H ] ω Ω P(ω) ln P (ω) = H. (5.5) In other wors, Lemma 1 states that any iscrete probability istribution is a maximum-entropy istribution. The entropy maximization is across all possible istributions P satisfying the constraint that the expecte value of H in istribution P is equal to H s expecte value in istribution P, which is P s entropy. In what follows, we will nee a more general version of Lemma 1.

10 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Suppose that function H in (5.2) can be written as a linear combination of r other functions h i : Ω R: r H (ω) = ln P (ω) = λ i h i (ω) + ξ, (5.6) where λ i, ξ R are constants. Let h i enote the expectation of h i with respect to P, h i = E P [h i ] = h i (ω)p (ω). (5.7) ω Ω i=1 Lemma 2. (Th. 11.1.1 [11]) The probability istribution P is the solution of the following optimization problem: S(P) = ω Ω P(ω) ln P(ω) max, (5.8) subject to the constraints ω Ω P(ω) = 1 an E P [h i ] ω Ω h i (ω)p(ω) = h i, i = 1,..., r. (5.9) We note that the main utility of Lemma 2 is not in observing that P (ω) e r i=1 λih i (ω) is a maximumentropy istribution (in fact, Lemma 1 states that any istribution is), but in specifying more general constraints (5.9) uner which istribution P is maximum-entropy. Lemma 1 is a special case of more general Lemma 2 with ξ = 0, r = 1, an λ 1 = 1. Inee, in this case H = h 1, an the constraints in (5.5) an (5.9) become manifestly ientical. Lemma 2 is ientical to Theorem 11.1.1 in [11], but we provie the proof here for completeness. Proof. Let P be any istribution that satisfies the constraints in (5.9). Then its entropy S(P) = ω Ω P(ω) ln P(ω) = ω Ω = D KL (P P ) + ω Ω P(ω)H (ω), P(ω) ln P(ω)P (ω) P (ω) (5.10) where D KL (P P ) is the Kullback Leibler (KL) ivergence of P from P. Since the KL ivergence is always non-negative, S(P) P(ω)H (ω) = ( r ) P(ω) λ i h i (ω) + ξ ω Ω ω Ω i=1 (5.11) r r = λ i E P [h i ] + ξ = λ i h i + ξ = S(P ). i=1 i=1 This shows that P inee maximizes the entropy. The uniqueness follows form the fact that D KL (P P ) = 0 if an only if P = P. Lemmas 1&2 can be formulate for any ensemble of iscrete objects, incluing sets of graphs an simplicial complexes. Using the notation introuce in Definition 2 an applying the Lemmas to Ω = S C n, we can concisely write Lemma 1 as (S, P ) = ERSC ( S, H, H ), (5.12) an Lemma 2 as (S, P ) = ERSC ( S, {h i }, { h i } ). (5.13) In many cases, Lemmas 1&2 are not useful, since for many generative moels (S, P ) the istribution P cannot be explicitly written in the Gibbs form with linear Hamiltonian (5.6). Moreover, in generative

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 11 moels, the complexity of the generating algorithm often makes it impossible to even explicitly compute the probability P (C) for a given C. Even in the preferential attachment moel [4], where the algorithm which generates a network appears to be fairly simple a new noe connects to existing noe i with probability proportional to its egree p i k i the resulting istribution is unknown. However, if we o know P as a function of observables, P (C) e r i=1 λih i (C), Lemma 2 is very helpful in representing (S, P ) as an ERSC. Inee, let us briefly see how Lemma 2 applies to the alreay consiere generative moels. For G(n, p), the probability of a graph G G n in the moel is P G(n,p) = p f1(g) (1 p) (n 2) f 1(G). (5.14) The corresponing Hamiltonian is then (( ) ) n H G(n,p) (G) = f 1 (G) ln p f 1 (G) ln(1 p) 2 = ln 1 p ( ) n f 1 (G) ln(1 p), p }{{} 2 }{{} h 1 λ (G) }{{} 1 ξ (5.15) where the bottom notations refer to the notations in Lemma 2. The observation that G(n, p) is an ERG (4.1) then follows from Lemma 2, since E PG(n,p) [f 1 ] = ( n 2) p. Similarly, for X(n, p) an Y (n, p), H X(n,p) (C) = ln 1 p ( ) n p f 1(C) ln(1 p), 2 ( ) n f 1 = E X(n,p) [f 1 ] = p, 2 H Y (n,p)(c) = ln 1 p ( ) n p f 1(C) ln(1 p), + 1 ( ) n f +1 = E Y (n,p)[f +1 ] = p, + 2 (5.16) an the observations (4.2) an (4.8) that these ensembles are ERSCs are irect corollaries of Lemma 2. The main point of this section is that in case the probability istribution is a known exponential function of a linear combination of structural observables, the computation of the partition function, which tens to be a nontrivial task in general, is not necessary to show that the istribution is the unique maximizer of entropy across all the istributions that satisfy the constraints that the expecte values of these observables are equal to their expecte values in this istribution. 6. Kahle s -Ensembles We now turn to a more general moel which contains the Erős Rényi ranom graphs, the ranom flag complexes, an the Linial Meshulam complexes as special cases. In a recent survey [27], Kahle introuce the following multi-parameter moel (n; p 1,..., p n 1 ) which generates ranom simplicial complexes inuctively by imension. First, buil a 1-skeleton by putting an ege between any two vertices with probability p 1. Then, for = 2,..., n 1, a every -simplex with probability p, but only if the entire ( 1)-imensional bounary of that simplex is alreay in place. More formally, we have the following efinition.

12 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Definition 3. The Kahle moel (n; p 1,..., p n 1 ) is a ranom simplicial complex moel that generates C C n as follows: for = 1,..., n 1, for every i +1, if b i+1 = 0 set a i+1 = 0. { 1 with probability p, (6.1) if b i+1 = 1 set a i+1 = 0 with probability 1 p. Topological properties of the Khale moel are stuies in [8 10]. Here we stuy its maximum-entropy properties. In Appenix A.1 we prove the following proposition. Proposition 3. Let C (n; p 1,..., p n 1 ). The expecte numbers of -simplices in C () an C [] are ( ) ( ) 1 n f = p (+1 k) n k an φ = p (+1 k) k. (6.2) + 1 + 1 The Kahle moel unifies all the ranom simplicial complexes we have consiere so far: G(n, p) = (n; p, 0,..., 0), X(n, p) = (n; p, 1,..., 1), Y (n, p) = (n; 1, p, 0,..., 0), Y (n, p) = (n; 1,..., 1, p, 0,..., 0). }{{} Since all these special cases are ERSCs, it is natural to expect that so is (n; p 1,..., p n 1 ). We cannot prove this using the same metho as for the Erős Rényi ranom graphs an the ranom flag an Linial Meshulam complexes in Section 4. As with ERGs, analytical computation of the partition function Z(θ) for ERSCs is rarely possible, an G(n, p), X(n, p), an Y (n, p) are lucky exceptions. In Appenix A.3 we illustrate ifficulties one has to be prepare to experience when attempting to compute the partition function for (n; p 1,..., p n 1 ) with n = 3. However, with the help of Lemmas 1&2 in Section 5 there exists a simpler alternative proof. The fact that (n; p 1,..., p n 1 ) is an ERSC is a irect corollary of those lemmas. Theorem 2. The Kahle -ensemble is the ERSC ensemble: (n; p 1,..., p n 1 ) = ERSC ( C n, { {f } n 1 =1, {φ } n 1 } { =2, { f } n 1 =1, { φ } n 1 }) =2, (6.4) where f an φ are the expecte numbers of -simplices in C () an C []. Proof. For any C C n, the probability P (C) that (n, p 1,..., p n 1 ) generates C can be compute by inuction: n 1 P (C) = P (C ) n 1 () C ( 1) = p f (C) (1 p ) φ (C) f (C). (6.5) =1 Inee, given the ( 1)-skeleton C ( 1), the maximum possible number of -simplices in C () is exactly φ (C), the number of -simplices in the fille skeleton C []. Since each of these -simplices appears inepenently with probability p, the conitional probability P C ( () C ( 1)) = p f (C) (1 p ) φ (C) f (C), where f (C) is the actual number of -simplices in C (). The Hamiltonian of C is therefore n 1 ( H (C) = f (C) ln 1 p ) 1 + φ (C) ln p 1 p =1 n 1 = f (C) ln 1 p n 1 ( ) ( ) (6.6) 1 n 1 + φ (C) ln + ln, p 1 p 2 1 p 1 =1 =2 =1 (6.3)

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 13 (a) (b) (c) () 10 9 1 p 10 1 p 1 9 p 1 1 p 18 8 2 p 8 1 p 2 1 p 145 p 78 p 14 7 3 p 7 1 p 3 6 4 1 p 6 p 4 p 57 p 578 5 p 5 p 45 n=10 C (0) C (1) C (2) Figure 4. Sampling from with n = 10. Panel (a) shows the starting point: n = 10 potential 0-simplices (vertices) represente by empty ots. At the first step, shown in Panel (b), we create the 0-skeleton of a future complex C by incluing (excluing) every vertex i in (from) C (0) with probability p i (with probability 1 p i). In this example, five vertices represente by fille ots belong to C (0). Next, in Panel (c), we generate the 1-skeleton by creating 1-simplices (eges). Each of the ( 5 2) possible eges {i2} = {1, 4}, {1, 5},..., {7, 8} appears in C (1) with probability p i2. We represent the accepte (rejecte) eges by soli (ashe) lines. Finally, in Panel (), we create the 2-skeleton by aing 2-simplices (triangles). There are only two possible triangles in C (2) : {1, 4, 5} an {5, 7, 8}. Here, the former (empty) was rejecte with probability 1 p 145, an the latter (fille) was accepte with probability p 578. since φ 1 (C) = ( n 2) for any C. Using Lemma 2 with completes the proof. {h i } ={f 1,..., f n 1, φ 2,..., φ n 1 }, { {λ i } = ln 1 p 1,..., ln 1 p } n 1 1 1, ln,..., ln, p 1 p n 1 1 p 2 1 p n 1 ( ) n 1 ξ = ln, 2 1 p 1 (6.7) 7. General Ranom Simplicial Complexes with Inepenent Simplices Finally, we introuce an consier the most general case of ranom simplicial complexes with statistically inepenent simplices. In this case each simplex has its own iniviual probability of appearance. To stay as general as possible, we must allow for even the 0-simplices (vertices) to be present with any probabilities, which are not necessarily equal to 1. We enote this new moel by (n; p 1,..., p n ), or for brevity, where p = {p i } is a collection of ( n ) appearance probabilities for each ( 1)-simplex. Whereas in in the previous section, the subinex in p refers to the simplex imension, in the sub-multi-inex i in p i refers to the specific ( 1)-simplex {i }. To generate C, we first create its 0-skeleton by having vertices {i 1 } C with probabilities p i1, i 1 = 1,..., n. Then, for = 1,..., n 1, we a every -simplex {i +1 } with probability p i+1, but only if the entire ( 1)-imensional bounary of that simplex is alreay in place. Figure 4 illustrates the generation of a 2-complex from with n = 10. More formally, we have the following efinition. Let C n = n C k enote the set of all simplicial complexes with n vertices or less. As in Section 2, any C C n is uniquely etermine by a collection of its ajacency tensors a = {a i }, = 1,..., n, except that now a 1 is not necessarily equal to the all-ones vector 1 n, since C may have less than n vertices.

14 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Definition 4. The moel (n; p 1,..., p n ) is a ranom simplicial complex moel that generates C C n as follows: for = 0,..., n 1, for every i +1, if b i+1 = 0 set a i+1 = 0, { 1 with probability p i+1, if b i+1 = 1 set a i+1 = 0 with probability 1 p i+1. Here, for convenience, we use the convention that b i1 = 1 for all i 1 = 1,..., n. If p i+1 = p for = 0,..., n 1 an p 0 = 1, then we recover the original Kahle s moel (n; p 1,..., p n 1 ) from the previous section. To give the expressions for the expecte values of the observables a i an b i in (n; p 1,..., p n ), we nee a bit of new notation. Let multi-inex k m = k 1,..., k m enote an m-tuple with increasing values 1 k 1 <... < k m, an i k m = i 1,..., î k1,..., î km,..., i be the ( m)-long multi-inex with i k1,..., i km omitte, with a convention that k 0 = an i k 0 = i. In Appenix A.2 we prove the following proposition. Proposition 4. Let C (n; p 1,..., p n ). The expecte values of the observables a i an b i are ā i = 1 m=0 k m p i km an b i = 1 m=1 Lemma 2 helps again to prove that the general moel is also an ERSC. Theorem 3. The -ensemble is the ERSC ensemble: (7.1) k m p i km. (7.2) (n; p 1,..., p n ) = ERSC ( C n, {{a i } n =1, {b i } n =2}, { {ā i } n =1, { b i } n =2}), (7.3) where ā i an b i are the expecte values of the observables a i an b i. Proof. The probability that (n; p 1,..., p n ) generates C C n is P (C) =P ( = i 1 p a i 1 i 1 = =1 C (0)) n 1 i =1 P ( C () C ( 1)) n 1 (1 p i1 ) 1 a i 1 =1 n p ai i (1 p i ) b i a i. =1 i p ai+1 i +1 i +1 The Hamiltonian of C is then n ( H (C) = a i ln 1 p ) i 1 + b i ln p i 1 p i = n =1 i α i a i + (1 p i+1 ) b i +1 a i+1 n 1 β i b i + ln, 1 p i1 =2 i where α i an β i are the Lagrange multipliers couple to observables a i an b i, α i i 1 (7.4) (7.5) = ln 1 p i 1 an β i = ln. (7.6) p i 1 p i

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 15 Using Lemma 2 with completes the proof. {h i } = {{a i1 },..., {a in }, {b i2 },..., {b in }}, {λ i } = {{α i1 },..., {α in }, {β i2 },..., {β in }}, ξ = i 1 ln 1 1 p i1, (7.7) 8. Discussion In summary, exponential ranom simplicial complexes (ERSCs) are a natural higher imensional analog of exponential ranom graphs which are extensively use for moeling network ata an statistical inference. An ERSC ensemble is a maximum-entropy ensemble of simplicial complexes uner soft constraints that fix expecte values of some observables or properties of simplicial complexes. We have evelope the formalism for ERSCs, an introuce the most general generative moel of ranom simplicial complexes with statistically inepenent simplices. This moel has as special cases several popular moels stuie in the literature: Erős Rényi ranom graphs, ranom flag complexes, Linial Meshulam complexes, an Kahle s -ensembles. As all these moels, is an ERSC ensemble. The constraints in this ensemble are expecte number of simplices an their bounaries. This result is a irect corollary of the general observation that any probability istribution P is maximumentropy uner the constraint that the expecte value of ln P is equal to the entropy of P. This observation ramatically simplifies the representation of many ensembles of ranom simplicial complexes as ERSCs since the calculation of the partition function is no longer neee. For example, to show that the Erős Rényi ranom graphs G(n, p) are exponential ranom graphs with a given expecte number of eges, one oes not really have to calculate the partition function. This calculation is trivial in the Erős Rényi case or in the general case of exponential ranom graphs with statistically inepenent eges [37]. However, the analogous calculation for the general case of ranom simplicial complexes with statistically inepenent simplices appears to be intractable. The multi-parameter moel (n; p 1,..., p n ) is the ERSC ensemble with two types of constraine observables: {a i } an {b i }. The observables of the first type are simplices themselves: a i (C) = 1 if the ( 1)-simplex {i } belongs to C, an zero otherwise. The observables of the secon type are their bounaries: b i (C) = 1 if the entire ( 2)-imensional bounary of simplex {i } belongs to C, an zero otherwise. Theorem 3 states that (n; p 1,..., p n ) is a solution of the following optimization problem: S(P) max, P(C) = 1, (8.1) C C n E P [a i ] = ā i, = 1,..., n, E P [b i ] = b i, = 2,..., n. (8.2) If we rop the observables of the secon type in this optimization problem, we alter the maximum-entropy istribution as illustrate in Figure 5. Since the istribution has change, ensemble Λ(n; p 1,..., p n ) = ERSC (C n, {a i } n =1, {ā i } n =1 ) efine by this istribution is now also ifferent from (n; p 1,..., p n ). The fact that the secon type of bounary-presence observables are also constraine in (n; p 1,..., p n ) may appear quite unexpecte at first glance. The reason for the presence of these constraints is that simplex existence probabilities are actually conitional, where the conitions are the presence of simplex bounaries. If we go from conitional to unconitional probabilities, we change to Λ. Inee, in, p i is the conitional probability of the ( 1)-simplex {i } to appear in C, given that its ( 2)-imensional bounary is alreay in place, p i = P (a i = 1 b i = 1) = P (a i = 1, b i = 1) P (b i = 1) = P (a i = 1) P (b i = 1), (8.3)

16 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Figure 5. Constraine entropy maximization. The surface represents the Gibbs entropy S, which, in this schematic example, is a function on the set of all probability istributions on C n. The global maximum correspons to the uniform istribution U, which is the maximum-entropy istribution among all istributions supporte on C n. Theorem 3 shows that if we have two sets of constraints, EP [ai ] = a i an EP [bi ] = b i, then the resulting maximum-entropy istribution is P. If we rop the secon set of constraints, then we get some other maximum-entropy istribution PΛ 6= P for ensemble Λ 6=. where the last equation follows from the compatibility conition ai = 1 bi = 1. This means that the unconitional probability of having {i } C is P (ai = 1) = pi P (bi = 1), (8.4) an, therefore, the expecte values of observables ai an bi satisfy a i = E [ai ] = P (ai = 1) = pi P (bi = 1) = pi b i. (8.5) Thus, if we want to represent as an ERSC an we fixe the expecte values of the observables of the first type ai, we must also fix the expecte values of the observables of the secon type bi. Moreover, these expecte values are not inepenent an must satisfy a i = pi b i, which is consistent with Proposition 4. In Appenix A.4 we consier a special case with n = 3, pi1 = 1, pi2 = p1, an pi3 = p2, an explicitly show that the maximum-entropy istributions with an without the secon type constraints are ifferent. To conclue, 6= Λ. From the maximum-entropy point of view, the ensemble Λ, with only the observables of the first type constraine, appears more natural than. Yet is more natural than Λ in terms of simplicity of its constructive Definition 4 that allows for efficient sampling of simplicial complexes. We leave open the questions of whether there exist ways to calculate the probability istribution PΛ (C) in ensemble Λ, an to efficiently sample from it, i.e., to easily generate simplicial complexes C with this probability. Acknowlegments We thank Alexanru Suciu, Gabor Lippner, an Christopher King for very useful iscussions, comments, an suggestions. This work was supporte by DARPA grant No. HR0011-12-1-0012; NSF grants No. CNS-1344289, CNS-1442999, CNS-0964236, CNS-1441828, CNS-1039646, an CNS-1345286; an by Cisco Systems.

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 17 Appenix A.1. Proof of Proposition 3. Let us first compute the expecte number of -simplices in C [], where C (n; p 1,..., p n 1 ). φ =E[φ ] = E i+1 = E E i+1 +1 b i+1 a iˆk+1 +1 = E i+1 a 1 = E i+1 a iˆk+1 +1 E ] [a a 1 iˆk+1. (A.6) If the bounary of ( 1)-simplex {iˆk +1 } belongs to C, i.e. b iˆk+1 = 1, then {iˆk +1 } C, i.e. a iˆk+1 = 1, with probability p 1. Otherwise, if b iˆk+1 = 0, then automatically a iˆk+1 = 0. Therefore, the inner expecte value: ] E [a a 1 iˆk+1 = p 1 b iˆk+1. (A.7) So, φ = E i+1 +1 p 1 b iˆk+1 = p +1 1 E i+1 k 2 a i k 2 +1, (A.8) where k 2 = k 1, k 2 is a pair of inices 1 k 1 < k 2 + 1, an i k 2 +1 = i 1,..., î k1,..., î k2,..., i +1 is the ( 1)-long multi-inex with i k1 an i k2 omitte. Proceeing in this manner, we have: φ =p +1 1 E =p +1 1 E =p (+1 1 ) 1 E i+1 i+1... p(+1 1) a i k 2 +1 k 2 p 2 b i k 2 = p (+1 1 +1 k 2 1 E i+1 a 2 = p +1 1 E i+1 k 2 1 ) p(+1 2 ) 2 E i+1 k a i k +1 = [ E a i k2 k 3 a i k 3 +1 ( ) 1 n p (+1 k) k. + 1 +1 ] a 2 =... (A.9) The last equation hols because a = 1 for any i +1 an k, since all simplicial complexes C (n; p 1,..., p n 1 ) i k +1 have exactly n vertices. The expecte number of -simplices in C () is now: f = E[f ] = E[E[f φ ]] = E[p φ ] = p φ = ( ) n p (+1 k) k. (A.10) + 1

18 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV A.2. Proof of Proposition 4. Computations are similar to those in the previous section. [ ] [ [ ]] bi =E[b i ] = E a iˆk = E E a iˆk a 2 =E = = [ [ ] ] [ E a a 2 iˆk = E [ p iˆk E a i k 2 k 2 p iˆk... ] = p 1 i k k 1 since b = 1 for any i an k 1. Finally, i k 1 p iˆk E p iˆk b iˆk ] = p i k 2 k 2 b 1 i k k 1 E = [ ā i = E[a i ] = E[E[a i b i ]] = E[p i b i ] = [ ] p iˆk E b iˆk b i k 2 k 2 1 m=1 1 m=0 ] =... k m p i km, (A.11) k m p i km. (A.12) A.3. Special case: (3; p 1, p 2 ). Theorem 2 in Section 6 explicitly represents the Kahle s multi-parameter moel of ranom simplicial complexes (n; p 1,..., p n 1 ) as an ERSC for any values of the parameters. This theorem is a irect corollary of Lemmas 1&2 in Section 5 which assert that any istribution is, in fact, the maximum-entropy istribution uner certain constraints. Here we illustrate the ifficulties that arise when one tries to compute the maximum-entropy istribution P using Theorem 1. We successfully use this metho, which is base on computing the partition function, in Section 4 for the Erős Rényi ranom graphs an the ranom flag an Linial Meshulam complexes. For Kahle s -ensemble, however, the partition function becomes intractable. Consier a special case of the Kahle s moel with n = 3. Accoring to Theorem 2 an Proposition 3, (3; p 1, p 2 ) is the maximum-entropy ensemble of simplicial complexes on 3 vertices with three constraints: E[f 1 ] = 3p 1, E[f 2 ] = p 3 1p 2, E[φ 2 ] = p 3 1. (A.13) Let us compute the corresponing maximum-entropy istribution P (3;p1,p 2) using Theorem 1. The partition function Z in (3.4) is Z(θ 1, θ 2, θ 3 ) = C C 3 e H(C) = C C 3 e θ1f1(c) θ2f2(c) θ3φ2(c) =1 + 3e θ1 + 3e 2θ1 + e 3θ1 θ3 + e 3θ1 θ2 θ3, (A.14) where the last equality follows from Figure 6, where we list all complexes in C 3 along with the corresponing values of observables f 1, f 2, an φ 2. To fin parameters θ 1, θ 2, an θ 3, which are the Lagrange multipliers couple to observables f 1, f 2, an φ 2, we nee to solve the system of three equations (3.5), where x i are replace by the expecte values in (A.13): 3e θ1 + 6e 2θ1 + 3e 3θ1 e θ3 + 3e 3θ1 e θ2 e θ2 1 + 3e θ1 + 3e 2θ1 + e 3θ1 e θ3 + e 3θ1 e θ2 e = 3p 1, θ3 e 3θ1 e θ2 e θ3 1 + 3e θ1 + 3e 2θ1 + e 3θ1 e θ3 + e 3θ1 e θ2 e = p3 1p 2, (A.15) θ3 e 3θ1 e θ3 + e 3θ1 e θ2 e θ3 1 + 3e θ1 + 3e 2θ1 + e 3θ1 e θ3 + e 3θ1 e θ2 e = p3 1. θ3

EXPONENTIAL RANDOM SIMPLICIAL COMPLEXES 19 f 1 =0, f 2 =0, φ 2 =0 f 1 =1, f 2 =0, φ 2 =0 f 1 =1, f 2 =0, φ 2 =0 f 1 =1, f 2 =0, φ 2 =0 f 1 =2, f 2 =0, φ 2 =0 f 1 =2, f 2 =0, φ 2 =0 f 1 =2, f 2 =0, φ 2 =0 f 1 =3, f 2 =0, φ 2 =1 f 1 =3, f 2 =1, φ 2 =1 Figure 6. Simplicial complexes on three vertices. Here we show all C C 3 an the values of f 1 (number of 1-simplices), f 2 (number of 2-simplices), an φ 2 (number of 2-simplices in C [2] ) for each C. After some teious algebra, one can show that the solution is The partition function simplifies then to Therefore, the maximum-entropy istribution is e θ1 = p 1 1 p 1, e θ2 = p 2 1 p 2, e θ3 = 1 p 2. (A.16) P (3;p1,p 2)(C) = e H(C) Z Z = =(1 p 1 ) 3 ( p1 1 p 1 1 (1 p 1 ) 3. (A.17) = e θ1f1(c) θ2f2(c) θ3φ2(c) Z ) f1(c) ( p2 1 p 2 ) f2(c) (1 p 2 ) φ2(c) =p f1(c) 1 (1 p 1 ) 3 f1(c) p f2(c) 2 (1 p 2 ) φ2(c) f1(c). (A.18) As expecte, the obtaine istribution coincies with the istribution in (6.5), where n = 3 an φ 1 (C) = 3. Unfortunately, this metho of computing P cannot be extene to the general case (n; p 1,..., p n 1 ): when n > 3 the partition function Z an the corresponing analog of system (A.15) become analytically intractable. This makes Lemmas 1&2 an essential tool for proving Theorem 2 an a more general Theorem 3. A.4. ERSC(C 3, {f 1, f 2 }, { f 1, f 2 }). Here we erive the maximum-entropy istribution on C 3 only uner the constraints of the first type, E[f 1 ] = f 1 an E[f 2 ] = f 2, an show that it is ifferent from P (3,p1,p 2). This explicitly emonstrates that the constraint of the secon type, E[φ 2 ] = φ 2, is not reunant, an, if roppe, the resulting maximum-entropy ensemble will no longer be.

20 K. ZUEV, O. EISENBERG, AND D. KRIOUKOV Let (C 3, P) be the maximum-entropy ensemble ERSC(C 3, {f 1, f 2 }, { f 1, f 2 }). maximum-entropy istribution on C 3 uner the constraints In other wors, P is the E[f 1 ] = f 1 an E[f 2 ] = f 2. (A.19) We can fin P using Theorem 1 as in the previous section. The partition function Z(θ 1, θ 2 ) = C C3 e H(C) = C C3 e θ1f1(c) θ2f2(c) = ( 1 + e θ1) 3 + e 3θ 1 e θ2, (A.20) where the last equality is obtaine with the help of Figure 6. The system of equations (3.5) for θ 1 an θ 2 is then an one can check that the solution is given by e θ1 = 3 e θ1 ( 1 + e θ 1 ) 2 + e 3θ 1 e θ2 (1 + e θ1 ) 3 + e 3θ1 e θ2 = f 1, e 3θ1 e θ2 (1 + e θ1 ) 3 + e 3θ1 e θ2 = f 2, (A.21) f 13 f 2 1 f an e θ2 = f 2 (1 f 2 ) 2 13 ( f1 3 f ) 3. (A.22) 2 The partition function, as a function of f 1 an f 2, is then ( ) 2 1 f2 Z = ) (1 f 3. (A.23) 13 Therefore, the maximum-entropy istribution is H(C) e P = Z ) (1 f 3 13 = = e θ1f1(c) θ2f2(c) Z ( 1 f2 ) 2 ( f13 f 2 1 f 13 ) f1(c) ( ) f1(c) 3f 2(C) ( f1 = 3 f 2 1 f 1 3 f 2 (1 f 2 ) 2 ( f1 3 f ) 3 2 f 2(C) ) 3 f1(c) f f 2(C) ( ) 2f2(C) 2 2 1 f2. (A.24) This is a general expression for P for any expecte values f 1 an f 2. In the special case, when f 1 an f 2 coincie with the corresponing values for (3; p 1, p 2 ) in (A.13), that is f 1 = 3p 1 an f 2 = p 3 1p 2, the istribution P reuces to P = p f1(c) 1 (1 p 1 ) 3 f1(c) p f2(c) 2 (1 p 2 1p 2 ) f1(c) 3f2(C) (1 p 3 1p 2 ) 2f2(C) 2. (A.25) We see that P P (3;p1,p2). This means that the two maximum-entropy ensembles (3; p 1, p 2 ) an (C 3, P) are ifferent, ERSC(C 3, {f 1, f 2, φ 2 }, { f 1, f 2, φ 2 }) ERSC(C 3, {f 1, f 2 }, { f 1, f 2 }), (A.26) an, more generally, ERSC ( C n, { {f } n 1 =1, {φ } n 1 =2 }, { { f } n 1 =1, { φ } n 1 }) ( =2 ERSC Cn, {f } n 1 =1, { f } n 1 ) =1. (A.27)