The degree of a typical vertex in generalized random intersection graph models

Discrete Matheatics 306 006 15 165 www.elsevier.co/locate/disc The degree of a typical vertex in generalized rando intersection graph odels Jerzy Jaworski a, Michał Karoński a, Dudley Stark b a Departent of Discrete Matheatics, Ada Mickiewicz University, ul. Uultowska 87, 61-614 Poznan, Poland b School of Matheatical Sciences, Queen Mary, University of London, London E1 4NS, UK Received 17 May 004; received in revised for 6 January 006; accepted 19 May 006 Available online 11 July 006 Abstract In this paper we consider the degree of a typical vertex in two odels of rando intersection graphs introduced in [E. Godehardt, J. Jaworski, Two odels of rando intersection graphs for classification, in: M. Schwaiger, O. Opitz Eds., Exploratory Data Analysis in Epirical Research, Proceedings of the 5th Annual Conference of the Gesellschaft für Klassifikation e.v., University of Munich, March 14 16, 001, Springer, Berlin, Heidelberg, New York, 00, pp. 67 81], the active and passive odels. The active odels are those for which vertices are assigned a rando subset of a list of objects and two vertices are ade adjacent when their subsets intersect. We prove sufficient conditions for vertex degree to be asyptotically Poisson as well as closely related necessary conditions. We also consider the passive odel of intersection graphs, in which objects are vertices and two objects are ade adjacent if there is at least one vertex in the corresponding active odel containing both objects. We prove a necessary condition for vertex degree to be asyptotically Poisson for passive intersection graphs. 006 Elsevier B.V. All rights reserved. Keywords: Rando graphs; Rando intersection graphs 1. Introduction Rando intersection graphs are rando graph odels with dependent edges, as opposed to the odel with independent edges introduced by Erdős and Rényi. The first odel of rando intersection graphs was introduced in [3,4] and generalized rando intersection graph odels were introduced in []. In this paper we look at the degree of a typical vertex of the graphs described in []. Our results generalize soe of the results in [5] on the binoial odel [3,4]. In defining the odel of [], it is helpful to first define a related odel of bipartite graphs, which we denote by BG n, n, P, where n and are positive integers and P = P 0,P 1,...,P with = 1 is a probability distribution on the set of integers [] :={0, 1,,...,}. The independent sets of BG n, n, P are denoted by V and W and the cardinalities of V and W are given by V =n and W =, respectively. E-ail address: D.S.Stark@aths.qul.ac.uk D. Stark. 001-365X/$ - see front atter 006 Elsevier B.V. All rights reserved. doi:10.1016/j.disc.006.05.013

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 153 In the odel BG n, n, P a rando graph is chosen according to the following procedure: 1 The vertex degrees of the eleents in the independent set V are deterined by i.i.d. rando variables Xv, v V, with coon probability distribution P = P 0,P 1,...,P. Given the values of Xv, the set of neighbors Γv W of v is deterined by choosing a subset Γv in such way that 1 PΓv = S Xv = k = k for all S [] of cardinality k = Xv. The subsets are chosen independently of one another. Thus, given S [], PΓv = S = P S S and Γv are utually independent rando eleents. This procedure deterines BG n, n, P uniquely and, in particular, the neighbors of vertices in W are given by Γw ={v V : w Γv}. By construction, we have Xv = Γv. Moreover, we let Ywdenote Yw= Γw. Two odels of rando intersection graphs, called active and passive, are defined in []. The active graphs have vertex set V and v 1,v V are adjacent if and only if Γv 1 Γv =. The active rando intersection graph odel is denoted by IG a n P. The passive graphs have vertex set W and w 1,w W are adjacent if and only if Γw 1 Γw =. The passive rando intersection graph odel is denoted by IG p np. The paraeter will be set to = n α for soe real nuber α 0. We will characteristically ignore the integer part notation and write = n α. Note that the probabilities iplicitly depend on n. Active graphs ay be ore interesting with respect to applications than the passive graphs IG p np,as[] shows that they ay be interpreted as odels of classification. The intersection graph odel proposed in [3,4] can be treated as the active odel with P taken as Binoial, p. We will denote this odel by Gn,, p and call it the binoial odel. On the other hand, it was shown in [] that for any n and, and any distribution P the rando variable Yw, w W the degree of a vertex w in BG n, n, P is Binoialn, μ/ distributed for any w W, where μ = EXv, v V. Moreover Yw 1, Y w,...,yw are utually independent if and only if BG n, n, P = BG n, n, Binoial, p = G n,,p. Therefore, when is taken as Binoialk, p, IG a n P is the sae as IG p P n and both odels are equivalent to Gn,, p. Properties such as the eergence of sall subgraphs and contiguity with the classical rando graph odel Gn, p of Erdős and Rényi, for which there are n vertices and edges are present independently and with probability p, were studied for Gn,, p in [1,3] while the degree of a typical vertex was exained and liiting distributions were derived in [5].. The active graphs In this section we are interested in the vertex degree of a typical vertex of a rando intersection graph IG a n P. We let and P depend on n and exaine convergence in distribution of vertex degree as n. For positive integers a, b we let a b denote the falling factorial a b := aa 1 a b + 1 and let a 0 = 1, 0 0 = 1, 0 b = 0. A siple exaple of active rando intersection graphs which ay be analyzed without difficulty is the degenerate case where P d = 1 for soe d = d. This exaple was considered in []. The degree of a fixed vertex in the active odel is then Binoialn 1, 1 d d / d distributed. Hence, if for soe constant c M = n 1 1 d d c 1 d

154 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 as n, then the distribution of a typical vertex will be asyptotically Poissonc. Given sequences a n, b n,we use the notation a n b n to denote a n /b n 1asn.Ifd κ infinitely often for soe constant κ > 0, then d d / d e κ and 1 is ipossible. Therefore, we ay assue that d = o, in which case 1 holds if nd / c or d c 1 1/α/. If α=1, then with d = c the liiting distribution is Poissonc. This contrasts with Gn,, p, which has a copound Poisson liiting distribution when α = 1 and p = c/n; see [5]. The question thus arises as to what distributions P give a liiting Poisson distribution for vertex degree as n. A partial answer to this question is given by Theore 1. Theore 1. Let X be a rando variable with the probability distribution P and denote by μ and σ its expected value and standard deviation, respectively. Denote by ξ the degree of a vertex in an active intersection rando graph IG a n P, where = n α, α > 0. Then i if α < 1 and Eξ c, where c is a nonnegative constant, then alost all vertices of IG a n P are isolated; ii if α 1, Eξ c, and σ = oμ, then ξ converges weakly to the Poissonc distribution; iii if α 1, Eξ c, μ = o, σ dμ for a constant d>0, and μ 3 = EX 3 = Oμ 3, then ξ does not have a liiting Poisson distribution. Proof. The proof of the theore is given in a sequence of leas. Part i is proved in Lea. The proofs of parts ii and iii are given in Leas 5 and 6, respectively, using Leas 1, 3, and 4. Consider IG a n P and fix a vertex v V. We define a i to be a i = Pv has degree i in IG a n P, i = 0, 1,...,n 1. Our first lea gives an exact forula for the probability generating function Fx= n 1 i=0 a ix i of vertex degree in IG a n P. Lea 1. For the odel IG a n P, where P = P 0,P 1,...,P, the probability generating function Fx of vertex degree is given by Fx= k j + 1 Moreover, the expected vertex degree is n 1 M := ja j = F 1 = n 1 1 and the variance of the vertex degree is S := n 1 jj 1a j + M M = F 1 + M M = n 1n 1 x n 1. 3 + M M. 4

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 155 Proof. The degree distribution of an arbitrary vertex v in the active odel is given by [] P{v has degree i in IG a n P } n 1 = 1 E k X i Ek X n 1 i, i k k i = 0, 1,...,n 1, where we define 0 0 = 1 and where E k denotes the k-factorial oent operator. The probability generating function Fxof ξ is given by and Since Fx= [ 1 E k X x + E ] k X n 1, k k M = Eξ = F 1 = n 1 E ξ = Eξξ 1 = F 1 = n 1n 1 E k X, k 1 E k X. k j k k =, the forulae for Fx, M, and S given in the lea follow iediately. We use a variant of the degenerate case to help show that ost vertices in IG a n P are isolated alost surely if α < 1 and M is bounded above. Lea. Suppose that α < 1 and M is bounded above. Then P 0 1 as n and the nuber of isolated vertices I has expectation EI np 0 n. Proof. Let ρ = 1 P 0. The rando variable ξ is stochastically larger than the vertex degree of a fixed vertex of the degenerate active graph with P given by P 0 = 1 ρ, P 1 = ρ. Fix v V. The nuber Z of non-isolated vertices in V\{v} in BG n, n, P for the degenerate graph is Binoialn 1, ρ. The degenerate graph has vertex degree distribution ξ given by { 1 ρ + ρ PW = 0 if k = 0, P ξ = k = ρpw = k if k>0, where, conditional on the value of Z, W is BinoialZ, 1/ distributed. Thus, we have the lower bound M =Eξ E ξ= ρew = ρ n 1/. Since n/ and M is bounded, ρ 0. Our next result shows that μ = o is necessary for M c when σ = oμ. Lea 3. Suppose that σ = oμ and M c for a constant c. Then μ = o.

156 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 Proof. If it is not true that μ = o, then for soe constant κ > 0 it holds that μ κ for infinitely any.for such, we lower bound the forula for M at 3 by M = n 1 1 j<κ / n 1 1 k,j κ / + k,j κ / k,j<κ / e kj / k<κ / k<κ / n 11 e κ /4 PX κ / PX < κ /, where X is a rando variable with distribution given by P. Chebyshev s inequality gives σ PX < κ / μ κ / = o1, hence M n 11 e κ /4 + o1 for satisfying μ κ. Next we will show that under the assuptions σ = Oμ and μ = o, there is a natural asyptotic relationship between, n, μ, and c. The previous lea shows that when σ = oμ the assuption μ = o is unnecessary. Suppose that v 1,v V. Conditional on the event that v 1 has degree k and v has degree j in BG n, n, P, the nuber of vertices in W to / which v 1, v are utually adjacent is hypergeoetrically distributed with probability distribution function k k x j x j, where x represents the nuber of utually adjacent vertices, and ean kj /. The expected nuber of utually adjacent vertices is k j kj / = μ /. It is reasonable to suppose that if μ / is sall, then it will be close to the probability that v 1, v are utually adjacent to a positive nuber of vertices in W. Hence, μ / should approxiately be the edge density in IG a n P. It is therefore reasonable to suppose that μ / should be close to c/n. We will use the asyptotic result of Lea 4 frequently in succeeding proofs, often in the for μ = O /n. Lea 4. Suppose that α > 0, μ = o, and σ = Oμ. Under these assuptions M c if and only if μ c n. Proof. First suppose that M c. We will find upper and lower bounds on μ / of the for c/n + on 1. Let μ be the second oent μ := j. Because e jk/ 1 jk + jk, 5

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 157 3 iplies fro which 1 M n 1 = μ 1 jk jk + = 1 μ + μ, M n 1 + μ. Since σ = Oμ we have μ = Oμ, and now μ = o and M = c + o1 give μ c n + on 1. 6 For the lower bound on μ / we use the following inequality which holds as long as j + k<: 1 k j exp = exp j jk j1 k/ j jk j k jk/ 1 1 j + k/. 7 Fix any ε satisfying 0 < ε < inα, 1/. We find that 1 M n 1 j,k n ε j j,k n ε j n ε PX n ε jk/ 1 1 j + k/ j j n ε μ 1 n ε. 1 1 n ε The upper bound 6 iplies μ = O /n, which, together with ε < 1, produces the estiate μ = on ε.asa result, Chebyshev s inequality, 6, and the assuption σ = Oμ and ε < α/ iply PX > n ε = Oσ / n ε = Oμ / n ε = On ε 1 / = on 1. 8

158 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 Now, using M = c + o1, we infer that μ c/n + on 1. 9 Cobining 6 and 9 results in μ / c/n. Supposing now that μ / c/n, we will prove that M = c + o1. In light of 5 and the assuptions of the lea, 1 μ / + μ / which, when substituted in 3, iplies = 1 c/n + on 1, li inf M c. 10 n Siilarly, using 7 and again choosing ε satisfying 0 < ε < inα, 1/, we have k,j n ε j j n ε Using 8 and 3, we conclude as a result that μ /1 + On ε = PX n ε c/n + on 1. li sup M c. 11 n The asyptotics 10 and 11 iply M = c + o1. There are choices of P for which M c but μ / c/n fails to hold. Consider the active graphs with P 0 =1 ρ, P = ρ, and ρ = c/n 1 and let W be a Binoialn 1, ρ distributed rando variable. The degree distribution ξ of a typical vertex has distribution { 1 ρ + ρpw = 0 if k = 0, Pξ = k = ρpw = k if k>0. Despite the fact that M = ρ n 1 = c,wehaveμ = ρ, soμ / c/n. In the next lea we give sufficient conditions for ξ to have a liiting Poisson distribution. Lea 5. Suppose that α > 0, σ = oμ, and M c. Under these assuptions the distribution of ξ converges weakly to the Poissonc distribution. Proof. We will deonstrate that li Fx= e c+cx 1 n when x [0, 1], where Fxwas given by. It then follows that the Laplace transfor of ξ converges to the Laplace transfor of a Poisson distributed variable and that the distribution of ξ is asyptotically Poisson. First note that the expression in square brackets in is bounded by 1 whenever x [0, 1]. If we define ω to be ω := in μ/σ, log n, 13

then it follows that we have Fx= k j + 1 k μ ωσ j where Chebyshev s inequality iplies Δ := k j + 1 k μ >ωσ P X μ > ωσ = Oω J. Jaworski et al. / Discrete Matheatics 306 006 15 165 159 x x n 1 n 1 + Δ, 14 = o1. 15 The last equality is a consequence of our assuption σ = oμ and definition 13. We will show that for all k in {k [0,]: k μ ωσ} = 1 c n + on 1, 16 with the on 1 ter bounded uniforly. The asyptotic 1 will then follow fro 14 16. The analysis showing 16 is siilar to the arguent proving Lea 4. Using 5, we have 1 jk jk + = 1 μk + μ k. For all k such that k μ ωσ,wehave 1 μμ ωσ + μ μ + ωσ 1 μ + μωσ + μ μ + σ log n. 17 By definition 13, ω μ/σ, fro which μ μωσ = o. 18 The last ter in 17 is bounded by σ = oμ and μ = Oμ, giving μ μ + σ log n μ 4 log n = O, Leas 3 and 4 iply μ / c/n, hence 1 c n + on 1.

160 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 In a siilar way, choosing ε in the range 0 < ε < inα, 1/, for all k such that k μ ωσ observing that k =Oμ, Lea 4 and the choice of ε iply k/ = On 1/ = on ε, for large enough n 7 gives us j n ε PX n ε 1 PX > n ε [ ] jk 1 1 n ε μk 1 n ε μμ + ωσ 1 n ε. Our assuptions iply 8 and Lea 4, which, together with 18, give us 1 μ + on 1 = 1 c n + on 1. Although α < 1 is not officially excluded in Lea 5, when α < 1 Lea 5 cannot apply because of Lea. For a illustrative exaple, fix α < 1 and consider P 0 = 1 ρ, P 1 = ρ with ρ = μ c/n = o1, in which case σ = ρ ρ?μ. The next lea shows that when σ is of the sae order as μ and assuptions are ade on the third oent of P we never get Poisson convergence. The ean M and variance S of the vertex degree were given in Lea 4. The third oent of P is defined to be μ 3 := k 3. Lea 6. Suppose that α 1, M c, μ = o, and σ dμ for a constant d>0, and that μ 3 = Oμ 3. Then S c + c d when α > 1 and c + c d li inf n S li sup S = O1 n when α = 1. For α 1, it follows that ξ does not have liiting Poisson distribution. Proof. The forula for F 1 obtained fro differentiating twice is F 1 = n 1n 1.

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 161 If we prove for all α 1 that li inf n F 1 d c + c, 19 then it will follow that li inf n S = li inf n F 1 + M M = li inf n F 1 + c c c + c d, resulting in the lower bounds in the stateent of the theore. In the proof of Lea 4 it was convenient to use n ε as a dividing point to use different analyses. In this proof it sees better to use /μ log n for the estiates in 5 below. We begin with Note that F 1 n 1n 0 k /μ log n 1. 1. For all k in the range 0 k /μ log n we ay use the bound μ = Oμ to conclude that there is a constant C>0 such that for all n kμ μ C log n. 0 Thus, we have [ 1 jk ] jk + = 1 μk + μ k = 1 μk 1 μ k μ 1 μk 1 C log n. 1 We conclude fro 5 and 1 that for n large enough, uniforly for all 0 k /μ log n it holds that [ 1 jk ] jk + 1,

16 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 and so for large enough n F 1 n 1n = n 1n = n 1n μ 0 k /μ log n 0 k /μ log n 0 k /μ log n 1 C n 1n μ log n = 1 C n 1n μ log n [ 1 1 jk jk ] + μk μ k k 1 kμ μ 0 k /μ log n μ /μ log n<k where we have used 0 at. Observe that Lea 4 iplies k k, 3 n 1n μ μ n μ σ + μ d + 1 n μ 4 d + 1c. 4 Moreover, Markov s inequality applied to the probability distribution given by k /μ, which has expectation μ 3 /μ = Oμ 3 /μ = Oμ, gives us /μ log n<k k μ μ 3 /μ /μ log n = O μ μ log n μ log n = O = oμ n. 5 Now, 19 results fro 3 5. We now coplete the proof of the lea when α > 1 by deriving the upper bound li sup n F 1 d c + c to atch 19. Fix ε such that 0 < ε < inα 1, 1/. 6 For all 0 k n ε, fro 7 we have 0 j n ε [ ] jk/ 1 1 j + k/ 1.

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 163 It follows that F 1 n 0 k n ε = n n 0 k n ε 0 k n ε 1 0 j n ε [ ] jk/ 1 + n 1 j + k/ n ε <k jk/ 1 j + k/ + 0 j n ε μk/ 1 n ε + n ε <j n ε <j + n n ε <k + n n ε <k. 7 Define τ n to be τ n :=. n ε <k Because μ = on ε, by Chebyshev s inequality and Lea 4 we have σ μ n ε n ε 1 τ n = O n ε = O = O and therefore n τ n = o1 by the choice of ε at 6. We now have F 1 n 0 k n ε μ k / 1 n ε + τ nμk/ 1 n ε + τ n + n τ n n μ μ / 1 n ε + n τ n μ / 1 n ε + n τ n + n τ n = n μ μ + o1 = d + 1c + o1, 8 where 8 was shown at 4. When α = 1 we use an arguent siilar to the one for α > 1, but replacing the function n ε by βn, where β < 1 is a positive constant. Note that = n. We also have μ = O1 because of Lea 4, hence βn<k n = On by

164 J. Jaworski et al. / Discrete Matheatics 306 006 15 165 Chebyshev s inequality. As a result, 7 becoes F 1 n jk/n 1 j + k/n + 0 k βn 0 j βn O1n O1n 0 k βn 0 k βn O1μ μ + O1 = O1. 0 j βn βn<j n jk 1 n + O n + O 1 [ kμ n + O 1 n ] + O 1 + n βn<k n By assuption M = O1 and the estiate F 1 = O1 gives S = O1 as well. We have shown for all α 1 that Eξ = S + M is bounded, and so the sequence of rando variables ξ as a function of n is tight. If the distribution of ξ were to converge weakly to the Poissonc distribution, then tightness would iply that S would converge to c, but we have shown that li inf n S >c. 3. The passive graphs The results for passive odel given in [] iply the following forula for the distribution of the vertex degree of a typical vertex w in IG p np. Lea 7. Let ψ be a rando variable with the distribution of the vertex degree of a typical vertex w in IG p np. Then i 1 i P{ψ = i}= 1 1 s μ i s+1 n i s + k kp i+s 1 k. s=0 i+s Moreover, the expected vertex degree is [ n ] kk 1 Eψ = 1 1 1. 1 Proof. In [] it was shown that ψ has the distribution given by P{w has degree i in IG p np }=P{ψ = i} i 1 i = 1 s 1 EX i s + EX X n i+s 1. i+s s=0 This iplies the first forula iediately. The forula for the expected value of ψ follows fro the fact that the probability of an edge in IG p np is given see [] by [ P{w k w l is an edge of IG p np }=1 1 E ] n X. The rearks on the relationship between the odels Gn,, p, IG a n P, and IG p P n at the end of Section 1 and the results in [5] show that IG p np has asyptotic Poisson vertex degree when α > 1, n= α, p= c 1+α/,

J. Jaworski et al. / Discrete Matheatics 306 006 15 165 165 and = j p j 1 p j. However, one can easily see that the d-degenerate passive graph IG p np with P d =1 for soe d 3 does not have Poisson vertex degree in the liit for the reason that either the vertex degree is 0 or it is at least d 1. Therefore it sees to be justified that the conditions iplying Poisson convergence of the distribution of ψ ust involve P, as well as higher oents of X. We intend to use Lea 7 to get precise results about vertex degree in IG p np in future research, but in Theore we are able to prove a sufficient condition for ψ to be asyptotically Poisson by a different arguent. Theore. Suppose that probability distribution P satisfies i k=3 k = o/n. ii P c/n for soe constant c>0. Then the distribution of ψ is asyptotically Poissonc. Proof. Let us fix w W. The expected nuber of neighbors of w in the passive odel which are adjacent to a vertex in V of degree of 3 or ore in the graph BG n, n, P is bounded by k n n k = o1, 1 k=3 k k=3 where the su in the first bound is the probability that a particular vertex of V has degree at least 3 and has w and a fixed vertex of W\{w} as a neighbor. We ay ignore vertices in V of degree 0 or 1 because they do not contribute any edges to IG p np. Thus edges in IG p np adjacent to w are alost surely deterined by vertices in V of degree and each such vertex corresponds to a K in IG p np. The expected nuber of these K s for which w is in the vertex set of the K and for which there exist ore than one vertex of V adjacent in BG n, n, P to both vertices in the K is bounded by n 1 P 1 = O1/ = o1. We conclude that alost surely each edge adjacent to w in IG p np is deterined by a unique vertex in V of degree inbg n, n, P. Conditional / on v being of degree in BG n, n, P, the probability that v does not deterine an edge adjacent to w is 1 = 1 /. The probability that v contributes an edge adjacent to w, ignoring the graphs for which degv 3, which we showed are alost surely insignificant, is therefore P /. The choices for the vertices in V are independent, hence the degree of w is distributed asyptotically as Binoialn, P / which converges in distribution to Poissonc. It is easy to check that IG p np satisfies the conditions of Theore when α > 1, n = α, p = c 1+α/, and = j p j 1 p j. References [1] J.A. Fill, E.R. Scheineran, K.B. Singer-Cohen, Rando intersection graphs when = ωn: an equivalence theore relating the evolution of the Gn,, p and Gn, p odels, Rando Structures Algoriths 16 000 156 176. [] E. Godehardt, J. Jaworski, Two odels of rando intersection graphs for classification, in: M. Schwaiger, O. Opitz Eds., Exploratory Data Analysis in Epirical Research, Proceedings of the 5th Annual Conference of the Gesellschaft für Klassifikation e.v., University of Munich, March 14 16, 001, Springer, Berlin, Heidelberg, New York, 00, pp. 67 81. [3] M. Karoński, E.R. Scheineran, K.B. Singer-Cohen, On rando intersection graphs: the subgraph proble, Cobin. Probab. Coput. 8 1999 131 159. [4] K.B. Singer, Rando intersection graphs, Dissertation, Johns Hopkins University, 1995. [5] D. Stark, The vertex degree distribution of rando intersection graphs, Rando Structures Algoriths 4 004 49 58.