The Covariance of Topological Indices that Depend on the Degree of a Vertex

Similar documents
CCO Commun. Comb. Optim.

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

The Linear Chain as an Extremal Value of VDB Topological Indices of Polyomino Chains

Extremal trees with fixed degree sequence for atom-bond connectivity index

Extremal Topological Indices for Graphs of Given Connectivity

The smallest Randić index for trees

On the Randić index. Huiqing Liu. School of Mathematics and Computer Science, Hubei University, Wuhan , China

Hexagonal Chains with Segments of Equal Lengths Having Distinct Sizes and the Same Wiener Index

On the Randić Index of Polyomino Chains

arxiv: v1 [cs.dm] 9 Jun 2011

Kragujevac J. Sci. 33 (2011) UDC : ZAGREB POLYNOMIALS OF THORN GRAPHS. Shuxian Li

FIRST ORDER SENTENCES ON G(n, p), ZERO-ONE LAWS, ALMOST SURE AND COMPLETE THEORIES ON SPARSE RANDOM GRAPHS

Dedicated to Prof. Jaroslav Kurzweil on the occasion of his 80th birthday

Seminar In Topological Dynamics

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

Isomorphism of free G-subflows (preliminary draft)

Constructive proof of deficiency theorem of (g, f)-factor

CCO Commun. Comb. Optim.

Math 341: Convex Geometry. Xi Chen

CLIQUES IN THE UNION OF GRAPHS

On the Existence of Almost Periodic Solutions of a Nonlinear Volterra Difference Equation

Lecture 7: February 6

Extremal Zagreb Indices of Graphs with a Given Number of Cut Edges

A Class of Vertex Transitive Graphs

Notes on Random Vectors and Multivariate Normal

On the Inverse Sum Indeg Index

A Generalization of Wigner s Law

Thus f is continuous at x 0. Matthew Straughn Math 402 Homework 6

TWO TYPES OF CONNECTIVITY INDICES OF THE LINEAR PARALLELOGRAM BENZENOID

Topological indices: the modified Schultz index

MULTIPLICATIVE ZAGREB ECCENTRICITY INDICES OF SOME COMPOSITE GRAPHS. Communicated by Ali Reza Ashrafi. 1. Introduction

1 + lim. n n+1. f(x) = x + 1, x 1. and we check that f is increasing, instead. Using the quotient rule, we easily find that. 1 (x + 1) 1 x (x + 1) 2 =

Asymptotic Statistics-III. Changliang Zou

Estimating Some General Molecular Descriptors of Saturated Hydrocarbons

17. Convergence of Random Variables

Existence and Uniqueness

INVERSE LIMITS WITH SET VALUED FUNCTIONS. 1. Introduction

Large topological cliques in graphs without a 4-cycle

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Lecture Notes 1: Vector spaces

k-protected VERTICES IN BINARY SEARCH TREES

SOME VERTEX-DEGREE-BASED TOPOLOGICAL INDICES UNDER EDGE CORONA PRODUCT

Random Graphs. EECS 126 (UC Berkeley) Spring 2019

a n b n ) n N is convergent with b n is convergent.

Solutions: Problem Set 4 Math 201B, Winter 2007

On the harmonic index of the unicyclic and bicyclic graphs

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Notes 6 : First and second moment methods

2. Metric Spaces. 2.1 Definitions etc.

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

MATHS 730 FC Lecture Notes March 5, Introduction

The Fibonacci numbers for the molecular graphs of linear phenylenes

Counting the average size of Markov graphs

A New Multiplicative Arithmetic-Geometric Index

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

D-EQUIENERGETIC SELF-COMPLEMENTARY GRAPHS

Combinatorics in Banach space theory Lecture 12

LEBESGUE INTEGRATION. Introduction

Measures and Measure Spaces

Math 117: Continuity of Functions

Why Do Partitions Occur in Faà di Bruno s Chain Rule For Higher Derivatives?

Problems - Section 17-2, 4, 6c, 9, 10, 13, 14; Section 18-1, 3, 4, 6, 8, 10; Section 19-1, 3, 5, 7, 8, 9;

Estrada and L-Estrada Indices of Edge-Independent Random Graphs

Multivariate Random Variable

Lecture 4: Two-point Sampling, Coupon Collector s problem

RANDOM FIBONACCI-TYPE SEQUENCES

Math212a1413 The Lebesgue integral.

Existence and uniqueness: Picard s theorem

ON THE NUMBER OF ALTERNATING PATHS IN BIPARTITE COMPLETE GRAPHS

Modified Zagreb M 2 Index Comparison with the Randi} Connectivity Index for Benzenoid Systems

Extremal Values of Vertex Degree Topological Indices Over Hexagonal Systems

Bipartite decomposition of random graphs

COMPUTING SANSKRUTI INDEX OF TITANIA NANOTUBES

Probability Lecture III (August, 2006)

Harmonic Temperature Index of Certain Nanostructures Kishori P. Narayankar. #1, Afework Teka Kahsay *2, Dickson Selvan. +3

Universal examples. Chapter The Bernoulli process

1 Exercises for lecture 1

CONCEPT OF DENSITY FOR FUNCTIONAL DATA

3 Integration and Expectation

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

Approximability of Dense Instances of Nearest Codeword Problem

Lectures 6: Degree Distributions and Concentration Inequalities

Measure Theory and Lebesgue Integration. Joshua H. Lifton

A Martingale Central Limit Theorem

ACM 116: Lectures 3 4

Methods of constructing topological vector spaces

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

SMSTC (2007/08) Probability.

COMPUTING SANSKRUTI INDEX OF DENDRIMER NANOSTARS. Chengdu University Chengdu, , P.R. CHINA 2 Department of Mathematics

1.1 Limits and Continuity. Precise definition of a limit and limit laws. Squeeze Theorem. Intermediate Value Theorem. Extreme Value Theorem.

Completely regular Bishop spaces

Decomposing oriented graphs into transitive tournaments

be a path in L G ; we can associated to P the following alternating sequence of vertices and edges in G:

1 Complete Statistics

Exercises Measure Theoretic Probability

P (A G) dp G P (A G)

then it is called a non-archimedean absolute value. If condition (1.1.2) fails for some x, y F, then is called an archimedean absolute value.

Upper Bounds for Partitions into k-th Powers Elementary Methods

Fractional and circular 1-defective colorings of outerplanar graphs

Discrete Mathematics

Transcription:

MATCH Communications in Mathematical and in Computer Chemistry MATCH Commun. Math. Comput. Chem. 54 (2005) 177-187 ISSN 0340-6253 The Covariance of Topological Indices that Depend on the Degree of a Vertex Boris Hollas Theoretische Informatik, Universität Ulm, D-89081 Ulm E-mail: hollas@informatik.uni-ulm.de Received August 23, 2004 Abstract We consider topological indices I that are sums of f(deg(u), deg(v)), where u, v are adjacent vertices and f is a function. The Randić connectivity index, the 2nd Zagreb index or the Platt number are examples for indices of this kind. In earlier work on topological indices that are sums of independent random variables, we identified the correlation between I and the number of edges of the molecular graph as the main cause for correlated indices. For random graphs on a Poisson-distributed number of vertices, we show how I can be transformed to be uncorrelated to the number of edges. More important, we give reason to assume that this should not simply transform a linear to a non-linear dependence. A similar result holds for zero-order indices, i.e. topological indices that are sums of f(deg(v)) as for example the 1st Zagreb index. 1 Introduction For quite some time it has been known that topological indices (graph invariants on molecular graphs) exhibit considerable mutual correlation [1, 2]. This is a major problem when performing structure-activity studies as the employed statistical methods may fail or give little meaningful results on sets of correlated data. In addition, strong correlations among a set of topological indices raise doubt whether these indices describe different and meaningful biological, chemical or physical properties of molecules. Many topological indices also depend on the number of bonds in a molecule, which is information that should better be coded spearately. Principal component analysis (PCA) is sometimes used to obtain uncorrelated descriptors from a set of correlated descriptors. By construction, the new set of descriptors is a linear combination of the old ones. Hence, each new descriptor may be a combination of all previous descriptors, which complicates model interpretability. Also, uncorrelated descriptors may still be non-linearly related. In an attempt to investigate the reasons for these correlations, we used random graphs [3] as a model for chemical graphs and for topological indices of the form I X(G) X ux v {u,v} E where E is the edge set of the molecular graph G (V, E) and X {X v v V } is a set of independent random variables with a common expectation E(X) [4, 5, 6]. We proved that I X, I Y, and I 1 are necessarily dependent for independent sets of vertex properties X,Y with E(X), E(Y ) 0. For E(X) E(Y ) 0 however these indices are uncorrelated. Here, I 1 denotes a topological index with X v 1 for all v V, that is, I 1 E. In the first case, these indices are thus correlated as a result of the graph invariant used, not as a result of similar chemical properties.

While the random graph model we used in [6] encompasses graphs of arbitrary structure, including chemical graphs, the notion of vertex (or atom) properties X v that are independent of the molecular graph is a serious abstraction from computational chemistry where atom properties used for topological indices are a function of the graph or even the molecule. In this paper, we use a special case of the random graph model used in [6]. In particular, the number of vertices is a Poisson-distributed random variable N. For a fixed number of vertices N n, edges are chosen independently with a probability p n Θ(1/n) so that the expected number of edges is a linear function of n. We use this to model an approximately linear relation of bonds to vertices present in molecules. For example, homologous series of aliphatic or aromatic hydrocarbons with n atoms contain n + c bonds for some constant c. Polyphenyls contain 7 6 n + c bonds as each monomer adds 6 atoms and 7 bonds. On the other hand, there is some variation in the number of bonds for a given number of atoms in a heterogenous set of molecules, which is also true for the random graph model. Note however that this random graph model describes only some specific aspects of chemical graphs, namely the average degree of a vertex and a variation in the number of vertices and edges. Random graphs may not be connected or may contain vertices of degree greater 4. As a more significant difference we consider the vertex properties to be a function of the vertex degree instead of being independent. Thus, our results are valid for topological indices such as the Randić connectivity index, the second Zagreb index or the Platt number [7]. Since we discovered that topological indices that are correlated with I 1 are also mutually correlated within the setting of [4, 5, 6], we focus on the covariance between I X and I 1. We show how to make I X uncorrelated with I 1 in a way that should not result in a non-linear dependence. In section 7 we also consider zero-order indices, which include the first Zagreb index and the zero-order Kier & Hall index. 2 Preliminaries - 178 - The random graph model is constructed in two steps. We shall first consider a fixed, then a variable number of vertices. In the first step, we obtain random graph model G (n, p n). From here, we obtain model G (N, p N) by the use of expectation. For a graph (V, E) on the vertex set V {1,..., N} where N is a random variable let { 1 if {u, v} E 1 uv 1 {{u,v} E} 0 else be the indicator function for {{u, v} E}. For N fixed, let 1 uv (u, v V ) be independent random variables with P (1 uv 1 N n) p n for some p n (0, 1). The space of random graphs G (n, p n) can be identified with the distribution of (1 uv) u,v V with respect to P ( N n), that is, all edges are chosen independently with probability p n. We choose the edge-probabilities p n in a way such that E E α(n 2) for a fixed parameter α > 0 as motivated in the introduction. For the expected degree of a vertex follows E(deg(v)) 2E E n 2α The parameter α thus describes the branching of the graphs. To let the number of vertices vary, let N be a Poisson-distributed random variable and G (N, p N) a space of random graphs with distribution P (G) E(P (G N)) n P (G N n) P (N n) To describe the vertex properties, let f : N 2 R be a function and let X uv f(deg(u), deg(v)) (2.1)

- 179 - be the properties of vertices u, v. We consider the topological index I X I X(G) X uv (2.2) {u,v} E(G) where G is a random graph. In section 3 we will write this as I X X uv1 uv (2.3) uv which is better suited to employ the expectation operator. In section 7, we also consider topological indices of the form N I X X v v1 with X v f(deg(v)) for a function f. We use the following notations throughout the text: O(f) denotes a function g with g(x) cf(x) for all large x and a constant c > 0 D X n X denotes that random element Xn converges to X in distribution P α denotes the Poisson distribution with parameter α. 3 Expectations for n fixed In this section, we use the random graph model G (n, p n), that is, all expectation values are defined via P ( N n). To determine expectation values, we have to eliminate the dependence among X uv and (1 uv) u,v V in (2.1). Therefore, we define the conditional expectation δ (i,j) f,n E ( i+1 X 12 1 12 k3 j+1 1 1k l3 1 2l 1 Since 1 uv (u, v V ) are independent and identically distributed, we have δ (i,j) f,n δ(j,i) f,n. Note that this definition does not depend on the choice of (u, v) (1, 2) since all X uv (u, v V ) have the same distribution. As we shall see in section 4, lim n δ (i,j) f,n exists and is a function of α if f does not grow too steeply. Thus, we may regard δ (i,j) f,n as almost constant for large n. Lemma 1. E(I X) δ (1,1) f,n E E Proof. E(I X) uv E(X uv 1 uv 1) p n by (2.3) δ (1,1) f,n E E by (3.1) ) (3.1) Lemma 2. [ ] E(I XI 1) δ (1,1) n 2 f,n p n + 2δ (1,2) f,n (n 2)pn + δ(1,1) f,n E E 2

Proof. To dissect the sum according to {u, v} {u, v }, consider Then - 180 - E(I XI 1) E(X uv1 uv1 u v ) uv u v S k {(u, v, u, v ) u v u v {u, v} {u, v } k}, 0 k 2 n n 2 S 0 2 2 n S 1 6 3 n S 2 2 (3.2) and (3.4) are obvious. To verify (3.3) let (u, v, u, v ) S 1. Exactly two numbers are equal as indicated in figure 1. Cases (a), (b) allow just one way to distribute three distinct numbers on u, v, u, v while there are two ways for cases (c), (d). For symmetry reasons, E(X uv1 uv1 u v ) (3.2) (3.3) (3.4) u v u v u v u v u v u v u v u v (a) (b) (c) (d) Figure 1: Possibilities for (u, v, u, v ) S 1 E(X 121 121 13) for all (u, v, u, v ) S 1. Hence, we get E(I XI 1) S 0 E(X 121 121 34) + S 1 E(X 121 121 13) + S ( 2 E X 121 2 ) 12 S 0 E(X 12 1 12 1) p 2 n (3.5) + S 1 E(X 12 1 121 13 1) p 2 n With and (3.2)-(3.4), (3.5), we have + S 2 E(X 12 1 12 1) p n n n n 2 3 2 3 E(I XI 1) δ (1,1) n 2 f,n E E p n 2 + 2δ (1,2) f,n E E (n 2)pn + δ (1,1) f,n E E Remark. With f 1, lemma 1 and the help of Mathematica it follows Var(I 1) E E (1 p n), as it should be.

- 181-4 Convergence of δ (i,j) f,n In this section, we prove a result on the convergence of δ (i,j) f,n for n. We need this to generalize the results in section 3 for a Poisson-distributed number of vertices and to justify the main conclusion of this paper. Recall that for random vectors X n, X in R d holds iff D X n X E(f(X n)) E(f(X)) for all bounded and continuous functions f : R d R d. This does not hold for arbitrary unbounded functions f. Therefore, we require that f is bounded by an exponential function in theorem 3. This does not have to be the most general restriction but it is sufficient to treat currently used topological indices. Theorem 3. If f(x, y) b x+y for a constant b > 0 and lim n p nn α, then for all i, j N lim n δ(i,j) f,n E(f(i + P1, j + P2)) where P 1, P 2 are independent and P α-distributed random variables. Proof. First, we show that the claims holds for bounded functions. Let and By definition (3.1), δ (i,j) f,n E S (1) n,i i + S (2) n,j j + n 1 1k ki+2 n 1 2k kj+2 ( i+1 X 12 1 12 k3 ( ) E f S (1) n,i, S(2) n,j j+1 1 1k l3 1 2l 1 E(f(S n)) (4.1) if we write S n S (1) n,i, S(2) n,j. Since p nn α, Poisson s limit theorem gives n D 1 1k Pα (n ) ki+2 The function f : N 2 R can be extended to a continuous function f : R 2 R in an arbitrary way. Hence, the continuous mapping theorem gives f(s n) D f(i + P 1, j + P 2) (n ) For all bounded and continuous functions f : R 2 R follows by (4.1) δ (i,j) f,n E(f (i + P 1, j + P 1)) (n ) (4.2) )

To prove (4.2) for arbitrary functions f with f(x, y) b x+y we cut f off above a limit to divide f into a bounded and an unbounded part. We show that the unbounded part tends to zero as the limit tends to infinity. To do so, let { x if x m c m(x) 0 else and Then c m(x) - 182 - { 0 if x m x else E(( c m f)(s n)) E ( f(s n)1 {f(sn) m}) ( E f(s f(sn) ) n) m 1 m E b 2S(1) n,i +2S(2) n,j 1 m b2i E ( b 2 112) n (i 1) b 2j E ( b 2 112) n (j 1) since S (1) n,i and S(2) n,j are independent since p nn α. Thus, b2(i+j) m (1 + b 2 p n) 2n (1 + b 2 p n) i+j 2 O(1/m) (4.3) lim n δ(i,j) f,n lim m lim n E(f(Sn)) by (4.1) lim lim [E((cm f)(sn)) + E(( cm f)(sn))] m n lim [E((cm f)(i + P1, j + P2)) + O(1/m)] m by (4.2) and (4.3) E(f(i + P 2, j + P 1)) by the convergence theorem of Lebesgue. 5 Expectations for Poisson-distributed N In this section, we use the random graph model G (N, p N). Hence, the number of vertices N is not constant anymore. As we need the results of section 3 here, we denote by E n the edge set of a random graph in G (n, p n), i. e. on a fixed number of vertices, whereas E is the edge set of a graph in G (N, p N). To circumvent dividing by zero and other technical problems we require that N 2 in this section. Therefore, N is a random variable with { β n 2 P (N n) (n 2)! e β if n 2 0 else

- 183 - for β > 0. In the strict sense, N is thus not Poisson-distributed, but N 2 is. Since E(N) E(N 2) + 2 β + 2, the parameter β describes the expectation of N. For N n and α (0, 3) we define the edge-probabilities p n as p n α n ( 2 n 2) so that E E n α(n 2). This results in the useful properties E E n 2 E E n P (N n) αβ βn 3 (n 3)! e β αβ (5.1) and E E n P (N n) E E P (N n 1) (5.2) Thus, E E is the product of a parameter that is related to the average degree of a vertex (see section 2) and a parameter related to the number of vertices in the graph. With lemma 1 and (5.2) we get Lemma 4. E(I X) E δ (1,1) f,n+1 E E The generalization of lemma 2 requires more effort. Lemma 5. [ E(I XI 1) E E E E E δ (1,1) f,n+2 + (1 3α)E δ (1,1) f,n+1 ] +4αE δ (1,2) f,n+1 + O(1/β) Proof. By lemma 2 and (5.2), we have E(I XI 1) E(E(I XI 1 N)) E E [ ] δ (1,1) n 2 f,n p n + 2δ (1,2) f,n (n 2)pn + δ(1,1) f,n 2 P (N n 1) We deal with the three summands separately using α(n 3)P (N n 1) E E P (N n 2) n 2 1 p n α(n 3) 3α + O 2 n 2 1 (n 2)p n 2α + O n 2 and δ (i,j) 1 f,n O P (N n 1) O(1/β) n 2

which holds since δ (i,j) f,n converges by theorem 3. Hence, we get E(I XI 1) E E - 184 - E E δ (1,1) f,n P (N n 2) n 4 3α δ (1,1) f,n P (N n 1) + O(1/β) + 4α δ (1,2) f,n P (N n 1) + O(1/β) + δ (1,1) f,n P (N n 1) [ E E E E E δ (1,1) f,n+2 ] + 4αE δ (1,2) f,n+1 + O(1/β) ( + (1 3α)E δ (1,1) f,n+1 ) With lemma 4 follows Corollary 6. Var(I 1) E E (1 + α + O(1/β)) 6 Covariance with I 1 Any topological index I X of the form (2.2) can be made uncorrelated to E E I 1 by adding a suitable constant: If we write I X (X uv + c) (6.1) we have iff {u,v} E(G) 0 Cov ( I X, I1 ) Cov(IX, I 1) + cvar(i 1) Cov(IX, I1) c Var(I 1) However, this does not make any sense if (6.1), (6.2) merely transform a linear into a non-linear dependence. Figure 2 illustrates how a non-linear dependence may appear. In this example however Cov(I X, I 1) is not zero if E(I 1) is in a region where the curve rises or falls. We therefore require that Cov ( I X, I1 ) 0 for different values of β, or equivalently, that c does not depend on β. It will turn out that this can be achieved approximately, which will precisely be stated in terms of a limit of c for β. With lemma 1, 2 can be shown that Cov(I X, I 1) Var(I 1) δ (1,1) f,n (1 2α) + 2δ(1,2) f,n α + O(1/n) holds for every fixed n, hence (6.2) does not much depend on the number of vertices in this case. On the other hand, if the number of vertices is a Poisson-distributed random variable N, the results of section 5 show that Cov(IX,I1) Var(I1) contains the product ( ) E E E δ (1,1) f,n+2 δ(1,1) f,n+1 (6.3) (6.2)

- 185 - I X 0.3 0.2 0.1 PSfrag replacements -0.1 I 10 20 30 40 50 60 1 Figure 2: A non-linear dependence If the convergence speed of δ (1,1) f,n is not known, nothing can be proved about the behavior of (6.3) for different values of β. Therefore, we define ɛ n δ (1,1) f, δ(1,1) f,n (6.4) with δ (1,1) f, limn δ(1,1) f,n and ˆXuv X uv + ɛ n (6.5) so that δ (1,1) ˆf,n δ(1,1) f, for all n and thus E δ (1,1) ˆf,N+k always zero and Cov ( I ˆX, I1 ) Var(I 1) δ (1,1) f, (1 3α)δ (1,1) f, + 4αE δ (1,2) + O(1/β) ˆf,N+1 1 + α + O(1/β) for all k. Then, product (6.3) is by lemma 4, 5 and corollary 6. This still depends on β, but only to a small extent. More precisely: lim β c exists and is finite. Note that β iff E(N). As a result, we get Theorem 7. For a random graph in G (N, p N), let ˆX uv X uv + ɛ N X uv ˆX uv + c (6.6) with c Cov( I ˆX, I1 ) Var(I 1) and let ɛ n be defined according to (6.4). Then we have: 1. I X is uncorrelated to I1 2. lim β c exists and is finite Proof. The first claim follows from (6.1), (6.2). The only assertion left to show is that E converges for β. By theorem 3 and (6.3), (6.5), δ (1,2) ˆf,n δ (1,2) ˆf,N+1 converges. Hence, there is for any

ε > 0 an n ε such that δ (1,2) ˆf,n+1 δ(1,2) ε for all n > n ˆf, ε and lim E δ (1,2) δ (1,2) lim lim P (N nε) O(1) + εp (N > nε) 0 β ˆf,N+1 ˆf, ε 0 β Thus, the right hand side of (6.6) converges. 7 Zero-Order Indices As a second application, we now consider topological indices of the form N I X X v v1 with X v f(deg(v)) for a function f with f(x) b x for a constant b > 0. Examples for indices of this form are the first Zagreb group index or the zero-order Kier & Hall index. It follows similarly to the results of (6.1), (6.2) that I X with X v X v + c is uncorrelated to I 1 N iff Cov(IX, I1) c E(X 1) (7.1) Var(I 1) This holds for arbitrary random variables N. If we write δ f,n E(X 1 N n), we have c E(δ f,n) and it follows as in theorem 7 that lim β c exists and is finite. Theorem 8. For the zero-order index I X with X v X v + c and c defined according to (7.1) holds: 1. I X is uncorrelated to I1 2. lim β c exists and is finite 8 Discussion Theorem 7 shows that a topological index I X of the form (2.2) can be transformed to an index I X that is uncorrelated to I1 E E within model G (n, pn). A similar assertion holds for zeroorder indices. This transformation should not introduce a non-linear dependence as illustrated in figure 2. This is formally stated by (6.6) and the fact that c does not depend on β for β, or equivalently, for E(N). Thus, the dependence among I X and E E is much reduced. However, the number of bonds may still influence the variance of I X. The values ɛ n introduced in (6.4) are needed to prove assertion (2) of theorem 7. If, in a practical application, δ (1,1) f,n converges rapidly or if the size of the molecules in the data set does not vary much it may not be necessary to transform X uv to ˆX uv since (6.3) will not change much either. In this case, the same is true for c. To show the necessity of this transformation, it still remains to show when I X and E E are correlated within this setting, as it was shown for independent vertex properties with non-zero expectation [4, 5, 6]. Further, a computer experiment with chemical data is needed to see the difference this transformation makes to the correlations among topological indices of the kind we consider here. References - 186 - [1] I. Motoc, A. Balaban, O. Mekenyan, and D. Bonchev. Topological indices: Inter-relations and composition. MATCH Commun. Math. Comput. Chem., 13:369 404, 1982.

- 187 - [2] S. Basak, V. Magnuson, G. Niemi, R. Regal, and G. Veith. Topological indices: Their nature, mutual relatedness, and applications. Math. Model., 8:300 305, 1987. [3] B. Bollobás. Modern Graph Theory. Springer, 1998. [4] B. Hollas. Correlation properties of the autocorrelation descriptor for molecules. MATCH Commun. Math. Comput. Chem., 45:27 33, 2002. [5] B. Hollas. An analysis of the autocorrelation descriptor for molecules. J. Math. Chem., 33(2):91 101, 2003. [6] B. Hollas. Correlations in distance-based descriptors. MATCH Commun. Math. Comput. Chem., 47:79 86, 2003. [7] R. Todeschini and V. Consonni. Handbook of Molecular Descriptors. Wiley, 2000.